Tuesday, 26 May 2015

Endorsing web scraping

With more than 200 projects delivered, we stand firmly for new challenges every day. We have served above 60 clients and have won 86% of repeat business, as our main core is customer delight. Successive Softwares was approached by a client having a very exclusive set of requirements. For their project they required customised data mining, in real time to offer profitable information to their customers. Requirement stated scrapping of stock exchange data in real time so that end users can be eased in their marketing decisions. This posed as an ambitious task for us because it required processing of huge amount of data on a routine basis. We welcomed it as an event to evolve and do something aside of classic web application development.

We started with mock-ups, pursuing our very first step of IMPART Framework (Innovative Mock-up based Prototypes Analyzed to develop Reengineered Technology). Our team of experts thought of all the potential requirements with a flow and materialized it flawlessly into our mock up. It was a strenuous tasks but our excitement to do something which others still do not think of, filled our team with confidence and energy and things began to roll out perfectly. We presented our mock-up and statistics to the client as per our expectation client choose us, impressed with the efforts.

We started gathering requirements from client side and started to formulate design about the flow. The project required real time monitoring of stock exchange together with Prices, Market Turnover and then implement them into graphs. The front end part was an easy deal, we were already adept in playing with data the way required. The intractable task was to get the data. We researched and found that it can be achieved either with API or with Web Scarping and we moved with latter because of the limitations in API.

Web scraping is a compelling technique to get the required information straight out of the web page. Lack of documentation and not much forbearance forced us to make a slow start, but we kept all the requirements clear and new that we headed in the right direction.  We divided the scraping process into bits of different but related tasks. Firstly we needed to find the data which has to be captured, some of the problems faced were pagination and use of AJAX but with examination of endpoints in URL and the requests made when data is drawn, we surmounted these problems easily.

After targeting our data we focused on HTML parser which could extract data form all the targets. Using PHP we developed a parser extracting all the information and saving them in Database in a structured way.  After the required data present at our end we easily manipulated it into tables and charts and we used HIGHSTOCK for that. Entire Client side was developed in PHP with Zend frame work and we used MySQL 5.7 for server side.

During the whole development cycle our QA team insured we were delivering a quality product following all standards. We kept our client in the loop during the whole process keeping them informed about every step. Clients were also assured as they watched their project starting from scratch which developed into full fledge website. The process followed a strict time line releasing regular builds and implementing new improvements. We stood up to the expectation our client and delivered a product just as they visualized it to be.

Source: http://www.successivesoftwares.com/endorsing-web-scraping/

No comments:

Post a Comment