Webscraping - easy

En curso Publicado Jan 5, 2011 Pagado a la entrega
En curso Pagado a la entrega

Freelancer Job FSA scraper

I need a scraping program (compiled and source code, plus the data) that is able to scrape the following website:

[url removed, login to view]

The tricky bit is that the website requires the user to enter at least the first three letters of the company name into the field “Firm Name”. One possible solution could be for the scraper to input all possible combinations of the letters A-Z and 1-0, ie starting with AAA, the AAB, AAC etc to retrieve all entries in the tables. “Match Level” should be set to “Starts with”. “Currently authorised“ to TRUE. The scraper then needs to click on Submit and scrape all entries displayed. Demo: search for „AXA“. If there are Captchas we may need to abandon the project, so please test the approach above before investing too much time into details. Details in the attached file.

The scraper then should cycle through all pages and extract all records by opening each record and saving the data into an XML file.

This is a simple project. Fixed Budget 70 USD. If this goes well there are plenty of follow up projects. I will prefer bidders who can demonstrate their skills by providing a very basic demo version. Thanks, I look forward to hearing from you.

Best regards

.NET Java PHP Extracción de datos web

Nº del proyecto: #903679

Sobre el proyecto

15 propuestas Proyecto remoto Activo Jan 6, 2011