eeking a provider to develop a small prototype data grabber for scraping and extracting address data from different type of commercial websites.
Must be able to also identify/select/extract as well static as dynamic address data within general websites i.e. address search pages with respect to all available individual branch/subsidary addresses.
Expected output data:
Results:
- General Web-Address (www)
- Individual Web-Address
- Name of organisation
- Brand name
- Adress as country/state/city/street/nr.
- Internal identifyer (if applicable)
- phone, fax, e-mail (if available)
- type of business (optional)
Non Successful: list of websites non-successfully searched
Output format: Access or Excel
Admin Console:
The (prototype) software should run on Windows and must have a management console enabling the following functions:
- Start/interupt/continue/end process
- Reporting of frontier generation
- Reporting of crawl progress, results, log, faults etc.
Implementation:
Previous experience of crawler/grabber development and examples/references are essential.
Process must be fast and efficient.
Process search strategy and application structure must be clearly documented.
Deliverable must be a working prototype that demonstrates the functionality, associated compilable source code, in line documentation, and instructions on recompilation.
After successful demonstration of functionality of prototype a follow-on job for commercial application might be awarded.