Crawl 10,000 urls and put html blobs in ElasticSearch
$250-750 USD
Cerrado
Publicado hace alrededor de 9 años
$250-750 USD
Pagado a la entrega
crawl 10,000 urls and put html blobs in elastic search
need to store
name, ID, full url, being domain url
Need to limit to the core domain (or subdomain)
Need to limit to 5,000 pages per site
Would be nice to run this on several AWS spot instances at the same time so we can crawl more quickly
Will run Elastic Search on a single large AWS instance (lots of ram and CPU)