Find Jobs
Hire Freelancers

Write some Software

$250-750 USD

Cancelado
Publicado hace más de 8 años

$250-750 USD

Pagado a la entrega
I am looking to get a custom software or script built that will scrape the outgoing links from a particular website which we call it as "seed site" or backlinks from a particular website This will be in 2 parts : Part 1 : SCRAPER Example : Lets consider [login to view URL] as the seed site. So I want to scrape all the domains that link out from [login to view URL] or all the domains that [login to view URL] is backlinking to. For example a post about domain "[login to view URL]" posted on bbc and has a backlink from it. So bbc links out to thousands of sites and I want to extract all those sites So not just bbc I want this to work for any of the seed sites or scrape from any of the sites that i enter in software Part 2 : Check for domain metrics by Integration with API After it scrapes these domains I want to check metrics of these extracted domains like PA, DA, Tf etc. Meaning they should work with or intergrate with API of [login to view URL], [login to view URL] and [login to view URL] services. It should also check for domain availibility for registration. I am aware that many such similar scripts have been built in freelancer sucessfully. I would be glad to award them this project __________________________________________________________________________ Inputs to the tool ------------------ * Mandatory - 1 or more seed urls * Optional - Crawl depth (Default value = 0, max value = 10) * Optional - TLD list (Default values = [.org, .net, .com, .info, .biz]) If user enters TLDs, then append them to existing ones. * Optional - Number of parallel threads to use. (Default value = 6) * Optional - Proxy server configuration Output from the tool -------------------- * CSV file with list of domain names scraped Requirements ------------- SCRAPER : 1) Take 1 or more seed urls as input via UI field or from a file 2) Take crawl/scrape depth (e.g., 1, 2, 3 and so forth), that is to determinate in a parameter field 3) Take TLD from a list, that is to determinate in a parameter field (.org,.net,.com,.info,.biz and a customer needs to be able to add more and his preferred TLDs) 4) It also needs to work with subdomains 5) Crawl the urls for backlinks (showing the process, so customer knows that something happens and is working, like counting the processed 6) If the backlink is invalid (e.g., HTTP 404 not found), write it to a separate file 7) If the depth is 0, crawl only the seed url and domain. If the depth is 1, crawl backlink domain [login to view URL] depth is 3 count backlinks of the backlinks, and so forth.” 8) Possibility to use proxies (to determinate in a parameter field) for proxies) 9) Use multiple threads to scrape 10) Save the invalid to cvs file 11) Build a web application using JSP which will run on a Tomcat. The wordpress site / pop up window a) should display the status of the scraping b) should work in all browsers DOMAIN METRICS CHECKER 1)Upload all the domains in UI or text file 2)It should check for MOZ - DA PA ; Majestic : Trust flow & citation flow; check for domain availibilty Deliverables & Scopes --------------------- Following are the deliverables the developer will provide the employer 1) A standalone Java program that scrape 2) A web page to enter the inputs mentioned above Example of such exisitng and working domain scraper : [login to view URL]
ID del proyecto: 8159075

Información sobre el proyecto

1 propuesta
Proyecto remoto
Activo hace 9 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
1 freelancer está ofertando un promedio de $555 USD por este trabajo
Avatar del usuario
A proposal has not yet been provided
$555 USD en 10 días
0,0 (0 comentarios)
0,0
0,0

Sobre este cliente

Bandera de INDIA
Belgaum, India
5,0
13
Forma de pago verificada
Miembro desde ene 24, 2013

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.