Find Jobs
Hire Freelancers

Create a scraper for CFR regulations

$250-750 USD

Cerrado
Publicado hace más de 10 años

$250-750 USD

Pagado a la entrega
File Scraper, downloader, and file processing. This project consists of two parts: 1. Spider through a website ad download all files that result from the spidering 2. Format each file downloaded to a specific format Part one: You will be given a batch of starting URL's that look like this: [login to view URL] You will follow each of these URL's that will lead to another page with links that look like this: [login to view URL] You will follow each of these URL's that will lead again to another page with links that look like this: [login to view URL] You will now follow each of these links that leads to a page that links to specific documents. The links within the pages tend to look like this: <table width="480"><tr> <td><table width="120"> <tr><td> <a class="tpl" href="/cgi/t/text/text-idx?c=ecfr&SID=f68f503ab8017206c54fb367aaaa7851&amp;rgn=div8&amp;view=text&amp;node=10:1.0.1.1.4.1.56.1&amp;idno=10"> &sect;5.100</a></td></tr> </table></td> <td><table width="354"> <tr><td>Purpose and effective date.</td></tr> </table></td> </tr></table> each of these links leads to a page that needs to be saved with the following naming structure that looks like this: [login to view URL] other examples of naming structures: 6cfrAppendix A to Part [login to view URL] Part two of this project: After you have downloaded each file, you will need to put each file into a specific html page structure. 1. You will first strip all of the information before <!-- startDynamic --> and after the <!-- endDynamic --> 2. You will now need to create a header for each record that looks like the files that are part of the samples. 3. You will need to replace the string in the text when it comes across a graphic: example string: Please replace: <img src="/graphics/ With this string: <img src="[login to view URL] AND replace this string: <a href="/graphics/pdfs/ With this string: <a href="[login to view URL] 4. You will need to create a footer at the bottom of each section, after the p class=” cita, that looks this this example: <p class="cita">[54 FR 53314, Dec. 28, 1989]</p> <br><p><center>Copyright 2013 Compliance Publishing Corporation (877) 500-6737</center> </body> </html> 5. You must be able to accommodate both regular regulations and the Appendix sections 6. Some of the titles have one less level. This program must be able selectable to how many levels deep the individual text is located. 7. All of the search and replace definitions must be kept ‘outside’ of the program in text files that can be modified as needed. 8. We require the source code as well as the finished program at the end of the project 9. Attached is a program that completed most of these tasks, but no longer works correctly because of a minor change in the text formatting (the programmer is no longer available). You may wish to use this program as a guide. 10. Attached are raw data documents and finished documents to be used as a guide. Please review the information carefully before you provide a bid, as there will be no changes to the contract price once we accept your bid. Please view the attached file for a sample of what the file format will be when completed. There are both regular and appendix text in this sample. We provide all funds in an escrow account. You must complete this project within 30 days (or less) You must reply to all communications within 24 hours
ID del proyecto: 5335612

Información sobre el proyecto

20 propuestas
Proyecto remoto
Activo hace 10 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
20 freelancers están ofertando un promedio de $605 USD por este trabajo
Avatar del usuario
Hi, our team is interested in your project. We specialize in web crawler. We already implemented a number of web scrappers. We propose to use C# for development. Best regards, Kate ProTeam SPb
$752 USD en 15 días
4,9 (32 comentarios)
6,5
6,5
Avatar del usuario
Hi I review the requirements and check your attachment. Every thing is clear and i am ready to process the project. Of course i am ready to provide a demo on your request. Thanks
$1.263 USD en 10 días
4,9 (74 comentarios)
6,0
6,0
Avatar del usuario
Hi, I didn't see finished documents in the attachment. Let me know which one of the files is final. Do you have the source code of the program that is attached? My development estimates are around 50 hours. Please check my rating and feedback. Regards, Artur
$750 USD en 15 días
5,0 (25 comentarios)
5,3
5,3
Avatar del usuario
Hi Alan, We can complete this project easily like previous projects. Please award us and discuss more. Thanks and best regards, Mai.
$366 USD en 5 días
5,0 (4 comentarios)
3,3
3,3
Avatar del usuario
I've done several successful scraping jobs with excellent feedback (see profile). I can do this task with python in a week or so. I can only accept funds either though freelancer.com or with a transfer in paypal. Let me know if that works for you
$400 USD en 10 días
5,0 (1 comentario)
2,3
2,3
Avatar del usuario
Dear Sir, I have six years experience in .NET and C# and can complete your work up to your satisfaction. I have done projects like University Automation system. Site Scraping, OCR System, Counseling System, Stock and inventory System etc. I am new to freelancer and need reputation than money. I will provide you quality output if you award this project to me. Regards, Munish Kumar Matolia
$555 USD en 30 días
5,0 (1 comentario)
1,5
1,5
Avatar del usuario
Hello, Sir I am from vSol CORP. We are 19 people team, worked on more than 500 completed projects. We provide round the clock service. Sir we have work together earlier too. Can this be done manually? We are really like to discuss with you regarding this project. We are open for any type of negotiation. Let's speak together. Thanks vSol CORP
$257 USD en 10 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
Hello I have read your description. Yes, I can do it. I will create a chrome addon, it will do the thing you want. Replace content and download file as you want. Do you have mockup for this ? Anyway, please discuss together, we will find out more good thing for this. Regards
$722 USD en 28 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
Dear Employee, You project sounds interesting and I can make it. I'm programmer for 20+ years. Your specification is pretty good but I'll have questions if you choose me. Sincerely, Laszlo Nyakas
$555 USD en 15 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
Hi, I can create this kind of application using c#, but is there any special reason why you choose to scrape the html, not just using their xml provided data and then formatted as you wanted? Regards,
$833 USD en 30 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
I have 7+rs of experience as programmer in various languages If you want to have a demo on requirement i'll prepare it for you Hope for the positive response
$250 USD en 10 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
Hello, I am a skilled application developer with 5+ years of experience of enterprise development. I am new to this website as I just recently began the transition to self-employment through my LLC, Luntspark Systems. Upon reviewing your requirements, I feel that I am highly-qualified for your project as I have a years of experience with data-intensive systems, including .NET applications, HTML5/JavaScript/CSS front-ends, and scripting. Parsing out the required information from HTML documents would not be difficult. From there, it's just a matter of composing an HTML template that is to your liking and formatting the data appropriately. If you're interested, we could make this a customizable template that you could modify down the line if you would like to. Also, I am not a fan of embedding "magic strings" within an application. I will be sure to make all of the options configurable - i.e., the target and replacement URL's for graphics. You might find my bid to be on the low side. My main goal right now is not to make as much money as possible but rather to provide the best service that I possibly can. I take great pride in my work and I would love to give you a product that goes over and above your expectations and inspires you to not only recommend me, but to come back with any future needs. Please do not hesitate to contact me. I appreciate your time and look forward to hearing from you! Best Regards, Tony Lunt Luntspark Systems, LLC
$416 USD en 15 días
0,0 (0 comentarios)
0,0
0,0

Sobre este cliente

Bandera de UNITED STATES
Edina, United States
4,9
76
Forma de pago verificada
Miembro desde ago 13, 2008

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.