Find Jobs
Hire Freelancers

Bulk Scanned PDFs doing selective area OCR Application Dev for vyadzmak

$250-750 USD

En curso
Publicado hace más de 11 años

$250-750 USD

Pagado a la entrega
Project Description: I have thousands of scanned form pdfs (by form I don't mean they are editable or fillable pdfs, they are just strict rasterized tif based graphics). The forms are of different types. The scan quality of some of the pdfs are medium at best. I need someone to develop a desktop win32/64 based software that does ocr of some specific area on the form and save captured data to database. With regards to application (call it templateApp from now on) picking out the desired area on form to be ocr, I envision it to be a Win32/64 desktop application, where administrator (generally selective few users only) whom has rights to setup this capture and ocr specifics, will be opening a pdf, from there, he/she can mark multiple "area of interests" (like how we select an area to crop out in mspaint, dragging from left upper corner to right lower corner), and such info will be stored in db somewhere to be used by the actual ocr application (call it ocrProcessingApp from now on). Subsequently using ocProcessingApp, all incoming pdf files (of the same form format and in bulk hundreds or thousands) intended to be ocr and captured text from these multiple "area of interests" will be processed accordingly, and all text found can be stored in mysql database. Ocr requirement is going to english text only in this project, please keep in mind if you can do additional language ocr, we can extend project to different phase handling multiple language. I don't have any ocr library of choice, please PM me what ocr library you intend to use to be seriously considered as a candidate. It is very important to obtain high accuracy in ocr text while the incoming pdf/images are clear. If you had done any customization like pre-processing images/pdf e.g. scaling, de-noising, etc what makes images clearer before feeding into OCR engines like Tesseract, gocr, etc, it would be a bonus. If you had experience with grid table parsing based on Tesseract/Cuneiform, it would be a bonus also. Some of the forms have a grid between the values, others don't. Please PM me how you plan to tackle this issue to make out the text we need to capture. Please keep in mind I intend to create a general purpose tool i.e. not specifically geared towards a specific job only. You can assume all scanned pdfs are straight and not skewed. You can use any programming language, but if it's not a .NET language, java, C++ or python, please check with me before. Also please include what language you will be using as part of your quote. Application must work on XP, Vista, Windows7 both x32 and x64 OS. Please PM me which programming language you intend to develop these two applications (templateApp & ocrProcessingApp) under. Please make sure application is bug free. Please try not to use any 3rd party components where I have to pay for licenses. But if you must, please include 3rd party component info and cost in your quote or PM me. Ultimately cost, reliability, and licenses for royalty distribution is major factors. I need all source code and rights to the source and binary code in the end. Thank you for your interest in bidding on this project. Possible follow-on projects based on satisfactory work on this project. If you have any questions, please don't hesitate to ask. Thanks.so. Skills required: .NET, C# Programming, Java, OCR, Visual Basic Per our discussion previously via private messaging... Thanks.
ID del proyecto: 3993579

Información sobre el proyecto

3 propuestas
Proyecto remoto
Activo hace 11 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos

Sobre este cliente

Bandera de CANADA
Scarborough, Canada
4,9
47
Miembro desde ene 7, 2009

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.