Automatic PDF Processing Concept - from Scan to finished text+Image
$100-500 USD
Terminado
Publicado hace más de 13 años
$100-500 USD
Pagado a la entrega
We have thousands of scanned books which need to processed. We are looking for a good concept on how these can be processed automatically!
General Information:
1. Currently we have scanned book in PDF format. A typical pdf file is about 400mb. It is already automatically OCRed.
2. The current images are often crooked, have the wrong contrast, brightness etc.
3. pdf files will be submitted per web interface (YES!!! 400mb! per book)
Steps needed:
After the file is uploaded on a linux server the process that we would like you do provide a concept for is:
1. auto rotating (to get the individual pages straight)
2. auto cropping
3. auto brightness and contrast
4. auto OCR
5. auto reducing of the resolution of the images to provide pdf files for the web (about 20mb-- instead of 400mb)
6. PDF with text behind image.
Your task in the contact is to provide a written concept (software packages - what software componets we should develop ourselves) of how we can achive the above.
If you can only provide part of the needed concept this is also okay. We plan to accept more than one bid on this project.
If you can also help in the actual realization of the above we can do this in a separate contract.