Find Jobs
Hire Freelancers

webscraper for prices spesific webpage (I have separate 1 hour to scope, seperate 5 mins to read... than apply for the job)

$30-250 USD

Cancelado
Publicado hace más de 6 años

$30-250 USD

Pagado a la entrega
I need a price information from n different pages at the sametime in every x miliseconds ( around 30 mins ). Need to 7/24 run as a service at windows or a docker service in cloud (preffered), Scrapping for all pages that has in XML needs to be concurrent (thread) request to get the data at the same timeline. After filling all data, you can make the calculations and then you can save all values. PossibleDiagram has attached. Save every actionto a log file with in relateddatename to see if error occurs. 1- You will get config of running process from XML file (time, log, mails....) 2- You will get the pages, Xpath and regex from XML file. 3- You will get calculations from XML. 4- You will get warning conditions from XML 5- SQLite is enough for DB, but If you prefer to use another for better results. 6- I wish to run this project in Amazon Web Service but if it' s not possible I will provide a VM to make a setup. Alert logic Alert logic is easy. <alert id='warn1' if='balances>5' ops='admin;operator' alertype='sms' text='balance is low'/> this is an SMS type alert find admins from ops such as admin find his GSM (if you can' t find it, bypass... but log it) call web page of sms which all details under SMS chids in XML with the formula // {URL}?GSMNO={findadminGSMnumber}&text={GETALERTTEXT}&{child}={child/text()}.... Currently its [login to view URL];123123&text=balance%20is%20low&user=tako&pass=mako&title=POAS&attr1=123&attr2=345 You will make get requests. If it' s email type of alert... Get the mail, send ... title will be the same with Text. You will use SMTP details from config. I need a XML configuration file such as <config> <settings> <time>65000</time> <logfile>log{date}.log</logfile> <admins> <admin id='cats' mail='asdasd@[login to view URL]' gsm='213123'/> <admin id='felicia' mail ='test@[login to view URL]' gsm='123123' /> </admins> <smtp> </serveraddress> -- generally planning sendin mail from gmail </username> </password> </TLSport> --IFneeded </SSLport> --IFneeded </smtp> <sms> <url>[login to view URL]</url> <user>tako</user> <pass>mako</pass> <title>POAS</title> <attr1>123</attr1> <attr2>345</attr2> </sms> </settings> <pages> <page id='scrape1' startUrl='[login to view URL]' selector='//*[@id="root"]/div/main/header/div/div/div[2]/span[1]' regex='\\d{5}' /> <page id='scrape2' startUrl='[login to view URL]' selector='//*[@id="tab10"]/table/tbody/tr[2]/td[3]' regex='\\d{3}' /> <page id='scrape3' startUrl='[login to view URL]' selector='//*[@id="jsParityTable"]/div[3]/div[1]/div[2]' regex='\\d{9}' /> </pages> <calculates> <calc id='convert1' formula='(scrape3/scrape2)*scrape1' /> <calc id='balances' formula='(scrape1+scrape3)*(scrape1-scrape2)' /> </calculates> <alerts> <alert id='warn1' if='balances>5' ops='admin;operator' alertype='sms' text='balance is low'/> <alert id='warn2' if='convert1=5' ops='operator' alerttype='email' text='convert is 5'/> <alert id='warn3' if='scrape = 0' ops='admin' alerttype='email;sms' text='scrape is null' /> </alerts> </config> General Accepts - Page number won' t be more than 10, probably 4 but it must be capable work 10 (telling you because of thread issue ) - Minimum x time will be 5 mins, so you have 5 mins to calculate - Calculates won' t be more then 10. - All Scraps will be a numeric value, mostly money. - All values default is 0. If you can' t calculate some how... log why... where you coudn' t calculate (xmlid) and set 0 - Calculates will be done by ordering of XML. It' s not gonna happen with concurrent threads, calculations will be made after all concurrent page scrapping finished. - Calculates can use calculate ID' s in formula. calc1 formula can be 10x5, calc2 formula could be calc1*5 - I need to get 250... if calc1 is not calculated because of XML order, calc2 result will be 0, but you need to log that problem in to log file as "calc1 was null" - calc2 coudn' t calculated.
ID del proyecto: 15893667

Información sobre el proyecto

2 propuestas
Proyecto remoto
Activo hace 6 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos

Sobre este cliente

Bandera de TURKEY
İstanbul, Turkey
5,0
13
Forma de pago verificada
Miembro desde nov 8, 2008

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.