This is a datamining application some of which is already finished.
The dataming is for information reguarding address records in the USA.
Bonuses will be giving for useful/pretty reporting, and data mining realistic rent from a source I have been unable to find.
Skills required include, HTML parsing, HTML post/get queries, Mysql databases, PDF creation.
The acceptable error rate is 1 in every 500 records.
That is to say, 1 in every 500 records is allow to have data import(data mining) errors.
the other 499 records are required to have 100% data correctly data mined.
Must work with Strewberry Perl on Windows 7.
Any Addtional Modules\libs used but be installable by the ppm install command
OR must be installable by you providing them in a ZIP/RAR file.
No Build or Make Commands.
## Deliverables
This is a datamining application most of which is already finished.
The dataming is for information reguarding address records in the USA.
Bonuses will be giving for useful/pretty reporting, and data mining realistic rent from a source I have been unable to find.
Skills required include, HTML parsing, HTML post/get queries, Mysql databases, PDF creation.
The acceptable error rate is 1 in every 500 records.
That is to say, 1 in every 500 records is allow to have data import(data mining) errors.
the other 499 records are required to have 100% data correctly data mined.
Must work with Strewberry Perl on Windows 7.
Any Addtional Modules\libs used but be installable by the ppm install command
OR must be installable by you providing them in a ZIP/RAR file.
No Build or Make Commands.
--- Please see attached zip file ---
command line sytax.
perl foredownloader
-- this option causes the program to download the data right now
perl foredownloader 10-23-2010 10-26-2010
-- this option causes the program to download the date range 10-23-2010 to 10-26-2010
perl foredownloader 10-23-2010 10-26-2010 always
-- This option cause the program to download with out the confirm response
downloads the list from
-- Search By 'Document Type' - Left hand side
[login to view URL]
-- Use types "HL,L,LISP,DEF,B,NTS,TSD,TXDUE,DETS,BETS" / Foreclosure Documents
[login to view URL]
-- Click "Create Export File"
The exported file will provide the following data points
push(@headers, 'DocumentID');
push(@headers, 'CrossPartyName');
push(@headers, 'Consideration');
push(@headers, 'Comments');
push(@headers, 'DocTypeKey');
push(@headers, 'FullName');
push(@headers, 'RecordDate');
push(@headers, 'ClerkFileNumber');
push(@headers, 'DOR1ParcelID');
push(@headers, 'Comments2');
-- If there is a error that the date range is to large(IE that we have tired to download over 10,000 records), the program should automatically divied the date range
until it sucessful, It should download all such sections and reassable them.
After downloading the List from [login to view URL]
it should show the total number of records about to be downloaded, and request a confirm to start downloading.
The program should take the Parcel ID information from the exported excel file and downloads the information from the Assessor website.
The program should also add data fields for any URL ref from the Assessor website, and also a data feild for the Assessor website itself.
Example URLs
[login to view URL]
[login to view URL]
[login to view URL]:05188
[login to view URL]
[login to view URL]
and the following data points
# GENERAL INFORMATION
push(@headers, 'Assessor URL');
push(@headers, 'Parcel NO.');
push(@headers, 'OWNER AND MAILING ADDRESS');
push(@headers, 'LOCATION ADDRESS CITY/UNINCORPORATED TOWN');
push(@headers, 'ASSESSOR DESCRIPTION');
push(@headers, 'ASSESSOR DESCRIPTION URL');
push(@headers, 'RECORDED DOCUMENT NO.');
push(@headers, 'RECORDED DOCUMENT NO. URL');
push(@headers, 'RECORDED DATE');
push(@headers, 'VESTING');
# ASSESSMENT INFORMATION AND SUPPLEMENTAL VALUE
push(@headers, 'TAX DISTRICT');
push(@headers, 'APPRAISAL YEAR');
push(@headers, 'FISCAL YEAR');
push(@headers, 'SUPPLEMENTAL IMPROVEMENT VALUE');
push(@headers, 'SUPPLEMENTAL IMPROVEMENT ACCOUNT NUMBER');
#REAL PROPERTY ASSESSED VALUE 1
push(@headers, 'FISCAL YEAR 1');
push(@headers, 'LAND 1');
push(@headers, 'IMPROVEMENTS 1');
push(@headers, 'PERSONAL PROPERTY 1');
push(@headers, 'EXEMPT 1');
push(@headers, 'GROSS ASSESSED (SUBTOTAL) 1');
push(@headers, 'TAXABLE LAND+IMP (SUBTOTAL) 1');
push(@headers, 'COMMON ELEMENT ALLOCATION ASSD 1');
push(@headers, 'TOTAL ASSESSED VALUE 1');
push(@headers, 'TOTAL TAXABLE VALUE 1');
#REAL PROPERTY ASSESSED VALUE 2
push(@headers, 'FISCAL YEAR 2');
push(@headers, 'LAND 2');
push(@headers, 'IMPROVEMENTS 2');
push(@headers, 'PERSONAL PROPERTY 2');
push(@headers, 'EXEMPT 2');
push(@headers, 'GROSS ASSESSED (SUBTOTAL) 2');
push(@headers, 'TAXABLE LAND+IMP (SUBTOTAL) 2');
push(@headers, 'COMMON ELEMENT ALLOCATION ASSD 2');
push(@headers, 'TOTAL ASSESSED VALUE 2');
push(@headers, 'TOTAL TAXABLE VALUE 2');
Push(@headers, 'Teasurer Property Taxes URL');
#ESTIMATED LOT SIZE AND APPRAISAL INFORMATION
push(@headers, 'ESTIMATED SIZE');
push(@headers, 'ORIGINAL CONST. YEAR');
push(@headers, 'LAST SALE PRICE MONTH/YEAR');
push(@headers, 'LAND USE');
push(@headers, 'DWELLING UNITS');
#PRIMARY RESIDENTIAL STRUCTURE
push(@headers, 'TOTAL LIVING SQ. FT.');
push(@headers, '1ST FLOOR SQ. FT.');
push(@headers, '2ND FLOOR SQ. FT.');
push(@headers, 'BASEMENT SQ. FT.');
push(@headers, 'GARAGE SQ. FT.');
push(@headers, 'CARPORT SQ. FT.');
push(@headers, 'STORIES');
push(@headers, 'BEDROOMS');
push(@headers, 'BATHROOMS');
push(@headers, 'FIREPLACE');
push(@headers, 'ADDN/CONV');
push(@headers, 'POOL');
push(@headers, 'SPA');
push(@headers, 'TYPE OF CONSTRUCTION');
push(@headers, 'ROOF TYPE');
#ASSESSORMAP VIEWING GUIDELINES
push(@headers, 'MAP');
push(@headers, 'MAP URL');
The program should then download all the data points from Teasurer Property Taxes URL Example [login to view URL]
## List of data Points from the Teasurer Website not listed here, but download them all
The program should then use a Free Geocoding Service which allows at least 10,000 records to be geocoded per day.
Any recorded not abled to be geocoded during that day should beable to be geocoded later by running the command
perl foredownloader fixgeo
the Geocoding should provide at least the following data points
#from geocode
push(@headers, 'Geo_Number'); <-- Street Numbers
push(@headers, 'Geo_Street'); <-- Street Name
push(@headers, 'Geo_Type'); <-- Street Type (Circle, Ave, Blvd, St.) etc.
push(@headers, 'Geo_City');
push(@headers, 'Geo_State');
push(@headers, 'Geo_Zip');
push(@headers, 'Geo_Suffix');
push(@headers, 'Geo_Prefix'); <-- such as North, S. E.
push(@headers, 'Geo_Lat');
push(@headers, 'Geo_Long');
Should include a data field for URL of Google Maps for each Address
Example [login to view URL],+NV+89014&sll=36.114646,-
115.172816&sspn=0.745514,1.244202&ie=UTF8&hq=&hnear=635+Pepper+Tree+Cir,+Henderson,+Clark,+Nevada+89014&z=16
The program need to download the following from [login to view URL] For Each Address.
# From epprisal
push(@headers, 'Eppraisal');
push(@headers, 'Zillow_apprasial');
Should also download the data for Recently Sold Homes (all 5 of them)
Address,Sales Price,Sale Date,Bed/Bath,Sq. Ft.
#### DATABASE WORK ####
All the data should go into mysql, with a timestamp for the Query which importanted it.
there should be a [login to view URL] file to hold the configuration values.
Records need to be important multiply times, each time with a different importID and timestamp.
When a record for lets say Parcel=191-24-111-040 is important on Oct 10th
it should not over write the record important early n Oct 2nd.
needs to be a [login to view URL] file which will create all the needed database tables
needs to be a [login to view URL] file which will prompt the user to confirm they really wish to delete database tables, and them.
### Bonus ###
Up to 20$ USD bonus will be given for useful reporting.
such as looking to see which multi family homes with 4 units where built between 1998 and 2010, with an Eppraisal between 65,000 and 200,000
The better looking the reports the better
Using a background image(same background for each page) and then creating an Mulitpage PDF file, one page per Address is prefect.
### Addtional BONUS ###
Up to an 20$ USD bonus will be given if a realistic suggested rental price can be generated/data mined for each address.
I need to get realistic rent I can charge if I were to buy a property, should take things into account such as properties type(House, Condo, Appartment), # of bedrooms
and bath rooms, SQ feet, etc.
The more realistic the Suggest rental price is the closer to 20$ USD you will get.