I need code written in either Java or R to mine data on historic baseball games from the web; [login to view URL] has the data available in table format but I am open to other sources. I need the data then exported to a CSV or similar format in a reasonably organized manner.
## Deliverables
I need a piece of code written that takes in as arguments a start and end date (with today as the default end date if none is supplied). Then program will then go to [login to view URL] and collect data on all MLB baseball games played during the specified date span.
Data from each game will then be written to a CSV file in such a manner that, when read into R (or Excel), that game reads in as a single row. The row will contain the date of the game, followed by the following information for both the visiting and home teams (written with one element per column):
League
Starting players at all 9 Positions
Each starting player's season total (to-date) batting average, on-base percentage and slugging percentage
Each starting player's results for the game: number of at-bats, hits, doubles, triples, home-runs, walks, strike-outs and fielding errors
For the starting pitchers, I then need the number of innings pitched, runs allowed, earned runs allowed, walks, strike-outs, home-runs allowed, pitches, strikes and season to-date ERA
All data is available on the same web-page (the box-score for the game).