This dataset contains historical playing data from Major League Baseball (tm), disaggregated at various levels. These are based on the files produced by Retrosheet.
There are two main groupings of files:
- daybyday: Game-by-game records for players and teams, for all seasons available from Retrosheet.
- splits: Various batting and pitching splits generated from event-level data, that is, which cannot be computed from the game-by-game records. This covers 1974 through present, for which Retrosheet has full play-by-play coverage.
The underlying data is obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at www.retrosheet.org.
This dataset is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/.
The maintainer of this dataset is Dr Theodore Turocy, Chadwick Baseball Bureau; ted.turocy@gmail.com / http://www.chadwick-bureau.com.