jimedmonds Verified Member Posted March 29, 2019 Posted March 29, 2019 hi guys, I will keep this really brief but was hoping you guys could help me out here (mods: feel free to delete thread and warn/ban/permaban me if I've violated the R&R somehow). What's the best or authoritative source for baseball data in terms of downloading gargantuan data sets? I only know of Sean Lahman and just standard stuff kicking around on fangraphs and baseball-reference, but was hoping to find as many different and comprehensive sources as possible. Also open to any other sites in general that can get me up to speed on the latest tools, metrics, and analytics. long story short - I'm a Data Scientist/Data Engineer at a FAANG company and wanted to try out various technical alchemy and also share both my findings and my visualisations on the interwebz. I obviously can't use proprietary information so thought I'd work with data on a topic that I'm most interested in. At the very least, I'd be creating one additional website dedicated to baseball analytics with graphical interfaces - something that I noticed is severely lacking on fangraphs. Thanks!
Captain Adama Old-Timey Member Posted March 29, 2019 Posted March 29, 2019 hi guys, I will keep this really brief but was hoping you guys could help me out here (mods: feel free to delete thread and warn/ban/permaban me if I've violated the R&R somehow). What's the best or authoritative source for baseball data in terms of downloading gargantuan data sets? I only know of Sean Lahman and just standard stuff kicking around on fangraphs and baseball-reference, but was hoping to find as many different and comprehensive sources as possible. Also open to any other sites in general that can get me up to speed on the latest tools, metrics, and analytics. long story short - I'm a Data Scientist/Data Engineer at a FAANG company and wanted to try out various technical alchemy and also share both my findings and my visualisations on the interwebz. I obviously can't use proprietary information so thought I'd work with data on a topic that I'm most interested in. At the very least, I'd be creating one additional website dedicated to baseball analytics with graphical interfaces - something that I noticed is severely lacking on fangraphs. Thanks! Baseball Savant is a pretty good site. It offers a comprehensive view of statcast data. https://baseballsavant.mlb.com/
Abomination Old-Timey Member Posted March 29, 2019 Posted March 29, 2019 (edited) hi guys, I will keep this really brief but was hoping you guys could help me out here (mods: feel free to delete thread and warn/ban/permaban me if I've violated the R&R somehow). What's the best or authoritative source for baseball data in terms of downloading gargantuan data sets? I only know of Sean Lahman and just standard stuff kicking around on fangraphs and baseball-reference, but was hoping to find as many different and comprehensive sources as possible. Also open to any other sites in general that can get me up to speed on the latest tools, metrics, and analytics. long story short - I'm a Data Scientist/Data Engineer at a FAANG company and wanted to try out various technical alchemy and also share both my findings and my visualisations on the interwebz. I obviously can't use proprietary information so thought I'd work with data on a topic that I'm most interested in. At the very least, I'd be creating one additional website dedicated to baseball analytics with graphical interfaces - something that I noticed is severely lacking on fangraphs. Thanks! You seem to be able to download the data from MLB's gameday service. However, it can only be used for non-commercial purposes (likely the same for any of the free/cheap sources). There's two ways to get it. This explains the older method: http://www.michealwillard.com/mlbam_api/ So for example, here's yesterday's game: http://gd2.mlb.com/components/game/mlb/year_2019/month_03/day_28/gid_2019_03_28_detmlb_tormlb_1/ And here's the full game data (It looks like the spin rates this year aren't displayed here for some reason). http://gd2.mlb.com/components/game/mlb/year_2019/month_03/day_28/gid_2019_03_28_detmlb_tormlb_1/inning/inning_all.xml And you seem to be able to access a more modern api as well (the JSON is a lot easier to work with directly): https://statsapi.mlb.com/api/v1/schedule?sportId=1&date=03/28/2019 The live data section has pitchs and information, but I don't see it listed for yesterday's games for some reason. Here's a random game from last year: https://statsapi.mlb.com/api/v1/game/529942/feed/live Do keep in mind that the data is NOT scrubbed. This means that once in a while you get duplicate events listed, events out of order, etc. It's a nightmare to work with if you care about 100% accuracy, but extremely useful if you only care about it being right 99% of the time. Edited March 29, 2019 by Abomination
Spanky99 Old-Timey Member Posted March 29, 2019 Posted March 29, 2019 hi guys, I will keep this really brief but was hoping you guys could help me out here (mods: feel free to delete thread and warn/ban/permaban me if I've violated the R&R somehow). What's the best or authoritative source for baseball data in terms of downloading gargantuan data sets? I only know of Sean Lahman and just standard stuff kicking around on fangraphs and baseball-reference, but was hoping to find as many different and comprehensive sources as possible. Also open to any other sites in general that can get me up to speed on the latest tools, metrics, and analytics. long story short - I'm a Data Scientist/Data Engineer at a FAANG company and wanted to try out various technical alchemy and also share both my findings and my visualisations on the interwebz. I obviously can't use proprietary information so thought I'd work with data on a topic that I'm most interested in. At the very least, I'd be creating one additional website dedicated to baseball analytics with graphical interfaces - something that I noticed is severely lacking on fangraphs. Thanks! Here's a few... https://baseballsavant.mlb.com/ http://www.brooksbaseball.net/dashboard.php http://www.statcorner.com/BatLeaderboardR.php
jimedmonds Verified Member Posted March 29, 2019 Author Posted March 29, 2019 thanks so much, guys - this is awesome. really appreciated! once I get set up, will definitely my work with this board for feedback and perspectives
JoJo Parker Dunedin Blue Jays - A SS On Tuesday, Parker was just 1-for-5, but the one hit was his first professional home run. Explore JoJo Parker News >
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now