Jump to content
Jays Centre
  • Create Account

Recommended Posts

Posted

hi guys,

 

I will keep this really brief but was hoping you guys could help me out here (mods: feel free to delete thread and warn/ban/permaban me if I've violated the R&R somehow). What's the best or authoritative source for baseball data in terms of downloading gargantuan data sets? I only know of Sean Lahman and just standard stuff kicking around on fangraphs and baseball-reference, but was hoping to find as many different and comprehensive sources as possible. Also open to any other sites in general that can get me up to speed on the latest tools, metrics, and analytics.

 

long story short - I'm a Data Scientist/Data Engineer at a FAANG company and wanted to try out various technical alchemy and also share both my findings and my visualisations on the interwebz. I obviously can't use proprietary information so thought I'd work with data on a topic that I'm most interested in. At the very least, I'd be creating one additional website dedicated to baseball analytics with graphical interfaces - something that I noticed is severely lacking on fangraphs.

 

Thanks!

Posted
hi guys,

 

I will keep this really brief but was hoping you guys could help me out here (mods: feel free to delete thread and warn/ban/permaban me if I've violated the R&R somehow). What's the best or authoritative source for baseball data in terms of downloading gargantuan data sets? I only know of Sean Lahman and just standard stuff kicking around on fangraphs and baseball-reference, but was hoping to find as many different and comprehensive sources as possible. Also open to any other sites in general that can get me up to speed on the latest tools, metrics, and analytics.

 

long story short - I'm a Data Scientist/Data Engineer at a FAANG company and wanted to try out various technical alchemy and also share both my findings and my visualisations on the interwebz. I obviously can't use proprietary information so thought I'd work with data on a topic that I'm most interested in. At the very least, I'd be creating one additional website dedicated to baseball analytics with graphical interfaces - something that I noticed is severely lacking on fangraphs.

 

Thanks!

 

Baseball Savant is a pretty good site. It offers a comprehensive view of statcast data.

 

https://baseballsavant.mlb.com/

Posted (edited)
hi guys,

 

I will keep this really brief but was hoping you guys could help me out here (mods: feel free to delete thread and warn/ban/permaban me if I've violated the R&R somehow). What's the best or authoritative source for baseball data in terms of downloading gargantuan data sets? I only know of Sean Lahman and just standard stuff kicking around on fangraphs and baseball-reference, but was hoping to find as many different and comprehensive sources as possible. Also open to any other sites in general that can get me up to speed on the latest tools, metrics, and analytics.

 

long story short - I'm a Data Scientist/Data Engineer at a FAANG company and wanted to try out various technical alchemy and also share both my findings and my visualisations on the interwebz. I obviously can't use proprietary information so thought I'd work with data on a topic that I'm most interested in. At the very least, I'd be creating one additional website dedicated to baseball analytics with graphical interfaces - something that I noticed is severely lacking on fangraphs.

 

Thanks!

 

You seem to be able to download the data from MLB's gameday service. However, it can only be used for non-commercial purposes (likely the same for any of the free/cheap sources). There's two ways to get it. This explains the older method:

http://www.michealwillard.com/mlbam_api/

 

So for example, here's yesterday's game:

http://gd2.mlb.com/components/game/mlb/year_2019/month_03/day_28/gid_2019_03_28_detmlb_tormlb_1/

 

And here's the full game data (It looks like the spin rates this year aren't displayed here for some reason).

http://gd2.mlb.com/components/game/mlb/year_2019/month_03/day_28/gid_2019_03_28_detmlb_tormlb_1/inning/inning_all.xml

 

 

And you seem to be able to access a more modern api as well (the JSON is a lot easier to work with directly):

https://statsapi.mlb.com/api/v1/schedule?sportId=1&date=03/28/2019

 

The live data section has pitchs and information, but I don't see it listed for yesterday's games for some reason. Here's a random game from last year:

https://statsapi.mlb.com/api/v1/game/529942/feed/live

 

Do keep in mind that the data is NOT scrubbed. This means that once in a while you get duplicate events listed, events out of order, etc. It's a nightmare to work with if you care about 100% accuracy, but extremely useful if you only care about it being right 99% of the time.

Edited by Abomination
Posted
hi guys,

 

I will keep this really brief but was hoping you guys could help me out here (mods: feel free to delete thread and warn/ban/permaban me if I've violated the R&R somehow). What's the best or authoritative source for baseball data in terms of downloading gargantuan data sets? I only know of Sean Lahman and just standard stuff kicking around on fangraphs and baseball-reference, but was hoping to find as many different and comprehensive sources as possible. Also open to any other sites in general that can get me up to speed on the latest tools, metrics, and analytics.

 

long story short - I'm a Data Scientist/Data Engineer at a FAANG company and wanted to try out various technical alchemy and also share both my findings and my visualisations on the interwebz. I obviously can't use proprietary information so thought I'd work with data on a topic that I'm most interested in. At the very least, I'd be creating one additional website dedicated to baseball analytics with graphical interfaces - something that I noticed is severely lacking on fangraphs.

 

Thanks!

 

Here's a few...

 

https://baseballsavant.mlb.com/

 

http://www.brooksbaseball.net/dashboard.php

 

http://www.statcorner.com/BatLeaderboardR.php

Posted
thanks so much, guys - this is awesome. really appreciated! once I get set up, will definitely my work with this board for feedback and perspectives

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
The Jays Centre Caretaker Fund
The Jays Centre Caretaker Fund

You all care about this site. The next step is caring for it. We’re asking you to caretake this site so it can remain the premier Blue Jays community on the internet.

×
×
  • Create New...