jays4life19 Old-Timey Member Posted March 12, 2015 Posted March 12, 2015 Working on a baseball FD/DK optimizer. Here is what it spits out for today (if there was a contest) for DK (salaries already up for opening day). It takes the SEA/COL game as being at Coors. [TABLE=width: 780] [TR] [TD=width: 65]SP[/TD] [TD=width: 65]SP2[/TD] [TD=width: 65]C[/TD] [TD=width: 65]1B[/TD] [TD=width: 65]2B[/TD] [TD=width: 65]SS[/TD] [TD=width: 65]3B[/TD] [TD=width: 65]OF1[/TD] [TD=width: 65]OF2[/TD] [TD=width: 65]OF3[/TD] [TD=width: 65]Salary[/TD] [TD=width: 65]Points[/TD] [/TR] [TR] [TD]David Price[/TD] [TD]Matt Harvey[/TD] [TD]Wilin Rosario[/TD] [TD]Logan Morrison[/TD] [TD]Rickie Weeks[/TD] [TD]Brad Miller[/TD] [TD]Kyle Seager[/TD] [TD]Seth Smith[/TD] [TD]Nelson Cruz[/TD] [TD]Corey Dickerson[/TD] [TD=align: right]49600[/TD] [TD=align: right]129.04[/TD] [/TR] [/TABLE] Things not yet included: Weather and catcher for framing Looking forward to daily PM's of JFAS optimal lineups when baseball season starts. jk...kinda.
NorthOf49 Old-Timey Member Posted March 12, 2015 Posted March 12, 2015 Do these sites systematically undervalue Coors? Wow lol
jays4life19 Old-Timey Member Posted March 12, 2015 Posted March 12, 2015 Do these sites systematically undervalue Coors? Wow lol You usually get great value by taking a visiting hitter at Coors.
jays4life19 Old-Timey Member Posted March 12, 2015 Posted March 12, 2015 https://drive.google.com/folderview?id=0ByKIdMktKhYNb09tWGtDOFNyb28&usp=sharing I'm hoping to have automatic updates in this folder by the beginning of the season. Right now an update requires two clicks from me and ~30 seconds to run. Timestamps are in MST. Today's optimal for example (if there was a contest on DK): [TABLE=width: 910] [TR] [TD=width: 65]System[/TD] [TD=width: 65]SP1[/TD] [TD=width: 65]SP2[/TD] [TD=width: 65]C[/TD] [TD=width: 65]1B[/TD] [TD=width: 65]2B[/TD] [TD=width: 65]SS[/TD] [TD=width: 65]3B[/TD] [TD=width: 65]OF1[/TD] [TD=width: 65]OF2[/TD] [TD=width: 65]OF3[/TD] [TD=width: 65]Salary[/TD] [TD=width: 65]Points[/TD] [TD=width: 65]Updated[/TD] [/TR] [TR] [TD]Steamer[/TD] [TD]Masahiro Tanaka[/TD] [TD]Jacob deGrom[/TD] [TD]Jason Castro[/TD] [TD]Adam Lind[/TD] [TD]Scooter Gennett[/TD] [TD]Jean Segura[/TD] [TD]Pedro Alvarez[/TD] [TD]Steven Moya[/TD] [TD]George Springer[/TD] [TD]Mike Trout[/TD] [TD=align: right]50000[/TD] [TD=align: right]119.8[/TD] [TD=class: xl63, align: right]15-03-12 15:40[/TD] [/TR] [TR] [TD]ZiPS[/TD] [TD]Masahiro Tanaka[/TD] [TD]Jacob deGrom[/TD] [TD]Salvador Perez[/TD] [TD]Andy Wilkins[/TD] [TD]Scooter Gennett[/TD] [TD]Jean Segura[/TD] [TD]Pedro Alvarez[/TD] [TD]Carlos Gonzalez[/TD] [TD]Mike Trout[/TD] [TD]Kole Calhoun[/TD] [TD=align: right]49900[/TD] [TD=align: right]124.5[/TD] [TD=class: xl63, align: right]15-03-12 15:40[/TD] [/TR] [TR] [TD]FGDC[/TD] [TD]Masahiro Tanaka[/TD] [TD]Jacob deGrom[/TD] [TD]Salvador Perez[/TD] [TD]Andy Wilkins[/TD] [TD]Scooter Gennett[/TD] [TD]Jean Segura[/TD] [TD]Pedro Alvarez[/TD] [TD]Kole Calhoun[/TD] [TD]Carlos Gonzalez[/TD] [TD]Mike Trout[/TD] [TD=align: right]49900[/TD] [TD=align: right]121.16[/TD] [TD=class: xl63, align: right]15-03-12 15:40[/TD] [/TR] [/TABLE] You're awesome man.
Boxcar Old-Timey Member Posted March 12, 2015 Posted March 12, 2015 You're awesome man. I second this.
LTR Verified Member Posted March 12, 2015 Posted March 12, 2015 JFaS: Cool. Interested to know some of your tricks (if you're willing to share). Where are you getting your projections from? I'm guessing Fangraphs, in which case you are probably having to manually update that as needed (by manually I mean downloading the XLS)? Same question goes for handedness data? And what time-frame are you basing the handedness data on? What stat are you using as a baseline to evaluate each players handedness? How are you vetting guys who are not in the line-up due to injury and other reasons? Where are you getting the daily DK player data? I probably have more questions but those come to mind initially.
jays4life19 Old-Timey Member Posted March 12, 2015 Posted March 12, 2015 Dude's a Robot, lol. We're really lucky to have him on this forum.
LTR Verified Member Posted March 12, 2015 Posted March 12, 2015 From Fangraphs yes. Do have to manually download. Haven't figured out the java_dopostback yet. Using career splits and regressing them for each rate stat needed (BB, K, 1B on BIP, 2B on BIP, 3B on BIP, HR on BIP). Same thing for pitchers but just K, BB and FB (for HR). I haven't figured out a way to automatically gather this either. But since projections and "handedness" data isn't going to change much on a daily basis I was only updating once a week (manually). Automatically pulling lineups from BaseballPress. A player who isn't in the lineup won't appear. This does mean it will miss late games if the lineups are not in on time. How do you get it automatically (it's not table based so can't be imported easily into Excel) unless you're doing some nifty parsing? From DK itself (Export CSV). Right now manually, but I know it can be done with a bot and I am going to try and do that. Haven't used DK myself, so was just curious. I was gathering my data automatically from RotoGuru. For example, FD: http://rotoguru1.com/cgi-bin/stats.cgi?04d - They have the up-to-date data for the next day usually at 11 PM the night before if I recall correctly (they also have data for DK and other sites there). I like to do my line-ups the night before (since I can't really do them during work hours). So I would prep my line-ups using all active players and offset that list with this simple injury listed I created: http://users16.jabry.com/eragonth/test.asp (parses the MLB Injury List data into a simpler form). Still requires an additional step of taking out any players not in that days "optimum" line-up and then re-running the optimizer.
LTR Verified Member Posted March 12, 2015 Posted March 12, 2015 I also added this document to the folder: Current Batter projections take into account the following: Park (by handedness) Regressed handedness splits (batter and pitcher interactive) GB/FB profile adjustments (vs. pitcher) Adjustment for defense of opposing team lineup PA projection based on spot in lineup, quality of teammates in lineup, opposing pitchers, and home team or away with win% (lose last inning if winning at home) R and RBI projected with adjustments for surrounding players in lineup Projected amount of PA vs. SP and vs. bullpen Current Pitcher projections take into account the following: Park Interactive matchups with opposing hitters (including opposing pitcher in the lineup) Batters faced and IP projection based on projected event rates Adjustment for defense of their lineup Adjustment for baserunning skills of opposing lineup Projected odds of team winning game and projected chance of pitcher W Projected chance of CG If you've done this correctly you should be able to make a lot of money.
jays4life19 Old-Timey Member Posted March 12, 2015 Posted March 12, 2015 LTR - Are you going to play other sites this year or just stick to FD still?
LTR Verified Member Posted March 12, 2015 Posted March 12, 2015 LTR - Are you going to play other sites this year or just stick to FD still? I see no reason to venture out really, I'm just going to focus on one site so I can streamline my system.
LTR Verified Member Posted March 12, 2015 Posted March 12, 2015 Ya I just wrote some VBA to load the webpage and "parse" it with some nifty formulae and send it to a CSV which my DB uses. I'm guessing you're changing (hacking) the 'page size' to max and then triggering the post back to get the entire list?
nmrch Verified Member Posted March 12, 2015 Posted March 12, 2015 I appreciate the insight from both you and JFas, i won't be using his lineups but it'll help cross check my methodology because i'm pretty much doing the same thing as him. I'm also using my own formula to project Runs and RBI instead of using the projections. Since you guys are so generous with your secrets, allow me to add my insight. I haven't figured out a way to automatically gather this either. But since projections and "handedness" data isn't going to change much on a daily basis I was only updating once a week (manually). I wrote a program that scrapes data from Fangraphs, i only have to feed it a list of fangraphs ID's and it spits out data on the players including projections, splits or anything else i need. How do you get it automatically (it's not table based so can't be imported easily into Excel) unless you're doing some nifty parsing? i'm doing this with another scraping program, it gets the players names and matches it to a master list of fangraphs id's i have and then i feed the lineups just in the form of player id's to my other program. One problem i'm running into is players with the same names, i'm still working on fixing that. Haven't used DK myself, so was just curious. I was gathering my data automatically from RotoGuru. For example, FD: http://rotoguru1.com/cgi-bin/stats.cgi?04d - They have the up-to-date data for the next day usually at 11 PM the night before if I recall correctly (they also have data for DK and other sites there). I like to do my line-ups the night before (since I can't really do them during work hours). So I would prep my line-ups using all active players and offset that list with this simple injury listed I created: http://users16.jabry.com/eragonth/test.asp (parses the MLB Injury List data into a simpler form). Still requires an additional step of taking out any players not in that days "optimum" line-up and then re-running the optimizer. I plan on running my program 30 minutes before the first game, this should ensure all the lineups are in on normal days. Weekends are a problem with the afternoon games but the DFS sites usually skip those games anyway.
nmrch Verified Member Posted March 12, 2015 Posted March 12, 2015 I'd also like to include weather adjustments (and automatically exclude guys with high chance of PPD). Also want to include catcher framing that day, but have to wean the framing out of projections first. Also include the slight home field advantage in called strikes. Would you be interested in sharing your formula for runs projected and RBI's, the problem i have is mine are somewhat instinct based without a lot of concrete data backing it up. For runs, my weighing is 55% wOBA 15% speedscore 30% lineup position and surrounding teammates wOBA similar methodology for RBI's but i also include speed scores of surrounding teammates.
LTR Verified Member Posted March 12, 2015 Posted March 12, 2015 I'd also like to include weather adjustments (and automatically exclude guys with high chance of PPD). Also want to include catcher framing that day, but have to wean the framing out of projections first. Also include the slight home field advantage in called strikes. Weather makes sense if you can trust it. You're being finicky with the "home field advantage of called strikes". If you go that far you might as well start taking into account ump biases (relatively difficult to quantify).
LTR Verified Member Posted March 12, 2015 Posted March 12, 2015 i'm doing this with another scraping program, it gets the players names and matches it to a master list of fangraphs id's i have and then i feed the lineups just in the form of player id's to my other program. One problem i'm running into is players with the same names, i'm still working on fixing that. I also have the problem with duplicate player names, it's relatively minor issue (there's only a handful of players with the same name). I'm not sure there's a real solution since you're dealing with different sites using different IDs, etc. Though in theory you could probably cross reference the team to make sure it's the right player.
nmrch Verified Member Posted March 12, 2015 Posted March 12, 2015 I also have the problem with duplicate player names, it's relatively minor issue (there's only a handful of players with the same name). I'm not sure there's a real solution since you're dealing with different sites using different IDs, etc. Though in theory you could probably cross reference the team to make sure it's the right player. Yeah i'll probably just hardcode the ID's on my master list for guys with the same names, not worth spending too much time on it. Right now i'm working on automating weather and vegas line data, i don't know if the latter is worth it or its easier to manually load the gamelines
nmrch Verified Member Posted March 13, 2015 Posted March 13, 2015 For runs and RBI I just multiply by the 'basic' park factor and other variables that are weighted accordingly (opposing pitcher, handedness, etc.). The formula gets a little crazy when you start accounting for surrounding teammates. Batting order is important for these 2 stats but there's no real way to account for batting order in projections, therefore I just ignore it. I adjust the wOBA's for those factors, i run my program each day after the lineups are posting so its not hard to adjust the daily projections based on lineup order.
LTR Verified Member Posted March 13, 2015 Posted March 13, 2015 I adjust the wOBA's for those factors, i run my program each day after the lineups are posting so its not hard to adjust the daily projections based on lineup order. I deleted that post because honestly it requires more thought. Projections are based off past performance... if Jose Reyes, for example, always (or at least in recent years) batted in the 2-hole with X batting before and Y batting after and the same line-up occurs in the game in question there's no real reason to adjust anything... the projections would be in-line with this. So for that reason I ignore these variables all together, if that makes sense.
nmrch Verified Member Posted March 13, 2015 Posted March 13, 2015 I deleted that post because honestly it requires more thought than I can come up with in this instance. Projections are based off past performance... if Jose Reyes, for example, always batted in the 2-hole with X batting before and Y batting after and the same line-up occurs in the game in question there's no real reason to adjust anything... the projections would be in-line with this. So I tend to ignore these variables, perhaps I'm wrong. I agree with this but lineups change from year to year, player could be shifted around season to season and they could have new teammates, and this could have a significant impact on runs and rbi's. I just find the the projections to be too crude when it comes to forecasting runs and rbi's, so i'm using a combination of individual factors and teammate factors to predict runs and rbi's on a given day.
LTR Verified Member Posted March 13, 2015 Posted March 13, 2015 I agree with this but lineups change from year to year, player could be shifted around season to season and they could have new teammates, and this could have a significant impact on runs and rbi's. I just find the the projections to be too crude when it comes to forecasting runs and rbi's, so i'm using a combination of individual factors and teammate factors to predict runs and rbi's on a given day. Well JFaS said he's doing the same (of sorts) so perhaps I'm wrong but I just don't see how you can account for past line-ups and batting orders and weigh those according to current line-up and batting order. It's too tricky.
nmrch Verified Member Posted March 13, 2015 Posted March 13, 2015 Well JFaS said he's doing the same (of sorts) so perhaps I'm wrong but I just don't see how you can account for past line-ups and batting orders and weigh those according to current line-up and batting order. It's too tricky. Sorry i'm being unclear on this, i only use my own calculations to projects runs, rbi's and plate appearances. I still use Steamer and Zips for rate stats, i just think the regression analysis which is what those systems do to project counting stats like runs and rbi's is too crude, you gotta take into account lineup order and composition.
LTR Verified Member Posted March 13, 2015 Posted March 13, 2015 Sorry i'm being unclear on this, i only use my own calculations to projects runs, rbi's and plate appearances. I still use Steamer and Zips for rate stats, i just think the regression analysis which is what those systems do to project counting stats like runs and rbi's is too crude, you gotta take into account lineup order and composition. Yeah, I was on the same page as you. Nevertheless, if a guy has been playing with the same people in the same spot in the order for a 3+ years... I think you would be deflating or inflating his projected runs and RBIs (because of his spot in the order and those batting around him) even though in this case you should probably do nothing. I believe most projection systems use around 3-5 years of data so the only way to validate what you're doing is compare against past line-ups and batting order positions during the years used for projection, which obviously gets too crazy. UNLESS, projection systems like ZiPS / Steamer completely ignore past data for RBI and runs when projecting, but I doubt it.
nmrch Verified Member Posted March 13, 2015 Posted March 13, 2015 Yeah, I was on the same page as you. Nevertheless, if a guy has been playing with the same people in the same spot in the order for a bunch of years... you would be deflating or inflating his projected runs and RBIs (because of his spot in the order and those batting around him) even though in this case you should probably do nothing. I believe most projection systems use around 3-5 years of data so the only way to validate what really validate what you're doing is compare against past line-ups and batting order positions during the years used for projection, which obviously gets too crazy. oh, i get what you're saying and its an excellent point. I just completely ignore what steamer and zips say about rbi's and runs and start from scratch, there's no threat of overestimating or underestimating guys. I start with a theoretical league average rate of scorings runs and getting rbi's per plate appearance and then adjust them accordingly trying to account for a number of factors.
LTR Verified Member Posted March 13, 2015 Posted March 13, 2015 oh, i get what you're saying and its an excellent point. I just completely ignore what steamer and zips say about rbi's and runs and start from scratch, there's no threat of overestimating or underestimating guys. I start with a theoretical league average rate of scorings runs and getting rbi's per plate appearance and then adjust them accordingly trying to account for a number of factors. That's interesting way of doing it but I'd prefer to go by the experts in this case (and adjust accordingly). Though I would like to know exactly how ZiPS and Steamers projects RBIs and runs.
Deadpool Old-Timey Member Posted March 13, 2015 Posted March 13, 2015 So, if we follow JFaS' system, are we all gunna get rich?
LTR Verified Member Posted March 13, 2015 Posted March 13, 2015 https://drive.google.com/folderview?id=0ByKIdMktKhYNb09tWGtDOFNyb28&usp=sharing I have it updating automatically every 10 minutes now. Though it is powered by a laptop that I move around a lot so it won't be updating all the time. Well done.
Nox Verified Member Posted March 13, 2015 Posted March 13, 2015 For automatically downloading prices from DK and FanDuel (which need you to authenticate before you can gain access), I'd suggest using Selenium to build the active scraper. It has Java and .Net flavours.
Nox Verified Member Posted March 13, 2015 Posted March 13, 2015 FanDuel and DraftKings really need to expose a public API to get this stuff. Really not sure why they don't.
JoJo Parker Dunedin Blue Jays - A SS On Tuesday, Parker was just 1-for-5, but the one hit was his first professional home run. Explore JoJo Parker News >
Recommended Posts