Fangraphs - How (Not) to Set Up a Fastball (http://www.fangraphs.com/blogs/how-not-to-set-up-a-fastball/)
I have a couple of quibbles and some questions about this article (plus I wanted to post it since I thought it was interesting).
Quibbles
Starting graphs away from zero is something that I don't think should be done without a very good reason, I'm not sure there's justification here and so the disparities presented seem much larger than perhaps is appropriate.
Not presenting the number of observations for each preceding pitch type was probably a clutter/brevity decision but without them it makes it difficult to even quickly ballpark whether or not the differences between the preceding pitch types are statistically different from zero.
Questions
The bolded text - simply pretending that these effects don't exist doesn't mean that they don't in fact matter. I understand that it may seem like they don't matter when answering the question, "Is there evidence that there exists an optimal pitch preceding a four-seam fastball?" but it seems like there could be significant interaction effects between those components and the selection of preceding pitch, no?
There doesn't appear to be any explicitly stated sample restrictions. I don't really know the answer to this but how much does it matter that we're including non-starters (specifically, closers)? I know that they pitch fewer innings but intuitively sequencing matters less for a guy like Sanchez (or perhaps Osuna would be a better example?) than it does for a guy like Greinke, right? I mean I understand that imposing certain types of restrictions will introduce attrition bias but not making those restrictions could also be problematic.
There's a fundamental assumption imposed or if not "imposed" then underlying the analysis - that fastballs are the pitch being primed, not the priming pitch. Is this a reasonable starting point? For example, in this article (http://www.fangraphs.com/blogs/marco-estrada-has-maybe-the-changiest-changeup/) talking about Estrada's changeup it seems that the ordering is fastball then changeup.
Additionally, in the comments section MGL (I assume the real MGL?) wrote:
Wouldn't this only be the case with perfect information? Does anyone have any links or anything about this particular point?
EDIT: Found more pitch sequencing stuff where MGL makes an appearance:
http://www.hardballtimes.com/defining-the-pitch-sequencing-question/
http://mglbaseball.com/2013/10/08/do-ex-pitchers-understand-how-to-pitch/
And here's a few other links on it for those interested in reading more:
http://cheaptalk.org/2011/10/19/serial-correlation-in-baseball-pitch-sequences/
http://www.baseballprospectus.com/article.php?articleid=11585
http://www.galvanize.com/blog/the-data-science-behind-baseball-pitching-strategy/#.VnpgjJMrKRs
http://fivethirtyeight.com/features/game-theory-says-r-a-dickey-should-throw-more-knuckleballs/
EDIT 2: On further thought I understand MGL's comment abut equal value, it is a direct result of the equilibrium. Just had a brain fart.