So I was reading the Hardball Times today and ran across this gem. The article discusses home run length and uses statistics from Hit Tracker to break down "no doubt," "plenty," "just enough," and "lucky" home runs. The first two categories seem throwaway since they contain homers that were plenty far enough so no matter what, they were leaving the yard. It's the second two categories ("just enough" and "lucky") that Rob Neyer talks about here (subscription needed) that warrant discussion. The Jaffe piece argues that teams and players with high percentages of "just enoughs" will have a higher probability of regression the next year. Notable teams and players include the Red Sox (36% compared to a league average 29%), Giants (43% cheap), Brandon Phillips (14 of 30 cheap), and David Wright (12 of 30 cheap). Jaffe then suggests that one can better predict a team or player's home run rate by normalizing their "just enoughs" to the league average of roughly 29%.
My problem with his analysis is that he does not take into account what I call "just misses." For every ball that makes it over the wall by ten feet or falls one wall's length beyond the wall (the threshold for "just enough"), there is in all likelihood a ball that falls short by the same amount (a "just miss"). If we normalize home run rates by adjusting "just enough" rates to the league average, we need to do the same for "just misses." Without "just miss" rates, we really have no concrete way to know whether or not Gary Sheffield's ridiculously low "just enough" rate of 4.17% is attributable to the kind of contact he makes or to horribly bad luck with fly balls within the "just miss" threshold. For all we know, he hit tons of balls that died short of the wall that could potentially be homers next year. We need to know "just miss" rates for individual players as well as the league average so we can normalize correctly on both ends.
Furthermore, Jaffe doesn't take the full effect of weather into account. For every fly ball that becomes a homer just because of favorable weather conditions ("lucky"), there is in all likelihood a fly ball that fails to go over the fence because of adverse conditions ("unlucky"). For example, David Wright had 10 "lucky" home runs last year. However, to accurately predict the overall effect of weather on Wright's home run rate, we have to know how many "unlucky" fly balls he hit. Additionally, we have to know the average rate of "unlucky" fly balls for the league so we can normalize his rate for predictive purposes. This will give us a complete picture of Wright's overall luck factor by taking in both the positive and negative effects of weather.
TLDR: Unless we can normalize for "just misses" and "unlucky" fly balls, we do not have a way to accurately predict home run rates through Jaffe's method. I hope he continues his analysis by taking this into account because it's a fascinating line of research; he simply needs to get both halves of the whole to see if there's a correlation to be found.
Sorry that there's not more to be posted today. Some things came up and I don't have more time to blog at the moment. But don't worry, I'll be plenty bored at work tomorrow so there should be more updates then.