ERA+: Looking Behind The Stat
ERA+ is a great analytical tool. It permits comparisons of ERAs across different eras and different run environments by adjusting for general league scoring levels and park factors. Its advantages over simple ERA are obvious. It is the single pitching statistic most often regarded as the definitive tool for analyzing pitching careers. Some stat geeks have become so enamored of ERA+ and its derivatives that they deny certain baseball truisms that might call into question the validity of judging pitchers primarily on the basis of ERA+. They tend to deny the concept of clutch pitching, despite the fact that certain pitchers evince a tendency to pitch measurably better or worse in high leverage situations (see this post for a discussion of Leveraged ERA+, or LevERA+, which weights runs allowed (and runs prevented) based on the impact on win expectancy). They also tend to discount the theory that most pitchers "pitch to the score" by changing their pitching approach depending on the game situation.
A host of statistics confirm that most pitchers do indeed pitch to the score. Pitchers as a group subscribe to the theory that when granted a big lead it is better to put the ball over the plate and make the opposition hit their way back into the game rather than risking a rally fueled by bases on balls. Virtually all successful pitchers walk fewer batters when working with a significant lead. Virtually all pitchers, successful or not, surrender more runs when working with tremendous run support from their teammates. Baseball-Reference.com recently added pitching splits based on team run support, showing a pitcher's performance in games in which they received between 0 and 2 runs of support, 3 to 5 runs of support and 6 or more runs of support. The vast majority of pitchers will surrender more runs on average when working with 6 or more runs than they do when working with 5 or fewer runs. The run-support splits further confirm that variations in ERA in high run-support scenarios have little or no impact on a pitcher's winning percentage in these scenarios, with good pitchers winning between 90% and 95% of these decisions regardless of how much their ERAs increase with great run support.
These statistics don't reveal defects in the ERA+ statistic but rather reveal the limitations of the statistic. They reveal that the ERA+ of pitchers who are blessed with generally superior run support, like Jack Morris, may be misleading. In games in which Morris received six or more runs of support he allowed 18% more earned runs than he did when working with 3 to 5 runs of support. This didn't prevent Morris from winning 93.3% of his decisions in these games, approximately the same percentage as pitchers who had much smaller increases in ERA in similar situations. The incremental runs allowed by Morris in high run-support games significantly inflated his ERA and ERA+ but had virtually no impact on game outcomes or his teams' fortunes.
Morris is representative of most elite starting pitchers in this regard. They tend to allow significantly more runs when they have good run support to work with. The following list shows the percentage by which these pitchers' ERAs increased or decreased in games in which they received 6 or more runs of support (relative to games in which they received 5 or fewer runs of support).
Obviously, for a given ERA (or ERA+) the optimal distribution of runs allowed by a pitcher would have the pitcher allowing the fewest runs in games in which his run support was weak and the most runs in games where his run support was strong. Pitchers who pitch relatively better where their run support is particularly weak or strong see little benefit to their winning percentages; even the best pitchers in the lowest run scoring environments will win less than 25% of their decisions when they receive 2 or fewer runs of support, and even average pitchers will generally win nearly 90% of their decisions in games in which they receive 6 or more runs of support. The impact of a pitcher's performance is greatest in those games where his run support is in the middle range - three to five runs of support - and those pitchers who pitch well in those games see the most beneficial impact on their winning percentages.
Palmer pitched a slightly lower run scoring environment in Baltimore, and accordingly 3 to 5 runs represented slightly better run support than the same number of runs when scored in the parks Blyleven pitched in during the '70s. However, this potential mitigating factor is offset by the fact that Blyleven received better run support overall when receiving 3 to 5 runs of support, getting an average of 3.93 runs/game as compared to Palmer's 3.77 runs/game. After adjusting for the different scoring environments, the run support received by each within the 3 to 5 run category is almost precisely the same. The huge disparity in their winning percentages when receiving between 3 and 5 runs of support cannot be explained by disparate run suppport, and is almost solely a function of the fact that Palmer pitched significantly better when receiving middling run support.
Blyleven had a slightly better ERA+ than Palmer when receiving 6 or more runs of support, but winning percentage in this category is largely inelastic (meaning that it doesn't vary much even with significant fluctuations in ERA+ ). Palmer lost only one such game in the '70s, Blyleven lost two. Blyleven also had a better ERA+ than Palmer when receiving between 0 and 2 runs of support, but Palmer had a significantly better winning percentage, .267 to Blyleven's .211. Palmer's advantage when receiving weak run support can be explained by Palmer's far superior record in one-run games, which will constitute a significant percentage of games in which a pitcher receives two or fewer runs of support.
As the Palmer/Blyleven comparison demonstrates, relatively similar ERA+ figures can mask significant differences in pitcher performance. Although Palmer's ERA+ in the '70s was only marginally better than Blyleven's, Palmer's substantially better performance in high leverage situations and better performance in those games where pitcher performance is most likely to affect the outcome (i.e., the 3 to 5 run support category) produced a substantially better W-L record.
Ron Guidry. Guidry pitched much better in higher leverage situations, compiling a LevERA+ more than five points higher than his nominal ERA+. Guidry also pitched significantly better in games where he received 3 to 5 runs of support, compiling an ERA+ in those games of 130.5 as compared to an overall ERA+ of 119 and an ERA+ of 109.4 in games in which he had run support of 6 runs or more.
John Tudor. Tudor had nearly a 129 LevERA+ (as compared to a 124 ERA+). He also excelled in matching his performance to the game scoring environment, pitching his best in lower scoring games while allowing more runs in high run support scenarios.
Whitey Ford. Ford's LevERA+ of 137 was even more impressive than his outstanding 133 ERA+. Ford also allowed nearly 9% fewer runs when receiving 5 or less runs of support than he did with 6 or more runs of support.
Tommy John. John's 114 LevERA+ was approximately three points higher than his ERA+, and his ERA was nearly a full run higher when receiving support of 6 runs or more than when he was working with 5 runs or less. His ERA in high run support scenarios hurt his ERA and ERA+ but not his winning percentage, and accordingly his ERA+ is deceptively low.
Juan Marichal. Marichal had a slightly higher LevERA+ than ERA+, 125 to 123, and he allowed approximately half a run more when supported with 6 or more runs than he did when working with 4 to 5 runs. Between his fine clutch pitching and his tendency to allow insignificant runs when working with great run support, Marichal's 123 career ERA+ is deceptively low.
On the other end of the spectrum - the Blyleven end, so to speak - Dave Stieb, Curt Schilling, Orel Hershiser and Steve Rogers are notable examples of pitchers whose LevERA+s were lower than their ERA+ and who tended to pitch better when graced with huge run support than they did in games in the critical 3 to 5 run support category. Like Blyleven, their ERA+ figures don't tell the full story.
In short, any apparent comparability between Bert Blyleven's performance in the '70s and Jim Palmer's is illusory. Palmer was clearly the better pitcher and it's not even particularly close. This may not be apparent if one looks only at ERA+, but one doesn't have to look too hard behind the ERA+ stat to learn that while they may have allowed a similar number of runs, Palmer generally allowed them when he could afford to and Blyleven too frequently allowed them at the worst possible times. This fact, not disparate run support, accounts for the huge difference in their W-L records. ERA+ won't tell you that. It's still an important measure of pitching performance, but there are now statistics readily available that, when viewed together with ERA+, give a much fuller and accurate picture of a pitcher's performance.
_____________________
* Koufax's +59% figure is an anomaly produced by the fact that Koufax played in wildly disparate scoring environments, pitching in distinctly hitter-favorable parks until '62, and then switching to the pitcher friendly Dodger Stadium just as he was hitting his stride. As a consequence, a disproportionate number of games in which Koufax received 6 or more runs of support occurred early in his career when he was not yet the Koufax of legend, and this significantly skews the numbers.
Currently have 0 comments: