Leverage Adjusted ERA (Or "Not All Runs Are Equal")
It's been surprising to me, given the profusion of new pitching statistics (FIP, VORP, Component ERA), that we haven't seen an expression of ERA or ERA+ that adjusts for leverage, weighing runs allowed in high-leverage situations more and runs allowed in low-leverage situations less. The data is available in the game logs at Baseball-Reference.com, but aggregating the data would be a tedious exercise. Fangraphs.com aggregates the data on a seasonal basis in the WPA, WPA/LI and Clutch statistics, but expresses the statistics in terms of incremental games won or lost rather than adjusted ERA.
Fangraphs calculates "Clutch" by subtracting WPA/LI, which aggregates the unleveraged increase or decrease in win probabilities associated with each plate appearance against a pitcher, from WPA, which also aggregates the win probabilities but assigns a leverage factor to each event based on the game situation (score, inning, base and out situation). Generally speaking, a pitcher with a positive Clutch factor performed better in high-leverage situations relative to his overall seasonal performance, or declined in performance in low-leverage situations relative to his overall seasonal performance, or some combination of the two. A better performance in high-leverage situations means that the incremental outs the pitcher got in high-leverage situations count for more than an average out (i.e., an out obtained in a game situation with a leverage factor of 1.0). A worse performance in low-leverage situations means that the incremental runs the pitcher allowed in low-leverage situations count for less than the average run (i.e., a run scored in a game situation with a leverage factor of 1.0).
The significance of the Clutch statistic should be obvious: not all runs allowed (and runs prevented) are equal. For example, the run surrendered in the bottom of the ninth of a tie game should be counted differently than the run surrendered in the bottom of the first inning after the visiting took a six run lead in the top half of the inning. ERA and ERA+ count each run the same, notwithstanding that the two runs I used as examples are likely to have had hugely disparate impacts on the outcome of the game. The advantage of expressing the number of leverage-weighted runs allowed as a variation on ERA should also be obvious: most fans will not know whether a Clutch factor of 0.74 is merely above average, or very good, or a spectacular achievement, but fans know how to compare a 116 ERA+ to a 135 ERA+.
It turns out that a Clutch factor of 0.74 - meaning that the pitcher's clutch performance was worth 0.74 wins for the season - is very high. A Clutch factor of 2.0 in a season is truly spectacular, and 3.0 or above exceedingly rare. Curt Schilling was a spectacular clutch performer in 2001, improving on his usual performance that year by 30% with runners on, by 48% with runners in scoring position and two-out, and by 20% in "late and close" situations. These spectacularly clutch performances translated into 2.03 incremental wins for Schilling as measured by win probabilities added. But if one expresses this same statistic by adjusting his ERA and ERA+ to reflect not only how many runs he allowed, but the impact of these runs given the game situation, what happens to Curt's 2.98 ERA and 157 ERA+ for 2001?
Leverage-adjusted ERA+ (or "LevERA+) is calculated by expressing the 2.03 wins Curt added by virtue of his clutch performance in terms of the equivalent number of runs. The concept of expressing wins in terms of equivalent runs is a common one in sabremetrics, although the appropriate win-to-runs conversion factor is difficult to calculate and varies depending on the league scoring level and "park run environment". Fortunately, Fangraphs has already calculated the conversion factor for us, and it can be found simply by dividing RE24 by REW (here's the Fangraph glossary that describes these two stats). Curt's RE24 for 2001 was 50.87 and his REW was 5.09. That means the appropriate win-to-runs conversion factor for Curt in 2001 was 50.87/5.09, or 9.99 (which is a fairly typical conversion factor in today's game). Multiplying Curt's 2.03 Clutch factor by 9.99 reveals that Curt's clutch performance was the equivalent of allowing 20.28 fewer runs than the 86 runs Curt allowed in 2001, or 23.58% fewer runs. Reducing Curt's earned runs allowed by the same 23.58% results in a figure of 65 leverage-adjusted earned runs (as compared to Curt's actual 85 earned runs in 2001). That means Curt's LevERA in 2001 was 2.28 and his LevERA+ was 205.
This is an extreme example, of course, because a Clutch factor of 2.03 for a season is extremely high. Most pitchers will have a Clutch factor much closer to 0 (that is, they were neither "clutch" nor "anti-clutch") and accordingly a LevERA+ that varies very little from their ERA+.
Just to give a further idea of how extreme Curt's 20 run Clutch improvement was, consider that since 1974 (the date from which the Clutch stats are available at Fangraphs) the largest career Clutch improvements and declines measure less than 100 runs. Curt, oddly enough, had a career Clutch factor of negative 5.26, which translates into 53.4 more runs and 50.75 more earned runs over his career. Again, that may not sound like much but it moves Curt's career ERA+ of 127 to a 122.1 LevERA+. That's still excellent, of course, but it's a difference to which most fans, and certainly most sabremetrically inclined fans, would attach some significance.
Another HOF aspirant with a significantly negative Clutch factor is our old buddy Bert Blyleven. Bert's Clutch factor for the first four seasons of his career - '70 to '73 - are not available, but it can be estimated on the basis of the leverage statistics at Baseball-Reference.com that Bert's Clutch factor for those four years would be slightly negative (very negative in '70, very positive in '71, mildly negative in each of '72 and '73). Let's assume for purpose of calculating career LevERA+ that his Clutch factor was precisely zero for his first years. That leaves Bert with the -3.88 Clutch factor he accumulated from '74 to the end of his career. That translates to 37.12 more runs and 33.48 more earned runs over his career, making Bert's career LevERA+ 115.9.
Here are the top 25 LevERA+'s among pitchers with 2000 or more innings pitched since 1952:
Guidry, Appier, Tudor, Palmer and Santana had the largest increases over ERA+. Maddux and Brown had the largest decreases.
There were three pitchers who ranked in the top 15 in ERA+ but not in LevERA+: Schilling, Smoltz and Mussina. Schilling had the largest drop (127 ERA+, 122.1 LevERA+). Smoltz dropped from a 125 ERA+ to 122.4 LevERA+. Mussina dropped from a 123 ERA+ to a 122.4 LevERA+.
In evaluating levels of run support provided to pitchers, increasing attention has been focused on run distribution and the potential that an adverse run distribution can make a pitcher's run support look better than it really was. A pitcher whose run support as measured by average runs/game scored by his team in his starts may have really received relatively poor run support because he received a very large number of runs in a small number of games, or had a concentration of games at the both the low-end and high-end of the spectrum. Although these kinds of adverse run distribution situations can (and do) occur within seasons, it is very unlikely that the phenomenon could persist over a lengthy pitching career, and I've seen no data that suggests that any pitcher in fact suffered from adverse run distribution over the course of his career.
Very little attention has been paid, however, to the distribution of runs allowed by pitchers, despite the fact that certain pitchers have exhibited distinct tendencies to pitch differently in high-leverage and low-leverage situations. Unlike distribution of run support, adverse distribution of runs allowed is far more likely to persist over a career because of the potential that a given pitcher possesses the tendency to pitch better or worse in high-leverage situations (or pitch better or worse in low-leverage situations, or some combination of the two). LevERA+ is a measure of the impact of the distribution of runs allowed by a pitcher. It reveals that certain pitchers, like Bert Blyleven, contributed to their own mediocre W-L records by performing relatively poorly in critical situations, and it reveals that other pitchers, like Ron Guidry, produced spectacular W-L records not only because of superior run support but because of their superior performance in critical situations.
Did You Know...
There are eleven pitchers since 1954 who won at least 50% of their starts. Here they are:
Seaver, 47.9%
Maddux, 47.9%
Morris, 47.6%
Schilling 47.2%
Gooden, 47.1%
Hunter, 46.6%
Carlton, 46.1%
Tiant, 45.2%
Glavine, 44.7%
UPDATE: As of May 1, 2010, here are the top 100 pitchers in terms of percentage of stars won since 1952 (clicking on the preceding link will take you to the spreadsheet at Google Docs):
BlyLeverage
Bill James did a piece a few years ago on Bert Blyleven in which he addressed the great mystery surrounding Blyleven's conspicuously mediocre W-L record. While conceding that Bert's critics make some good points - "Blyleven did not do an A+ job of matching his effort to the runs he had to work with" - he ultimately concluded that Bert's biggest problem was his lack of run support, not his failure to pitch better in critical situations. Bill attributed roughly two-thirds of Bert's relatively poor record to lack of run support and one-third to Bert's tendency to pitch relatively poorly in tight games.
Bill's analysis was disappointing in certain respects, however. First, he didn't note that Bert's relatively poor career W-L record is almost purely a function of his performance in the first nine years of his career ('70 to '78). Had Bert compiled a W-L record commensurate with his ERAs and run support in the '70s Bert would already be in the Hall and Bill James and I wouldn't be writing about him. Second, Bill didn't discuss Bert's pertinent statistics from this period that likely explain the disparity between Bert's excellent ERAs during that period and his pedestrian W-L record. As I've previously noted, Bert had terrible record in "late and close situations" in that period, far worse than any premier pitcher of that era that I've examined, and lost a disproportionate number of close games. While it strikes me as reasonable and logical to infer that a pitcher who performs poorly in the late innings of tight games will lose a disproportionate number of close games, I thought I'd look at the records of various pitchers in one-run games and attempt to determine if there is any significant correlation between a pitcher's performance in close games and his record in one-run games.
I began by identifying pitchers who either distinctly improved their performance in high-leverage situations or exhibited a distinct decline in performance in high-leverage situations.* I then compiled their records in relatively low-scoring one-run games in which they started and pitched at least 5 innings, reasoning that higher-scoring one-run games and games in which they pitched fewer than five innings are less a function of their performance and more a function of other factors. Accordingly, I looked at one-run games with scores of 4-3, 3-2, 2-1 and 1-0. A comparison of these one-run games to Bill James's data on all one-run games pitched by the pitchers referenced in his Blyleven article indicated no significant differences, meaning that none of the pitchers performed materially differently in higher-scoring one-run games.
Here are the pitchers in the two categories:
Now, to the analysis. The six pitchers who improved in HL situations improved by an average of 7.5%, ranging from Guidry at 3% to Palmer at 13%. The nine pitchers who declined in HL situations did so by an average of 7.44%, ranging from Gibson at 2% to Rogers at 18% (I probably should have excluded Gibson, the only pitcher whose performance varied by less than 3%, but I left him in to make the point that this analysis is not intended to be any kind of dispositive argument about clutchness). The six improvers had a winning percentage of .614 in one-run games in which they started and pitched at least five innings. The six decliners had a winning percentage of .520 in such games. The correlation coefficient between performance in HL situations and one-run game winning percentage was a fairly strong .69.
There were outliers in each category. Ford improved by 6% in HL situations but had only a 32-29 record in one-run games (but, as with Gibson and Hunter, you can't tell me Whitey wasn't clutch). On the other end, Carlton and Sutton each declined by 12% but had winning percentages of .566 and .545, respectively. The best record in one-run games was Koufax, who had a winning percentage of .682 (and improved in HL situations by 6%). The worst record in one-run games was Blyleven, who had a winning percentage of .432 (and declined in HL situations by 6%).
This is obviously a very small sample set. There are more pitchers in the "decline" category than the "improve" category simply because that category seemed to fill out faster (primarily because I began by looking at pitchers referenced in James's Blyleven article and most of them just happened to exhibit performance declines in HL situations). I'm considering adding more pitchers to the analysis but compiling the records of one-run games is a fairly tedious exercise. If I can bring myself to pore through the game logs I'll update this analysis.
_______________________
* I opted to go with the high-leverage statistics at Baseball-Reference.com rather than the "late and close" statistics for various reasons but principally because the "late and close" statistics are just too narrow for this purpose, excluding anything before the sixth inning and even many situations in the late innings in which the difference in the score is only two runs. Additionally, "late and close" statistics have become increasingly less relevant over the last 30 years, as pitchers accumulate very few innings beyond the sixth inning. Whereas the "late and close situation" typically constituted between 15% and 20% of a pitcher's innings in the '60s and '70s, they generally constitute less than 10% of a contemporary pitcher's innings.
Welcome To The Club, Bill
Bill James is coming out shortly with his Bill James Gold Mine 2010. He has a chapter in the book entitled "Comparing Starting Pitchers Across History." The chapter has been pre-released online and you can read it here.
In this chapter, Bill returns to one of his favorite subjects: Hall of Fame standards for starting pitchers. He's noted many times in the past that Hall of Fame voting in recent years appears to reflect a movement away from traditional HOF standards for starting pitchers toward an emphasis on longer careers and the accumulation of huge career statistics (high career win totals, strikeouts, etc.). If it were up to today's HOF voters would pitchers like Drysdale, Lemon, Newhouser, Bunning, Hunter, Gomez and Dean be in the Hall of Fame? It's not at all clear.
Bill introduces a system based on something called "Season Scores"* that awards points to a pitcher for ranking among the top starting pitchers in the league and awards bonus points for a particularly dominant season: for ranking as the top pitcher in the league, leading the league in Season Score by 50 points or more, and compiling a historically high Season Score. A pitcher can accumulate a maximum of 9 bonus points, three for each of the three achievements listed in the preceding sentence. Bill identified twelve seasons since 1930 in which a pitcher earned the maximum 9 bonus points. Oddly, this hasn't been achieved since Doc Gooden's 1985 season; not Pedro, not Randy, not Roger, not Greg. Here are the twelve seasons:
Grove, '30 and '31
Hubbell, '33
Dean, '34
Roberts, '52
Koufax, '63, '65, '66
McLain, '68
Carlton, '72
Guidry, '78
Gooden, '85
The method for calculating Seasons Scores and then cumulating to arrive at a career HOF-worthy point total is rather prosaic and so I won't go into the precise method here. It is notable less for sophisticated statistical analysis than for the high correlation between top Season Scores and the balloting that awards Cy Young's and HOF inductions. It is, in other awards, an insight into the approach the baseball writers have used, wittingly or unwittingly, in evaluating pitchers for the Hall and for Cy Young Awards. Bill found that his system very accurately predicts which pitchers make the Hall, at least among pitchers who retired prior to 1990; after that the system breaks down, in the sense that various pitchers who've accumulated enough career HOF-worthy point totals to have traditionally qualified for induction have failed to even merit serious consideration among HOF voters. Dwight Gooden and David Cone are two such examples.
Bill identifies only two pitchers who retired prior to 1990, accumulated career HOF-worthy point totals well in excess of the traditional requirement, and yet have failed to make the Hall. Bill calls these two the "true outliers" in the analysis. They are Bert Blyleven (of course) and Ron Guidry. As Bill puts it:
"Blyleven and Guidry are so far above the Hall of Fame line that one would think that their Hall of Fame selection would not be an issue. Blyleven, of course, has become a popular candidate. Guidry has not."Bill's system for comparing starting pitchers across history has led to a re-evaluation by Bill of Guidry's HOF qualifications, although it's important to note that Bill doesn't claim that his system is the "right" system or is preferable to the judgments of the HOF balloters. Accordingly, he doesn't really take a position on whether Blyleven or Guidry belong in the Hall, in his own estimation, but merely concludes that they belong in the Hall if the standard is the one historically used by HOF voters.
Bill returns to his "Guidry and Gomez" comparison that he's made before, most notably in his Historical Baseball Abstract. But whereas Bill formerly believed that Gomez was a marginal HOF inductee, and Guidry a shade behind Gomez in his qualifications, he now believes that both are clear HOFers and that Guidry is actually more deserving.
"In the past, I have analyzed this comparison in this way:
1) Gomez was fortunate to make the Hall of Fame, being very marginally qualified,
2) Guidry was similar but a little bit behind Gomez, thus not in a range where his Hall of Fame selection was likely,
3) Gomez had three outstanding seasons; Guidry only one, 1978, and
4) Gomez made the Hall of Fame, in part, based on his post-career reputation as an entertainer and ambassador for the game.
But the implications of this new method are totally incompatible with that analysis. As this method sees it, putting Gomez in the Hall of Fame was not a reach. Gomez is well qualified based on the number of high-quality seasons that he produced. And Guidry, rather than ranking behind Gomez, in fact ranks far ahead of him."Bill then examines his rankings of Guidry among the league's best pitchers in various seasons and observes that "Guidry [had] four seasons among the league's four best pitchers, and he was competing in a 14-team league. Gomez had four such seasons, competing in an eight-team league."
Bill sums it up this way:
"By Guidry's era, career win totals had come to dominate the Hall of Fame discussion. Perhaps this is right; perhaps it is wrong. I am not suggesting that my new method here should substitute for all other judgments about Hall of Fame selections, not at all. There are many other ways to look at the issue. Perhaps those other ways are better.
But while those other pitchers have 100+ wins more than Guidry, Guidry's winning percentage was far better than Carlton's, or Sutton's, or Niekro's, or Kaat's, or Tommy John's, or Ryan's, or Blyleven's or Gaylord Perry's; it was even far better than Tom Seaver's. Guidry was further over .500 - wins minus losses - than most of those pitchers.
Steve Carlton's ERA was 41 points better than the league norm for this career. Don Sutton's ERA was 45 points better-than-league, Tommy John's was 42 points better, Blyleven's 50 points better. Jim Kaat was 15 points better than league. Ron Guidry's ERA was 76 points better than the league average.
I am merely pointing this out: in general, through baseball history, pitchers who have this many seasons as one of the best pitchers in their league have been almost automatic Hall of Fame selections. Historically, the Hall of Fame has made room for all pitchers with 250+ wins, but also for pitchers who were more dominant in shorter careers."Well put, Bill. Your analysis is already driving some of the stat geeks crazy, but your point is nonetheless valid. Ron Guidry does indeed belong in the Hall of Fame. Even forgetting many of the aspects of Guidry's career I've discussed in this blog, Ron still qualifies for the Hall based on the standards employed by the HOF in the first 60 years of HOF balloting. Throw in the fact that he was the best big-game pitcher of his generation and it's a no-brainer.
I know Bill will insist he wasn't campaigning for Ron's induction, but I'm inviting him into the "Put Gator In The Hall" club anyway.
On The Subject of "Clutch"
It's a word you hear a lot about in discussions of athletics. It's a given among most sports fans and commentators that some performers are clutch and some aren't. Does anyone dispute that Michael Jordan was clutch? Does anyone dispute that John Elway was clutch? We all remember those game-winning shots and game-winning 4th quarter drives. Those were clutch, right? Ron Guidry's 26 wins in 30 September pennant race starts? That's gotta be clutch, doesn't it? And does anyone really dispute that Derek Jeter is clutch?
Well, yes, some people do dispute that Derek Jeter is clutch. And, frankly, they make some pretty good points. They correctly caution us that we should be careful about placing too much emphasis on "the flip" in '01 in the ALDS against the A's, or the walk-off home run in '01 against the D'backs in game 4 of the 2001 World Series. And they're right about relying on anecdotalism, or isolated instances of "clutch plays", or, more generally, very small sample sets. Those may have been clutch plays, but do they necessarily make Derek Jeter a clutch player? They point out that Derek Jeter in the post-season is pretty much like Derek Jeter in the regular season - almost identical batting average, OPS, and just a little bit more HR power in Oct/Nov than in April to September. Jeter's not being "clutch", they argue; he's merely being Jeter.
Reggie Jackson? Surely a .755 slugging average across five World Series establishes beyond question Reggie's clutch bona fides, right? Well, what about those 11 ALCS series, the skeptics ask. Those were big games, too, and Reggie slugged .380 and had an OBP under .300.
Here's Bill James on the subject of "clutch":
"The prominence of clutch performance as an element in player ratings can be attributed to three factors: (1) Hero worship journalism; (2) Self-aggrandizement by athletes, particularly retired athletes serving as TV announcers; (3) The fact that we all need, at times, to escape the implications of our logic."Bill then cites ex-athletes like Joe Morgan, Ray Knight and Reggie Jackson as commentators who have a tendency to cast every contest as a "test of character, determination, and fortitude."
"My attitude toward this can probably be inferred from my tone. I do not believe that athletes are better people than the rest of us, I do not believe that athletic contests are tests of character, and I do not believe that there is any such thing as an ability to perform in clutch situations. It's just a lot of poppycock."While rejecting the notion of an ability to perform in the clutch, however, Bill agrees that certain players have performed so well in clutch situations, for whatever reason, that they deserve credit for it and extra consideration when assessing their historical standing.
Bill contrasts Don Drysdale and Bob Gibson to illustrate his point. Bill cites Gibson's well-known big-game reputation, his tremendous performance down the stretch in the 1964 NL pennant race, and his remarkable World Series record (7-2, seven straight wins, two game 7 victories, 2 World Series MVPs) and contrasts Gibson's clutch achievements with Drysdale's pennant race performances.
"This is an absolute fact that doesn't change depending on how you feel about it: Don Drysdale started 13 games in his career in the heat of the pennant race against the team the Dodgers were trying to beat - and never won. Not even once. He never pitched particularly well without winning; 0 for 13.*
"I don't believe that this reflects a character failing on Drysdale's part. I think it's just something that happened. Sometimes he had been overworked; sometimes maybe a pitch or two got away from him. Sometimes you make good pitches and get beat. If there was a big game next week, I'd as soon have Drysdale pitching for me as anybody else.
"Nonetheless, it did happen; he did, in general, pitch poorly in pennant races (with some exceptions), and he did repeatedly fail to beat the Dodger's kep opponent in the heat of the pennant race. In rating Drysdale's career, is this something that should be ignored, or something that should be considered?"Bill answers his own question directly and succinctly, stating "if a player really does come through in big games or fail in big games, I don't think we can afford to ignore that."
Bill then argues that there are, in his opinion, about 20 players who should be rated up or down "a little bit" because of their clutch performances. In addition to Drysdale and Gibson, Bill mentions five other players for whom the clutch factor would figure in Bill's analysis: Yogi Berra, Joe Carter, George Brett, Steve Garvey and Reggie Jackson. Although Bill doesn't say so, I think it's fairly clear that each of these five would be uprated by Bill for clutch performance. But Bill doesn't explain why, and it's really not clear to me what Bill's methodology was in arriving at these five examples. If it's post-season performance (and I believe that is what Bill primarily relied on) then it should be noted that none of these players have aggregate post-season numbers that put them among the all-time post-season performers (with the possible exception of Reggie Jackson). And each of them have very notable chinks in their post-season records. The point is this: why these guys but not Lou Gehrig? Henry Aaron? Lou Brock? Allie Reynolds? Lefty Gomez? Babe Ruth? Mickey Mantle?
The fact is that there were two different Yogi Berras in the World Series - the one that hit .188 in his first five World Series and the one that was a fire-breathing monster in the next 7 Series in which he played. Gehrig and Ruth virtually never had a poor World Series - why don't they deserve Bill's uprate?
Why Joe Carter? His aggregate post-season numbers are even weaker than his generally mediocre regular season numbers. And he only played in five post-season series. What about Lou Brock and Henry Aaron, each of whom may have played in only three post-season series but put up numbers that are off the charts? And if it's Carter's World Series winning walk-off HR in the '93 WS that qualifies as a clutch uprate for Bill, then what about Bill Mazeroski, whose HR to win the classic 7 game Series in 1960 is even bigger than Carter's walk-off, and whose aggregate post-season numbers are far better than Carter's? Or what about Mantle, who hit more hugely consequential World Series HRs than anyone?
Well, Bill himself pointed out that the subject of clutch performance is inevitably very subjective and, as he put it so eloquently, "it's a dangerous area to get into, because when you reach into the bullshit dump, you're not going to come out with a handful of diamonds."
Still, you can't avoid the whole "clutch" debate; it's a classic sports fan subject. And it's a subject that in many ways is an implicit premise of this blog about so-called big-game pitchers. There are some players who were so undeniably great in big games, in tight pennant races, or in post-season competition that you have to take notice. And in the final analysis, I suppose I don't care if these performances were the result of some innate clutch gene, or some identifiable super-ability in the clutch. These performances occurred, they took place in the biggest games on the biggest stage, and the implications for their team and for baseball history were profound. So I'm with Bill, here: I don't think we can afford to ignore that.
____________________
* Bill, although generally correct about Drysdale's conspicuously poor record in September against other contenders, is simply wrong in his claim that Drysdale never won such a game. Drysdale beat the Giants on September 19, 1959 to draw the Dodgers even with the Giants with six games to go, pitching six innings, giving up one unearned run and striking out 8; and Drysdale beat the Pirates on September 15, 1966 to put the Dodgers up by 2.5 games over the Pirates with 17 games to go, going 8.2 innings and giving up 5 hits and 3 runs.
Bill James Ranks The Lefties
Bill James ranked the 100 greatest pitchers in baseball history in The New Bill James Historical Baseball Abstract. The book was written after the 1999 season (and ultimately published in 2001) and consequently greats like Roger Clemens, Greg Maddux, Randy Johnson, Pedro Martinez and Mariano Rivera are not as elevated in James's rankings as they would be today (ranking 11th, 14th, 29th, 49th and unranked, respectively). Based on more recent commentary by James I think it's pretty clear that Randy Johnson, in particular, would make a huge jump in James's ranking.
Here is James's list of the top 20 southpaws of all time through 1999 (my list is here; open it up in a new window and do a side-by-side comparison of the lists):
Bill's top 10 would look pretty much the same today, I believe, with two exceptions: Randy Johnson would jump ahead of Koufax and probably even Spahn, taking the two spot behind Grove; and Glavine would likely crack the top 10 given his superior 2000 to 2002 seasons (55-27, 133 ERA+).
It's harder to speculate about what Bill's second 10 would look like today. Bill and his buddy Rob Neyer seem to think fairly highly of Andy Pettitte (Bill has said Andy will likely make the Hall of Fame, and Neyer has said that Andy is "qualitatively" better than Jack Morris). And Johan Santana is probably already on the cusp of Bill's second 10. However, the presence of Wilbur Cooper and Eppa Rixey on Bill's list suggests to me that he perhaps adhered too slavishly to his Wins Shares index, and I don't think either Pettitte or Santana fare too well on that basis (although Santana will certainly get there with another four or five good years).
Bill has seven pitchers ranked ahead of Guidry who I'd ranked behind: Plank, Newhouser, Waddell, Cooper, Pierce, John and Kaat. Conversely, Bill has Gomez ranked one spot behind Guidry whereas I had ranked Gomez one spot ahead of Guidry. The difference between our approaches to Plank and Waddell is easily explainable: I copped out and argued that it was simply to difficult to compare "deadball era" pitchers to post-1920 era pitchers. As for the rest, this is my take on Bill's take:
Newhouser. Bill seems to hold Prince Hal in unusually high regard. It's not clear to me how Bill weights the War Years, but it seems as if he doesn't discount them as much as I do. He also made a couple of very strange claims about Newhouser's post-War Years in his Historical Baseball Abstract, arguing that Newhouser was the best pitcher in the AL in '47 and '48. Well, it was Feller in '47 and, for my money, either or both Gene Bearden and Bob Lemon were better than Hal in '48 (I admit, Hal's and Lemon's numbers in '48 are pretty damn similar, but Lemon pitched TEN shutouts, and that tipped the balance to Bob). Anyway, I had Prince Hal 14th on my list of post-1920 lefties.
Cooper. Cooper was good, and very consistent, between '16 and '24, but I think Bill's ranking of Cooper is a case of too much reliance on the Win Shares analysis. Also, while consistency is certainly a virtue, Bill seems to value it more highly than I do. See the discussion of Jim Kaat, below. Cooper was honorable mention on my list (i.e., not in the top 15).
It's difficult to tell what Bill thought of Cooper, with Cooper's mini-bio in Bill's Abstract consisting only of two extended quotes from two other writers that dealt more with Cooper's personality than his pitching.
Pierce. I, too, am a Billy Pierce fan (perhaps I just fancy slightly built lefthanded power pitchers), but he barely missed my list of the top 15 post-1920 lefties. Here are my two reservations regarding Billy.
First, in considering his excellent W-L record during his peak (i.e., '51 to '60) you have to recall that the White Sox had some very good teams in those years. In fact, the White Sox had a cumulative .568 winning percentage over that decade; Billy managed a .584 winning percentage over that period. That's not much of a difference.
Second, Billy seemed to disappear a bit on the Chisox down the stretch of those '50s pennant races. Take his great '55 season, for instance (1.97 ERA, 199 ERA+). The White Sox were half a game up on the Yanks and 1.5 games up on the Indians when Billy took the mound against the Indians on Sept. 3, but he gave up 8 hits and 6 walks in 5.1 innings and lost, 6-1. A week later he faced the Yanks and couldn't make it out of the 2nd inning, giving up 6 runs in 1.1 innings. Ten days later, with the White Sox now five games back of the Yankees but still alive, Billy lost to the Indians and Early Wynn, 3-2, eliminating the Sox from the pennant race.
In 1953 the Sox were just 6.5 games back of the Yanks after Billy shutout the Tigers on August 14th, but Billy won only two more game the rest of the way, and his second win was on the last day of the season, long after the Sox had been eliminated.
In '57 the Sox were staying within striking distance of the Yanks through August and September (between 4 and 6.5 games back for most of those two months), but Billy went only 5-5 with a 5.04 ERA in August and September, after taking a 15-7 record and 2.45 ERA into August. He was hammered in both his starts against the Yanks down the stretch, giving up 9 earned runs in 10.1 innings.
And when the Sox finally won the pennant in '59 Billy struggled down the stretch, winning only two games in August and September. It was the same story in '60 - the Sox were right there with the Yanks and Orioles in August and September, but Billy pitched very poorly over his last 8 starts (ironically, the Sox won six of those starts anyway, despite Billy averaging less than 4.5 innings per start and posting a 5.14 ERA).
I don't think these late season swoons figured into Bill's analysis at all. They figure into mine, however, and they exerted a pretty heavy drag on Pierce's ranking in my list of all-time lefties.
John. Honorable mention on my list. He was a big winner when healthy for the Dodgers and Yanks in the late '70s and in 1980. And he was consistently good even during his abbreviated seasons. But there were just too many seasons when he didn't make enough starts, didn't pitch enough innings, and therefore didn't have enough impact.
Kaat. Bill has an extended take on Kaat in the Abstract, and emphasizes how consistent Kaat was during the '60s and '70s. It's true, Kaat was very consistent, but he was too often consistently mediocre from '68 to '73, during a period that should have been the peak of his career (ages 29 to 34). I've lauded Kaat in this blog - for his back-to-back 20 win seasons for poor White Sox teams in '74 and '75, and for his spectacular stretch run in the great four-team pennant scramble of '67 - but I had Kaat ranked 15th on my list of post-1920 lefties, five spots behind Guidry.
All in all, I think Bill's list is pretty similar to mine. And even if Bill is right and Guidry is the 14th best post-1920 southpaw rather than the 10th, I still think he belongs in the Hall.
If You Had To Win One Game...
If you had to choose one pitcher to start a critical, late September game in a tight division race, who would you choose? Sabathia? Halladay? Santana? Carpenter?
I know who I would choose, and you know who I'd choose, too, because his picture is to the right. I'd choose Roy Oswalt, the Astros ace, hands down. Year after year Roy has put up Guidry-like numbers in September with the Astros in contention for a division title or wild-card spot. He's as close to infallible in a battle for the post-season as any pitcher of his generation.
Thanks to the three division, wild-card format, the Astros have been in contention for a post-season berth in every year of Oswalt's career other than 2007 and 2009. Oswalt has made 40 September starts in the seven tight races in which he's participated and his record is 28-7 with a 2.49 ERA in 267.1 innings pitched. However, two of his losses came in late September 2002 after the Astros had been eliminated (the only two starts of his 40 September starts that occurred after the Astros had either clinched or been eliminated). Take away those starts and Oswalt is 28-5 with a 2.39 ERA in 38 September starts while the Astros were still in contention.
And Roy's been getting better as he goes along. In his last five post-season races ('03, '04, '05, '06 and '08) he is 24-3 with a 2.33 ERA. In his last two - '06 and '08 - he's 10-1 with a 1.64 ERA. Four times Roy has won five games in September in the heat of races for the post-season, a feat matched only by Ron Guidry since 1954.
Over his last 30 September starts in tight races, Oswalt has won 24 games. As I said, Guidry-esque.
Throw in his 4-0 post-season record and it should be pretty obvious why Roy is my go-to guy.
There is no doubt that Roy is a late starter and fast closer; his August and September career statistics are vastly superior to his career statistics for the first four months of the season. But Roy's spectacular numbers in the heat of post-season races are more than just a function of his fast finishes. As I mentioned, the Astros were out of contention in September in only two of Roy's seasons - '07 and '09 - during which Roy made seven starts, going 0-2 with a 4.73 ERA in 40 innings. Throw in the last two starts of 2002 (after the Astros had been eliminated) and Roy's career record in September when the Astros are not contending is 0-4 in nine starts with a 4.76 ERA in 51 innings. That's right: 28-5 when it meant the most, but 0-4 when it meant little or nothing.
It's pretty plain that Roy likes the big stakes, thrives on pressure, and wants the ball in the big games. There's a term for guys like that. They're called "big game pitchers," and they are just about the most precious commodity in major league baseball.
Demythologizing Bert's Famous "Bad Luck"
The June 1976 SI article was published shortly after Bert's first appearance with the Rangers, in which Bert and Mark "The Bird" Fidrych each went 11 innings, with the Tigers prevailing, 3-2. Including Bert's last two starts with the Twins, this made the third consecutive start where Bert had pitched well and been tied going into the late innings but lost. But the SI article wasn't a product of Bert's disappointing results in tight games over the preceding few weeks. The SI article was prompted by two conspicuous aspects of Bert's record that had persisted for years. First, Bert's W-L record never seemed to match the rest of his record - the superior ERAs, the shutouts, the complete games and the strikeouts. Second, Bert had a propensity to lose a lot of close games in the late innings.
Bert's fanatical supporters always have two deceptively simple explanations at the ready for Bert's mediocre W-L records in the '70s: he received poor run support, and he was unlucky. Each of these two rationalizations offered by Bert's backers fail completely to explain Bert's relatively poor W-L records, and each are particularly absurd for having been offered by people who purport to possess some degree of sophistication in statistical analysis. Each can be dismissed quickly and definitively.
Run Support. Bert's run support was slightly below average for much of the '70s. In this respect, Bert's backers are correct. But Bert's run support in the '70s was enough so that a pitcher with Bert's record of stinginess in allowing runs should have had a W-L percentage of approximately .600!
Bert had completed six seasons in the major leagues when the SI article appeared in June 1976, and had allowed an average of 3.16 runs/game while receiving approximately 4.0 runs/game from the Twins. Plug those two figures into the so-called Pythagorean Theorem with which any of Bert's statistically inclined supporters is familiar and you receive a projected winning percentage of approximately .615. Bert's actual winning percentage from 1970 to 1975 was .528. The Pythagorean Theorem suggests Bert should have been able to compile his .528 winning percentage while receiving only 3.35 runs/game from the Twins.
This disparity between Bert's actual record and his projected Pythagorean record persisted for the remainder of the '70s. At the end of the decade, Bert's winning percentage was .536. The Pythagorean Theorem says Bert should have had a winning percentage of .602.
To put it another way, if Bert's run support in the '70s had been 3.3 runs/game, Bert's backers would have a point about poor run support accounting for Bert's poor W-L record. But Bert's run support was approximately 4.0 runs/game during the decade, approximately 20% higher than the 3.3 runs/game that might have explained Bert's .536 winning percentage.
Bad Luck. It is particularly curious that Bert's backers would resort to this argument, one borne of superstition and anti-rationalism rather than the rigorous statistical analysis Bert's backers purport to favor. It is a wholly unworthy argument - even silly - for two simple reasons. First, the suggestion that Bert's "bad luck" could persist for a solid decade, across 350 starts and more than 2600 innings, following Bert from Minnesota to Texas to Pittsburgh, is pure nonsense as a statistical matter. Second, it is particularly nonsensical given the abundance of statistical evidence demonstrating that Bert had an astoundingly bad record in tight, low-scoring games and that this record was attributable to Bert's unusually poor performance in the late innings of tight games.
The rap on Blyleven, as expressed in the SI article, was that you could get to him in the late innings, and the statistics bear out that reputation. The following table shows Bert's performance in "late and close" situations (i.e., plate appearances against Bert in the 7th inning or later with the batting team tied, ahead by one, or the tying run at least on deck).
The pertinent numbers here are in the last two columns - the tOPS+ (Bert's OPS+ in late and close situations relative to his general OPS+ for the season) and the sOPS+ (Bert's OPS+ in late and close situations relative to the general league-wide OPS+, which by definition is 100 each year). Throughout the '70s Bert suffered declines in performance in late and close situations greater than any other elite pitcher of the era - his tOPS+ for the decade was 119. From 1980 to the end of his career the more mature Bert compiled a very good 86 tOPS+. Not coincidentally, Bert's winning percentage of .533 after 1979 almost exactly matches his projected Pythagorean winning percentage of .530, a stark contrast to the huge disparity between Bert's actual and Pythagorean winning percentages in the '70s.
Bert's improved performance in late and close situations from '80 to the end of his career coincided with a dramatic drop in the number of plate appearances against Bert in late and close situations. During the '70's approximately 14% of the plate appearances against Blyleven came in late and close situations. This figure dropped to less than 9% after 1979. This change is explained primarily by the fact that Bert was pulled from the game earlier in the '80's when he got in trouble in late and close situations, which contributed greatly to Bert's improved late and close performances later in his career. Consequently, Bert faced on average more than 8 batters per late and close game in the '70s, but just over 6 batters from '80 to the end of his career. This trend started in '79, when plate appearances against Bert in late and close situations dropped dramatically as a result of Chuck Tanner's decision to pull Bert at the first sign of trouble. The strategy worked spectacularly for the Pirates, who won 23 of Bert's 37 starts that year (a .622 winning percentage) despite the fact that in the great majority of Bert's 20 no-decisions he left the game either tied or behind.
The late and close statistics reveal that in the seven seasons preceding Tanner's '79 "quick hook" strategy, Bert had five seasons in which his tOPS+ was greater than 130 and his sOPS+ was 100 or greater. During these seven seasons Bert's tOPS+ was 128. No other elite pitcher of the era comes close to this record of performance decline in late and close situations over such an extended period. Bert's sOPS+ was 105, meaning that in these five seasons Bert was an average or below average pitcher in late and close situations. Of the other premier pitchers of the era who, like Bert, had a dozen or more seasons in which they faced 100 or more batters in late and close situations, no other pitcher had more than three seasons in which their tOPS+ was greater than 130 and the sOPS+ was greater than 100. Steve Carlton and Gaylord Perry each had three such seasons and a fourth season that nearly qualified, but these were spread over careers more than 20 years long. The occurrence of five such seasons in a period of seven years during the peak of Blyleven's career explains to a significant degree the striking discrepancy between his actual W-L record and the kinds of W-L records projected by the Pythagorean Theorem.
Boswell on Blyleven (or, "Bert Backers Bash Boswell")
Of all the commentary in the aftermath of the HOF voting results I was most struck by the following comments by Thomas Boswell during the course of an online chat at the Washington Post website:
Photo Left: Thomas Boswell"The push for Blyleven drives me crazy. I follwed his whole career. His reputation was that, more than any other top stuff pitcher, he would find a way to lose or not to win. He's just not a HOFer, in my book. He only won 20 games one time and more than 17 only twice! And he pitched in the era when top starters got 4-5 more starts a year and 20 wins was easier. BB had nine seasons with 36-to-40 starts and averaged 38 in those years. When Chuck Tanner got him in Pittsburgh the word went around that Chuck had decided, over BB's protestations, to take him out of late-and-close games because he'd never had the stomach for it. 'Take him out before he can lose.' Tanner never said it in public. But BB's winning opercentage gets better."
Well, we'll never know what was in Chuck Tanner's head, and Chuck is a classy guy and he ain't sayin'. But we do know the following: Boswell is absolutely correct regarding Blyleven's reputation, and Chuck Tanner did indeed resort to a quick hook with Blyleven beginning in the 1979 season, a strategy that succeeded wildly and was a critical part of the Bucs' march to the World Series that year.
I get the impression a lot of Bert Backers are too young to have closely followed the game back in the '70s, but Bert's reputation as a guy who lost the close ones and stumbled in the late innings of tight games is simply a fact, and also a matter of record. That was Bert's reputation; Boswell remembers it correctly. I remember it, too, and anyone else who followed the game back then would also remember it. Of course, reputations aren't always earned, and reputations in baseball are sometimes born unfairly out of an incident or two, or out of nothing at all.
However, if the issue is whether Bert actually had the reputation claimed by Boswell, we don't have to rely on Boswell's recollection. Bert's reputation for "finding a way to lose", as Boswell put it, was the subject of a Sports Illustrated article in 1976 published shortly after Bert's trade from the Twins to the Rangers, entitled "The Stuff, and No Nonsense: As a Texas Ranger He is Richer, But Will He Pay Attention?"
After recounting the rather ugly facts regarding Bert's infamous exit from Minnesota (i.e., Bert's heated salary dispute with Twins owner Clark Griffith and his flip-off to Twins fans after Bert's last appearance for the Twins) and offering a comic tableau of Bert losing a battle of concentration with a resin bag, the article shifted to the crux of the matter: "However, what was really at issue was not Blyleven's bad manners or the size of his paycheck, but whether he might now become the big winner so many think he ought to be." The following two paragraphs of the SI article neatly capture the gist of the matter.
When Blyleven does lose, his downfalls seem to occur in the late innings. For this he has blamed the Twins' relievers. Given a better bullpen, he claims "I would have 40 more career victories."
But many baseball people believe his late-inning reversals have been mostly his own doing. "Bert throws basically two pitches," says Bonds, "a hard fastball and a hard curveball. Everything comes in at the same speed, so sooner or later you can get your timing down. It takes a few innings and by then maybe Bert's lost a bit off his fastball. It starts to flatten out. And maybe in later innings his curveball will hang every so often."Bert's problems in late and close situations were common knowledge in baseball, although the theories for the problem varied. (None of the theories, however, focused on the Twins bullpen; notwithstanding Bert's claim of 40 lost victories, Bert lost only 11 wins to bullpen malfunctions between '70 and '76, fewer than Niekro, Kaat, Hunter, Ken Holtzman, Joe Coleman, Andy Messersmith, Carl Morton, Fritz Peterson and Dave McNally, among others.)
The SI article used virtually the precise language used by Boswell in recollecting Bert's reputation as a pitcher who pitched just well enough to lose.
If Blyleven's parts have seemed greater than the whole, he attributes it to his struggles with a mediocre team. But as Dick Williams, the manager of the Angels, says, "I've seen a lot of pitchers who never had Blyleven's stuff win 20 games with teams a lot worse. Some pitchers pitch just good enough to win, whether it's 1-0 or 9-8, and others always seem to pitch just good enough to lose."Dick Williams didn't name his "20 wins for bad teams" all-star team, but he wouldn't have had any problem filling out the rotation. Randy Jones won 20 for a Padres team in '75 that won only 71 games and 22 for a 73-win Padres team in '76; Steve Busby won 22 games for a Royals team that was 16 games under .500 when Busby wasn't the pitcher of record; and Jim Colborn won 20 games in '73 for a Brewers team that won only 74 games. And then there were pitchers who seemed to specialize in winning 20 games a season for mediocre teams, like Ferguson Jenkins, Mel Stottlemyre and Wilbur Wood, each of whom won 20 games three times for teams that were either .500 or below or would have been had their ace pitcher's W-L records been subtracted from their teams' record.
Perhaps the most glaring example of a pitcher who won 20 games without benefit of Bert's stuff and for teams worse than Bert's teams was Jim Kaat. Kaat was rebounding from arm problems when the Twins traded him to the White Sox in '73 and he no longer had the stuff he'd had for the Twins in the 60's. But Kaat put together back-to-back 20 win seasons for White Sox teams that finished behind the Twins in the AL West in '74 and '75. Bert, meanwhile, was winning 17 and 15 games, respectively, in '74 and '75.
The SI article from June 1976 is pretty compelling evidence that Boswell's recollection is correct: Bert had the reputation, fairly or unfairly, as a pitcher who pitched just well enough to lose, a pitcher who didn't produce results worthy of his nasty stuff, and a pitcher who seemed to sag in the late innings of tight games. Bert Backers can contest the fairness of this reputation but they cannot deny the existence of the reputation. I'm fairly certain that won't stop them from attacking Pat Jordan, the celebrated SI writer who wrote the article, or those the article quoted, like Bobby Bonds, Jim Palmer, Dick Williams and Gene Mauch. But they might consider that the source for Bert's alleged tendency to lose his concentration in tight spots was Bert himself, and that Bert's shabby attempt to blame the Twins bullpen for his troubles, absurdly blaming his teammates for costing him more wins than Bert's total number of no-decisions in that period, suggests that Bert was aware of his reputation and rather defensive about it.
I distinctly recall that this SI article was not the only notice the media took of Bert's reputation, but few publications maintain archives of 43 year old articles. I also recall, as apparently Boswell does, that Bert's unwanted reputation only grew after this article, as his late inning troubles in '77 for the Rangers and '78 for the Pirates exceeded his Minnesota woes and became a source of contention with Pittsburgh manager Chuck Tanner.
I believe anyone who reads the SI article will agree that Boswell is owed an apology by those Bert Backers who accused him of fabricating his claim regarding Bert's reputation. Boswell's recollection is correct. The reputation existed. I'll examine in a subsequent post whether the reputation was deserved.