Put Gator In The Hall: 3/1/10

ERA+: Looking Behind The Stat

Posted by Gator Guy on Wednesday, March 31, 2010 , under Sabremetrics, Statistics | comments (0)

ERA+ is a great analytical tool. It permits comparisons of ERAs across different eras and different run environments by adjusting for general league scoring levels and park factors. Its advantages over simple ERA are obvious. It is the single pitching statistic most often regarded as the definitive tool for analyzing pitching careers. Some stat geeks have become so enamored of ERA+ and its derivatives that they deny certain baseball truisms that might call into question the validity of judging pitchers primarily on the basis of ERA+. They tend to deny the concept of clutch pitching, despite the fact that certain pitchers evince a tendency to pitch measurably better or worse in high leverage situations (see this post for a discussion of Leveraged ERA+, or LevERA+, which weights runs allowed (and runs prevented) based on the impact on win expectancy). They also tend to discount the theory that most pitchers "pitch to the score" by changing their pitching approach depending on the game situation.

A host of statistics confirm that most pitchers do indeed pitch to the score. Pitchers as a group subscribe to the theory that when granted a big lead it is better to put the ball over the plate and make the opposition hit their way back into the game rather than risking a rally fueled by bases on balls. Virtually all successful pitchers walk fewer batters when working with a significant lead. Virtually all pitchers, successful or not, surrender more runs when working with tremendous run support from their teammates. Baseball-Reference.com recently added pitching splits based on team run support, showing a pitcher's performance in games in which they received between 0 and 2 runs of support, 3 to 5 runs of support and 6 or more runs of support. The vast majority of pitchers will surrender more runs on average when working with 6 or more runs than they do when working with 5 or fewer runs. The run-support splits further confirm that variations in ERA in high run-support scenarios have little or no impact on a pitcher's winning percentage in these scenarios, with good pitchers winning between 90% and 95% of these decisions regardless of how much their ERAs increase with great run support.

These statistics don't reveal defects in the ERA+ statistic but rather reveal the limitations of the statistic. They reveal that the ERA+ of pitchers who are blessed with generally superior run support, like Jack Morris, may be misleading. In games in which Morris received six or more runs of support he allowed 18% more earned runs than he did when working with 3 to 5 runs of support. This didn't prevent Morris from winning 93.3% of his decisions in these games, approximately the same percentage as pitchers who had much smaller increases in ERA in similar situations. The incremental runs allowed by Morris in high run-support games significantly inflated his ERA and ERA+ but had virtually no impact on game outcomes or his teams' fortunes.

Morris is representative of most elite starting pitchers in this regard. They tend to allow significantly more runs when they have good run support to work with. The following list shows the percentage by which these pitchers' ERAs increased or decreased in games in which they received 6 or more runs of support (relative to games in which they received 5 or fewer runs of support).

Obviously, for a given ERA (or ERA+) the optimal distribution of runs allowed by a pitcher would have the pitcher allowing the fewest runs in games in which his run support was weak and the most runs in games where his run support was strong. Pitchers who pitch relatively better where their run support is particularly weak or strong see little benefit to their winning percentages; even the best pitchers in the lowest run scoring environments will win less than 25% of their decisions when they receive 2 or fewer runs of support, and even average pitchers will generally win nearly 90% of their decisions in games in which they receive 6 or more runs of support. The impact of a pitcher's performance is greatest in those games where his run support is in the middle range - three to five runs of support - and those pitchers who pitch well in those games see the most beneficial impact on their winning percentages.

* * * * * *

A vivid demonstration of how misleading ERA+ can be is the comparison between Jim Palmer and Bert Blyleven in the 1970's. Although they had similar ERA+s in the '70s (Palmer, 137; Blyleven, 130), Palmer exhibited both aspects of a pitcher who maximized his run support by pitching well in high-leverage situations and by optimally distributing his performances, surrendering significantly fewer runs when supported with 5 or fewer runs. Blyleven, by contrast, pitched relatively worse in high leverage situations and surrendered approximately as many runs when getting 5 or fewer runs as he did when getting 6 or more. The effect of these contrasting characteristics can be seen in their winning percentages during the '70s in games where they received between 3 and 5 runs of support: Palmer had a .752 winning percentage; Blyleven had a .596 winning percentage. Palmer had a 154 ERA+ in these games; Blyleven had a 123 ERA+.

Palmer pitched a slightly lower run scoring environment in Baltimore, and accordingly 3 to 5 runs represented slightly better run support than the same number of runs when scored in the parks Blyleven pitched in during the '70s. However, this potential mitigating factor is offset by the fact that Blyleven received better run support overall when receiving 3 to 5 runs of support, getting an average of 3.93 runs/game as compared to Palmer's 3.77 runs/game. After adjusting for the different scoring environments, the run support received by each within the 3 to 5 run category is almost precisely the same. The huge disparity in their winning percentages when receiving between 3 and 5 runs of support cannot be explained by disparate run suppport, and is almost solely a function of the fact that Palmer pitched significantly better when receiving middling run support.

Blyleven had a slightly better ERA+ than Palmer when receiving 6 or more runs of support, but winning percentage in this category is largely inelastic (meaning that it doesn't vary much even with significant fluctuations in ERA+ ). Palmer lost only one such game in the '70s, Blyleven lost two. Blyleven also had a better ERA+ than Palmer when receiving between 0 and 2 runs of support, but Palmer had a significantly better winning percentage, .267 to Blyleven's .211. Palmer's advantage when receiving weak run support can be explained by Palmer's far superior record in one-run games, which will constitute a significant percentage of games in which a pitcher receives two or fewer runs of support.

As the Palmer/Blyleven comparison demonstrates, relatively similar ERA+ figures can mask significant differences in pitcher performance. Although Palmer's ERA+ in the '70s was only marginally better than Blyleven's, Palmer's substantially better performance in high leverage situations and better performance in those games where pitcher performance is most likely to affect the outcome (i.e., the 3 to 5 run support category) produced a substantially better W-L record.

* * * * * *

Most pitchers perform essentially the same in clutch situations as they generally perform and exhibit fairly small deviations in performance under high and low run support scenarios, allowing slightly fewer runs in lower run support scenarios and slightly more runs when working with great run support. For these pitchers ERA+ is a reasonably accurate measure of their contribution to winning games. There is a smaller set of pitchers who exhibit countervailing characteristics, for instance pitching better in clutch situations but performing slightly worse in lower run support scenarios than in high run support scenarios (Mike Mussina is an example of such a pitcher). For these pitchers, too, ERA+ is a reasonably accurate measure of their true pitching performance. But there is a third category of pitchers, like Palmer, for whom ERA+ materially understates their performance. Other examples of such pitchers are:

Ron Guidry. Guidry pitched much better in higher leverage situations, compiling a LevERA+ more than five points higher than his nominal ERA+. Guidry also pitched significantly better in games where he received 3 to 5 runs of support, compiling an ERA+ in those games of 130.5 as compared to an overall ERA+ of 119 and an ERA+ of 109.4 in games in which he had run support of 6 runs or more.

John Tudor. Tudor had nearly a 129 LevERA+ (as compared to a 124 ERA+). He also excelled in matching his performance to the game scoring environment, pitching his best in lower scoring games while allowing more runs in high run support scenarios.

Whitey Ford. Ford's LevERA+ of 137 was even more impressive than his outstanding 133 ERA+. Ford also allowed nearly 9% fewer runs when receiving 5 or less runs of support than he did with 6 or more runs of support.

Tommy John. John's 114 LevERA+ was approximately three points higher than his ERA+, and his ERA was nearly a full run higher when receiving support of 6 runs or more than when he was working with 5 runs or less. His ERA in high run support scenarios hurt his ERA and ERA+ but not his winning percentage, and accordingly his ERA+ is deceptively low.

Juan Marichal. Marichal had a slightly higher LevERA+ than ERA+, 125 to 123, and he allowed approximately half a run more when supported with 6 or more runs than he did when working with 4 to 5 runs. Between his fine clutch pitching and his tendency to allow insignificant runs when working with great run support, Marichal's 123 career ERA+ is deceptively low.

On the other end of the spectrum - the Blyleven end, so to speak - Dave Stieb, Curt Schilling, Orel Hershiser and Steve Rogers are notable examples of pitchers whose LevERA+s were lower than their ERA+ and who tended to pitch better when graced with huge run support than they did in games in the critical 3 to 5 run support category. Like Blyleven, their ERA+ figures don't tell the full story.

* * * * * *

ERA+ is an analytical tool, not the dispositive argument many stat geeks believe it to be when discussing the relative merits of pitchers. In most instances, other factors being approximately equal, the pitcher with the 120 ERA+ will be better than the pitcher with the 110 ERA+. But there are many instances where the pitcher with the distinctly lower ERA+ is the superior pitcher because he made more optimal use of his run support, pitching well in high leverage situations and matching his performance to his team's, surrendering fewer runs when runs are at a premium and/or more runs when his team has provided great run support.

In short, any apparent comparability between Bert Blyleven's performance in the '70s and Jim Palmer's is illusory. Palmer was clearly the better pitcher and it's not even particularly close. This may not be apparent if one looks only at ERA+, but one doesn't have to look too hard behind the ERA+ stat to learn that while they may have allowed a similar number of runs, Palmer generally allowed them when he could afford to and Blyleven too frequently allowed them at the worst possible times. This fact, not disparate run support, accounts for the huge difference in their W-L records. ERA+ won't tell you that. It's still an important measure of pitching performance, but there are now statistics readily available that, when viewed together with ERA+, give a much fuller and accurate picture of a pitcher's performance.
_____________________

* Koufax's +59% figure is an anomaly produced by the fact that Koufax played in wildly disparate scoring environments, pitching in distinctly hitter-favorable parks until '62, and then switching to the pitcher friendly Dodger Stadium just as he was hitting his stride. As a consequence, a disproportionate number of games in which Koufax received 6 or more runs of support occurred early in his career when he was not yet the Koufax of legend, and this significantly skews the numbers.

Clutch Septembers of the '20s and '30s

Posted by Gator Guy on Thursday, March 25, 2010 , under Big Game Pitchers, Pennant Races | comments (0)

I've discussed the great pennant race performances of pitchers over the last 50 years. It's time to look at some of the legendary pennant race performances from long ago. These performances help to explain why certain pitchers with conspicuously thin career qualifications for the Hall were nonetheless inducted into the Hall of Fame. They also help to explain why some pitchers who were never seriously considered for the Hall are nonetheless revered by the oldtimers. Some names will be very familiar, others less so. But each of these pitchers put together performances in the heat of pennant races that lifted their teams to glory.

Dizzy Dean, 1934

Let's start with Dizzy Dean. Everybody knows about Dizzy's 30-win season for the Gashouse Gang in '34.They might not recall however that it was Dizzy's performance in August and September of '34 that made him a national figure and a baseball legend, as Dizzy led the Cards comeback to catch the defending World Champion Giants.

After briefly leading the NL in late May the Cards had fallen behind the Giants and remained between 5 and 7 games behind the defending World Champs for most of July and August. The Cards kept pace with the Giants throughout August and early September but failed to make up much ground. They were still 4.5 games back on September 16th when Dizzy squared off againt the Giants. Dean and the Cards beat the Giants 5-3 to close to 3.5 games back with 14 left to play. With time running out, the Cards decided to hitch their fortunes to Dean's arm and began pitching him virtually every other day in an effort to catch the Giants. Dean pitched a three-hit shutout against the Dodgers on Sept. 21, pitched in relief in both ends of a doubleheader two days later on the 23rd, pitched a complete game victory on Sept. 25, pitched a complete-game shutout on two days rest on the 28th, and then pitched a complete-game shutout on one day's rest on the 30th. All in all, Dizzy pitched in six of the Cards' last 11 games as they caught and passed the Giants.

From August 1 to the end of the season, Dizzy went 12-3 with 3 saves, posting an incredible 1.48 ERA in 155.1 innings and winning his last nine starts in a row. Dean's fast finish not only brought a pennant to the Gashouse Gang, it permitted him to win 30 games - the last time a National League pitcher would ever accomplish that feat.

Dean topped off his dream season by winning two of his three starts against the Tigers in the World Series, including the clincher in game seven. There is no question that Dean's superhuman achievements during the Cards dash to the NL pennant in '34 form the bulk of the Dean legend and was a significant part of his elevation to the Hall. Without that performance, and the 30 win season that resulted from the Cards decision to pitch Dean every other day down the stretch, it's likely Dizzy wouldn't be in the Hall.

Jesse Haines, 1928

The Cardinals were perennial contenders in the mid and late-20's, and Jesse Haines was their ace. After spending most of 1926 as the Cards' No. 3 starter, Haines came to the fore in the legendary World Series matchup with the Yankees of the Murderers Row era, pitching a complete-game shutout in game 3 and winning the decisive 7th game with a 6.2 inning, two run effort against Ruth, Gehrig and Co.

Haines was the unquestioned ace of the Cards staff in 1927, going 24-10 as the Cards narrowly missed winning another NL pennant. Haines pitched brilliantly in August, helping the Cards keep pace with the Pirates and Giants in a torrid three-way race, but stumbled in September and the Cards came up short. Haines redeemed himself in 1928, however, putting together a pennant race performance that ranks among the best in baseball history.

The Cards were just half a game in front on August 24th when Haines took the mound against the Phillies. Haines' shutout against the Phils triggered a five-game winning streak that extended the Cards' lead to 5.5 games by August 28th. But the lead slowly dwindled through early and mid-September and remained between one and two games for much of the last two weeks of the season. As the Cards were trying to hang on, Jesse Haines was the Cards' personal life preserver. Beginning with his win against the Phils on August 24th, Haines reeled off eight consecutive complete game victories, compiling a 1.38 ERA over that stretch. Three of Haines' last four starts came with the Cards up by one game or less. Haines didn't allow as much as three earned runs in any of those eight starts until the last one, when he beat the Boston Braves to keep the Cards up by one with three games to go.

The Cards held on to win the NL pennant but were swept by the Murderers Row Yankees in the World Series. Haines started and lost game 3 of the Series, after two errors on one play by Cards catcher Jimmie Wilson led to three Yankees runs that broke a 3-3 tie in the sixth. Even with this loss, however, Haines' numbers against the great Yankee lineups in the '26 and '28 World Series are impressive: in four appearances against Murderers Row in those two World Series, Haines won two of his three starts and put up a 1.99 ERA. Haines added a complete-game four-hitter against the A's in the 1930 World Series, and finished his World Series career with a 3-1 record and 1.67 ERA, World Series stats virtually identical to Ron Guidry's.

Jesse Haines 210 career wins and .571 winning percentage didn't much impress the BBWAA during the '50s and early '60s, but Jesse finally made the Hall in 1970 because enough Veterans Committee members remembered Jesse Haines' central role on those Cardinals teams that fought Murderers Row to a draw in the '26 and '28 World Series.

Big Bill Lee, 1938

Big Bill's remarkable stretch drive in the great NL pennant race of 1938 has been largely overshadowed by Gabby Hartnett's legendary "homer in the gloamin'" that gave the Cubs a crucial victory over the Pirates just as umpires were preparing to call the game due to darkness. Hartnett's homer, however, wouldn't even be a footnote to history but for Lee's astounding September performance because the Cubs would have already been eliminated.

The Pirates entered September with a fairly comfortable lead over the Cubs, Giants and Reds, who appeared to be in a tight race for 2nd place. The Pirates faltered in early September, however, and by September 14 the four teams were separated by just 3.5 games. Lee began September by shutting out the Pirates. He then pitched shutouts against the Reds and Giants, helping to move the Cubs into 2nd place just 2.5 games behind the Pirates. Lee pitched a fourth consecutive shutout on Sept. 22nd against the Phillies, but the Cubs were still 3.5 games back with 13 to play. Lee's scoreless streak was finally snapped by the Cardinals on Sept. 26, but Lee pitched his fifth straight complete-game victory. The Pirates arrived in Chicago the next day with a 1.5 game lead to begin a three game series with the 2nd place Cubs. The stage was set for one of the most remarkable finishes in NL history.

A diminished but still formidable Dizzy Dean was tapped by the Cubs to pitch the first game against the Bucs. Dean's arm was no longer what it was, damaged as a result of his attempt to compensate by overthrowing after a line-drive in the '37 All-Star game broke a toe on his landing foot and restricted his ability to follow through. Dizzy was only a once-a-week pitcher for the Cubs in '38, but when he pitched he was spectacular, taking a 6-1 record and 1.91 ERA to the mound to face the Pirates. Dean pitched brilliantly against the Pirates and took a 2-0 lead into the bottom of the ninth. Dizzy had runners on 2nd and 3rd with two outs in the ninth when Hartnett, the Cubs manager, waved in Bill Lee. Lee promptly through a wild pitch that allowed the runner to score from third, but with the tie run just 90 feet away Lee struck out Pirate catcher Al Todd to end the game. The Cubs were just half a game behind the Bucs.

The next day the Cubs and Pirates were tied 3-3 when the Pirates scored two runs in the 8th inning off Cub pitcher Larry French to take a 5-3 lead. Hartnett brought in Big Bill with no one out in the 8th to stem the rally and Lee managed to finish the inning without permitting further damage. It was Lee's third appearance in three days. The Cubs responded with two runs in the bottom of the 8th to tie the game 5-5, and Lee, who was slated to start the next day's game, was replaced by Charlie Root to pitch the top of the 9th. Root held the Pirates scoreless in the 9th, and the rest is history. Hartnett's bottom of the 9th shot in the gathering darkness at Wrigley Field remains one of the most famous home runs in baseball history.

The Cubs were now in first place for the first time since early June. Lee took the mound to make his fourth appearance in four days; his last start had been just three days prior. The Cubs, perhaps conscious of the fact that Lee was running on fumes, scored three runs in the bottom of the first to take a quick lead. By the end of the fifth inning the Cubs had an 8-1 lead, having pounded the Pirates pitching trio of Bauers, Brandt and Blanton. The Cubs won the game 10-1 to finish the series with the Bucs with a 1.5 game lead. Lee recorded his sixth complete game victory in September. For the month, Lee was 6-0 with two saves and a microscopic 0.64 ERA. He had started four games against the other contenders in the NL race and won them all, with wins over the Pirates bookending his month. The Cubs held on to win the pennant, maintaining their lead over the Pirates for the last three games of the season.

Lee started the first and fourth games of the World Series for the Cubs against the Yankee juggernaut manned by a roster of Hall of Famers. Lee pitched well but to no avail, surrendering just three earned runs in 11 innings against the likes of Gehrig, Dimaggio, Dickey, Gordon and Henrich, but losing both games. Ruffing and Gomez were too much for the Cubs batters, and the Yankees swept the Series.

When one considers Lee's iron-man performance against the Pirates in late September and the fact that three of his four September shutouts came against other contenders, Big Bill's pennant race performance for the Cubs in '38 might be the most spectacular in National League history.

A Recipe For Catfish

Posted by Gator Guy on Wednesday, March 24, 2010 , under Hall of Fame, Hunter | comments (0)

Catfish Hunter is frequently cited by the stat geeks as a prime example of an unworthy HOF inductee. He doesn't have a plaque at the Baseball Think Factory's Hall of Merit, where Dave Stieb, Bret Saberhagen and Wes Ferrell are enshrinees. Hunter's ERA+ is presumably the problem the HOM balloters have with Hunter. It can't be the 224 career wins, since Stieb, Saberhagen and Ferrell each have significantly fewer. I've offered my explanation for Hunter's induction into the HOF, an induction I believe was more than worthy. I thought I'd look at Catfish's Team Relative performance.

During his ten-year prime from '67 to '76 Catfish outperformed his team by 11.3%. That's not a very good figure for a Hall of Famer, and I wasn't particularly surprised by it. What I was surprised by was Hunter's Team Relative index for his five-year prime of '71 to '75, which covers the A's World Series years and his first season with the Yankees. I apparently had assimilated the argument of the stat geeks that Hunter's record during that period was purely a function of pitching for a great team and getting huge run support. Not true, as it turns out. Hunter's Team Relative index for that five-year period is 28%. If you remove the '75 season, Catfish outperformed his A's teams by 29.3%. And if you limit the analysis to just the three World Series championship years with the A's, Catfish's Team Relative index was 34.6%.

To be clear, I'm not arguing that Catfish didn't benefit from great run support. He did. And I'm not arguing that Catfish would've had five consecutive 20-win seasons if he'd played for Blyleven's Twins teams in the '70s. What I am arguing, however, is that the claim that Hunter's great record during this period was just a function of great run support from a great team is demonstrably untrue. Take away the great run support and Hunter was still outperforming his team by 28% over a five-year period and a robust 34.6% during the A's championship years. Those are Hall of Famer-type numbers, albeit for a relatively brief period. It is simply a myth to argue that any pitcher with a Team Relative index like Hunter's was merely a product of great run support and great teams.

Let's look at another pitcher generally dismissed by the stat geeks as a mere product of great run support: Jack Morris. Morris's Team Relative index during his peak nine-year period of '79- '87 was 15.8%, not much by HOF standards but right there with Bunning's 16% index for his 11-year peak. That means if Morris had played for an average hitting team with a .500 record he still would have posted a .579 win% over those nine years. I think it's fair to conclude therefore that Morris's actual winning percentage of .615 during his peak was perhaps 30% attributable to his run support; the bulk of the credit, however, has to go Morris. If I'm not mistaken, Morris detractors would look at his 105 ERA+ and conclude that Morris's .577 career winning percentage was attributable 95% to his superior run support. This is plainly not the case. The Team Relative analysis demonstrates that Morris was able to perform far above the standard a career ERA+ of 105 would typically indicate.

It's no mystery why Catfish is in the Hall. He's in for the same reason Waite Hoyt, Jesse Haines, Lefty Gomez, and Red Ruffing are in the Hall despite falling well short of 300 wins, and for the reason Curt Schilling will make the Hall. They excelled on the big stage and made a huge impact for great teams. They put their imprint on legendary pennant races and World Series contests. That counts for a lot in HOF balloting, and it should.

The Celebrated Mr. K

Posted by Gator Guy on Tuesday, March 23, 2010 , under Koufax, Team Relative Performance | comments (0)

His blazing five-year stretch from '62-'66 has become the standard by which all other great pitchers are measured. The Gold Standard. The definition of pitching dominance. Anyone who considers a new mode of analyzing pitching greatness has to insert his five peak seasons into the formulas and see what comes out. If you plug into your formulas his stats from these five seasons, during which he won five straight ERA titles, three pitching triple crowns and three 25+ win seasons in four years, and a historic result doesn't come out the other end, then maybe you need to double check your methods and formulas.

From '62 to '66 Sandy Koufax outperformed his team by 41%. If you exclude the '62 season, where Koufax's injury and the Dodger's decision to rush him back into the rotation in late September significantly skew the numbers, then Koufax outperformed his team by 49.5% from '63 to '66*. That's Randy Johnson territory. A 50% Team Relative performance over a period of years could be known as the Sandy-Randy Standard.

Randy Johnson's peak four-year period by Team Relative analyses was actually the five-year period from '93 to '97 that includes his injury-shortened '96 season when he went 5-0. It also includes the strike abbreviated '94 and '95 seasons. Over that five-year period Johnson's Team Relative performance was 58.2%. History suggests, however, that Johnson would not have maintained the .920 winning percentage he compiled in '95-'96 had he pitched full seasons. Johnson's true peak, as measured by wins, ERA+ and most other measures, actually occurred with the D'backs from '99 to '02, and he compiled a 49.9% Team Relative performance during that period.

Maddux compiled a 52.6% Team Relative performance from '94 to '97, but that period also included two strike-shortened seasons.

Guidry's Team Relative performance over his three-year peak from '77 to '79 was 40%. Seaver had a 44.2% Team Relative performance for four years between '68 and '71. If one excludes Gibson's injury-shortened '67 season, Gibson maintained a 41.1% Team Relative performance from '65 to '70. If one excludes Marichal's injury-shortened '67 season, he maintained a 33.8% Team Relative performance from '63 to '69.

For Schilling's three 20-win seasons - '01, '02 and '04 - he had a Team Relative performance of 48.2%. For Guidry's three 20-win seasons he had a Team Relative performance of 43.5%.

Team Relative analysis confirms that Koufax's great run was indeed among the very best four or five year stretches in baseball history. Throw in the huge innings totals Koufax put up in these years, the no-hitters, strikeout records, pennant race and post-season performances, and it's clear why Mr. K became a legend.
_________________________

* Koufax's best season by far as measured by Team Relative performance was his injury-shortened '64 season, when he posted a 19-5 record for a Dodger team that was truly terrible but for Koufax, compiling a .442 winning percentage in games in which Koufax was not the pitcher of record. Koufax outperformed that team by nearly 89%.

Lost In Translation

Posted by Gator Guy on Monday, March 22, 2010 , under Blyleven, Team Relative Performance | comments (0)

Not surprisingly, an analysis of Bert's prime years of '70 to '79 demonstrates that despite his superlative ERAs he didn't significantly improve his team when he was on the mound. Yes, Bert didn't get good run support from his teams, who scored .35 runs/game fewer for Bert than they did for other starting pitchers. It is also true that in measuring Bert's performance against his teams' Bert was competing against some pretty good pitchers. For the entire decade, Bert pitched on staffs that were slightly above average even without Bert's contribution, and the staffs on his '70, '72, '77 and '79 teams were among the very best in their leagues. But the Team Relative analysis controls for these factors, of course.

Even after increasing Bert's run support to team average, and adjusting his team's W-L record downward to reflect what it would have been with an average pitching staff, Bert still only outperformed his team's W-L record by 10.2%. That's down in Drysdale territory. As I've previously noted, Bert hugely underperformed his Pythagorean projection during those ten years, compiling a .536 winning percentage as compared to a .599 PythPro. If Bert had been able to perform to his PythPro he wouldn't be such a hot topic today because he would have been inducted into the Hall years ago.

Bert's Team Relative performance was worst during his first six years with the Twins, the period that prompted the Sports Illustrated article wondering why Bert wasn't a bigger winner. Bert's outperformed his team by 8.3% during those years. He improved slightly in his stints with the Rangers and Pirates from '76 to '79, outperforming his team by 13.9%, still well short of what we'd expect from a top flight pitcher. Bert's worst year in this regard was '72, when he performed only 2.2% better than his team despite receiving .44 runs/game more than the other Twins starting pitchers. This is one year where Bert can truly be called a victim of a poor distribution of run support, with a disproportionate number of games falling at either end of the spectrum - a large number of games in which he received three runs or less and a large number of games where he received seven or more. Amazingly, in Bert's 38 starts there were only five games in which he received 4, 5 or 6 runs of support. Bert's run support distribution was also very poor in '75 and, to a lesser extent in '73 and '74.

Bert's run support distribution was more conventional in '76 to '79, although his average run support in '76 was terrible - only 2.75 runs/game. But remember, the Team Relative analysis controls for poor average run support*; it doesn't control for poor run support distribution. In '77, '78 and '79, Bert's run support was almost precisely team average. After controlling for Bert's run support for the '76 to '79 period, a period during which his run support distribution was more conventional and his ERA+ was a very good 125.5, he still only managed to outperform his team by 13.9%. And since we've acknowledged Bert's poor run distribution in the period '72 to '75, we should also acknowledge that Bert's Team Relative performance in the period '76 to '79 was skewed by his W-L record in the '79 season, when the Pirates' bullpen and bats bailed out Bert an extraordinary 13 times after Bert left the game in a position to lose. To give you some idea of how extraordinary this "bailout" total is, consider that Bert was similarly bailed out only 18 times in the preceding nine years. Bert's 12-5 record in '79 is extremely misleading, and if not for his good fortune and the Bucs late-inning dramatics for Bert in '79 his record would have been something like 12-15, which more than eliminates the improvement we see in Bert's Team Relative performance in the late '70s.

I've not modeled Bert's projected record from '72 to '75 assuming a more optimal distribution of run support. It could be done using a system that generates a random distribution of run support and Bert's projected W-L record using such a system would no doubt benefit. There's also no doubt, however, that any benefit to Bert from a more conventional distribution of run support from '72 to '75 would be largely offset by his '79 season, when Bert easily could have lost an additional 10 games and his 12-5 and .706 winning percentage was not reflective of his performance: he had a 109 ERA+, a LevERA+ of 99, and won only 12 of his 37 starts for a World Series championship team.

Again, this Team Relative analysis is limited to Bert's peak decade of '70 to '79. As the Bert Backers would no doubt argue, Bert had some great seasons outside this period, primarily '84 and '89. But the '70 to '79 period forms the overwhelming bulk of Bert's argument for the Hall. For the balance of his career he had a .533 winning percentage and 108 ERA+, and despite the excellent '84 and '89 seasons the 80's were an exceedingly erratic period for Bert, a period in which poor seasons ('80 and '88) and injury limited seasons ('82 and '83) detract from his case for the Hall. The argument of the Bert Backers is almost exclusively based on the '70s, during which he posted his best ERA+ figures, almost 2/3s of his shutouts and six of his eight 200 strikeout seasons. But the fact is that for all of Bert's statistical achievements in the '70s, they didn't translate into a commensurate win total and W-L record and Bert didn't improve his team as much as he should have. And the problem wasn't run support.

__________________________

* For example, Bert posted a 13-16 record in '76, worse than his team's 79-83 record. But the Team Relative analysis has Bert outperforming his team by 32% after adjusting for his run support.

Big Unit, Indeed

Posted by Gator Guy on Saturday, March 20, 2010 , under Randy Johnson, Team Relative Performance | comments (0)

It's an amazing sight when you're watching great athletes compete at the highest levels in their sport and one competitor is so great that the opposition is just overmatched. I mean dominated; not just beat, not just bested, but dominated, almost completely helpless. In the realm of baseball, the greatest pitchers, at their best, will do this. Major league hitters, the best in the world, men with preternatural reflexes and freakish hand-eye coordination, are left to wave futilely at pitches or are so flummoxed they can't even swing.

I remember watching Clemens pitch against the Yanks in '97 and wondering how in the hell anyone ever hit the guy. I remember watching Pedro against the Yanks in September '99, the game he struck out 17, and feeling sorry for Yankee batters. Jorge Posada couldn't even get the bat off his shoulder. He had no idea what was coming - 96 mph fastball, or slider, or change-up or curveball. Yankee after Yankee left the plate after striking out shaking their head on the way back to the dugout, no doubt feeling the way Mickey Mantle felt after facing Koufax for the first time in the '63 World Series, when he said to the umpire as he was turning to leave the plate after striking out, "now how in the hell am I supposed to hit that shit."

But for sheer dominance, the ability to induce not only helplessness in big league batters but terror, there has perhaps never been anyone like Randy Johnson. It was sometimes like watching little league baseball, where the big kid is on the mound, the one that seemed to mature about two years ahead of the rest of the kids, and the ball is blowing by the batter before they can even think about swinging. One kid gets smoked and the next batter approaches the batters box looking like they're going to the gallows. They have no chance. When he was at his best, that was Randy Johnson on the mound. Too big, too nasty, too fast. And that slider - Christ, you pitied lefthanded hitters who had to face Randy Johnson.

The ironic thing is that Johnson was arguably never considered the best pitcher in the game during his prime. Before you say, "hey, wait a minute...", consider this: Maddux was off the charts in '94 and '95, throwing strike after strike without ever hitting the white of the plate. Clemens was spinning back-to-back pitching triple crowns in '97 and '98. And then Pedro was dominating from '99 to '02, like a Marichal with more speed and a Hoffman-like change-up. Note that I didn't say Johnson was never actually the best during this period - he unquestionably was in 2001 when Pedro missed half the season. I said he was never considered the best. Johnson had the misfortune during his peak of '93 to '02 of always seeming to be in the shadow of another all-time great. Even in 2001 Johnson was a bit overshadowed by his loquacious and self-promoting mound-mate, he of the Bloody Sock.

During his ten-year peak, Johnson posted seasons of 18-2 (the strike-shortened '94 season), 20-4, 21-6 and 24-5. After sulking his way through the first half of the season in Seattle in '98, he went to the National League and went 10-1 with a 1.28 ERA in eleven starts for the Astros. I remember thinking as Johnson was doing this, "man, those NL batters have never seen anything like this."

Judging just by the W-L records, Randy Johnson was as close to unbeatable for those ten years as anyone has ever been in major league baseball. He was 175-58, for a .751 winning percentage. Only Grove had a comparable winning percentage across a similar number of decisions, going 172-54 for a .761 winning percentage between '27 and '33. There was a difference, however. Grove was pitching for Connie Mack's Athletics, the team that sent Ruth and Gehrig packing for home at the end of the season in '29, '30 and '31. Grove was lavished with spectacular run support by Foxx, Simmons and Cochrane. He pitched to one of the all-time great field generals in Mickey Cochrane. He had a hell of a supporting cast. Johnson received generally good run support from his Seattle and Arizona teams, but it was actually slightly less than team average, and needless to say the overall quality of those teams didn't approach the Athletics of the 30's.

The team relative analysis for Randy Johnson yielded a number that made me go back and double-check the formulas in the spreadsheet. After those checked out, I reconsidered the whole concept of the team relative analysis as a metric for pitchers. But I think its validity still holds. I had to reconsider the concept and double-check the methodology because it produced for Randy Johnson a result so outlandish, so amazing, that it was hard to believe.

Between '93 and '02, Randy Johnson outperformed his team, after adjusting for factors other than his own performance that might have effected his W-L record and his team's W-L record, by 50%. Yeah, you read that right - 50%.

It's a figure that significantly exceeds Seaver and Maddux. And it's a figure that I'm pretty certain will exceed Koufax's figure for his peak period of '62 to '66. If I don't miss my guess, it's a figure that only Walter Johnson will be able to approach over a decade period. And even Walter won't hit Randy's mark unless either his run support was significantly worse than other Senator pitchers received or those Senator pitching staffs were better than they appear to be at first glance.

This result is not the product of any significant adjustment to Johnson's record produced by the methodology. A straight comparison of Johnson's W-L record to his teams' records in games other than those where Johnson got the decision shows that Johnson outperformed his team by nearly 46%. After adjusting for the fact that (i) Johnson's run support was slightly below team average and (ii) Johnson was outperforming a very good D'back pitching staff from '99 to '02, that figure increased to 50%.

Unless I can identify some flaw in this concept I'm forced to reconsider my opinion that Grove was the greatest southpaw in the history of the game. Hell, Randy Johnson might have been the greatest pitcher, period. True, these analyses are restricted to peak periods of about a decade, but it's not like Randy had a short career; he won 300 games, after all. And I think it unlikely that other candidates for greatest ever will have achievements outside their peak decade that will militate in their favor, although Clemens might.

In terms of sheer power-pitching dominance, we were watching Walter Johnson at his peak when we were watching Randy Johnson from '93 to '02. We were watching peak Koufax. We were watching Grove at his very peak, say from '28 to '33. I'm not sure I appreciated that at the time. In fact, I'm pretty certain I didn't. But the Big Unit was indeed that great.

More Team Relative Analyses: Lefty Grove

Posted by Gator Guy on Friday, March 19, 2010 , under Grove, Team Relative Performance | comments (0)

Lefty Grove weighs in at 36%. I didn't realize what tremendous run support Grove got from the Athletics in the early 30's.

It may be a shade behind Seaver and Maddux, but it doesn't change my opinion that Lefty was The Man among post-1920 pitchers. What it does do is make me appreciate how great Seaver and Maddux were. Man, that 35%+ improvement over team performance is one upscale neighborhood.

Reappraisals of Palmer, Bunning and Drysdale

Posted by Gator Guy on , under Team Relative Performance | comments (0)

The analyses of pitcher performance relative to his team have yielded some interesting results.

As I described in the "The Theory of Relativity" post, it is possible to compare a pitcher's W-L record and winning percentage to his team's and adjust for factors that distort the comparison. These adjustments involve adjusting the pitcher's run support to equalize it with the run support the team provided to the other pitchers on the team and normalizing the ERAs and runs allowed by the rest of the staff to league average. These two adjustments assure that a pitcher won't benefit or suffer by virtue of run support that deviated from team average, or by virtue of the fact that the rest of the pitching staff, to whom the pitcher is effectively being compared, were either better than league average or worse than league average. A pitcher may be a great pitcher but his W-L record relative to his team's won't be very impressive if the rest of the team's pitching staff is comprised of great pitchers. In comparing Greg Maddux's W-L record to his team's it is obviously necessary to adjust for the fact that the Braves' pitching staffs were great and produced tremendous winning percentages because of the presence of guys like Glavine and Smoltz.

Here are the new results. Curt Schilling's ten-year peak from 1997 to 2006 was pretty impressive, as his excellent winning percentage and ERAs would suggest. Schilling outperformed his team by approximately 27% over that decade. Bob Gibson outperformed his team by approximately 28% over his nine-year peak of '64 to '72; not sure whether people will find that disappointing or impressive. Both these results obviously cast Ron Guidry in a very good light, because Schilling's and Gibson's team relative performance figures are right in Guidry territory. Ron is in good company.

Here are the results I found surprising. I ran the numbers on Bunning, Drysdale and Palmer. I've always thought of Bunning and Drysdale as being very similar, and I've conceived of Palmer as an American League version of Tom Seaver, although a shade behind Tom Terrific. My team relative analyses have fundamentally changed my appraisals of these pitchers.

I didn't expect Drysdale to fare very well in this analysis, simply because his a straight comparison of his W-L records to his teams' isn't very impressive. I expected, however, that Drysdale would benefit from the fact that the Dodger pitching staffs were generally exceptional during his era. Drysdale came in at 9.75%. Now remember that Dave Stieb came in at approximately 17%. That's a big difference. I know Drysdale made in the Hall in large part because of his participation in a lot of great pennant races and World Series, but those who question Drysdale's HOF bona fides have new ammunition for their argument.

Bunning polled in at 16%. That's a pretty good figure, and it's far better than Drysdale.

Now here was a bit of a stunner. A straight comparison of W-L records for Jim Palmer and his Oriole teams never looked all that impressive, because those Orioles teams posted great records during Palmer's peak from '70 to '78. Still, I expected that Palmer would benefit greatly from the fact that those were great pitching staffs the Orioles fielded in the '70s. Well, they were great staffs in the early '70s, but from '74 to '78 they really weren't all that good when you take away Palmer. For the full nine-year period the Orioles posted a 103.6 ERA+ when you subtract Palmer's ERAs. Good, but not great. After adjusting for the quality of the Oriole pitching, Jim Palmer outperformed his team by 20.4%. That's better than Bunning, but not by much. And it's nowhere near Tom Seaver's neighborhood of 37%.

O.K., Jim Palmer wasn't Tom Seaver. Does that make me reconsider whether Palmer was a legitimate first-ballot HOFer? No, not at all. Eight 20-win seasons in nine years is quite an accomplishment, and Palmer was an undoubted big-game pitcher, posting some pretty impressive pennant race performances and superlative post-season numbers. But he wasn't as good as I thought he was.

As with any other statistic or metric, it's important to put it in context. It is an all too common failing of many fans who fancy themselves sabremetricians that they attach too much weight to a single statistic. Schilling comes out well ahead of Palmer in team relative performance. That tends to confirm what Schilling's great winning percentage and ERAs had already told us: Schilling was a damn good pitcher. But Palmer was rock-solid consistent and was generally a bigger winner than Schilling over their respective peaks, even after adjusting for the difference between the four-man rotation that Palmer pitched in and the five-man rotation Schilling pitched in. Palmer had one stinker in '75 when he had significant arm issues, but was excellent every other year. Every pitcher gets a pass for one season where he had arm issues; Seaver had one during his peak, Guidry had one in '84, and most other great pitchers also had one. Schilling had more than his share of them, however, and was significantly limited in his contribution to his team in '99, 2000, '03 and '05 as a result of poor performance or limited innings due to arm issues. He was excellent when he was on, like Saberhagen, but like Saberhagen you can't just ignore all the seasons where his team didn't get what they justifiably expected.

Bunning would appear at first glance to have had more than his share of off seasons during his peak, but that's deceptive. Bunning was very durable and never really missed time due to injury issues during his peak. Some of his pedestrian W-L records, particular in 1960, are easily explained: poor team, terrible run support. There's reason to believe that if you put Bunning on those '70s Orioles teams he might have posted another four or five 20-win seasons. But there's also reason to believe he might not have. It has to be noted that Bunning pitched on some good teams that gave him pretty good run support, like the '61 Tigers and '64 Phillies, and he didn't post 20-win seasons. He did well enough, posting good ERAs and winning 17 games in '61 and 19 in '64. But a Hall of Famer should have been winning 20 games, and probably 22 or 23. Bunning had a bit of a tendency to pitch to his team's level. That's not a particularly damning observation, but it's an important consideration.

Back to Guidry for a moment. The myth is that Guidry's spectacular W-L record was in large part a function of great run support from a powerful Yankee offense. It's a myth. The '77 and '80 Yankee teams could hit, no question, but the Yankees of the pennant winning years of '77 to '81 were unquestionably pitching and defense oriented teams. Look at the stats. And Guidry's run support from the Yankees was in any event strictly average for those teams. Guidry outperformed those teams, and those exceptional pitching staffs, because he was an excellent pitcher, made optimal use of his run support, was a clutch pitcher who pitched best in critical situations, and was in short a winner. That term - winner - is a term a lot of the self-styled stat geeks scoff at, but if they looked a little closer at the stats the concept would be plain. Some pitchers are winners. Some aren't. Blyleven wasn't. Steve Rogers wasn't. To a lesser degree Stieb wasn't either. It doesn't mean they weren't good pitchers. But it means they weren't as good as their generally superior ERAs and ERA+s would suggest. The stat geeks should ponder that for a moment.

The Theory of Relativity

Posted by Gator Guy on Thursday, March 18, 2010 , under Sabremetrics, Statistics, Team Relative Performance | comments (0)

I love a lot of the new pitching stats. They're great analytical tools. Take FIP, for example ("fielding independent pitching"). It's based on the proposition that what happens on a ball put in play is frequently a function of random chance and team fielding. Bill James recognized its utility and cited Wally Bunker's 1964 season as an example of a pitcher apparently benefiting from some good luck insofar as his BAbip that year was .216. It turns out that Bunker in fact had a pretty good facility for generating low BAbip's in his career, presumably because, like Maddux in his prime, he was adept at keeping the ball away from the fat part of the bat and inducing batters to hit pitches outside the hitter's sweet spots in the strike zone. But Bunker never again came close to posting the .216 BAbip he posted in '64, despite being backed by the legendary team defense of the '60s Orioles.

FIP tends to understate the effectiveness of a pitcher with a demonstrated ability to consistently generate very low BAbips. Take The Great Rivera, for example. I was skeptical of Mariano's decision in '97 to move almost exclusively to the cutter because it seemed to sharply cut into his strikeouts. "Throw the high fastball!", I would shout, longing for Mariano's incredible strikeout ratio in '96 when he K's 130 batters in 107 innings. Still, I had to admit that batters seemed almost incapable of getting good wood on the cutter, but bloops and dribblers can and do become hits, while strikeouts can't and don't. Bloops and dribblers that found holes in the defense became, in my mind, "Mariano Specials." Obviously Mariano's decision to go with the cutter has been thoroughly vindicated and my early concerns were unfounded. But Mariano never fares too well in the FIP stat, and that's misleading because Mariano has demonstrated an ability to consistently generate low BAbips (Mariano's career BAbip is .265, as compared to a major league average of .299).

As Bill James has noted regarding FIP and various other new and sophisticated measures of pitching performance, they have a tendency to throw out a lot of information in an effort to isolate and identify a pitcher's performance independent of non-pitching factors. Bill is a little unsettled by this, and so am I. As he's argued, W-L records are the antipode to FIP and similar stats, incorporating all information, including unfortunately things that have nothing to do with a pitcher's performance, like offensive support and team fielding. However, the inclination of the stat geeks to summarily dismiss W-L records is extremely misguided. It is possible to start with W-L records and make appropriate adjustments, and that's what I'm about to propose.

The Theory of Relativity, in contrast to FIP, throws out nothing but attempts to adjust for everything (or at least most things) that happens outside of the pitcher's performance. Simply put, it compares a pitcher's W-L record to his teams record in games where the pitcher was not the pitcher of record (i.e., it subtracts the pitcher's W-L record from the team's), adjusting for factors that effect the pitcher's and team's W-L records but are largely unrelated to the pitcher's own performance. If a pitcher received run support better or worse than the run support a team generally provided its pitchers, the pitcher's W-L record is adjusted (via the Pythagorean theorem) to reflect what his W-L record would have been had he received run support equal to his team's average. It also adjusts for the performance of the rest of the team's pitching staff, because even a good pitcher who receives excellent run support will appear to fare poorly relative to his team's W-L record if the rest of the starting pitching staff is comprised of Walter Johnson, Pete Alexander, Tom Seaver and Randy Johnson, with Gossage, Eckersley and Rivera coming out of the bullpen.

These pitching staff adjustments are accomplished by taking the team's ERA+ (exclusive of the subject pitcher's own ERA+) and adjusting the team's W-L record to reflect what it would have been had the rest of the staff generated a 100 ERA+ (again, based on the Pythagorean theorem). It simply takes the team's ERA+ exclusive of the subject pitcher's ERA+, calculates the runs allowed or saved by the staff's performance above or below the assumed 100 ERA+, and adds or subtracts those incremental runs to the team's runs allowed. A Pythagorean record is then generated assuming a league-average staff.

Once you've adjusted the pitcher's record for run support and adjusted the team's record for the rest of the pitching staff's performance, you compare the pitcher's adjusted W-L record to his team's adjusted W-L record. The impact of run support on the pitcher's W-L record relative to his team's is thereby eliminated, and the impact of the rest of the staff's performance on the team's W-L record is similarly eliminated. A good pitcher will have an adjusted W-L record much better than his team's adjusted record, and a poor pitcher will have a worse one. Measuring the difference between the adjusted records of the pitcher and the team provides a good measure of the pitcher's performance. It doesn't expressly adjust for team defense (a notoriously difficult aspect of team performance to measure), but it implicitly incorporates it because bad team defense will lower the denominator representing the team's W-L record and therefore increase the relative impact of the pitcher's W-L record (adjusted for run support) relative to his team's W-L record (adjusted for the performance of the rest of the pitching staff).

The concept of simply comparing a pitcher's record to his team's is not novel, but the defects in the system became apparent to me when I was comparing Phil Niekro's relative W-L record to Don Sutton's. Even if the records were adjusted for variations in run support, Sutton would still tend to fare poorly compared to Niekro because Niekro would benefit by being compared to the poor Braves pitching staffs of the '70s, while Sutton would suffer from being compared to the generally excellent Dodger's pitching staffs of the '70s. It was easy for Niekro to outperform the sub-average pitchers on the Braves staff, but more difficult for Sutton to outperform the Tommy John's, Claude Osteen's and Andy Messersmith's who generally populated the Dodger staffs. It's fairly easy, however, to adjust for this, and the conceptual validity of the adjustment should be obvious. Still, the process of collating the team pitching data from different years, incorporating it into the adjustment formulas and generating the Pythagorean adjustments is a little involved and so for the moment I'll only present an analysis of three pitchers: Tom Seaver, Ron Guidry and Dave Stieb.

I selected these three pitchers because I thought they would be illustrative. Bill James has noted how spectacular Seaver's winning percentage was given the generally mediocre nature of the Mets teams he pitched for in the late '60s/early and mid-70s. I selected Guidry because I knew that his record was spectacular even after accounting for the fact that the Yankees teams he pitched for were generally pretty good, but I didn't know how his relative record had been affected by his run support and the quality of the Yankee pitching staffs. And I selected Stieb because (i) I knew that he had significantly underperformed relative to Pythagorean projections during his prime years in the early and mid-80s, and (ii) I was tired of beating up on Bert Blyleven. (I knew Blyleven also underperformed his Pythagorean projections in his prime, but I genuinely like the guy and he was by many measures a borderline great pitcher - certainly better than Stieb - albeit not a Hall of Famer).

I compared nine-year peaks for each of the pitchers. This was convenient because both Guidry and Stieb had distinct nine-year peaks that account for all of their superior seasons. One could select various nine-year periods for Seaver, because his peak extended well beyond nine years, but I selected his first nine seasons, comprising substantially his entire Met career. I'll begin the comparison by noting some things that you probably already know. For instance, the Mets were not a good team once you subtract Seaver, notwithstanding their two NL pennants and their '60 World Series championship. Their team winning percentage from '67 to '75 was .495 (mediocre, but not bad), but was only .463 once you subtract Seaver's .636 winning percentage from the equation, and that obviously stinks. You probably also knew that the Mets' problem was poor hitting. They actually had very good pitching, even after stupidly trading Nolan Ryan, posting a team ERA+ of 108 from '67 to '75. Once you subtract Seaver's superlative ERAs, however, the team ERA+ was 102.2. That's not great, but it's pretty good considering the staff's ace pitcher is excluded. Another way too look at it is that the Mets staff was above average even without the great Seaver.

I was somewhat surprised by how good Stieb's winning percentage was from '82 to '90. He was 135-90 for a very good .600 winning percentage. But I was also slightly surprised by how good the Jays teams were in that period. They had a .548 winning percentage, and were generally a pretty good team even aside from the excellent '85 and '87 seasons, other than in '82. Even subtracting Stieb's W-L record the Jays still had a .539 winning percentage. I was very surprised, however, by how good the Jays pitching was in that period. They had a team ERA+ of 109.9 and an ERA+ of 106.9 even after subtracting Stieb. Even without Stieb the Jays staff in the '80s was as good as the Yankees pitching in the period '77 to '85 (primarily because the Yankees pitching sagged significantly from '82 to '84). Jimmy Key and Doyle Alexander were no slouches, and Jim Clancy was a pretty good No. 4 starter. And the Tom Henke-led bullpen was generally pretty solid and sometimes excellent.

The Yankees had a team winning percentage of .575 from '77 to '86, and were well over .500 every year other than '82. The Yanks' winning percentage drops to .552 without Guidry, still very good but not that much better than the Jays' .539 W% without Stieb. The Yanks pitching was better than the Mets but not as good as the Jays, posting an overall 106.3 ERA+ and a 103.7 ERA+ without Guidry. The period of '77 to '85 was really a tale of two Yankee pitching staffs: the excellent staff from '77 to '81 and the generally mediocre staff from '82 to '85.

On the offensive support side both Guidry and Seaver received run support slightly better than team average, in each case about 3%. Stieb's run support was 1.2% below team average. Accordingly Guidry's and Seaver's adjusted W% was slightly lower than their actual W% and Stieb's slightly higher. The adjustments were quite small in each case, with Stieb's W% going up from .600 to .606. Guidry's adjusted W% dropped 18 points to .679 and Seaver's dropped 14 points to .622.

The big beneficiary of the adjustment to team W% by assuming an average pitching staff was Stieb. The Jays W% (exclusive of Stieb) drops from .539 to .518. A Jays staff with a 100 ERA+ would have added about 36 runs per year to the Jays' runs allowed total.

The effects of these adjustments were essentially negligible for Guidry and Seaver, with the reduction in their personal W%'s being largely offset by the reduction in the team W% resulting from translating their good team pitching staffs into average staffs. Stieb, by contrast, saw a significant increase in his W% relative to his team's. Simply comparing Stieb's .600 W% to his team's .548 W% shows that Stieb outperformed his team by 9.5%. Adjusting for run support and pitching staff, however, increases Stieb's relative performance figure to 17%. That's a pretty good figure, and though I've not yet run the figures for various HOFers I'm willing to bet that it compares favorably to some of the more marginal inductees.

Seaver outperformed his team after adjusting for run support and pitching staff by a tremendous 37%, which is almost precisely the figure obtained by comparing his straight W% to his team's.

Guidry outperformed his team after adjusting for run support and pitching staff by 27%, which represents less than a one point increase over the approximately 26% figure obtained by comparing his .697 W% to his team's .552 W% without Guidry.

Just to give some idea of how astounding Seaver's figure is, my preliminary calculations appear to suggest that Koufax outperformed his team during his historic five-year run from '62 to '66 by slightly north of 40%. Seaver's 37% relative performance figure maintained over a nine-year period, therefore, appears to be a historic feat, and I'm willing to bet that few other pitchers since 1920, if any, can match it.

Stieb's figures demonstrate how a pitcher who had run support below team average and pitched on a good staff can have actually outperformed his team by a larger margin than a simple comparison of W% between pitcher and team would indicate. On the flip side, a pitcher whose performance relative to his team's at first glance appears to be superlative can be revealed as a fundamentally average pitcher if he received both great run support relative to his team's average run support and pitched on a team with an inferior pitching staff. Obviously neither Seaver nor Guidry are examples of this, and I'm not sure off the top of my head which pitcher might fit this profile. I know Andy Pettitte has received tremendous run support throughout his career, but he's also pitched on generally excellent pitching staffs. If anyone can suggest such a pitcher in the comments section I'd appreciate it. I'm going to start looking by first identifying poor pitching staffs from recent decades and then examining the run support received by their starting pitchers.

The performance of Seaver, Guidry and Stieb relative to each other was not a complete surprise. For one thing, Stieb slightly underperformed his Pythagorean record from '82 to '90, compiling a .600 W% relative to a .613 Pythagorean projection (Stieb significantly underperformed the Pythagorean projection during his very best years of '82 to '85, indicating that he slightly outperformed Pythagorus over the balance of his nine-year stretch). The Pythagorean comparison doesn't provide for any of the adjustments in the Relativity method I've described, but it does indicate that Stieb didn't make particularly good use of his run support. Guidry, by contrast, hugely outperformed his Pythagorean projection from '77 to '85, posting a .697 W%, more than 40 points higher than his .654 Pythagorean projection. That's a big difference. Seaver underperformed his Pythagorean projection but by an insignificant amount, posting a .636 W% from '67 to '75 as compared to a .641 Pythagorean projection, well within the margin of error in Pythagorean projections.

What did we learn by comparing a pitcher's performance to his team's after making the Relativity adjustments? Well, without having finished fully computing the figures for a meaningful number of other pitchers, I think we learned that Stieb was a pretty good pitcher; Seaver, as one must have expected, was a truly great pitcher and a worthy member of the inner sanctum in the Hall of Fame; and Guidry was precisely between Stieb and Seaver. My own takeaway is that the gap between Seaver and Guidry was about what I'd expected: it's significant, because Seaver is unquestionably among the very elite in the history of baseball, and Guidry, although deserving of HOF induction in my opinion, is admittedly a marginal candidate if one focuses soley on career statistics and ignores the astounding big-game record and his degree of dominance over a decade. I think the Relativity analysis also suggests strongly that the gap between Stieb and Guidry is about as big as the gap between Guidry and Seaver. It's significant, and it belies any comparison of the two based on nothing more than ERA+.

The results for Stieb and Guidry confirm a few things and dispense with a few myths. They confirm that Guidry's improved performance in high leverage situations translated into incremental wins, and Stieb's poor performance in high leverage situations translated into incremental losses. Stieb may have had the superior ERA+, but Guidry's LevERA+ was distinctly superior, and the difference explains in part the disparity in their ability to outperform their teams. The Relativity analysis also dispenses with the myth that Guidry's outstanding career winning percentage was just a product of good run support and great teams. Guidry did indeed get good run support and pitched for good teams, but the fact remains that he outperformed his teams by a huge margin. A .600 winning percentage for a Yankee pitcher in the years '77 to '85 would be good but not that much better than the Yanks' record for those years. A .697 winning percentage, however, is spectacular even after adjusting for run support and the quality of the Yankee teams.

Based on what we've seen so far I think it's clear that elite pitchers will outperform their teams by 17% after adjusting for run support and the quality of the rest of the pitching staff. All time greats - and I mean pitchers among the top six or eight of all time - may outperform their team on an adjusted basis by more than 35%. And it should be clear that pitchers who outperform their team on an adjusted basis by more than 25% are no doubt Hall of Famers. If there are any doubts about that, the Relativity analyses of pitchers like Drysdale, Bunning, Sutton, Niekro, Ryan, Palmer are likely to resolve those doubts.

UPDATE: I just ran the numbers for Greg Maddux for the period '92-'02. He's an interesting case, of course, because he pitched on such great pitching teams, and so his 15% outperformance of his team's record on a straight comparison of W% could be expected to rise significantly. But - wow. Maddux shoots up to a relative performance index of 42% when adjusted for run support and pitching staff. I didn't appreciate how poor Maddux's run support was relative to team average. The Braves scored 4.86 runs/game when Maddux wasn't pitching, but only 4.41 for Greg. Maddux is just north of Seaver's 37%. I guess no one should be surprised.

The Guidry Decade

Posted by Gator Guy on Monday, March 15, 2010 , under Guidry | comments (0)

I've noted before the fact that Ron Guidry is the only pitcher in baseball history to lead the major leagues in wins and lead his own league in ERA and SO over a ten-year period and yet be rejected by the Hall. He averaged nearly 17 wins per season in the decade between '77 and '86 and had a 3.23 ERA (121 ERA+). When apprised of Guidry's achievement, my fellow baseball fans have had remarkably similar reactions, initially expressing some surprise at Guidry's accomplishment but then arguing that Guidry's statistics during this period, while impressive, were pre-eminent during this period only because this decade happened to occur at an odd interregnum in baseball, when greats like Seaver, Palmer and Carlton had just passed their prime and before the rise of Clemens, Maddux, Johnson and Martinez. They suggest that Guidry's performance really wouldn't have been that exceptional in any other era in baseball.

I must admit that I was inclined to give some credence to this argument. I assumed the win total wouldn't be that impressive when compared to all the titans who pitched during the eras of four-man rotations that prevailed in baseball until the '80s. I believed it was probably true that averaging about 17 wins a season over a decade while posting an ERA+ of 120 or greater was not all that unusual during many other eras in modern baseball history, and so I decided to check the record book. It turns out I was wrong. Averaging nearly 17 wins a season over a decade while compiling an ERA 20% better than the park-adjusted ERAs of your contemporaries has always been an achievement only the greats have attained. It turns out that this level of excellence over a decade gives a pitcher an almost automatic entree into Cooperstown. By my count, there have been 27 pitchers who accomplished this since 1920. All but four have already been inducted into the Hall of Fame or are almost certain to be inducted upon eligibility. And it further turns out that Guidry's accomplishment is becoming exceedingly rare in the age of the five-man rotation and seven inning starts.

The last pitcher to average as many wins as Guidry did over a decade was Randy Johnson from '97 to '06. Maddux came very close during the same period but finished a win shy of equaling Guidry's 16.9 wins per season. Maddux averaged 16.9 or more wins per decade from '95 to '04 and for each of the preceding seven ten-year periods* (i.e., every ten-year period commencing between 1988 and 1994). Since Guidry, only five pitchers have averaged 16.9 wins or more per season over a ten-year period while compiling an ERA+ of 120 or better: Maddux, Johnson, Clemens, Glavine and Mussina. Only two pitchers currently pitching have a realistic chance at accomplishing the feat any time in the next four years - Halladay and Santana. Halladay needs to win 39 games in the next two seasons to do it. Santana can do it by winning 70 games over the next four years. Although each as a realistic chance, the odds are long. (For the most recent ten-year period - 2000 to 2009 - Andy Pettitte led all major league pitchers with 148 wins.)

Guidry's statistical accomplishments between '77 and '86 were relatively rare even in the day of the four-man rotation, at least in the American League. Just as only Clemens and Mussina matched Guidry's feat in the AL over the 25 years since Guidry did it, only three American League pitchers accomplished the feat in the 30 years before Guidry. Bob Lemon did it for each of the ten-year periods concluding in '55, '56 and '57, averaging 19.7 wins per season with a 122 ERA+ during his best ten-year stretch. Whitey Ford did it for each of the ten-year periods concluding in '63, '64 and '65, averaging 17.3 wins with a 136 ERA+ during his best ten-year stretch. And Jim Palmer did it for each of the ten-year periods ending in '77, '78, '79, '80, '81 and '82, averaging 19.2 wins and a 139 ERA+ from '70 to '79.

In the history of the American League since 1920, only ten pitchers have averaged 16.9 wins per season and a 120 ERA+ over a ten-year span: Grove, Ferrell, Feller, Newhouser, Lemon, Ford, Palmer, Guidry, Clemens and Mussina. Only four other pitchers have averaged 16.9 wins per season in the AL since 1920: Hunter, Morris, Lolich and Wynn. There have been only two ten-year periods in the American League in which as many as two pitchers have accomplished this: Grove and Ferrell from '28 to '37 and Grove and Ruffing from '31 to '40. Feller and Newhouser would have done it within the same ten-year period if Feller hadn't lose four years to military service. In other words, had Guidry's magnificent decade occurred at any other time in AL history since 1920 he would have been one of only three pitchers at most to accomplish this feat in that ten-year period.

There was only one brief era in modern baseball history - the late '60s to late '70s - when there were more than three pitchers in any ten-year period to average 16.9 wins per season while maintaining a 120 or better ERA+. There were five pitchers to accomplish the feat in the decade from '69 to '78: Jenkins, Perry, Palmer, Carlton and Seaver. There were four pitchers in each of the decade periods of '68 to '77 and '71 to '80, and it was the same four pitchers for each period: Perry, Palmer, Carlton and Seaver.

In other words, Guidry's statistical accomplishments from '77 to '86 would have placed him among the very top echelon of elite pitchers in any era, generally accompanied by only one or two other pitchers in averaging nearly 17 wins per season and a 120 ERA+.

* * * * * *

Here's the full roster of pitchers on the 17 wins/120 ERA+ list since 1920, listed in chronological order (click here to see a spreadsheet with the full details of their peak ten-year stretches):

Besides Guidry, there are three other pitchers on this list who failed to make the Hall of Fame. Two of them, Wes Ferrell and Lon Warneke, had their peak years in the '30s. The other, Bucky Walters, had his peak years in the late '30s and during the war years. They had remarkably similar careers, each winning between 192 and 198 games and compiling ERA+s between 115 and 119. Each won between 170 and 175 games during their peak decade, meaning that for each pitcher his peak decade comprised substantially his entire productive career. Of the three, Walters was the only one to receive more than token support among HOF voters, twice topping 20% in the balloting in the mid-60s.

Bill James argues in his recent Gold Mine article that none of these three pitchers have as strong a case as Guidry for inclusion in the Hall. According to Bill, none had as many seasons as Guidry in which he ranked among the very best pitchers in the league, and though each had one or more truly superlative seasons, none had a season of historical significance comparable to Guidry's '78 season. As Bill saw it, Ferrell had a marginal case for the Hall, but Walters and Warneke fell distinctly short of the standards HOF voters have historically applied to pitchers.

I generally agree with Bill's analysis (although I think he sells Ferrell a little short). I would make another point, however. While each of Walters, Ferrell and Warneke was among the leading winners in baseball during his peak decade, their ten-year win totals (175 for Ferrell and Warneke; 170 for Walters) were not particularly notable for the period. Certain of their contemporaries, as well as premier pitchers in succeeding decades, far exceeded their 10-year win totals, with Grove, Hubbell, Spahn, Marichal, Feller, Roberts, Lemon and Jenkins all averaging approximately 20 wins per season or more. Guidry's ten-year total, by contrast, ranks with the very best ten-year win totals during the era of the five-man rotation. Since Guidry's ten-year peak, only Maddux, Glavine and Johnson have eclipsed Guidry's tally by as much as one victory per season, and only Maddux's best 10-year stretch ('92 to '01) topped Guidry's total by two victories per season.

Guidry won almost as many games in a decade as did Ferrell, Warneke and Walters, despite pitching in an era where the five-man rotation and sharp decline in complete games have rendered the 17-win season the functional equivalent of the 20-win gold standard of prior eras. That fact alone would seem to distinguish Guidry from the three other pitchers who have failed to make the Hall despite a decade of averaging 17 wins per season and a 120 or better ERA+, and would seem to dictate that he join the 23 other pitchers who have accomplished this feat and are either already in the Hall or on a glide path to Cooperstown.

* * * * * *

Guidry's ERA+ during his peak decade is roughly equivalent to the peak-decade ERA+ of seven Hall of Fame pitchers on the list of those who averaged 16.9 wins or more and a 120 ERA+ over a ten-year stretch, and it is equal to or better than various Hall of Fame pitchers who didn't make the list. The following HOF pitchers on the list had an ERA+ under 130 during their best 10-year stretch: Bob Lemon (122), Don Drysdale (121), Ferguson Jenkins (123), Mike Mussina (129), Red Ruffing (124), Robin Roberts (123), Steve Carlton (127), Warren Spahn (128). The following HOF pitchers narrowly missed making the list and had an ERA+ during their peak decade roughly equivalent to or lower than Guidry's:

Jim Bunning, 164 wins, 124 ERA+
Don Sutton, 164 wins, 120 ERA+
Nolan Ryan, 160 wins, 116 ERA+
Early Wynn, 188 wins, 116 ERA+
Lefty Gomez, 160 wins, 127 ERA+
Waite Hoyt, 166 wins, 114 ERA+
Eppa Rixey, 166 wins, 119 ERA+
Herb Pennock, 163 wins, 116 ERA+
Catfish Hunter, 184 wins, 111 ERA+

* * * * * *

Let me anticipate the reaction of many: "O.K., Guidry had a ten-year peak comparable to many Hall of Famers, perhaps even most, but he had no career to speak of outside of that ten-year peak, and had a shorter career than virtually all HOF pitchers other than Koufax and Dean." I would respond, simply, by asserting that (i) most HOF pitchers have very little to speak of in terms of HOF-worthy accomplishments outside of their peak decade, and (ii) Guidry's career was as long as, or virtually as long as, more than a dozen HOF pitchers.

Guidry pitched in 14 seasons and pitched enough innings to qualify for the ERA title in ten of those. Guidry's ten full seasons are as many or more than Koufax, Dean, Gomez, Lemon, Walsh, Chesbro, McGinnity, Sutter, Joss and Waddell. The following pitchers only had 11 full major league seasons: Hunter, Newhouser, Vance, Haines and Coveleski. Drysdale and Three-Finger Brown each pitched in 14 major league seasons, twelve of which were substantially full seasons. If Guidry's career was too short, the shortfall seems too insignificant a reason to exclude him from the Hall.

The list of HOF pitchers with far longer careers but virtually no HOF qualifications outside their peak decade is a much longer list. For every Spahn, Maddux or Carlton who truly had more than ten HOF worthy seasons there are two or three HOF pitchers whose accomplishments outside their ten-year peak did little more than pad their career statistics.

Let's begin by taking two very striking examples: Early Wynn and Don Sutton, each of whom had very long careers and joined the cherished 300 win club.

Wynn's peak decade was '50 to '59, during which he won 188 games and had a 116 ERA+. On either side of the that peak decade Wynn was 83-94 with a 92 ERA+ ('39 to '49) and 29-31 with a 105 ERA+ ('60 to '63). Outside of his peak decade, Wynn had two seasons where he won more than 13 games: 1943, when he went 18-12 with a 110 ERA+, and 1947, when he went 17-15 with a 103 ERA+. Neither season was the equal of his average season during his peak decade.

Sutton's peak decade was '71 to '80, in which he won 164 games with a 120 ERA+. On either side of that peak Sutton was 66-73 with a 95 ERA+ ('66 to '70) and 94-81 with a 102 ERA+. Like Wynn, Sutton didn't have a single season outside his decade peak where his W-L record and ERA approached his average season during his peak. Like Wynn, he didn't have a single season outside of his peak that, if replicated over a 10 or 12 season period, would have given him a credible argument for the HOF.

The same is true of many other pitchers who most would agree are greater than Wynn or Sutton. Take Bob Gibson, for example. Gibson had a brilliant peak between '63 and '72, winning 191 games and posting a 136 ERA+. Gibson didn't have a single season outside of that peak decade that would have qualified him for the list I've discussed in this post if replicated over a decade (i.e., 16 or more wins and a 120 ERA+). His '62 season was by many measures an excellent season (he led the league with a 151 ERA+), but his 15-13 record for a winning team was worse than any of his peak decade years.

Carl Hubbell had a ten-year peak very similar to Gibson's where he was a consistently big winner with superlative ERAs. Outside that peak decade he didn't have a single season where he won more than 11 games.

The following HOF or presumptive HOF pitchers didn't have a single season outside their ten-year peak in which they won 15 games and had a 112 or better ERA+: Drysdale, Lemon, Newhouser, Marichal, Wynn, Ruffing, Roberts, Vance, Hubbell, Sutton, Gomez, Hunter, Feller, Coveleski, Pedro Martinez and Curt Schilling. The following pitchers had exactly one such season outside their peak decade: Bunning, Jenkins, Palmer, Gibson, Ryan, Hoyt and Randy Johnson.

The fact is that most HOF pitchers were truly great for a period of about ten years. Pitchers like Walter Johnson, Maddux, Spahn, Clemens and Seaver who had multiple outstanding seasons outside their peak decade are the exception, not the rule. It is clear that Ron Guidry had a peak decade that is comparable to the peak decades of many Hall of Famers - Bunning, Drysdale, Lemon, Wynn, Sutton, Gomez, Hunter, Jenkins, Ruffing and Roberts, among others. It is also clear that none of these pitchers did anything outside of their peak decade that materially added to their HOF qualifications.

* * * * * *

I would humbly submit that by any statistical measure Guidry's HOF qualifications are the equal of Bunning's, Dyrsdale's, Lemon's, Newhouser's, Vance's and Gomez's. To the extent they won more games in their career it is because they pitched in the era of four-man rotations. I would also submit that Guidry's HOF qualifications are the equal of Ruffing's, Hunter's, Sutton's and Niekro's. To the extent they won more games than Guidry they did so primarily because they had many more seasons where they were perhaps competent major league pitchers but not HOF quality pitchers.

There will no doubt be those who argue that many of these pitchers don't meet their particular idea of HOFers. Hunter, Bunning and Drysdale are examples of more recent HOF inductees who are frequently characterized as marginal inductees. Vance, Newhouser, Coveleski, Pennock, Hoyt and Faber are just a few examples of other pitchers who have been deemed by many to be marginal HOFers. I think it is fair to say that Guidry's HOF qualifications stack up pretty well against the qualifications of all the pitchers I've named in this paragraph. If one wants to argue nonetheless that Guidry doesn't belong in the Hall then they are in effect arguing for a much smaller Hall of Fame and for HOF standards that are radically more restrictive than the standards that have been observed for the last 75 years.

P.S. Here's a list of pitchers who just missed making the 17 wins/120 ERA+ list, either because they had too few wins, an ERA+ less than 120 or because some of their peak seasons occurred prior to 1920.
______________________________
* Win totals for any pitcher who pitched during strike-shortened '81, '94 and '95 seasons have been adjusted to reflect shortened seasons. For example, Maddux's win totals for any decade that includes both the '94 an '95 seasons was divided by 9.6 rather than 10 because approximately 40% of a season was lost between the premature end to the '94 season and the belated start of the '95 season.

ERA+: Looking Behind The Stat

Clutch Septembers of the '20s and '30s

A Recipe For Catfish

The Celebrated Mr. K

Lost In Translation

Big Unit, Indeed

More Team Relative Analyses: Lefty Grove

Reappraisals of Palmer, Bunning and Drysdale

The Theory of Relativity

The Guidry Decade

Links

Categories

Blog Archive

Robin Roberts, 1926-2010

Mini Updates