Stat of the Week...Top 15 in percentage of starts won since 1952 (min. 120 wins): 1. Warren Spahn 53.9%... 2. Juan Marichal 52.1%... 3. Ron Guidry 51.7%... 4. Whitey Ford 51.2%... 5. Roy Halladay 51.0%... 6. Pedro Martinez 50.9%... 7. Johan Santana 50.8%... 8. Bob Gibson 50.8%... 9. Sandy Koufax 50.6%... 10. Mike Mussina 50.4%... 11. Jim Palmer 50.3%... 12. Roger Clemens 50.1%... 13. Randy Johnson 49.9%... 14. Andy Pettitte 49.9%... 15. Jim Maloney 49.6%...
Previous Articles
Showing posts with label Team Relative Performance. Show all posts
Showing posts with label Team Relative Performance. Show all posts

The Celebrated Mr. K

Posted by Gator Guy on Tuesday, March 23, 2010 , under , | comments (0)




His blazing five-year stretch from '62-'66 has become the standard by which all other great pitchers are measured. The Gold Standard. The definition of pitching dominance. Anyone who considers a new mode of analyzing pitching greatness has to insert his five peak seasons into the formulas and see what comes out. If you plug into your formulas his stats from these five seasons, during which he won five straight ERA titles, three pitching triple crowns and three 25+ win seasons in four years, and a historic result doesn't come out the other end, then maybe you need to double check your methods and formulas.

From '62 to '66 Sandy Koufax outperformed his team by 41%. If you exclude the '62 season, where Koufax's injury and the Dodger's decision to rush him back into the rotation in late September significantly skew the numbers, then Koufax outperformed his team by 49.5% from '63 to '66*. That's Randy Johnson territory. A 50% Team Relative performance over a period of years could be known as the Sandy-Randy Standard.

Randy Johnson's peak four-year period by Team Relative analyses was actually the five-year period from '93 to '97 that includes his injury-shortened '96 season when he went 5-0. It also includes the strike abbreviated '94 and '95 seasons. Over that five-year period Johnson's Team Relative performance was 58.2%. History suggests, however, that Johnson would not have maintained the .920 winning percentage he compiled in '95-'96 had he pitched full seasons. Johnson's true peak, as measured by wins, ERA+ and most other measures, actually occurred with the D'backs from '99 to '02, and he compiled a 49.9% Team Relative performance during that period.

Maddux compiled a 52.6% Team Relative performance from '94 to '97, but that period also included two strike-shortened seasons.

Guidry's Team Relative performance over his three-year peak from '77 to '79 was 40%. Seaver had a 44.2% Team Relative performance for four years between '68 and '71. If one excludes Gibson's injury-shortened '67 season, Gibson maintained a 41.1% Team Relative performance from '65 to '70. If one excludes Marichal's injury-shortened '67 season, he maintained a 33.8% Team Relative performance from '63 to '69.

For Schilling's three 20-win seasons - '01, '02 and '04 - he had a Team Relative performance of 48.2%. For Guidry's three 20-win seasons he had a Team Relative performance of 43.5%.

Team Relative analysis confirms that Koufax's great run was indeed among the very best four or five year stretches in baseball history. Throw in the huge innings totals Koufax put up in these years, the no-hitters, strikeout records, pennant race and post-season performances, and it's clear why Mr. K became a legend.
_________________________

* Koufax's best season by far as measured by Team Relative performance was his injury-shortened '64 season, when he posted a 19-5 record for a Dodger team that was truly terrible but for Koufax, compiling a .442 winning percentage in games in which Koufax was not the pitcher of record. Koufax outperformed that team by nearly 89%.

Lost In Translation

Posted by Gator Guy on Monday, March 22, 2010 , under , | comments (0)



Not surprisingly, an analysis of Bert's prime years of '70 to '79 demonstrates that despite his superlative ERAs he didn't significantly improve his team when he was on the mound. Yes, Bert didn't get good run support from his teams, who scored .35 runs/game fewer for Bert than they did for other starting pitchers. It is also true that in measuring Bert's performance against his teams' Bert was competing against some pretty good pitchers. For the entire decade, Bert pitched on staffs that were slightly above average even without Bert's contribution, and the staffs on his '70, '72, '77 and '79 teams were among the very best in their leagues. But the Team Relative analysis controls for these factors, of course.

Even after increasing Bert's run support to team average, and adjusting his team's W-L record downward to reflect what it would have been with an average pitching staff, Bert still only outperformed his team's W-L record by 10.2%. That's down in Drysdale territory. As I've previously noted, Bert hugely underperformed his Pythagorean projection during those ten years, compiling a .536 winning percentage as compared to a .599 PythPro. If Bert had been able to perform to his PythPro he wouldn't be such a hot topic today because he would have been inducted into the Hall years ago.

Bert's Team Relative performance was worst during his first six years with the Twins, the period that prompted the Sports Illustrated article wondering why Bert wasn't a bigger winner. Bert's outperformed his team by 8.3% during those years. He improved slightly in his stints with the Rangers and Pirates from '76 to '79, outperforming his team by 13.9%, still well short of what we'd expect from a top flight pitcher. Bert's worst year in this regard was '72, when he performed only 2.2% better than his team despite receiving .44 runs/game more than the other Twins starting pitchers. This is one year where Bert can truly be called a victim of a poor distribution of run support, with a disproportionate number of games falling at either end of the spectrum - a large number of games in which he received three runs or less and a large number of games where he received seven or more. Amazingly, in Bert's 38 starts there were only five games in which he received 4, 5 or 6 runs of support. Bert's run support distribution was also very poor in '75 and, to a lesser extent in '73 and '74.

Bert's run support distribution was more conventional in '76 to '79, although his average run support in '76 was terrible - only 2.75 runs/game. But remember, the Team Relative analysis controls for poor average run support*; it doesn't control for poor run support distribution. In '77, '78 and '79, Bert's run support was almost precisely team average. After controlling for Bert's run support for the '76 to '79 period, a period during which his run support distribution was more conventional and his ERA+ was a very good 125.5, he still only managed to outperform his team by 13.9%. And since we've acknowledged Bert's poor run distribution in the period '72 to '75, we should also acknowledge that Bert's Team Relative performance in the period '76 to '79 was skewed by his W-L record in the '79 season, when the Pirates' bullpen and bats bailed out Bert an extraordinary 13 times after Bert left the game in a position to lose. To give you some idea of how extraordinary this "bailout" total is, consider that Bert was similarly bailed out only 18 times in the preceding nine years. Bert's 12-5 record in '79 is extremely misleading, and if not for his good fortune and the Bucs late-inning dramatics for Bert in '79 his record would have been something like 12-15, which more than eliminates the improvement we see in Bert's Team Relative performance in the late '70s.

I've not modeled Bert's projected record from '72 to '75 assuming a more optimal distribution of run support. It could be done using a system that generates a random distribution of run support and Bert's projected W-L record using such a system would no doubt benefit. There's also no doubt, however, that any benefit to Bert from a more conventional distribution of run support from '72 to '75 would be largely offset by his '79 season, when Bert easily could have lost an additional 10 games and his 12-5 and .706 winning percentage was not reflective of his  performance: he had a 109 ERA+, a LevERA+ of 99, and won only 12 of his 37 starts for a World Series championship team.

Again, this Team Relative analysis is limited to Bert's peak decade of '70 to '79. As the Bert Backers would no doubt argue, Bert had some great seasons outside this period, primarily '84 and '89. But the '70 to '79 period forms the overwhelming bulk of Bert's argument for the Hall. For the balance of his career he had a .533 winning percentage and 108 ERA+, and despite the excellent '84 and '89 seasons the 80's were an exceedingly erratic period for Bert, a period in which poor seasons ('80 and '88) and injury limited seasons ('82 and '83) detract from his case for the Hall. The argument of the Bert Backers is almost exclusively based on the '70s, during which he posted his best ERA+ figures, almost 2/3s of his shutouts and six of his eight 200 strikeout seasons. But the fact is that for all of Bert's statistical achievements in the '70s, they didn't translate into a commensurate win total and W-L record and Bert didn't improve his team as much as he should have. And the problem wasn't run support.

__________________________

* For example, Bert posted a 13-16 record in '76, worse than his team's 79-83 record. But the Team Relative analysis has Bert outperforming his team by 32% after adjusting for his run support.

Big Unit, Indeed

Posted by Gator Guy on Saturday, March 20, 2010 , under , | comments (0)




It's an amazing sight when you're watching great athletes compete at the highest levels in their sport and one competitor is so great that the opposition is just overmatched. I mean dominated; not just beat, not just bested, but dominated, almost completely helpless. In the realm of baseball, the greatest pitchers, at their best, will do this. Major league hitters, the best in the world, men with preternatural reflexes and freakish hand-eye coordination, are left to wave futilely at pitches or are so flummoxed they can't even swing.

I remember watching Clemens pitch against the Yanks in '97 and wondering how in the hell anyone ever hit the guy. I remember watching Pedro against the Yanks in September '99, the game he struck out 17, and feeling sorry for Yankee batters. Jorge Posada couldn't even get the bat off his shoulder. He had no idea what was coming - 96 mph fastball, or slider, or change-up or curveball. Yankee after Yankee left the plate after striking out shaking their head on the way back to the dugout, no doubt feeling the way Mickey Mantle felt after facing Koufax for the first time in the '63 World Series, when he said to the umpire as he was turning to leave the plate after striking out, "now how in the hell am I supposed to hit that shit."

But for sheer dominance, the ability to induce not only helplessness in big league batters but terror, there has perhaps never been anyone like Randy Johnson. It was sometimes like watching little league baseball, where the big kid is on the mound, the one that seemed to mature about two years ahead of the rest of the kids, and the ball is blowing by the batter before they can even think about swinging. One kid gets smoked and the next batter approaches the batters box looking like they're going to the gallows. They have no chance. When he was at his best, that was Randy Johnson on the mound.  Too big, too nasty, too fast. And that slider - Christ, you pitied lefthanded hitters who had to face Randy Johnson.

The ironic thing is that Johnson was arguably never considered the best pitcher in the game during his prime. Before you say, "hey, wait a minute...", consider this: Maddux was off the charts in '94 and '95, throwing strike after strike without ever hitting the white of the plate. Clemens was spinning back-to-back pitching triple crowns in '97 and '98. And then Pedro was dominating from '99 to '02, like a Marichal with more speed and a Hoffman-like change-up. Note that I didn't say Johnson was never actually the best during this period - he unquestionably was in 2001 when Pedro missed half the season. I said he was never considered the best. Johnson had the misfortune during his peak of '93 to '02 of always seeming to be in the shadow of another all-time great. Even in 2001 Johnson was a bit overshadowed by his loquacious and self-promoting  mound-mate, he of the Bloody Sock.

During his ten-year peak, Johnson posted seasons of 18-2 (the strike-shortened '94 season), 20-4, 21-6 and 24-5. After sulking his way through the first half of the season in Seattle in '98, he went to the National League and went 10-1 with a 1.28 ERA in eleven starts for the Astros. I remember thinking as Johnson was doing this, "man, those NL batters have never seen anything like this."

Judging just by the W-L records, Randy Johnson was as close to unbeatable for those ten years as anyone has ever been in major league baseball. He was 175-58, for a .751 winning percentage. Only Grove had a comparable winning percentage across a similar number of decisions, going 172-54 for a .761 winning percentage between '27 and '33. There was a difference, however. Grove was pitching for Connie Mack's Athletics, the team that sent Ruth and Gehrig packing for home at the end of the season in '29, '30 and '31. Grove was lavished with spectacular run support by Foxx, Simmons and Cochrane. He pitched to one of the all-time great field generals in Mickey Cochrane. He had a hell of a supporting cast. Johnson received generally good run support from his Seattle and Arizona teams, but it was actually slightly less than team average, and needless to say the overall quality of those teams didn't approach the Athletics of the 30's.

The team relative analysis for Randy Johnson yielded a number that made me go back and double-check the formulas in the spreadsheet. After those checked out, I reconsidered the whole concept of the team relative analysis as a metric for pitchers. But I think its validity still holds. I had to reconsider the concept and double-check the methodology because it produced for Randy Johnson a result so outlandish, so amazing, that it was hard to believe.

Between '93 and '02, Randy Johnson outperformed his team, after adjusting for factors other than his own performance that might have effected his W-L record and his team's W-L record, by 50%. Yeah, you read that right - 50%.

It's a figure that significantly exceeds Seaver and Maddux. And it's a figure that I'm pretty certain will exceed Koufax's figure for his peak period of '62 to '66. If I don't miss my guess, it's a figure that only Walter Johnson will be able to approach over a decade period. And even Walter won't hit Randy's mark unless either his run support was significantly worse than other Senator pitchers received or those Senator pitching staffs were better than they appear to be at first glance.

This result is not the product of any significant adjustment to Johnson's record produced by the methodology. A straight comparison of Johnson's W-L record to his teams' records in games other than those where Johnson got the decision shows that Johnson outperformed his team by nearly 46%. After adjusting for the fact that (i) Johnson's run support was slightly below team average and (ii) Johnson was outperforming a very good D'back pitching staff from '99 to '02, that figure increased to 50%.

Unless I can identify some flaw in this concept I'm forced to reconsider my opinion that Grove was the greatest southpaw in the history of the game. Hell, Randy Johnson might have been the greatest pitcher, period. True, these analyses are restricted to peak periods of about a decade, but it's not like Randy had a short career; he won 300 games, after all. And I think it unlikely that other candidates for greatest ever will have achievements outside their peak decade that will militate in their favor, although Clemens might.

In terms of sheer power-pitching dominance, we were watching Walter Johnson at his peak when we were watching Randy Johnson from '93 to '02. We were watching peak Koufax. We were watching Grove at his very peak, say from '28 to '33. I'm not sure I appreciated that at the time. In fact, I'm pretty certain I didn't. But the Big Unit was indeed that great.

More Team Relative Analyses: Lefty Grove

Posted by Gator Guy on Friday, March 19, 2010 , under , | comments (0)



Lefty Grove weighs in at 36%. I didn't realize what tremendous run support Grove got from the Athletics in the early 30's.

It may be a shade behind Seaver and Maddux, but it doesn't change my opinion that Lefty was The Man among post-1920 pitchers. What it does do is make me appreciate how great Seaver and Maddux were. Man, that 35%+  improvement over team performance is one upscale neighborhood.

Reappraisals of Palmer, Bunning and Drysdale

Posted by Gator Guy on , under | comments (0)




The analyses of pitcher performance relative to his team have yielded some interesting results.

As I described in the "The Theory of Relativity" post, it is possible to compare a pitcher's W-L record and winning percentage to his team's and adjust for factors that distort the comparison. These adjustments involve adjusting the pitcher's run support to equalize it with the run support the team provided to the other pitchers on the team and normalizing the ERAs and runs allowed by the rest of the staff to league average. These two adjustments assure that a pitcher won't benefit or suffer by virtue of run support that deviated from team average, or by virtue of the fact that the rest of the pitching staff, to whom the pitcher is effectively being compared, were either better than league average or worse than league average. A pitcher may be a great pitcher but his W-L record relative to his team's won't be very impressive if the rest of the team's pitching staff is comprised of great pitchers. In comparing Greg Maddux's W-L record to his team's it is obviously necessary to adjust for the fact that the Braves' pitching staffs were great and produced tremendous winning percentages because of the presence of guys like Glavine and Smoltz.

Here are the new results. Curt Schilling's ten-year peak from 1997 to 2006 was pretty impressive, as his excellent winning percentage and ERAs would suggest. Schilling outperformed his team by approximately 27% over that decade. Bob Gibson outperformed his team by approximately 28% over his nine-year peak of '64 to '72; not sure whether people will find that disappointing or impressive. Both these results obviously cast Ron Guidry in a very good light, because Schilling's and Gibson's team relative performance figures are right in Guidry territory. Ron is in good company.

Here are the results I found surprising. I ran the numbers on Bunning, Drysdale and Palmer. I've always thought of Bunning and Drysdale as being very similar, and I've conceived of Palmer as an American League version of Tom Seaver, although a shade behind Tom Terrific. My team relative analyses have fundamentally changed my appraisals of these pitchers.

I didn't expect Drysdale to fare very well in this analysis, simply because his a straight comparison of his W-L records to his teams' isn't very impressive. I expected, however, that Drysdale would benefit from the fact that the Dodger pitching staffs were generally exceptional during his era. Drysdale came in at 9.75%. Now remember that Dave Stieb came in at approximately 17%. That's a big difference. I know Drysdale made in the Hall in large part because of his participation in a lot of great pennant races and World Series, but those who question Drysdale's HOF bona fides have new ammunition for their argument.

Bunning polled in at 16%. That's a pretty good figure, and it's far better than Drysdale.

Now here was a bit of a stunner. A straight comparison of W-L records for Jim Palmer and his Oriole teams never looked all that impressive, because those Orioles teams posted great records during Palmer's peak from '70 to '78. Still, I expected that Palmer would benefit greatly from the fact that those were great pitching staffs the Orioles fielded in the '70s. Well, they were great staffs in the early '70s, but from '74 to '78 they really weren't all that good when you take away Palmer. For the full nine-year period the Orioles posted a 103.6 ERA+ when you subtract Palmer's ERAs. Good, but not great. After adjusting for the quality of the Oriole pitching, Jim Palmer outperformed his team by 20.4%. That's better than Bunning, but not by much. And it's nowhere near Tom Seaver's neighborhood of 37%.

O.K., Jim Palmer wasn't Tom Seaver. Does that make me reconsider whether Palmer was a legitimate first-ballot HOFer? No, not at all. Eight 20-win seasons in nine years is quite an accomplishment, and Palmer was an undoubted big-game pitcher, posting some pretty impressive pennant race performances and superlative post-season numbers. But he wasn't as good as I thought he was.

As with any other statistic or metric, it's important to put it in context. It is an all too common failing of many fans who fancy themselves sabremetricians that they attach too much weight to a single statistic. Schilling comes out well ahead of Palmer in team relative performance. That tends to confirm what Schilling's great winning percentage and ERAs had already told us: Schilling was a damn good pitcher. But Palmer was rock-solid consistent and was generally a bigger winner than Schilling over their respective peaks, even after adjusting for the difference between the four-man rotation that Palmer pitched in and the five-man rotation Schilling pitched in. Palmer had one stinker in '75 when he had significant arm issues, but was excellent every other year. Every pitcher gets a pass for one season where he had arm issues; Seaver had one during his peak, Guidry had one in '84, and most other great pitchers also had one. Schilling had more than his share of them, however, and was significantly limited in his contribution to his team in '99, 2000, '03 and '05 as a result of poor performance or limited innings due to arm issues. He was excellent when he was on, like Saberhagen, but like Saberhagen you can't just ignore all the seasons where his team didn't get what they justifiably expected.

Bunning would appear at first glance to have had more than his share of off seasons during his peak, but that's deceptive. Bunning was very durable and never really missed time due to injury issues during his peak. Some of his pedestrian W-L records, particular in 1960, are easily explained: poor team, terrible run support. There's reason to believe that if you put Bunning on those '70s Orioles teams he might have posted another four or five 20-win seasons. But there's also reason to believe he might not have. It has to be noted that Bunning pitched on some good teams that gave him pretty good run support, like the '61 Tigers and '64 Phillies, and he didn't post 20-win seasons. He did well enough, posting good ERAs and winning 17 games in '61 and 19 in '64. But a Hall of Famer should have been winning 20 games, and probably 22 or 23. Bunning had a bit of a tendency to pitch to his team's level. That's not a particularly damning observation, but it's an important consideration.

Back to Guidry for a moment. The myth is that Guidry's spectacular W-L record was in large part a function of great run support from a powerful Yankee offense. It's a myth. The '77 and '80 Yankee teams could hit, no question, but the Yankees of the pennant winning years of '77 to '81 were unquestionably pitching and defense oriented teams. Look at the stats. And Guidry's run support from the Yankees was in any event strictly average for those teams. Guidry outperformed those teams, and those exceptional pitching staffs, because he was an excellent pitcher, made optimal use of his run support, was a clutch pitcher who pitched best in critical situations, and was in short a winner. That term - winner - is a term a lot of the self-styled stat geeks scoff at, but if they looked a little closer at the stats the concept would be plain. Some pitchers are winners. Some aren't. Blyleven wasn't. Steve Rogers wasn't. To a lesser degree Stieb wasn't either. It doesn't mean they weren't good pitchers. But it means they weren't as good as their generally superior ERAs and ERA+s would suggest. The stat geeks should ponder that for a moment.

The Theory of Relativity

Posted by Gator Guy on Thursday, March 18, 2010 , under , , | comments (0)



I love a lot of the new pitching stats. They're great analytical tools. Take FIP, for example ("fielding independent pitching"). It's based on the proposition that what happens on a ball put in play is frequently a function of random chance and team fielding. Bill James recognized its utility and cited Wally Bunker's 1964 season as an example of a pitcher apparently benefiting from some good luck insofar as his BAbip that year was .216. It turns out that Bunker in fact had a pretty good facility for generating low BAbip's in his career, presumably because, like Maddux in his prime, he was adept at keeping the ball away from the fat part of the bat and inducing batters to hit pitches outside the hitter's sweet spots in the strike zone. But Bunker never again came close to posting the .216 BAbip he posted in '64, despite being backed by the legendary team defense of the '60s Orioles.

FIP tends to understate the effectiveness of a pitcher with a demonstrated ability to consistently generate very low BAbips. Take The Great Rivera, for example. I was skeptical of Mariano's decision in '97 to move almost exclusively to the cutter because it seemed to sharply cut into his strikeouts. "Throw the high fastball!", I would shout, longing for Mariano's incredible strikeout ratio in '96 when he K's 130 batters in 107 innings. Still, I had to admit that batters seemed almost incapable of getting good wood on the cutter, but bloops and dribblers can and do become hits, while strikeouts can't and don't. Bloops and dribblers that found holes in the defense became, in my mind, "Mariano Specials." Obviously Mariano's decision to go with the cutter has been thoroughly vindicated and my early concerns were unfounded. But Mariano never fares too well in the FIP stat, and that's misleading because Mariano has demonstrated an ability to consistently generate low BAbips (Mariano's career BAbip is .265, as compared to a major league average of .299).

As Bill James has noted regarding FIP and various other new and sophisticated measures of pitching performance, they have a tendency to throw out a lot of information in an effort to isolate and identify a pitcher's performance independent of non-pitching factors. Bill is a little unsettled by this, and so am I. As he's argued, W-L records are the antipode to FIP and similar stats, incorporating all information, including unfortunately things that have nothing to do with a pitcher's performance, like offensive support and team fielding. However, the inclination of the stat geeks to summarily dismiss W-L records is extremely misguided. It is possible to start with W-L records and make appropriate adjustments, and that's what I'm about to propose.

The Theory of Relativity, in contrast to FIP, throws out nothing but attempts to adjust for everything (or at least most things) that happens outside of the pitcher's performance. Simply put, it compares a pitcher's W-L record to his teams record in games where the pitcher was not the pitcher of record (i.e., it subtracts the pitcher's W-L record from the team's), adjusting for factors that effect the pitcher's and team's W-L records but are largely unrelated to the pitcher's own performance. If a pitcher received run support better or worse than the run support a team generally provided its pitchers, the pitcher's W-L record is adjusted (via the Pythagorean theorem) to reflect what his W-L record would have been had he received run support equal to his team's average. It also adjusts for the performance of the rest of the team's pitching staff, because even a good pitcher who receives excellent run support will appear to fare poorly relative to his team's W-L record if the rest of the starting pitching staff is comprised of Walter Johnson, Pete Alexander, Tom Seaver and Randy Johnson, with Gossage, Eckersley and Rivera coming out of the bullpen.

These pitching staff adjustments are accomplished by taking the team's ERA+ (exclusive of the subject pitcher's own ERA+) and adjusting the team's W-L record to reflect what it would have been had the rest of the staff generated a 100 ERA+ (again, based on the Pythagorean theorem). It simply takes the team's ERA+ exclusive of the subject pitcher's ERA+, calculates the runs allowed or saved by the staff's performance above or below the assumed 100 ERA+, and adds or subtracts those incremental runs to the team's runs allowed. A Pythagorean record is then generated assuming a league-average staff.

Once you've adjusted the pitcher's record for run support and adjusted the team's record for the rest of the pitching staff's performance, you compare the pitcher's adjusted W-L record to his team's adjusted W-L record. The impact of run support on the pitcher's W-L record relative to his team's is thereby eliminated, and the impact of the rest of the staff's performance on the team's W-L record is similarly eliminated. A good pitcher will have an adjusted W-L record much better than his team's adjusted record, and a poor pitcher will have a worse one. Measuring the difference between the adjusted records of the pitcher and the team provides a good measure of the pitcher's performance. It doesn't expressly adjust for team defense (a notoriously difficult aspect of team performance to measure), but it implicitly incorporates it because bad team defense will lower the denominator representing the team's W-L record and therefore increase the relative impact of the pitcher's W-L record (adjusted for run support) relative to his team's W-L record (adjusted for the performance of the rest of the pitching staff).

The concept of simply comparing a pitcher's record to his team's is not novel, but the defects in the system became apparent to me when I was comparing Phil Niekro's relative W-L record to Don Sutton's. Even if the records were adjusted for variations in run support, Sutton would still tend to fare poorly compared to Niekro because Niekro would benefit by being compared to the poor Braves pitching staffs of the '70s, while Sutton would suffer from being compared to the generally excellent Dodger's pitching staffs of the '70s. It was easy for Niekro to outperform the sub-average pitchers on the Braves staff, but more difficult for Sutton to outperform  the Tommy John's, Claude Osteen's and Andy Messersmith's who generally populated the Dodger staffs. It's fairly easy, however, to adjust for this, and the conceptual validity of the adjustment should be obvious. Still, the process of collating the team pitching data from different years, incorporating it into the adjustment formulas and generating the Pythagorean adjustments is a little involved and so for the moment I'll only present an analysis of three pitchers: Tom Seaver, Ron Guidry and Dave Stieb.

I selected these three pitchers because I thought they would be illustrative. Bill James has noted how spectacular Seaver's winning percentage was given the generally mediocre nature of the Mets teams he pitched for in the late '60s/early and mid-70s. I selected Guidry because I knew that his record was spectacular even after accounting for the fact that the Yankees teams he pitched for were generally pretty good, but I didn't know how his relative record had been affected by his run support and the quality of the Yankee pitching staffs. And I selected Stieb because (i) I knew that he had significantly underperformed relative to Pythagorean projections during his prime years in the early and mid-80s, and (ii) I was tired of beating up on Bert Blyleven. (I knew Blyleven also underperformed his Pythagorean projections in his prime, but I genuinely like the guy and he was by many measures a borderline great pitcher - certainly better than Stieb - albeit not a Hall of Famer).

I compared nine-year peaks for each of the pitchers. This was convenient because both Guidry and Stieb had distinct nine-year peaks that account for all of their superior seasons. One could select various nine-year periods for Seaver, because his peak extended well beyond nine years, but I selected his first nine seasons, comprising substantially his entire Met career. I'll begin the comparison by noting some things that you probably already know. For instance, the Mets were not a good team once you subtract Seaver, notwithstanding their two NL pennants and their '60 World Series championship. Their team winning percentage from '67 to '75 was .495 (mediocre, but not bad), but was only .463 once you subtract Seaver's .636 winning percentage from the equation, and that obviously stinks. You probably also knew that the Mets' problem was poor hitting. They actually had very good pitching, even after stupidly trading Nolan Ryan, posting a team ERA+ of 108 from '67 to '75. Once you subtract Seaver's superlative ERAs, however, the team ERA+ was 102.2. That's not great, but it's pretty good considering the staff's ace pitcher is excluded. Another way too look at it is that the Mets staff was above average even without the great Seaver.

I was somewhat surprised by how good Stieb's winning percentage was from '82 to '90. He was 135-90 for a very good .600 winning percentage. But I was also slightly surprised by how good the Jays teams were in that period. They had a .548 winning percentage, and were generally a pretty good team even aside from the excellent '85 and '87 seasons,  other than in '82. Even subtracting Stieb's W-L record the Jays still had a .539 winning percentage. I was very surprised, however, by how good the Jays pitching was in that period. They had a team ERA+ of 109.9 and an ERA+ of 106.9 even after subtracting Stieb. Even without Stieb the Jays staff in the '80s was  as good as the Yankees pitching in the period '77 to '85 (primarily because the Yankees pitching sagged significantly from '82 to '84). Jimmy Key and Doyle Alexander were no slouches, and Jim Clancy was a pretty good No. 4 starter. And the Tom Henke-led bullpen was generally pretty solid and sometimes excellent.

The Yankees had a team winning percentage of .575 from '77 to '86, and were well over .500 every year other than '82. The Yanks' winning percentage drops to .552 without Guidry, still very good but not that much better than the Jays' .539 W% without Stieb. The Yanks pitching was better than the Mets but not as good as the Jays, posting an overall 106.3 ERA+ and a 103.7 ERA+ without Guidry. The period of '77 to '85 was really a tale of two Yankee pitching staffs: the excellent staff from '77 to '81 and the generally mediocre staff from '82 to '85.

On the offensive support side both Guidry and Seaver received run support slightly better than team average, in each case about 3%. Stieb's run support was 1.2% below team average. Accordingly Guidry's and Seaver's adjusted W% was slightly lower than their actual W% and Stieb's slightly higher. The adjustments were quite small in each case, with Stieb's W% going up from .600 to .606. Guidry's adjusted W% dropped 18 points to .679 and Seaver's dropped 14 points to .622.

The big beneficiary of the adjustment to team W% by assuming an average pitching staff was Stieb. The Jays W% (exclusive of Stieb) drops from .539 to .518. A Jays staff with a 100 ERA+ would have added about 36 runs per year to the Jays' runs allowed total.

The effects of these adjustments were essentially negligible for Guidry and Seaver, with the reduction in their personal W%'s being largely offset by the reduction in the team W% resulting from translating their good team pitching staffs into average staffs. Stieb, by contrast, saw a significant increase in his W% relative to his team's. Simply comparing Stieb's .600 W% to his team's .548 W% shows that Stieb outperformed his team by 9.5%. Adjusting for run support and pitching staff, however, increases Stieb's relative performance figure to 17%. That's a pretty good figure, and though I've not yet run the figures for various HOFers I'm willing to bet that it compares favorably to some of the more marginal inductees.

Seaver outperformed his team after adjusting for run support and pitching staff by a tremendous 37%, which is almost precisely the figure obtained by comparing his straight W% to his team's.

Guidry outperformed his team after adjusting for run support and pitching staff by 27%, which represents less than a one point increase over the approximately 26% figure obtained by comparing his .697 W% to his team's .552 W% without Guidry.

Just to give some idea of how astounding Seaver's figure is, my preliminary calculations appear to suggest that Koufax outperformed his team during his historic five-year run from '62 to '66 by slightly north of 40%. Seaver's 37% relative performance figure maintained over a nine-year period, therefore, appears to be a historic feat, and I'm willing to bet that few other pitchers since 1920, if any, can match it.

Stieb's figures demonstrate how a pitcher who had run support below team average and pitched on a good staff can have actually outperformed his team by a larger margin than a simple comparison of W% between pitcher and team would indicate. On the flip side, a pitcher whose performance relative to his team's at first glance appears to be superlative can be revealed as a fundamentally average pitcher if he received both great run support relative to his team's average run support and pitched on a team with an inferior pitching staff. Obviously neither Seaver nor Guidry are examples of this, and I'm not sure off the top of my head which pitcher might fit this profile. I know Andy Pettitte has received tremendous run support throughout his career, but he's also pitched on generally excellent pitching staffs. If anyone can suggest such a pitcher in the comments section I'd appreciate it. I'm going to start looking by first identifying poor pitching staffs from recent decades and then examining the run support received by their starting pitchers.

The performance of Seaver, Guidry and Stieb relative to each other was not a complete surprise. For one thing, Stieb slightly underperformed his Pythagorean record from '82 to '90, compiling a .600 W% relative to a .613 Pythagorean projection (Stieb significantly underperformed the Pythagorean projection during his very best years of '82 to '85, indicating that he slightly outperformed Pythagorus over the balance of his nine-year stretch). The Pythagorean comparison doesn't provide for any of the adjustments in the Relativity method I've described, but it does indicate that Stieb didn't make particularly good use of his run support. Guidry, by contrast, hugely outperformed his Pythagorean projection from '77 to '85, posting a .697 W%, more than 40 points higher than his .654 Pythagorean projection. That's a big difference. Seaver underperformed his Pythagorean projection but by an insignificant amount, posting a .636 W% from '67 to '75 as compared to a .641 Pythagorean projection, well within the margin of error in Pythagorean projections.

What did we learn by comparing a pitcher's performance to his team's after making the Relativity adjustments? Well, without having finished fully computing the figures for a meaningful number of other pitchers, I think we learned that Stieb was a pretty good pitcher; Seaver, as one must have expected, was a truly great pitcher and a worthy member of the inner sanctum in the Hall of Fame; and Guidry was precisely between Stieb and Seaver. My own takeaway is that the gap between Seaver and Guidry was about what I'd expected: it's significant, because Seaver is unquestionably among the very elite in the history of baseball, and Guidry, although deserving of HOF induction in my opinion, is admittedly a marginal candidate if one focuses soley on career statistics and ignores the astounding big-game record and his degree of dominance over a decade. I think the Relativity analysis also suggests strongly that the gap between Stieb and Guidry is about as big as the gap between Guidry and Seaver. It's significant, and it belies any comparison of the two based on nothing more than ERA+.

The results for Stieb and Guidry confirm a few things and dispense with a few myths. They confirm that Guidry's improved performance in high leverage situations translated into incremental wins, and Stieb's poor performance in high leverage situations translated into incremental losses. Stieb may have had the superior ERA+, but Guidry's LevERA+ was distinctly superior, and the difference explains in part the disparity in their ability to outperform their teams. The Relativity analysis also dispenses with the myth that Guidry's outstanding career winning percentage was just a product of good run support and great teams. Guidry did indeed get good run support and pitched for good teams, but the fact remains that he outperformed his teams by a huge margin. A .600 winning percentage for a Yankee pitcher in the years '77 to '85 would be good but not that much better than the Yanks' record for those years. A .697 winning percentage, however, is spectacular even after adjusting for run support and the quality of the Yankee teams.

Based on what we've seen so far I think it's clear that elite pitchers will outperform their teams by 17% after adjusting for run support and the quality of the rest of the pitching staff. All time greats - and I mean pitchers among the top six or eight of all time - may outperform their team on an adjusted basis by more than 35%. And it should be clear that pitchers who outperform their team on an adjusted basis by more than 25% are no doubt Hall of Famers. If there are any doubts about that, the Relativity analyses of pitchers like Drysdale, Bunning, Sutton, Niekro, Ryan, Palmer are likely to resolve those doubts.

UPDATE: I just ran the numbers for Greg Maddux for the period '92-'02. He's an interesting case, of course, because he pitched on such great pitching teams, and so his 15% outperformance of his team's record on a straight comparison of W% could be expected to rise significantly. But - wow. Maddux shoots up to a relative performance index of 42% when adjusted for run support and pitching staff. I didn't appreciate how poor Maddux's run support was relative to team average. The Braves scored 4.86 runs/game when Maddux wasn't pitching, but only 4.41 for Greg. Maddux is just north of Seaver's 37%. I guess no one should be surprised.