Has WAR Derailed the MVP Discourse?

Brad Penner-USA TODAY Sports

On Wednesday, I wrote about one of my favorite topics: The impact of sabermetrics on the practice and analysis of baseball. Specifically, in this case: How MVP voters behave in the post-Fire Joe Morgan era. And for those of you who got to the end of that 2,000-word post and did not feel sated, there’s good news! This was not the question I actually set out to answer when I started kicking the topic around.

Welcome to Part 2.

The very name of the MVP award invites voters to consider the value of a certain player’s contributions. For nearly 100 years, that was a tricky proposition. How do you weigh differences in position, in playing style, park factors, hitting versus pitching versus fielding versus baserunning? It’s enough to boggle the mind.

One of the earliest and most enduring projects of the sabermetrics movement has been the pursuit of a catch-all metric that captures the entire picture. To measure every player’s production, put it in context, and spit out a single number that says he was worth X number of runs or Y number of wins. The implications of such a number are obvious for an enterprise such as identifying the most valuable player in the league in a given season.

Critics from days gone by — or critics from now, who are so behind the times they only sound like they’re from days gone by — would wield WAR as a straw man against more numerate analysts. “If you believe in WAR, why not just vote the WAR leaderboard?”

The answer is that WAR, while published to the tenth of a win, is a model built on assumptions that might not hold to that level of precision. It’s a good starting point, but it’s not the whole conversation, especially in a close race.

And you can tell that WAR is not settled science because even the sabermetrics eggheads can’t agree on what WAR is. FanGraphs, obviously, has its own WAR, and I use that because I’m a good company man. But there are dozens of MVP voters, and many thousands of people who publish analysis of baseball for public consumption. Unfortunately, some of them get their information from other sources. (But we’re working on that, I promise.)

When people talk about WAR, they’re either talking about FanGraphs WAR, Baseball Reference WAR, or WARP, from Baseball Prospectus. (The “p” at the end is for PECOTA, I think.) Each makes its own assumptions about the role of defense in pitcher evaluation, just to name one point of divergence. So while all three mainstream WARs usually arrive at similar conclusions, they don’t arrive at the same conclusion. Which is what you’d need to make “sort by WAR” a coherent award voting strategy.

I’ve compiled a list of every player since 2000 to finish in the top 10 in MVP voting, and their WAR totals that year for fWAR, bWAR, and WARP — 481 seasons in all. On only one occasion — Justin Turner in 2017 — did all three major flavors of WAR agree on a player’s value to within a tenth of a win. So I feel pretty confident that Turner was worth exactly 5.6 WAR in 2017. Everything else is open to interpretation.

There’s another complicating factor: All three flavors of WAR get updated every so often, as wins above replacement is not some inviolable concept that was considered perfect when it was first willed to humanity by the gods. It’s a statistical model, after all, and those ought to get updated when new information emerges. So the WAR totals you see now might not be exactly the same as they were when voters were looking at the various sites’ leaderboards back in the day.

So there are different ways of answering the question of how closely WAR and the MVP vote line up. From a normative standpoint, you could look at it as the voters getting the question right, or as a check on WAR — do the numbers match what we see with our eyes, or is this math left to run amok?

From 2000 to 2023 (since we don’t have the MVP vote tallies from this year yet), there were 134 individual league leaders in WAR. That’s 24 seasons times two leagues times three WAR brands, minus the five years (2000 to 2004) for which BP doesn’t have published WAR data.

Out of those 134 individual player seasons, the WAR leader — for each league and WAR type — finished in the top 10 in MVP voting 120 times. Here are the exceptions.

WAR Leaders Outside the MVP Top 10

It should surprise no one that starting pitchers are heavily represented here. MVP voters hate pitchers for reasons that are not entirely clear to me. A starting pitcher can lead the league in WAR and not get the time of day from voters. Greinke, apparently, can lead the league in two different WARs and still be ignored.

On Wednesday, I dragged Max Nichols, the poor soul who cast the one dissenting MVP vote against Carl Yastrzemski in 1967. Today, I’d like to praise another singularly iconoclastic voter: Nick Piecoro of the Arizona Republic in 2018.

That year, Christian Yelich ran away with the MVP vote, winning 29 first-place votes for a season in which he missed the Triple Crown by two home runs and one RBI. He hit .326/.402/.598, which was not only a paradigm-shifting breakout for Yelich himself, but also a transformative one for his team. Yelich, who was traded from Miami to Milwaukee the previous winter, led the unfashionable Brewers to the no. 1 seed in the National League. Narratively, it was not unlike Yaz in 1967: an unbelievable individual season at the center of the league’s best team level story.

And yet Jacob deGrom was so much better. This was the 10-9, 1.70 ERA season that spawned a million memes and ended up, somehow, going underrated by voters. The Mets ace led the league in all three WAR categories and had more than a win on Yelich in all three cases, up to 2.6 WAR over the eventual MVP according to B-Ref. And yet he only finished fifth.

Not every MVP debate has a right and wrong answer. This one did, and Piecoro was the only voter who found it.

Most of the rest of the ignored WAR leaders topped the table in WARP, which can be a bit stingy and occasionally spits out an unexpected result. Nevertheless, it’s a reminder that Soto was really good even in a down year in 2022, and McCann was hugely underrated in general.

I did want to highlight Markakis because, at the risk of being unkind, he’s the last person I expected to lead the league in anything, ever. He led the league in bWAR in 2008 by having, in some respects, a typical Markakis season: He hit around .300, with about 20 homers and 40-odd doubles, and played good defense in right. What was unusual is that he also walked a career-high 99 times, which elevated his OBP to .406, and DRS credited him with 22 runs saved above average, which is a ridiculous number. That was enough to boost a very, very good season to 7.4 bWAR (and 6.1 fWAR, so it’s not like B-Ref was an outlier here), and in a weak season that topped the AL.

This was my favorite moment in researching this piece. Finding out Markakis led the league in WAR once was like finding out Cesar Tovar was actually better than Yastrzemski in 1967. Suffice it to say, the voters did not notice. Not only did Markakis miss out on the top 10 in MVP balloting, he didn’t get a single vote from anyone.

Now that we’ve seen the WAR leaders who didn’t get much love from the MVP voters, let’s look at the other side of the equation: MVPs who didn’t lead the league in any version of WAR. Since 2000, there have been 18 such cases. (Incidentally, that list includes the past two Twins MVPs and the past three Phillies MVPs, so in cases where voters dislike WAR they apparently love sarcasm and smoked and/or cured meats.)

That group of 18 also includes both MVPs in 2000 and nine of the 20 MVPs between 2000 and 2009. That inflection point would seem to hold with the historical proliferation of advanced stats. The Fire Joe Morgan era ended in 2008, and the early 2010s were to the anti-intellectual voter what the Hundred Days were to Napoleon. These included the Jack Morris Hall of Fame debate and the Miguel Cabrera vs. Mike Trout MVP campaigns of 2012 and 2013.

But recent history doesn’t line up with the WAR leaderboards either. From 2020 to 2022, four out of six MVPs — José Abreu, Freddie Freeman, Bryce Harper, and Paul Goldschmidt — failed to top any WAR leaderboard. Of course, all of those races were hard cases: 2020 because of the 60-game season, 2021 and 2022 because of a diffuse and pitcher-dominated WAR leaderboard.

Indeed, many of these supposedly WAR-deficient MVPs only missed out on league leadership by fractions of a win. So let’s narrow that group to MVPs who won in years where a different player led the league in at least two different WAR categories by at least a win in each case.

Miscarriages of WAR

Year League Actual MVP bWAR fWAR WARP WAR MVP bWAR fWAR WARP

2002 AL Miguel Tejada 5.7 4.5 n/a Alex Rodriguez 8.8 10.0 n/a

2012 AL Miguel Cabrera 7.1 7.3 6.5 Mike Trout 10.5 10.1 6.5

2013 AL Miguel Cabrera 7.5 8.6 7.0 Mike Trout 8.9 10.1 7.5

2015 AL Josh Donaldson 7.1 8.7 7.0 Mike Trout 9.6 9.3 8.0

2017 AL Jose Altuve 7.7 7.7 5.4 Aaron Judge 8.0 8.7 8.3

2018 NL Christian Yelich 7.3 7.7 5.9 Jacob deGrom 9.9 9.0 7.0

This list doesn’t include cases like the 2000 AL race or both races in 2006, where it’s clear that a purely WAR-based voter would not have given the MVP to the player who won in real life, but the top of the leaderboard is close enough that it was less clear which player should’ve won. (I had forgotten, for instance, how good Jason Giambi was in 2001 and Carlos Beltrán was in 2006.)

Every one of these cases has a strong narrative basis. Cabrera won the Triple Crown in 2012. Donaldson, Altuve, and Yelich turned up-and-coming teams into juggernauts. Nobody respects pitchers or A-Rod, and for some reason Trout became a proto-culture war volleyball in the waning days of the sport’s battle over empirics.

From 2000 to 2004, FanGraphs and Baseball Reference WAR had the same leader nine times out of 10 chances. That might have something to do with methodological convergence, but it probably has more to do with Barry Bonds and Alex Rodriguez just dragging everyone else in the league during that period. The voters — most of whom had probably never heard of WAR at the time — gave the consensus WAR leader the MVP five times out of those nine chances, and voted him second on two other occasions.

In the 19 seasons that followed, one player led the league in all three WARs on 15 occasions. The voters agreed 10 times.

WAR Triple Crown Winners, 2005-Present

Year League Name bWAR fWAR WARP MVP

2007 AL Alex Rodriguez 9.4 9.6 6.3 Rodriguez

2008 NL Albert Pujols 9.2 8.7 9.2 Pujols

2009 NL Albert Pujols 9.7 8.4 10.4 Pujols

2012 NL Buster Posey 7.6 9.8 8.0 Posey

2013 AL Mike Trout 8.9 10.1 7.5 Miguel Cabrera

2015 NL Bryce Harper 9.7 9.3 8.0 Harper

2015 AL Mike Trout 9.6 9.3 8.0 Josh Donaldson

2016 AL Mike Trout 10.5 8.7 8.9 Trout

2017 AL Aaron Judge 8.0 8.7 8.3 Jose Altuve

2018 NL Jacob deGrom 9.9 9.0 7.0 Christian Yelich

2019 NL Cody Bellinger 8.6 7.9 6.9 Bellinger

2020 AL Shane Bieber 3.2 3.1 2.5 José Abreu

2021 AL Shohei Ohtani 8.9 8.0 10.2 Ohtani

2022 AL Aaron Judge 10.5 11.1 10.0 Judge

2023 AL Shohei Ohtani 9.9 8.9 9.3 Ohtani

In three of those cases — Trout in 2013 and 2015 and Judge in 2017 — the consensus WAR leader came second. The other two involved pitchers: deGrom in 2018 and Shane Bieber in 2020, and while neither won the MVP, they are the two most recent full-time pitchers to finish in the top five.

Has WAR turned MVP voting into a leaderboard-reading exercise, as critics predicted? Not really. Thanks to the three publishers’ subtle differences in methodology, I’m not sure that it can. When there is consensus, the voters usually follow suit (again, unless there’s a pitcher involved). But that was basically the case 20 years ago, when on-base percentage was state of the art.

If MVP voting has become predictable in the 2020s, and if it’s the result of pressures and innovations brought on by sabermetrics and online media, I don’t think it’s because everyone is just blindly following WAR. We’re all getting better information now, and people with clubhouse access and award votes are better equipped to use that information. Whether that leads to conformity is of secondary importance to the quality of the analysis.

Even in the 1960s, there was a risk of being brigaded for expressing an unpopular opinion — just ask the ill-fated Max Nichols. And fear of the dogpile has only grown immeasurably since then. It’s an incentive not to disagree with the consensus, sure, but it’s an even bigger incentive to get your facts straight. Surely nobody would place a league-average utilityman over an 11-WAR slugger on an MVP ballot in 2024. (With that said, nothing would make me happier than finding out, in a week’s time, that someone voted for Matt Vierling over Judge for AL MVP. I will throw an actual party if that happens, and you’re all invited.)

So is it that voters no longer have the courage of their convictions, or is it just that they have better convictions these days?

Source link