My day job is in the world of data science and in it I build evaluation and prediction models. All of these models are built on certain assumptions that I hold to be true and although I update the models over time my views on these base assumptions rarely ever change. However, it is good practice to revisit these assumption every now and then to make sure that they are still true with any new knowledge you may have.
In baseball there are certain advance stats that I hold to be correct even though I never built them from first principles myself, this article is the first of a series of me checking through the assumptions of Wins Above Replacement (WAR) for MLB.
First a quick sabermetrics history lesson. In 2008, David Appelman gave the world an early Christmas present with “win values” being added on Christmas Eve to the pages of FanGraphs. Though it wasn’t labelled WAR for a little while longer. In 2010, thanks to the efforts of Sean Forman and Sean Smith, Baseball-Reference added it to their collection of statistics.
As many of you will know, these models aren’t exactly the same but if you weren’t following these developments at the time you may not have seen, or heard of, the biggest historical difference: the replacement level. Both systems had chosen a different value for replacement level, this is rather key as WAR is defined as wins above replacement.
Replacement level is a concept to set a baseline for expected performance, from a team that invested the absolute minimum in putting together a Major League roster. This replacement level team would have no farm system, so that they didn’t have to spend money on draft picks, coaches, equipment, or facilities. They would build out their roster entirely on league minimum players.
FanGraphs historic replacement level was a .265 win percentage team, this meant that they had 1142 wins to distribute out to players each season (in a 162 game, 30 team season). Baseball-Reference’s historic replacement level was a .320 team, this gave them 875 wins. That is not an insignificant difference. The difference doesn’t really change how important a player is within an individual season, it might add or remove one win from everyone, but players would be ranked the same.
What it did impact however, is how much WAR a player accumulated over their career. If we brought down the replacement level down by the equivalent of 1 win, suddenly a player who played 22 full seasons gains an additional 22 wins but one who played 15 gains just 15 wins. If these were on the same WAR value before, there is now a 7 win difference which is quite significant. Which could change our overall evaluation of the two players and impact things like Hall of Fame voting.
These potential differences in lifetime WAR, were noted by David Appelman and Sean Forman and in 2013 they chose to measure players on the same scale. The new unified replacement level, which is the same as we have today, was set at 1,000 WAR per 2,430 Major League games (current number of games per season). An easier way to put it, is that their new replacement level is equal to a .294 winning percentage, which works out to 47.7 wins over a full season.
That number was almost exactly halfway in between FanGraph’s previous and Baseball-Reference’s previous, though the number wasn’t chosen solely as an equal compromise. In Tom Tango’s original methodology post back in 2008, the model he laid out used a replacement level equal to 1,009 wins, or a .292 winning percentage.
Since 2013, this value of replacement has been kept. I wanted to check to see if this value of replacement still holds true in today’s game and if it doesn’t what we could or should do with replacement level and its impacts on WAR.
To do that the question of how do we measure replacement level has to answered. Thankfully Tango’s original piece gave us a definition of replacement we can work from: it’s the talent level for which you would pay the minimum salary on the open market, or for which you can obtain at minimal cost in a trade.
So, Minor League free agents, who go on to player in the majors, are the epitome of a replacement level player. They get generally get paid league minimum, though there are some who have performance related bonuses (especially pitchers). If a Major League team was attempting to build out a roster of league minimum players, these are exactly the kinds of talents they’d be picking from. However, they could also supplement those players with waiver claims.
One way to check what the current replacement level is, is to look at the players that have been signed by these methods and see how they performed. This is an updated look to a Dave Cameron piece in 2013.
Using MLBTradeRumors transaction tracker, I got a list of every player who had signed a Minor League contract or been claimed off the waivers since the end of last season up to the end of March 2019. I then filtered for players who had either pitched 30 IP or faced 100 PA over the two previous season (2018 & 2017).
The reason for using the two previous seasons and not just the last one is to combat section bias. The fact that we’re identifying players who were forced to sign Minor League deals, means that we’re starting with a group that likely underachieved last year. Therefore we don’t want to just focus on what these players did last year. We don’t want to go back too far, of course, as many of these guys aren’t what they used to be, so I settled on two years. That gives us the following results.
There were 90 players that matched the 100 PA over 2017-2018 criteria, they combined for a grand total of 2.2 WAR. Given that it was from over 35k plate appearances, if you scale that to a per 600 plate appearances rate, that’s 0.04 WAR per full season. In all reality 0.04 WAR is equivalent to 0 suggesting that the players teams are pick up via this method are on average at replacement level.
This matches Cameron’s findings for 2013 players, but this time we have a much larger sample size (90 compared to 24). Suggesting players of this level are moved about more these days than before.
But before we move on to the pitchers, I wanted to quickly show you how these 90 players have done in 2019.
Just over half (46 of 90) of these players have played a game in 2019 (as of September 2nd), they currently have pro rata WAR of just over 1 win per 600 PA. What is really interesting here is that players who had negative WAR over 2017-2018 combined have performed much better than those who had positive WAR; but less of them managed to play a game in 2019.
This would seem to suggest that the players who haven’t performed that well over the last two seasons have been given a chance to shine, not all make it; but those who have made it have done quite well. Where as the players who had done OK in past two season have continued with their decline. Players such as Gio Urshela, Tom Murphy & Hunter Pence compared to Curtis Granderson, Carlos Gonzalez & Lucas Duda.
The names high on the negative to positive WAR side are mostly swing changers so here we may be seeing some impact of the player development revolution highlighted in Ben Lindbergh’s and Travis Sawchik’s The MVP Machine.
71 pitchers meet the criteria and produced 18.6 WAR across the 2017 & 2018 seasons. If we scaled that to a per 180 innings pitched rate, that’s 0.53 WAR per full season. This is quite a bit above the expected replacement level, so I looked at the players to see if there was an issue.
Straight away I spotted an outlier that need to be removed: Gio González. Although he did sign a Minor League deal (with incentives) with the Yankees before that start of the season he opted out of that when they didn’t promote him to the Majors and got a $2m deal with the Brewers. In my books that means he isn’t the type of replacement level player we are looking for here.
This exercise for pitchers a little trickier, since so many Minor League deals for starters include incentives that push them well over the league minimum. Ervin Santana had a Minor League deal that had base salary of $4.3 million if he made the roster, which he did.
After removing these 2 and a few others, cursory glance didn’t suggest many others with such deals, the picture looked a bit different.
Removing these players has brought these replacement level down, to 0.19 WAR/180 IP. That is much closer to replacement and given that these players average just under 40 IP across the 2 year window their if we were to adjust that rate of pitching it would be 0.04 WAR/40 IP.
Once again these roughly replacement level players have come out very close to replacement level. Let’s look at how they have done in 2019.
Just under half (32 of 66) of these players have played in 2019 and overall their performance has slightly increased. As with hitters, more pitchers who had positive WAR over the previous 2 years have pitched in 2019 and they have performed slightly worse this season but unlike hitters these players have out performed their negative WAR counter parts. The main improvement has come from the previously bad pitchers being less worse but not actually good.
This all suggests that the theoretical replacement level is quite close to the actual MLB level. That being said, if the teams are using a WAR model with the current replacement level to help them make decisions on who to cut and who to get then we might expect to get this answer; as we may be having a chicken and the egg moment.
So this may not be the true replacement level but it does seem to be the replacement level that MLB teams are using. And because of that it still seems reasonable to me that this can be used as the base of WAR going forward.
The next piece in this series will be looking at the split of the wins between pitchers and hitters.