Chasing Probabilities and The Postseason Crapshoot

The end of the regular season is the time of the season when I check the playoff odds every day, to see how much they have changed.  As I have said with WAR in my previous pieces, these are values that generally I treat as good and don’t question. They are not perfect but accurate enough that we don’t need to find a baseball equivalent of Paul the Octopus.

In the spirit of my current endeavours, let’s have a look at a couple of projection systems and see how good they are. FanGraphs and FiveThirtyEight are two sites which not only produce playoff odds but also have the win percentages available for each game that make up these projections. Do teams that they project to win 60% of the time, win games 60% of the time?

With that in mind I took all the win probabilities for all games from 2014 onwards (as far back as FanGraphs goes) grouped them in the into 1% buckets (anything from 48%, up to but not including 49% will be in the 48% bucket). Then I looked at all the teams in each bucket and looked to see what the win percentage of those teams were.

As you can see from the graph above both sites have pretty good prediction models.  They model very close to the projected probability. The values on the high and low win percentage are affected by low volume so should be Ignored. So with both models, if they think something will happen 60% of the time, it happens 60% of the time.

While this analysis suggests that both models are very good at doing their job, there are some little discrepancies. The biggest being the FanGraphs 48% win projection teams who win 52.6% of the time.

These models predicted the correct winner (over the 6 year period), i.e. team that had the higher win percentage wins, 57.16% (FiveThirtyEight) and 57.02% (FanGraphs) of the time. If we compare that to the simplest of models, where we assume the home team won every game, that model would be correct 53.38% of the time.

These models have gotten better over the 6 year period though, with them getting closer to 60% in 2019. Which shows that while these models are quite good, baseball is inherently hard to predict.

Even the best hitters in baseball fail to get on base more often than they succeed, Ted Willliams’ OBP is .482.  This is one of many reasons why don’t see teams going 162-0. If you look at the graph for the models you can see that, for both of them, the teams with the highest predicted win percentage lost their games.  In fact the top 2 for each lost.

FanGraphs had the Indians at a 84.2% win probability against the White Sox, for their match up on 30th September 2017. On the mound it was Corey Kluber versus Carson Fulmer and the Sox went on to win 2-1. This was 10 days after the Sox overcame (3-1) the previous highest odds FanGraphs had given, when the Astros had a 83.9% win probability. That time it was Fulmer versus Dallas Keuchel.

For 538, they had the Red Sox at 81.2% against the Orioles, with Chris Sale on the mound versus Jimmy Yacabonis; Baltimore won 10-3. They gave their second highest win probability to Houston, in the now infamous for gamblers, game against Detroit on August 22nd this season. Justin Verlander pitched the distance allowing just two hits, which both were home runs, but Tigers pitching lead by starter Daniel Norris, held the Astros to just 1 run.

The models have good correlation with each other, the r-squared is 0.74, but they do disagree on 16% of games. Of this 16%,  FiveThirtyEight was correct 50.2% of the time and FanGraphs was correct 49.8%. So not much difference there. If we only look at when they both agree the predicted result occurs 58.43% of the time and 61.52% for 2019.

So, how do these games odds get turned into play off probability? Using these prediction models the sites have win probabilities for all the games in a season, they run tens of thousands of simulations of all the remaining games of the season to get the odds of each team making the playoffs.

When we get to the last few weeks of the season these playoff odds can jump quite significantly as the impact of one result is much more. The graph below shows the playoff odds for Cleveland, Tampa Bay & Oakland throughout this season, you can see that the graphs change much more violently at the end of the season.

What about when we get to the postseason? As I said before, predicting baseball is hard and when MLB gets to the postseason you will hear a lot of us call it a crapshoot.

For all but one game in the last 5 postseasons, FanGraphs and FiveThirtyEight have had the win probabilities of the favoured team between 50% and 70%. In a one game wild card game, it is easy to say that even the best team only had a 70% chance of winning. But how would those odds change over 5 or 7 games?

If we take some simple assumptions that a team has the same chance of winning in every game of a series (no changes for pitchers and stadium) you end up with the following table.

So if a team was the most heavily favoured team in recent playoff history and was projected to win 70% of the time (in a 1 game series), they would win a 5 game series 83.7% of the time and a 7 game series 87.4% of the time. That is roughly 5 out of 6 times and 7 out of 8 times. But remember that a team has to win 3 separate series to win the World Series (if we exclude the wild card) so this team would project to win it all 63.9% of the time (83.7% * 87.4% * 87.4%).

Teams aren’t regularly that heavily favoured in the postseason. Before the start of the playoffs, both sites had Houston with the greatest World Series odds with FanGraphs at 32.5% and FiveThirtyEight at 25%. Simplistically this means that FanGraphs has Houston at 59% win probability for each game and FiveThirtyEight at 56%.

Which means that in the 5 game divisional series that Houston plays they would probably win it 2 out 3 times. With Houston being the most favoured team and there being 3 other divisional series we should expect one of them to be an upset with the unfavoured team winning. Last season was the first time since 2009 that all 4 higher seed teams won their series and the time before that was 1995.

Odds and historical performance tells us that we should expect at least one of the top 4 teams not to make it to the championship series. Who did you we think wouldn’t make it this time the Yankees, the Astros, the Dodgers or the Braves?  (Ed – We’re late getting to this post, but yes, we have already lost two!)

These models give fans a great insight into the potential outcomes of the season but in the end there is only one season that matters. The one that actually happens.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.