In the companion piece to this, I went through how I created a couple of metrics to show the cluster luck that can happen in a game. And armed with the Expected Runs (xRuns) and Run Luck (RL) we could see how lucky teams got in a game and across the season.
But there is an additional variable of luck/randomness to account for which will take xRuns to the next level. In the previous piece, we had treated all singles as equal, all doubles as equal etc. But as the saying goes ‘All singles are equal but some singles are more equal than others’.
I am talking about ‘xwOBAcon’ or in maybe simpler English expected weighted on-base on contact. Thanks to MLB Statcast and all the wonderful people behind it, there is data that we can now use to estimate the value of a batted ball on the wOBA scale (a home run being two and an out being zero) using the exit velocity and launch angle of a batted ball.
The diagrams above show all the batted balls from 2018 on the left, and on the right, you have the expected model which they built using exit velocity, launch angle and player speed (for weak hit balls and grounders). On a simplistic level, for flyballs and liners, it looks at the exit velocity and launch angle of a batted ball, and averages the wOBA values of approximately 400 of the most similar balls-in-play to predict wOBAcon. You can read more details on xwOBAcon here.
Now, imagine if we did something similar but instead of averaging out the wOBA value, we average out the run expectancy. So, I built a model using the 2020 data. I reduced my comparisons to 100 events due to having less data than the average season and that got me the following:
With this model, I can now get the expected runs of any ball in play, based on the exit velocity and launch angle. I was lazy here and didn’t include the player-speed part for grounders but the overall model is still good without it.
For example, a ball hit at 110mph with a 20-degree launch angle has an expected runs value of 1.13, a bit shy of the full home run value (1.38). But one hit at a -11 degree launch angle has an expected runs value of -0.06, suggesting that an out is more likely than a hit.
With this, we then do the same as we did in the last piece but change the run expectancies for batted ball outcomes. Linear weights based on the outcomes we had last time are replaced with these new run expectancies based on the exit velocity and launch angle. This gives us new xRuns and RL for every game.
Now let us look at how well teams did using these new run expectancies for batted ball outcomes. Same caveat as before, this does not include stolen bases, caught stealing, passed balls and wild pitches which would move the expected runs slightly.
Expected to Win
In the 951 games that were played last year, the team with the higher xRuns won 758 of them, that is 80% of the time. This is slightly down on the 84% from the last model and that was probably to be expected given getting an actual hit has more impact on winning a game than a probable hit.
In the games where the xRuns leader did not win, the Run Luck difference between the two teams was 3.38 runs, with the winner getting on average 1.53 more runs than expected and the loser getting 1.85 runs less than expected.
As with the previous model, the likelihood that a team wins increases the further they are ahead by xRuns. If we compare that to the linear model (graph below), you can see that the win rate is very similar for 0-2 xRuns differences but is lower in the new model for teams in the 2-5 xRuns difference territory.
You might have noticed the dip in win percentage from 100% for teams who were 8-9 xRuns ahead and that game is one I talked about in the last piece when the Blue Jays lost 14-11 to the Marlins.
Biggest Positive Diff in xRuns in a Loss – 8.5 Runs, Toronto Blue Jays (12 Aug v Marlins)
Toronto was the team with the biggest xRuns lead to lose a game in the linear model and they come out on top for this model as well. The xRuns difference though has increased from 4.8 runs to 8.5 runs.
The Marlins scored five runs in the third and three runs in the fifth, the combined xRuns for the third to the fifth was 2.1 runs. Toronto did commit three errors in the game but hit 20 barrels compared to just four by the Marlins. This game gets more ridiculous the more I investigate it.
Highest RL (Nine-inning game) – 12.9 runs, Atlanta Braves (9 Sept v Marlins).
This is another game which matches the same as the previous model. The Braves outhit their xRuns by 12.9 runs. That being said, their 16.1 expected runs were the third-highest of the season and they thoroughly deserved that win.
Lowest RL (Nine-inning game) – -5.2 runs, Houston Astros (11 Oct @ Rays).
This is one of a trio of games that were the main driver behind me looking into this. This was Game One of the 2020 ALCS, in which the Astros lost 2-1 to the Rays. The Astros look the lead off a solo shot from Altuve in the first inning, and despite hitting the ball well multiple times later in the game, they failed to put another run on the board. I had them ahead by 2.3 xRuns and their odds of winning with that difference was 76%. Then came Game Two.
The Astros once again outperformed the Rays by xRuns but failed to catch the Rays after going three down in the first inning thanks to a single, an error and a home run. The Astros were ahead by 3.5 xRuns and their odds of winning with that difference was 91%. Then there was Game Three.
The Astros led this game, again off a first-inning home run from Altuve, but the Rays piled on the runs in sixth which included this very fortuitous bloop double to drive in the fourth and fifth. The Astros managed just one further run.
The odds for Astros winning this game, given their 2.7 xRun lead, was 85%. If we combine those three games, the likelihood (based on the outcomes of all other 2020 games) of the Astros losing all three of them was 0.2%. The odds of winning all three were 59%.
They managed to take the Rays to Game Seven in that series but failed to overcome the 0-3 deficit. Any one of those games going in their favour and it probably would have been an Astros vs. Dodgers World Series.
Who was the luckiest?
The White Sox and the Rays sit a top the list for the teams with the most luck, and the Brewers and Reds had way worse luck than the other teams.
If we look purely from an offensive perspective, the Rockies got 0.6 runs more than expected from their games but that can probably be explained away by the fact that hits in Colorado always outperform these models, as the mile-high conditions lead to balls travelling further. On the other end of the offensive spectrum, we have the Reds who were unlucky to the tune of -0.8 runs per game.
Defensively you could argue that some of this luck is actually down to defensive performance which would be definitely true, but it wouldn’t account for all of it, and it is hard to split it out. The Rays and the Cubs were two of the top three teams defensively according to FanGraphs and my defensive luck stat. The Padres, however, were second on FanGraphs, but only had middling defensive luck.
Maybe the Rays were right to trade the players away as they knew their results in 2020 had over-performed the underlying metrics. And the Brewers may have been happy to stand pat with their team as their underlying performance suggested they should have done better.
What is next?
I would really love to hear people’s opinions on the expected runs models and graphs. The idea was for them to show if teams got lucky or unlucky in games. So please, comment here or on Twitter with your suggestion about which version you preferred. Do they make sense by looking at them? What more do you want from the graphics? Also, if there is a game you want to see the graph for, send me a tweet and I will endeavour to respond.
Finally, shout out to Bob @BravesintheUK for mentioning exploring football’s xG but for baseball.
Russell is Bat Flips and Nerds’ resident analytical genius, and arguably Europe’s finest sabermetrician. If you’re not following Russell on Twitter @REassom then you’re doing baseball wrong.