The Compassionate Umpire or The Cold Automated Zone

At the half way point and at end of the season Jeff Sullivan from FanGraphs does pieces on the worst call ball and strike of the season so far. These are usually quite bizarre calls that have some unusual circumstances behind them but in the most part don’t have too much influence on the game. But in this postseason there was a ‘strike’ call which was very bad and had a very big impact on a game.

In bottom of the second inning of game 3 in the NLDS series between the Braves and the Dodgers, Walker Buehler was in a difficult situation with 2 outs and runners on 2nd and 3rd after an error from Cody Bellinger. The Dodgers decided to intentionally walk Charlie Culberson, loading the bases, to get to Braves pitcher Sean Newcomb – a fairly standard approach in the NL. But Buehler went and threw 4 balls to Newcomb and walked in the first run of the game, bringing up Ronald Acuna and Buehler threw another three balls to get down 3-0 in the count.

Then came ‘ball four’ but it wasn’t called a ball: home plate umpire Gary Cederstrom called a strike. That meant Buehler threw another pitch to Acuna who launched it for a grand slam, the decision by the umpire meant the game was 5-0 and not 2-0. The maybe ‘pitcher friendly’ call by the umpire cost the Dodgers 3 runs in a game they ended up losing by just one.

And using hyperbole meant they lost the game, had to play a further game in the series, were more tired than the Brewers, that forced them into a 7 game series, which meant they were more tired than the Red Sox and therefore lost the World Series. A bit of a stretch but it probably cost them the game as the Braves managed just 3 runs in the other 35 innings of their 4 game series.

Now not every mistake made by an umpire has an as easily identifiable ramification as that one but they do happen in most games and unsurprisingly MLB and WUA (World Umpire Association) want to have the smallest number of mistakes possible. Nowadays they can do this by looking at how many calls a umpire got right or wrong thanks to systems that track the speeds and trajectories of pitched baseballs.

Measuring the umpire performance

In the 2006 playoffs MLB debuted a system called PITCHf/x, which could track the baseball during a pitch enabling data points to be stored such as velocity of pitch, release point, location through the plate and where the ball landed after being hit.  This also tracked a player’s stance to calculate the top and bottom of the zone for each individual. This became standard across all stadiums in the 2008 season which means MLB and WUA could look at the call that was given by an umpire and the location of the pitch and decide if they got it correct or not.

This system was replaced by Trackman, a better and more accurate system, in 2017. All of this information, going back to 2008, is publicly available via Statcast Search on the Baseball Savant site. So one can grab all this data and determine the accuracy of the umpires in the MLB by looking at all the pitches the umpires called, working out if the ball went through the strike zone or not and if that matched the call.

Note – The data given by PITCHf/x and Trackman give the coordinates for the middle of the ball, so since just a piece of the baseball needs to cross home plate in order for the pitch to be considered a strike the diameter of the ball (about 2.9 inches) has to be added to the plate size, making the range [−0.95; 0.95] in feet with 0 being the centre of the plate. The height of the strike zone varies from batter to batter and these values are used for calculations but for any graphical representations of the zone I have chosen to use average values for the top and the bottom of the zones (3.5 and 1.6 feet, respectively.)

Doing this for all the possible pitches since the start of the 2008 season gives us the above. Every single year, since the introduction in 2008, the umpires have gotten better with a correct call rate of 91.6% in 2018 compared to 86.6% in 2008.  This to me shows that whatever the MLB and WUA are doing with this data and their training for the umpires is working. But how does it compare for balls and strikes?

Umpires get the calls right for balls better than for strikes and that makes sense when you think that there will be a number of balls that are not even close and are easy decisions. What stands out from this graph is that most of the improvement seen in the overall performance of umpires is them calling strikes better, improving by 10 percentage points over the 11 year period.

If you have watched baseball for a while you may have heard people saying that the zone changes depending on the count, i.e. the strike zone decreases on pitchers counts and increases on batters counts. To see if this is true I looked at the correct call rate of balls and strike for each count.

Looking at the graph it definitely agrees with the hypothesis above, the lowest correct call rates for strikes are in pitcher’s counts (0-2, 67%) & (1-2, 73%) and the lowest correct call rates for balls are in hitter’s counts (2-0, 88%) & (3-0, 82%). This clearly shows that umpires are affected by the count but how does it affect the strike zone?

Modelling the bias

To determine the impact on the size and shape of the zone by count I had to build a model which would predict the chance a pitch is called a strike by an umpire based on its location.  This is where R and R Studio are any budding baseball analysts best friend. R is a programming language which is very useful for statistical computing and modelling. If you want to get into baseball analytics I highly recommend the book ‘Analyzing Baseball Data with R’ by Max Marchi & Jim Albert, it is a great book which covers what you can do with R beyond Excel and where the publicly available data is.

There are modelling functions pre-built in R so you can use your input data to get predictions of how likely a pitch would called a strike based on its location. With that data you can then identify a line, for more analytically inclined it is a contour generated by a loess function, where the strike call is 50% to give you a good idea of where the zone is, i.e. inside the line the pitch has an above 50% chance of being called a strike and outside it is below 50%. The graph below shows the 50% lines for left handers and right handers in 2008 and 2018.

As you can see for all of these splits, the strike zone for an 0-2 pitch is considerably smaller than the one for an 0-0 pitch and the 3-0 pitch has a larger strike zone but not by that much. You can also see the differences between 2008 and 2018 as well with the side edges of the zones constricting in much closer to the actual zone size but the bottom part of the zone has got larger dropping outside the actual zone. According to my model, even with the enlarged zone expected for the 3-0 count, the pitch that Buehler threw had a 2.2% chance of being called a strike.

There is a clear difference between the biggest hitter (3-0) and pitcher (0-2) count compared to 0-0. What about all 12 counts? What do their strike zones look like? The graph below is for 2018 right handers only.

If we were to order these strike zones by area they match up well with run potential for each of those counts which to me means that the umpires are trying to even the game out being compassionate to whoever is struggling in the count.

So if the umpires are impacted by the count, are they influenced by other factors, like the players involved or home bias?

Additional umpire bias

Thankfully it took only a little digging to show no bias between home team and away team but there is some potential bias for batters and pitchers. I split 2018’s pitchers and batters into thirds based on their wOBA for 2018 and ran the models again to see if there was any visual difference between their zones for the 0-0, 0-2 & 3-0 count. This produced the graphs below, again for right handed batters only.

The top 3 are the Pitcher splits and while there isn’t much difference between the middle 3rd and the top 3rd (top 1-3% bigger in all counts) there is quite a difference for the bottom 3rd pitchers for the 0-2 count.  They have a significantly smaller (25%) strike zone for the 0-2 count, there were just over 2,000 pitches thrown in that scenario which is a smaller than the others (both just over 5k) but it still enough for it to be significant. So it looks like the poorer pitchers are getting a bad deal off the umpires when they are doing well in the count.

For batters,  in all counts, the umpire strike zones are lower for the middle 3rd than the top 3rd and the bottom 3rd is even lower the middle 3rd. This would be a worrying trend if the umpires were lowing the zone but on further inspection of the average top and bottom of the zone it doesn’t look like there is an issue here.

In the table above (values given in feet), the batters in the bottom 3rd have a lower zone than the other two groups. If we take this into account the ratio of how much smaller or larger the zone has gotten dependent on the count is similar (within 2%) across all three groups suggesting no bias from the umpires here. While there is no observed bias here it is interesting to note that there is maybe some correlation between strike zone height, and therefore batter height, and overall wOBA.

(The 3-0 strike zone for the bottom 3rd batters is slightly skewed due to lower volume of occurrences in 2018: only 449)

An Automated Zone

We have seen that umpires definitely have some bias. What would happen if MLB switched to an automated strike zone? Beyond the fact that we wouldn’t see the 32,000 incorrect calls across a season, the best benefit for MLB right now might be improved pace of play. In 2018 there were 2,605 at-bats ended early by incorrect calls but there was 4,039 that were extended by incorrect calls. That’s 1,434 net at-bats being longer than they should be.

The 4,039 at-bats which were extended by the bad calls had on average a further 5 pitches after the incorrect call. Using the net of 1,434 at-bats extended we saw just over 7,000 extra pitches thrown which accounted for 1% of the pitches thrown in 2018.  So just looking at the specific scenario of pitches which could end at-bats, the time for a baseball game could be reduced by 1% by an automatic strike zone.

I am relatively certain that most of the MLB teams have done similar analysis and know that the umpires aren’t good at calling pitches in the corners of the zone and are bad at calling strikes in the 0-2 count which to me means that pitchers probably are not targeting them as they expect a bad call. I believe that with an automated strike zone pitchers would be more aggressive towards pitching in the zone, especially the pitchers with better command.

One thing I haven’t talked about so far is pitch framing by catchers. With an automated zone that would become defunct skill and this would severely impact the defensive output of catchers. According to the catcher stats by Baseball Prospectus in 2018, for all of the top catchers, framing accounted for 80%+ of their defensive runs. For example, Yasmani Grandal led MLB with 16.3 runs saved of which 15.7 were down to his pitch framing ability.

If we were to remove pitch framing, the highest defensive contribution based on just blocking and throwing would be just 3.3 runs by Tucker Barnhart, which would be 26th overall for catchers if we compared that to the current metric. Barnhart is an extreme case of the switch that would happen if we removed framing, BP gives him -11.5 defensive runs from framing and he is the 9th (out of 117) worst catcher in 2018. Many other catchers would benefit from this like Willson Contreras who would go from second worst to second best. There are others who would be impacted by this badly, such as Jorge Alfaro, who would go from 9th best to 4th worst.

Removing the strike zone decreases the defensive output from catchers with the range of defensive runs saved going from [+16.3 : -15.7] to [+3.3 : -4.9]. With this decreased impact of their defensive contribution catchers offense would have to improve for their value to be the same, which wouldn’t be possible for most of the current catchers. MLB catchers in 2018 averaged a wRC+ of 84. This might lead to some good offensive players being converted into catchers as their negative defensive runs would be countered by their high offensive production and still make them better overall for a team than the current catchers.

Final Comment

I honestly have been a long time fan of the introduction of the automatic strike zone because of it making the game more accurate but in doing this research I have become even more of a fan, with it the additions of probably reducing pitch count and increasing hits and runs scored. I genuinely believe this is change a MLB should be looking into but I do see there being fight back from the MLB Players Association due to the impact on the catchers. With the power the players union has, if a change to the Collective Bargaining Agreement like this was proposed by MLB they would be expecting something back of benefit to the players in return at the same time.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.