MLB: The art of expected bases

I don’t quite remember which player statistics the Hillsboro Hops had up on their right field scoreboard during the Northwest League minor league game. One for sure was batting average, and one for sure was OPS.

OPS, of course, stands for On-base Plus Slugging, or, put mildly, the sum of a player’s on-base percentage (how often the player reaches base safely) and slugging percentage (a scale of how many extra base hits a player has made.)

My problem in that moment? I hate OPS. The number on the scoreboard had functionally no use to me, apart from the fact the third and final stat, the one I don’t remember, was either slugging percentage or on base percentage, so I could perform mental subtraction. In my time working in baseball as a statistics stringer, I’ve gravitated towards observational statistics over analytical statistics. I’m not setting up an argument against analytical statistics – as a data analyst, I think they’re very important, but they’re difficult to digest in the moment.

Sitting at the ballpark, I thought to myself – what I really want to know isn’t OPS, but rather expected bases. In any given plate appearance, how far down the base line would you expect the player to get?

The formula came to mind easily: (total bases + walks + hit by pitches) / (at bats + walks + hit by pitches) – functionally, slugging percentage, but with walks and hit by pitches added to both the numerator and the denominator.

What excited me is this number actually had meaning – instead of adding a number to another number, the formula represented how many bases you could expect a player to achieve in a given at-bat. Calculating it for the National League, Christian Yelich by far and away had the best in the 2019 season with .721 expected bases per near-plate appearances for the 2019 season, whereas Miami’s Lewis Brinson was the outlier at the other end with only .282 expected bases per near-plate appearance. (The NL average was .502 expected bases, or about half a base every near-plate appearance.)

What about sacrifices? I’ve excluded those, because in a sacrifice, it could be argued a player isn’t actually trying to reach base – instead, they’re trying to move another runner along at their own expense. As a result, sacrifices are treated as null, hence my near-plate appearance term used above.

I’ve looked around on the internet and haven’t found the formula anywhere. I’d be very surprised if this were a novel idea. The concept isn’t perfect, since the concept of a percentage of a base is indeed kind of odd – but being able to visualise the fact Christian Yelich would get an extra .210 bases per near-plate appearance means that in a game, you could expect Yelich to reach one extra base more the average player, which – let’s be honest, it’s incredible. That being said, reaching base means different things to different players – Ronald Acuña, for instance, scores 48% of the time he reaches base, in part to his base-stealing ability, while the Reds’ Eugenio Suarez is 9th in expected bases, but only scores 36% of the time he reaches base. I’m sure someone has looked at these scoring stats and how much they vary by player over seasons, but again, this is a tangent – a player only has so much control over what happens when they’re at the plate, and a metric which measures how many bases they would be expected to achieve only as a result of their plate appearance makes logical sense to me.

Also, while singles may be a better result than a walk (since it allows baserunners to move more than one base and isn’t reliant on forces) I’m happy to let other metrics measure those outcomes. One of the oddities of OPS is that on-base percentage sort of gets lost, but with expected bases, walks and HBP are treated the same as a single, since the player achieved the same result, albeit in different ways: gaining first base. If over ten at bats a player only walks, his expected bases will be 1.000, the same as OBP, and if the same player only hits home runs, his expected bases would match his slugging at 4.000.

The ranking actually matches OPS fairly closely, with an r² of .96, which makes sense – OPS isn’t a terrible metric. However, there were some stark differences. The biggest gainers were players such as Derek Dietrich, Hunter Renfroe, and at the bottom end of the rankings, Austin Riley, who all have an exceptionally high number of extra base hits. Players who dropped in the rankings compared to their OPS were Howie Kendrick, who rarely walks and hits a lot of singles, and Brian Reynolds. Pittsburgh’s Kevin Newman had the biggest overall drop in the rankings, with OPS thinking him 59th but only finishing 91st in expected bases. Newman does not walk all that often and has a very low percentage of extra base hits with 75% of his hits going for singles. Interestingly, Ronald Acuña did not drop in the rankings much in spite of not really being an extra base hitter. Adding walks into the equation will hurt a player who typically hits doubles, but this could also be considered a feature of the metric – if someone hits triples consistently, a triple has a much greater value than a walk.

I’ve been thinking about this for a few months and I like it a lot. As I’ve said above, I’m sure someone else has thought of this somewhere, but I strongly prefer it as an alternative to OPS, especially because it’s a statistic which can be easily visualized.