Tuesday, March 23, 2010

The Absurdity of Quality

Joe Mauer and Hanley Ramirez were the 2009 American and National League batting champions. Did they deserve the honor?

Carlos Pena, with 37 home runs, was the 2009 home run king. Even though Albert Pujols smacked 47.

Absurd, right? But Pena averaged a home run in every 12.08 at bats compared to 12.09 for Pujols. So Pena hit home runs more often. And since Pena had at least 3.1 plate appearances per Tampa Bay game, he qualified to be the home run champ based on his exceptional rate.

This is sarcasm, of course. But what is the difference between making such a claim and awarding the "Batting Champ" in the same way?

When evaluating season-long performance, isn't the end game -- total accumulation -- ultimately what is most important? Would you rather have a player who finished with more home runs or who had the higher home run rate? I've never understood why we analyze statistics so inconsistently. Why is rate important for batting but not for home runs?

Don't get me wrong, there are times when "quality" over "quantity" makes sense. If a batter hit 10 home runs in 30 games but missed the rest of the season due to injury, it's good to know his home run rate when evaluating his ability. And you can't just say the best pitcher was the one who gave up the fewest runs since the crown would go to a pitcher who barely played.

But there are so many reasons to hate the crowning of a player based on quality stats. In a typical season, a player needs only about 503 plate appearances (3.1 X 162 games) to qualify for the batting championship (which, of course, is awarded to the player with the highest batting average). Accounting for walks, hit by pitch and sacrifices, the typical player with 503 plate appearances would record 445 at bats.

It should be noted that Prince Fielder, the only hitter who played 162 games last season, had 719 plate appearances and 591 at bats, more than a difference of 200 plate appearances and nearly 150 at bats over the above example.

Let's assume the player with 445 at bats hit .350 and was crowned batting champion. Let's assume there was another player with 591 at bats who hit .345. The batting champion recorded 156 hits while the runner-up had 204.

What sense does this make? The problem is that when we draw an imaginary, arbitrary line to qualify for this honor, we anoint all above that line as equals. In reality, they are not.

To put it another way, a hitter could bat .450 and collect 200 hits in 444 at bats but lose the batting title to a hitter who batted .350 (or worse) and collected 156 hits in 445 at bats.

While these are extreme examples, I think we can all agree that a .325 batting average in 503 plate appearances does not equal a .325 batting average in 700 plate appearances. The player who maintained that level of play longer was more valuable.

Comparing the players with identical batting averages is relatively easy. But at what point is a player with a lower batting average more valuable? You could apply a formula like (Player Average - League Average) X Plate Appearances, but I doubt we'll ever do that in the mainstream. It gets fuzzy in a hurry, and instead we just get sloppy with analysis.

The home run champion is not always the player with the highest home run rate. The RBI champion is not always the player with the highest RBI rate. Why, then, is the batting champion the player with the highest hitting rate?

I'd actually suggest that this problem of lazy statistical analysis goes much deeper, and that we overvalue the hit while ignoring other ways a player gets on base. While statistics like On Base Percentage are finally gaining acceptance as a conventional statistic that many see as more valuable than Batting Average, we completely ignore the cumulative On Base statistic (Hits + Walks + Hit By Pitch).

Joe Mauer was the AL batting champ with a .365 batting average, and Hanley Ramirez led the NL at .342. Yet four players in the American League accumulated more hits than Mauer (led by Ichiro with 225 and 34 more than Mauer). Two players (led by Ryan Braun's 203) had more hits than Hanley Ramirez in the National League. And if you want to focus on times on base, five players reached more than Mauer's 269 in the AL (led by Derek Jeter's 289) while six players exceeded Ramirez's 267 (led by Albert Pujols' 310).

Batting average has value, but should it be the factor that determines the batting champion? And if not, who should have been the batting champion for each league in 2009? It would seem that arguments could be made for Jeter and Ichiro in the AL and Braun and Pujols in the NL.


Nathan on March 31, 2010 said...

Posted this on another site discussing your article, but thought you might be interested in some feedback as well.

The whole idea of "batting champion" is just semantics. So let's throw that one out for now.

The real question the article touches on is "why do we value batting average over cumulative number of hits, while in other stats we look at the cumulative rather than the rate number"? And which one you use, I think, depends on what you are trying to measure: the skill of the player, or how much a player has contributed to the team:

* Skill of player should always be measured in rate stats. Durability or other factors aside, given a single at bat, how likely are they to get a hit, HR, etc.? That's what rate stats meausre
* Contribution is a bit muddier, and I think here is the divide between cumulative HR numbers being important while batting average or on-base percentage are more important in those realms. Homeruns are just homeruns - if you get one, great, but if you didn't, a number of other things could have happened in the plate appearance so it's hard to give anything but a neutral value on that. On-base percentage is different though - if you get a hit or walk, great, but every additional plate appearance in which you *don't* get one means something very important - you got an out. And that hurts the team. That's why a player with 150 hits but a .250 batting average is not as valuable a contributor as a player with only 100 hits but a .400 batting average. The BA (or OBP) encapsulates much more information than just how much the player contributed on the offensive side, it also tells you how much a player *hurt* the team offensively by producing outs. This is why rate stats for BA and OBP are much more valuable tells of "contribution" than rate stats in other things such as HR's.

Jon Loomer on March 31, 2010 said...

Thanks for the feedback, Nathan! Awesome points.

First, would love to know where you are discussing this. Care to share a link?

You're absolutely right about the value of rate stats. I agree, there is a place for them. And you do make a good point about rate of BA or OBP gives more information than the rate of AB/HR.

Ultimately, I feel that this is a subject that isn't discussed enough. Maybe you are correct that the fact that BA (or OBP) also measure number of times a player got out is reason to use it as a measure of a player's greatness over an entire season. There is merit to that. But as discussed above, it certainly does get muddy when you start comparing two players of differing plate appearances or at bats.

That said, it would also get muddy the other way around. For example, if you were to award the batting championship to the player with the most hits (or most times on base), you could have two players with the identical number of times on base. However, one may have significantly fewer plate appearances/AB. So it goes both ways.

But it's a good discussion to have. It's one of those baseball traditions that we've grown to accept. The most important statistics, we've been told, are Batting Average, Home Runs and Runs Batted In. That's why they make up the Triple Crown. As many have said before me, the value of the RBI is also in question given the uncontrolled factor of opportunity.

Thanks for making me think!

