The Speedrunning community and Zipf’s Law

Rajiv Krishnakumar
2 min readMay 11, 2021

Can speedrunner game preferences be described by a statistical linguistics law from the 1930s? A rather strange question, you might ask. And yeah you’re right, but the answer is surprisingly yes!

The idea for this small post started while I was watching Michael’s YouTube video titled The Zipf Mystery. Big shout out to Michael! I really recommend you give it a watch, but to summarize very simplistically, he discusses how things tend to follow Zipf’s law. This law says when you have a bunch of things from a certain category (e.g. words in a story), the frequency at which those things appear follow an observed law. Basically, if the most frequent thing appears n times, the second most frequent thing will appear n/2 times, the third most frequent thing n/3 times and so on. Again, watch the video! It’s very clear and entertaining.

So for some reason I wondered if the number of people speedrunning games also follows this law! As of today, on speedrun.com, the most active game listed is Minecraft: Java Edition with 1,310 active players. If speedrunner game preferences followed Zipf’s law, we would expect the second most popular game to have roughly 655 active players, the third most popular game to have 436 active players, and so on.

So do speedrunner game preferences follow Zipf’s law? Well kind of! Here’s the data:

Hover over each dot to see which is the actual game for each data point

Although the tail is a bit higher than the law would predict, it definitely follows the trend. Also it looks like the second most popular game, Super Mario 64, is a bit of an anomoly i.e. if speedrunners were truly Zipfian then we would expect roughly twice as many active players in Super Mario 64. But hey, still pretty remarkable given the theory was invented over 20 years before the first video game even came into existence!

I like speedrunning, and I also like math. I am by no means an expert in either, but I suppose I am somewhere in the range between dabbler and knowledgable in both categories. Thanks for reading everyone!

Thanks to Khuyen for her post on embedding plots into medium. The code used to create the data can be found here.

--

--