[Editor’s note: I never got around to completing this post, but this post I just read about how the sound of popcorn being popped is the sound of the normal distribution seemed like a worthy addition to this post.]

Many everyday problems that involve statistics require making many observations about something quantifiable and being able to figure out the most probable value of that observation. For example, maybe you’re trying to figure out what mileage your car’s giving you, so every time you fill the tank up, you record how many miles you go before you empty it out. You might ask, what is the average distance my car will go on a full tank?

One way to solve this problem is to take an average of all your observations. The simple average is not a very good statistical measure, mainly because it’s influenced by “outliers.” If one of the observations you took of how many miles you went on a full tank of gas involved a very long highway road trip, then the average measure will make it look like you get really awesome mileage. That’s why car companies have to report a highway and a city mileage – the average highway mileage is bound to be higher than the average city-only mileage, and taking an average that combines the two would be misleading.

So how does one get an accurate picture of averages that is less affected by these outliers? One way is to use *probability distributions*. A “distribution” is a mathematical equation that tells you how frequently a specific value is observed when you’re making a series of observations. For example, when you are tabulating your car’s mileage for each tank of gas you fill, you end up with a distribution that might go something like – 40% of the time, you get about 30 mpg, 20% of the time you get about 28 mpg, 22% of the time you get about 31 mpg, and so on.

A “normal” distribution is a very special kind of distribution for two reasons. The first reason is that it occurs a lot in nature, and anything that is natural holds a special fascination for us human beings. If for example you measure how much each dog in a pack weighs, or how fast each one of a school of fish moves in an aquarium, or how big the pumpkins in a patch get – the distribution of these observations follows a common pattern. There is a single equation – this is where things get a bit spooky – that will describe every one of these natural phenomena. This equation is called the “normal distribution.”

One reason this pattern is so fascinating is that it’s very simple. The mathematical equation that describes a normal distribution requires only two variables – the “mean” of the distribution and its “standard deviation.” These two variables together tell you what the distribution says about what you’re observing. Once you’ve made a few observations, and figured out that your observations fit within a normal distribution – we’ll see how you do that, in a little bit – you can then use the distribution to predict what percentage of future observations will lie within a certain range.

If you’re starting to notice, for example, that your car goes about 300 miles on a full tank, you can then use the normal distribution to predict how far it will go, most of the time, in the future. It’s the predictability of various phenomena using the normal distribution that makes a lot of economics and planning possible in the real world.

There’s another reason the normal distribution is so awesome. It’s because of something called the *central limit theorem.* The central limit theorem has something to say about what happens when you have a lot of random events occurring together, and how they are related to the mathematics of the normal distribution. To understand this, let’s consider a lake full of frogs.

Imagine that in this lake, each frog acts independently of the others. This is a very critical requirement of the theorem, by the way, but it’s something that tends to be true in most of the natural world – things happen fairly independently of each other, at least for all practical purposes. Now every minute, let’s say you keep count of how many times each frog went *rib-bit.* Each frog has its own random behavior and there is no specific pattern that they are all following.

If you took the average number of *rib-bits* you got, across all the frogs, and kept track of that every minute, then the average number of *rib-bits *you observe will, as you tally them over time, start to fall into… you guessed it, a normal distribution. This is true no matter how many frogs there are, and what behavior each individual frog is exhibiting, with regards to its *ribbit*ing.

The central limit theorem means that even when something isn’t following a normal distribution, it’s possible to find a way of combining it with other phenomena to create a normal distribution. A large number of frogs, acting independently of each other, will combine to create a normal distribution of number of *rib-bits* per day.