Five sigma — a simple explanation

You may have heard the term 5 sigma before. It came to the attention of a lot of people when the Higgs boson was discovered by the Large Hadron Collider (LHC) at CERN in 2012.

Basically, 5 sigma is an experimental measure that science — and physics in particular — considers accurate enough to constitute proof of a discovery.

Experimental measurement is a careful and time consuming task. Scientists have to eliminate any potential errors with the structure of the experiment or the accuracy of the measuring equipment. They also have to eliminate the possibility that they might see a certain result at random.

Sometimes it’s handy to talk about an average or mean result. If we run an experiment 10 times we might glean something from the average value of our result.

If the results of something that can be a value between 0 and 10 are 4,4,4,4,4,6,6,6,6,6 then our average result is 5. That tells us something but it’s not the whole picture.

If our results are 0,0,0,0,0,10,10,10,10,10, the average is also 5, but I’m sure you can see that in this example we’re getting the edge values all the time. That might tell us something a mere average doesn’t.

We can learn something new by looking at the distribution of the results as well as the average.

One useful statistical way to analyse a set of results is to use something called a standard deviation. Don’t worry about the technicalities for now but this is a particular way to apply a formula to a set of results — like our numbers between 0 and 10 above — to find out how they deviate from the average.

We can use standard deviation to plot what’s called a normal distribution of values on a graph. The graph describes a shape called a bell curve, which looks vaguely like a bell as follows:

Bell curve showing up to 3 sigma.
Bell curve showing up to 3 sigma.

The average is in the centre and you’d expect there to be more values around the average than at the extremes of the graph.

Sigma divides the normal distribution graph into sections, with the sigma numbers getting higher the further they are away from the central average.

The way normal distribution works is as follows:

  • 1 sigma contains 68.27% of all results,
  • 2 sigma contains 95.45% of all results,
  • 3 sigma contains 99.73% of all results,
  • 4 sigma contains 99.9936699% of all results,
  • 5 sigma contains 99.9999403% of all results.

5 sigma is the same as saying something has a 1 in 3.5 million chance.

When they discovered the Higgs boson at CERN it was via two different experiments (two different instruments), so all their eggs were not in one basket and this serves to strengthen confidence in the result.

Wording is everything though, and to be precise about the Higgs boson discovery, it’s saying that there’s only about a 1 in 3.5 million chance they’d get the results they did if there was no Higgs boson. The conclusion was that they had therefore discovered the Higgs boson.

Results in science are normally presented within a margin of error. It might be a small margin, but it’ll be there. In a sense, you never know anything with 100% certainty, but if you see something at 5 sigma you know the chances of it occurring at random (or by other means, maybe) are so ludicrously small you can accept it as fact.

The end, but for anyone interested in the standard deviation formula, it’s as follows:

Standard deviation forumla.
Standard deviation forumla.