With experiments run 40 million times per second every day, the Large Hadron Collider (LHC) is a Big Data fanatic’s idea of heaven.

The last couple of days of  pre-annoucement v. official annoucement reporting concerning how near CERN scientists were to finding the subatomic Higgs Boson or ‘God’ particle showed how important a role statistics plays in the sciences, not least in pinpointing outliers and measuring degrees of uncertainty.

In Statistics, if something is quantifiably less uncertain, it is more certain. Pre-announcement reports of CERN scientists’ findings from the LHC were shared in terms of a ‘four-sigma’ observation. This, said David Hand, Professor of Statistics at Imperial College, equated “in rough ballpark terms” to a 1 in 30,000 chance that they had made an error.

A four-sigma observation meant that the scientists were not able to say that they had definitely found the Higgs Boson particle. A four-sigma observation is one that is so far out (4 standard deviations from the mean) that it, or something more extreme, would occur by chance only 1 time in 30,000.

The ”threshold of certainty’ as to whether or not this was indeed the particle, was a five-sigma observation (or result).  A five-sigma result is further out and it, or something more extreme, would occur by chance only 1 time in 3.5 million.

Today the scientists’ have officially announced a five-sigma result which gives us more confidence than a four-sigma observation that there is something real there and not just a fluke or experimental error.

NB Please see the Understanding Uncertainty site for a great explanation of Higgs Boson and the role of sigma observations. See  Explaining five-sigma – How well did they do? for more on how various media outfits coped with explaining the statistics behind the news.