With experiments run 40 million times per second every day, the Large Hadron Collider (LHC) is a Big Data fanatic’s idea of heaven.
The last couple of days of pre-annoucement v. official annoucement reporting concerning how near CERN scientists were to finding the subatomic Higgs Boson or ‘God’ particle showed how important a role statistics plays in the sciences, not least in pinpointing outliers and measuring degrees of uncertainty.
In Statistics, if something is quantifiably less uncertain, it is more certain. Pre-announcement reports of CERN scientists’ findings from the LHC were shared in terms of a ‘four-sigma’ observation. This, said David Hand, Professor of Statistics at Imperial College, equated “in rough ballpark terms” to a 1 in 30,000 chance that they had made an error.
A four-sigma observation meant that the scientists were not able to say that they had definitely found the Higgs Boson particle. A four-sigma observation is one that is so far out (4 standard deviations from the mean) that it, or something more extreme, would occur by chance only 1 time in 30,000.
The ”threshold of certainty’ as to whether or not this was indeed the particle, was a five-sigma observation (or result). A five-sigma result is further out and it, or something more extreme, would occur by chance only 1 time in 3.5 million.
Today the scientists’ have officially announced a five-sigma result which gives us more confidence than a four-sigma observation that there is something real there and not just a fluke or experimental error.
NB Please see the Understanding Uncertainty site for a great explanation of Higgs Boson and the role of sigma observations. See Explaining five-sigma – How well did they do? for more on how various media outfits coped with explaining the statistics behind the news.