Although its discovery has now been disputed (via Gizmodo), the “apparent” recent discovery of the Higgs Boson has elicited much discussion. Wayne Myrvold, a philosopher at the University of Western Ontario, wrote a nice though slanted explanation of the statistical background of the claim. I wrote a somewhat lengthly reply as a comment to his article, but for whatever reason it has yet to be posted. So I’m reproducing that here. Please read Wayne’s article first.
We can all agree that reasoning and decision making in science is complicated. Scientists reason in many different contexts: in the lab, in their published papers, as career-minded professionals, as interested consumers of science, and as people going about their lives. It’s plausible to think that they reason in different ways in all of these contexts. When we’re discussing their reasoning as scientists, I believe distinguishing between the first three contexts is especially important. While Wayne’s explanation of the statistics behind the Higgs Boson discovery is very interesting, informative, and as far as I can tell correct, I think there are some confusions arising from his failure to make these distinctions.
Wayne’s explanation fails to distinguish between the two different statistical methodologies underlying the gradual changes in subjective belief he describes as arising from the accumulation of data and the “five sigma signal” that justifies CERN’s discovery claim. Changes in belief, as he described, occur through Bayesian updating. Bayesianism is an account of how our beliefs change, or ought to change, as we accumulate new experiences. The five sigma signal, in contrast, is a measure of statistical significance. Such significance measures come from an entirely different statistical methodology, often known as “standard” or “Neyman-Pearson” statistics. While the product of Bayesian updating is a degree of belief that a claim is true (say, that the Higgs Boson exists and has a certain energy level), the product of standard statistics is the probability that we would observe the results we do through random chance alone. That is, if no Higgs Boson existed, how often would we observe the results we do? A five sigma result means that, as Wayne explained, there is about a one in a million chance that we would. Importantly, this *does not* mean that there is a one in a million chance the the Higgs Boson doesn’t exist, or that we are 99.9999% certain the Higgs Boson does exist. There is no way to translate between our beliefs that the Higgs Boson exists, as measured by Bayesian statistics, and the probability that we would get the observed results through chance, as measured by standard statistics.
Given this distinction, it isn’t really correct for Wayne to imply that the five sigma signal is a “threshold of certainty for telling the world you’ve made a discovery”, because you can’t get from the five sigma result to any measure of certainty that the Higgs Boson exists. Instead, I think a better way to say this would be that a five sigma result has been set by the community of physicists as a standard for claiming discovery. Scientists agreed that once this threshold was reached–that is, once the probability that the result would have been observed if the Higgs Boson did not exist was small enough–they could claim discovery. Thus, I would say there is a moment that the Higgs Boson was discovered: the moment that the statistical analysis was completed showing that their data had reached or exceeded the previously established criteria. This is similar to claiming that a house is built once the last nail has been hammered and the last paint has been applied. Although the construction of the house took place over perhaps months, it was built, or completed, at a specific time. Yes, there is some grey area for both houses and science, but that grey area is relatively narrow.
There is also a distinction between the moment of discovery and the moment of claiming the discovery. Once a discovery has been made, there is a further decision of whether and when to announce or publish that discovery. The history of science is full of examples of scientists declining to announce discoveries for years after they have been made, sometimes waiting until after their deaths. I think Wayne is right to apply decision theory to the question of whether to announce. However, it’s important to clearly distinguish between this announcement decision and the discovery moment. In Heather Douglass’s account, values indirectly enter science through the setting of significance thresholds. There is nothing magical about the five sigma threshold. It is a complicated product of many different considerations, including the practical limits of present technology, the phenomena being studied, and the social and political context. Douglas’s argument is that those latter considerations can properly be considered part of scientific reasoning: scientists can and must consider social and political values when setting their thresholds for making scientific claims. However, setting this threshold is a separate procedure from deciding whether to announce the discovery. Social and political considerations may enter into the decision of whether to announce a discovery as well, but those decisions are already baked into the discovery itself. Whether or not CERN decided to announce or not, they had already, objectively, made their discovery and the criteria for making that discovery already included value considerations.
So in summary, I think Wayne’s explanation of the discovery is valuable and mostly accurate, but he failed to clearly distinguish between (a) what scientists believe about the Higgs Boson, (b) whether and when scientists discovered the Higgs Boson, and (c) how scientists decided to announce that discovery. These three processes can involve different styles of reasoning, and indeed completely different statistical methodologies. Not making these distinctions can lead to an incomplete or incorrect understanding of the scientific process.