What does it mean to justify your statistical decisions?

Lakens and 87 other researches, including me, recently published a position paper on the p-level controversy. You can find the OSF-repo here: https://osf.io/by2kc/

Part of our argument is that instead of swallowing a specific alpha-threshold (e.g. 0.05 or 0.005) as gospel, every researcher is asked to justify her or his choices (ideally before data collection). What the paper did not describe was how such a justification would look.

Why do we even need a justification?

Apparently, it is also hard to find a paper with such a justification. Constructive examples are missing. The following tweet can be seen as a condesation point of this issue:

As I argued in a blog post a while ago, statistical decision theory can not bridge the naturalistic fallacy. The search for truth comes into conflict with other values, e.g. beneficience, justice, gratitude. If one value enters the game, all values enter.

Also, colleagues and I argued in an older paper, that no amount of evidence can justify a certain decision. It requires normative “bridging principles”. My PhD-supervisor Tom Potthast likes to call this kind of collaboration an “epistemic-moral hybrid“. But even if you do not believe in the analytical disjunction of ethical and empirical statements (1), you might need to accept that the social practice of justifying an action or decision is different from the social practice of justifying what is. Morality is contrafactual, and therefore, a researchers decisions are at least partially contrafactual. Be it the question whether do a follow-up, replicate, explore novel hypothesis or exploit knowledge. This is also the case, regardless of whether you follow a Bayesian, Likelihoodist or Frequentist approach to statistical evidence. Sooner or later, you have to decide your next step and how to invest resources. This step inevitably requires bridging principles, and therefore warrants justification.

But consider also that there is a need for justification only if the decision or action is not determined, i.e. if you have degrees of freedom. A stone rolling down a hill needs no justification. But throwing a stone down the hill does. Similar, data, scripts and statistical parameters calculated out of them need no justification. They can be proven by showing the data, your notebooks or by math. What instead needs justification is your prior, why you did this experiment, your analysis workflow, or your alpha-level. Again, justification only makes sense if more than one action is justifiable. Otherwise there would an unique solution to your decision problem, and your action would be determined. Therefore, justification only exist in professions, i.e. jobs requiring expertise, creativity or solving NP-hard problems. These are jobs that can inherently not be transformed into a set of strict guidelines. One might paraphrase: The easy decision is always clear, the hard decision needs to be justified.

So how does a principled, justified decision look?

Sure, you are a researcher, and you want to be rational in your decisions. But first, let’s take a look at what can be considered justification. As Hans Albert (2) nicely argued, whenever you attempt to justify, you run into one of these three problems:

- infinite regress, where you forever add principles because your current principle needs to be justified.

- circular reasoning, where you justify principle A with principle B, but you justify principle B with principle A.

- dogma, where you decide that a certain principle is inevitably true, and therefore the foundation of all others.

Therefore, one extreme standpoint is the requirement for ultimate justification, as this is not possible. In my experience, this demand usually pairs with naive foundationalism, i.e. ignorance about the dogmatism about one’s own principles. On the other extreme is methodological anarchism, which is often described by Feyerabends famous remark that “anything goes” (3).

Instead of these extreme position, we might aim for something of a middle ground. Maybe justification can be considered sufficient, if you have a coherent theory? Maybe a set of principles, and your decision is not in direct violation of these principles, or close to an equilibrium of opposing demands? This approach to justification is famous in medical ethics (4), where you try to balance non-maleficience, beneficency, respect for patient autonomy, justice and the trust within the professional relationship. Clearly, science has its own set of principles when it comes to decisions within its profession. Without claiming completeness, i might just throw in truth, curiosity, replicability, openness, inclusivity, transparency, frugality, methodological soundness or rationality.

Yet, such coherentist approach will often experience that the set of accepted principles, and how they are balanced, is rarely fixed. Therefore, equilibria are rarely stationary. What is considered “ok” depends on the tradition and the current practice, and can change over time. In a similar vain, van Fraasen argued that science has its own traditions, and one has to understand decisions made by scientists within their specific traditions (5). Justification would therefore occur in front of an audience of peers within a similar tradition.

So, how can we justify a specific alpha-level?

Well, one approach is tradition. Certainly, history shows us that 0.05 is a point with a very stable equilibrium. Not by math, but by coherence with a set of principles. Now, in the last year, some principles became more important. This caused pressure on the alpha-level (6), and it might shift to 0.005 to please reproducibility, or this gradient can countered by calls for frugality and curiosity. Additionally, how is a statistical decision actually linked to the principles of scientific tradition? Most of these links are not sufficently understood - so how would one sufficiently justify statistical decisions by calling upon certain principles? In my opinion that is why sticking to an arbitrary value feels like treading water, when instead we need to learn swimming. Changing alpha won't solve this at all. Yet, remember that all decision are epistemic-moral hybrids? I predict there will be opposition by scientistic dogmatists against actual attempt of justifications, because proper justification requires that "non-scientific" values enter the debate. Yet, while i personally feel (at the moment) sufficiently justified with keeping my alpha at 0.05, i know deep in my heart that anything goes and a lot can be justified. Even 0.005.

Just as it is never the easy decision that requires justification, justification is never easy.

References

1: Quine, W. V. O. (1951). “Two Dogmas of Empiricism” https://doi.org/10.2307%2F2181906

2: Albert, Hans (1968). “Traktat über kritische Vernunft

3: https://de.wikipedia.org/wiki/Anything_goes

4: Beauchamp &Childress (2009). “Principles of biomedical ethics”

5: van Fraasen, Bas (2002). “The empirical stance”

6: Benjamin et al (20017). “Redefine Statistical Significance” https://scholar.harvard.edu/files/dtingley/files/sig-naturehumanbehaviour.pdf