#BringOutYerNulls is being used on twitter to help highlight the challenge of publishing so-called “null results” – when no effect is seen. For example, in an experiment comparing a group of men and women on a parallel parking task, a null result would indicate that males and females did not differ in their ability to squeeze into tricky parking spots. Null results are important because they can, as in this example, dispel myths or challenge assumptions. However they are also hard to publish in academic journals. I recently had a null result from our randomised controlled trial of an iPad app published after an arduous process which took 18 months, five journals and 16 expert reviewers. Despite this, let me say that I think the paper is a good paper. I am obviously biased, because I wrote it, but I think it is fair to say that well-designed randomised controlled trials of supports for autism are still few and far between, and so this trial adds a considerable amount to the sum of knowledge. But it wasn’t good enough for the first four journals we sent it to, obviously. And what did they think was wrong with it?
It is clear from the reviewer comments that one reason was the null result. The two groups in our study weren’t different from each other. The intervention didn’t “work”, or at least not the way we measured it. One reviewer illustrated this very clearly when they commented: “Now comes the hard part. I am afraid that I am recommending that the paper be rejected. The reason for this recommendation is that, despite the innovative intervention and the above mentioned design strengths, the results showed no intervention effect.” Another reviewer noted: given the lack of meaningful results, I do not recommend publication of this manuscript at this time, as it will not advance the literature on the use of technology in autism treatment.
This is really disappointing because null results are still informative and add to understanding. As another reviewer said: Some may note that the intervention does not have a significant effect, but this is useful information. If iPad interventions do not survive the rigour of RCTs, the area really needs to be aware of this, and to reflect upon what to do about this. If future RCT publications do find an effect, this study will be a useful comparison. Another reviewer notes: Even without treatment effects I think that this paper teaches us something new about intervention approaches.
So why do so many reviewers fixate on statistically significant results as being more ‘meaningful’ than null results? Well, one possibility is that it is just more exciting when there is something new to report – some pattern or deviation from what we consider to be ordinary. In this case, what we reported in our paper is that all the children in our study (whether they played with our special iPad app or not) just made a normal amount of progress. This is like a newspaper with a headline saying that Adele sang perfectly in tune at a live show, or Madonna performed without falling over her cape, or Leicester City is in the middle of the premier league table. It’s not news – we want to know when something happens which is different, unusual, exciting. Children with autism just gradually learning a bit of new stuff over six months? Not news.
That doesn’t sound very scientific though. Is there a systematic difference between a null result and a significant result which can explain reviewers’ attitudes? Well, sort of. In an experimental setting, a statistically significant result proves something, a null result simply fails to disprove it. Going back to the parking example, a significant result might allow me to say “women parallel park better than men” but a null result simply says “I haven’t found a difference in how well men and women park”. What we can’t say is “women and men have the same parking ability” because we can never prove this (null) hypothesis.
The problem with ignoring null results, or discouraging them from being published, is that we end up with a skewed evidence base for medical practice. The single trial showing that X is a wonder drug for hypertension would be viewed very differently alongside another twenty trials where no effect was found. In fact, the real message here is to get away from focusing on single results at all – it is only by building a collection of evidence around a particular approach that we can be confident in making recommendations for practice. And to do that, we need to make sure null results weigh in the balance too.
For more information: