What made you go WTF today?

**Loki** · 09-03-2018, 02:40 AM

Good catch on the journal name.

If the purpose of the study is to see whether X has a statistically significant effect on Y, you can't just up the sample until your p-value reaches an acceptable level.

This is p-hacking taken to the extreme, and the authors nonchalantly admit to it.

If your data has substantial variability, then it should be hard to obtain statistical significance. That's the whole point. If you want to claim there's a large effect size, that's fine. If you want to claim hypothesis testing is unnecessary and that you're just providing summary statistics, that's more or less fine. But you can't have your cake and eat it too by doing hypothesis testing and then explicitly changing the parameters of the study until you get the desired result. Or at least you shouldn't admit to doing it.

**wiggin** · 09-03-2018, 02:53 AM

Originally Posted by Loki

Good catch on the journal name.

If the purpose of the study is to see whether X has a statistically significant effect on Y, you can't just up the sample until your p-value reaches an acceptable level.

This is p-hacking taken to the extreme, and the authors nonchalantly admit to it.

If your data has substantial variability, then it should be hard to obtain statistical significance. That's the whole point. If you want to claim there's a large effect size, that's fine. If you want to claim hypothesis testing is unnecessary and that you're just providing summary statistics, that's more or less fine. But you can't have your cake and eat it too by doing hypothesis testing and then explicitly changing the parameters of the study until you get the desired result. Or at least you shouldn't admit to doing it.

I suspect that your critique is fair but I think it's important to distinguish this from p-hacking. P-hacking is essentially going in with no hypothesis, getting a shitload of data, and then running every test you can imagine until something 'significant' appears. This is distinct; they had a hypothesis but didn't know either the effect size or the population variance, so they scaled sample size to find the effect. It's potentially got some of the same problems with p-hacking if they didn't account for multiple post-hoc tests when reporting p-values (I suspect this is the case) - but it's not as dishonest as p-hacking. They're essentially saying 'we're pretty sure there's a signal here, the size of the effect we saw was X' rather than 'we had a hypothesis that Y effect exists and is this size, and sure enough we were right'. It's a bit sloppy science, since ideally they would have run a pilot study to inform their power analysis, but the fact of the matter is that no one does that.

For subtle signals in in vivo data, I would indeed prefer to see at least one replication (or partial replication) before I'd submit this kind of data. And that would obviate the need to justify their sample size in this manner. But it's still informative IMO, even if it means your conclusions are far weaker than they might be otherwise.

Thread: What made you go WTF today?

Thread Tools

Search Thread

Display

Hybrid View

Posting Permissions