Real world examples of data dredging

This is surely a situation that crops up in practice and to my knowledge there is no good answer to it.Įupraxis' answer would then imply that I had been data dredging because of the issues with my model, and cbeleites answer would imply that I had to collect independent data to validate. And suppose I had no alternative data for an independent test.

Suppose the unexpected peak invalidated my model, though it could be explained by another model I'd thought about before (albeit didn't happen to be testing this time) - let's call it model X.

Let's suppose that the test had turned out differently though that there was no peak where I expected it but a very good peak elsewhere. And it also works on a different dataset altogether. The graph above is illustrative - in this case I had sound physical reasons to believe correlation would have a smooth peak somewhere between 1m, and on testing indeed it did, so I'm not very worried about it (if anything it confirms the mechanism). But, as adjacent test results in parameter space are strongly correlated, intuitively we haven't conducted 21 independent "tests" here - maybe closer to 5? Is there a way to quantify the effective number of independent tests in this manner? instead of a 1% confidence level require a 1/21 = 0.048% confidence level to attain significance). Applying Bonferroni correction to these tests would probably render all findings insignificant (i.e.

The graph above shows 21 different tests. As results for similar values of the parameter are similar, this can be viewed as a calibration of the model rather than data dredging folly. Sometimes multiple tests are conducted of the same hypothesis, using the same model each time but with some model parameter changed.