Research Integrity: Common pitfalls in design, analysis, and interpretation
Researchers have flexibility to make decisions throughout a research study ranging from the purpose (describe, predict, diagnose, explain), design (choices of variables, data collection decisions), analysis (data properties, missingness, modeling choices), correct interpretation of results. Irreproducibility results when these decisions are handled in an ad-hoc manner. Top rated factors for irreproducibility were related to time pressure, selective reporting (p-hacking), low statistical power or poor analysis in a Nature survey of 1,546 scientists. Computational algorithms may lead to confounding or overfitting, if there is no clear understanding of variation, bias, and proper interpretation of coefficients in statistical models.
Reading list
- Baker, M. 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454 (2016). https://doi.org/10.1038/533452a
- van Smeden M. A Very Short List of Common Pitfalls in Research Design, Data Analysis, and Reporting. PRiMER. 2022;6:26. https://journals.stfm.org/primer/2022/van-smeden-2022-0059/
- Wicherts JM, Veldkamp CL, Augusteijn HE, Bakker M, van Aert RC, van Assen MA. Degrees of Freedom in Planning, Running, Analyzing, and Reporting Psychological Studies: A Checklist to Avoid p-Hacking. Front Psychol. 2016 Nov 25;7:1832. doi: 10.3389/fpsyg.2016.01832.
- Baillie, Huebner, et al on behalf of STRATOS TG3. Ten simple rules for Initial Data Annalysis. PLoS Comp Biol https://doi.org/10.1371/journal.pcbi.1009819