Generating reproducible statistical analyses and evaluation reports
Dr. Steven J. Pierce recently presented a talk called Generating reproducible statistical analyses and evaluation reports: Principles, practices, and free software tools at the 2024 conference of the American Evaluation Association.
Slides and example files are available at https://github.com/sjpierce/Pierce.AEA2024.
Fully reproducible statistical analyses are ones for which investigators have shared all the materials required to exactly recreate their findings so others can verify them or conduct alternative analyses. That requires sharing the original (usually de-identified) data, supporting documentation, and the software code used to analyze the data. While reproducibility has been described as an attainable minimum standard for trustworthy, credible scientific work; it is not yet well-embedded in evaluators’ professional training. This session will introduce the audience to a set of principles, practices, and free, open-source software tools that enable evaluators to efficiently generate reproducible statistical analyses and evaluation reports. We will cover why reproducibility is important in an evaluation context, then offer a vision of how to improve the reproducibility of your work and suggest concrete steps you can take to achieve that goal. We will discuss tailoring the degree of reproducibility you aim to achieve for a given project, which may vary due to project context or constraints. In terms of software, we will describe how R, RStudio, Quarto, and TinyTex comprise a powerful suite of tools that can generate dynamic documents containing a mix of narrative text along with R code that can be compiled to automate producing a fully-formatted report, manuscript, or set of slides complete with narrative text, analysis results, figures, tables, and references. Git and GitHub.com add further value through support for version control and collaboration on the source code for dynamic documents. The session will include conceptual content, examples of dynamic documents, and links to supporting resources the audience can use to accelerate learning how to make their work more reproducible.