Data analysis
Analysis plan
The analysis plan is part of the protocol before you see the data.You think about what to expect of the data, how the data may look like, make a sample dataset with variables you are considering. What are the potential strengths and limitations of your analysis plan? Address potential bias. What statistical models are appropriate for your study? Statisticians can make sure that the statistical models are appropriate for the research question. Statisticians can suggest more complex models you may not have thought of, that may lead to a revision of your research questions. If you plan doing the statistical analysis yourself, check with CSTAT for advice. If you have questions using statistical software, CSTAT consultants may be able to give you advice.
Think before you analyze ("Initial data analysis")
- Perform initial data analysis, namely data cleaning and data screening that do not touch the research questions.
- Keep a record of all your analysis steps and changes you may make to the data.
- What descriptive statistics will you report?
- Do you notice data properties that are unexpected? For example, the distribution of a variable is different than expected. What is the reason for this? Are there patterns if missing values?
- Examples to look out for: A contemporary framework for initial data analysis. Huebner et al.
- On the basis of your results, do you need to perform sensitivity analyses?
- Should there be an update or refinement of your analysis plan?
- Be aware that data cleaning and data screening takes time. By experience about 2-10% of data in excel spreadsheets may be in error (incorrect dates, unusually high or low numbers, missing data)
Perform statistical analysis
- Be prepared to review summary statistics provided by your statistician promptly and then proceed with modeling.
- Prompt communication is crucial for avoiding delays.
- Consider a workshop on statistical software R or statistical models offered by CSTAT or elsewhere.
For abstract deadlines keep in mind, that statisticians need 2-3 weeks leeway, provided the data are already clean and you have an account number.