Statistical analysis of high-dimensional biomedical data: a gentle introduction to analytical goals, common approaches and challenges
Dr. Lara Lusa, Natural Sciences and Information Technologies, University of Primorska, Slovenia and Institute for Biostatistics and Medical Informatics, University of Ljubljana, Ljubljana, Slovenia
In high-dimensional data (HDD) the number of measured variables greatly exceed the number of participants included in the study. In biomedical research. omics HDD data measure characteristics of the genome, proteome or metabolome; also electronic health records can provide HDD due to the large number of variables being recorded.
The aim of this talk is to provide an overview of the characteristics of HDD, of the possible aims of the experiments that use them, and of the most commonly used approached used to prepare data and analyze them. The topics presented briefly will be: initial data analysis, exploratory data analysis, multiple testing, and prediction.
The talk is devoted to researchers that would like to understand the challenges of HDD statistical analysis and interpretation of the results, including researchers that do not have prior experience with HDD or that are newly embarking in research involving HDD. The talk is based on the paper recently published from the Topic Group "High-dimensional data" of the STRATOS initiative (Rahnenführer et al. BMC Medicine (2023) 21:182 https://doi.org/10.1186/s12916-023-02858-y)