Data quality, simple yet complicated
Professor Carsten O. Schmidt, University of Greifswald, Germany
Data quality is very important and a necessary ingredient in valid and sustainable research. Yet, there is often confusion about its definition and about how to assess it. It is precisely these two aspects that this talk will focus upon, starting with concepts and frameworks to address data quality. There exists a considerable heterogeneity of terms within and across disciplines to describe different aspects of data quality. Two frameworks are presented that relate to observational health studies and electronic health records. Both approaches are compared to illustrate demands that arise from the different application scenarios. Thereafter ways to implement data quality assessments are highlighted with a focus on the R programming language. The talk also covers prerequisites to implement data quality assessments in a structured way, foremost adequate metadata management. It will be shown how measurement data should be described to enable automated assessments.