Mar 19
R-Ladies: Introduction to Data Cleaning with dataMaid
VUMC Biostatistics Conference Room

Molly Olson will be leading us on a data cleaning journey! Data cleaning can be the most time consuming part of a statistician’s workflow, and the validity of the conclusions from a data analysis can depend on this simple data cleaning. At the Conference of Statistical Practice 2018, Molly took a short course titled “Cleaning up the Data Cleaning Process: Challenges and Solutions in R,” taught by Anne Helby Petersen and Claus Thorn Ekstrøm from the section of Biostatistics, University of Copenhagen. The short course introduced an R package called “dataMaid” that is a “systematic, analytical approach to data cleaning that will ensure the data cleaning process to be just as structured and well-documented as the rest of the data analysis.” In this tutorial, Molly will provide an overview of the dataMaid package and demonstrate how to use the dataMaid package to improve efficiency and accuracy in the statistician’s every day data cleaning.