- Level Foundation
- Duration 9 hours
- Course by University of Michigan
-
Offered by
About
By the end of this third course in the Total Data Quality Specialization, learners will be able to: 1. Learn about design tools and techniques for maximizing TDQ across all stages of the TDQ framework during a data collection or a data gathering process. 2. Identify aspects of the data generating or data gathering process that impact TDQ and be able to assess whether and how such aspects can be measured. 3. Understand TDQ maximization strategies that can be applied when gathering designed and found/organic data. 4. Develop solutions to hypothetical design problems arising during the process of data collection or data gathering and processing. This specialization as a whole aims to explore the Total Data Quality framework in depth and provide learners with more information about the detailed evaluation of total data quality that needs to happen prior to data analysis. The goal is for learners to incorporate evaluations of data quality into their process as a critical component for all projects. We sincerely hope to disseminate knowledge about total data quality to all learners, such as data scientists and quantitative analysts, who have not had sufficient training in the initial steps of the data science process that focus on data collection and evaluation of data quality. We feel that extensive knowledge of data science techniques and statistical analysis procedures will not help a quantitative research study if the data collected/gathered are not of sufficiently high quality. This specialization will focus on the essential first steps in any type of scientific investigation using data: either generating or gathering data, understanding where the data come from, evaluating the quality of the data, and taking steps to maximize the quality of the data prior to performing any kind of statistical analysis or applying data science techniques to answer research questions. Given this focus, there will be little material on the analysis of data, which is covered in myriad existing Coursera specializations. The primary focus of this specialization will be on understanding and maximizing data quality prior to analysis.Modules
Welcome!
1
Videos
- Welcome to Course 3 and the final course in the Specialization!
2
Readings
- Course Syllabus
- Course Pre-Survey
Validity
1
Assignment
- Design Strategies for Maximizing Validity
4
Videos
- Maximizing Validity for Designed Data
- Case Study: Improving Questions Based on Pre-Testing Results
- Maximizing Validity for Gathered Data
- Case Study: Improving the Validity of Gathered Data using Auxiliary Data and Transformations
1
Readings
- Case Study pre-read: Improving Google Flu Trends Estimates for the United States through Transformation
Data Origin
1
Assignment
- Design Strategies for Maximizing Data Origin Quality
4
Videos
- Maximizing Data Origin Quality for Designed Data
- Case Study: Standardized vs. Conversational Interviewing
- Maximizing Data Original Quality for Gathered Data
- Case Study: Simple Lessons Learned for Improving Data Origin Quality While Web Scraping
1
Readings
- Optional: links from previous lecture on Maximizing Data Original Quality for Gathered Data
Processing
1
Assignment
- Design Strategies for Maximizing Processing Quality
4
Videos
- Maximizing Processing Quality for Designed Data
- Example: Double Data Entry and Imputation to Maximize Data Processing Quality
- Maximizing Processing Quality for Gathered Data
- Example: Maximizing Processing Quality for Gathered Data
1
Readings
- Files for the next example
Data Access
1
Assignment
- Strategies for Maximizing Access Quality
3
Videos
- Maximizing Data Access Quality for Designed Data
- Maximizing Data Access Quality for Gathered Data
- Example: Maximizing Data Access Quality for Gathered Data
1
Readings
- Exploring and Evaluating Enhancements for ABS Sampling Frames
Data Source
1
Assignment
- Strategies for Maximizing Source Quality
3
Videos
- Maximizing Data Source Quality for Designed Data
- Example: Maximizing Data Source Quality for Designed Data
- Maximizing Data Source Quality for Gathered Data
1
Readings
- Probability Samples of Twitter
Data Missingness
1
Assignment
- Strategies for Minimizing Data Missingness
5
Videos
- Minimizing Data Missingness for Designed Data
- Example: Imputation and Weighting Adjustment
- Minimizing Data Missingness for Designed Data: Responsive and Adaptive Survey Design
- Minimizing Data Missingness for Gathered Data
- Example: Minimizing Data Missingness for Gathered Data
2
Readings
- Files for next example
- Optional: .csv and .py files for the next lecture
Maximizing the Quality of Data Analysis
1
Assignment
- Maximizing Data Analysis Quality
1
Peer Review
- A Study of Wordle Performance
4
Videos
- Maximizing the Quality of an Analysis of Designed Data
- Case Studies in Analytic Error
- Maximizing the Quality of an Analysis of Gathered Data
- Case Study: Maximizing the Quality of an Analysis of Video Image Data
Course and Specialization Conclusion
3
Readings
- Course and Specialization Conclusion
- References for Design Strategies for Maximizing Total Data Quality
- Course and Specialization Post-Survey
Auto Summary
Enhance your data science skills with "Design Strategies for Maximizing Total Data Quality," taught by Coursera. This foundational course, part of the Big Data and Analytics domain, focuses on optimizing data quality throughout the collection and gathering stages. Over 540 minutes, you'll learn design tools, assess data processes, and develop solutions for maximizing data quality. Ideal for data scientists and quantitative analysts, this course emphasizes the importance of high-quality data for effective analysis. Subscribe with the Starter plan and elevate your data quality expertise today.

Brady T. West

James Wagner

Jinseok Kim

Trent D Buskirk