- Level Foundation
- المدة 14 ساعات hours
- الطبع بواسطة Johns Hopkins University
-
Offered by
عن
Data never arrive in the condition that you need them in order to do effective data analysis. Data need to be re-shaped, re-arranged, and re-formatted, so that they can be visualized or be inputted into a machine learning algorithm. This course addresses the problem of wrangling your data so that you can bring them under control and analyze them effectively. The key goal in data wrangling is transforming non-tidy data into tidy data. This course covers many of the critical details about handling tidy and non-tidy data in R such as converting from wide to long formats, manipulating tables with the dplyr package, understanding different R data types, processing text data with regular expressions, and conducting basic exploratory data analyses. Investing the time to learn these data wrangling techniques will make your analyses more efficient, more reproducible, and more understandable to your data science team. In this specialization we assume familiarity with the R programming language. If you are not yet familiar with R, we suggest you first complete R Programming before returning to complete this course.الوحدات
About This Course
2
Readings
- About This Course
- Tidy Data Review
Reshaping Data
1
Assignment
- Reshaping Data Quiz
4
Readings
- Reshaping Data
- Wide Data
- Long Data
- Reshaping Data
Data Wrangling
1
Assignment
- Data Wrangling Quiz
13
Readings
- Data Wrangling
- R Packages
- The Pipe Operator
- Filtering Data
- Reordering
- Creating New Columns
- Separating Columns
- Merging Columns
- Cleaning Column Names
- Combining Data Across Data Frames
- Grouping Data
- Summarizing Data
- Operations Across Columns
Working with Factors
1
Assignment
- Working With Factors Quiz
10
Readings
- Working with Factors
- Factor Review
- Manually Changing the Labels of Factor Levels: fct_releve()
- Keeping the Order of the Factor Levels: fct_inorder()
- Advanced Factoring
- Re-ordering Factor Levels by Frequency: fct_infreq()
- Reversing Order Levels: fct_rev()
- Re-ordering Factor Levels by Another Variable: fct_reorder()
- Combining Several Levels into One: fct_recode()
- Converting Numeric Levels to factors: ifelse() + factor()
Working With Dates and Times
1
Assignment
- Working With Dates Quiz
4
Readings
- Dates and Times Basics
- Creating Dates and Date-Time Objects
- Working with Dates
- Time Spans
Working with Strings
1
Assignment
- Working With Strings Quiz
5
Readings
- Working with Strings
- stringr
- String Basics
- Regular Expressions
- glue
Working With Text
3
Readings
- Tidy Text Format
- Sentiment Analysis
- Word and Document Frequency
Functional Programming
1
Assignment
- Functional Programming Quiz
5
Readings
- Functional Programming
- For Loops vs. Functionals
- map Functions
- Multiple Vectors
- Anonymous Functions
Exploratory Data Analysis
2
Readings
- Exploratory Data Analysis
- General Principles of EDA
Case Studies
1
Readings
- Case Studies
Case Study #1: Health Expenditures
1
Labs
- Case Study #1: Health Expenditures
3
Readings
- Healthcare Coverage Data
- Healthcare Spending Data
- Join the Data
Case Study #2: Firearms
1
Labs
- Case Study #2: Firearms
7
Readings
- Census Data
- Violent Crime
- Brady Scores
- The Counted Fatal Shootings
- Unemployment Data
- Population Density: 2015
- Firearm Ownership
Project
1
Assignment
- Wrangling Data in the Tidyverse Course Project
1
Readings
- Important information before you start the project
Auto Summary
"Wrangling Data in the Tidyverse" is a foundational Data Science course focused on transforming non-tidy data into tidy data using R. Taught by Coursera, it covers essential techniques like data reshaping, manipulation with dplyr, and text processing with regular expressions. Lasting 840 minutes, it's ideal for those familiar with R. Subscription options include Starter and Professional plans, targeting learners aiming to enhance their data analysis efficiency and reproducibility.

Carrie Wright, PhD

Shannon Ellis, PhD

Stephanie Hicks, PhD

Roger D. Peng, PhD