- Level Foundation
- المدة 20 ساعات hours
- الطبع بواسطة Johns Hopkins University
-
Offered by
عن
Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data "tidy". Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.الوحدات
Week 1
9
Videos
- Obtaining Data Motivation
- Raw and Processed Data
- Components of Tidy Data
- Downloading Files
- Reading Local Files
- Reading Excel Files
- Reading XML
- Reading JSON
- The data.table Package
3
Readings
- Welcome to Week 1
- Syllabus
- Pre-Course Survey
Practical R Exercises in swirl
1
Readings
- Practical R Exercises in swirl Part 1
Quiz
1
Assignment
- Week 1 Quiz
Week 2
5
Videos
- Reading from MySQL
- Reading from HDF5
- Reading from The Web
- Reading From APIs
- Reading From Other Sources
Quiz
1
Assignment
- Week 2 Quiz
Week 3
7
Videos
- Subsetting and Sorting
- Summarizing Data
- Creating New Variables
- Reshaping Data
- Managing Data Frames with dplyr - Introduction
- Managing Data Frames with dplyr - Basic Tools
- Merging Data
Practical R Exercises in swirl
- swirl Lesson 1: Manipulating Data with dplyr
- swirl Lesson 2: Grouping and Chaining with dplyr
- swirl Lesson 3: Tidying Data with tidyr
1
Readings
- Practical R Exercises in swirl Part 2
Quiz
1
Assignment
- Week 3 Quiz
Week 4
5
Videos
- Editing Text Variables
- Regular Expressions I
- Regular Expressions II
- Working with Dates
- Data Resources
Practical R Exercises in swirl
- swirl Lesson 1: Dates and Times with lubridate
1
Readings
- Practical R Exercises in swirl Part 4
Quiz
1
Assignment
- Week 4 Quiz
Course Project
1
Peer Review
- Getting and Cleaning Data Course Project
Post-Course Survey
1
Readings
- Post-Course Survey
Auto Summary
"Getting and Cleaning Data" is a foundational course in the Data Science & AI domain, offered by Coursera. Taught by industry experts, it focuses on acquiring data from various sources, including the web, APIs, and databases, and transforming it into tidy, analyzable formats. The comprehensive content covers data collection, cleaning, and sharing, making it ideal for beginners. With a duration of 1200 minutes, the course is available under the Starter subscription plan. Perfect for those new to data science.

Jeff Leek, PhD

Roger D. Peng, PhD

Brian Caffo, PhD