- Level Foundation
- Duration 15 hours
- Course by Johns Hopkins University
-
Offered by
About
Getting data into your statistical analysis system can be one of the most challenging parts of any data science project. Data must be imported and harmonized into a coherent format before any insights can be obtained. You will learn how to get data into R from commonly used formats and harmonizing different kinds of datasets from different sources. If you work in an organization where different departments collect data using different systems and different storage formats, then this course will provide essential tools for bringing those datasets together and making sense of the wealth of information in your organization. This course introduces the Tidyverse tools for importing data into R so that it can be prepared for analysis, visualization, and modeling. Common data formats are introduced, including delimited files, spreadsheets and relational databases, and techniques for obtaining data from the web are demonstrated, such as web scraping and web APIs. In this specialization we assume familiarity with the R programming language. If you are not yet familiar with R, we suggest you first complete R Programming before returning to complete this course.Modules
About This Course
1
Readings
- About This Course
Tibbles
3
Readings
- Tibbles
- Creating a tibble
- Subsetting
Spreadsheets
3
Readings
- Spreadsheets
- Excel files
- Google Sheets
CSVs
3
Readings
- CSVs
- Downloading CSV files
- Reading CSVs into R
TSVs
2
Readings
- TSVs
- Reading TSVs Files into R
Delimited Files
2
Readings
- Delimited Files
- Reading Delimited Files into R
Exporting Data from R
1
Assignment
- Importing and Exporting Data Quiz
1
Readings
- Exporting Data from R
JSON
1
Readings
- JSON
XML
1
Readings
- XML
Databases
1
Assignment
- JSON, XML, and Databases Quiz
8
Readings
- Databases
- Relational Data
- Relational Databases: SQL
- Connecting to Databases: RSQLite
- Working with Relational Data: dplyr & dbplyr
- Mutating Joins
- Filtering Joins
- How to Connect to a Database Online
Web Scraping
5
Readings
- Web Scraping
- rvest Basics
- SelectorGadget
- Web Scraping Example
- A final note: SelectorGadget
Application Programming Interfaces (APIs)
1
Assignment
- Getting Data from the Internet Quiz
6
Readings
- API
- Getting Data: httr
- Example 1: GitHub’s API
- Example 2: Obtaining a CSV
- read_csv() from a URL
- API keys
Foreign Formats
1
Readings
- haven
Images
1
Readings
- Images
googledrive
1
Assignment
- Foreign Formats, Images and googledrive Quiz
1
Readings
- googledrive
Case Study #1: Health Expenditures
1
Labs
- Health Expenditures Lab
3
Readings
- Case Study #1: Health Expenditures
- Healthcare Coverage Data
- Healthcare Spending Data
Case Study #2: Firearms
1
Labs
- Firearms Case Study Lab
8
Readings
- New Case Study #2: Firearms
- Census Data
- Counted Data
- Suicide Data
- Brady Data
- Crime Data
- Land Area Data
- Unemployment Data
Introduction, Instructions and Datasets
2
Readings
- Introduction and Background
- Datasets
Importing Data into R
1
Assignment
- Importing Data into R Project
Auto Summary
"Importing Data in the Tidyverse" is a foundational course in Big Data and Analytics, offered by Coursera. Led by expert instructors, it focuses on importing and harmonizing data in R using Tidyverse tools. The course covers various data formats, web scraping, and APIs over 900 minutes. Ideal for those familiar with R, it offers Starter and Professional subscription options for a comprehensive learning experience.

Carrie Wright, PhD

Shannon Ellis, PhD

Stephanie Hicks, PhD

Roger D. Peng, PhD