- Level Foundation
- المدة 17 ساعات hours
- الطبع بواسطة IBM
-
Offered by
عن
After taking this course, you will be able to describe two different approaches to converting raw data into analytics-ready data. One approach is the Extract, Transform, Load (ETL) process. The other contrasting approach is the Extract, Load, and Transform (ELT) process. ETL processes apply to data warehouses and data marts. ELT processes apply to data lakes, where the data is transformed on demand by the requesting/calling application. Both ETL and ELT extract data from source systems, move the data through the data pipeline, and store the data in destination systems. During this course, you will experience how ELT and ETL processing differ and identify use cases for both. You will identify methods and tools used for extracting the data, merging extracted data either logically or physically, and for importing data into data repositories. You will also define transformations to apply to source data to make the data credible, contextual, and accessible to data users. You will be able to outline some of the multiple methods for loading data into the destination system, verifying data quality, monitoring load failures, and the use of recovery mechanisms in case of failure. Finally, you will complete a shareable final project that enables you to demonstrate the skills you acquired in each module.الوحدات
Welcome
1
Videos
- Course Intro video
1
Readings
- Course Introduction
ETL and ELT Processes
6
Videos
- ETL Fundamentals
- ELT Basics
- Comparing ETL and ELT
- Data Extraction Techniques
- Introduction to Data Transformation Techniques
- Data Loading Techniques
Module 1 Summary, Practice Quiz, and Graded Quiz
2
Assignment
- ETL and ELT Processes
- Graded Quiz: ETL and ELT Processes
1
Readings
- Summary & Highlights
ETL using Shell Scripts
2
Assignment
- Practice Quiz: ETL using Shell Scripts
- Graded Quiz: ETL using Shell Scripts
1
External Tool
- Hands-On Lab: ETL using Shell Scripts
1
Videos
- ETL Using Shell Scripting
3
Readings
- Linux Commands and Shell Scripting
- ETL Techniques
- Summary & Highlights
An Introduction to Data Pipelines
2
Assignment
- Practice Quiz: An Introduction to Data Pipelines
- Graded Quiz: An Introduction to Data Pipelines
4
Videos
- Introduction to Data Pipelines
- Key Data Pipeline Processes
- Batch versus Streaming Data Pipeline Use Cases
- Data Pipeline Tools and Technologies
1
Readings
- Summary & Highlights
Using Apache Airflow to build Data Pipelines
2
Assignment
- Practice Quiz: Building Data Pipelines using Airflow
- Graded Quiz: Building Data Pipelines using Airflow
4
External Tool
- Hands-on Lab: Getting Started with Apache Airflow
- Hands-on Lab: Create a DAG for Apache Airflow with PythonOperator
- Hands-on Lab: Create a DAG for Apache Airflow with BashOperator
- Hands-on Lab: Monitoring a DAG
5
Videos
- Apache Airflow Overview
- Advantages of Representing Data Pipelines as DAGs in Apache Airflow
- Apache Airflow UI
- Build a DAG Using Airflow
- Airflow Logging and Monitoring
1
Readings
- Summary & Highlights
Using Apache Kafka to build Pipelines for Streaming Data
2
Assignment
- Practice Quiz: Building Streaming Pipelines using Kafka
- Graded Quiz: Building Streaming Pipelines using Kafka
3
External Tool
- Hands-on Lab: Working with Streaming Data using Kafka
- [Optional] Hands-on Lab: Kafka Message Keys and Offset
- [Optional] Hands-on Lab: Kafka Python Client
4
Videos
- Distributed Event Streaming Platform Components
- Apache Kafka Overview
- Building Event Streaming Pipelines using Kafka
- Kafka Streaming Process
1
Readings
- Summary & Highlights
Final Assignment
3
External Tool
- Hands-on Lab: Build ETL Data Pipelines with BashOperator using Apache Airflow
- [Optional] Hands-on Lab: Build an ETL Pipeline using PythonOperator with Apache Airflow
- [Optional] Hands-on Lab: Build a Streaming ETL Pipeline using Kafka
1
Peer Review
- Peer Review: Project Submission and Peer Review
1
Readings
- Project Overview
Final Quiz
1
Assignment
- Timed Final Quiz
1
Readings
- Graded Timed Final Exam Instructions
Course Wrap-up
2
Readings
- Congrats & Next Steps
- Thanks from the Course Team
Auto Summary
Explore the foundational course "ETL and Data Pipelines with Shell, Airflow, and Kafka" in IT & Computer Science by Coursera. Master the ETL and ELT processes for converting raw data into analytics-ready data. Learn to extract, transform, and load data using tools like Shell, Airflow, and Kafka. This 1020-minute course includes practical use cases, data transformation techniques, and project work to showcase your skills. Ideal for beginners, the course offers a starter subscription option. Join now to enhance your data pipeline expertise!

Jeff Grossman

Yan Luo

Lavanya Thiruvali Sunderarajan

Ramesh Sannareddy

Sabrina Spillner