- Level Professional
- المدة 17 ساعات hours
- الطبع بواسطة Duke University
-
Offered by
عن
Welcome to the third course in the Building Cloud Computing Solutions at Scale Specialization! In this course, you will learn how to apply Data Engineering to real-world projects using the Cloud computing concepts introduced in the first two courses of this series. By the end of this course, you will be able to develop Data Engineering applications and use software development best practices to create data engineering applications. These will include continuous deployment, code quality tools, logging, instrumentation and monitoring. Finally, you will use Cloud-native technologies to tackle complex data engineering solutions. This course is ideal for beginners as well as intermediate students interested in applying Cloud computing to data science, machine learning and data engineering. Students should have beginner level Linux and intermediate level Python skills. For your project in this course, you will build a serverless data engineering pipeline in a Cloud platform: Amazon Web Services (AWS), Azure or Google Cloud Platform (GCP).الوحدات
Welcome to the Course!
1
Discussions
- Introductions
4
Videos
- Instructor Introduction
- Course Introduction
- Lab Onboarding
- Course 3 Project Overview
3
Readings
- Getting Started and Course Gotchas
- Specialization Project Roadmap: Course 3
- Course Structure and Discussion Etiquette
The End of Moore's Law
1
Assignment
- Quiz-The End of Moore's Law
1
Discussions
- Implications of the End of Moore's Law
1
Labs
- Threads and Processes
7
Videos
- Introduction to the End of Moore's Law
- The Problem with Concurrency in Python
- Exploring the End of Moore's Law
- Using CUDA and Numba
- What is an ASIC?
- Taking Advantage of Colab Pro
- Exploring Colab AI
5
Readings
- Key Terms
- The End of Moore's Law
- Numba User Guide
- Custom Silicon
- Lesson Reflection
Build Distributed Systems
1
Assignment
- Quiz-Build Distributed Systems
1
Discussions
- Monitoring and Logging In Distributed Systems
1
Labs
- Debug with Python PDB
7
Videos
- Introduction to Distributed Systems
- Logging and Instrumentation Distributed Systems
- CAP Theorem
- Amdahl's Law
- Elasticity
- Highly Available Nine Nine's
- Debugging Python Code
5
Readings
- Key Terms
- Python Debugger
- Python Logging
- Implementing elasticity, high availability, and monitoring in the AWS cloud
- Lesson Reflection
Understand Big Data and Data Processing Platforms
1
Assignment
- Quiz-Understand Big Data and Data Processing Platforms
1
Discussions
- Challenges to Big Data Platforms
1
Labs
- Understanding Map Reduce
7
Videos
- Exploring Google BigQuery
- What is Big Data?
- Data Lakes
- Big Data Processing
- Introduction to AWS Data Engineering Design Principles
- Processing Big Data with AWS
- Transform Data with Databricks Spark SQL
5
Readings
- Key Terms
- Introduction to Cloud Data Platforms
- What is a Lakehouse Platform
- Snowpark ML
- Lesson Reflection
Applied Practice: Creating High-Performance Code
1
Assignment
- Quiz-High-Performance Code
4
Readings
- Key Terms
- Turbocharging Python with Command Line Tools
- Creating high-performance code
- Lesson Reflection
Graded Quiz
1
Assignment
- Quiz
What is Data Engineering?
1
Assignment
- Quiz-What is Data Engineering?
1
Discussions
- The Power of Events
1
Labs
- Streaming Data with Python Generators
5
Videos
- Introduction to Data Engineering
- Data Driven Organizations
- What is Data Engineering?
- Batch vs. Streaming vs. Events
- Ingesting by Batch or Stream
3
Readings
- Key Terms
- Python Generators
- Lesson Reflection
Using Command-line tools in Rust and Python
1
Assignment
- Quiz-Using Command-line tools in Rust and Python
1
Discussions
- Why is a Containerized CLI so Useful?
2
Labs
- Build a Click Command-Line Tool
- Rust CLI Tool
5
Videos
- Building CLI Tools with Click
- Building Containerized Command-line Tools
- Rust and Python
- Python Calculator CLI
- Caesar Cipher CLI
3
Readings
- Key Terms
- Python and Rust
- Lesson Reflection
Advanced Code Testing and AI Enhanced Development Techniques
1
Assignment
- Quiz-Advanced Code Testing and AI Enhanced Development Techniques
1
Discussions
- Using Advanced Code Analysis Tools
1
Labs
- Building Command-Line Data Processing Tool
6
Videos
- Advanced Testing with Amazon CodeGuru
- Advanced Testing with AWS CodeBuild
- Mapping Functions to CLI: Part 1
- Mapping Functions to CLI: Part 2
- AWS CodeWhisperer CLI
- AWS CodeWhisperer SDK
3
Readings
- Key Terms
- AWS CodeWhisperer for Data Engineering
- Lesson Reflection
Applied Practice: Extending Command-Line Data Processing Tool
1
Readings
- Extending Command-Line Data Processing Tool
Graded Assessment
1
Assignment
- Quiz
Build a Serverless Data Engineering System
1
Assignment
- Quiz-Build a Serverless Data Engineering System
1
Discussions
- Serverless Tools that use a CLI
15
Videos
- Introduction to Serverless Data Engineering
- Automating Pipelines
- What is Serverless?
- Serverless Concepts: Service Model
- Serverless Concepts: Functions
- Serverless Concepts: Ecosystem
- AWS Lambda: Overview
- Introduction to AWS Cloud9
- AWS Lambda: Event Handling
- AWS Lambda: Hello World
- AWS Lambda: Deploy Testing
- AWS Lambda: Wikipedia Example
- Build a Serverless Data Pipeline
- Serverless Cookbook with AWS
- Serverless Cookbook with GCP
4
Readings
- Key Terms
- Lambda Console Gotchas
- Using Amazon EFS for AWS Lambda
- Lesson Reflection
Effective Data Governance
1
Assignment
- Quiz-Effective Data Governance
1
Discussions
- Startup Founder Wants To "Move Fast and Break Things"
8
Videos
- Introduction to Data Governance
- What is Data Governance?
- The Principle of Least Privilege
- Cloud Security with IAM on AWS
- AWS Shared Security Model
- AWS IAM Service
- AWS Cloud Security Operations
- Encrypt at Rest and Transit
2
Readings
- Key Terms
- Lesson Reflection
Applied Practice: Building Computer Vision Label Trigger for S3
1
Readings
- Building Computer Vision Label Trigger for S3
Graded Assignment
1
Assignment
- Quiz
Effective ETL
1
Assignment
- Quiz-Effective ETL
1
Discussions
- Why spend four minutes when you can spend four days?
5
Videos
- Introduction to Extract, Transform, Load (ETL)
- What is ETL?
- Ingesting and Preparing Data on AWS
- Using Amazon Athena with AWS Glue
- Real-World Problems in ETL
2
Readings
- Key Terms
- Lesson Reflection
Cloud Databases
1
Assignment
- Quiz-Cloud Databases
1
Discussions
- Relational Database Conundrum
1
Labs
- MySQL for Data Engineering
11
Videos
- Introduction to Cloud Databases
- One Size Does Not Fit All in the Cloud?
- MySQL Overview
- MySQL from Terminal
- Archive and Drop Database
- Import external database Sakila
- Modify database Sakila
- Bash pipelines with MySQL
- MySQL to Python Standard Library Web Server
- Big Query with Prompt Engineering
- Big Query Colab Pipeline
3
Readings
- Key Terms
- Cloud Databases
- Lesson Reflection
Cloud Storage
1
Assignment
- Quiz-Cloud Storage
1
Discussions
- Disaster Recovery and Amazon S3
4
Videos
- Introduction to Cloud Storage
- Why Cloud Storage?
- Cloud Storage Deep Dive
- Using Amazon S3
3
Readings
- Key Terms
- Cloud Storage Solutions
- Lesson Reflection
Graded Assignment
1
Assignment
- Quiz
Putting it all Together: Final Course Project
3
Labs
- Rust Data Engineering Sandbox
- Jupyter Sandbox
- VS Code Sandbox
2
Readings
- Create a serverless Data Engineering Pipeline
- Next Steps
Auto Summary
Unlock the potential of Cloud Data Engineering with this course from Coursera! Designed for both beginners and intermediate students, it focuses on applying data engineering to real-world projects using Cloud computing. Throughout the course, you'll develop applications using best practices, such as continuous deployment and monitoring. Tackle complex solutions with Cloud-native technologies on platforms like AWS, Azure, or GCP. Ideal for those with beginner Linux and intermediate Python skills. Duration: 1020 minutes. Subscription options: Starter, Professional.

Noah Gift