- Level Professional
- Duration 13 hours
-
Offered by
About
In this class, Introduction to Designing Data Lakes on AWS, we will help you understand how to create and operate a data lake in a secure and scalable way, without previous knowledge of data science! Starting with the "WHY" you may want a data lake, we will look at the Data-Lake value proposition, characteristics and components. Designing a data lake is challenging because of the scale and growth of data. Developers need to understand best practices to avoid common mistakes that could be hard to rectify. In this course we will cover the foundations of what a Data Lake is, how to ingest and organize data into the Data Lake, and dive into the data processing that can be done to optimize performance and costs when consuming the data at scale. This course is for professionals (Architects, System Administrators and DevOps) who need to design and build an architecture for secure and scalable Data Lake components. Students will learn about the use cases for a Data Lake and, contrast that with a traditional infrastructure of servers and storage.Modules
Week 1 Assessment
2
Assignment
- Pre-Assessment
- Knowledge Check
5
Videos
- Course Introduction and Overview
- Module Objectives
- Why Data Lakes?
- Data Lake vs Data Warehouse
- Components and Architectures
1
Readings
- Pre-Course Survey
Week 2 Assessment
1
Assignment
- Knowledge Check
10
Videos
- Module Objectives
- Data Lake Storage
- Data Ingestion
- Crawl and Catalog Data
- Demo: Creating and Running a Glue Crawler
- Formatting the Data in the Lake
- Partitioning, Compressing, and Compacting the Data in the Lake
- Query Data with Amazon Athena
- Demo: Querying CSV data with Amazon Athena
- Demo: Columnar data formats with Amazon Athena - a performance and cost comparison
1
Readings
- Mid-Course Survey
Week 3 Assessment
2
Assignment
- Are you ready for the lab?
- Knowledge Check
1
External Tool
- Lab 1: Building a Data Lake using AWS Lake Formation
3
Videos
- Moule Objectives
- AWS Lake Formation Overview
- AWS Lake Formation Basic Permission Model
1
Readings
- Reading: AWS Lake Formation Basics
Week 4 Assessment
1
Assignment
- Knowledge Check
1
Discussions
- AWS Glue
7
Videos
- Module Objectives
- Data Transformation
- Data Processing with AWS Glue
- AWS Glue Jobs and Workflows
- Demo - Glue Databrew
- Demo - Glue Studio
- Tech Talk - Glue / Athena Federated Queries
Untitled Lesson
2
Assignment
- Are you ready for the lab?
- Knowledge Check
1
External Tool
- Lab 2: Automate Data Lake Creation Using AWS Lake Formation Blueprints
5
Videos
- Module Objectives
- Using Blueprints and Workflows
- Fine-grained Access Control
- Demo - LF-Tags
- Visualizing Data with QuickSight - Demo
Untitled Lesson
2
Assignment
- Are you ready for the lab?
- Post-Assessment
1
External Tool
- Lab 3: Publishing and Managing Data Product in AWS Lake Formation
6
Videos
- Module Objectives
- Modern Data Architecture
- Data Movement Scenarios
- Data Sharing Models
- Tech Talk - What's your favorite data lake feature?
- Course Recap/Key Takeaways
3
Readings
- Reading: Modern data architecture
- Reading: Data Movement Scenarios
- Post-Course Survey
Auto Summary
Dive into "Introduction to Designing Data Lakes on AWS," an expert-led Coursera course crafted for professionals in Data Science & AI. Guided by industry leaders, this 780-minute course demystifies the creation and management of secure, scalable data lakes. Ideal for Architects, System Administrators, and DevOps, it covers foundational concepts, best practices, and performance optimization techniques. With a focus on practical application, learners will contrast data lakes with traditional infrastructures, ensuring robust knowledge to avoid common pitfalls. Access this professional-level course with a Starter subscription and elevate your data architecture skills.