- Level Professional
- Duration 7 hours
- Course by Google Cloud
-
Offered by
About
The two key components of any data pipeline are data lakes and warehouses. This course highlights use-cases for each type of storage and dives into the available data lake and warehouse solutions on Google Cloud in technical detail. Also, this course describes the role of a data engineer, the benefits of a successful data pipeline to business operations, and examines why data engineering should be done in a cloud environment. This is the first course of the Data Engineering on Google Cloud series. After completing this course, enroll in the Building Batch Data Pipelines on Google Cloud course.Modules
Course Introduction
2
Videos
- Course series introduction
- Course introduction
Introduction to Data Engineering
12
Videos
- Module introduction
- The role of a data engineer
- Data engineering challenges
- Introduction to BigQuery
- Data lakes and data warehouses
- Transactional databases versus data warehouses
- Partner effectively with other data teams
- Manage data access and governance
- Demo: Finding PII in your dataset with the DLP API
- Build production-ready pipelines
- Google Cloud customer case study
- Recap
Lab: Using BigQuery to do Analysis
1
External Tool
- Using BigQuery to do Analysis
2
Videos
- Lab Intro: Using BigQuery to do Analysis
- Getting Started with Google Cloud and Qwiklabs
Quiz
1
Assignment
- Introduction to Data Engineering
Building a data lake
7
Videos
- Module Introduction
- Introduction to Data Lakes
- Data storage and ETL options on Google Cloud
- Build a data lake using Cloud Storage
- Secure Cloud Storage
- Store all sorts of data types
- Cloud SQL as a relational Data Lake
Lab: Loading Taxi Data into Google Cloud SQL
1
External Tool
- Loading Taxi Data into Google Cloud SQL
1
Videos
- Lab Intro: Loading Taxi Data into Google Cloud SQL
Quiz
1
Assignment
- Building a Data Lake
Building a data warehouse
6
Videos
- Module Introduction
- The modern data warehouse
- Introduction to BigQuery
- Demo: Querying TB of data in seconds
- Get started with BigQuery
- Load data into BigQuery
Lab: Loading Data into BigQuery
1
External Tool
- Loading Data into BigQuery
1
Videos
- Lab Intro: Loading Data into BigQuery
BigQuery as a data warehousing solution
2
Videos
- Explore schemas
- Demo: Exploring Schemas
Schema Design
4
Videos
- Schema design
- Nested and repeated fields
- Demo: Nested and repeated fields
- Design the optimal schema for BigQuery
Lab: Working with JSON and Array data in BigQuery
1
External Tool
- Working with JSON and Array data in BigQuery
1
Videos
- Lab Intro: Working with JSON and Array data in BigQuery
Partitioning and Clustering in BigQuery
1
Videos
- Optimize with partitioning and clustering
Lab: Partitioned Tables in BigQuery
1
External Tool
- Lab: Partitioned Tables in BigQuery
1
Videos
- Lab Intro: Partitioned Tables in BigQuery
Review
1
Videos
- Review
Quiz
1
Assignment
- Building a Data Warehouse
Summary
1
Videos
- Course Summary
Auto Summary
Elevate your data management skills with the "Modernizing Data Lakes and Data Warehouses with Google Cloud" course, designed for IT and Computer Science professionals. This in-depth course explores the pivotal roles of data lakes and warehouses within data pipelines, focusing on their use-cases and the comprehensive solutions offered by Google Cloud. You'll gain a thorough understanding of the data engineer's responsibilities and the significant advantages a well-crafted data pipeline brings to business operations. Additionally, the course emphasizes the importance of performing data engineering in a cloud environment. This is the first installment in the Data Engineering on Google Cloud series, making it an essential starting point for those looking to specialize in this domain. Upon completion, you can advance your expertise by enrolling in the subsequent course, "Building Batch Data Pipelines on Google Cloud." Offered by Coursera, this professional-level course spans approximately 420 minutes and is available through various subscription options, including Starter, Professional, and Paid plans. It is ideal for professionals aiming to modernize their data handling capabilities and leverage Google Cloud's cutting-edge tools for optimal data management.

Google Cloud Training