- Level Expert
- Duration 10 hours
- Course by Google Cloud
-
Offered by
About
In the last installment of the Dataflow course series, we will introduce the components of the Dataflow operational model. We will examine tools and techniques for troubleshooting and optimizing pipeline performance. We will then review testing, deployment, and reliability best practices for Dataflow pipelines. We will conclude with a review of Templates, which makes it easy to scale Dataflow pipelines to organizations with hundreds of users. These lessons will help ensure that your data platform is stable and resilient to unanticipated circumstances.Modules
Overview
2
Videos
- Course Introduction
- Coursera: Getting Started with Google Cloud Platform and Qwiklabs
1
Readings
- Important note about hands-on labs
Course Feedback
1
Readings
- How to Send Feedback
Monitoring
5
Videos
- Job List
- Job Info
- Job Graph
- Job Metrics
- Metrics Explorer
Quiz
1
Assignment
- Monitoring
Additional Resources
1
Readings
- Additional Resources
Logging and Error Reporting
2
Videos
- Logging
- Error Reporting
Quiz
1
Assignment
- Logging and Error Reporting
Additional Resources
1
Readings
- Additional Resources
Troubleshooting and Debug
2
Videos
- Troubleshooting workflow
- Types of troubles
Quiz
1
Assignment
- Troubleshooting and Debug
Lab: Monitoring, Logging and Error Reporting for Dataflow Jobs
1
External Tool
- Lab: Monitoring, Logging and Error Reporting for Dataflow Jobs
Additional Resources
1
Readings
- Additional Resources
Performance
4
Videos
- Pipeline Design
- Data Shape
- Source, Sinks & external systems
- Shuffle and streaming engine
Quiz
1
Assignment
- Performance
Additional Resources
1
Readings
- Additional Resources
Testing and CI/CD
5
Videos
- Testing and CI/CD Overview
- Unit Testing
- Integration Testing
- Artifact Building
- Deployment
Quiz
1
Assignment
- Testing and CI/CD
Lab: Testing with Apache Beam
2
External Tool
- Lab: Testing with Apache Beam (Java)
- Lab: Testing with Apache Beam (Python)
Lab: CI/CD with Dataflow
1
External Tool
- Lab: CI/CD with Dataflow
Additional Resources
1
Readings
- Additional Resources
Reliabiity
5
Videos
- Introduction to Reliability
- Monitoring
- Geolocation
- Disaster Recovery
- High Availability
Quiz
1
Assignment
- Reliability
Additional Resources
1
Readings
- Additional Resources
Flex Templates
4
Videos
- Classic templates
- Flex templates
- Using flex templates
- Google provided templates
Quiz
1
Assignment
- Flex Templates
Lab: Custom Dataflow Flex Templates
2
External Tool
- Lab: Custom Dataflow Flex Templates (Java)
- Lab: Custom Dataflow Flex Templates (Python)
Additional Resources
1
Readings
- Additional Resources
Summary
1
Videos
- Course Summary
Auto Summary
Unlock expert-level skills in the "Serverless Data Processing with Dataflow: Operations" course. Dive into Dataflow's operational model, learn troubleshooting and optimization techniques, and master testing, deployment, and reliability best practices. Ideal for data science and AI professionals, this Coursera course spans 600 minutes and offers scalable solutions for large organizations. Available with Starter and Professional subscriptions.

Google Cloud Training