- Level Expert
- Duration 33 hours
- Course by DeepLearning.AI
-
Offered by
About
In the fourth course of Machine Learning Engineering for Production Specialization, you will learn how to deploy ML models and make them available to end-users. You will build scalable and reliable hardware infrastructure to deliver inference requests both in real-time and batch depending on the use case. You will also implement workflow automation and progressive delivery that complies with current MLOps practices to keep your production system running. Additionally, you will continuously monitor your system to detect model decay, remediate performance drops, and avoid system failures so it can continuously operate at all times. Understanding machine learning and deep learning concepts is essential, but if you’re looking to build an effective AI career, you need production engineering capabilities as well. Machine learning engineering for production combines the foundational concepts of machine learning with the functional expertise of modern software development and engineering roles to help you develop production-ready skills. Week 1: Model Serving Introduction Week 2: Model Serving Patterns and Infrastructures Week 3: Model Management and Delivery Week 4: Model Monitoring and LoggingModules
A conversation with Andrew Ng, Robert Crowe and Laurence Moroney
1
Videos
- Course Overview
Introduction to Model Serving
1
Assignment
- Introduction to Model Serving
1
External Tool
- Ungraded Lab - Introduction to Docker (Google Cloud)
1
Videos
- Introduction to Model Serving
3
Readings
- Ungraded Labs - Best Practices
- Ungraded Lab - Introduction to Docker
- [IMPORTANT] Have questions, issues or ideas? Join our Forum!
Introduction to Model Serving Infrastructure
1
Assignment
- Introduction to Model Serving Infrastructure
1
External Tool
- Ungraded Lab - Vertex AI Workbench Notebook: Qwikstart
4
Videos
- Introduction to Model Serving Infrastructure
- Deployment Options
- Improving Prediction Latency and Reducing Resource Costs
- Creating and deploying models to AI Prediction Platform
1
Readings
- Optional: Build, train, and deploy an XGBoost model on Cloud AI Platform
Installing TensorFlow Serving
1
Assignment
- TensorFlow Serving
1
Videos
- Installing TensorFlow Serving
2
Readings
- Ungraded Lab - Tensorflow Serving with Docker
- Ungraded Lab - Serve a model with TensorFlow Serving
Lecture Notes (Optional)
1
Readings
- Lecture Notes Week 1
Model Serving Architecture
1
Assignment
- Model serving architecture
3
Videos
- Model Serving Architecture
- Model Servers: TensorFlow Serving
- Model Servers: Other Providers
2
Readings
- Documentation on model servers
- Ungraded Lab - Deploy a ML model with FastAPI and Docker
Scaling Infrastructure
1
Assignment
- Scaling Infrastructure
1
External Tool
- Ungraded Lab - Orchestrating the Cloud with Kubernetes
1
Videos
- Scaling Infrastructure
2
Readings
- Learn about scaling with boy bands
- Explore Kubernetes and KubeFlow
Online Inference
1
Assignment
- Online Inference
1
External Tool
- Ungraded Lab - Distributed Load Testing with Kubernetes (Optional)
1
Videos
- Online Inference
1
Readings
- Ungraded Lab - Latency testing with Docker Compose and Locust
Data Preprocessing
1
Assignment
- Data Preprocessing
1
Videos
- Data Preprocessing
1
Readings
- Data preprocessing
Batch Inference Scenarios
1
Assignment
- Batch inference scenarios
1
Videos
- Batch Inference Scenarios
Batch Processing with ETL
1
Assignment
- Batch Processing with ETL
1
External Tool
- Autoscaling TensorFlow model deployments with TF Serving and Kubernetes
1
Videos
- Batch Processing with ETL
1
Readings
- Ungraded Lab (Optional): Machine Learning with Apache Beam and TensorFlow
Lecture Notes (Optional)
1
Readings
- Lecture Notes Week 2
ML Experiments Management and Workflow Automation
1
Assignment
- ML Experiments Management and Workflow Automation
3
Videos
- Experiment Tracking
- Tools for Experiment Tracking
- Introduction to MLOps
1
Readings
- Experiment Tracking
MLOps Methodology
1
Assignment
- MLOps Methodology
1
External Tool
- TFX on Google Cloud Vertex Pipelines
1
Labs
- Ungraded Lab: Developing TFX Custom Components
3
Videos
- MLOps Level 0
- MLOps Levels 1&2
- Developing Components for an Orchestrated Workflow
3
Readings
- MLOps Resources
- (Optional) Ungraded Lab: Intro to Kubeflow Pipelines
- Architecture for MLOps using TFX, Kubeflow Pipelines, and Cloud Build
Model Management and Deployment Infrastructure
1
Assignment
- Model Management and Deployment Infrastructure
1
External Tool
- Implementing Canary Releases of TensorFlow Model Deployments with Kubernetes and Anthos Service Mesh
3
Videos
- Managing Model Versions
- Continuous Delivery
- Progressive Delivery
5
Readings
- Ungraded Lab - Model Versioning with TF Serving
- ML Model Management
- Ungraded Lab - CI/CD pipelines with GitHub Actions
- Continuous Delivery
- Progressive Delivery
Lecture Notes (Optional)
1
Readings
- Lecture Notes Week 3
Model Monitoring and Logging
1
Assignment
- Model Monitoring and Logging
1
External Tool
- Data Loss Prevention: Qwik Start - JSON
5
Videos
- Why Monitoring Matters
- Observability in ML
- Monitoring Targets in ML
- Logging for ML Monitoring
- Tracing for ML Systems
2
Readings
- Monitoring Machine Learning Models in Production
- [IMPORTANT] Reminder about end of access to Lab Notebooks
Model Decay
1
Assignment
- Model Decay
3
Videos
- What is Model Decay?
- Model Decay Detection
- Ways to Mitigate Model Decay
1
Readings
- Addressing Model Decay
GDPR and Privacy
1
Assignment
- GDPR and Privacy
4
Videos
- Responsible AI
- Legal Requirements for Secure and Private AI
- Anonymization and Pseudonymisation
- Right to be Forgotten
2
Readings
- Responsible AI
- GDPR and CCPA
Lecture Notes (Optional)
1
Readings
- Lecture Notes Week 4
Specialization recap and farewell
1
Videos
- Specialization recap and farewell
Course Resources
1
Readings
- Course 4 Optional References
Acknowledgments
2
Readings
- Acknowledgements
- (Optional) Opportunity to Mentor Other Learners
Auto Summary
"Deploying Machine Learning Models in Production" is an expert-level course by Coursera, focusing on making ML models operational for end-users. Part of the Machine Learning Engineering for Production Specialization, this course covers scalable infrastructure, workflow automation, and system monitoring to ensure continuous operation. With a duration of 1980 minutes, it includes weekly modules on model serving, management, and monitoring. Ideal for those with a background in data science and AI, subscriptions are available in Starter and Professional tiers.

Laurence Moroney

Robert Crowe