- Level Professional
- Duration 26 hours
- Course by Microsoft
-
Offered by
About
In this course, you will learn how to harness the power of Apache Spark and powerful clusters running on the Azure Databricks platform to run data science workloads in the cloud. This is the fourth course in a five-course program that prepares you to take the DP-100: Designing and Implementing a Data Science Solution on Azurec ertification exam. The certification exam is an opportunity to prove knowledge and expertise operate machine learning solutions at a cloud-scale using Azure Machine Learning. This specialization teaches you to leverage your existing knowledge of Python and machine learning to manage data ingestion and preparation, model training and deployment, and machine learning solution monitoring in Microsoft Azure. Each course teaches you the concepts and skills that are measured by the exam. This Specialization is intended for data scientists with existing knowledge of Python and machine learning frameworks like Scikit-Learn, PyTorch, and Tensorflow, who want to build and operate machine learning solutions in the cloud. It teaches data scientists how to create end-to-end solutions in Microsoft Azure. Students will learn how to manage Azure resources for machine learning; run experiments and train models; deploy and operationalize machine learning solutions, and implement responsible machine learning. They will also learn to use Azure Databricks to explore, prepare, and model data; and integrate Databricks machine learning processes with Azure Machine Learning.Modules
Welcome to the Course
1
Discussions
- Meet and greet
1
Videos
- Introduction to the course
2
Readings
- Course syllabus
- How to be successful in this course
Describe Azure Databricks
1
Assignment
- Exercise quiz
2
Videos
- Explain Azure Databricks
- Lesson summary
3
Readings
- Create an Azure Databricks workspace and cluster
- Create and execute a notebook
- Exercise: Work with Notebooks
1
Quiz
- Knowledge check
Spark architecture fundamentals
2
Assignment
- Knowledge check
- Test prep
4
Videos
- Lesson introduction
- Understand the architecture of Azure Databricks Spark cluster
- Understand the architecture of spark job
- Lesson summary
Use Azure Databricks to prepare the data for advanced analytics and machine learning operations
2
Assignment
- Exercise quiz
- Knowledge check
2
Videos
- Lesson introduction
- Lesson summary
6
Readings
- Read data in CSV format
- Read data in JSON format
- Read data in Parquet format
- Read data stored in tables and views
- Write data
- Exercises: Read and write data
Work with DataFrames in Azure Databricks
2
Assignment
- Knowledge check
- Test prep
2
Videos
- Lesson introduction
- Lesson summary
4
Readings
- Describe a DataFrame
- Use common DataFrame methods
- Use the display function
- Exercise: Distinct articles
Build and query a Delta Lake
3
Assignment
- Exercise quiz
- Exercise quiz
- Knowledge check
2
Videos
- Describe the open source Delta Lake
- Lesson summary
4
Readings
- Get started with Delta using Spark APIs
- Exercise: Work with basic Delta Lake functionality
- Describe how Azure Databricks manages Delta Lake
- Exercise: Use the Delta Lake Time Machine and perform optimization
Work with user-defined functions
3
Assignment
- Exercise quiz
- Knowledge check
- Test prep
2
Videos
- Lesson introduction
- Lesson summary
3
Readings
- Write user defined functions
- Exercise: Perform Extract, Transform, Load (ETL) operations using user-defined functions
- Additional resources
Perform machine learning with Azure Databricks
4
Assignment
- Exercise quiz
- Exercise Quiz
- Exercise quiz
- Knowledge check
2
Videos
- Lesson introduction
- Lesson summary
6
Readings
- Understand machine learning
- Exercise: Train a model and create predictions
- Understand data using exploratory data analysis
- Exercise: Perform exploratory data analysis
- Describe machine learning workflows
- Exercise: Build and evaluate a baseline machine learning model
Train a machine learning model
4
Assignment
- Exercise quiz
- Exercise quiz
- Knowledge check
- Test prep
2
Videos
- Lesson introduction
- Lesson summary
5
Readings
- Perform featurization of the dataset
- Exercise: Finish featurization of the dataset
- Understand regression modeling
- Exercise: Build and interpret a regression model
- Additional resources
Work with MLflow in Azure Databricks
2
Assignment
- Exercsie quiz
- Knowledge check
2
Videos
- Lesson introduction
- Lesson summary
2
Readings
- Use MLflow to track experiments, log metrics, and compare runs
- Exercise: Work with MLflow to track experiment metrics, parameters, artifacts and modelss
Perform model selection with hyperparameter tuning
3
Assignment
- Exercsie quiz
- Knowledge check
- Test prep
2
Videos
- Lesson introduction
- Lesson summary
3
Readings
- Describe model selection and hyperparameter tuning
- Exercise: Select optimal model by tuning hyperparameters
- Additional resources
Deep learning with Horovod for distributed training
2
Assignment
- Exercsie quiz
- Knowledge check
2
Videos
- Lesson introduction
- Lesson summary
3
Readings
- Use Horovod to train a deep learning model
- Use Petastorm to read in Apache Parquet format with Horovod for distributed model training
- Exercise: Work with Horovod and Petastorm for training a deep learning model
Work with Azure Machine Learning to deploy serving models
2
Assignment
- Knowledge check
- Test prep
2
Videos
- Lesson introduction
- Lesson summary
2
Readings
- Use Azure Machine Learning to deploy serving models
- Additional resources
Course wrap up
1
Discussions
- Reflect on learning
1
Videos
- Congratulations
1
Readings
- Next steps
Auto Summary
Unlock the potential of cloud-based data science with "Perform Data Science with Azure Databricks." This advanced course, part of a five-course series for the DP-100 certification, empowers data scientists to leverage Apache Spark and Azure Databricks for scalable machine learning solutions. Ideal for professionals familiar with Python and frameworks like Scikit-Learn and TensorFlow, the course covers data ingestion, model training, and deployment on Azure. Offered by Coursera, this 1560-minute course provides essential skills for creating end-to-end solutions in Microsoft Azure, with flexible subscription options available.

Microsoft