- Level Professional
- المدة 22 ساعات hours
- الطبع بواسطة Microsoft
-
Offered by
عن
In this course, you will learn how to harness the power of Apache Spark and powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud. You will discover the capabilities of Azure Databricks and the Apache Spark notebook for processing huge files. You will come to understand the Azure Databricks platform and identify the types of tasks well-suited for Apache Spark. You will also be introduced to the architecture of an Azure Databricks Spark Cluster and Spark Jobs. You will work with large amounts of data from multiple sources in different raw formats. you will learn how Azure Databricks supports day-to-day data-handling functions, such as reads, writes, and queries. This course is part of a Specialization intended for Data engineers and developers who want to demonstrate their expertise in designing and implementing data solutions that use Microsoft Azure data services for anyone interested in preparing for the Exam DP-203: Data Engineering on Microsoft Azure (beta). You will take a practice exam that covers key skills measured by the certification exam. This is the eighth course in a program of 10 courses to help prepare you to take the exam so that you can have expertise in designing and implementing data solutions that use Microsoft Azure data services. The Data Engineering on Microsoft Azure exam is an opportunity to prove knowledge expertise in integrating, transforming, and consolidating data from various structured and unstructured data systems into structures that are suitable for building analytics solutions that use Microsoft Azure data services. Each course teaches you the concepts and skills that are measured by the exam. By the end of this Specialization, you will be ready to take and sign-up for the Exam DP-203: Data Engineering on Microsoft Azure (beta).الوحدات
Welcome to the course
1
Discussions
- Meet and greet
1
Videos
- Introduction to the course
2
Readings
- Course syllabus
- How to be successful in this course
Describe Azure Databricks
2
Videos
- Explain Azure Databricks
- Lesson summary
3
Readings
- Create an Azure Databricks workspace and cluster
- Create and execute a notebook
- Exercise: Work with Notebooks
2
Quiz
- Exercise quiz
- Knowledge check
Spark architecture fundamentals
2
Assignment
- Knowledge check
- Test prep
4
Videos
- Lesson introduction
- Understand the architecture of Azure Databricks Spark cluster
- Understand the architecture of spark job
- Lesson summary
Use Azure Databricks to prepare the data for advanced analytics and machine learning operations
3
Assignment
- Exercise quiz
- Knowledge check
- Test prep
2
Videos
- Lesson introduction
- Lesson summary
6
Readings
- Read data in CSV format
- Read data in JSON format
- Read data in Parquet format
- Read data stored in tables and views
- Write data
- Exercises: Read and write data
Work with DataFrames in Azure Databricks
2
Assignment
- Exercise quiz
- Knowledge check
2
Videos
- Lesson introduction
- Lesson summary
4
Readings
- Describe a DataFrame
- Use common DataFrame methods
- Use the display function
- Exercise: Distinct articles
Describe lazy evaluation and other performance features in Azure Databricks
2
Assignment
- Knowledge check
- Test prep
4
Videos
- Lesson introduction
- Describe the fundamentals of how the Catalyst Optimizer works
- Describe performance enhancements enabled by shuffle operations and Tungsten
- Lesson summary
2
Readings
- Describe the difference between eager and lazy execution
- Define and identify actions and transformations
Work with DataFrames columns in Azure Databricks
2
Assignment
- Exercise quiz
- Knowledge check
2
Videos
- Lesson introduction
- Lesson summary
3
Readings
- Describe the column class
- Work with column expressions
- Exercise: Washingtons and Marthas
Work with DataFrames advanced methods in Azure Databricks
3
Assignment
- Exercise quiz
- Knowledge check
- Test prep
2
Videos
- Lesson introduction
- Lesson summary
3
Readings
- Perform date and time manipulation
- Use aggregate functions
- Exercise: Deduplication of data
Describe platform architecture, security, and data protection in Azure Databricks
3
Assignment
- Exercise quiz
- Knowledge check
- Test prep
6
Videos
- Lesson introduction
- Describe the Azure Databricks platform architecture
- Perform data protection
- Secure access with Azure IAM and authentication
- Describe security
- Lesson summary
4
Readings
- Create the required resources
- Describe Azure key vault and Databricks security scopes
- Exercise: Access Azure Storage with key vault-backed secrets
- Further resources
Build and query a Delta Lake
2
Assignment
- Exercise quiz
- Knowledge check
2
Videos
- Describe the open source Delta Lake
- Lesson summary
4
Readings
- Get started with Delta using Spark APIs
- Exercise: Work with basic Delta Lake functionality
- Describe how Azure Databricks manages Delta Lake
- Exercise: Use the Delta Lake Time Machine and perform optimization
1
Quiz
- Exercise quiz
Describe Azure Databricks Delta Lake architecture
2
Assignment
- Knowledge check
- Test prep
3
Videos
- Lesson introduction
- Describe bronze, silver, and gold architecture
- Lesson summary
2
Readings
- Perform batch and stream processing
- Further resources
Process streaming data with Azure Databricks structured streaming
1
Assignment
- Knowledge check
3
Videos
- Lesson introduction
- Describe Azure Databricks structured streaming
- Lesson summary
3
Readings
- Perform stream processing using structured streaming
- Work with Time Windows
- Process data from Event Hubs with structured streaming
Create production workloads on Azure Databricks with Azure Data Factory
2
Assignment
- Knowledge check
- Test prep
3
Videos
- Lesson introduction
- Create the required resources
- Summary
3
Readings
- Schedule Databricks jobs in a Data Factory pipeline
- Pass parameters into and out of Databricks jobs in Data Factory
- Further resources
Implement CI/CD with Azure DevOps
1
Assignment
- Knowledge check
3
Videos
- Lesson introduction
- Describe CI/CD
- Lesson summary
1
Readings
- Create a CI/CD process with Azure DevOps
Integrate Azure Databricks with other Azure services
1
Assignment
- Knowledge check
2
Videos
- Lesson summary
- Lesson summary
2
Readings
- Set up Azure Synapse Analytics
- Integrate with Azure Synapse Analytics
Describe Azure Databricks best practices
2
Assignment
- Knowledge check
- Test prep
6
Videos
- Lesson introduction
- Understand workspace administration best practices
- List security best practices
- Describe tools and integration best practices
- Explain Databricks runtime best practices
- Lesson summary
2
Readings
- Understand cluster best practices
- Further resources
Course practice exam
1
Assignment
- Course practice exam
1
Videos
- Course recap
1
Readings
- About the practice exam
Course wrap up
1
Discussions
- Reflect on learning
1
Videos
- Course summary
1
Readings
- Next steps
Auto Summary
Unlock the power of data engineering with the "Microsoft Azure Databricks for Data Engineering" course, designed to provide you with the skills needed to handle large data workloads in the cloud. This professional-level course, offered by Coursera, is ideal for data engineers and developers aiming to master Microsoft Azure data services and prepare for the DP-203 certification exam. Dive into the world of Apache Spark and Azure Databricks to process massive files and perform complex data-handling tasks. You'll explore the capabilities of Azure Databricks, learn about its architecture, and understand how to efficiently run Spark Jobs on powerful clusters. The course also covers practical aspects such as reading, writing, and querying data from multiple sources and formats. As the eighth course in a comprehensive ten-course specialization, this program thoroughly prepares you for the Data Engineering on Microsoft Azure exam. You will engage in a practice exam that mirrors the key skills evaluated in the certification test, ensuring you're well-equipped to demonstrate your expertise in designing and implementing robust data solutions on the Azure platform. With a total duration of approximately 22 hours, this course offers flexible learning through a Starter subscription, making it accessible for professionals eager to enhance their data engineering skills and advance their careers. Join the course today and take a significant step towards becoming a certified Azure data engineer.

Microsoft