- Level Professional
- Course by Johns Hopkins University
-
Offered by
About
This course will aid in students in learning in concepts that scale the use of GPUs and the CPUs that manage their use beyond the most common consumer-grade GPU installations. They will learn how to manage asynchronous workflows, sending and receiving events to encapsulate data transfers and control signals. Also, students will walk through application of GPUs to sorting of data and processing images, implementing their own software using these techniques and libraries. By the end of the course, you will be able to do the following: - Develop software that can use multiple CPUs and GPUs - Develop software that uses CUDA’s events and streams capability to create asynchronous workflows - Use the CUDA computational model to to solve canonical programming challenges including data sorting and image processing To be successful in this course, you should have an understanding of parallel programming and experience programming in C/C++. This course will be extremely applicable to software developers and data scientists working in the fields of high performance computing, data processing, and machine learning.Modules
GPU Programming Specialization
1
Discussions
- Enterprise Data Processing Discussion Prompt
1
Videos
- GPU Specialization Overview
Course Technical and Programming Expectations
1
Videos
- Course Expectations
Coursera Lab and Assignment Usage
- CUDA Device Memory Analysis Assignment
1
Discussions
- Canonical Algorithms Discussion Prompt
1
Labs
- CUDA Device Memory Lab
1
Videos
- Coursera Lab and Assignment Overview
1
Readings
- Development Tools and C/C++ Refresher Material
Multiple CPU Systems
- Multiple CPU Assignment
1
Discussions
- Multiple CPU/Distributed Computing Discussion
1
Labs
- Multiple CPU Lab
3
Videos
- Multiple CPU Architectures
- Multiple CPU Lab
- Multiple CPU Assignment
Multiple GPU Systems
1
Videos
- Multiple CPUs vs Multiple GPUs
CUDA Multiple GPU Syntax
- CUDA GPU identification Assignment
1
Labs
- CUDA GPU Identification Laboratory Activity
3
Videos
- CUDA Multiple GPU Programming Model
- Multiple GPU Activity
- Multiple GPU Assignment
CUDA Multiple CPU/GPU Systems
1
Peer Review
- Competing GPUs
CUDA Streams and Events for Orchestration
1
Videos
- CUDA Streams and Events
1
Readings
- CUDA Streams External Resources
CUDA Streams Syntax
1
Videos
- CUDA Streams Syntax
CUDA Events Syntax
1
Labs
- CUDA Streams and Events Lab Activity
1
Videos
- CUDA Events Syntax
CUDA Streams and Events Use Cases
- CUDA Streams and Events Assignment
1
Discussions
- CUDA Streams and Events Discussion
2
Videos
- CUDA Streams and Events Use Cases
- CUDA Streams and Events Assignment Walkthrough
1
Readings
- CUDA Streams and Events External Resources
Determining Sorting Algorithm Pseudocode
1
Discussions
- Fundamental Algorithms and Programming Discussion
1
Videos
- Using Input Data to Develop GPU Pseudocode
CUDA Best Practices in Algorithm and Kernel Design
3
Videos
- Sorting Algorithm GPU Pseudocode Bubble Sort
- Sorting Algorithm GPU Pseudocode Radix Sort
- Sorting Algorithm GPU Pseudocode Quick Sort
Memory Usage Pattern Identification
3
Videos
- Memory and GPU Pseudocode Bubble Sort
- Memory and GPU Pseudocode Radix Sort
- Memory and GPU Pseudocode Quick Sort
Initial Sorting Algorithm Implementation
- CUDA Merge Sort Algorithm Assignment
1
Labs
- Sorting Algorithms Lab Activity
4
Videos
- Sorting Algorithms Lab Activity Bubble Sort
- Sorting Algorithms Lab Activity Radix Sort
- Sorting Algorithms Lab Activity Quick Sort
- Sorting Algorithms Assignment
1
Readings
- GPU Sort Algorithm Reading List
CUDA NPP Programming Syntax for Image Processing
1
Discussions
- Image and Signal Processing Discussion
1
Videos
- NPP Image Processing Syntax
CUDA NPP Image Processing Example
1
Videos
- NPP Image Processing Code Demonstration
CUDA NPP Programming Syntax for Signal Processing
1
Videos
- NPP Signal Processing Syntax
CUDA NPP Signal Processing Example
1
Videos
- NPP Signals Processing Syntax Demonstration
CUDA Programming Independent Development Project
1
Peer Review
- CUDA at Scale Independent Project
1
Labs
- NPP Box Filter Laboratory
3
Videos
- Course Independent Project Lab Overview
- Course Independent Project Overview
- Course Independent Project Rubric Overview
Auto Summary
"CUDA at Scale for the Enterprise" is designed for software developers and data scientists in high-performance computing, data processing, and machine learning. Led by Coursera, this professional-level course teaches the use of GPUs and CPUs beyond consumer-grade setups. Key topics include asynchronous workflows, data transfer management, and applying GPUs to data sorting and image processing. A solid understanding of parallel programming and experience in C/C++ is recommended. Available through Starter and Professional subscriptions, this course empowers learners to develop advanced, scalable software solutions.

Chancellor Thomas Pascale