- Level Expert
- Duration 49 hours
- Course by University of Colorado Boulder
-
Offered by
About
The course is intended for individuals looking to understand the architecture patterns necessary to take large software systems that make use of big data to production. You will transform big data prototypes into high quality tested production software. After measuring the performance characteristics of distributed systems, you will identify trouble areas and implement scalable solutions to improve performance. Upon completion of the course you will know how to scale production data stores to perform under load, designing load tests to ensure applications meet performance requirements. This course can be taken for academic credit as part of CU Boulder’s MS in Data Science or MS in Computer Science degrees offered on the Coursera platform. These fully accredited graduate degrees offer targeted courses, short 8-week sessions, and pay-as-you-go tuition. Admission is based on performance in three preliminary courses, not academic history. CU degrees on Coursera are ideal for recent graduates or working professionals. Learn more: MS in Data Science: https://www.coursera.org/degrees/master-of-science-data-science-boulder MS in Computer Science: https://coursera.org/degrees/ms-computer-science-boulderModules
Welcome to Software Architecture Patterns for Big Data!
1
Videos
- Introduction to the Course
3
Readings
- Earn Academic Credit for Your Work!
- Course Support
- Motivation for Peer Review
Prediction Models
2
Readings
- Introduction to Prediction Models
- Evaluation Metrics for Prediction Models
1
Quiz
- Predictive Models
Building & Evaluating a Prediction Model
1
Peer Review
- Predictive Model Coding Exercise
2
Videos
- Introduction to Match Predictor Codebase
- Match Predictor Models
3
Readings
- Soccer Match Predictor
- Evaluating a Model
- Creating Reports & Automating Evaluation
Basics of Distributed Systems
1
Videos
- Contact Tracing
2
Readings
- Distributed Systems & Distributed Workloads
- Performance Increase with Distributed Systems
Messaging Queues
1
Videos
- Direct Exchange
3
Readings
- What is a Messaging Queue?
- RabbitMQ
- Messaging Queues and Large Workloads
1
Quiz
- Considerations for Messaging Queues
Performance of Distributed Systems
1
Peer Review
- Performance Test Coding Exercise
1
Videos
- Introduction to Email Verifier Codebase
4
Readings
- Introduction to Email Verifier Codebase
- Identifying Performance Concerns
- Writing Performance Tests
- Interpreting Output of Performance Test
Performance Test for Business Requirements
1
Assignment
- Improvements
1
Peer Review
- Performance Test for Business Requirement
1
Videos
- Performance Testing – Custom Benchmark
3
Readings
- Writing a Performance Test for a Business Requirement
- Making Improvements Based on Results
- More on Testing
1
Quiz
- Improvements
Messaging Queue Cont'd
1
Videos
- Consistent Hash Exchange
3
Readings
- Consistent Hash Ring
- Consistent Hash Exchange
- Messaging Queues and High Availability Databases
1
Quiz
- Consistent Hash Exchange Quiz
Availability and Distributed Systems
1
Videos
- CAP Theorem Trade-offs
2
Readings
- What is Availability?
- Hindrance to Availability
1
Quiz
- CAP Theorem
Databases
4
Readings
- High Availiability Databases
- Tradeoffs Regarding High Availability Databases
- Delay, Memory & Messaging Tradeoffs in Distributed Systems
- Replication
1
Quiz
- High Availability Databases
Auto Summary
Unlock the secrets of Software Architecture Patterns for Big Data with this expert-level course from Coursera. Ideal for IT and Computer Science professionals, you'll learn to transform big data prototypes into robust, production-ready software. Tackle performance issues, implement scalable solutions, and design effective load tests. This course is part of CU Boulder’s MS in Data Science and MS in Computer Science degrees, featuring short 8-week sessions and flexible pay-as-you-go tuition. Perfect for recent graduates and working professionals. Available in Starter and Professional subscription options.

Tyson Gern

Mike Barinek