Big Data - Capstone Project

Buy Now AED 274.99 + VAT

Monthly Subscription Starting at AED 99 + VAT

Level Foundation
Duration 21 hours
Course by University of California San Diego
Offered by

About

Welcome to the Capstone Project for Big Data! In this culminating project, you will build a big data ecosystem using tools and methods form the earlier courses in this specialization. You will analyze a data set simulating big data generated from a large number of users who are playing our imaginary game "Catch the Pink Flamingo". During the five week Capstone Project, you will walk through the typical big data science steps for acquiring, exploring, preparing, analyzing, and reporting. In the first two weeks, we will introduce you to the data set and guide you through some exploratory analysis using tools such as Splunk and Open Office. Then we will move into more challenging big data problems requiring the more advanced tools you have learned including KNIME, Spark's MLLib and Gephi. Finally, during the fifth and final week, we will show you how to bring it all together to create engaging and compelling reports and slide presentations. As a result of our collaboration with Splunk, a software company focus on analyzing machine-generated big data, learners with the top projects will be eligible to present to Splunk and meet Splunk recruiters and engineering leadership.

Modules

Introduction to the Capstone Project

4 Videos

4 Readings

Show info about module content

4 Videos

Welcome to the Big Data Capstone Project
Welcome from Splunk: Rob Reed World Education Evangelist
A Summary of Catch the Pink Flamingo
A Conceptual Schema for Catch the Pink Flamingo

4 Readings

Planning, Preparation, and Review
A Game by Eglence Inc. : Catch The Pink Flamingo
Overview of the Catch the Pink Flamingo Data Model
Overview of Final Project Design

Acquiring and Understanding the Game Data

2 Readings

Show info about module content

2 Readings

Downloading the Game Data and Associated Scripts
Understanding the CSV Files Generated by the Scripts

Let's Do It: Exploring and Preparing the Data

4 Readings

1 PeerReview

1 Assignment

Show info about module content

1 Assignment

Data Exploration With Splunk

1 Peer Review

Data Exploration Technical Appendix

4 Readings

Optional Review of Splunk
“Catch the Pink Flamingo” Data Exploration with Splunk
Aggregate Calculations Using Splunk
Filtering the Data With Splunk

Get Thinking: Classifying Players' Spending Habits

2 Readings

Show info about module content

2 Readings

Review: Classification Using Decision Tree in KNIME
Review: Interpreting a Decision Tree in KNIME

Let's Do It

2 Readings

1 PeerReview

Show info about module content

1 Peer Review

Classifying in KNIME to identify big spenders in Catch the Pink Flamingo

2 Readings

Workflow Overview for Building a Decision Tree in KNIME
Description of combined_data.csv

Get Thinking: Clustering to Improve Eglence Inc.'s Revenue

1 Readings

3 Discussion

Show info about module content

3 Discussions

Is there only “one way” to cluster a client base?
How many clusters?
What kind of criteria might provide actionable information for Eglence Inc.?

1 Readings

Informing business strategies based on client base

Let's Do It

1 Readings

1 PeerReview

Show info about module content

1 Peer Review

Recommending Actions from Clustering Analysis

1 Readings

Practice with PySpark MLlib Clustering

Get Thinking: A Graph Analytics Approach to Simulated Chat Data

1 Readings

Show info about module content

1 Readings

Understanding the Simulated Chat Data Generated by the Scripts

Let's Do It: Working with Simulated Chat Data in Neo4j

1 Readings

1 PeerReview

Show info about module content

1 Peer Review

Graph Analytics With Chat Data Using Neo4j

1 Readings

Graph Analytics of Catch the Pink Flamingo Chat Data Using Neo4j

Final Project Instructions

1 Videos

1 Readings

Show info about module content

1 Videos

Week 5: Bringing It All Together

1 Readings

Final project preparation

Final Project Submission

1 Videos

1 PeerReview

Show info about module content

1 Peer Review

Final Project

1 Videos

Congratulations! Some Final Words...

Optional Splunk Submission

1 Readings

1 PeerReview

Show info about module content

1 Peer Review

Optional 3-minute video: Splunk opportunity

1 Readings

Part 2: Help us connect your video to your LinkedIn profile

Auto Summary

Embark on an exciting journey into the world of big data with the Capstone Project for Big Data, a premier offering in the Data Science & AI domain. Guided by Coursera, this five-week intensive course allows learners to apply their knowledge from previous courses by constructing a sophisticated big data ecosystem. The project centers around analyzing a comprehensive data set from the fictional game "Catch the Pink Flamingo," simulating real-world big data scenarios. Participants will follow a structured path through essential big data science steps: acquisition, exploration, preparation, analysis, and reporting. The initial weeks focus on familiarizing learners with the data set and performing exploratory analysis using tools like Splunk and Open Office. As the course progresses, students tackle complex big data challenges employing advanced tools such as KNIME, Spark's MLLib, and Gephi. In the final week, learners will master the art of creating compelling reports and presentations. An added perk of this course is the collaboration with Splunk, providing top performers the unique opportunity to present their projects to Splunk's recruiters and engineering leadership. With a duration totaling 1260 minutes, the course offers various subscription options including Starter, Professional, and Paid plans, catering to different levels of commitment and access. This foundational-level course is ideal for aspiring data scientists and AI enthusiasts eager to enhance their practical skills and gain recognition in the industry. Join now and take a significant step towards mastering big data.

Instructors

Ilkay Altintas

Instructors

Amarnath Gupta

Big Data - Capstone Project

About

Modules

Introduction to the Capstone Project

Acquiring and Understanding the Game Data

Let's Do It: Exploring and Preparing the Data

Get Thinking: Classifying Players' Spending Habits

Let's Do It

Get Thinking: Clustering to Improve Eglence Inc.'s Revenue

Let's Do It

Get Thinking: A Graph Analytics Approach to Simulated Chat Data

Let's Do It: Working with Simulated Chat Data in Neo4j

Final Project Instructions

Final Project Submission

Optional Splunk Submission

Auto Summary

Start learning with us today!

Live Chat

Big Data - Capstone Project

About

Modules

Simulating Big Data for an Online Game

Introduction to the Capstone Project

Acquiring, Exploring, and Preparing the Data

Acquiring and Understanding the Game Data

Let's Do It: Exploring and Preparing the Data

Data Classification with KNIME

Get Thinking: Classifying Players' Spending Habits

Let's Do It

Clustering with Spark

Get Thinking: Clustering to Improve Eglence Inc.'s Revenue

Let's Do It

Graph Analytics of Simulated Chat Data With Neo4j

Get Thinking: A Graph Analytics Approach to Simulated Chat Data

Let's Do It: Working with Simulated Chat Data in Neo4j

Reporting and Presenting Your Work

Final Project Instructions

Final Submission

Final Project Submission

Optional Splunk Submission

Auto Summary