- Level Foundation
- المدة 13 ساعات hours
- الطبع بواسطة University of California San Diego
-
Offered by
عن
Once you've identified a big data issue to analyze, how do you collect, store and organize your data using Big Data solutions? In this course, you will experience various data genres and management tools appropriate for each. You will be able to describe the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems and analytical tools. Through guided hands-on tutorials, you will become familiar with techniques using real-time and semi-structured data examples. Systems and tools discussed include: AsterixDB, HP Vertica, Impala, Neo4j, Redis, SparkSQL. This course provides techniques to extract value from existing untapped data sources and discovering new data sources. At the end of this course, you will be able to: * Recognize different data elements in your own work and in everyday life problems * Explain why your team needs to design a Big Data Infrastructure Plan and Information System Design * Identify the frequent data operations required for various types of data * Select a data model to suit the characteristics of your data * Apply techniques to handle streaming data * Differentiate between a traditional Database Management System and a Big Data Management System * Appreciate why there are so many data management systems * Design a big data information system for an online game company This course is for those new to data science. Completion of Intro to Big Data is recommended. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Refer to the specialization technical requirements for complete hardware and software specifications. Hardware Requirements: (A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking "About This Mac." Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size. Software Requirements: This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge (except for data charges from your internet provider). Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.الوحدات
What is in this course?
1
Discussions
- Getting to know you: Tell us about yourself and why you are taking this course
2
Videos
- Welcome to Big Data Modeling and Management
- Why is this a New Course in the Big Data Specialization?
Why Big Data Modeling and Management?
1
Discussions
- Let's discuss: What area of big data management interests you most?
9
Videos
- Summary of Introduction to Big Data (Part 1)
- Summary of Introduction to Big Data (Part 2)
- Summary of Introduction to Big Data (Part 3)
- Big Data Management "Must-Ask Questions"
- Data Ingestion
- Data Storage
- Data Quality
- Data Operations
- Data Scalability and Security
3
Readings
- Slides: Summary of Introduction to Big Data
- Slides: Big Data Management
- Reading on Storage Systems
Real Big Data Management Applications
1
Discussions
- Let's discuss: What are the design criteria in the big data applications you have heard?
3
Videos
- Energy Data Management Challenges at ConEd
- Gaming Industry Data Management: Q&A with Apmetrix CTO Mark Caldwell
- Flight Data Management at FlightStats: A Lecture by CTO Chad Berkley
2
Readings
- Slides: Energy Data Management Challenges at ConEd
- Slides: Flight Data Management at FlightStats
Hands-On
3
Readings
- Downloading and Installing Docker Desktop Instructions
- Instructions for Downloading Hands On Datasets
- Basic terminal shell commands
What is a Data Model?
1
Discussions
- Let's discuss: Modeling data in your daily life
4
Videos
- Introduction to Data Models
- Data Model Structures
- Data Model Operations
- Data Model Constraints
1
Readings
- Slides: What Is A Data Model?
Hands-On
1
Videos
- Introduction to CSV Data
2
Readings
- Introduction to CSV Data (OpenOffice)
- Introduction to CSV Data (Microsoft Excel)
Different Kinds of Data Models (Part 1)
1
Discussions
- Let's discuss: Utilization of XML or JSON on the Internet
2
Videos
- What is a Relational Data Model?
- What is a Semistructured Data Model?
2
Readings
- Slides: What Is A Relational Data Model?
- Slides: What is a Semistructured Data Model?
Hands-On
1
Assignment
- Practical Quiz for Week 2 Hands-On Lectures
4
Videos
- Exploring the Relational Data Model of CSV Files
- Exploring the Semistructured Data Model of JSON data
- Exploring the Array Data Model of an Image
- Exploring Sensor Data
7
Readings
- Exploring the Relational Data Model of Comma Separated Values (OpenOffice)
- Exploring the Relational Data Model of Comma Separated Values (Excel)
- Installing Python
- Creating a Python Virtual Environment
- Exploring the Semistructured Data Model of JSON data
- Exploring the Array Data Model of an Image
- Exploring Sensor Data
Different Kinds of Data Models (Part 2)
1
Assignment
- Data Models Quiz
3
Videos
- Vector Space Model
- Graph Data Model
- Other Data Models
3
Readings
- Slides: Vector Space Model
- Slides: Graph Data Model
- Slides: Other Data Models
Hands-On
2
Videos
- Exploring the Lucene Search Engine's Vector Data Model
- Exploring Graph Data Models with Gephi
2
Readings
- Exploring Vector Data Models with Lucene
- Exploring Graph Data Models with Gephi
Data Models vs. Data Formats
1
Videos
- Data Model vs. Data Format
1
Readings
- Slides: Data Model vs. Data Format
Working with Streaming Data
1
Assignment
- Data Formats and Streaming Data Quiz
1
Discussions
- Let's discuss: Streaming data applications
3
Videos
- What is a Data Stream?
- Why is Streaming Data different?
- Understanding Data Lakes
3
Readings
- Slides: What is a Data Stream?
- Slides: Why is Streaming Data Different?
- Slides: Understanding Data Lakes
Hands-On: Handling Data Streams
1
Videos
- Exploring Streaming Sensor Data
1
Readings
- Exploring Streaming Sensor Data
Why Data Management?
1
Videos
- DBMS-based and non-DBMS-based Approaches to Big Data
1
Readings
- Slides: DBMS-based and non-DBMS-based Approaches to Big Data
From DBMS to BDMS
1
Assignment
- BDMS Quiz
6
Videos
- From DBMS to BDMS
- Redis: An Enhanced Key-Value Store
- Aerospike: a New Generation KV Store
- Semistructured Data – AsterixDB
- Solr: Managing Text
- Relational Data – Vertica
1
Readings
- Slides: From DBMS to BDMS
What is an Information System?
1
Peer Review
- Designing a Data Model for 'Catch the Pink Flamingo'
2
Discussions
- Let's discuss: Analytical tasks to make Catch the Pink Flamingo better
- Let's discuss: Using the data model for Catch the Pink Flamingo
1
Readings
- A Game by Eglence Inc. : Catch The Pink Flamingo
Auto Summary
Discover how to collect, store, and organize big data with "Big Data Modeling and Management Systems." This foundational course, ideal for those new to data science, explores various data genres and management tools like AsterixDB, HP Vertica, and SparkSQL. Led by Coursera, the course includes hands-on tutorials and is designed to help you build a Big Data Infrastructure Plan. No prior programming experience is required, though basic technical skills are necessary. Choose from Starter, Professional, or Paid subscription options for a 780-minute immersive learning experience.

Ilkay Altintas

Amarnath Gupta