

Our Courses

Serverless Data Processing with Dataflow: Foundations
This course is part 1 of a 3-course series on Serverless Data Processing with Dataflow. In this first course, we start with a refresher of what Apache Beam is and its relationship with Dataflow. Next, we talk about the Apache Beam vision and the benefits of the Beam Portability framework. The Beam Portability framework achieves the vision that a developer can use their favorite programming language with their preferred execution backend.
-
Course by
-
Self Paced
-
3 hours
-
English

Big Data Analysis Deep Dive
The job market for architects, engineers, and analytics professionals with Big Data expertise continues to increase. The Academy’s Big Data Career path focuses on the fundamental tools and techniques needed to pursue a career in Big Data. This course includes: data processing with python, writing and reading SQL queries, transmitting data with MaxCompute, analyzing data with Quick BI, using Hive, Hadoop, and spark on E-MapReduce, and how to visualize data with data dashboards.
-
Course by
-
Self Paced
-
14 hours
-
English

Social Media Data Analytics
Learner Outcomes: After taking this course, you will be able to: - Utilize various Application Programming Interface (API) services to collect data from different social media sources such as YouTube, Twitter, and Flickr. - Process the collected data - primarily structured - using methods involving correlation, regression, and classification to derive insights about the sources and people who generated that data. - Analyze unstructured data - primarily textual comments - for sentiments expressed in them. - Use different tools for collecting, analyzing, and exploring social media data for resea
-
Course by
-
13 hours
-
English

Advanced Data Science with IBM
As a coursera certified specialization completer you will have a proven deep understanding on massive parallel data processing, data exploration and visualization, and advanced machine learning & deep learning. You'll un…
-
Course by
-
Self Paced
-
English

NoSQL, Big Data, and Spark Foundations
Big Data Engineers and professionals with NoSQL skills are highly sought after in the data management industry. This Specialization is designed for those seeking to develop fundamental skills for working with Big Data, Apache Spark, and NoSQL databases.
-
Course by
-
Self Paced
-
English

Preparing for Google Cloud Certification: Cloud Data Engineer
This program provides the skills you need to advance your career in data engineering and provides a pathway to earn the industry-recognized Google Cloud Professional Data Engineer certification.
-
Course by
-
Self Paced
-
English

Data Structures and Algorithms
This course explores data structures and algorithms for back-end development, focusing on performance and scalability.
-
Course by
-
Self Paced
-
English

I/O-efficient algorithms
I/O-efficient algorithms, also known as external memory algorithms or cache-oblivious algorithms, are a class of algorithms designed to efficiently process data that is too large to fit entirely in the main memory (RAM) of a computer. These algorithms are particularly useful when dealing with massive datasets, such as those found in large-scale data processing, database management, and file systems. Operations on data become more expensive when the data item is located higher in the memory hierarchy.
-
Course by
-
Self Paced
-
10 hours
-
English

Data Processing and Manipulation
The "Data Processing and Manipulation" course provides students with a comprehensive understanding of various data processing and manipulation concepts and tools. Participants will learn how to handle missing values, detect outliers, perform sampling and dimension reduction, apply scaling and discretization techniques, and explore data cube and pivot table operations. This course equips students with essential skills for efficiently preparing and transforming data for analysis and decision-making. Learning Objectives: 1.
-
Course by
-
Self Paced
-
29 hours
-
English

Serverless Data Processing with Dataflow
It is becoming harder and harder to maintain a technology stack that can keep up with the growing demands of a data-driven business. Every Big Data practitioner is familiar with the three V’s of Big Data: volume, velocity, and variety. What if there was a scale-proof technology that was designed to meet these demands? Enter Google Cloud Dataflow. Google Cloud Dataflow simplifies data processing by unifying batch & stream processing and providing a serverless experience that allows users to focus on analytics, not infrastructure.
-
Course by
-
Self Paced
-
English

GPU Programming
This specialization is intended for data scientists and software developers to create software that uses commonly available hardware. Students will be introduced to CUDA and libraries that allow for performing numerous computations in parallel and rapidly. Applications for these skills are machine learning, image/audio signal processing, and data processing.
-
Course by
-
Self Paced
-
English

Data Analysis Using Pyspark
One of the important topics that every data analyst should be familiar with is the distributed data processing technologies. As a data analyst, you should be able to apply different queries to your dataset to extract useful information out of it. but what if your data is so big that working with it on your local machine is not easy to be done. That is when the distributed data processing and Spark Technology will become handy.
-
Course by
-
Self Paced
-
3 hours
-
English

How Computers Work: Demystifying Computation
Explore the fundamentals of computing: computer architecture, binary logic, data processing, circuits & more.
-
Course by
-
Self Paced
-
12
-
English

Vertex AI: Qwik Start
This is a self-paced lab that takes place in the Google Cloud console. In this lab, you will use BigQuery for data processing and exploratory data analysis, and the Vertex AI platform to train and deploy a custom TensorFlow Regressor model to predict customer lifetime value (CLV). The goal of the lab is to introduce to Vertex AI through a high value real world use case - predictive CLV. Starting with a local BigQuery and TensorFlow workflow, you will progress toward training and deploying your model in the cloud with Vertex AI.
-
Course by
-
Self Paced
-
2 hours
-
English

Basic Data Processing and Visualization
This is the first course in the four-course specialization Python Data Products for Predictive Analytics, introducing the basics of reading and manipulating datasets in Python. In this course, you will learn what a data product is and go through several Python libraries to perform data retrieval, processing, and visualization. This course will introduce you to the field of data science and prepare you for the next three courses in the Specialization: Design Thinking and Predictive Analytics for Data Products, Meaningful Predictive Modeling, and Deploying Machine Learning Models.
-
Course by
-
Self Paced
-
11 hours
-
English

Introduction to Data Engineering
Start your journey in one of the fastest growing professions today with this beginner-friendly Data Engineering course! You will be introduced to the core concepts, processes, and tools you need to know in order to get a foundational knowledge of data engineering. as well as the roles that Data Engineers, Data Scientists, and Data Analysts play in the ecosystem. You will begin this course by understanding what is data engineering as well as the roles that Data Engineers, Data Scientists, and Data Analysts play in this exciting field.
-
Course by
-
Self Paced
-
13 hours
-
English

Predictive Modeling and Machine Learning with MATLAB
In this course, you will build on the skills learned in Exploratory Data Analysis with MATLAB and Data Processing and Feature Engineering with MATLAB to increase your ability to harness the power of MATLAB to analyze data relevant to the work you do. These skills are valuable for those who have domain knowledge and some exposure to computational tools, but no programming background.
-
Course by
-
Self Paced
-
22 hours
-
English

Building Batch Data Pipelines on Google Cloud
Data pipelines typically fall under one of the Extract and Load (EL), Extract, Load and Transform (ELT) or Extract, Transform and Load (ETL) paradigms. This course describes which paradigm should be used and when for batch data. Furthermore, this course covers several technologies on Google Cloud for data transformation including BigQuery, executing Spark on Dataproc, pipeline graphs in Cloud Data Fusion and serverless data processing with Dataflow. Learners get hands-on experience building data pipeline components on Google Cloud using Qwiklabs.
-
Course by
-
Self Paced
-
17 hours
-
English

Introduction to Relational Databases (RDBMS)
Are you ready to dive into the world of data engineering? In this beginner level course, you will gain a solid understanding of how data is stored, processed, and accessed in relational databases (RDBMSes). You will work with different types of databases that are appropriate for various data processing requirements. You will begin this course by being introduced to relational database concepts, as well as several industry standard relational databases, including IBM DB2, MySQL, and PostgreSQL.
-
Course by
-
Self Paced
-
19 hours
-
English

Introduction to Big Data with Spark and Hadoop
This self-paced IBM course will teach you all about big data! You will become familiar with the characteristics of big data and its application in big data analytics. You will also gain hands-on experience with big data processing tools like Apache Hadoop and Apache Spark. Bernard Marr defines big data as the digital trace that we are generating in this digital era. You will start the course by understanding what big data is and exploring how insights from big data can be harnessed for a variety of use cases.
-
Course by
-
Self Paced
-
24 hours
-
English

Fundamentals of Scalable Data Science
Apache Spark is the de-facto standard for large scale data processing. This is the first course of a series of courses towards the IBM Advanced Data Science Specialization. We strongly believe that is is crucial for success to start learning a scalable data science platform since memory and CPU constraints are to most limiting factors when it comes to building advanced machine learning models.\n\nIn this course we teach you the fundamentals of Apache Spark using python and pyspark.
-
Course by
-
Self Paced
-
22 hours
-
English

Introduction to Business Analytics with R
Nearly every aspect of business is affected by data analytics. For businesses to capitalize on data analytics, they need leaders who understand the business analytic workflow. This course addresses the human skills gap by providing a foundational set of data processing skills that can be applied to many business settings. In this course you will use a data analytic language, R, to efficiently prepare business data for analytic tools such as algorithms and visualizations.
-
Course by
-
Self Paced
-
17 hours
-
English

Advanced Algorithms and Complexity
In previous courses of our online specialization you've learned the basic algorithms, and now you are ready to step into the area of more complex problems and algorithms to solve them. Advanced algorithms build upon basic ones and use new ideas. We will start with networks flows which are used in more typical applications such as optimal matchings, finding disjoint paths and flight scheduling as well as more surprising ones like image segmentation in computer vision.
-
Course by
-
Self Paced
-
27 hours
-
English

Big Data Integration and Processing
At the end of the course, you will be able to: *Retrieve data from example database and big data management systems *Describe the connections between data management operations and the big data processing patterns needed to utilize them in large-scale analytical applications *Identify when a big data problem needs data integration *Execute simple big data integration and processing on Hadoop and Spark platforms This course is for those new to data science. Completion of Intro to Big Data is recommended.
-
Course by
-
Self Paced
-
18 hours
-
English