

Our Courses

Advanced Data Science with IBM
As a coursera certified specialization completer you will have a proven deep understanding on massive parallel data processing, data exploration and visualization, and advanced machine learning & deep learning. You'll un…
-
Course by
-
Self Paced
-
English

Building Modern Node.js Applications on AWS
In modern cloud native application development, it’s oftentimes the goal to build out serverless architectures that are scalable, are highly available, and are fully managed. This means less operational overhead for you and your business, and more focusing on the applications and business specific projects that differentiate you in your marketplace. In this course, we will be covering how to build a modern, greenfield serverless backend on AWS.
-
Course by
-
Self Paced
-
17 hours
-
English

Introduction to Designing Data Lakes on AWS
In this class, Introduction to Designing Data Lakes on AWS, we will help you understand how to create and operate a data lake in a secure and scalable way, without previous knowledge of data science! Starting with the "WHY" you may want a data lake, we will look at the Data-Lake value proposition, characteristics and components.
-
Course by
-
Self Paced
-
13 hours
-
English

Serverless Data Processing with Dataflow: Foundations
This course is part 1 of a 3-course series on Serverless Data Processing with Dataflow. In this first course, we start with a refresher of what Apache Beam is and its relationship with Dataflow. Next, we talk about the Apache Beam vision and the benefits of the Beam Portability framework. The Beam Portability framework achieves the vision that a developer can use their favorite programming language with their preferred execution backend.
-
Course by
-
Self Paced
-
3 hours
-
English

Data Analysis Using Pyspark
One of the important topics that every data analyst should be familiar with is the distributed data processing technologies. As a data analyst, you should be able to apply different queries to your dataset to extract useful information out of it. but what if your data is so big that working with it on your local machine is not easy to be done. That is when the distributed data processing and Spark Technology will become handy.
-
Course by
-
Self Paced
-
3 hours
-
English

GPU Programming
This specialization is intended for data scientists and software developers to create software that uses commonly available hardware. Students will be introduced to CUDA and libraries that allow for performing numerous computations in parallel and rapidly. Applications for these skills are machine learning, image/audio signal processing, and data processing.
-
Course by
-
Self Paced
-
English

Serverless Data Processing with Dataflow
It is becoming harder and harder to maintain a technology stack that can keep up with the growing demands of a data-driven business. Every Big Data practitioner is familiar with the three V’s of Big Data: volume, velocity, and variety. What if there was a scale-proof technology that was designed to meet these demands? Enter Google Cloud Dataflow. Google Cloud Dataflow simplifies data processing by unifying batch & stream processing and providing a serverless experience that allows users to focus on analytics, not infrastructure.
-
Course by
-
Self Paced
-
English

Data Processing and Manipulation
The "Data Processing and Manipulation" course provides students with a comprehensive understanding of various data processing and manipulation concepts and tools. Participants will learn how to handle missing values, detect outliers, perform sampling and dimension reduction, apply scaling and discretization techniques, and explore data cube and pivot table operations. This course equips students with essential skills for efficiently preparing and transforming data for analysis and decision-making. Learning Objectives: 1.
-
Course by
-
Self Paced
-
29 hours
-
English

I/O-efficient algorithms
I/O-efficient algorithms, also known as external memory algorithms or cache-oblivious algorithms, are a class of algorithms designed to efficiently process data that is too large to fit entirely in the main memory (RAM) of a computer. These algorithms are particularly useful when dealing with massive datasets, such as those found in large-scale data processing, database management, and file systems. Operations on data become more expensive when the data item is located higher in the memory hierarchy.
-
Course by
-
Self Paced
-
10 hours
-
English

Vertex AI: Qwik Start
This is a self-paced lab that takes place in the Google Cloud console. In this lab, you will use BigQuery for data processing and exploratory data analysis, and the Vertex AI platform to train and deploy a custom TensorFlow Regressor model to predict customer lifetime value (CLV). The goal of the lab is to introduce to Vertex AI through a high value real world use case - predictive CLV. Starting with a local BigQuery and TensorFlow workflow, you will progress toward training and deploying your model in the cloud with Vertex AI.
-
Course by
-
Self Paced
-
2 hours
-
English

Data Structures and Algorithms
This course explores data structures and algorithms for back-end development, focusing on performance and scalability.
-
Course by
-
Self Paced
-
English

Preparing for Google Cloud Certification: Cloud Data Engineer
This program provides the skills you need to advance your career in data engineering and provides a pathway to earn the industry-recognized Google Cloud Professional Data Engineer certification.
-
Course by
-
Self Paced
-
English

NoSQL, Big Data, and Spark Foundations
Big Data Engineers and professionals with NoSQL skills are highly sought after in the data management industry. This Specialization is designed for those seeking to develop fundamental skills for working with Big Data, Apache Spark, and NoSQL databases.
-
Course by
-
Self Paced
-
English

How Computers Work: Demystifying Computation
Explore the fundamentals of computing: computer architecture, binary logic, data processing, circuits & more.
-
Course by
-
Self Paced
-
12
-
English

Social Media Data Analytics
Learner Outcomes: After taking this course, you will be able to: - Utilize various Application Programming Interface (API) services to collect data from different social media sources such as YouTube, Twitter, and Flickr. - Process the collected data - primarily structured - using methods involving correlation, regression, and classification to derive insights about the sources and people who generated that data. - Analyze unstructured data - primarily textual comments - for sentiments expressed in them. - Use different tools for collecting, analyzing, and exploring social media data for resea
-
Course by
-
13 hours
-
English

Big Data Analysis Deep Dive
The job market for architects, engineers, and analytics professionals with Big Data expertise continues to increase. The Academy’s Big Data Career path focuses on the fundamental tools and techniques needed to pursue a career in Big Data. This course includes: data processing with python, writing and reading SQL queries, transmitting data with MaxCompute, analyzing data with Quick BI, using Hive, Hadoop, and spark on E-MapReduce, and how to visualize data with data dashboards.
-
Course by
-
Self Paced
-
14 hours
-
English

Foundations of Sports Analytics: Data, Representation, and Models in Sports
This course provides an introduction to using Python to analyze team performance in sports. Learners will discover a variety of techniques that can be used to represent sports data and how to extract narratives based on these analytical techniques. The main focus of the introduction will be on the use of regression analysis to analyze team and player performance data, using examples drawn from the National Football League (NFL), the National Basketball Association (NBA), the National Hockey League (NHL), the English Premier LEague (EPL, soccer) and the Indian Premier League (IPL, cricket).
-
Course by
-
Self Paced
-
49 hours
-
English

Big Data Science with the BD2K-LINCS Data Coordination and Integration Center
The Library of Integrative Network-based Cellular Signatures (LINCS) was an NIH Common Fund program that lasted for 10 years from 2012-2021. The idea behind the LINCS program was to perturb different types of human cells with many different types of perturbations such as drugs and other small molecules, genetic manipulations such as single gene knockdown, knockout, or overexpression, manipulation of the extracellular microenvironment conditions, for example, growing cells on different surfaces, and more.
-
Course by
-
Self Paced
-
9 hours
-
English

AWS Data Processing
AWS: Data Processing Course is the second course of AWS Certified Data Analytics Specialty Specialization. This course focuses on providing data processing solutions. The entire course is designed to teach learners the concept of EMR and Extract, Transform and Load. This course also put emphasis on ETL services and Data Processing solutions in AWS. The course is divided into three modules and each module is further segmented by Lessons and Video Lectures. This course facilitates learners with approximately 3:30-4:00 Hours Video lectures that provide both Theory and Hands -On knowledge.
-
Course by
-
Self Paced
-
6 hours
-
English

Data Platform, Cloud Networking and AI in the Cloud
The Data Platform course aims to establish a strong foundation, and working knowledge of the fundamentals of data, including data mechanics, databases, and other foundational elements of data processing. This course will drill into the specific data management elements including relational taxonomy of data, data lifecycle and fundamentals databases and data processing and analysis. The course also covers the relevance of IA with respect to data in the cloud.
-
Course by
-
Self Paced
-
7 hours
-
English

Data Processing Using Python
This course (The English copy of "用Python玩转数据" ) is mainly for non-computer majors. It starts with the basic syntax of Python, to how to acquire data in Python locally and from network, to how to present data, then to how to conduct basic and advanced statistic analysis and visualization of data, and finally to how to design a simple GUI to present and process data, advancing level by level.
-
Course by
-
Self Paced
-
29 hours
-
English

Serverless Data Processing Dataflow em Português Brasileiro
Está se tornando cada vez mais difícil manter uma pilha de tecnologia que possa acompanhar as crescentes demandas de um negócio orientado a dados. Todo praticante de Big Data está familiarizado com os três V’s do Big Data: volume, velocidade e variedade. E se houvesse uma tecnologia à prova de escala projetada para atender a essas demandas? Entre no Google Cloud Dataflow. O Google Cloud Dataflow simplifica o processamento de dados unificando o processamento em lote e fluxo e fornecendo uma experiência sem servidor que permite que os usuários se concentrem na análise, não na infraestrutura.
-
Course by
-
Self Paced
-
Portuguese

Deploying Machine Learning Models
In this course we will learn about Recommender Systems (which we will study for the Capstone project), and also look at deployment issues for data products. By the end of this course, you should be able to implement a working recommender system (e.g.
-
Course by
-
Self Paced
-
11 hours
-
English

Implement Real Time Analytics using Azure Stream Analytics
In this project, we are going to see how to "Implement Real Time Analytics using Azure Stream Analytics" Data processing is broadly categorized into two main categories: Batch processing & Real time processing.
-
Course by
-
Self Paced
-
3 hours
-
English

Serverless Data Processing with Dataflow: Develop Pipelines
In this second installment of the Dataflow course series, we are going to be diving deeper on developing pipelines using the Beam SDK. We start with a review of Apache Beam concepts. Next, we discuss processing streaming data using windows, watermarks and triggers. We then cover options for sources and sinks in your pipelines, schemas to express your structured data, and how to do stateful transformations using State and Timer APIs. We move onto reviewing best practices that help maximize your pipeline performance.
-
Course by
-
Self Paced
-
19 hours
-
English