

Our Courses

Identifying Patient Populations
This course teaches you the fundamentals of computational phenotyping, a biomedical informatics method for identifying patient populations. In this course you will learn how different clinical data types perform when trying to identify patients with a particular disease or trait. You will also learn how to program different data manipulations and combinations to increase the complexity and improve the performance of your algorithms.
-
Course by
-
Self Paced
-
13 hours
-
English

Relational Database Design
Have you ever wanted to build a database but don't know where to start? This course will provide you a step-by-step guidance. We are going to start from a raw idea to an implementable relational database. Getting on the path, practicing the real-life mini cases, you will be confident and comfortable with Relational Database Design. Let's get started! Relational Database Design can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform.
-
Course by
-
Self Paced
-
71 hours
-
English

Python Project for Data Engineering
Showcase your Python skills in this Data Engineering Project! This short course is designed to apply your basic Python skills through the implementation of various techniques for gathering and manipulating data. You will take on the role of a Data Engineer by extracting data from multiple sources, and converting the data into specific formats and making it ready for loading into a database for analysis. You will also demonstrate your knowledge of web scraping and utilizing APIs to extract data.
-
Course by
-
Self Paced
-
16 hours
-
English

Data Science in Stratified Healthcare and Precision Medicine
An increasing volume of data is becoming available in biomedicine and healthcare, from genomic data, to electronic patient records and data collected by wearable devices. Recent advances in data science are transforming the life sciences, leading to precision medicine and stratified healthcare. In this course, you will learn about some of the different types of data and computational methods involved in stratified healthcare and precision medicine. You will have a hands-on experience of working with such data. And you will learn from leaders in the field about successful case studies.
-
Course by
-
Self Paced
-
17 hours
-
English

MLOps Platforms: Amazon SageMaker and Azure ML
In MLOps (Machine Learning Operations) Platforms: Amazon SageMaker and Azure ML you will learn the necessary skills to build, train, and deploy machine learning solutions in a production environment using two leading cloud platforms: Amazon Web Services (AWS) and Microsoft Azure.
-
Course by
-
Self Paced
-
13 hours
-
English

Data Analysis Using Python
This course provides an introduction to basic data science techniques using Python. Students are introduced to core concepts like Data Frames and joining data, and learn how to use data analysis libraries like pandas, numpy, and matplotlib. This course provides an overview of loading, inspecting, and querying real-world data, and how to answer basic questions about that data. Students will gain skills in data aggregation and summarization, as well as basic data visualization.
-
Course by
-
Self Paced
-
17 hours
-
English

Documentation and Usability for Cancer Informatics
Introduction: Cancer datasets are plentiful, complicated, and hold information that may be critical for the next research advancements. In order to use these data to their full potential, researchers are dependent on the specialized data tools that are continually being published and developed. Bioinformatics tools can often be unfriendly to their users, who often have little to no background in programming (Bolchini et al. 2008).
-
Course by
-
Self Paced
-
6 hours
-
English

Using probability distributions for real world problems in R
By the end of this project, you will learn how to apply probability distributions to solve real world problems in R, a free, open-source program that you can download. You will learn how to answer real world problems using the following probability distributions – Binomial, Poisson, Normal, Exponential and Chi-square. You will also learn the various ways of visualizing these distributions of real world problems.
-
Course by
-
Self Paced
-
2 hours
-
English

Preparing for DP-900: Microsoft Azure Data Fundamentals Exam
Microsoft certifications give you a professional advantage by providing globally recognized and industry-endorsed evidence of mastering skills in digital and cloud businesses. In this course, you will prepare to take the DP-900 Microsoft Azure Data Fundamentals certification exam. You will refresh your knowledge of the fundamentals of database concepts in a cloud environment, the basic skilling in cloud data services, and foundational knowledge of cloud data services within Microsoft Azure.
-
Course by
-
Self Paced
-
6 hours
-
English

AI Workflow: Business Priorities and Data Ingestion
This is the first course of a six part specialization. You are STRONGLY encouraged to complete these courses in order as they are not individual independent courses, but part of a workflow where each course builds on the previous ones. This first course in the IBM AI Enterprise Workflow Certification specialization introduces you to the scope of the specialization and prerequisites. Specifically, the courses in this specialization are meant for practicing data scientists who are knowledgeable about probability, statistics, linear algebra, and Python tooling for data science and ma
-
Course by
-
Self Paced
-
8 hours
-
English

Materials Data Sciences and Informatics
This course aims to provide a succinct overview of the emerging discipline of Materials Informatics at the intersection of materials science, computational science, and information science. Attention is drawn to specific opportunities afforded by this new field in accelerating materials development and deployment efforts. A particular emphasis is placed on materials exhibiting hierarchical internal structures spanning multiple length/structure scales and the impediments involved in establishing invertible process-structure-property (PSP) linkages for these materials.
-
Course by
-
Self Paced
-
9 hours
-
English

Microsoft Azure Machine Learning for Data Scientists
Machine learning is at the core of artificial intelligence, and many modern applications and services depend on predictive machine learning models. Training a machine learning model is an iterative process that requires time and compute resources. Automated machine learning can help make it easier.
-
Course by
-
Self Paced
-
11 hours
-
English

Perform data science with Azure Databricks
In this course, you will learn how to harness the power of Apache Spark and powerful clusters running on the Azure Databricks platform to run data science workloads in the cloud. This is the fourth course in a five-course program that prepares you to take the DP-100: Designing and Implementing a Data Science Solution on Azurec ertification exam. The certification exam is an opportunity to prove knowledge and expertise operate machine learning solutions at a cloud-scale using Azure Machine Learning.
-
Course by
-
Self Paced
-
26 hours
-
English

Introduction to Clinical Data Science
This course will prepare you to complete all parts of the Clinical Data Science Specialization. In this course you will learn how clinical data are generated, the format of these data, and the ethical and legal restrictions on these data. You will also learn enough SQL and R programming skills to be able to complete the entire Specialization - even if you are a beginner programmer. While you are taking this course you will have access to an actual clinical data set and a free, online computational environment for data science hosted by our Industry Partner Google Cloud.
-
Course by
-
Self Paced
-
8 hours
-
English

Machine Learning Introduction for Everyone
This three-module course introduces machine learning and data science for everyone with a foundational understanding of machine learning models. You’ll learn about the history of machine learning, applications of machine learning, the machine learning model lifecycle, and tools for machine learning. You’ll also learn about supervised versus unsupervised learning, classification, regression, evaluating machine learning models, and more. Our labs give you hands-on experience with these machine learning and data science concepts.
-
Course by
-
Self Paced
-
7 hours
-
English

Big Data, Genes, and Medicine
This course distills for you expert knowledge and skills mastered by professionals in Health Big Data Science and Bioinformatics. You will learn exciting facts about the human body biology and chemistry, genetics, and medicine that will be intertwined with the science of Big Data and skills to harness the avalanche of data openly available at your fingertips and which we are just starting to make sense of.
-
Course by
-
Self Paced
-
40 hours
-
English

Probability Theory: Foundation for Data Science
Understand the foundations of probability and its relationship to statistics and data science. We’ll learn what it means to calculate a probability, independent and dependent outcomes, and conditional events. We’ll study discrete and continuous random variables and see how this fits with data collection. We’ll end the course with Gaussian (normal) random variables and the Central Limit Theorem and understand its fundamental importance for all of statistics and data science. This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science
-
Course by
-
Self Paced
-
48 hours
-
English

Scalable Machine Learning on Big Data using Apache Spark
This course will empower you with the skills to scale data science and machine learning (ML) tasks on Big Data sets using Apache Spark. Most real world machine learning work involves very large data sets that go beyond the CPU, memory and storage limitations of a single computer. Apache Spark is an open source framework that leverages cluster computing and distributed storage to process extremely large data sets in an efficient and cost effective manner.
-
Course by
-
Self Paced
-
7 hours
-
English

The exposome: cracking the science about what makes us sick
What are the causes of disease? We know that most diseases result from a combination of genes and environment (nature and nurture). Our genes alone do not determine our fate. For most complex diseases, externalities - environmental factors in the broad sense - are more important. This includes our living and working environments, diet, social support and stress, pollution, and exposure to infectious agents.
-
Course by
-
Self Paced
-
15 hours
-
English

AI Workflow: Data Analysis and Hypothesis Testing
This is the second course in the IBM AI Enterprise Workflow Certification specialization. You are STRONGLY encouraged to complete these courses in order as they are not individual independent courses, but part of a workflow where each course builds on the previous ones. In this course you will begin your work for a hypothetical streaming media company by doing exploratory data analysis (EDA). Best practices for data visualization, handling missing data, and hypothesis testing will be introduced to you as part of your work. You will learn techniques of estimation
-
Course by
-
Self Paced
-
11 hours
-
English

Statistical Inference for Estimation in Data Science
This course introduces statistical inference, sampling distributions, and confidence intervals. Students will learn how to define and construct good estimators, method of moments estimation, maximum likelihood estimation, and methods of constructing confidence intervals that will extend to more general settings. This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform.
-
Course by
-
Self Paced
-
28 hours
-
English

Computer Vision in Microsoft Azure
In Microsoft Azure, the Computer Vision cognitive service uses pre-trained models to analyze images, enabling software developers to easily build applications"see" the world and make sense of it. This ability to process images is the key to creating software that can emulate human visual perception. In this course, you'll explore some of these capabilities as you learn how to use the Computer Vision service to analyze images. This course will help you prepare for Exam AI-900: Microsoft Azure AI Fundamentals.
-
Course by
-
Self Paced
-
8 hours
-
English

Calculus through Data & Modelling: Vector Calculus
This course continues your study of calculus by focusing on the applications of integration to vector valued functions, or vector fields. These are functions that assign vectors to points in space, allowing us to develop advanced theories to then apply to real-world problems. We define line integrals, which can be used to fund the work done by a vector field. We culminate this course with Green's Theorem, which describes the relationship between certain kinds of line integrals on closed paths and double integrals.
-
Course by
-
Self Paced
-
5 hours
-
English

Introduction to the Tidyverse
This course introduces a powerful set of data science tools known as the Tidyverse. The Tidyverse has revolutionized the way in which data scientists do almost every aspect of their job. We will cover the simple idea of "tidy data" and how this idea serves to organize data for analysis and modeling. We will also cover how non-tidy can be transformed to tidy data, the data science project life cycle, and the ecosystem of Tidyverse R packages that can be used to execute a data science project.
-
Course by
-
Self Paced
-
English

Google Advanced Data Analytics Capstone
You’re almost there! This is the seventh and final course of the Google Advanced Data Analytics Certificate. In this course, you have the opportunity to complete an optional capstone project that includes key concepts from each of the six preceding courses.
-
Course by
-
Self Paced
-
10 hours
-
English