

دوراتنا

NoSQL, Big Data, and Spark Foundations
Big Data Engineers and professionals with NoSQL skills are highly sought after in the data management industry. This Specialization is designed for those seeking to develop fundamental skills for working with Big Data, Apache Spark, and NoSQL databases.
-
Course by
-
Self Paced
-
الإنجليزية

Data Analysis Using Pyspark
One of the important topics that every data analyst should be familiar with is the distributed data processing technologies. As a data analyst, you should be able to apply different queries to your dataset to extract useful information out of it. but what if your data is so big that working with it on your local machine is not easy to be done. That is when the distributed data processing and Spark Technology will become handy.
-
Course by
-
Self Paced
-
3 ساعات
-
الإنجليزية

Machine Learning with PySpark: Data Analysis using SQL
This Guided Project is for beginning Python Developers. In this 1-hour long project-based course, you will learn how to Describe PySpark and Machine Learning, Use PySpark to Capture data, Use PySpark SQL to observe the data, Use PySpark MLlib to prepare training data, and Use PySpark MLlib to predict an outcome. To achieve this, we will work through using PySpark to read data into a PySpark Dataframe, View the Data using PysPark SQL, Prepare the Test and Training data using a heart disease data set, and attempt to predict heart disease using independent variables.
-
Course by
-
Self Paced
-
3 ساعات
-
الإنجليزية

Building Machine Learning Pipelines in PySpark MLlib
By the end of this project, you will learn how to create machine learning pipelines using Python and Spark, free, open-source programs that you can download. You will learn how to load your dataset in Spark and learn how to perform basic cleaning techniques such as removing columns with high missing values and removing rows with missing values. You will then create a machine learning pipeline with a random forest regression model. You will use cross validation and parameter tuning to select the best model from the pipeline.
-
Course by
-
Self Paced
-
2 ساعات
-
الإنجليزية

Music Recommender System Using Pyspark
Nowadays, recommender systems are everywhere. for example, Amazon uses recommender systems to suggest some products that you might be interested in based on the products you've bought earlier. Or Spotify will suggest new tracks based on the songs you use to listen to every day. Most of these recommender systems use some algorithms which are based on Matrix factorization such as NMF( NON NEGATIVE MATRIX FACTORIZATION) or ALS (Alternating Least Square).
-
Course by
-
Self Paced
-
2 ساعات
-
الإنجليزية

Machine Learning with PySpark: Customer Churn Analysis
This 90-minute guided-project, "Pyspark for Data Science: Customer Churn Prediction," is a comprehensive guided-project that teaches you how to use PySpark to build a machine learning model for predicting customer churn in a Telecommunications company. This guided-project covers a range of essential tasks, including data loading, exploratory data analysis, data preprocessing, feature preparation, model training, evaluation, and deployment, all using Pyspark.
-
Course by
-
Self Paced
-
3 ساعات
-
الإنجليزية

Graduate Admission Prediction with Pyspark ML
In this 1 hour long project-based course, you will learn to build a linear regression model using Pyspark ML to predict students' admission at the university. We will use the graduate admission 2 data set from Kaggle. Our goal is to use a Simple Linear Regression Machine Learning Algorithm from the Pyspark Machine learning library to predict the chances of getting admission. We will be carrying out the entire project on the Google Colab environment with the installation of Pyspark. You will need a free Gmail account to complete this project.
-
Course by
-
Self Paced
-
2 ساعات
-
الإنجليزية

Diabetes Prediction With Pyspark MLLIB
In this 1 hour long project-based course, you will learn to build a logistic regression model using Pyspark MLLIB to classify patients as either diabetic or non-diabetic. We will use the popular Pima Indian Diabetes data set. Our goal is to use a simple logistic regression classifier from the pyspark Machine learning library for diabetes classification. We will be carrying out the entire project on the Google Colab environment with the installation of Pyspark.You will need a free Gmail account to complete this project.
-
Course by
-
Self Paced
-
3 ساعات
-
الإنجليزية

Machine Learning con Pyspark aplicado al campo sanitario
Este proyecto es un curso práctico y efectivo para aprender a generar modelos de Machine Learning en un entorno de Big Data con PySpark en proyectos sanitarios.
Te enseñaremos desde cero las bases de PySpark hasta las funciones más complejas. Y finalmente acabarás desarrollando un modelo completo y avanzado con Spark en Jupyter Notebooks.
-
Course by
-
Self Paced
-
3 ساعات
-
الإسبانية

ML y Big Data con PySpark para la retención de clientes
Es un curso práctico y efectivo para aprender a generar modelos de Machine Learning con PySpark en un entorno de Big Data para predecir el "Churn" del cliente. Te enseñaremos desde cero los fundamentos de Spark y MLlib, y acabarás desarrollando avanzados modelos de Machine Learning con MLlib y PySpark.
-
Course by
-
Self Paced
-
3 ساعات
-
الإسبانية

Machine Learning y Regresión con PySpark. Guía paso a paso
Es un curso práctico y efectivo para aprender a generar modelos de regresión (Machine Learning) con PySpark en un entorno de Big Data.
-
Course by
-
2 ساعات
-
الإسبانية

Fundamentals of Scalable Data Science
Apache Spark is the de-facto standard for large scale data processing. This is the first course of a series of courses towards the IBM Advanced Data Science Specialization. We strongly believe that is is crucial for success to start learning a scalable data science platform since memory and CPU constraints are to most limiting factors when it comes to building advanced machine learning models.\n\nIn this course we teach you the fundamentals of Apache Spark using python and pyspark.
-
Course by
-
Self Paced
-
22 ساعات
-
الإنجليزية