- Level Foundation
- المدة 21 ساعات hours
- الطبع بواسطة University of Washington
-
Offered by
عن
Case Studies: Analyzing Sentiment & Loan Default Prediction In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank. These tasks are an examples of classification, one of the most widely used areas of machine learning, with a broad array of applications, including ad targeting, spam detection, medical diagnosis and image classification. In this course, you will create classifiers that provide state-of-the-art performance on a variety of tasks. You will become familiar with the most successful techniques, which are most widely used in practice, including logistic regression, decision trees and boosting. In addition, you will be able to design and implement the underlying algorithms that can learn these models at scale, using stochastic gradient ascent. You will implement these technique on real-world, large-scale machine learning tasks. You will also address significant tasks you will face in real-world applications of ML, including handling missing data and measuring precision and recall to evaluate a classifier. This course is hands-on, action-packed, and full of visualizations and illustrations of how these techniques will behave on real data. We've also included optional content in every module, covering advanced topics for those who want to go even deeper! Learning Objectives: By the end of this course, you will be able to: -Describe the input and output of a classification model. -Tackle both binary and multiclass classification problems. -Implement a logistic regression model for large-scale classification. -Create a non-linear model using decision trees. -Improve the performance of any model using boosting. -Scale your methods with stochastic gradient ascent. -Describe the underlying decision boundaries. -Build a classification model to predict sentiment in a product review dataset. -Analyze financial data to predict loan defaults. -Use techniques for handling missing data. -Evaluate your models using precision-recall metrics. -Implement these techniques in Python (or in the language of your choice, though Python is highly recommended).الوحدات
Welcome to the course
3
Videos
- Welcome to the classification course, a part of the Machine Learning Specialization
- What is this course about?
- Impact of classification
3
Readings
- Important Update regarding the Machine Learning Specialization
- Slides presented in this module
- Get help and meet other learners. Join your Community!
Course overview and details
5
Videos
- Course overview
- Outline of first half of course
- Outline of second half of course
- Assumed background
- Let's get started!
1
Readings
- Reading: Software tools you'll need
Linear classifiers
6
Videos
- Linear classifiers: A motivating example
- Intuition behind linear classifiers
- Decision boundaries
- Linear classifier model
- Effect of coefficient values on decision boundary
- Using features of the inputs
1
Readings
- Slides presented in this module
Class probabilities
4
Videos
- Predicting class probabilities
- Review of basics of probabilities
- Review of basics of conditional probabilities
- Using probabilities in classification
Logistic regression
5
Videos
- Predicting class probabilities with (generalized) linear models
- The sigmoid (or logistic) link function
- Logistic regression model
- Effect of coefficient values on predicted probabilities
- Overview of learning logistic regression models
Practical issues for classification
2
Videos
- Encoding categorical inputs
- Multiclass classification with 1 versus all
Summarizing linear classifiers & logistic regression
1
Assignment
- Linear Classifiers & Logistic Regression
1
Videos
- Recap of logistic regression classifier
Programming Assignment
1
Assignment
- Predicting sentiment from product reviews
1
Readings
- Predicting sentiment from product reviews
Maximum likelihood estimation
4
Videos
- Goal: Learning parameters of logistic regression
- Intuition behind maximum likelihood estimation
- Data likelihood
- Finding best linear classifier with gradient ascent
1
Readings
- Slides presented in this module
Gradient ascent algorithm for learning logistic regression classifier
5
Videos
- Review of gradient ascent
- Learning algorithm for logistic regression
- Example of computing derivative for logistic regression
- Interpreting derivative for logistic regression
- Summary of gradient ascent for logistic regression
Choosing step size for gradient ascent/descent
3
Videos
- Choosing step size
- Careful with step sizes that are too large
- Rule of thumb for choosing step size
(VERY OPTIONAL LESSON) Deriving gradient of logistic regression
5
Videos
- (VERY OPTIONAL) Deriving gradient of logistic regression: Log trick
- (VERY OPTIONAL) Expressing the log-likelihood
- (VERY OPTIONAL) Deriving probability y=-1 given x
- (VERY OPTIONAL) Rewriting the log likelihood into a simpler form
- (VERY OPTIONAL) Deriving gradient of log likelihood
Summarizing learning linear classifiers
1
Assignment
- Learning Linear Classifiers
1
Videos
- Recap of learning logistic regression classifiers
Programming Assignment
1
Assignment
- Implementing logistic regression from scratch
1
Readings
- Implementing logistic regression from scratch
Overfitting in classification
4
Videos
- Evaluating a classifier
- Review of overfitting in regression
- Overfitting in classification
- Visualizing overfitting with high-degree polynomial features
1
Readings
- Slides presented in this module
Overconfident predictions due to overfitting
3
Videos
- Overfitting in classifiers leads to overconfident predictions
- Visualizing overconfident predictions
- (OPTIONAL) Another perspecting on overfitting in logistic regression
L2 regularized logistic regression
4
Videos
- Penalizing large coefficients to mitigate overfitting
- L2 regularized logistic regression
- Visualizing effect of L2 regularization in logistic regression
- Learning L2 regularized logistic regression with gradient ascent
Sparse logistic regression
1
Videos
- Sparse logistic regression with L1 regularization
Summarizing overfitting & regularization in logistic regression
1
Assignment
- Overfitting & Regularization in Logistic Regression
1
Videos
- Recap of overfitting & regularization in logistic regression
Programming Assignment
1
Assignment
- Logistic Regression with L2 regularization
1
Readings
- Logistic Regression with L2 regularization
Intuition behind decision trees
3
Videos
- Predicting loan defaults with decision trees
- Intuition behind decision trees
- Task of learning decision trees from data
1
Readings
- Slides presented in this module
Learning decision trees
4
Videos
- Recursive greedy algorithm
- Learning a decision stump
- Selecting best feature to split on
- When to stop recursing
Using the learned decision tree
2
Videos
- Making predictions with decision trees
- Multiclass classification with decision trees
Learning decision trees with continuous inputs
3
Videos
- Threshold splits for continuous inputs
- (OPTIONAL) Picking the best threshold to split on
- Visualizing decision boundaries
Summarizing decision trees
1
Assignment
- Decision Trees
1
Videos
- Recap of decision trees
Programming Assignment 1
1
Assignment
- Identifying safe loans with decision trees
1
Readings
- Identifying safe loans with decision trees
Programming Assignment 2
1
Assignment
- Implementing binary decision trees
1
Readings
- Implementing binary decision trees
Overfitting in decision trees
2
Videos
- A review of overfitting
- Overfitting in decision trees
1
Readings
- Slides presented in this module
Early stopping to avoid overfitting
2
Videos
- Principle of Occam's razor: Learning simpler decision trees
- Early stopping in learning decision trees
(OPTIONAL LESSON) Pruning decision trees
3
Videos
- (OPTIONAL) Motivating pruning
- (OPTIONAL) Pruning decision trees to avoid overfitting
- (OPTIONAL) Tree pruning algorithm
Summarizing preventing overfitting in decision trees
1
Assignment
- Preventing Overfitting in Decision Trees
1
Videos
- Recap of overfitting and regularization in decision trees
Programming Assignment
1
Assignment
- Decision Trees in Practice
1
Readings
- Decision Trees in Practice
Basic strategies for handling missing data
3
Videos
- Challenge of missing data
- Strategy 1: Purification by skipping missing data
- Strategy 2: Purification by imputing missing data
1
Readings
- Slides presented in this module
Strategy 3: Modify learning algorithm to explicitly handle missing data
2
Videos
- Modifying decision trees to handle missing data
- Feature split selection with missing data
Summarizing handling missing data
1
Assignment
- Handling Missing Data
1
Videos
- Recap of handling missing data
The amazing idea of boosting a classifier
3
Videos
- The boosting question
- Ensemble classifiers
- Boosting
1
Readings
- Slides presented in this module
AdaBoost
5
Videos
- AdaBoost overview
- Weighted error
- Computing coefficient of each ensemble component
- Reweighing data to focus on mistakes
- Normalizing weights
Applying AdaBoost
2
Videos
- Example of AdaBoost in action
- Learning boosted decision stumps with AdaBoost
Programming Assignment 1
1
Assignment
- Exploring Ensemble Methods
1
Readings
- Exploring Ensemble Methods
Convergence and overfitting in boosting
2
Videos
- The Boosting Theorem
- Overfitting in boosting
Summarizing boosting
1
Assignment
- Boosting
1
Videos
- Ensemble methods, impact of boosting & quick recap
Programming Assignment 2
1
Assignment
- Boosting a decision stump
1
Readings
- Boosting a decision stump
Why use precision & recall as quality metrics
2
Videos
- Case-study where accuracy is not best metric for classification
- What is good performance for a classifier?
1
Readings
- Slides presented in this module
Precision & recall explained
2
Videos
- Precision: Fraction of positive predictions that are actually positive
- Recall: Fraction of positive data predicted to be positive
The precision-recall tradeoff
3
Videos
- Precision-recall extremes
- Trading off precision and recall
- Precision-recall curve
Summarizing precision-recall
1
Assignment
- Precision-Recall
1
Videos
- Recap of precision-recall
Programming Assignment
1
Assignment
- Exploring precision and recall
1
Readings
- Exploring precision and recall
Scaling ML to huge datasets
2
Videos
- Gradient ascent won't scale to today's huge datasets
- Timeline of scalable machine learning & stochastic gradient
1
Readings
- Slides presented in this module
Scaling ML with stochastic gradient
3
Videos
- Why gradient ascent won't scale
- Stochastic gradient: Learning one data point at a time
- Comparing gradient to stochastic gradient
Understanding why stochastic gradient works
2
Videos
- Why would stochastic gradient ever work?
- Convergence paths
Stochastic gradient: Practical tricks
6
Videos
- Shuffle data before running stochastic gradient
- Choosing step size
- Don't trust last coefficients
- (OPTIONAL) Learning from batches of data
- (OPTIONAL) Measuring convergence
- (OPTIONAL) Adding regularization
Online learning: Fitting models from streaming data
2
Videos
- The online learning task
- Using stochastic gradient for online learning
Summarizing scaling to huge datasets & online learning
1
Assignment
- Scaling to Huge Datasets & Online Learning
1
Videos
- Scaling to huge datasets through parallelization & module recap
Programming Assignment
1
Assignment
- Training Logistic Regression via Stochastic Gradient Ascent
1
Readings
- Training Logistic Regression via Stochastic Gradient Ascent
Auto Summary
Explore the world of Machine Learning with a focus on Classification in this foundational course by Coursera. Guided by expert instructors, you'll dive into real-world case studies like sentiment analysis and loan default prediction. Learn to build advanced classifiers using techniques such as logistic regression, decision trees, and boosting. Over approximately 1260 minutes, you'll master handling data, implementing models at scale with stochastic gradient ascent, and evaluating them using precision-recall metrics. Ideal for aspiring data scientists and AI enthusiasts, the course offers hands-on experience with Python and optional advanced content for deeper learning. Subscribe to the Starter plan to begin your journey.

Emily Fox

Carlos Guestrin