- Level Professional
- Duration 17 hours
- Course by SAS
-
Offered by
About
This course covers predictive modeling using SAS/STAT software with emphasis on the LOGISTIC procedure. This course also discusses selecting variables and interactions, recoding categorical variables based on the smooth weight of evidence, assessing models, treating missing values, and using efficiency techniques for massive data sets. You learn to use logistic regression to model an individual's behavior as a function of known inputs, create effect plots and odds ratio plots, handle missing data values, and tackle multicollinearity in your predictors. You also learn to assess model performance and compare models.Modules
Course Overview
1
Videos
- Meet the Instructor
3
Readings
- What You Learn in This Course
- Learner Prerequisites
- Using Forums and Getting Help
Logistics
3
Readings
- Access SAS Software for this Course
- Set Up Data for This Course (REQUIRED)
- About the Demos and Practices in this Course
Overview
1
Videos
- Overview
Predictive Modeling Fundamentals
2
Assignment
- Practice: Exploring the Bank Data for the Target Marketing Project
- Practice: Exploring the Veterans' Organization Data Used in the Practices
7
Videos
- Introduction
- Goals of Predictive Modeling
- Terms for Elements in Predictive Modeling
- Basic Steps of Predictive Modeling
- Applications of Predictive Modeling
- Demonstration Scenario: Target Marketing for a Bank
- Demo: Examining the Code for Generating Descriptive Statistics and Frequency Tables
Predictive Modeling Challenges
4
Assignment
- Question 1.01
- Question 1.02
- Question 1.03
- Practice: Splitting the Data
7
Videos
- Introduction
- Data Challenges
- Analytical Challenges
- Separate Sampling
- Avoiding the Optimism Bias: Honest Assessment
- Splitting the Data for Model Training and Assessment
- Demo: Splitting the Data
Review
1
Readings
- Summary
Overview
1
Videos
- Overview
Understanding the Logistic Regression Model
2
Assignment
- Question 2.01
- Question 2.02
13
Videos
- Introduction
- Understanding the Logistic Regression Model
- Constraining the Posterior Probability Using the Logit Transformation
- Understanding the Fitted Surface
- Interpreting the Model by Calculating the Odds Ratio
- Understanding Logistic Discrimination
- Estimating Unknown Parameters Using Maximum Likelihood Estimation
- Interpreting Concordant, Discordant, and Tied Pairs
- Using PROC LOGISTIC to Fit Logistic Regression Models
- Demo: Fitting a Basic Logistic Regression Model, Part 1
- Demo: Fitting a Basic Logistic Regression Model, Part 2
- Scoring New Cases
- Demo: Scoring New Cases
Correcting for Oversampling
1
Assignment
- Practice: Fitting a Logistic Regression Model
4
Videos
- Introduction
- Understanding the Effect of Oversampling
- Understanding the Offset
- Demo: Correcting for Oversampling
Review
1
Assignment
- Fitting the Model Review
1
Readings
- Summary
Overview
1
Videos
- Overview
Handling Missing Values
2
Assignment
- Question 3.01
- Practice: Imputing Missing Values
7
Videos
- Introduction
- Reasons for Missing Data
- Complete Case Analysis
- Methods for Imputing Missing Values
- Missing Value Imputation with Missing Value Indicator Variables
- Demo: Imputing Missing Values
- Cluster Imputation
Working with Categorical Inputs
5
Assignment
- Question 3.02
- Question 3.03
- Question 3.04
- Practice: Collapsing the Levels of a Nominal Input
- Practice: Computing the Smoothed Weight of Evidence
10
Videos
- Introduction
- Problems Caused by Categorical Inputs
- Solutions to Problems Caused by Categorical Inputs
- Linking to Other Data Sets
- Collapsing Categories by Thresholding
- Collapsing Categories by Using Greenacre's Method
- Demo: Collapsing the Levels of a Nominal Input, Part 1
- Demo: Collapsing the Levels of a Nominal Input, Part 2
- Replacing Categorical Levels by Using Smoothed Weight-of-Evidence Coding
- Demo: Computing the Smoothed Weight of Evidence
Reducing Redundancy by Clustering Variables
2
Assignment
- Question 3.05
- Practice: Reducing Redundancy by Clustering Variables
8
Videos
- Introduction
- Problem of Redundancy
- Variable Clustering Method
- Understanding Principal Components
- Divisive Clustering
- PROC VARCLUS Syntax
- Selecting a Representative Variable from Each Cluster
- Demo: Reducing Redundancy by Clustering Variables
Performing Variable Screening
5
Assignment
- Question 3.06
- Practice: Performing Variable Screening
- Practice: Creating Empirical Logit Plots
- Question 3.07
- Question 3.08
9
Videos
- Introduction
- Detecting Nonlinear Relationships
- Demo: Performing Variable Screening, Part 1
- Demo: Performing Variable Screening, Part 2
- Univariate Binning and Smoothing
- Demo: Creating Empirical Logit Plots
- Remedies for Nonlinear Relationships
- Demo: Accommodating a Nonlinear Relationship, Part 1
- Demo: Accommodating a Nonlinear Relationship, Part 2
Selecting Variables Sequentially
6
Assignment
- Question 3.09
- Practice: Using Forward Selection to Detect Interactions
- Question 3.10
- Practice: Using Backward Elimination to Subset the Variables
- Question 3.11
- Practice: Using Fit Statistics to Select a Model
14
Videos
- Introduction
- Specifying a Subset Selection Method in PROC LOGISTIC
- Best-Subsets Selection
- Stepwise Selection
- Backward Elimination
- Scalability of the Subset Selection Methods in PROC LOGISTIC
- Detecting Interactions
- BIC-based Significance Level
- Demo: Detecting Interactions
- Demo: Using Backward Elimination to Subset the Variables
- Demo: Displaying Odds Ratios for Variables Involved in Interactions
- Demo: Creating an Interaction Plot
- Demo: Using the Best-Subsets Selection Method
- Demo: Using Fit Statistics to Select a Model
Review
1
Assignment
- Preparing the Input Variables Review
1
Readings
- Summary of Preparing the Input Variables, Parts 1 and 2
Overview
1
Videos
- Overview
Honest Assessment of the Model
1
Assignment
- Question 4.01
4
Videos
- Introduction
- Fit versus Complexity
- Assessing Models when Target Event Data Is Rare
- Demo: Preparing the Validation Data
Common Metrics for Model Performance
3
Assignment
- Question 4.02
- Question 4.03
- Practice: Assessing Model Performance
7
Videos
- Introduction
- Understanding the Confusion Matrix
- Measuring Performance across Cutoffs by Using the ROC Curve
- Choosing Depth by Using the Gains Chart
- Effects of Oversampled Data on Performance Measures
- Adjusting a Confusion Matrix for Oversampling
- Demo: Measuring Model Performance Based on Commonly-Used Metrics
Profit-Based Metrics
2
Assignment
- Question 4.04
- Question 4.05
8
Videos
- Introduction
- Understanding the Effect of Cutoffs on Confusion Matrices
- Understanding the Profit Matrix
- Choosing the Optimal Cutoff by Using the Profit Matrix
- Using the Central Cutoff
- Using Profit to Assess Fit
- Calculating Sampling Weights
- Demo: Using a Profit Matrix to Measure Model Performance
Kolmogorov-Smirnov Statistic
1
Assignment
- Question 4.06
4
Videos
- Introduction
- Plotting Class Separation
- Assessing Overall Predictive Power
- Demo: Using the K-S Statistic to Measure Model Performance
Model Selection Plots
1
Assignment
- Question 4.07
6
Videos
- Introduction
- Comparing ROC Curves of Several Models"
- Demo: Comparing ROC Curves to Measure Model Performance
- Using Macros to Compare Many Models
- Demo: Comparing and Evaluating Many Models, Part 1
- Demo: Comparing and Evaluating Many Models, Part 2
Review
1
Assignment
- Measuring Model Performance Review
1
Readings
- Summary
Certification Practice Exam
1
External Tool
- Access the Practice Exam
1
Readings
- About the Certification Exam
Auto Summary
Dive into the world of predictive modeling with SAS in this professional-level course on logistic regression. Offered by Coursera, the course focuses on using the LOGISTIC procedure in SAS/STAT software to model behaviors, assess models, and handle missing data and multicollinearity. With a comprehensive 1020-minute duration, it includes practical techniques for managing massive datasets. Subscription options include Starter and Professional, making it ideal for data science and AI enthusiasts looking to enhance their skills.

Marc Huber