Wk6 DQ - Advanced Statistical Concepts and Business Analytics

profilevoyage
bowerman_9e_chap_16.pptx

Chapter 16

Predictive Analytics II: Logistic Regression, Discriminate Analysis, and Neural Networks

Copyright ©2018 McGraw-Hill Education. All rights reserved.

1

Chapter Outline

16.1 Logistic Regression

16.2 Linear Discriminate Analysis

16.3 Neural Networks

16-2

2

16.1 Logistic Regression

The general logistic regression model relates the probability that an event will occur to k independent variables

The general model is

Y is a dummy variable that equals one if the event has occurred and zero otherwise

Odds ratio is the probability of success divided by the probability of failure

Equation is

LO16-1: Use a logistic model to estimate probabilities and odds ratios.

16-3

LO16-1

Logistic Regression of the Price Reduction Data

Figure 16.1

16-4

LO16-1

Logistic Regression of the Performance Data

Figure 16.3

16-5

16.2 Linear Discriminate Analysis

Classifies an observation and estimates the probability that the observation will fall into a particular class

Calculate the squared distance between each class’s predictor variable value means and an observation’s predictor variable values

Observation put into the class with the smallest squared distance

Easiest classification analytic to use when there are more than two classes

LO16-2: Use linear discriminate analysis to classify observations and estimate probabilities.

16-6

LO16-2

Results of a Linear Discriminate Analysis

Figure 16.11 partial

16-7

16.3 Neural Networks

Regression techniques so far developed for n = 1,000 or less

Not uncommon for data mining projects to have millions of observations

Neural network modeling developed to handle large data sets

Idea is to represent the response variable as a nonlinear function of linear combinations of the predictor variables

Most common is the single-hidden-layer, feedforward neural network

LO16-3: Use neural network modeling to estimate probabilities and predict values of quantitative response variables.

16-8

LO16-3

Single-Hidden-Layer, Feedforward Neural Network

An input layer consisting of predictor variables x1, x2, … xk

A single hidden layer consisting of m hidden nodes

An output layer where we form a linear combination L of the m hidden node functions

16-9

LO16-3

The Single Layer Perceptron

Figure 16.17

16-10

LO16-3

Neural Networks Continued

Because a neural network model employs many parameters, we say it is overparametrized

There is a danger we will overfit the model

Model finds parameter estimates that minimize a penalized least squares criterion

The penalty equals  times the sum of the squared value of the parameter estimates

The penalty weight  controls the tradeoff between overfitting and underfitting

16-11