Redo Question Zeek

profilemrsaiz
3.Howtoevaluateyourmodel.docx

How to Evaluate a Predictive Model (a Classifier)?

Part-1 Using Performance (classification)

Let’s go back to predicting a Second Heart-attack dataset

Split the Training Data!

Splitting 80% of the training data for training a Model

Splitting 20% of the training data for testing the Model

You can split the data into various ratios such as the following:

· 80/20 (80% Training a model/ 20% Testing that Model)

· 70/30 (70% Training a model/ 30% Testing that Model)

· 60/40 (60% Training a model/ 40% Testing that Model)

· 50/50 (50% Training a model/ 50% Testing that Model)

Note: In Rapidminer you represent this split as a ration of 1. Therefore, you right the above ratio like following:

· 0.8/0.2 (80% Training a model/ 20% Testing that Model)

· 0.7/0.3 (70% Training a model/ 30% Testing that Model)

· 0.6/0.4 (60% Training a model/ 40% Testing that Model)

· 0.5/0.5 (50% Training a model/ 50% Testing that Model)

1. The Logistic Regression Model

1. The Accuracy for Logistic Regression is 92.86% (this model is 92.86% accurate to predict if a patient will have a second attack or not!)

2. Precision:

1) Predicted True Yes =16 patients to have a second heart attack where the Actual number of patients were 14 patients (True Yes = True Positive) Yes Class Precision of True Positive is (14/16 = 87.5%)

2) Predicted True No = 12 patients will not have a second heart attack (12 out of 12 patients) where the Actual number of patients were also 12 patients Precision of True Negative is (12/12= 100%)

3. Recall:

1) How many out of what we predicted to be True Yes are actually True Yes?

All of them (14 out of 14) Class Recall for True Yes is (14/14 = 100%)

2) How many out of what we predicted to be True No are actually True No?

only (12 out of 14) we were able to find Class Recall for True No is (12/14 = 85.71%)

Practice!

Repeat the above steps for the following Models (LDA & Decision Trees).

2- Linear Discriminate Analysis Model

1. The Accuracy for Logistic Regression is?

2. Precision:

1) Predicted True Yes

2) Predicted True No

3. Recall:

1) How many out of what we predicted to be True Yes are actually True Yes?

2) How many out of what we predicted to be True No are actually True No?

3- Decision Tree Model

4. The Accuracy for Logistic Regression is?

5. Precision:

1) Predicted True Yes

2) Predicted True No

6. Recall:

1) How many out of what we predicted to be True Yes are actually True Yes?

2) How many out of what we predicted to be True No are actually True No?

Confusion Matrix

Accuracy =

True Yes

True No

Class Precision

Pred. Yes

Pred. No

Class Recall

Accuracy =

True Yes

True No

Class Precision

Pred. Yes

Pred. No

Class Recall

Part-2 Using ROC & Lift Chart

Practice!

Using the Marketing Campaign Dataset

1