Redo Question Zeek
How to Evaluate a Predictive Model (a Classifier)?
Part-1 Using Performance (classification)
Let’s go back to predicting a Second Heart-attack dataset
Split the Training Data!
Splitting 80% of the training data for training a Model
Splitting 20% of the training data for testing the Model
You can split the data into various ratios such as the following:
· 80/20 (80% Training a model/ 20% Testing that Model)
· 70/30 (70% Training a model/ 30% Testing that Model)
· 60/40 (60% Training a model/ 40% Testing that Model)
· 50/50 (50% Training a model/ 50% Testing that Model)
Note: In Rapidminer you represent this split as a ration of 1. Therefore, you right the above ratio like following:
· 0.8/0.2 (80% Training a model/ 20% Testing that Model)
· 0.7/0.3 (70% Training a model/ 30% Testing that Model)
· 0.6/0.4 (60% Training a model/ 40% Testing that Model)
· 0.5/0.5 (50% Training a model/ 50% Testing that Model)
1. The Logistic Regression Model
1. The Accuracy for Logistic Regression is 92.86% (this model is 92.86% accurate to predict if a patient will have a second attack or not!)
2. Precision:
1) Predicted True Yes =16 patients to have a second heart attack where the Actual number of patients were 14 patients (True Yes = True Positive) Yes Class Precision of True Positive is (14/16 = 87.5%)
2) Predicted True No = 12 patients will not have a second heart attack (12 out of 12 patients) where the Actual number of patients were also 12 patients Precision of True Negative is (12/12= 100%)
3. Recall:
1) How many out of what we predicted to be True Yes are actually True Yes?
All of them (14 out of 14) Class Recall for True Yes is (14/14 = 100%)
2) How many out of what we predicted to be True No are actually True No?
only (12 out of 14) we were able to find Class Recall for True No is (12/14 = 85.71%)
Practice!
Repeat the above steps for the following Models (LDA & Decision Trees).
2- Linear Discriminate Analysis Model
1. The Accuracy for Logistic Regression is?
2. Precision:
1) Predicted True Yes
2) Predicted True No
3. Recall:
1) How many out of what we predicted to be True Yes are actually True Yes?
2) How many out of what we predicted to be True No are actually True No?
3- Decision Tree Model
4. The Accuracy for Logistic Regression is?
5. Precision:
1) Predicted True Yes
2) Predicted True No
6. Recall:
1) How many out of what we predicted to be True Yes are actually True Yes?
2) How many out of what we predicted to be True No are actually True No?
Confusion Matrix
|
Accuracy = |
|||
|
|
True Yes |
True No |
Class Precision |
|
Pred. Yes |
|
|
|
|
Pred. No |
|
|
|
|
Class Recall |
|
|
|
|
Accuracy = |
|||
|
|
True Yes |
True No |
Class Precision |
|
Pred. Yes |
|
|
|
|
Pred. No |
|
|
|
|
Class Recall |
|
|
|
Part-2 Using ROC & Lift Chart
Practice!
Using the Marketing Campaign Dataset
1