Neural System
Week 5 Exercise - Neural Network
|
In this exercise, you will use the R Studio interface to run the Neural Network method. You will run the method with different parameters and will interpret the results, including the classification accuracy. Exercise Instructions Step 1 - Complete an exercise in the Word document on diabetes dataset. Get familiar with the neuralnet method and with available input parameters. |
|
Your results might be slightly different depending on the operating system and R Studio version. You do not need to write a report for this part.
Part 2 - Run an exercise on a column dataset and write a report on your findings and results interpretation in your own words. The report needs to cover the exercise key points below in order.
Download the CSV file to your hard drive. Click on a dataset description URL and read the dataset description. Note: I changed normal to 0 abnormal to 1
http://archive.ics.uci.edu/ml/datasets/Vertebral+Column#
Exercise Key points
1. Introduction
· Which variable is the dependent variable in the column dataset?
· Which variables are the independent variables in the column dataset?
· What do you expect the neuralnet method to accomplish for the column data?
2. Data pre-processing - Load the data into R studio, and discuss the data preprocessing steps you run before running the neuralnet method. For each step, include the command you ran, the output, and an explanation of what the step accomplishes.
3. Divide the data into training and test set, and explain why we do that. Include the commands and explanation in the report. Remember to set seed.
4. Running the method on a training set
· Run the neural net function to build the network, and store the result in a variable nn. Include the command in the report, and discuss the input parameters you used.
· Enter nn at the prompt and hit enter. Interpret the output. Include the command, the output, and output interpretation in the report.
· Run the command nn$result.matrix. Interpret the output. Include the command, the output, and output interpretation in the report.
· Run the command nn$net.result[[1]][1:10] to preview the first 10 predicted values. What do the values mean? Include the command, the output, and output interpretation in the report.
5. Network Visualization - Run the plot(nn) command to visualize the network, and interpret the model. If the plot is hard to read, you may need to click on zoom button in the plot panel to view the model in a new window. Included the command you ran, the plot, and plot interpretation in the report.
6. Confusion Matrix for the training data
· Run the following commands to extract and store the predicted values in a variable mypredict. Round the predictions to the whole number. Display the first 10 values of mypredict. Explain what the values mean. Include the commands, the output, and output interpretation in the report.
mypredict<-compute(nn, nn$covariate)$net.result mypredict<-apply(mypredict, c(1), round)
mypredict[1:10]
· Run the table command to build the confusion matrix. Explain what the matrix shows. Include the command you ran, the matrix, and matrix interpretation in the report.
7. Use the test data to evaluate the model.
· Use the test set to run the compute command, and store the predicted values in the variable testpred. Run the second command to run round the predictions. Include both commands in the report.
· Run the table command to build the confusion matrix for the test data. Explain what the matrix shows. Include the command you ran, the matrix, and matrix interpretation in the report.
· Compare the classification accuracy for the test data with the classification accuracy for the training data.
8. Different Input parameters.
· Repeat the steps 4-7 above with different input parameters. You may change the number of hidden layers, and/or the number of nodes in the hidden layer, and/or you may use a subset of independent variables. Include the command your ran, the output, and the output interpretation in the report.
· Compare the results of the two runs. Which run has a higher classification accuracy and why?
9. Summary
· What differences between decision tree classification and neural network classification methods did you observe?
· Which part of this exercise did you find the most challenging, and what approach did you take to resolve the challenge?
Exercise Deliverables
Submit the following files in the Exercise 6 Assignment folder.
· The report addressing the key points above
· An R script with commands your ran and brief comments on the commands purpose
Exercise Grading
· This exercise is worth 2% of the course grade.
· All questions must be answered in order and in your own words.
· Grammatical and spelling errors may affect the assignment grade.
*** Start working on this exercise early in a week to allow sufficient time for debugging any potential R errors. ***
This exercise may take 4-6 hours to complete. This estimate is a little higher than last week estimate because of the commands complexity. Post your question about this exercise in assignment 5 questions and answers discussion topic.