Final Project (R programming)
Final Project
Use any techniques (correlation, regression, machine learning, deep learning, etc.) to answer the following two questions:
· What factors significantly impact the amount of wines (AmountWines)?
· Due date: Monday 12/13/2021 11:59pm CT
· Data: Canvas-assignment-Final Project
· Submit your codes via Canvas-assignment-Final Project
· Submission format: r.script and Single Excel summary output
You must follow below steps. Otherwise, your code won’t work on my machine
· Manually set your working directory (from toolbar – Session), DO NOT hardcode the file path.
· Only use the following code to read data from your local drive
data_independent <- read.csv(file = 'data - for student - independent variables.csv')
data_dependent <- read.csv(file = 'data - for student - dependent variable.csv')
y <- data_dependent$AmountWines
· Enter your name, such as:
first_name<-'Yan'
last_name<-'Lang'
· Assign “new_data”, lower-case, to represent all independent variables you decided to use
· Assign “model”, to represent your model’s name
· Assign “preds”, lower-case, to represent your y-predictions based on the results of your model. If your data has 1500 rows, then you should see 1500 rows of predictions.
· write below code after calculating your predictions
rmse<- sqrt(mean((y - preds)^2))
· Combine, “first_name”, “last_name”, and “rmse” as one single csv output file, name the csv file as: finaloutput
· At the end of your scripts, write the following code (this is for TA, when testing your code, don’t run those lines.)
source("predictdata.R")
· What are your business suggestions/recommendations to the CEO?
· Due date: Monday 12/13/2021 11:59pm CT
· Submit your file via Canvas-assignment-Final Project
· Submission format: Single Word file
· Data visualization
· Executive summary reports
· etc.