Final Project (R programming)

harsh55
FinalProjectInstruction.docx

Final Project

Use any techniques (correlation, regression, machine learning, deep learning, etc.) to answer the following two questions:

· What factors significantly impact the amount of wines (AmountWines)?

· Due date: Monday 12/13/2021 11:59pm CT

· Data: Canvas-assignment-Final Project

· Submit your codes via Canvas-assignment-Final Project

· Submission format: r.script and Single Excel summary output

You must follow below steps. Otherwise, your code won’t work on my machine

· Manually set your working directory (from toolbar – Session), DO NOT hardcode the file path.

· Only use the following code to read data from your local drive

data_independent <- read.csv(file = 'data - for student - independent variables.csv')

data_dependent <- read.csv(file = 'data - for student - dependent variable.csv')

y <- data_dependent$AmountWines

· Enter your name, such as:

first_name<-'Yan'

last_name<-'Lang'

· Assign “new_data”, lower-case, to represent all independent variables you decided to use

· Assign “model”, to represent your model’s name

· Assign “preds”, lower-case, to represent your y-predictions based on the results of your model. If your data has 1500 rows, then you should see 1500 rows of predictions.

· write below code after calculating your predictions

rmse<- sqrt(mean((y - preds)^2))

· Combine, “first_name”, “last_name”, and “rmse” as one single csv output file, name the csv file as: finaloutput

· At the end of your scripts, write the following code (this is for TA, when testing your code, don’t run those lines.)

source("predictdata.R")

· What are your business suggestions/recommendations to the CEO?

· Due date: Monday 12/13/2021 11:59pm CT

· Submit your file via Canvas-assignment-Final Project

· Submission format: Single Word file

· Data visualization

· Executive summary reports

· etc.