R programming

profilejack12
  • 2 years ago
  • 10
files (2)

Untitleddocument.docx

Guidelines: ➢ Use R and R Studio for this assignment (do not use Excel or any other software) ➢ Submit both R code and PDF Report on findings ➢ Work is to be done individually for this assignment This exercise involves the Auto Data set studied the lab, which can be found in the file Auto.csv. Make sure that the missing values have been removed from the data. 1) Use the read.csv (“Auto.csv”, header=T, na.strings=”?” ) function to read the data into R. Call the loaded data Auto. Make sure that you have the directory set to the correct location for the data. (Hint: you should first set up your working directory). Take a screenshot of your output. 2) Use dim ( ) function to find out the number of rows and columns in this dataset? Take a screenshot of your output and then answer the question. 3) Suppose mpg is our dependent variable. Which of the predictors are quantitative predictors and which of them are qualitative predictors in this Auto.csv data file? 4) Use the summary ( ) function to produce a numerical summary of variables in the dataset. What is the range of each quantitative predictor? Take a screenshot of your output and then answer the question. 5) Use sapply ( ) function to produce the variance for each numerical variable? Take a screenshot of your output. 6) Use the pairs ( ) function to produce a scatterplot matrix of the first seven columns or variables of the data? What are the relationships between variables? (i.e. Which variables do you think are positively or negatively correlated?). Take a screenshot of your output and answer the questions. 7) Use the hist ( ) function to produce some histograms for four variables (cylinders, displacement, horsepower and weight). You may find the command par(mfrow=c(2,2)) useful: It will divide the print window into four regions so that four plots can be made simultaneously. Take a screenshot of your output. 8) Which car has the highest mpg and which car has the lowest mpg? (Code Hint: Auto[which.min(mpg),]) 9) Create a new qualitative variable, called “FourCylinder”, by binning the cylinders variable. We are going to divide cylinders into two groups based on whether the number of cylinders is equal to 4. After creating a new variable, you might use as.factor() function and summary () function to see how many 2 cars have 4 cylinders? Take a screenshot of your output and answer the question. (Hint: Refer the code about Elite on Page 55-Q8.(c).iv in the textbook). 10) Now remove the 10th through 85th observations. What is the range, mean and standard deviation of each quantitative predictor in the subset of the data that remains? What to submit: 1. R code. a. Should include all the code to accomplish the tasks. b. Clear and concise comments to indicate what part of the assignment each code chunk pertains to. c. Code should be easily readable. d. Filename should be in the format of: LastnameFirstname_A1.R 2. Report. a. Take screenshots of your outputs in R Studio and answer all the questions. b. Submit in PDF format. c. Answers questions clearly and concisely. d. Includes appropriate plots. Make sure the plots are properly labeled. e. The assignment will be graded on the correctness of the answers, comprehensiveness of the analysis, clarity of results’ presentation and neatness of the report.