Project One

profilemadzone42

Project One: Multiple Regression, Qualitative Variables Interactions, Quadratic Regression

For Project One, you have been asked to create different regression models analyzing a housing data set. Before beginning work on the project, be sure to read through the Project One Guidelines and Rubric to understand what you need to do and how you will be graded on this assignment. Be sure to carefully review the Project One Summary Report template, which contains all of the questions that you will need to answer about the regression analyses you are performing.

For this project, you will be writing all the scripts yourself. You may reference the textbook and your previous work on the problem sets to help you write the scripts.

Scenario

You are a data analyst working for a real estate company. You have access to a large set of historical data that you can use to analyze relationships between different attributes of a house (such as square footage or the number of bathrooms) and the house’s selling price. You have been asked to create different regression models to predict sale prices for houses based on critical variable factors. These regression models will help your company set better prices when listing a home for a client. Setting better prices will ensure that listings can be sold within a reasonable amount of time.

There are several variables in this data set, but you will be working with the following important variables:

VariableWhat does it represent?priceSale price of the homebedroomsNumber of bedroomsbathroomsNumber of bathroomssqft_livingSize of the living area in sqftsqft_aboveSize of the upper level in sqftsqft_lotSize of the lot in sqftageAge of the homegradeMeasure of craftsmanship and the quality of materials used to build the homeappliance_ageAverage age of all appliances in the homecrimeCrime rate per 100,000 peoplebackyardHome has a backyard (backyard=1) or not (backyard=0)viewHome backs out to a lake (view=2), backs out to trees (view=1), or backs out to a road (view=0)

Prepare Your Data Set

In the following code block, you have been given the R code to prepare your data set.

Click the Run button on the toolbar to run this code.

In [1]:

housing <- read.csv(file="housing.csv", header=TRUE, sep=",")
# converting appropriate variables to factors  
housing <- within(housing, {
  view <- factor(view)
  backyard <- factor(backyard)
})
# number of columns
ncol(housing)
# number of rows
nrow(housing)

222692

Model #1 - First Order Regression Model with Quantitative and Qualitative Variables

You have been asked to create a first order regression model for price as the response variable, and sqft_living, grade, bathrooms, and view as predictor variables. Before writing any code, review Section 3 of the Summary Report template to see the questions you will be answering about your first order multiple regression model.

Run your scripts to get the outputs of your regression analysis. Then use the outputs to answer the questions in your summary report.

Note: Use the + (plus) button to add new code blocks, if needed.

In [ ]:

myvars <- c("mpg","wt","drat")
mthouse_subset <- mthouse2[myvars]
# Print the first six rows
print("head")
head(mtcars_subset, 6)
# Print the correlation matrix
print("cor")
corr_matrix <- cor(mtcars_subset, method = "pearson")
round(corr_matrix, 4)

In [ ]:

In [ ]:

In [ ]:

In [ ]:

Model #2 - Complete Second Order Regression Model with Quantitative Variables

You have been asked to create a complete second order regression model for price as the response variable, and appliance_age and crime as predictor variables. Before writing any code, review Section 4 of the Summary Report template to see the questions you will be answering about your complete second order multiple regression model.

Run your scripts to get the outputs of your regression analysis. Then use the outputs to answer the questions in your summary report.

Note: Use the + (plus) button to add new code blocks, if needed.

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

Nested Models F-Test

You have been asked to create a reduced model and compare it with the complete second order model (Model #2 above). Before writing any code, review Section 5 of the Summary Report template to see the questions you will need to answer.

Run your scripts to get the outputs of your regression analysis. Then use the outputs to answer the questions in your summary report.

Note: Use the + (plus) button to add new code blocks, if needed.

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

End of Project One Jupyter Notebook

The HTML output can be downloaded by clicking File, then Download as, then HTML. Be sure to answer all of the questions in the Summary Report template for Project One, and to include your completed Jupyter Notebook scripts as part of your submission.

    • 3 years ago
    • 10
    Answer(0)