Lab 7- CSS300

p_patel359
Lab7.docx

CSS 300 Module 7 Activity Worksheet

Use this worksheet to complete your lab activity. Submit it to the applicable assignment submission folder when complete.

Deliverables:

· A word document with your answers to all questions.

· An ipynb file with all your code.

Download the Data-Weather.csv dataset from d2l and open a new Jupyter notebook:

1. First, paste and run all these import statements. The last line will import the dataset into the 'weather' dataframe.

import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression from sklearn import metrics %matplotlib inline

weather = pd.read_csv("Data-Weather.csv")

2. Heat map

a. Look for strong linear relationships with a heatmap using this code.

b. Paste a screenshot of your heatmap into your Word document.

c. Pick your independent variable and dependent variable.

d. Explain how you made your choices.

plt.figure(figsize=(10,10))

sns.heatmap(weather.corr(), vmin=-1, vmax=1, annot=True, annot_kws={"fontsize":14}, cmap="RdBu_r")

3. Scatterplot

a. Next plot your two variables into a scatter plot to visually check the relationship. Remember, we want the data to be as linear as possible.

b. Paste your scatterplot into your Word document.

c. Based on your scatterplot, would you say your two variables are appropriate for a linear regression? Why or why not?

weather.plot(x='MinTemp', y='MaxTemp', style='o') plt.title('MinTemp vs MaxTemp') plt.xlabel('MinTemp') plt.ylabel('MaxTemp') plt.show()

4. Divide the data into attributes and labels by using the following:

x = weather['MinTemp'].values.reshape(-1,1) y = weather['MaxTemp'].values.reshape(-1,1)

5. Split the data into training and testing sets using a 80% training and 20% testing split using the code below:

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

6. Train the algorithm using linear regression following the code below:

regressor = LinearRegression()

#training the algorithm regressor.fit(x_train, y_train)

7. Report the slope and y-intercept in your Word document. Use the following code:

#To retrieve the intercept: print("Intercept: ", regressor.intercept_)

#To retrieve the slope: print("Slope (beta coefficient): ", regressor.coef_)

8. In your Word document, write the equation for your model using the slope and intercept numbers, and the names of your dependent and independent variables. Here's the format:

dependent_variable = slope(independent_variable) + y_intercept

9. Use your model on your test data to generate a set of predictions:

y_predicted = regressor.predict(x_test)

10. Create a scatter plot with a line portraying the model using the following code. Paste a screenshot of this graph in your Word document.

plt.scatter(x_test, y_test, color='gray') plt.plot(x_test, y_predicted, color='red', linewidth=2) plt.show()