Statistic 7

profileedwin.villa
PeerResponse7.docx

Dylan

Hello class,

This week’s topic is Regression and Correlation. We are tasked to choose two variables that are correlated from week 1’s data set. The variables that I chose to work with were price and number of cylinders. I believe that as the numbers of cylinders increases that the price of the vehicle will increase. I think it will be a positive relationship. It is my novice understanding of engines that a high number of cylinders is needed for larger vehicles such as trucks. While researching car prices, trucks and luxury SUVs were mostly higher priced. The price is the dependent variable, and the number of cylinders is the independent variable. As defined the independent variable is what you change and the dependent variable changes because of that.

First, I began by following the directions in the PDF. My data did not need to be adjusted because it was already in numerical form. I used the Excel function =CORREL(Price column,#cylinders column) with a value of 0.5323. This is a positive correlation and agrees with my assumption. With this value I was then able to determine R-squared with the equation 0.5323*0.5323= 28%. This translates to there is a 28% variation in the data between price and cylinders accounted for by my data set. The R-squared value is low and tells me that this is a weak positive correlation. This means that although the variables both go up the relationship is not strong. The number of cylinders in a vehicle is not a strong indicator for car price.

I used this data to run a regression analysis in Excel. Following the steps in the pdf was easy. This tool allowed me to verify my calculations for the correlation and R-squared.

Next, I examined the data to see if the number of cylinders was a significant predictor for price. My p value (0.11321025) is greater than the alpha 0.05. Therefore, I can state that this is not a significant predictor in price. Because my value was so low, I do not think you would continue with a regression equation, but I did anyway for the assignment. I wrote out my regression equation using the coefficients from the data table.

Price= 4199.12921(# of cylinders)+7714.55056.

My y intercept does not have practical meaning for this scenario because no vehicle will have 0 cylinders. I did have one outlier in my set and this price definitely effected my values. I did a second regression analysis and changed that high price to match the other vehicles a little better. It is included in my attachment. This did result in a stronger positive correlation. I think the relationship between the actual data sets is important. Perhaps all the same make with different models and cylinders. I hope my data is clear. Good luck this week!

Valerie

Hi everyone,

Just based on exercise, I already feel like I will have an easier time this week than last. I hope this proves true as I move forward with the lesson, because last week I just kept getting lost.

Based on my data set, I feel that MPG City and MPG Highway are correlated. I believe it is a positive correlation, because an increase in one will result in an increase of the other. For the sake of my Regression this week, I will consider the independent variable to be MPG Highway (x) and the dependent variable to be MPG City (y). This means that MPG Highway (x) is a significant predictor of the MPG City (y).

My suspicion that my correlation is positive is confirmed by my positive value of .8655.

My R2 is 74.91%. With the strongest possible correlation being 100%, 74.91% is strong enough that it will still give us a good indication and we can further interpret the data.

My significance F, (p-value) is less than alpha (.05), so we can determine MPG Highway to be a significant predictor of MPG city.

Based on my regression, my conclusion is as follows: as the MPG Highway increases by 1, then the MPG City will increase by .6720.

Have a great week!