Regression and Correlation
Recall, our Car Price data Car Price: Year Years
Old
Observation 1 $ 20,000 2015 4 Observation 2 $ 25,000 2016 3
Observation 3 $ 30,000 2018 1 Observation 4 $ 31,000 2018 1
Observation 5 $ 22,500 2016 3 Observation 6 $ 25,000 2016 3
Observation 7 $ 29,500 2018 1
Observation 8 $ 24,000 2015 4 Observation 9 $ 24,500 2017 2
Observation 10 $ 25,000 2017 2 With the Regression output,
Lastly, I want to use my Regression Equation to predict prices. And then we want to find a 95% prediction interval for that predicted price. What would I expect to pay for a car that was manufactured in 2014? Remember 2019 – 2014 = 5. This means the car is 5 Years Old. This is the value you want to substitute into the Regression Equation. DO NOT put 2019 into the equation.
𝑃𝑟𝑖𝑐�̂� = −2,629.03 (𝑌𝑒𝑎𝑟𝑠 𝑂𝑙𝑑) + 31,959.68 𝑃𝑟𝑖𝑐�̂� = −2,629.03 (5) + 31,959.68 𝑃𝑟𝑖𝑐�̂� = −13,145.16 + 31,959.68
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.884606501
R Square 0.782528661
Adjusted R Square 0.755344744
Standard Error 1725.490814
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 85706451.61 85706451.61 28.78646 0.000673381
Residual 8 23818548.39 2977318.548
Total 9 109525000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 31959.67742 1296.435244 24.65196589 7.83E-09 28970.09239 34949.26245 28970.09239 34949.26245
Years Old -2629.032258 490.0064638 -5.365301179 0.000673 -3758.98919 -1499.075326 -3758.98919 -1499.075326
𝑃𝑟𝑖𝑐�̂� = $18,814.52 In the Year 2014, when the car is 5 Years Old, we will expect to pay $18,814.52 for a car. Now that we know what are expected to pay for a 5-Year-Old car, lets calculate a 95% prediction interval for 5 Years. We need to use this equation:
�̂� ± 𝑡∗(𝑆𝐸)√1 + 1
𝑛 +
(𝑥0 − �̅�) 2
(𝑛 − 1)𝑆𝐷𝑥 2
We will use the =T.INV.2T function to find the T-Critical Value. This value should look familiar. DF = n – 2 = 10 – 2 = 8. Which is the same DF for the Residual in the Regression output. =T.INV.2T(0.05,8) 2.306004135 Next, we will need to calculate the mean and SD for the x-variable. You should recall how to calculate descriptive statistics from Week 2. Mean = 2.4 SD = 1.1737878 SE is the Standard Error from the Regression Output which is 1725.4908 Now we can plug in what we know
18814.52 ± 2.306(1725.4908)√1 + 1
10 +
(5 − 2.4)2
(10 − 1)1.17378782
18814.52 ± 3978.9817√1 + .1 + 6.76
12.4
18814.52 ± 3978.9817√1.64516129 18814.52 ± 3978.9817(1.28263841) 18814.52 ± 5103.59476 ($13,710.93, $23,918.11)
The 95% prediction interval for a 5-Year-Old car will go from $13,710.93 to $23,918.11.