RSCH665Module4Notes.docx

RSCH 665 Notes Jeremy Hodges, PhD

Not for distribution/Classroom use only Derived from Aron, Coups, & Elliot (2019)

All rights are reserved. The material contained herein is the copyright property of Embry-Riddle Aeronautical University, Daytona Beach, Florida, 32114. No part of this material may be reproduced, stored in a retrieval system or transmitted in any form, electronic, mechanical, photocopying, recording or otherwise without the prior written consent of the University.

College of Aeronautics | worldwide.erau.edu

Module 4 Notes

The hypothesis testing process generally follows the same five steps (2019, p. 131-134):

1. Restate the research question as a research hypothesis and a null hypothesis about the populations.

2. Determine the characteristics of the comparison distribution.

3. Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected.

4. Determine your sample’s score on the comparison distribution.

5. Decide whether to reject the null hypothesis.

Here is an example:

Suppose we know through research that a population has a mean of 80 and a standard deviation of 10 for the results of a particular test. We want to evaluate our new training method to see if it improves results, as demonstrated through a higher test score. We have one volunteer to participate in this study.

Our candidate goes through the new training program and scores 93 on the test.

Let’s use the steps above to properly write up this study (2019, p. 140):

1. Restate the research question as a research hypothesis and a null hypothesis about the populations.

RQ: Do people who go through the new training have higher scores than those who do not go through the new training.

Population 1: People in general.

Population 2: People who go through the new training.

Ha: People who go through the new training have higher scores than people in general.

Ho: There is no difference in test scores between people who go through the new training and people in general.

2. Determine the characteristics of the comparison distribution.

The comparison distribution is a Z-distribution. (Because we are testing against the known Mean and SD of a population, this is atypical but good for our example. In the future, we’ll use t-distribution, f-distribution, and chi-square distribution).

3. Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected.

In social science, typically we “reject the null” at the p < .05 level, or 95% on the z-chart. This is a one-tail test (because we are predicting the direction – that scores will go up).

Looking up .95 for a one-tail test on the z-chart, we see this is a Z-score of 1.64.

We call this the critical cutoff score, or critical score, or Z-critical in this case.

4. Determine your sample’s score on the comparison distribution.

Our subject’s raw score on the test was 93. Converting this to a Z-score:

Z = X – M / SD

Z = 93 – 80 / 10

Z = 1.3

5. Decide whether to reject the null hypothesis.

Because our observation, Z=1.3, is less than our Z-critical (1.64), we fail to reject the null hypothesis. That is, we fail to reject the idea that there is no difference in test scores between people who go through the new training and people in general. Because we cannot reject this idea, we do not have support for our research hypothesis. Essentially, there is no support that our training program makes a difference in test scores.

We will use this same five-step process for all our hypothesis testing. This Z-score example is just a presentation of the steps.

A one-tailed test is used when your claim is directional (i.e. Training improves test scores).

A two-tailed test is used when your claim is non-directional (i.e., Sleep affects test scores).

Decision errors occur when the right steps lead to the wrong conclusions.

A Type 1 error, which is more severe, occurs when you reject the null hypothesis, when in fact you shouldn’t have. Consider this scenario:

In our earlier example, suppose our results were that the observed Z-score was 1.7, that is our candidate scored a 97 on the test (Z = 97-80/10). As a result, we would have rejected the null, and supported our claim that the new training improved test scores. However, there is still a very small chance that our candidate simply scored 97 without the help of the new training. In fact, there is a 4.46% chance we are incorrect. We may be instituting new training at our facility

based on our claim, spending thousands of dollars and all the while, our candidate just scored well that day, but it had nothing to do with our new training.

A Type 2 error occurs when you fail to reject the null hypothesis, when in fact you should have. Consider this scenario:

In our earlier example, let’s say we wanted to be really sure of our results, so instead of setting the cutoff at p < .05, we set it at p < .01…changing our critical cutoff score, Z-critical, to 2.33. Our candidate is very influenced by our new training and scores a 97 on the test (z=1.7). Because our observed score 1.7 < 2.33, we fail to reject the null hypothesis. Therefore, instead of supporting our research hypothesis, we discard the new training program.

You may notice that deciding on the critical cutoff score, you are in a tug-of-war between setting that threshold too low (risking Type 1 errors), or too high (risking Type 2 errors). Typically, following conventions for your field, you will be fine. Most graduate students use p < .05.

Confidence Intervals

Confidence intervals are used to get a sense of the accuracy of an estimated population mean. Note, we are estimating the population mean, using a sample mean. We are saying with a certain level of confidence that we believe that actual population mean falls within a certain range.

95% confidence interval: Z scores from -1.96 to +1.96

A confidence interval for which there is approximately a 95% change that the population mean falls in this interval.

99% confidence interval: Z scores from -2.58 to +2.58

A confidence interval for which there is approximately a 99% chance that the population mean falls in this interval

To calculate a confidence interval, we’ll need one more characteristic: Standard Error. Because we are using a sample, we’ll need to use standard error in our calculation.

Standard Error (SE) = standard deviation / square root of the sample size

From our earliest example, we had 24, 8th graders, take a test. Their scores were:

67 68 69 73 74 75 76 76 78 81 82 82

83 84 85 85 85 85 86 87 88 90 91 94

Mean = 81

SD = 7.39

SE = SD / √N

SE = 7.39 / √24

SE = 7.39 / 4.9

SE = 1.51

To calculate a 95% confidence interval, use -1.96 to +1.96 for the Z-score interval.

Z * SE =

1.96 * 1.51 = 2.96

95% Confidence interval lower limit = Mean – 2.96 = 81 – 2.96 = 78.04

95% Confidence interval upper limit = Mean + 2.96 = 81 + 2.96 = 83.96

This tells us that we are 95% confident that the population mean falls between 78.04 and 83.96, based on the evaluation of our sample of 24 students.

Reference

Aron, A., Coups, E. J., & Aron, E. (2019). Statistics for the behavioral and social sciences: A brief course (6th ed.). Upper Saddle River, NJ: Prentice Hall/Pearson.