Newnew

profiletretaylor
cf_7864_course_study_guide.pdf

UNIVERSl'TY

7864 Course Study Guide

1

Table of Contents

Table of Contents

Week 1: Basics of Data Collection and Analysis

Scales of Measurement Hypothesis Testing

Null and Alternative Hypotheses

Type I and Type II Errors

Probability Values and the Null Hypothesis

Preview of APA Skills

Week 2: Exploring Statistical Software and Descriptive Statistics

Screening Data

Measures of Central Tendency and Dispersion

Skewness and Kurtosis

Outliers

APA Focus of the Week: Ethics

Week 3: Correlation Introduction

Statistics and Ethics

Interpreting Correlation

Assumptions of Correlation

Hypothesis Testing of Correlation

Alternative Correlation Coefficients

APA Focus of the Week: Format Requirements

Week 4: Correlation Application

Proper Reporting of Correlations

r, Degrees of Freedom, and Correlation Coefficient Probability Values

APA Focus of the Week: Reporting Standards in APA Format

Week 5: t-Test Introduction

Logic of the t-test Assumptions of the t-test

2

Hypothesis Testing for a t-test APA Focus of the Week: Scholarly Writing

Week 6 – t-Test Application

Testing Assumptions: The Levene Test Proper Reporting of the Independent Samples t-test t, Degrees of Freedom, and t Value

Probability Value

APA Focus of the Week: Grammar and Usage - Verb Tense

Week 7: One-Way ANOVA Introduction

Advantage of ANOVA

Logic of a "One-Way" ANOVA

Avoiding Inflated Type I Error

Hypothesis Testing in a One-Way ANOVA

Assumptions of a One-Way ANOVA

APA Focus of the Week: Bias-free Language

Week 8: ANOVA Application

Proper Reporting of the One-Way ANOVA

F, Degrees of Freedom, and F Value

Probability Value

Post-Hoc Tests

APA Focus of the Week: In-text Citations

Week 9: Regression Introduction

Logic of a Simple Linear Regression

Hypothesis Testing in Simple Linear Regression

Assumptions of a Simple Linear Regression

APA Focus of the Week: References

3

Week 1: Basics of Data Collection and Analysis

This study guide is designed to highlight important information and help clarify

difficult concepts. Use it as you work through your readings and assignments.

Scales of Measurement

Quantitative researchers collect data and assign numbers to their observations. An important concept in understanding variables is the scales of measurement. There

are four scales of measurement—nominal, ordinal, interval, and ratio. These four scales of measurement are routinely reviewed in introductory statistics textbooks as

the classic way of differentiating measurements. However, the boundaries between the

measurement scales are fuzzy. For example, is intelligence quotient (IQ) measured on

the ordinal or interval scale? In 7864, we rely on a simple measurement dichotomy: categorical (qualitative) variables and continuous (quantitative) variables.

A categorical variable measures things that belong to a group (a category). Nominal variables have two or more categories that are not assigned in any particular order. For example, a nominal variable of “fruit” could assign an arbitrary number for each category, such as apple = 1, banana = 2, and grape = 3. Ordinal variables consist of categories with a particular order such as first place, second place, and third place in a contest. In the 7864 data set, categorical variables like “review” are useful in comparing students who did not complete a review session (1 = no) to those who did complete a review session (2 = yes).

A continuous variable represents a difference in the magnitude of something along a continuum, such as a measurement of "low to high" statistics anxiety. Interval variables have equal points on a scale such as a Celsius scale. A ratio variable has an additional property beyond equal intervals--a “true zero.” An example is the Kelvin scale, and the true zero is the complete absence of heat.

In the 7864 data set, an example of a continuous variable is “quiz1,” which is a student’s number of correct answers on the first quiz. It is important to distinguish between categorical variables and continuous variables in 7864. In many statistical software programs, for example, categorical variables are labeled as “Nominal” or “Ordinal,” and interval variables and ratio variables are labeled as “Scale.” Knowing how

4

to differentiate variables according to the scale of measurement will help you choose the correct statistical test for a given hypothesis.

Hypothesis Testing

A hypothesis is an educated guess of what the researcher will observe once the

data are gathered. Probability is crucial for hypothesis testing. In hypothesis testing, you

want to know the likelihood that your results occurred by chance. No matter how

unlikely, there is always the possibility that your results have occurred by chance, even

if that probability is less than 1 in 20 (5%). However, you are likely to feel more confident in your inferences if the probability that your results occurred by chance is less than 5%

compared to, say, 50%.

In high-stakes research (such as testing a new cancer drug), researchers may

want to be even more conservative in designating an alpha level, such as less than 1 in

100 (1%) that the results are due to chance. However, most researchers in the social sciences find it reasonable to designate less than a 5% chance as a cutoff point for determining statistical significance. This cutoff point is referred to as the alpha level or p

value (p < .05). An alpha level is set to determine when a researcher will reject or fail to

reject a null hypothesis (discussed next). The alpha level is set before data are

analyzed to avoid "fishing" for statistical significance.

Null and Alternative Hypotheses

When comparing groups, the null hypothesis (H0) predicts that group means

will not differ. When testing the strength of a relationship between two variables, the

null hypothesis is no relationship between variable X and variable Y. By contrast, the

alternative hypothesis (H1) does predict a difference between the two groups, or in

the case of relationships, that two variables are significantly related. An alternative

hypothesis can be directional (H1: Group X has a higher mean score than Group Y) or nondirectional (H1: Group X and Group Y will differ).

In hypothesis testing, you either reject or fail to reject the null hypothesis. Failing to reject the null hypothesis is not stating that you accept the null hypothesis

as true. You have simply failed to find statistical justification to reject the alternative

hypothesis.

5

Type I and Type II Errors

If you commit a Type I error, this means that you have incorrectly rejected a

true null hypothesis. You have incorrectly concluded that there is a significant difference between groups, or a significant relationship, where no such difference or relationship actually exists. Type I errors have real-world significance, such as

concluding that an expensive new cancer drug works when actually it does not work, costing money and potentially endangering lives. Keep in mind that you will probably

never know whether the null hypothesis is "true" or not, as we can only determine that our data fail to reject it.

Reject H0 Do Not Reject H0

H0 is True Type I error Correct

H0 is False Correct Type II error

If you commit a Type II error, this means that you have not rejected a false

null hypothesis when you should have rejected it. You have incorrectly concluded

that no differences or no relationships exist when they actually do exist. Type II errors also have real-world significance, such as concluding that a new cancer drug

does not work when it actually does work and could save lives.

Your alpha level (p-value) will affect the likelihood of making a Type I or a Type II error. If your alpha level is small (such as .01, less than 1 in 100 chance), you are less

likely to reject the null hypothesis, so you are less likely to commit a Type I error. However, you are more likely to commit a Type II error.

Probability Values and the Null Hypothesis

The statistic used to determine whether or not to reject a null hypothesis is

referred to as the calculated probability value or p value, denoted p. When you run

an inferential statistic in statistical software, it will provide you with a p value for that statistic. If the test statistic has a probability value of less than 1 in 20 (.05), we can

say "p <.05, the null hypothesis is rejected." Keep in mind in the coming weeks that we

are looking for values less than .05 to reject the null hypothesis. This may seem

counterintuitive at first, because usually we assume that bigger is better. In the case of null hypothesis testing, the opposite is the case. Any p value less than .05 (such as

6

.02, .01, or .001) means that we reject the null hypothesis. Any p value greater than

.05 (such as .15, .33, or .78) means that we do not reject the null hypothesis. Make

sure you understand this point, as it is a common area of confusion among statistics

learners.

Based on your understanding of the null hypothesis, the alternative hypothesis, the alpha level, and the p value, you can begin to make statements about your research results. If your results fall within the rejection region, you can claim that they

are "statistically significant," and you reject the null hypothesis. In other words, you will conclude that your groups do differ in some way or that two variables are significantly

related. If the results do not fall within the rejection region, you cannot make this claim. Your data fail to reject the null hypothesis. In other words, you will conclude that groups do not differ in some way or that two variables are unrelated.

Preview of APA Skills

A big part of learning statistics is knowing how to effectively communicate your findings to others. Therefore, each week of this course study guide will focus on one

APA-related skill relevant to statistics. Here is a quick snapshot of what you can expect to learn:

Week 2 – APA Ethics

Week 3 – APA Format Requirements

Week 4 – Reporting Statistics in APA format Week 5 – Scholarly Writing

Week 6 – APA Grammar and Usage - Verb Tense

Week 7 – Bias-free Language

Week 8 – APA In-text Citations

Week 9 – APA References

7

Week 2: Exploring Statistical Software and Descriptive

Statistics

Screening Data

Before getting started, it is critically important that as a researcher you ensure you have high quality data. This comes from careful methodological planning and sound data collection. It is your responsibility to check your data to make sure everything is in order. Running frequency analyses can help to highlight data errors. For example, if you are examining test scores and see a value of 143, you will know there was a data entry error if the highest possible score on the test is 100. You may also find missing data this way and will then need to consider how to address it. For example, you might ignore it, delete the whole case, or impute the missing data with additional analyses.

You may also need to calculate scores for the instruments you use. Suppose you give a 10-item questionnaire to learners to assess their test anxiety that uses a scale of 0 to 5 for each item with higher scores indicating more anxiety. In order to have an overall measure of test anxiety, you will need to tell a statistical software program to create a new variable “total anxiety” that is the sum of these 10 items and will have a potential range of scores from 0 to 100. (Note: For the purposes of this class, the dataset we will use has already been screened. You will not need to create new variables.)

Measures of Central Tendency and Dispersion

Descriptive statistics that measure central tendency (mean, median, mode) and dispersion (range, sum of squares, variance, standard deviation) are important to understand. Measures of centrality summarize where data clump together at the center of a distribution of scores and measures of dispersion indicate the level of variability in the scores. In a normal distribution, the mean, median, and mode are the same. 68% of the data falls within one standard deviation of the mean, 95% of the data falls within two standard deviations of the mean, and 99.7% of the data falls within three standard deviations of the mean.

Skewness and Kurtosis

8

An assumption of the statistical tests that you will study in this course is that the scores for a dependent variable are normal (or approximately normal) in shape. This assumption is first checked by examining a histogram of the distribution.

Departures from normality and symmetry are assessed in terms of skew and kurtosis. Skewness is the tilt or extent a distribution deviates from symmetry around the mean. A distribution that is positively skewed has a longer tail extending to the right (the "positive" side of the distribution). A distribution that is negatively skewed has a longer tail extending to the left (the "negative" side of the distribution). In contrast to skewness, kurtosis is defined as the peakedness of a distribution of scores. A distribution with negative kurtosis is a "flat" distribution (platykurtic), and a distribution with positive kurtosis has a "sharp" peak (leptokurtic).

The use of these terms is not limited to your description of a distribution following a visual inspection. They are included in your list of descriptive statistics and should be included when analyzing your distribution of scores. Skew and kurtosis scores of near zero indicate a shape that is symmetric or close to normal respectively.

In terms of assumptions of normality, skew and kurtosis values of −1 to +1 are considered ideal, whereas skew and kurtosis values ranging from −2 to +2 are considered acceptable for psychometric purposes.

Outliers

Outliers are defined as extreme scores on either the left or right tail of a distribution, and they can influence the overall shape of that distribution. There are a variety of methods for identifying and adjusting for outliers. Once an outlier is detected, the researcher must determine how to handle it. The outlier may represent a data entry error that should be corrected, or the outlier may be a valid extreme score. The outlier can be left alone, deleted, or transformed. Whatever decision is made regarding an outlier, the researcher must be transparent and justify their decision.

APA Focus of the Week: Ethics

Plagiarism is presenting others’ ideas as one’s own (American Psychological Association [APA], 2020). Plagiarism can be intentional, such as when someone buys a

paper from someone who has taken a course previously. Or, it can be unintentional, such as when a learner tries to paraphrase but does not give proper credit. Whether plagiarism is done purposefully or not, it is still unethical, and may have serious

consequences. See your APA 7 Manual, Section 8.2, for more information on avoiding

plagiarism (APA, 2020, pp. 254-256).

9

Self-plagiarism is presenting one’s own previous work as a new, original piece

(APA, 2020). As strange as it may seem, learners who choose to reuse portions of their own previous assignments must properly cite their own earlier work. In fact, resubmitting the same paper, or portions of the same paper, violates the academic

honesty policy and may be grounds for sanctions. To avoid self-plagiarism, always write

each discussion and assignment from scratch. See your APA 7 Manual, Section 8.3, for more information on avoiding self-plagiarism (APA, 2020, pp. 256-257).

10

Week 3: Correlation Introduction

Statistics and Ethics

Logic is an important part of statistics, and as a researcher, you are asked to

make a series of judgments as you work through your analyses. For each method of analysis we will explore, there are certain assumptions that must be tested to make

certain the test is appropriate. There are also particular ways to set up our hypotheses

and report our results. Your knowledge of statistics will help to ensure that you work

through these steps in a thoughtful and ethical manner.

Researchers have to note the limitations of their studies, potential sources of bias, and any methodological shortcomings they have identified. It is critical to not overstate results, assume causation, or draw unfounded conclusions. For example, if you found that individuals with high test anxiety were less likely to graduate from

college, you would not suggest that universities screen for test anxiety and not admit learners with high scores. That would be an irresponsible interpretation of the data. What you might instead conclude is that since test anxiety is related to dropping out, it would be wise for universities to provide assistance to those students with test anxiety

to help them to decrease it. As an ethical researcher, you must be thoughtful and

cautious in your interpretations as you proceed through the rest of the course and

beyond.

The first inferential statistic we will focus on is correlation, denoted r, which

estimates the strength of a linear association between two variables.

Interpreting Correlation

Interpreting a correlation requires an understanding of two concepts:

1. Magnitude.

2. Sign (±).

The magnitude refers to the strength of the linear relationship between variable

X and variable Y. The correlation ranges in values from −1.00 to +1.00. To determine

magnitude, ignore the sign of the correlation, and the absolute value of the correlation

indicates how much variable X and variable Y are linearly related. For correlations close

to 0, there is no linear relationship. As the correlation approaches either −1.00 or +1.00, the magnitude of the correlation increases. For example, |.65| > |.25|, or the magnitude

of r = −.65 is greater than the magnitude of r = +.25.

11

In contrast to magnitude, the sign of a non-zero correlation is either negative or positive. These labels are not interpreted as "bad" or "good." Instead, the sign

represents the slope of the linear relationship between X and Y. A scatter plot is used to

visualize the slope of this linear relationship, and it is a two-dimensional graph with dots

representing the combined X, Y score. Interpreting scatter plots is necessary to check

the assumptions of correlation discussed below.

A positive correlation indicates that as values of X increase, the values of Y also

increase (e.g., height and weight). Conversely, a negative correlation indicates that as

values of X increase, the values of Y decrease. Finally, when X and Y values are

randomly distributed on the scatter plot (that is, there is no linear relationship), then r =

0.00.

Assumptions of Correlation

All inferential statistics, including correlation, operate under assumptions that are

checked prior to interpreting analyses. Violations of assumptions can lead to erroneous

inferences regarding a null hypothesis. The first assumption of correlation is

independence of observations for X and Y scores. The measurement of individual X and

Y scores should not be influenced by errors in measurement or problems in research

design.

The second assumption is that, for Pearson's r, X and Y are quantitative and

each variable is normally distributed. Other correlations discussed below do not require

this assumption, but Pearson's r is the most widely used and reported type of correlation. This assumption is checked by a visual inspection of X and Y histograms

and calculations of skew and kurtosis values. If your data is not normal, you can use a

non-parametric test like Spearman’s rank order correlation.

The third assumption of correlation is that X and Y scores are linearly related. Correlation does not detect strong curvilinear relationships. This assumption is checked

by a visual inspection of the X, Y scatter plot.

The fourth assumption of correlation is that the X and Y scores should not have

extreme bivariate outliers that influence the magnitude of the correlation. Bivariate

outliers are also detected by a visual examination of a scatter plot. Outliers can

dramatically influence the magnitude of the correlation, which sometimes leads to errors

in null hypothesis testing. Bivariate outliers are particularly problematic when a sample

size is small.

12

The fifth assumption of correlation is that the variability in Y scores is uniform

across levels of X. This requirement is referred to as the homogeneity of variance

assumption, which is usually difficult to assess in scatter plots with a small sample size. Sometimes a potential violation can be detected, but this assumption is typically

emphasized when checking the homogeneity of variance for a t-test or analysis of variance (ANOVA) studied later in the course.

Hypothesis Testing of Correlation

The null hypothesis for correlation predicts no significant linear relationship

between X and Y, or H0: rXY = 0. A directional alternative hypothesis for correlation is

either an expected significant positive relationship (H1: rXY >0) or significant negative

relationship (H1: rXY < 0). A nondirectional alternative hypothesis predicts the correlation

is significantly different from 0 but does not stipulate the sign (H1: rXY ≠ 0).

For correlation as well as t-tests and ANOVA studied later in the course, the

standard alpha level for rejecting the null hypothesis is set to .05. Statistical software

output for a correlation showing a p value of less than .05 indicates that the null hypothesis should be rejected; there is a significant relationship between X and Y. A

p-value greater than .05 indicates that the null hypothesis should not be rejected; there

is not a significant relationship between X and Y.

Alternative Correlation Coefficients

The most widely used correlation is referred to as Pearson's r. Pearson's r is

calculated between X and Y variables that are measured on either the interval or ratio

scale of measurement (for example, height and weight). Other types of correlation

depend on other scales of measurement for X and Y. A point biserial correlation is

calculated when one variable is dichotomous (such as “yes/no”) and the other variable

is interval/ratio data (such as weight). If both variables are ranked (ordinal) data, the

correlation is referred to as Spearman's rho or Kendall’s tau. Although the underlying

scales of measurement differ from the standard Pearson's r, other forms of correlation

are both calculated between −1.00 and +1.00 and are interpreted similarly.

If both variables are dichotomous, the correlation is referred to as phi (φ). A final test of association is referred to as chi-square. Phi and chi-square are studied in

advanced inferential statistics.

13

APA Focus of the Week: Format Requirements

The APA 7 Manual outlines many specific format requirements, which help

ensure readability of papers. A paper which uses consistent font, spacing, margins, and

indentation helps the reader to be able to focus on the message of the paper, instead of being distracted by those elements. Here are some of the format requirements for papers in this course:

· Font: APA allows for some flexibility in font choices, provided the font is

accessible to all users, and the same font is used consistently for the entire

paper. We recommend 12-point Times New Roman, but see your APA 7 Manual, Section 2.19, for more information on acceptable fonts (APA, 2020, p. 44).

· Spacing: Papers should be double-spaced throughout, including the title page

and reference page. See your APA 7 Manual, Section 2.21, for more information

on spacing (APA, 2020, p. 45).

· Page margins: Use 1-inch margins on all sides. See your APA 7 Manual, Section 2.22, for more information on margins (APA, 2020, p. 45).

· Paragraph Indentation: Indent the first line of each paragraph one-half inch. See

your APA 7 Manual, Section 2.24, for more information on indentation (APA, 2020, pp. 45-46).

14

Week 4: Correlation Application

Proper Reporting of Correlations

Reporting a correlation requires an understanding of the following elements: the

statistical notation for a Pearson's correlation (r), the degrees of freedom (df), the

correlation coefficient, and the probability value. For example, you might report:

“There was a statistically significant positive correlation between test anxiety and

test scores, r(98) = .32, p = .01 (two-tailed).”

r, Degrees of Freedom, and Correlation Coefficient The statistical notation for Pearson's correlation is r, and following it is the

degrees of freedom for this statistical test. The degrees of freedom for Pearson's r is

N − 2. For example, if there were 100 participants in the sample, then the df would be

98 (100 − 2 = 98). Next is the actual correlation coefficient including the sign. After the

correlation coefficient is the probability value (p). So, you might have a correlation of r(98) = +.20, p = .04.

Probability Values

Prior to the widespread use of statistical software programs, p values were often

calculated by hand. The convention in reporting p values was to simply state, p < .05 to

reject the null hypothesis and p > .05 to not reject the null hypothesis. However, statistical software provides an exact probability value that should be reported instead. Hypothetical examples would be p = .021 to reject the null hypothesis and p = .514 to not reject the null hypothesis.

The "(two-tailed)" notation after the p value indicates that the researcher was

testing a nondirectional alternative hypothesis (H1: rXY ≠ 0). He or she did not have any a

priori justification to test a directional hypothesis of the relationship between

commitment and length of the relationship. In terms of alpha level, the region of rejection was therefore 2.5% on the left side of the distribution and 2.5% on the right side of the distribution (2.5% + 2.5% = 5%, or alpha level of .05). A "(one-tailed)" notation indicates a directional alternative hypothesis. In this case, all 5% of the region

of rejection is established on either the left (negative) side (H1: rXY < 0) or the right (positive) side (H1: rXY > 0) of the distribution. A directional hypothesis must be justified

prior to examining the results. In this course, we will always specify a two-tailed

(non-directional) test, which is more conservative relative to a one-tailed test. The

15

advantage is that a non-directional test detects relationships or differences on either side of the distribution, which is recommended in exploratory research.

APA Focus of the Week: Reporting Standards in APA Format Here are the most common statistical notations we will be using in this class,

based off of Table 6.5 in the APA 7 manual (APA, 2020, pp. 183-186).

APA abbreviation or symbol Definition

M Mean

SD Standard Deviation

df Degrees of Freedom

p Probability or Significance value

skewness Skewness

kurtosis Kurtosis

r Pearson’s correlation coefficient

t t-Test value

Levene’s F Levene’s test of homogeneity of variance

W Shapiro-Wilk

F ANOVA statistic

ANOVA Analysis of Variance

Reporting examples:

Mean and standard deviation Group A (M = 25.63, SD= 1.77) had

similar…

Pearson’s correlation r(103) = .76, p <.001. There is a

significant correlation between…

16

Levene’s assumption of homogeneity of variance

Levene’s F = .84, p = .67. The

assumption of homogeneity of variance

is met.

t-Test t(98) = 2.23, p = .048. There is a

significant difference between…

Shapiro-Wilk test of normality W = .95, p = .03. The assumption of normality is violated.

ANOVA F(3, 96) = 25.64, p = .032. There is a

significant difference in at least one…

17

Week 5: t-Test Introduction

Logic of the t-test Imagine that a researcher compares the test scores of Class A, which is using a

new teaching method, to Class B, which uses the traditional method. The mean test score for Class A is 93 and the mean test score for Class B is 81. Is there a significant difference in test scores between Class A and Class B? While the scores sound

different, we need to run analyses to be sure it is a significant finding.

To answer this question, the researcher conducts an independent samples t-test. The independent samples t-test compares two group means in a between-subjects

(between-S) design. In this between-S design, participants in two independent groups

are measured only once on some outcome variable.

By contrast, a paired samples t-test compares group means in a within-subjects

(within-S) design for a single group. Each participant is measured twice on some

outcome variable, such as a pretest-posttest design. For example, a researcher could

measure self-esteem for a class of learners prior to taking a statistics course (pretest) and then measure self-esteem again after completing the statistics course (posttest). The paired samples t-test determines if there is a significant difference in mean scores

from the pretest to the posttest.

There are two variables in an independent samples t-test: the predictor variable

(X) and the outcome variable (Y). The predictor variable must be dichotomous, meaning that it can only have two values or groups (for example, review session = 1; no

review session = 2). Group membership must be mutually exclusive. In

non-experimental designs, group membership is based on some naturally occurring

characteristic of a group. In experimental designs, participants are randomly assigned to

one of two group conditions (e.g., treatment group = 1; control group = 2). In contrast to

the dichotomous (nominal) predictor variable, the outcome variable must be continuous

to calculate a group mean (e.g., IQ scores, self-esteem scores).

Assumptions of the t-test All quantitative statistics, including the independent samples t-test, operate under

assumptions checked prior to calculating the t-test. Violations of assumptions can lead

to erroneous inferences regarding a null hypothesis. The first assumption is

independence of observations.

18

For predictor variable X in an independent samples t-test, participants are

assigned to one and only one "condition" or "level," such as a treatment group or control group. This assumption is not statistical in nature; it is controlled by proper research

procedures that maintain independence of observations.

The second assumption is that outcome variable Y is continuous and normally

distributed. This assumption is checked by a visual inspection of the Y histogram and

calculation of skewness and kurtosis values. A researcher may also conduct a

Shapiro-Wilk test to check whether a distribution is significantly different from normal. The null hypothesis of the Shapiro-Wilk test is that the distribution is normal. If the

Shapiro-Wilk test is significant, then the normality assumption is violated. In other words, a researcher hopes the Shapiro-Wilk test will not be significant, or that p > .05. If your data is not normally distributed, you can use a non-parametric alternative like the

Mann–Whitney U test.

The third assumption is referred to as the homogeneity of variance assumption. Ideally, the amount of variance in Y scores is approximately equal for Group 1 and

Group 2. This assumption is checked in statistical software with Levene’s test. The null hypothesis of the Levene test is that group variances are equal. If the Levene’s test is

significant, then the homogeneity assumption is violated. In this case, researchers will often report the Welch version of the t-test.

The null hypothesis for a t-test predicts no significant difference in population

means, or H0: μ1 = μ2. A directional alternative hypothesis for a t-test is that the

population means differ in a specific direction, such as H1: μ1 > μ2 or H1: μ1 < μ2. A

non-directional alternative hypothesis simply predicts that the population means differ, but it does not stipulate which population mean is significantly greater (H1: μ1 ≠ μ2). For example, H1 might be that, “There is a significant mean difference in test scores

between Class A and Class B.” For a t-test, the standard alpha level for rejecting the

null hypothesis is set to .05. Statistical software output for a t-test showing a p < .05

indicates that the null hypothesis should be rejected. A p > .05 indicates that the null hypothesis should not be rejected.

APA Focus of the Week: Scholarly Writing

Scholarly writing can be very different from writing for creative purposes. Here

are some of the main points to remember, from section 4 of the APA 7 manual:

19

· Conciseness and clarity – Write sentences as succinctly as possible, avoiding

ambiguous words. You also want to avoid using passive voice, or feelings like “I believe” (APA, 2020, pp. 113-114).

· Wordiness and Redundancy – Wordiness means using too many words when

fewer words would suffice. For example, instead of writing, “at the present time,” you could shorten it to, “now.” Redundancy involves using repetition, such as

“The reason is because…” The redundancy could be eliminated by simply

stating, “The reason is…” (APA, 2020, pp. 114-115).

· Tone – Present your findings in a clear and direct manner. Avoid slang, idioms, or clearly informal language (APA, 2020, p.115-116).

· Contractions – In scholarly writing, avoid contractions. For example, instead of “don’t,” write out “do not” (APA, 2020, p. 116).

20

Week 6 – t-Test Application

Testing Assumptions: The Levene Test The homogeneity of variance assumption is tested with the Levene’s test. The

null hypothesis for the Levene test is that group variances are equal. A significant Levene’s test (p < .05) indicates that the homogeneity of variance assumption is

violated. In this case, researchers will run the “Welch” version of the t-test. This version

of the t-test uses a more conservative adjusted degrees of freedom (df) that compensates for the homogeneity violation. The adjusted df can often result in a

decimal number (such as df = 13.4).

Proper Reporting of the Independent Samples t-test Reporting a t-test requires an understanding of several elements, including the

statistical notation for an independent samples t-test (t), the degrees of freedom in

parentheses, the t value, and the probability value. To provide context, provide the

means and standard deviations for each group.

For example, you might report:

The mean test score for Class A (M = 93.32, SD = 2.1) was higher than the mean

test score for Class B (M = 81.46 SD = 1.9). This was a statistically significant difference, t(98) = 2.75, p = .013.

t, Degrees of Freedom, and t Value

The statistical notation for an independent samples t-test is t, and following it is

the degrees of freedom for this statistical test. The degrees of freedom for t is n1 + n2 −

2, where n1 equals the number of participants in Group 1 and n2 equals the number of participants in Group 2. In the example above, there are 50 people in each group: N =

100 ( n1 = 50; n2 = 50), so the df = 98 [(n1 + n2) − 2]. It is not recommended that the

t-test be conducted with groups of fewer than 10 members. The t value is a ratio of the

difference in group means divided by the standard error of the difference in sample

means. The t value can be either positive or negative.

21

Probability Value

Statistical software determined the exact p value to be .013. This p value is less

than .05, which indicates that the null hypothesis should be rejected for the alternative

hypothesis—that is, the two groups are significantly different in test scores.

APA Focus of the Week: Grammar and Usage - Verb Tense

When writing up results, use the appropriate verb tense for each section. For example, when writing a proposal, write in future tense, using words such as “This study

will…” When discussing previous research, use past tense, such as “The study found…” Finally, when interpreting results of the current study, use present tense, such as, “These results indicate…”. See Table 4.1 in the APA 7 manual for more information

(APA, 2020, p. 118).

22

Week 7: One-Way ANOVA Introduction

Advantage of ANOVA

Recall that a t-test requires a predictor variable that is dichotomous (i.e., it has

only two levels or groups). The advantage of ANOVA over a t-test is that the categorical predictor variable can have two or more groups. Just like a t-test, the outcome variable

in ANOVA is continuous and requires the calculation of group means.

Logic of a "One-Way" ANOVA

The ANOVA, or F test, relies on predictor variables referred to as factors. A

factor is a categorical (nominal) predictor variable. The term "one-way" is applied to an

ANOVA with only one factor that is defined by two or more mutually exclusive groups. Technically, an ANOVA can be calculated with only two groups, but the t-test is usually

used. The one-way ANOVA is usually calculated with three or more groups, which are

often referred to as levels of the factor.

If the ANOVA includes multiple factors, it is referred to as a factorial ANOVA. An

ANOVA with two factors is referred to as a "two-way" ANOVA; an ANOVA with three

factors is referred to as a "three-way" ANOVA, and so on. Factorial ANOVA is studied in

advanced inferential statistics. In this course, we will focus on the theory and logic of the

one-way ANOVA.

ANOVA is one of the most popular statistics used in social sciences research. In

non-experimental designs, the one-way ANOVA compares group means between

naturally existing groups, such as political affiliation (Democrat, Independent, Republican). In experimental designs, the one-way ANOVA compares group means for participants randomly assigned to different treatment conditions (for example, high

caffeine dose; low caffeine dose; control group).

Avoiding Inflated Type I Error

You may wonder why a one-way ANOVA is necessary. For example, if a factor has four groups (k = 4), why not just run independent sample t-tests for all pairwise

comparisons such as Group A versus Group B, Group A versus Group C, Group B

versus Group C, and so on? A factor with four groups involves six pairwise

comparisons. The issue is that conducting multiple pairwise comparisons with the same

data leads to inflated risk of a Type I error (i.e., incorrectly rejecting a true null hypothesis—getting a false positive). The ANOVA protects the researcher from inflated

23

Type I error by calculating a single omnibus test that assumes all k population means

are equal.

Although the advantage of the omnibus test is that it helps protect researchers

from inflated Type I error, the limitation is that a significant omnibus test does not specify

exactly which group means differ, just that there is a difference "somewhere" among the

group means. A researcher therefore relies on either (a) planned contrasts of specific

pairwise comparisons determined prior to running the F test or (b) follow-up tests of pairwise comparisons, also referred to as post-hoc tests, to determine exactly which

pairwise comparisons are significant.

Hypothesis Testing in a One-Way ANOVA

The null hypothesis of the omnibus test is that all k (group) population means are

equal, or H0: μ1 = μ2 = … μk. By contrast, the alternative hypothesis is usually articulated

by stipulating that "at least one" pairwise comparison of population means is unequal. Keep in mind that this prediction does not imply that all groups must significantly differ from one another on the outcome variable.

Assumptions of a One-Way ANOVA

The assumptions of ANOVA reflect assumptions of the t-test. ANOVA assumes

independence of observations. ANOVA assumes that outcome variable Y is normally

distributed. ANOVA assumes that the variance of Y scores is equal across all levels (or groups) of the factor. These ANOVA assumptions are checked in the same process

used to check assumptions for the t-test discussed earlier in the course—using the

Shapiro-Wilk test and the Levene test. If your data is not normally distributed, you can

use a non-parametric alternative like the Kruskal-Wallis test. When the homogeneity

of variance assumption is violated, researchers will often rely on the Welch test.

APA Focus of the Week: Bias-free Language The APA 7 manual contains general guidelines for reducing bias in scholarly

writing. Some of the suggestions are:

Describe at the appropriate level of specificity (APA, 2020, pp. 132-133). While it is necessary to acknowledge relevant differences, make sure to be as specific as

possible as to what the data show. Refrain from including irrelevant personal feelings.

Be sensitive to labels (APA, 2020, pp. 133-134). Avoid using adjectives to label people or groups of people. Whenever possible, use person-first language such

24

as “students with learning disabilities,” instead of “learning disabled students.” However, note that some groups prefer identity-first language, such as “an

autistic person.” When possible, use the label preferred by members of the

community.

Avoid negative, stereotyping, and condescending terminology (APA, 2020, pp. 137). This means you will need to become aware of the currently accepted

terminology within the field. For example, instead of writing “drug addict,” which

has a negative connotation, when writing scholarly papers, you should write, “person with a substance use disorder.” In addition, instead of writing, “high-school dropouts,” you should write, “those with a grade-school education.” See the APA 7 manual, section 5, for more information.

25

Week 8: ANOVA Application

Proper Reporting of the One-Way ANOVA

Reporting a one-way ANOVA requires an understanding of several elements. To provide

context for the F test, provide the means and standard deviations for each level of a

given factor. The following elements are included in reporting the F test:

• The statistical notation for a one-way ANOVA ( F).

• The degrees of freedom.

• The F value.

• The probability value (p).

If the omnibus F test is significant, follow with a discussion of post-hoc tests. For example, you might report:

“The overall ANOVA was statistically significant, F(3, 24) = 11.94, p < .001. In

addition, all possible pairwise comparisons were made using the Tukey HSD test. Based on this test…”

F, Degrees of Freedom, and F Value

The statistical notation for a one-way ANOVA is F, and following it is the degrees

of freedom for this statistical test, such as (3, 24) reported above. Unlike correlation or a

t-test, there are two degrees of freedom reported for a one-way ANOVA. The first reported df is the between-groups df or dfbetween, which is the number of groups (or levels) minus one (k− 1). In the example above, the factor consists of k = 4 levels (4 − 1

= 3). The second reported df is the within-groups df, or dfwithin, which is the sample size

minus the number of groups or levels (N − k). In the example above, N = 28, so 28 − 4 =

24. The F value is calculated as a ratio of mean squares, which are both positive. Therefore, any non-zero F value is always positive.

Probability Value

The statistical software program reported p < .001. This p value is less than .05, which indicates that the null hypothesis should be rejected for the alternative

hypothesis—that is, at least one of the four group means is significantly different from

the other group means.

26

Post-Hoc Tests

When the omnibus F is significant, it does not indicate exactly which pairwise

comparisons are significant. A Tukey test is one of many post-hoc tests used. The

output for the Tukey test indicates which pairwise comparisons are statistically

significant, and this information can be reported in narrative form (i.e., without p values

or other specific statistical notation). If the homogeneity of variance assumption is

violated, researchers running a post hoc test will often report the Games-Howell version of the test.

APA Focus of the Week: In-text Citations The APA 7 manual has streamlined in-text citations, so it is easy to give proper

credit to sources. The author-date citation system has two formats – parenthetical citations or narrative citations. In general, parenthetical citations should follow these

guidelines:

Work with one author – (Last name, date). Example: (Koehler, 2016)

Work with two authors – (Last name & Last name, date). Example: (Sampson &

Hughes, 2020)

Work with three or more authors – (Last name et al., date). Example: (Pérez et

al., 2014)

For more information on parenthetical and narrative citations, see Table 8.1 in your APA

7 manual (APA, 2020, p. 266).

27

Week 9: Regression Introduction

Logic of a Simple Linear Regression

Both correlation and regression assess the direction and strength of relationships

between variables. However, the focus of regression is prediction of a continuous

dependent variable (your outcome of interest, Y) by a continuous independent (or predictor) variable (X). It is important to note that regressing X on Y will yield different results than regressing Y on X (whereas in correlation X and Y are interchangeable). Prediction is determined using a regression equation, which creates the best fitting line

for all the data points called a regression slope.

There are a number of different types of regression that you can read more

about in a more advanced statistics class. Multiple linear regression allows you to

look at multiple predictors of a continuous outcome simultaneously. Using it you could

explore how income, GRE scores, and hours spent studying impact statistics test scores. Binomial logistic regression allows one to predict membership in a

dichotomous group and is used when the dependent variable is categorical. So it could

be used if you were interested in what factors predict successful completion of graduate

school (graduation: yes/no). Polynomial regression is used to examine curvilinear relationships. The type of regression you conduct depends on the nature of your variables and the questions you are asking.

Hypothesis Testing in Simple Linear Regression

The null hypothesis for linear regression is that X has no predictive value of Y

written as H0: β1 = 0. By contrast, the alternative hypothesis is H1: β1 ≠ 0 meaning that changes in X are predictive of changes in Y. While linear regression does allow you

discuss prediction of variables, strict methodological procedures must be in place to

truly speak to causation (e.g., randomized experimental design). Therefore, you still must be cautious in your interpretation.

Assumptions of a Simple Linear Regression

A number of the assumptions of regression are similar to those of correlation. Both the independent and dependent variables must be continuous in linear regression. As with correlations, there must be independence of observations of X and Y scores. This can be tested by the Durbin Watson statistic, which should be close to two if this

assumption is true. There must also be a linear relationship between variables, which

28

is tested by visual inspection of the X, Y scatter plot. Also, like correlation, it is important that there are no significant outliers. This is tested by running casewise diagnostics that will highlight any cases that are outliers.

Unique to regression, there must also be homoscedasticity of residuals (i.e., equal error variances). This is tested with a visual inspection of standardized residuals

plotted against the unstandardized predicted values. The residuals (errors) of the

regression line also should be approximately normally distributed. This is tested by

examining the histogram (the mean should be close to zero and the standard deviation

close to one) and the Normal P-P Plot to see if the points are approximately aligned with

the diagonal.

APA Focus of the Week: References APA uses a specific format for references, to be sure all necessary information is

included to properly credit sources. In general, there are four main parts to each

reference:

Author – Who is responsible for the work

Date – When it was published

Title – What it is called

Source – Where others can find it

In scholarly writing, whenever you mention something which is not common

knowledge, you need to cite that source, and include it in the reference list. This is so if your readers are interested in knowing more, they can find the work and read it themselves. For more information on how to provide references, see Sections 9 and 10

of your APA 7 manual (APA, 2020, Chapters 9 & 10). See Figure 9.1 of the APA 7

manual for an excellent visual example of where to find these relevant details for a

journal article (APA, 2020, p. 283).

29

Reference

American Psychological Association. (2020). Publication manual of the American

Psychological Association (7th ed.). https://doi.org/10.1037/0000165-000