research lab

smartlalb.docx

Home >Social Science homework help >Sociology homework help >research lab

Introduction

Introduction to Populations and Samples

It would take too long and cost too much money to test the qualityof every piece of cereal made at a factory. Instead, a small sample ofeach batch is tested.

Wouldn't it be great if we could ask everyone in the world their opinion on atopic? What if we could have every person take a psychological test of interest sowe can assemble the most accurate data? How can we make sure that we includeevery man, woman, child, race, ethnicity, socioeconomic status, class, religion,occupation, or other demographic of interest in any study we conduct? We wantto make sure that the data we collect is as good as we can get under the givencircumstances. Because we cannot include everyone of interest in a study, wemust make sure our sample, or the group of those who participate in our study, isas close to "looking" like the population, or the entire collection of people ofinterest, as possible.

Consider this example. You are doing a study on the differences between men andwomen regarding their ability to follow directions. If you collected data from allmales and all females in the world—which would be the entire population,because sex is our main variable of interest—you would get an extremely accurateresult. However, it would be unrealistic, time consuming, and costly to collect thisdata. You could, however, take a sample of males and females and study them. If you choose a good sample, the results of your study can yieldan accurate representation of the population.

Collecting a sample that closely resembles the population we are interested in is an important component of conducting research. Muchconsideration must be given to the individuals you want to choose for your sample and how to ensure that your sample represents thepopulation. By choosing a good sample, we can make certain assumptions about the population, just as if we had selected everyone in thatpopulation. This is the focus of sampling: to select an appropriate cross-section of the population that will accurately represent the entirepopulation.

In the following lesson you will learn how to sample a population using a range of sampling methods. Be sure to pay specific attention to theadvantages and disadvantages of each method and when each is most useful.

Applying Knowledge of Populations and Samples

Populations and Samples in Ashford Courses

You will need to understand sample and population in a range of graduate courses, including those with a focus on psychological ororganizational assessment and testing, measurement, research methods, and statistics. In these courses you will need to be able to identify anddescribe the population of interest, how a sample was obtained, and the sampling methods used. These topics are important in understandinghow assessment or test results can be used or interpreted based on population norms, and how to conduct a study that does not suffer fromsampling biases or errors. In addition, having knowledge and skills in this area will help you better understand and evaluate the methods,results, and discussion sections of the research literature you may be asked to evaluate for various courses.

Populations and Samples in Graduate Research

The topics of sample and population are relevant to every research study. When conducting research it is important to determine and evaluatethe relevant population and appropriate sampling methods that will provide a representative sample. By doing this, you will increase thelikelihood that the results can be used to accurately describe the entire population. Sampling methods tend to be a source of criticism or thebasis for various limitations in research studies, as we can never achieve the perfect sample, either in size or characteristics. Therefore, it isimportant for you to acknowledge these limitations in any research results.

Populations and Samples in the Professional World

In order to apply research findings in the professional world, we must be able to evaluate the sample from which the results were obtained,how the sample represents the population, and how the sample relates to the individuals we are interested in. In addition, understanding thesample upon which a test (e.g., screening, intelligence) was developed is important, because this informs whether it is legal and/or ethical, forexample, to apply the test when making employment decisions.

Introduction

Introduction to Populations and Samples

It would take too long and cost too much money to test the qualityof every piece of cereal made at a factory. Instead, a small sample ofeach batch is tested.

Applying Knowledge of Populations and Samples

Populations and Samples in Ashford Courses

Populations and Samples in Graduate Research

Populations and Samples in the Professional World

Introduction

Introduction to Variables and Measurement

iStockphoto/thinkstock

A physician collects data on qualitative and quantitative variables ashe works with a company to develop a corporate health and wellnessplan.

Variables are what you are interested in looking at in a study. A variable issomething you plan to observe, manipulate, test, record, or evaluate. Asresearchers we need to know how to describe variables accurately in ourcommunications with others, and how to understand the impact a study'svariables have on the statistics we run and conclusions we come to.

If you are studying the impact of training on the leadership skills of female chiefexecutive officers (CEOs), then your variables would be training and leadershipskills, because those are the elements you are most interested in and what youwill be manipulating (training) and measuring (lead

ership skills). The population of interest is female CEOs, and you will be collectingdata from a sample of these individuals using a method described in Lesson 1,Populations and Samples. Characteristics of this sample, such as gender and title,are not variables because these traits are shared among all participants in thestudy (all are female and all are CEOs) and are not being manipulated ormeasured in any way.

One of the most basic ways to describe variables is in terms of their qualitative or quantitative properties. Qualitative variables classify items bycategory. For example, sex, ranking of students in a graduating class, race, top five favorite foods, and eye color are variables that fit intocategories; they measure the nature of something, and not a specific numerical value.

Quantitative variables classify items using numerical values, with which we can perform a range of mathematical functions. For example,number of items correct or incorrect, weight, time, psychological test scores, and temperature are variables that measure the numerical value ofsomething.

Throughout this lesson you will learn more about how to describe variables, including identifying the scale of measurement that applies to avariable of interest. You may still be wondering, why do we need to learn about scales of measurement, or any other characteristic of a variablefor that matter? To reiterate, being able to identify the various characteristics of variables helps to identify the statistics you can calculate anduse to interpret your findings. For example, it would not make sense to take a mean or average for the variable of sex, which is defined by oneof two categories: male or female. Sex is a qualitative variable, and is an example of a nominal scale of measurement. There is no number bywhich we can represent this variable; each person falls into one of the two categories. This is only one example of the importance that avariable's characteristics can have on a study. This lesson will continue to explore these topics and help you create the foundation needed todescribe variables accurately.

Applying Knowledge of Variables and Measurement

Variables and Measurement in Ashford Courses

As a student at Ashford University, you will need to understand variables and their characteristics in a range of graduate courses, including thosewith a focus on psychological or organizational assessment and testing, measurement, research methods, and statistics. In these courses you willneed to identify variables, describe their characteristics, and evaluate the statistical tests used and the results obtained as they relate to thosevariables. In addition, having knowledge and skills in this area will help you better understand the literature review, hypotheses, methods,results, and discussion sections of the research literature you may be asked to evaluate for various courses.

Variables and Measurement in Graduate Research

Variables are the primary focus of a research study. When conducting research it is important to determine and evaluate the variables ofinterest so you can accurately focus the research study, provide an overview of the variables' characteristics, and determine the appropriatestatistical tests to run and results to evaluate. Errors in identifying and describing variables can lead the researcher to use inappropriatestatistical tests or results that should not be used in the context of those variables. Errors can also cause the researcher to obtain inaccurateresults and draw incorrect conclusions regarding the study's hypotheses.

Variables and Measurement in the Professional World

In the professional world it is important to understand the nature of variables in order to describe and measure them accurately beforedisplaying related data. A variable's characteristics will be of prime importance when determining which mathematical or statistical analyses toperform on the data obtained for that variable. This is relevant in a variety of situations, ranging from determining the impact of a trainingprogram on employee performance, to customer satisfaction levels, to depression levels after administering a new therapy program. Knowingthe scale of measurement, for example, will lead the professional to perform certain analyses and present the results to others in a manner thatmakes sense for that variable. It is unethical and unprofessional to misrepresent data using inappropriate analyses or inaccurate variabledescriptions.

Tutorial

Introduction to Variables and Data

When researchers conduct a statistical study, they try to discern the relationship between different characteristics, or variables, of thepopulation they are studying. For example, in a study on the incidence of post-traumatic stress disorder (PTSD) among Iraq war veterans,researchers may collect data on each veteran's sex, age, years on active duty, highest educational degree, and current mental well-being. A variable is a value that varies over time, space, or from individual to individual in a study. A constant, on the other hand, is a value thatremains the same throughout the study. For example, in this study, whether or not each individual in the study served in Iraq is a constant. Thevalue is yes for every individual in the study. A data point is one variable value from one sampled individual. Data refers to the collection ofdata points from all of the individuals sampled.

There are two types of variables and data: qualitative and quantitative.

Qualitative Data and Quantitative Data

Qualitative Data

Qualitative data are the result of categorizing or describing attributes of a population that are not counted or measured. Hair color, blood type,ethnic group, the car a person drives, and the street where a person lives are examples of qualitative variables. Qualitative data are generallydescribed by words or letters. For example, hair color might be black, dark brown, light brown, blonde, gray, or red. Blood type might be AB+,O–, or B+. In the case of a study on PTSD among Iraq war veterans, whether or not the veteran has been diagnosed with PTSD would be aqualitative variable. Qualitative data are also known as categorical data.

Quantitative Data

Quantitative data are the result of counting or measuring attributes of a population. Quantitative data are always described using numbers andare usually the data of choice (where applicable) because there are many methods available for analyzing the data. Amount of money, pulserate, weight, number of people living in a town, and the number of students who take statistics classes are examples of quantitative variables.Age and medication dosage are examples of quantitative data.

Discrete vs. Continuous Variables

There are two types of quantitative data: discrete and continuous. Data that are the result of counting are called discrete data. These data arewhole number values. For example, the number of people in a town is an example of discrete data. There can be 1,286 people or 1,287 people,but not 1,286.3 people in the town.

Data that are the result of measuring are called continuous data (assuming that we can measure precisely). A person's weight and thetemperature of the air are both examples of continuous data.

Confusing Issues

In some cases it is not immediately apparent whether a variable is qualitative or quantitative. For example, sometimes researchers assign numbers to qualitative values. For example, in a data table you might see a column labeled "race,"but instead of descriptions like "white," "black," or "Hispanic," the column would include numbers 1, 2, and 3. The researcher has assigned anumber to each race simply to make the data easier to work with. It is important to note, however, that the data are still qualitative. The actualvalue of each number is meaningless.

In other cases, a variable can be qualitative or quantitative depending on how precisely it is described. For example, age can be described inyears, months, and days, in which case it is quantitative. But it can also be described qualitatively, with relative descriptions like "young,""middle-aged," and "old." Similarly, it is possible to collect data on height using terms like "short," "average," and "tall." As you can see,although these descriptions are not inaccurate, they are not very precise and are quite subjective. Quantitative data, on the other hand, offerspecific information. Unlike qualitative data, quantitative data can be analyzed more easily and with more statistical tools (as we will see inother lessons). For example, you can calculate the average age of war veterans if you know their precise ages in years; you cannot calculate theaverage of "young," "middle-aged," and "old."

When data can be described quantitatively, it should be. Sometimes it is useful to supplement those data with a qualitative description as well.

Independent and Dependent Variables

In some cases, variables are described not by their own intrinsic properties, but instead by how they are used. In many studies, researchers donot try to describe a sample simply by one piece of data (e.g., cholesterol level); rather, they try to analyze how various data points relate toeach other (e.g., how diet relates to cholesterol level). A researcher who wants to know how variable x and variable y are related willmanipulate one of the variables and not the other. The independent variable is the variable that is manipulated, while the dependent variableis the variable that is observed.

For example, an experimenter might compare how effective four types of antidepressants are at relieving depression. In this case, theindependent variable is the type of antidepressant, while the dependent variable is the extent of relief from depression.

In many cases, there is really no "manipulated" variable, but simply two observed variables (for example, height and weight). However, if aresearcher is trying to determine the relationship between a person's height and weight, he or she would consider one to be the independentvariable and the other to be the dependent variable.

We will go into more detail about independent and dependent variables in Lesson 9.

Scales of Measurement

If you look at a data table or a set of graphs, you may notice that data for different variables are described in different ways. For example, thedata table below includes possible data from a study on the incidence of PTSD among Iraq war veterans.

IDNumber	Homestate	Highesteducational degree	Body temperatureat rest (°F)	Resting pulse rate (beatsper minute)	Diagnosed with PTSD? (0 = nodiagnosis; 1 = yes)
1	Texas	High schooldiploma	98.0	54	0
2	New York	MS	98.9	58	1
3	Maryland	BS	98.6	63	1
4	Texas	PhD	97.4	55	0
5	California	High schooldiploma	99.3	62	1
6	Michigan	GED	98.7	62	0

Different types of variables are measured in different ways and are described using certain scales of measurement (also called levels ofmeasurement).

Nominal Scales

Data on a nominal scale are categorized responses. Many types of qualitative data are described using nominal scales; some examples aregender, handedness, favorite color, and religion. In the table above, home state and PTSD diagnosis are described on nominal scales. Note thatalthough the data on PTSD diagnoses are displayed as numbers, these numbers represent qualitative attributes. They are not meaningfulnumerical values and are thus still considered to be on a nominal scale.

Limitations of the Nominal Scale

A key characteristic of nominal scales is that they do not imply any ordering among the responses. For example, when classifying veteransaccording to home state, we would not rank the states. Responses are merely categorized. Because data on a nominal scale are organized insimple categories, it is also not possible to analyze them using many statistical tools. For example, we can't calculate the "average" state of thesample of war veterans (even if we assigned each state a number). We can, however, use the data to calculate frequencies, percentages, andproportions (e.g., the percentage of Iraq war veterans who reside in Maryland).

Ordinal Scales

A more useful scale for qualitative data, where possible, is an ordinal scale. A psychologist screening people for depression might ask them tospecify their feelings about life in general as either "very dissatisfied," "somewhat dissatisfied," "neither satisfied nor dissatisfied," "somewhatsatisfied," or "very satisfied." This is an example of an ordinal scale. Like the nominal scale, ordinal scales generally use words rather thannumbers, and many qualitative variables are described using ordinal scales. But unlike the nominal scale, the values on an ordinal scale can beordered (in this case ranging from least to most satisfied). Describing data using an ordinal scale allows a researcher to rank responses ratherthan simply categorize them.

Other examples of ordinal variables include military ranks and rankings in a race or contest (1st, 2nd, 3rd). In the table above, highest educationaldegree is given on an ordinal scale.

Limitations of the Ordinal Scale

It is important to note that the difference between two levels of an ordinal scale cannot be assumed to be the same as the difference betweentwo other levels. In our satisfaction scale, for example, the difference between the responses "very dissatisfied" and "somewhat dissatisfied"cannot be compared to the difference between "somewhat dissatisfied" and "somewhat satisfied." Nothing in this measurement procedureallows us to determine whether the two differences reflect the same difference in psychological satisfaction. Similarly, the difference between BSand MS and the difference between MS and PhD are not necessarily the same, and there is no way to indicate those on an ordinal scale.Statisticians express this point by saying that the differences between adjacent scale values do not necessarily represent equal intervals on theunderlying scale giving rise to the measurements. (In our case, the underlying scale is the true feeling of satisfaction, which we are trying tomeasure.)

What if the researcher had measured satisfaction by asking consumers to indicate their level of satisfaction by choosing a number from 1 to 4?Would the difference between the responses of 1 and 2 necessarily reflect the same difference in satisfaction as the difference between theresponses 2 and 3? The answer is No. Changing the response format to numbers does not change the meaning of the scale. We still are in noposition to assert that the mental step from 1 to 2, for example, is the same as the mental step from 3 to 4.

As with the nominal scale, there are not many statistical tools we can use to analyze ordinal data. For example, we cannot calculate averagesatisfaction or average educational degree. But what if those values are on a numerical ordinal scale? Does it make sense to compute theaverage of numbers measured on an ordinal scale? This is a difficult question, and one that statisticians have debated for decades. Theprevailing—but by no means unanimous—opinion is that for almost all practical situations, the average of an ordinal variable is a meaningfulstatistic. However, there are extreme situations in which computing the average of an ordinal variable can be misleading. You can explore thesetypes of situations in David Lane's measurement simulation in the Practice section of this lesson.

Interval Scales

Interval scales are numerical scales in which equal intervals are interpreted the same throughout. Interval scales are used for a number of typesof quantitative data. As an example, consider the Fahrenheit temperature scale, which is expressed in degrees (°F). The interval between 30°Fand 40°F represents the same temperature difference as the interval between 80°F and 90°F. This is because each 10-degree interval has thesame physical meaning (in terms of the kinetic energy of molecules). Dates are also expressed on an interval scale. The difference between twosuccessive days, for example, is the same regardless of the days chosen.

Limitations of Interval Scales

Interval scales are not perfect, however. In particular, they do not have a true zero point, even if one of the scaled values happens to carry thename "zero." The Fahrenheit temperature scale illustrates this issue. Zero degrees Fahrenheit does not represent the complete absence oftemperature (the absence of any molecular kinetic energy). In reality, the label 0°F is applied to this temperature for quite accidental reasonsconnected to the history of temperature measurement. This is also true for dates: The year "zero" is quite arbitrary and does not represent theabsence of time. Similarly, 0° longitude does not represent "no longitude," but rather an arbitrary north-south measurement on the earth'ssurface.

Because an interval scale has no true zero point, it does not make sense to compute ratios of values on an interval scale. For example, the ratioof 40°F to 20°F is not the same as the ratio of 100°F to 50°F; no interesting physical property is preserved across the two ratios. After all, if the"zero" label were applied at the temperature that the Fahrenheit scale happens to label as 10 degrees, the two ratios would instead be 30°F to10°F and 90°F to 40°F, which are no longer the same! For this reason, it does not make sense to say that 80°F is "twice as hot" as 40°F. Such aclaim would depend on an arbitrary decision about where to "start" the temperature scale, namely, what temperature to call "zero" (whereasthe claim is intended to make a more fundamental assertion about the underlying physical reality).

Because the data are quantitative and because the distance between values is set and understood, it is possible to perform statistical analyseson data on the interval scale. We can, for example, calculate an average temperature of seawater or the average birth year for a group ofpeople.

Ratio Scales

The ratio scale is the most informative scale of measurement. It is an interval scale with the additional property that its zero position indicatesthe absence of the quantity being measured. You can think of a ratio scale as the three earlier scales rolled into one. Like a nominal scale, itprovides a name or category for each object (the numbers serve as labels). Like an ordinal scale, the objects are ordered (in terms of theordering of the numbers). Like an interval scale, the difference between two places on the scale has the same meaning regardless of the twopoints chosen. In addition, the same ratio at two places on the scale also carries the same meaning.

An example of a ratio scale is the amount of money you have in your pocket right now (25 cents, 55 cents, etc.). Money is measured on a ratioscale because, in addition to having the properties of an interval scale, it has a true zero point: If you have zero money, this implies the absenceof money. Since money has a true zero point, it makes sense to say that someone with 50 cents has twice as much money as someone with 25cents—or that Bill Gates has a billion times more money than you do.

Like the interval scale, all statistical analyses can be performed on data described on a ratio scale.

Likert Scales

Many questionnaires use a type of scale called a Likert scale to gauge how people feel about particular issues. Typical responses for an item onthe Likert scale are:

1. strongly disagree

2. disagree

3. neither agree nor disagree

4. agree

5. strongly agree

Similar rating scales are used frequently in psychological research. For example, experimental subjects may be asked to rate their level of pain,how much they like a consumer product, or their confidence in an answer to a test question.

Are Likert scales ordinal or interval? Researchers disagree, and it depends on the study. Certainly, if no effort has been made to make sure thatthe difference between any two successive ratings on the scale is constant, then the scale is ordinal and not interval. But sometimes researchersattempt to construct the study such that the differences between ratings are approximately equal. This type of scale is thus sometimes referredto as an "approximately interval" scale. Researchers will perform numerical statistical analyses on these data. It is important, however, to becareful when collecting and interpreting the data. Whether the data should be considered ordinal or interval can be extremely subjective, and itis often inappropriate to consider psychological measurements scales as either interval or ratio.

Summary of Measurement Scales

Scale	Description	Type of data	Examples	Pros	Analyze numerically?
Nominal	data described byname only	qualitative	shape, country,gender	allows data to becategorized	in a limited way(frequencies,percentages,proportions)
Ordinal	data that can beranked in meaningfulorder	qualitative orquantitative	rank, position in arace	allows ranking ofdata	in a limited way ifdata are qualitative(frequencies,percentages,proportions) cansometimes beanalyzed morethoroughly if the dataare quantitative
Interval	data whose numbersindicate a set fixeddifference betweenintervals, with anarbitrary rather thanan absolute zeropoint	quantitative	temperature, date,sea level, longitude	provides informationon the absolutenumerical differencebetween two datapoints	yes
Ratio	interval data with anabsolute zero point	quantitative	age, height, elapsedtime	provides informationon the ratio of valuesof two data points;tells where data arein relation toabsolute zeromeasurement	yes

Introduction

Introduction to Charts and Graphs in Statistics

In the 1780s, Scottish economist and engineer William Playfair,founder of graphical methods in statistics, invented the line and bargraphs.

It is said that "a picture is worth a thousand words." We can say the same withgraphs, which are pictorial representations of data. By representing a data set inthis way, patterns become apparent, or we can begin to see what the data mightbe telling us. In addition, graphs can help to categorize or group together datapoints, which can help us draw conclusions regarding the data set.

Summarizing and categorizing data sets are a task aided through the use ofvarious graphing techniques. There are many different ways we can representdata, such as pie charts, bar charts, line graphs, stem plots, or histograms.Choosing the most appropriate graph depends on the information we want toinclude and how we can display that information most accurately.

When choosing an appropriate graph type, it is important to consider thevariables that will be displayed. For categorical variables (which are nominal orordinal in scale), pie charts and bar charts are more appropriate graphingtechniques for displaying frequencies or percentages for each category of thatvariable. For example, if we wanted to display the breakdown of various racialgroups in a sample, we would likely use a pie chart or bar chart to display the percentages or frequencies for these categories. The same couldbe said regarding other categorical variables, such as sex, age groups, rankings, and multiple-choice question responses.

For numerical variables (which are interval or ratio in scale), complex graphing techniques such as line charts or histograms may be moreappropriate. For example, if we wanted to display the average weekly quiz scores during a 13-week course, a histogram will likely be a moreappropriate way to represent this data. Other numerical variables that may call for these types of graphs include time, medication dosages,physical measurements (height, weight, temperature, etc.), and test scores.

In this lesson you will learn how to create various types of graphs and how to determine which type of graph is most appropriate for aparticular data set. Pay special attention to what the graph is telling you about the data, whether the graph is being used to demonstrate a bias,and how different graphs provide specific information about the data set.

Using Charts and Graphs in Statistics

Charts and Graphs in Ashford Courses

You will need to understand graphs in a range of graduate courses, including those with a focus on psychological or organizational assessmentand testing, measurement, research methods, and statistics. However, because graphs are commonly used to condense and present datafindings, you may find graphs throughout textbooks and readings for various courses. If you are required to present findings from a literaturereview, or present an argument, you may need to create and/or interpret graphs in order to provide data and conclusions to others. In addition,having knowledge and skills in this area will help you better understand the review, results, and discussion sections of the research literatureyou may be asked to evaluate for various courses.

Charts and Graphs in Graduate Research

Graphs are typically used in some way to present the results of a research study. When conducting research it is important to determine andevaluate the appropriate graphs needed for all of the variables of interest for a particular study in order to provide an overview of the data in aconcise and accurate manner. Graphs can range from tables that include the data collected or descriptive statistics of variables, to histograms ofstudy results.

Charts and Graphs in the Professional World

Graphs are commonly used as a means of communicating information in the workplace, because they provide a quick and effective look at dataresults without requiring extensive expertise in math or statistics. In the professional world it is important to understand how graphs arecreated, when to use them, how to accurately portray data to others, and how to interpret graphs in the field. In addition, it is important thatdata is presented in a way to avoid bias or skewing of the results.

Tutorial

Data Tables

Suppose a group of researchers collects data for a study on anxiety among working mothers of infants and toddlers. How should the data bepresented so that the researchers can analyze it easily and share it effectively with others? No one can work with a bunch of informationscribbled on various pieces of paper. Data must be organized in order to analyze it.

Researchers should think about how to display data before they begin collecting it. The first thing to do is create a data table. A useful datatable is organized with a header row at the top (with each variable represented in a separate column), and an identifier column at the far left(with each individual in the study represented in a separate row). An example of a data table is shown below.

Table 3.1: Anxiety among Working Mothers—General Data
SampleID	Age(years)	Maritalstatus1	Number ofchildren under 1year of age	Number ofchildrenaged 1–3	Average number ofhours worked perweek2	Jobdescription	Typeofdaycare3	Rating on survey item: I get upset easily orfeel panicky. 4
1	38.4	M	1	1	60	financialexecutive	5	4
2	26.2	M	2	0	40	cashier	1	3
3	34.4	S	1	1	40	universityprofessor	4	2
4	42.8	M	1	0	30	physician	2	2
5	35.0	S	N/A	N/A	50	physician	N/A	3

1 Marital Status: M= married; S = single; D= divorced or separated from father

2 Includes work done for job at home and on weekend

3 Type of day care: 1 = in home with relative; 2 = in home with non-relative; 3 = private home day care; 4 = day care center at workplace; 5= day care center not at workplace; N/A = control group

4 Anxiety Rating: 1 = a little of the time; 2= some of the time; 3= a good part of the time; 4 = most of the time

The table allows us to see the data collected for each person in the sample. Each row provides the data for all variables for a single individualsampled. Each column displays data from all individuals sampled for a single variable. We can easily view data collected for a single individual("Sample #1 is 38 years old and works 60 hours per week.") or compare values of a particular variable ("The women sampled range in age from26 to 42.").

Note that in a good table,

· Variables are clearly identified.

· Units are included for each variable (e.g., years).

· Rating systems and abbreviations are included in the headers or as a separate key.

One of the nice things about a table is that you can not only find data easily, but you can also sort it easily. We could, for example, sort thedata in the table above by age, marital status, or any other variable. Having the data in a table also allows us to perform statistical calculationson the data fairly easily (the exact methods depend on the program you are using).

Summary Data Tables

A good study requires a sufficiently large sample, which may include hundreds or thousands of individuals. In this case, it can be useful tocreate a number of additional tables that summarize the data in the original table. One useful summary table is a frequency table. Rather thandisplay every single data point, we display the quantity of each value. Table 3.2 includes data that could have been derived from the data inTable 3.1.

Table 3.2: Anxiety among Working Mothers—Frequencyof Anxiety Ratings
Anxietyrating	Frequency(number ofresponses)	Relative frequency(number of responses ÷total number of responses)	Percentage(out of totalresponses)
1	93	0.06	6%
2	453	0.28	28%
3	782	0.49	49%
4	267	0.17	17%
TOTAL	1595	1.00	100%

In the table above, frequency is the number of responses to the study. For example, 453 participants in the study reported an anxiety rating of2. Relative frequency is the number of responses with a particular value (e.g., "anxiety rating 2") divided by the total number of responses.Relative frequency is a proportion relating the number of participants having a particular variable value to the number of participants. In thiscase 453 participants out of 1595 total, or 0.28 of the total, reported an anxiety rating of 2. Percentage is simply relative frequency multipliedby 100. Relative frequency is useful when comparing two sets of data that do not have the same number of total values. For example, if wewere to compare this set to another researcher's set of data, it would probably be more informative to say that 6% of participants reported ananxiety rating of 1 than to say that 93 reported an anxiety rating of 1.

For data that are quantitative and continuous, it may be useful to summarize them even further by grouping or binning the values into ranges.For example, if we were to create a frequency table for age (in years), it would probably not be useful to include one row for each age. Thatwould mean a lot of rows, many of which would likely have a low frequency. Instead, we indicate the number of individuals in certain ageranges, called intervals or bins.

Table 3.3: Anxiety among Working Mothers—AgeFrequency
Age (in years)	Frequency	Relative Frequency	Percentage
16–20	47	0.03	3%
21–25	190	0.12	12%
26–30	404	0.25	25%
31–35	487	0.31	31%
36–40	358	0.22	22%
41–45	82	0.05	5%
46–50	25	0.02	2%
51–55	2	0.00	0%
Total	1595	1.00	100%

Charts and Graphs

Once data is organized, it can be analyzed. There are numerous calculations and statistical operations we can make, but the first thing to do isget a general idea of what the data "looks like." We do this by using graphs. Statisticians often graph data first in order to get a picture of thedata. Then, they can use more formal tools to interpret the data.

A statistical graph is a tool that helps you learn about the shape or distribution of a sample. A graph can present data more effectively than asimple list of numbers; a graph shows where data are clustered and where they are more scattered. We can also easily see maximum values,minimum values, and outliers (values that are very different from the rest). Graphs also allow us to see trends and compare facts and figuresquickly. This can be difficult—if not impossible—using a table with thousands of data points.

Some types of graphs that we can use to summarize and organize data are stem-and-leaf plots (stemplots), bar graphs, pie charts, andhistograms.

Stem-and-Leaf Plots (Stemplots)

One simple graph, the stem-and-leaf plot (or stemplot), is useful when the data sets are small. To create the plot, first identify the stem and leafof each piece of data. The leaf is usually the last digit of the number; the stem is the rest of the number. For example, the number 23 has stem2 and leaf 3. The number 432 has stem 43 and leaf 2. The number 5,432 has stem 543 and leaf 2. The decimal 9.3 has stem 9 and leaf 3.

Next, write the stems in a vertical line from smallest to largest. Then draw a vertical line to the right of the stems. Finally, write the leaves inincreasing order next to their corresponding stem. (The stem and leaves should make sense for the data set. Look at the range of points andsee how it best makes sense to divide the stem and leaves, and then group the data.)

For example, the scores for a final exam are as follows:

12; 53; 55; 55; 61; 63; 67; 68; 68; 69; 69; 72; 73; 74; 78; 80; 83; 88; 88; 88; 90; 92; 94; 94; 94; 94; 96; 100

The stemplot looks like this:

The stemplot shows that most scores fell in the 60s, 70s, 80s, and 90s. Eight of the 31 scores, or approximately 26%, were 90 or above; on atypical grading scale this represents a fairly high number of "A"s. Notice that in the stemplot "0" does not indicate a lack of data, but rather avalue (an exam score). The lack of data for a particular stem is indicated by no values in the leaf column.

The stemplot is a quick way to graph and gives a succinct picture of the data. You want to look for an overall pattern and any outliers. An outlier, or extreme value, is a piece of data that does not fit well with the rest of the data. When you graph an outlier, it will appear not to fitthe pattern of the graph. Some outliers are due to mistakes (e.g., writing 50 instead of 500), while others may indicate something unusual. Ittakes some background information to explain outliers. In the example above, 12 is an outlier.

Bar Graphs

Another type of graph that is useful for specific data values is a bar graph. Bar graphs display data in separate bars. Bar graphs can be used forqualitative or quantitative variables, and the bars can be vertical or horizontal. The figure below shows a frequency table with its correspondinggraph. Frequencies are represented by the heights of the bars.

Table 3.4: Anxietyamong WorkingMothers
Anxietyrating	Frequency
1	93
2	453
3	782
4	267

Figure 3.1

The same data could also be presented in terms of relative frequency or percentages.

Figure 3.2

Figure 3.3

Pie Graphs

Another way to display proportions is using a pie graph, or pie chart. In a pie graph, proportions are shown as pieces of a circular "pie." Theentire pie is equal to 100%.

Figure 3.4

It is important to note that you can use a pie graph only for proportions, and those proportions must add up to 100%. You cannot use a piechart to compare two variables, for data that overlap, or for data that don't represent the entire sample or population in a study. For example,say a data set includes annual deaths from malaria and from HIV/AIDS.

Table 3.5: Deaths from Infectious Diseases
Disease	Percent of all deaths
Malaria	2.23
HIV/AIDS	4.87

We could create a pie graph to display these data. Although it does give us a visual idea of the difference in deaths due to the two diseases(HIV/AIDS kills roughly twice as many people as does malaria), the graph is misleading because it suggests that just two diseases account for alldeaths. Unless we change the title of the pie graph to "Cause of Death of People Who Die of Either Malaria or HIV/AIDS," the graph is notappropriate for the study.

Figure 3.5

A bar graph, however, would be quite useful and would not be misleading.

Figure 3.6

Histograms

Bar graphs can be used to represent qualitative or quantitative data. When the data are quantitative and use an interval or ratio scale, they canbe displayed in a special type of bar graph called a histogram. A histogram is a bar graph of the frequencies of numerical values of a sampledpopulation.

A histogram consists of contiguous columns (columns without spaces between them). The horizontal axis is labeled with the variable beingmeasured (for instance, distance from your home to school). The vertical axis is labeled either "frequency" or "relative frequency." Again,frequency is just the number of counts, or data points with a particular value. Relative frequency is really the same as proportion or percentage,and is equal to the frequency divided by the total number of data points. For example, in the table below, the relative frequency of motherssampled that were aged 16–20 was 47/1595 = 0.03. Relative frequencies should always add up to 1.0.

Table 3.6: Study of Anxiety in Working Mothers
Age	Frequency (number ofmothers)	Relativefrequency(numberofmothersin eachbindividedby totalnumberofmothers)
16–20	47	0.03
21–25	190	0.12
26–30	404	0.25
31–35	487	0.31
36–40	358	0.22
41–45	82	0.05
46–50	25	0.02
51–55	2	0.00
Total	1595	1.00

Two histograms of the data are shown below.

Figure 3.7

Figure 3.8

Notice that the graphs have the same shape whether we plot absolute frequency, relative frequency, or percentage. Absolute frequency iscommonly used when the data set is small; relative frequency is used when the data set is large or when we want to compare severaldistributions. For example, if we had another set of data showing the distribution of ages of non-working mothers in the study, unless thenumber of mothers in each set was the same, we would probably want to use relative frequency rather than absolute frequency. We can use ahistogram to see the shape of the data distribution. As we will see in other lessons, we can also use it to estimate certain statistics, such as themean, or average.

One advantage of a histogram is that it can readily display large data sets. A rule of thumb is to use a histogram when the data set consists of100 values or more.

Bin Widths

There is more to be said about the widths of the class intervals, sometimes called bin widths. Your choice of bin width determines the numberof class intervals. This decision, along with the choice of starting point for the first interval, affects the shape of the histogram. This is importantbecause the shape affects how much information can be seen from the graph alone and on how people interpret the graph.

For example, the graph in Figure 3.9 has a narrow bin width, or class interval, of 1 month. The number of class intervals is 44: one for eachmonth. In this case, the bin width is so narrow that it is hard to see any pattern in the data.

Figure 3.9

In Figure 3.10, however, the same data are plotted in 8 bins, each having a bin width of 6 months. The wider bin width makes the data easierto plot because there are fewer bins to plot. We can also see a pattern in the data that was not apparent in Figure 3.9.

Figure 3.10

If the bins are too wide, however, the data becomes harder to analyze. In Figure 3.11, the same data are plotted again, but with a bin width of30. As you can see, it is impossible to get an idea of the real distribution of data using this width.

Figure 3.11

The best thing to do is experiment with different widths, and to choose a histogram according to how well it communicates the shape of thedistribution.

When choosing a bin width, remember that shifting the intervals can also affect the appearance of the data.

Summary of Charts and Graphs

· Data tables are used to collect, display, and sort data.

· Summary data tables include frequencies and percentages that summarize large sets of data.

· Graphs allow a researcher to quickly see trends, clusters, and maximum and minimum data values.

· Stem-and-leaf plots (or stemplots) provide a graphical representation of the frequency of values in small data sets.

· Bar graphs have vertical or horizontal columns to represent values of qualitative or quantitative variables.

· Pie graphs are used to depict parts of a whole.

· Histograms represent the relative frequencies of quantitative variables.

· The look of a histogram is highly influenced by bin width. Graphs with bins that are too narrow or too wide can be difficult to interpret.

Previous section

Next section

Tutorial

Mean

The mean of a set of data is also known as the average. It is the sum of the values of all of the data points divided by the number of datapoints. For example, to calculate the mean selling price of 50 houses, add the 50 prices together and divide by 50:

mean = sum of all values in the sample/number of values in the sample

Example 1

The data below indicate the number of months that 40 AIDS patients live after beginning treatment with a new antibody drug. What is themean survival time of this sample of patients?

10	37	17	8	13	27	12	24	14	8
15	44	16	40	15	18	21	22	22	25
11	25	33	44	27	17	29	29	31	32
33	4	34	26	35	3	16	26	44	34

Solution

To calculate the mean, we add up all the data points and divide by the number of data points:

sum of data values=941 months

number of data points=40 patients

941 months40 patients=23.53 months mean survival time per patient

Median

The median of a set of data is the middle data point. If you put all the data points in order, the median value is the value of the point in themiddle. If there are an odd number of data points, the median is the value of the point exactly in the middle. If there are an even number ofdata points, the median is the average of the two points in the middle. For example, to find the median price of 50 houses, we would put all ofthe prices in order, from lowest to highest. We would then look at the middle point. In this case, since 50 is an even number, we would takethe 25th and 26th prices, and find their average.

median = middle value, or average of the two middle values in the sample

Example 2

What is the median survival time of the sample of AIDS patients described in Example 1?

Solution

To calculate the median, first we need to put the data points in order from fewest to greatest number of months:

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24
25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

Then we split the data into two equal halves to find the middle data point:

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24

25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

In this case, since there are an even number of points (40), there are two middle points, 24 and 25.

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24

25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

The median is the average of these two middle points:

24 months+25 months2=24.50 months median survival time

Mode

The mode of a set of data is the value that appears most often in the data set. For example, to find the mode sale price of 50 houses, wewould organize the data by price and then see which price is most common.

mode = most common value in the sample

Example 3

What is the mode survival time of the sample of AIDS patients described in Example 1?

Solution

We need to sort the data from lowest to highest, just as when we calculate the median. Then we need to look through the data and find whichnumber occurs most often. The survival rate with the greatest frequency is 44 months. Three patients who survived for 44 months, so 44 is themode.

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24
25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

You've probably noted that this is a tedious procedure, and would be nearly impossible to do with the thousands of data points that areincluded in many studies. As you will see in the next section, it is a lot easier to use graphed data to determine the mode.

Mean vs. Median vs. Mode

As you can see from the examples above, although mean, median, and mode are all valid measures of central tendency, they are not always thesame for the same set of data. Recall that the mean, median, and mode of Example 1 are:

mean = 23.53 months

median = 24.50 months

mode = 44 months

Since these measures are not always the same, you will need to use other information to figure out which value is the best measure of thecentral tendency. Which is best depends partly on the data set and partly on the purpose of your study.

Using Graphs: Symmetrically Distributed Data

Many studies include hundreds, if not thousands, of data points. The only way to get a good sense of the data is to graph it. A histogram is agraph of the frequency of data values. (For a review of histograms, see Lesson 3.) You can use a histogram to estimate mean, median, andmode, and also to help determine which measure will be most useful.

Symmetrically Distributed Data

Consider the following data set:

4	5	6	6	6	7	7	7
7	7	7	8	8	8	9	10

A histogram of the data is shown below.

Figure 4.1

The histogram displays a symmetrical distribution of data. A distribution is symmetrical if a vertical line can be drawn at some point on thehistogram such that the shape to the left and the right of the vertical line are mirror images of each other. In a perfectly symmetricaldistribution, where there is only one mode, the mean, the median, and the mode are about the same. And in fact, if we calculate the mean,the median, and the mode for these data, we find that each one has a value of 7.

Using Graphs: Skewed Data

Now consider this data set:

The histogram for the data is shown below.

Figure 4.2

This histogram is not symmetrical. The right-hand side seems "chopped off" compared to the left side. The shape distribution is said to be skewed to the left because it is pulled out toward the left. (This is also known as a negative skew.) In a skewed distribution, the mean, median,and mode are not the same. In this case, the mode is still 7, but the mean and median are less than 7: The mean is 5.9 and the median is 6.Notice that the mean is less than the median and they are both less than the mode. The mean and the median both reflect the skewing. Themean is the lowest value in a distribution that is skewed to the left (negatively skewed).

Data may also be skewed to the right. For example, the data below appear to be pulled out toward the right, with the left side "chopped off."(This is also known as a positive skew.)

Figure 4.3

In this case, the mode is still 7, but the mean and median are both greater than 7: The mean is 7.7 and the median is 7.5. Notice that themean is the largest statistic, while the mode is the smallest. Again, the mean reflects the skewing the most. The mean is the highest measure ofcentral tendency in a distribution that is skewed to the right (positively skewed).

To summarize:

· If the distribution of data is symmetrical and there is only one mode, the mean, median, and mode are about the same.

· If the distribution of data is skewed to the left, the mean and median are less than the mode.

· If the distribution of data is skewed to the right, the mean and median are greater than the mode.

Using and Interpreting Mean, Median, and Mode

Imagine that the pharmaceutical company developing the AIDS drug discussed in the examples in Section 4.2 decides to market the drug. Themarketing materials state, "The drug provides remarkable results, with a mode life expectancy of 44 months."

The claim is true, but it is misleading. Although 44 months is the most frequent data point, the mean life expectancy is 23.5 months and themedian is 24.5 months, significantly less than the mode. It is generally quite easy to calculate mean, median, and mode (and in most cases,people do these calculations with statistics software), but it is not always easy to figure out which measure is most useful or what the measuresactually mean.

Using the Mean

Using the Mean with Symmetrical Data

The mean is a useful measure of central tendency when the data are symmetrical and there are no significant outliers. The distribution of dataon the survival of AIDS patients taking a new drug is not perfectly symmetrical, but it is fairly symmetrical and is not noticeably skewed. Thusmean is a good measure of central tendency. (Notice that we have binned the data into ranges of 6 months.)

Figure 4.4

Using the Mean with Skewed Data

When the data are not symmetrical, but are skewed to the right or left, however, the mean can be misleading. For example, suppose that in asmall town of 50 people, one person earns $5,000,000 per year and the other 49 each earn $30,000. The mean salary of people in the townwould be

49($30,000)+1($5,000,000)50 people=$129,400 per person

This mean salary is obviously misleadingly high because the one millionaire has skewed the data. In this particular case, the mode or themedian is a better measure of central tendency.

Figure 4.5

Using the Mean with Qualitative Data

It is important to note that the mean is valid for quantitative data only. It is not possible to determine the mean for qualitative variables suchas the sex of people in a pharmaceutical study or the make of cars in a study on automobile safety.

Sometimes researchers assign numerical codes to qualitative variables to make it easier to work with the data (e.g., 00 = male; 01 = female). Inthis case it is technically possible to calculate a mean, but the value would be meaningless. There is no such thing as "a mean sex of 0.8," forexample. With qualitative data, the mode, or most common value, is the appropriate measure of central tendency.

Figure 4.6

Though you might imagine a mean value for political party lying just to theleft of Republican in this graph, there is no such thing as a mean politicalparty in the U.S. Senate. The appropriate measure in this case is mode.

Using the Median

Using the Median with Symmetrical Data

Like the mean, the median is a useful measure of central tendency when the data are symmetrical and there are no significant outliers. Thedata on the survival of AIDS patients taking a new drug are fairly symmetrical and thus median is also a good measure of central tendency.

Figure 4.7

Using the Median with Skewed Data

The median is a particularly useful measure of central tendency when the data are skewed. In this case, the median is much more likely toprovide a representative center of data than the mean. In the example of a small town of 50 people in which one person earns $5,000,000 peryear and the other 49 each earn $30,000, the median salary is $30,000. The median is obviously a much better measure of central tendencythan the mean for this situation.

The median is generally a better measure of the center when there are extreme values or outliers because the median is not affected by thenumerical values of the outliers. The median can also be used for symmetrical data, in which case it is often quite close to the mean.

Figure 4.8

Using the Median with Qualitative Data

It is important to note that like the mean, the median is useless for almost all qualitative data. It may be possible to put qualitative values, suchas make of car, in some order and find the middle value; but because the order is meaningless, the median value is also meaningless. (Theexception may be for data sets with only two values. The median is the middle point, which is likely to be the most common point or mode aswell.)

Figure 4.9

If we were to try to calculate the median political party of the U.S. Senate, we wouldneed to order the data first. Order of qualitative data has no meaning. Depending onthe order we choose, we come up with a different value for the "median." There is, infact, no such thing as a median political party.

Using the Mode

Using the Mode with Qualitative Data

Given that mean and median are virtually meaningless measures of central tendency for qualitative data, the mode is particularly useful fordefining the center of a set of qualitative data points. For example, Toyota Motor Company reported the following sales for December 2010:

Model	Number Sold
AVALON	2,691
CAMRY	31,223
COROLLA	22,058
PRIUS	15,639
SCION tC	1,594
SCION xB	1,538
SCION xD	824
VENZA	3,996
YARIS	3,422

Figure 4.11

We can see immediately from the graph that the Camry was the most popular model in December 2010. Therefore, the mode value is "Camry."Mode is a useful measure of central tendency for this data set. (Though you might be able to imagine a mean and median somewhere aroundthe Prius, those measures actually have no meaning for this set of data.)

Using the Mode with Quantitative Data

The mode can also be a useful measure for quantitative data, depending on what is being analyzed. For example, suppose the grades of achemistry exam are as follows:

If we look only at the data points, we see that there is no single mode. Each grade occurs only once. However, if we bin the data and thengraph it, we get another, more useful look.

Figure 4.12

There are two modes: 60–69 and 90–99. This is important information. The mean and median grades are 80, but no students actually receiveda grade of 80. Analyzing the mode is useful in this case because it tells us that students either did very well or very poorly.

When we are analyzing quantitative data, it is often necessary to bin the data into ranges of values in order to come up with a useful mode. Ifwe don't do this, the mode may have little meaning.

Summary: Using Mean, Median, and Mode

Description of Data	Mean	Median	Mode
Data are quantitative
Data are qualitative
Data are symmetrical
Data are skewed
Data are bimodal
There is only one mode

– Generally a useful measure of central tendency, though there are exceptions

– NEVER an appropriate measure of central tendency

– Generally not a good measure of central tendency, though there are exceptions

Each data set must be evaluated carefully to determine which measure is most appropriate. When evaluating data, be sure to consider eachaspect of the data listed in the first column.

Introduction

Introduction to Measures of Variability

iStockphoto/thinkstock

One way to describe a data set is to look at its center points by using measures ofcentral tendency. However, center points alone are not enough to provide athorough description of a data set. It is also important to consider the variabilityof the data, or how the data points are distributed. Are they compact and closetogether, or are they widely spread out? We cannot answer this question bylooking at measures of central tendency alone.

Recall the example from the module on measures of central tendency, in whichyou are offered a new job, and you want to know how the salary you are offeredcompares to other typical offers. The range would tell us the difference betweenthe highest and lowest offers recorded, which would indicate whether thedistribution is narrow or wide. You could compare your offer to this range; if youroffer falls in the range of recorded offers, then you know it is within the typicalrange of offers made.

The standard deviation of a data set describes the average distance its data points fall from the mean. By comparing the standard deviation tothe mean, we can see whether the data are widely distributed or whether they are more concentrated around the mean. Using the previousexample, the standard deviation for a distribution of offers would tell you the average distance an offer tends to be from the mean. This wouldhelp identify whether or not the offers are concentrated around the mean, or whether they are widely distributed.

If the standard deviation is quite large in comparison to the mean value, then the distribution of offers is likely large as well. This means thatthe offers tend to cover a larger range than if the standard deviation was small. Once you calculate how far your offer is from the mean, youcan compare that difference to the standard deviation and see whether your offer is closer to or further from the mean than the typical offer.

By the end of this lesson you will have a greater appreciation for the importance of descriptive statistics, and you will have combined yourknowledge of measures of variability with the previous lesson's topic on measures of central tendency. Together this information will provideyou with the basic tools to describe a particular data set.

Applying Measures of Variability

Measures of Variability in Ashford Courses

You will need to understand measures of variability in a range of graduate courses, including those with a focus on psychological ororganizational assessment and testing, measurement, research methods, and statistics. In these courses you will need to calculate and interpretthese measures in order to better understand and describe the spread or variability of the data points for various distributions presented toyou. In addition, having knowledge and skills in this area will help you better understand the results and discussions sections of the researchliterature you may be asked to evaluate for various courses.

Measures of Variability in Graduate Research

Measures of variability are typically reported for all variables in a research study, similar to measures of central tendency. When conductingresearch it is important to determine and evaluate the appropriate measures of variability for all variables of interest, to provide an overview ofthe spread and variability of each variable's distribution. Even when demographic data are collected but not being analyzed as main componentsor variables, these data are typically summarized using measures of central tendency and variability.

Measures of Variability in the Professional World

In the professional world it is important to understand how measures of variability are calculated, when they should be used, and how tointerpret them in order to accurately portray data to others. Providing an overview of a distribution and its center points are commoncomponents of communicating about a certain behavior or outcome of interest. However, it is also important to take into consideration thedistribution of data points and how spread out they are. In addition, it is part of one's ethical practice as a professional to understand thatwhich one is communicating to others, whether it is summarizing another's research data or one's own findings. Measures of variability areespecially important as they are some of the most common methods for describing a particular group of data. We use these measures to showothers the range of values, how widely the data set is distributed, and how far a typical data point may be from the mean.

Tutorial

Introduction to Variability

Imagine you are comparing the results of a final exam taken by two groups of students. You decide that rather than look at all the data, you willsimply compare their measures of central tendency. You complete the analysis and are amazed to discover that the statistics are the same forboth groups. In fact, the values of all three measures—mean, median, and mode—are exactly the same for both groups: 50. You can infer a fewthings from this fact. The data are probably quite symmetrical and have a single mode. They are certainly centered at 50. But does this meanthat the data sets are the same? It seems unlikely, so you graph the data to get a better look:

Figure 5.1

Group A Test Scores

Figure 5.2

Group B Test Scores

The data sets are definitely different, and in this case, we can easily see what the difference is: The data for Group B are more spread out thanthe data for Group A. As we saw, however, this fact is not captured by the measures of central tendency. We need another group of measuresto further help describe the data. These are called measures of variability.

The terms variability, spread, and dispersion are synonyms that simply refer to how "spread out" the values in a data set are. In our example,the scores for Group A are more densely packed and those for Group B are more spread out. The difference in grades among students is greaterfor Group B than for Group A. There are four frequently used measures of variability: range, interquartile range, variance, and standarddeviation.

Range

The range is the simplest measure of variability to calculate, and one you have probably encountered the most. The range is simply the highestscore minus the lowest score. In the data sets graphed above, the range of scores for Group A is 4: The highest grade is 52 and the lowest is 48(52 − 48 = 4). The range for Group B is 20: The highest grade is 60 and the lowest is 40 (60 − 40 = 20). Note that people also use the term"range" to refer to the interval (not just the difference) between the lowest and highest value. For example, we might say that the range ofGroup B is 40−60.

Although the range tells us what the overall spread of the data is, it does not tell us anything about how those data are concentrated. ForGroup B, for example, we can't tell if the grades are evenly distributed from 40 to 60, if they are concentrated in the middle, or if they areconcentrated at one or both extremes just by calculating the range. The range can be misleadingly high if the data set contains outliers.

Interquartile Range – Introduction

The interquartile range (or IQR) gives us a different view of the variability of the data. The IQR is the range of the middle 50% of the scores in adistribution. It tells us how much the central 50% of values are dispersed. Before we talk more about IQR, it is important to understand what aquartile is.

Percentiles and Quartiles

You are probably familiar with the term percentile. You may have been told, for example, that you scored in the 95th percentile on an exam, orthat your infant's weight is in the 30th percentile. Percentile is a measure of the location of a data point. If your score is in the 95th percentile, itis higher than 95% of the other scores. (Note that this does not necessarily mean that you scored a 95% on the exam.) If your baby girl is in the30th percentile for weight, she is heavier than 30% of the other girls her age, and equal to or lighter than 70%.

With percentiles, we divide the data into 100 equal parts. If there are 100 data points, then there are 50 values below the 50th percentile and50 values above the 50th percentile. We can also divide the data into 4 equal parts, which are called quartiles. The first quartile, Q1, is the sameas the 25th percentile; the third quartile, Q3, is the same as the 75th percentile. Recall that the median is the middle value in the data set. Themedian is also the second quartile, Q2, and the 50th percentile.

Recall from the lesson on Measures of Central Tendency that if there are 10 data points, the median is between points 5 and 6. If the valuesare (1, 2, 3, 4, 5, 6, 7, 8, 9, 10), the median is 5.5. (See Figure 5.3.) To find Q1, find the median of the lower half of the data set. In the case of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), Q1 = 3. Similarly, to find Q3, find the median of the upper half of the data set. In this case, Q3 = 8. Figure 5.4shows Q1, Q2, and Q3 for the data set (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12).

Figure 5.3

Quartiles of the data set (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

Figure 5.4

Quartiles of the data set (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)

Calculating Quartile Values

It is important to note that the method shown above is just one way to calculate quartile values. Different statistics packages calculate thevalues in slightly different ways. For example, instead of choosing the median of the lower and upper half of the data as the Q1 and Q3 values,some choose the last data point in the first 25% and 75% of the data. For the data set (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), the Q1 value wouldtherefore be 3, not 3.5, and the Q3 value would be 9, not 9.5. This can affect the interquartile range, but usually by only a small amount.

Interquartile Range (IQR)

The interquartile range indicates the spread of the middle half (or the middle 50%) of the data. It is the difference between the third quartile(Q3) and the first quartile (Q1).

IQR = Q3 – Q1

For example, let's compare the quartiles and interquartile range of the Group A and Group B test scores. We put the values in numerical order,find the median, and then find Q1 and Q3. For Group A, the IQR = 51 – 49 = 2.

Figure 5.5

Group A Test Scores

Figure 5.6

Group B Test Scores

Although the ranges of the two groups are different, the interquartile ranges are close to each other. In both data sets, the middle 50% of thedata are clustered closely together.

Like range, the terms quartile and interquartile range can refer to the intervals of data as well as differences between values. You may hear theterm "first quartile" in reference to the lower 25% of the data (not simply the value for which 25% of the data are lower). Similarly,"interquartile range" can refer to the entire set of middle data points.

Outliers

We can use the interquartile range to help identify potential outliers. A value is suspected to be a potential outlier if it is more than 1.5 × IQRbelow the first quartile or more than 1.5 × IQR above the third quartile. For example, we might wonder which grades in Group B might beoutliers.

1) Anything less than Q1 – (1.5 × IQR) may be an outlier:

For Group B:

1.5 × IQR = 1.5 × 3 = 4.5

Q1 = 48.5

48.5 – 4.5 = 44

So anything less than 44 in Group B is a possible outlier.

2) Any value greater than Q3 + (1.5 × IQR) may be an outlier:

1.5 × IQR = 1.5 × 3 = 4.5

Q3 = 51.5

51.5 + 4.5 = 56

So anything greater than 56 in Group B is a possible outlier.

Group A, however, has no apparent outliers. All the values fall within 46 [Q1 – (1.5 × IQR)] and 54 [Q3 + (1.5 × IQR)].

Figure 5.7

Group A quartiles and interquartile range

Figure 5.8

Group B quartiles, interquartile range, and outliers

Box Plots

The spread of data, including range and IQR, is often displayed using a box plot, or box-and-whisker plot. The box plot is a good graphicalindicator of the concentration of the data, and also shows how far the extreme values are from the rest of the data values.

A box plot is constructed using five values: the minimum value, the first quartile, the median, the third quartile, and the maximum value. Toconstruct a box plot, use a number line and a rectangular box. The minimum and maximum data values are represented by the endpoints of theaxis. One end of the box represents the first quartile, and the other end of the box represents the third quartile. The middle 50% of the datafall inside the box. The median value is represented by a vertical line within the box. The "whiskers" extend from the ends of the box to theminimum and maximum data values. The box plots below are oriented horizontally, but they can also be oriented vertically. Orienting box plotsvertically is particularly common when comparing data sets.

Figure 5.9

Box plot

Let's look again at the Group B test scores.

Table 5.1: Group B Test Scores
40	49	50	52
43	49	50	52
46	49	50	53
47	50	50	53
47	50	51	54
48	50	51	57
48	50	51	60

A box plot of the Group B test scores looks like this:

Figure 5.10

Box plot of Group B test scores

The box plot clearly shows that the data are clustered around the median, and that the maximum and minimum values are far from themajority of the data.

In contrast, a box plot of the Group A test scores, however, looks quite different. It is easy to see that the data are all very close to the medianvalue.

Figure 5.11

Box plot of Group A test scores

Box Plots and Outliers

Recall that any data point that is less than 1.5 × IQR below Q1, or more than 1.5 × IQR above Q3, is a possible outlier. Knowing this, you canuse the box plot to quickly identify any outliers in the data set. For example, the values in the dotted areas of the box plot below are possibleoutliers.

Figure 5.12

Box plot showing areas containing possible outliers

Box Plots and Other Data Distributions

The data in Group A and Group B differ in their variability, but both have symmetrical distributions. We can see this fairly easily in the box plots.We can also see other distributions fairly easily in other box plots. Some examples are below.

Figure 5.13

Box plot of symmetrical data

Figure 5.14

Box plot of data skewed to the right

Figure 5.15

Box plot of data skewed to the left

Variance

We can describe the variability of the data in terms of the range and the interquartile range of values, but we can also describe it in terms ofhow close all of the scores in the distribution are to the middle of the distribution. When we define IQR and create box plots, we use median asthe measure of central tendency. However, we use mean as the measure of the middle to define variance.

The variance is defined as the average squared difference of the scores from the mean. The difference between a score and the mean is knownas its deviation from the mean. For example, suppose we have a set of 3 numbers: 45, 50, and 55. The mean is 50.

Table 5.2: Population Data Set
Score	Deviation from themean (score – mean)	Squared deviationfrom the mean
45	–5	25
50	0	0
55	5	25

sum of squared deviations from the mean	25 + 0 +25 = 50
average of squared deviations (sum of squareddeviations ÷ number of values)	50 ÷ 3 =16.67

The variance of this data set, therefore, is 16.67.

For comparison, the variance of the data set (25, 50, 75) is 416.67. The variance of (49, 50, 51) is 0.67. The closer the data points are to themean, the lower the variance .

You may wonder why we square the difference of scores—why not just sum the deviations? Because they will always add up to zero and thusthey tell us nothing about the data set.

The formula for the variance is written as

σ2=∑(X−μ)2N,

where σ2 is the variance, X is each value, μ is the population mean, and N is the number of values. (The ∑ symbol means "sum of.")

Variance of a Sample (as Opposed to a Population)

The formula shown above is used when every member of the population is included in the data set. In most cases, however, our data sets willnot include a value for each member of the population being studied. Instead, we are only able to gather data from a sample. Because asample is only an estimate of the population, the previous formula underestimates the variance. We use a slightly different formula to calculatethe variance of a sample. Instead of dividing the sum of squared differences by the number of values, we divide by one less than the number ofvalues. Mathematically, this results in a slightly greater value for the variance because we are making a more expansive estimation of thevariance that takes into account the possibility of error and greater variation than what might have been observed in our limited sample.

The formula for sample variance is written as:

s2=∑(X−M)2n−1,

where s2 is the estimate of the variance, X is each value, M is the sample mean, and n is the number of values in the sample. Note that M isthe mean of a sample taken from a population with a mean of μ. Since, in practice, the variance is usually computed in a sample, this formulais most often used.

To see the difference, let's revisit the simple sample data set.

Table 5.3: Sample Data Set
Score	Deviation from themean (score – mean)	Squared deviationfrom the mean
45	–5	25
50	0	0
55	5	25

sum of squared deviations from the mean	25 + 0 + 25 =50
average of squared deviations (sum of squareddeviations ÷ number of values minus 1)	50 ÷ (3 – 1) =50 ÷ 2 = 25

If these data include all members of the population, then the variance is (sum of squared deviations) ÷ (number of values) = 50 ÷ 3 = 16.67. Butif this is only a sample of the population, the variance is the (sum of squared deviations) ÷ (number of values minus 1) = 50 ÷ 2 = 25.

Standard Deviation

The most common measure of variability is standard deviation. Like variance, standard deviation is a number that measures how far data valuesare from their mean. The standard deviation provides a numerical measure of the overall amount of variation in a data set, and can be used todetermine whether a particular data value is close to or far from the mean. The greater the variation of the data, the higher the standarddeviation. For example, the graph below shows two distributions of data. The blue distribution has a standard deviation of 5. The red has astandard deviation of 10.

Figure 5.16

Distributions with two different standard deviations

Calculating Standard Deviation

The standard deviation is simply the square root of the variance. The standard deviation of the simple data set with a variance of 16.67 is thus4.08.

Table 5.4: Population Data Set
Score	Deviation from the mean (score – mean)	Squared deviation from the mean
45	–5	25
50	0	0
55	5	25
Population variance Population standard deviation (square root of variance)	σ2=∑(X−μ)2N=503=16.67 σ=σ2−−√=16.67−−−−√=4.08

The formula for the standard deviation of a data set that includes the whole population is

σ=∑(X−μ)2N−−−−−−−−−−√,

where σ = population standard deviation, X is each value, μ is the population mean, and N is the number of values. (The Σ symbol means "sumof" and the √ symbol means "square root.")

The formula for the standard deviation of a sample is:

s=∑(X−M)2n−1−−−−−−−−−−−√,

where s = sample standard deviation, X is each value, M is the sample mean, and n is the number of values in the sample. So if our simple dataset reflected a sample, rather than an entire population, the standard deviation would be 5 instead of 4.08.

Table 5.5: Sample Data Set
Score	Deviation from the mean (score − mean)	Squared deviation from the mean
45	–5	25
50	0	0
55	5	25
Sample variance Sample standard deviation (square root of variance)	s2=∑(X−M)2n−1=502=25s=s2−−√=25−−√=5

Notice that because we divide by (n – 1) for the sample and N for the population, the standard deviation of a sample is always greater than thestandard deviation of a population if the data sets are the same. This makes sense because we are less certain about how representative thevalues are for a sample data set. For a population data set, we know they are representative.

Although it is important to know how standard deviation is calculated in order to understand it, we don't usually need to calculate it by hand.Statistics programs and spreadsheet programs like Microsoft Excel have tools for calculating statistics like standard deviation.

Using Standard Deviation to Analyze a Data Point

Like variance, standard deviation tells us where the data are in relation to the mean overall. The standard deviation is small when the data areconcentrated close to the mean, exhibiting little variation or spread. The standard deviation is larger when the data values are more spread outfrom the mean, exhibiting more variation.

For example, the standard deviation for the Group A test scores is 1.17. This means that the average distance that a score is from the mean is1.17 test points. The standard deviation for the Group B test scores, however, is 3.72, which means that the average distance that a score isfrom the mean is 3.72 test points. This makes sense based on what we already know: The values in Group B are much more spread out thanthe values in Group A. (Note that we are considering Group A and Group B to be populations, not samples, so we use the formulas forpopulation variance and standard deviation.)

Table 5.6: Group A Test Scores and Parameters
48	49	50	51
48	50	50	51
49	50	50	52
49	50	51	52

Range: 4

IQR: 2

Variance: 1.38

Standard Deviation: 1.17

Table 5.7: Group B Test Scores and Parameters
40	49	50	52
43	49	50	52
46	49	50	53
47	50	50	53
47	50	51	54
48	50	51	57
48	50	51	60

Range: 20

IQR: 3

Variance: 13.86

Standard Deviation: 3.72

We can also use standard deviation to tell us how far a particular data value is from the mean. R

esearchers often group data points based onhow far they are from the mean with respect to standard deviation. A point that is within plus or minus the standard deviation from the meanis said to be "within one standard deviation" of the mean. A point that is within ±2 times the standard deviation is said to be "within twostandard deviations" of the mean. For example, for Group B, the mean is 50 and the standard deviation is 3.72. Therefore, any value between46.28 (50 – 3.72 = 46.28) and 53.72 (50 + 3.72 = 53.72) is within one standard deviation of the mean. Any value between 42.56 (50 – (2 × 3.72)= 50 – 7.44 = 42.56) and 57.44 (50 + (2 × 3.72) = 50 + 7.44 = 57.44) is within two standard deviations of the mean.

Figure 5.17

Standard deviations and outliers for Group B test scores

What does this indicate? A value that is within one standard deviation from the mean is considered close to the average. A value that is twostandard deviations from the mean is on the borderline for what many statisticians consider to be far from the average. Considering data to befar from the mean if it is more than 2 standard deviations away is more a "rule of thumb" than a rigid rule. Using this rule of thumb, the testscores of 40 and 60 are outliers, but the 43 and 57 are not necessarily outliers.

Summary of Measures of Variability

Table 5.8: Summary of Measures of Variability
Measure	Range	Interquartile range(IQR)	Variance	Standard deviation
Description	difference betweenhighest and lowestvalues in the data	difference between thethird quartile (75thpercentile) and the firstquartile (25th percentile)	average of the sum ofsquared differencesbetween the value andthe mean	square root of variance;average distance a datapoint falls from themean
Formula	(maximum value) –(minimum value)	Q3 – Q1	For a population: σ2=∑(X−μ)2N For a sample: s2=∑(X−M)2N−1 where σ2 = populationvariance; s2 = samplevariance; X = data value; M = sample mean; µ =population mean; N =number of data points	For the population: σ=∑(X−μ)2N−−−−−−−−−−√ Forthe sample: s=∑(X−M)2N−1−−−−−−−−−−−√ where σ =population standarddeviation; s = samplestandard deviation; X =data value; M = samplemean; µ = populationmean; N = number ofdata points.
What it indicates	· The range is smallwhen all valuesare clusteredtogether. · The range is highwhen values arespread apart orwhen there areextreme outliers.	· The IQR is smallwhen the middle50% of the valuesare clusteredtogether. · The IQR is highwhen the middle50% of values arespread apart.	· The variance issmall when thedata areconcentratedclose to themean, exhibitinglittle variation orspread. · The variance islarger when thedata values aremore spread outfrom the mean,exhibiting morevariation.	· The standarddeviation is smallwhen the data areconcentrated closeto the mean,exhibiting littlevariation orspread. · The standarddeviation is largerwhen the datavalues are morespread out fromthe mean,exhibiting morevariation.
Pros	· Tells us howspread apartthe data set isoverall. · Easy tocalculate.	· Tells us howvariable themiddle 50% of thedata are. · Relatively easy tocalculate.	· Takes every datapoint intoaccount.	· Takes every datapoint intoaccount. · Is in the sameunits as thevariable. · Used to calculate z-scores (whichare discussed inLesson 7)
Cons	· Based on onlytwo data points. · Can be misleadingif there areoutliers. · Does not tell ushow the data aredistributed orclustered. · Does not tell ushow far away thedata points are,on average. · Does not tell uswhat values mightbe outliers.	· Based on onlytwo data points. · Differentstatisticianscalculate it inslightly differentways.	· Hard to calculateby hand,especially largedata sets. · Is in differentunits from thevariable. · Can be misleadingwhen data arehighly skewedsince it is basedon the mean. · Outliers can havea strong influenceon the value sinceit is based on themean.	· Hard to calculateby hand,especially withlarge data sets. · Can be misleadingwhen data arehighly skewedsince it is basedon the mean. · Outliers can havea strong influenceon the value sinceit is based on themean.
When to use it	For any quantitativedata set.	For any quantitativedata set for which themedian is the preferredmeasure of centraltendency, which meanswhen data are notsymmetrical and bell-shaped (do not have anormal distribution).However, if the meanand median are thesame (distribution issymmetrical), the IQRcan be used in additionto the standarddeviation and variance.	For any quantitativedata set for which themean is the preferredmeasure of centraltendency, which meanswhen data aresymmetrical and bell-shaped (have a normaldistribution). Typicallyvariance is only used inthe process of obtainingthe standard deviation,which is reported andinterpreted morecommonly.	For any quantitativedata set for which themean is the preferredmeasure of centraltendency, which meanswhen data aresymmetrical and bell-shaped (have a normaldistribution).

Introduction

Introduction to Probability

image of dice.

Photodisc/Thinkstock

A thorough understanding of probability is the key to a casino'ssuccess.

What are the chances that you will pull a heart out of a deck of cards? How likelyis it that you will select a 25-year-old male from the general population for asample you are creating for a research study? How likely is it that you will answera multiple-choice question correctly? What is the likelihood that you willexperience a side effect of a particular drug? What are your chances of winningthe lottery? All of these questions illustrate the concept of probability and itsapplication to our everyday lives.

Probability is the chance or likelihood of an event, and it has a wide range ofapplications. The probability of something occurring is always a number between0 and 1, inclusive. A probability of 0 means that the likelihood of the eventoccurring is 0%. A probability of 1 means that the likelihood of the eventoccurring is 100%. As the probability of an event approaches 0, it becomes lesslikely to occur. Likewise, as the probability of an event approaches 1, it becomesmore likely to occur. A probability of .5 means that there is a 50% chance that theevent will occur.

The probability of an event is related to whether that event is independent or dependent of other events. For example, if we flip a coin, thechances of either heads up or tails up occurring are both the same, .5 or 50%, and will always be the same (assuming we have a fair, balancedcoin). Even if we flipped a coin 100 times and each time it was heads up, the next flip still has just a .5 or 50% chance of being heads up sinceeach flip is independent of the other flips. An event is dependent if the events leading up to it affect its probability. For example, say we have ahat containing 10 slips of paper, with a different name written on each. We pick names out of the hat one at a time. The chance of drawing acertain name first is 1/10, or .10, or 10%. After we have drawn five names from the hat, the chance of drawing one of the remaining names(presuming that we have not put the five names back into the hat) is now 1/5, or .20, or 20%. Thus the chance of selecting a certain name hasincreased from 10% to 20%. This is because the probability of the later event is dependent on the results of earlier events.

We must be careful to avoid the Gambler's Fallacy, which is the impression that the probability of an independent event increases when it hasyet to occur. This is incorrect because the probability, for example, of rolling a 6 on a six-sided die is the same with each roll, even though youmay have already rolled it 100 times and never rolled a 6. In events such as this, we should not assume the past events are additive, or create acondition in which the probability has changed when those events are independent of each other. This lesson will cover the basics ofprobability, including how to calculate and interpret various probabilities.

Applying Knowledge of Probability

Probability in Ashford Courses

You will need to understand probability in a range of graduate courses, including those with a focus on psychological or organizationalassessment and testing, measurement, research methods, and statistics. In these courses you will need to calculate and interpret probability soyou may better understand and describe the chances of an event occurring, especially as it relates to making errors in hypothesis testing anddrawing adequate samples from a population. In addition, having knowledge and skills in this area will help you better understand the resultsand discussions sections of the research literature you may be asked to evaluate for various courses.

Probability in Graduate Research

When conducting research it is important to determine and evaluate the probability, or significance level, for all tests conducted. This providesus with information regarding the level of statistical significance or probability of the results being due to chance. We should also considerprobability when sampling to make sure the sample is representative of the population, thus individuals have equal or specific probabilities ofbeing selected for the sample.

Probability in the Professional World

In the professional world it is important to understand how probabilities are calculated, when they should be utilized, and to interpret them inorder to accurately portray data to others. Considering the Gambler's Fallacy, the Law of Large Numbers, and other probability principles will aidin interpreting and correcting common misconceptions about probability. In addition, it is part of one's ethical practice as a professional tounderstand that which one is communicating to others, whether it is summarizing another's research data or your own findings.

Introduction

Introduction to Probability

image of dice.

Photodisc/Thinkstock

A thorough understanding of probability is the key to a casino'ssuccess.

Applying Knowledge of Probability

Probability in Ashford Courses

Probability in Graduate Research

Probability in the Professional World

ntroduction

Introduction to Normal Distributions

thinkstock

Many attributes, when measured from the population or a large enough sample,end up looking like what we call a "normal" distribution. Weight, height, runningspeed, heart rate, and other characteristics look like the normal curve, meaningmore scores are concentrated around the mean, or middle of the distribution, andfewer and fewer scores are shown as we go further away from the mean. It iscommon to hear a normal distribution referred to as a bell-shaped curve due tothis overall shape.

A normal distribution has many characteristics. First, the mean, median, and modeare all at or close to the same point. In addition, the normal distribution looks likea mound or bell, where the majority of scores falls near the middle and each sideslopes down to fewer and fewer scores.

A normal distribution is not skewed in any way and is symmetrical, meaning thatif we cut the distribution in half we would have two halves that look like mirror images of each other. When a distribution is normal, we cancomfortably use the mean to describe the center of our data, as there is no significant skewness or outliers that will affect the mean.

A normal distribution also has specific rules of thumb relating to a measure of variability: standard deviation. Approximately 68% of the scoresin a normal distribution fall within one standard deviation of the mean. Stated differently, the majority of the scores falls close to the mean.This makes sense because in the normal distribution, the majority of scores falls in the middle, where the "average" is located, while the furtherwe move off the mean on either side, there are fewer and fewer scores. Approximately 95% of the scores fall within 2 standard deviations ofthe mean, and approximately 99% of the scores fall within 3 standard deviations of the mean.

We can use standardized scores with a normal distribution: We can convert all the raw numbers in our data set to the same scale for easycomparison. We can only do this when the data have a predictable pattern, such as in the normal distribution. One of the most commonstandardized scores is the z score, which tells us the distance a raw score is from the mean in terms of standard deviations.

Z scores can be positive or negative. If a z score is positive (+), we know that the score is above or greater than the mean. If a z score isnegative (−), we know that the score is below or less than the mean. A z score of 0 means the score is the same as the mean. The z scoreequals the distance between it and the mean, in standard deviations. If a z score is −1.5, it is 1.5 standard deviations below the mean. If a zscore is +0.50, then it is .50 standard deviations above the mean.

A z score is useful because it tells us where a raw score falls in comparison to the mean and other scores. We can more easily see how all thescores relate to each other and the mean. Z scores are often reported when we take standardized tests. We can calculate z scores, too, if wehave a group of data that is normally distributed. We can also calculate the mean and standard deviation of the normal distribution.

This lesson will introduce you to the properties of the normal distribution and z scores, and how they are useful in statistical analysis.

Applying Normal Distributions and z Scores

Normal Distributions and z Scores in Ashford courses

Students will need to understand normal distributions in a range of graduate courses, including those with a focus on psychological ororganizational assessment and testing, measurement, research methods, and statistics. In these courses you will need to calculate and interpretz scores so you may better understand and describe data, especially as they relate to tests and measurements. You will also need to explain thecharacteristics and uses of normal distributions. In addition, having knowledge and skills in these areas will help you better understand themethods, results, and discussions sections of the research literature you may be asked to evaluate for various courses.

Normal Distributions and z Scores in Graduate Research

When conducting research it is important to determine and evaluate the shape of data's distribution to determine if it is "normal." With anormal distribution, we can describe the data using certain rules of thumb, and we can more easily calculate standardized scores from the data.We should also consider the importance of normal distributions when evaluating archival data.

Normal Distributions and z Scores in the Professional World

In the professional world it is important to understand how to calculate standardized scores, when to use them, and how to interpret them inorder to accurately portray data to others. Knowing the properties of normal distributions and z scores will aid in interpreting and correctingcommon misconceptions about what is "normal" as it applies to data. In addition, it is part of one's ethical practice as a professional tounderstand that one is communicating to others, whether it is summarizing another's research data or one's own findings.

Tutorial

Normal Distributions

You have probably heard of a bell curve: a bell-shaped graph in which values are more concentrated in the middle than at the ends. This ismore formally known as a normal distribution, and it appears in almost all disciplines, including psychology, business, economics, the sciences,nursing, and, of course, mathematics. Most IQ scores are normally distributed; real estate prices fit a normal distribution; human heights andweights are normally distributed; standardized assessments may be graded using a normal distribution.

Figure 7.1

Normal distributions are bell-shaped curves.

Importance of Normal Distributions

Normal distributions are extremely important in statistics for a number of reasons. First, as we have stated above, many population variableshave a probability distribution that is approximately normal. Measures of reading ability, introversion, job satisfaction, and memory are amongthe many psychological variables approximately normally distributed. Although the distributions are only approximately normal, they are usuallyquite close.

A second reason the normal distribution is so important is that it is easy for mathematical statisticians to work with. This means that manykinds of statistical tests can be derived for normal distributions. Fortunately, these tests work very well even if the distribution is onlyapproximately normally distributed. Some tests work well even with very wide deviations from the normal distribution.

If a variable is distributed normally, then we can easily calculate a number of probability statistics and use them to make inferences about ourdata. For example, we can calculate the probability that a value occurred due to chance. If that probability is extremely low, we might need toconsider some alternative explanation.

Properties of Normal Distributions

In Lesson 6, we discussed probability distributions, in particular binomial distributions. The binomial distribution is an example of a discretedistribution. That is, it is a frequency graph of a discrete variable: The values are described in whole numbers only (the number of successes canbe described in whole numbers only).

A normal distribution looks very similar to a symmetrical binomial distribution, but it is a continuous distribution. It is a probability distributionor frequency graph of continuous variables. The variable could, for example, be a person's height or weight, the score on an exam, or the priceof a house.

Normal distributions have a number of defining characteristics:

1. The mean, median, and mode values are the same.

Figure 7.2

2. They are symmetrical about the mean.

3. They are unimodal.

4. The area under a normal distribution curve is equal to a probability of 1. (This makes sense; recall that the sum of the probabilities of allpossible outcomes is 1.)

5. The probability of any single particular value is 0. (This may not make sense at first, but it arises from the fact that the variables are continuous;the distribution has is an infinite number of values. In a discrete distribution, we can display the probability of each value in a table. This is notpossible for a continuous distribution because the number of values is infinite, and the probability of each is zero. Unlike a discrete distribution,where we can define the probability of a single value, we will instead define the probability of a range of values.)

6. The exact shape of the normal distribution is defined by two parameters: the mean and the standard deviation. Several normal distributions areshown below.

Figure 7.3

Notice that they differ in how spread out they are. Because the area under the curve must be equal to 1, a change in the standard deviationcauses a change in the shape of the curve. The curve is broad with a large standard deviation, and skinny with a small standard deviation. Thereis an infinite number of normal probability distributions.

Calculating Probabilities Using the Empirical Rule

As we stated earlier, because the normal distribution is a continuous probability distribution, it is impossible to define the probability of a singlevalue. Instead, we can calculate the probability of a range of values.

In a normal distribution,

· about 68% of the values are within one standard deviation of the mean. Since the curve is symmetrical, this means that 34% are within onestandard deviation above the mean and 34% are within one standard deviation below the mean;

· about 95% of the values are within 2 standard deviations of the mean; and

· about 99.7% of the values are within 3 standard deviations of the mean.

These properties are true for all normal distributions and they form the empirical rule, or the standard deviation rule.

Figure 7.4

Standard Normal Distribution and z Scores

Notice that when we discussed the empirical rule, we replaced the x-axis values with numbers indicating distance from the mean in units ofstandard deviations, with the distribution centered at 0 (0 standard deviations from the mean). This curve is known as a standard normaldistribution. The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1. Normal distributions canbe transformed to standard normal distributions by the following formula:

z=x−μσ

where x is a score from the original normal distribution, μ is the mean of the original normal distribution, and σ is the standard deviation of theoriginal normal distribution. The standard normal distribution is sometimes called the z distribution.

The units along the x-axis of a standard normal distribution are z scores. The z score of a value is simply its value in terms of standarddeviations from the mean. For example, if the mean of a normal distribution is 5 and the standard deviation is 2, the value 11 is 3 standarddeviations above (or to the right of) the mean. Its z score is 3.

z=x−μσ=11−52=62=3

Standard normal distributions and z scores are important when comparing distributions. Using them, you can quickly understand what aparticular value really means. For example, suppose your doctor tells you that your infant son's weight is 8 ounces below the mean weight forbaby boys his age. Should you be worried? Even if your doctor tells you that the mean weight is 12 pounds 4 ounces, you still don't really knowif your son's weight is "normal" or not. What you need to know is how close it is to the mean in terms of standard deviations. If it is within 1standard deviation, it would be considered quite close to the mean and probably within the range of healthy infants his age. If it is more than 2standard deviations lower than the mean, however, you might have reason to worry. If it is more than 3 standard deviations, then you knowthat your son is smaller than more than 99% of other infants his age. We want to know not just your son's weight (the "raw score") and thedifference between his weight and the mean weight, but also the z score of his weight.

Converting between Raw Scores and Percentiles

It would actually be quite unlikely for your pediatrician to give you the z score of your infant son's weight. But he or she is likely to give you arelated measure: the percentile ranking of the baby's weight. You may be told that the baby is only in the 5th percentile for weight. This meansthat 95% of infant boys his age weigh more than he does. Similarly, you may have scored in the 98th percentile on a standardized test. Thismeans that you scored better than 98% of the people who took the test. Only 2% scored higher than you.

One advantage of normal distributions is that if the mean and standard deviation of a normal distribution are known, it is easy to convert backand forth between raw scores and percentiles.

Example 1: Calculating percentile rank from a raw score

For example, assume an exam in an Ashford psychology course is normally distributed with a mean of 80 and a standard deviation of 5. What isthe percentile rank of a person who received a score of 70 on the exam? We know that a score of 70 is 2 standard deviations below the mean(the mean is 80 and the standard deviation is 5). We learned in the section on the empirical rule that 95.4% of the data points lie within 2standard deviations of the mean. This means that the remaining 4.6% must be lie outside this range. Because the curve is symmetrical, 2.3% arebelow −2 standard deviations and 2.3% are above +2 standard deviations. Thus in terms of the psychology exam example, this means that aperson with a score of 70 would be in the 2.3rd percentile.

Figure 7.5

Example 2: Calculating percentile rank from a raw score

What about a person who scored 75 on the same exam? The proportion of the area of the curve between 0 and 75 is the same as theproportion of scores between 0 and 75.

Figure 7.6

A score of 75 is one standard deviation below the mean (the mean is 80 and the standard deviation is 5). If you keep the empirical rule in mind,you can calculate the percentile:

· 68% of scores are within one standard deviation of the mean. In other words 68% of scores are between the z scores of −1 and +1.

· Because the distribution is symmetrical, 34% (half of 68%) must be between −1 and the mean.

· We know that 50% of scores are between negative infinity and the mean (i.e., below the mean).

· Therefore, (50% below the mean) − (34% between −1 and the mean) = 16% below −1.

Figure 7.7

Therefore, the proportion of the scores below 75 is about 0.16, so a person with a score of 75 would have a percentile rank score of about 16(more precisely, 15.9).

Example 3: Calculating percentile rank from a raw score using a z table

If the z score is near a whole number between −3 and +3, then you can use the empirical rule. If not, you'll need another method. Recall fromLesson 6 that you can calculate percentiles by ordering and counting every data point and then figuring out where in the sequence of pointsyour data value falls. Fortunately, there are easier ways. Most people use a z table like the one below, or a z calculator, which you can findonline. (Links to z calculators can be found in the Practice section of this lesson.)

Table 7.1
z	probability (orarea under thecurve) fromnegative infinityto z
−3.0	0.0013
−2.5	0.0062
−2.0	0.0227
−1.5	0.0668
−1.0	0.1587
−0.5	0.3085
0.0	0.5000
0.5	0.6915
1.0	0.8413
1.5	0.9332
2.0	0.9772
2.5	0.9938
3.0	0.9987

What is the percentile rank of a person who received a score of 92.5 on the exam?

Figure 7.8

The graph shows that most people scored below 92.5. We can determine that 92.5 is 2.5 standard deviations above the mean:

z=x−μσ=92.5−805=12.55=2.5

The table indicates that a z score of 2.5 is equivalent to the 99.4th percentile. The proportion of people who scored below 92.5 on the exam,then, is .994.

Example 4: Calculating Raw Scores from Percentiles

What score on the psychology exam would be in the 75th percentile? (Remember the test has a mean of 80 and a standard deviation of 5.) Theanswer is computed by reversing the steps in the previous problems.

First, determine how many standard deviations above the mean equates to the 75th percentile. This can be found by using a z table and findingthe value of z associated with 0.75. The value of z for the 75th percentile is 0.674. Thus, one must be .674 standard deviations above the meanto be in the 75th percentile.

Second, calculate how far from the mean 0.674 standard deviations is. Because the standard deviation is 5, one must be 3.37 points above themean.

σ×z=5×0.674=3.37

Because the mean is 80, you would need a score of 80 + 3.37 = 83.37 to be in the 75th percentile. Rounding off, a score of 83 is needed to bein the 75th percentile.

Figure 7.9

Normal Distribution and z Scores Summary

Concept	Definition	Additional Notes
normaldistribution	A continuous probability distribution that is unimodal andsymmetrical about the mean, median, and mode.	In a normal distribution, the mean = median = mode · Probability of any particular value = 0 · Total probability = 1
standardnormaldistribution	A normal distribution that has been rescaled such that the x-axis is in units of standard deviations, with the curvecentered at 0, the mean.	The most common standard normal distribution is the zdistribution.
Empiricalrule	The rule that states that for a normal distribution of data, 68% of datapoints fall within 1 standard deviation of the mean, 95.44% fall within 2standard deviations of the mean, and 99.74% fall within 3 standarddeviations of the mean.	The empirical rule, along with z scores, can be used toestimate probabilities and percentiles for normaldistributions.
z score	The value of a data point in terms of number of standarddeviations from the mean; the z score is given in terms ofstandard deviation units.	z=x−μσ where x is a score from the originalnormal distribution, μ is the mean of theoriginal normal distribution, and σ is thestandard deviation of original normal distribution. The x-axis of the standard normal distribution is in zscore units or standard deviations from the mean. z scores can be calculated for all types of distributions,not just normal distributions. However, the standardnormal distribution tables cannot be used for these data.
Proportion	The amount of area (between 0 and 1) under the normalcurve below a certain value, above a certain value, orbetween two values.	The proportion below the curve and the probability thata point falls in that area below the curve are the same.
probability	The chance (between 0 and 1) that a data point has a valueabove a certain value, below a certain value, or betweentwo values.	The probability is the same as the proportion under thecurve. If the distribution is normal, the probability canbe calculated using z scores.
percentile	The ranking of a data point with respect to other datapoints; a data point in the nth percentile is greater than n%of the other data points, and less than 100 − n% of theother data points	If the distribution is normal, and the z score is known orthe raw data, mean, and standard deviation are known,the percentile can be determined using a z table.

Introduction

Introduction to Hypothesis Testing

iStockphoto/thinkstock

To determine whether or not a series of training sessions has animpact on job performance, a researcher would need to compareperformance before and after training, and compare performance ofthose who participated to those who did not participate in thetraining.

In the lessons leading up to this one, we focused on using descriptive statistics tohelp describe data in some way. This includes identifying and describing thevariable(s) of interest, evaluating the distribution of data, and interpretingmeasures of central tendency and variability. What if you wanted to look atmultiple variables at one time and compare them to each other, or determine ifyour sample data is representative of the population? Or what if you wanted todetermine whether there is a relationship between variables, or whether onevariable impacts another variable? All of these questions can be answered usinginferential statistics, which are methods for drawing conclusions regarding data.One major tool used in inferential statistics is hypothesis testing.

A hypothesis is an expectation regarding what you expect or hope to find in astudy. It provides direction for what you are interested in measuring. A research(or alternative) hypothesis is what you expect to find proof of in your study, whichwill typically focus on a significant difference, effect, or relationship. The nullhypothesis is what you hope to prove wrong with your data results; it is theopposite of the alternative hypothesis, in that it typically states that you will find nothing (hence, "null"). In other words, the null hypothesisstates that you will not find a significant difference, effect, or relationship.

Let's say you have the following research question in mind: Does a new employee training program impact job performance? The nullhypothesis for this study would be the following: A new employee training program does not impact job performance. You hope to prove thishypothesis wrong in your study, allowing you to accept the research/alternative hypothesis: A new employee training program does impact jobperformance. You have not stated what the impact will be—whether you expect an increase or decrease in job performance based on exposureto the training program—only that the training program might have an impact on job performance.

An important concept in hypothesis testing is statistical significance, which tells you whether the results are likely to be due to chance or to areal effect, relationship, or difference. However, that does not mean that what you have found has practical significance, which is the usefulnessor importance of the results in the "real world." By evaluating the actual "effect size" and using critical thinking, you can determine the practicalsignificance of test results for a particular study, thereby providing another level of analysis of statistical results.

In this lesson you will further explore the concept of hypothesis testing and how to determine statistical and practical significance for testresults that can lead to conclusions regarding hypotheses.

Applying Hypothesis Testing

Hypothesis Testing in Ashford Courses

You will need to understand hypothesis testing in a range of graduate courses, including those with a focus on psychological or organizationalassessment and testing, measurement, research methods, and statistics. In these courses you will need to understand how hypotheses areformulated based on research questions, and how these hypotheses influence the focus of the study and conclusions that can be made. Inaddition, having knowledge and skills in this area will help you better understand the introduction, methods, results, and discussions sections ofthe research literature you may be asked to evaluate for various courses.

Hypothesis Testing in Graduate Research

When conducting scientific research, a hypothesis typically guides the focus of the study. You will need to know how to formulate solid, clearhypotheses that capture the essence of a study and communicate to the reader what will be assessed in the study. In addition, based on theresults of the study, you will need to draw conclusions regarding acceptance or rejection of those hypotheses and discuss those conclusions interms of both statistical and practical significance.

Hypothesis Testing in the Professional World

In the professional world it is important to understand how to create hypotheses, when to use them, and how to analyze them in order toaccurately portray data results to others. Knowing how to evaluate hypotheses for the variables and focus of a study will aid in understandingresearch studies and their practical significance for application in the "real world." In addition, it is part of one's ethical practice as aprofessional to understand which one is communicating to others, whether it is summarizing another's research data or one's own findings.

Tutorial

Introduction to Hypothesis Testing: The Betting Dilemma

Suppose you and a friend are betting on the outcome of coin tosses. You know that if the coin is fair, there is a 50% chance it will fall heads upand a 50% chance that it will fall tails up. Now suppose your friend flips the coin 20 times and it falls heads up 14 times and tails up 6 times.You know that ideally, it would land heads up 10 times and tails up 10 times, but you also know that a perfect 50–50 outcome is actually veryunlikely with only 20 flips. But still, it seems unlikely that the outcome would be so different from 50–50. Should you attribute the result torandom chance? Or should you conclude that it is NOT a result of random chance (and then infer that the coin is biased and that your friend ischeating)?

In this scenario, we will explore the concept of hypothesis testing. We will not use any of the technical terms for now (we'll get to those later),but we will put those terms in curly brackets in case you are already somewhat familiar with hypothesis testing and to make it easier to referback to them later.

Defining the Question {Null and Alternative Hypotheses}

So how do we go about answering our question: Is the result due to random chance or is it not? Essentially, you will need to determine whatthe probability is that a fair coin toss would come up 14 heads, and then you will need to determine whether that probability is high enough toaccept random chance as a reasonable explanation. Notice that this process involves both a quantitative analysis (using the probabilities) and aqualitative analysis (using your own judgment regarding the probability threshold).

Examining the Probability – Exactly 14 Heads {p Values}

The first thing we would do is look at the probability distribution for an unbiased coin toss. For 20 tosses, it looks like this:

Figure 8.1

This is a probability distribution of outcomes of a coin toss. Note that this is actually adiscrete, binomial distribution, but we are displaying it like a continuous normaldistribution. With an increasing number of trials, the binomial distributionapproximates a normal distribution.

Table 8.1
X: Number of Heads Upin 20 Tosses	Probability that Number of Heads IsExactlyx
0	9.53674E-07
1	1.90735E-05
2	0.000181198
3	0.001087189
4	0.004620552
5	0.014785767
6	0.036964417
7	0.073928833
8	0.120134354
9	0.160179138
10	0.176197052
11	0.160179138
12	0.120134354
13	0.073928833
14	0.036964417
15	0.014785767
16	0.004620552
17	0.001087189
18	0.000181198
19	1.90735E-05
20	9.53674E-07

From Table 8.1, we see that the probability that exactly 14 heads would turn up in 20 tosses is 0.037, or 3.7%.

Examining the Probability – 14 or More Heads {p Values, One-Tailed Tests}

As we said, the probability that exactly 14 heads would turn up in 20 tosses is 0.037, or 3.7%. However, we are not really interested in theprobability that exactly 14 heads turn up, but instead in the probability that 14 or more heads turn up. After all, if the result was 15, 16, 17, 18,19, or 20 you would be equally, if not more suspicious. We will need to consult a cumulative probability table like Table 8.2 to figure this out.The cumulative probability is the probability that up to x number of heads turn up, or x or more heads turn up.

Table 8.2
x: Numberof HeadsUp in 20Tosses	Probabilitythat Numberof Heads IsExactly x	Probability thatNumber ofHeads Is x orFewer	Probability thatNumber ofHeads Is x orGreater
0	9.53674E-07	9.53674E-07	1
. . .	. . .	. . .	. . .
13	0.073928833	0.942340851	0.131587982
14	0.036964417	0.979305267	0.057659149
15	0.014785767	0.994091034	0.020694733
. . .	. . .	. . .	. . .
20	9.53674E-07	1	9.53674E-07

Figure 8.2

The area under the curve where x is greater than or equal to 14 is 0.058.

In Table 8.2 we see that the probability that 14 or more heads turn up is 0.058. We can also represent this graphically. The proportion of thearea under the curve is 0.058.

Examining the Probability – 14 or More Heads OR 6 or Fewer Heads {Two-Tailed Tests}

To take it a step further, you may also want to consider that you would have been equally suspicious of your partner if the result was 6 orfewer heads, that is, 14 or more tails. Now the total probability that the outcome would result in 14 or more heads OR 6 or fewer heads is0.116 or 11.6%.

Table 8.3
X: Numberof HeadsUp in 20Tosses	Probabilitythat Numberof Heads IsExactly x	Probability thatNumber ofHeads Is x orFewer	Probability thatNumber ofHeads Is x orGreater
0	9.53674E-07	9.53674E-07	1
. . .	. . .	. . .	. . .
6	0.036964417	0.057659149	0.979305267
. . .	. . .	. . .	. . .
14	0.036964417	0.979305267	0.057659149
. . .	. . .	. . .	. . .
20	9.53674E-07	1	9.53674E-07

Figure 8.3

The total area under the curve where x ≤ 6 or x ≥ 14 is 0.116.

Making a Judgment {Statistical Significance}

Now you must ask yourself, am I willing to accept that this outcome was due to chance? We've just calculated that over the long run, anoutcome of 14 heads or similar should occur 11.6% of the time, or about 1 in 9 times. Most researchers would regard this as well within therange of random chance. Until you gather more data to prove otherwise, you can safely accept that 14 heads up out of 20 was a random result,and not because your friend cheated.

Defining a Probability that Is Too Low to Attribute to Random Chance {α Values, Type I Errors, andType II Errors}

If 14 heads is considered to be within the realm of random chance, at what point should you become suspicious? That is technically up to youand what you are hoping to accomplish, but in social science research, the cutoff is generally at a probability of 0.05 (that is, results with aprobability of less than 0.05 would be suspicious). In some cases, this cutoff probability is 0.01. In our experiment, a probability of 0.05 wouldcorrespond to a result of 15 or more heads or 5 or fewer heads (0.0207 + 0.0207 = 0.0414). A cutoff probability of 0.01 would correspond to aresult of 17 or more heads or 3 or fewer heads (0.0013 + 0.0013 = 0.0026).

Which cutoff you choose depends on how comfortable you are with being wrong. For example, if you decide that the cutoff is 0.05, and anyresult of 15 or more or 5 or fewer heads means that your partner is cheating, you need to acknowledge that there is a 4.14%* chance that theresult is actually due to random chance. That is, you need to accept the fact that there is a 4.14% chance that you are unfairly accusing yourfriend of cheating. You may decide that this is just too great a chance (given the consequences of accusing someone of dishonesty) and insteadmove that threshold down to 0.01. Now, if a result is 17 or more, or 3 or fewer heads, there is only a 0.2% chance that you would come to anerroneous conclusion that the game is unfair.

If you were a casino owner, however, you would probably be less concerned with unfairly accusing someone of cheating and more concernedthat someone is stealing money from you. In this case, you might decide to have a higher probability cutoff. As a casino owner, you are moreworried about accepting that the result is due to random chance when in fact it is not, than you are about rejecting that it is due to randomchance when in fact it is.

* You may be wondering why a cutoff of 0.05 doesn't result in a 5% chance of being wrong. It does, but in this scenario, our variable is discrete:We can only have a whole number of heads up. The 0.05 probability corresponds to outcomes of somewhere between 14 or more and 15 ormore, and 5 or fewer and 6 or fewer. The exact probability of 15 or more or 5 or fewer is not 0.05, but is 0.0414.

Defining the Acceptable Range of Results {Confidence Interval}

We can frame the question of what probability is so low that it is suspicious in another way: What is the acceptable range of results? If yourcutoff is 0.05, then you have decided to accept as random and not be suspicious of any result between 6 and 14 heads. This is the range inwhich 95% of the results fall due to random chance. It is also the range for which you are 95% confident that you will not accidentally accuseyour friend of cheating. If your cutoff is 0.01, then you are satisfied with any result from 4 to 16 heads.

Figure 8.4

If we want to be 95% certain that we are not falsely accusing anyone of cheating, theshaded region represents the outcomes that we will accept as being due to randomchance. Anything outside that region we would not accept as being due to randomchance.

Sample Size

Now suppose you continued the bet for 100 tosses and the result is the same in terms of percentage of heads to tails as it was for 20 tosses:70% heads and 30% tails. Should you still attribute the result to random chance? Almost definitely not. If we look at a binomial table, we willsee that the probability of getting 70 or more or 30 or fewer heads is just 0.0000785, or a 0.00785% chance. The larger the sample, the morelikely it is that a difference between your sample and the population (or expected outcome) is significant. For 100 coin flips, we should besuspicious if the result is outside the range of 40 to 60 or so (because 95% of the outcomes on a fair coin toss should occur in this region).

Figure 8.5

Inferential Statistics and Hypothesis Testing

The question Is this result due to chance or is it not? is in the realm of inferential statistics. Whereas descriptive statistics involves describingsamples and populations, inferential statistics involves making inferences about populations based on samples, comparing samples, and inferringthe relationships between variables. For example, we want to know how likely it is that our sample represents the population or that twosamples are different from each other. The Betting Dilemma serves to illustrate many aspects of the realm of hypothesis testing, and we willrefer back to it in the following discussion.

The Null and Alternative Hypotheses

In the Betting Dilemma, we were trying to determine whether the outcome of 14 heads was a result of random chance or not. We were testingthe hypothesis: The outcome of 14 heads in 20 coin flips was a result of random chance. The whole point of the exercise was to decide whetherwe should accept this hypothesis or reject it in favor of another hypothesis: The outcome of 14 heads in 20 coin flips was NOT a result ofrandom chance. We refer to the initial hypothesis as the null hypothesis (abbreviated H0) and the rejection of the null hypothesis as the alternative hypothesis (abbreviated Ha). In statistics, the null hypothesis is generally a statement indicating that things are status quo, there isnothing happening, or there is not a difference. The alternative hypothesis is simply the opposite of the null hypothesis: Something is happeningor there is a difference.

Table 8.4
Null Hypothesis	Alternative Hypothesis
My sample of 30 wild lionsrepresents the entire populationof lions in the wild.	My sample of 30 wild lions doesnot represent the entirepopulation of lions in the wild.
The mean weight of my sampleof 30 wild lions is the same asthe mean weight of all lions inthe wild.	The mean weight of my sampleof 30 wild lions is not the sameas the mean weight of all lions inthe wild.
There is no statisticallysignificant difference betweenthe weight of lions in the wildand lions in captivity.	There is a statistically significantdifference between the weight oflions in the wild and lions incaptivity.
There is no correlation betweenlion diet and lion weight.	There is a statistically significantcorrelation between lion diet andlion weight.

Note that the null hypothesis is not necessarily what you hope or think. You may hope that your sample of 30 wild lions represents the totalpopulation, but you don't know, and you need to test it. You may expect the mean weight of your sample to be almost the same as the meanweight of all lions, but you don't know. You may suspect (strongly) that there IS a correlation between lion diet and lion weight, but you stillneed to test the null hypothesis. In most cases, the null hypothesis is the opposite of your research hypothesis.

One-Tailed vs. Two-Tailed Tests

In the Betting Dilemma, we started out by calculating the probability that the result of 20 coin tosses would be 14 or greater. Then weconsidered the probability that the result could also be 6 or fewer heads. We switched from a one-tailed to a two-tailed test. In a one-tailedtest (also called a directional test) you are only concerned about one extreme of the data, or one direction. In a one-tailed test, the hypothesisspecifies the direction of the difference between the sample and the population or between the two samples being compared. For example, ifyou were testing a drug for anxiety and your null hypothesis was the drug does not result in lower levels of anxiety, you would perform a one-tailed test (you are only concerned with whether or not patients are less anxious). If, however, your null hypothesis was the drug does not resultin a difference in anxiety levels, you would perform a two-tailed test. (You are concerned with whether or not patients are less or more anxiousthan they are without the drug.) A two-tailed test (also called a non-directional test) allows you to test for a difference in both extremes ordirections.

Statistical Significance

Hypothesis testing involves analysis of the data and statistics to determine whether we should accept or reject the null hypothesis. The processthat we went through in the Betting Dilemma was a hypothesis test. In some of the sample hypotheses, we used the term "statisticallysignificant."

In hypothesis testing we evaluate the null hypothesis by evaluating the statistical significance of certain statistics. If you conclude that a result isstatistically significant, you are concluding that it is not a result of random chance but is due to a true effect or difference.

Statistical Significance: α Values

In the Betting Dilemma, we were trying to determine whether the result of 14 heads in 20 coin tosses was statistically significant. We decidedthat if the probability that the result was due to random chance was less than 0.05, we would consider the result to be statistically significant.This "cutoff probability" is sometimes known as the α value (alpha value) or significance level. In social sciences, 0.05 is the default α value, butin many cases 0.01 is used. This level corresponds to the percentage or change of error we are willing to live with or consider acceptable in theinterpretation of significance. A value of 0.05 means we are setting our minimum acceptable level of getting this result by random chance at 5%,thus we want to be at least 95% confident that the effect we find is true and not due to chance.

Statistical Significance: p Values

Once we have decided on an α value, we need to calculate the probability that the result occurred by random chance. This value is known asthe p value. In the Betting Dilemma, the p value of 14 or more OR 6 or fewer heads was 0.116. We then compare the p value to the α value. Ifthe p value is less than the α value, we conclude that the result is statistically significant and we reject the null hypothesis. If the p value isgreater than the α value, we conclude that the result is NOT statistically significant and we retain the null hypothesis. In the Betting Dilemma,the p value was higher than the α value, so we kept our null hypothesis that a result of 14 heads was simply due to random chance. When weconsidered a result of 70 heads in 100 tosses, however, our p value was much lower than the α value, so we rejected the null hypothesis andaccepted the alternative hypothesis (that a result of 70 heads is not due to random chance when tossing a fair coin 100 times).

We should note that in the Betting Dilemma, we calculated the p value before determining what the α value should be. In actual hypothesistesting, it is important to determine the α value first. Otherwise, the choice of α value could be biased toward what we want the result of thehypothesis test to be. For example, if you really like your friend and don't want to conclude that he or she is cheating, you might choose an αvalue that is much lower than if you were looking for a reason to accuse him or her. An α value of 0.05 is the general standard used instatistical testing in the social and behavioral sciences, though a value of 0.01 may be used if we want to be more stringent in our analysis ofthe results.

Statistical Significance: Confidence Intervals

Another measure used to determine statistical significance is the confidence interval. Recall that the α value is the probability below which wewill consider results NOT to be a result of random chance. The confidence interval, on the other hand, is the interval of values that we WILLaccept as due to random chance. In the Betting Dilemma, we decided that any results between 6 and 14 heads inclusive would be consideredas due to random chance and not statistically significant. Any results outside this confidence interval would be considered unlikely to be a resultof random chance. We would consider results outside this confidence interval as statistically significant.

Recall from Lesson 7 that approximately 95% of data points in a normal distribution fall within two standard deviations of the mean, or betweendata points with z scores of −1.96 and +1.96. If you use the confidence interval for your hypothesis test, you can use the raw score or the zscore to determine whether the score is statistically significant or not. A z score of greater than 2 or less than −2 is outside the 95% confidenceinterval. A z score of greater than 2.58 or less than −2.58 is outside the 99% confidence interval.

Type I Errors and Type II Errors

Whether something is statistically significant is somewhat subjective. As we noted in the Betting Dilemma, if we are worried about falselyaccusing someone of cheating, we are more likely to choose a smaller α value or a larger confidence interval. But if we are more worried aboutletting a cheater get away, we would likely choose a larger α value and a smaller confidence interval.

Rejecting the null hypothesis when it is actually true is known as a Type I error (or α). Alternatively, accepting the null hypothesis when it isactually false is a Type II error (or β). In our example, choosing a smaller α value (0.01 instead of 0.05) or a larger confidence interval (4 to 16heads instead of 6 to 14 heads) decreases our chances of a Type I error. (The α value actually is the probability of making a Type I error.)Choosing a greater α value and a tighter confidence interval decreases our chances of making a Type II error. Which error you are moreconcerned about depends on the test and the specific situation.

Table 8.5
	In Reality the. . .
Your Analysis Has LedYou to . . .	Null Hypothesis,H0, is True	Null Hypothesis, H0,is False
Reject the NullHypothesis, H0	Type I Error (α)	Correct
Accept the NullHypothesis, H0	Correct	Type II Error (β)

Statistical Power

We can also look at the accuracy of a hypothesis test in a positive way. Statistical power is the probability of correctly rejecting a nullhypothesis (i.e., rejecting it when it is, in fact, false).

· The probability of making a Type II error (β) is thus (1 − statistical power).

· The statistical power is therefore (1 − β).

The goal for any hypothesis test is to maximize the statistical power of the test.

Factors Affecting Statistical Power

Calculating statistical power is beyond the scope of this lesson, but it is important to know that it is influenced by a number of factors, including α value, sample size, effect size (the actual difference between the sample and population or the two samples being compared), and whetherthe test is one-tailed or two-tailed.

· The greater the α value, the more statistical power the test has.

· The greater the sample size, the more statistical power the test has.

· The greater the effect size, the more statistical power the test has.

· Given the same α value, one-tailed tests are also more powerful than two-tailed tests.

Of all these, the sample size has the greatest effect.

Increasing Statistical Power

It is important to keep in mind that "statistical power" is a measure of the likelihood of rejecting the null hypothesis when it is false. Though itis nice to have a high statistical power, it is also important not to reject the null hypothesis when it is true (i.e., see a difference when there isnone or conclude that a relationship exists when one does not).

· Having a large sample and using a one-tailed test where appropriate are the best ways to increase statistical power.

· Increasing α value increases chances of making a Type I error.

(Calculating β and statistical power can be complicated. If you are curious, it would be worth it to view Cengage's Statistical Power tutorial.)

Statistical Significance vs. Practical Significance

Suppose a drug company tests a weight loss drug. One group is given a pill once per day and a control group is given a placebo. The companycollects and analyzes the data. They report that the drug significantly increases the rate of weight loss. You are suspicious, so you look closely attheir study. The sample size is large: 500 patients in each group. The α value the researchers have chosen is small: 0.01. The p value of meanweight loss of the test group with respect to the control group is even smaller: 0.005. Unless the researchers made up the data, their study andtheir results seem sound: The study shows that the rate of weight loss is significantly greater for those on the pill than for those on theplacebo.

But wait, what do we mean by "significant"? It may be true that the study shows statistical significance. But practically speaking, is thedifference significant? To answer this question, you need to evaluate the actual value of the difference. Consider two cases: (A) the mean rate ofweight loss for people on the drug is 1.25 pounds more than the mean weight loss for those on the placebo; (B) the difference in the means is0.1 pound per week. Note that depending on the study, the results of both Case A and Case B could be statistically significant, but Case A hasmuch more practical significance.

This practical significance can be evaluated quantitatively by calculating the effect size. The effect size is the difference between the means ofthe two groups divided by the standard deviation of the control group. In social sciences, researchers use this general rule of thumb whenevaluating effect size:

· < 0.1 is considered a trivial effect, with no practical significance.

· 0.1–0.3 is considered a weak effect.

· 0.3–0.5 is considered a moderate effect.

· > 0.5 is considered a strong effect.

Practical Significance Is Subjective

Practical significance is very subjective, and depends not only on the effect size, but also on other factors as well. For example, whether youthink the weight loss drug has an effect that is practically significant depends on things like the cost of the drug and the side effects of the drug.

Note that the question of statistical versus practical significance can go the other way as well. Consider a hypothetical study on the use of adifferent type of seatbelt. The study might conclude that although the data show a lower injury and death rate among those involved inaccidents with this new seatbelt as opposed to those with a conventional seatbelt, the difference is not statistically significant. But if you look atthe effect size and you consider other factors like cost of the seatbelt and the situation (this belt could save your life), you might conclude thatalthough the effect is not statistically significant, it is practically significant for you personally.

The moral of the story is: Statistical significance may or may not actually mean anything in the real world. As a "consumer" of statisticalinformation, you need to be able to analyze the data and statistics to form your own conclusions.

Summary: Steps in Hypothesis Testing

1. Formulate the null hypothesis (H0) and the alternative hypothesis (Ha).

2. Determine whether the alternative hypothesis is directional (a one-tailed test) or non-directional (a two-tailed test).

3. Select the significance level, or α value, for your test.

4. Select a sample and collect data.

5. Calculate the sample statistic that correlates to the population parameter (or to the statistic of the sample that you are comparing your sampleto).

6. Calculate the probability that the sample statistic is the same as the population parameter or the probability that the sample statistic is thesame as that of another sample you are comparing it to.

7. Compare the p value to the α value.*

8. Determine the effect size.

9. Draw a conclusion about the null hypothesis: If the p value is greater than the α value, accept the null hypothesis. If the p value is less than the α value, reject the null hypothesis and accept the alternative hypothesis.

10. Report all of the results and all decisions and data that went into coming to your conclusion, including α value, p value, sample size, type of test,and effect size so that others can draw their own conclusions.

*If you are using a confidence interval instead of an α value, you do not need to calculate the p value of your statistic. You just need todetermine whether or not the statistic in the null hypothesis falls within the confidence interval. If it does, accept the null hypothesis. If it doesnot, reject the null hypothesis.

Previous section

Next section

Introduction

Introduction to Correlation and Regression

source:U.S. NHTSA, DOT HS 810 780, US Dept. of Agriculture

Researchers plot one variable against another to analyze thecorrelation between the two. A strong correlation, however, does notnecessarily mean that there is a cause and effect relationship.

If a student studies more hours, his or her exam grade is likely to increase. If thetemperature goes up, then typically so do ice cream sales. A runner who increasestraining for a marathon is likely going to have a lower run time than his or herprevious marathon time. If a sales employee continues to be rude to customers,then his or her sales are likely to decrease. These are examples of correlationswhere we are observing or expecting a relationship between two variables. In thefirst example, we expect study hours to be related to an exam grade. We cannotinfer causality from correlations, however, because it is unclear what could becontributing to the relationship or which variable necessarily "causes" the other.Studying does not cause someone to have a higher grade, but it does contributeto it along with other factors (e.g., study techniques, previous exposure tomaterial, intelligence, test anxiety levels, etc.). We should, instead, discusscorrelations as relationships between variables in which there is a pattern to howthey change.

There are three main things to keep in mind when dealing with correlations:magnitude of the relationship, direction of the relationship, and statistical significance of the relationship. The magnitude of the relationship isthe actual correlation coefficient or number that represents the correlation, which can range from 0 to ±1. The closer to 0, the weaker therelationship, while the closer to 1 a correlation is (either positive or negative), the stronger the relationship.

The direction of the relationship is designated by the positive or negative sign that accompanies the correlation coefficient. A positivecorrelation means that the variables are going in the same direction; thus, as one increases, so does the other, and when one decreases, sodoes the other. A negative correlation means that the variables are going in opposite directions; thus, as one increases, the other decreases,and vice versa. As we can see, the positive and negative signs do not signify "good" or "bad," or anything about the quality of the correlation,only whether the variables change in the same corresponding direction (positive) or opposite directions (negative). In the previous examplesprovided, the first two are positive correlations due to an increase in the first variable corresponding to an increase in the second variable (i.e.,studying and exam grade; temperature and ice cream sales). The last two examples are negative correlations due to an increase in the firstvariable corresponding to a decrease in the second variable (i.e., training time and race run time; rude behaviors and sales).

The last feature of correlations that is important to note is statistical significance. If a correlation is found to be statistically significant, then thatmeans we are unlikely to have found a relationship of this magnitude and direction by chance and it is indeed a "true" relationship. This relatesback to Lesson 8 on hypothesis testing, which can be illustrated with the following example.

Let's say you have the following research question in mind: Is there a relationship between job satisfaction and salary? The null hypothesis forthis study would be the following: There is no relationship between job satisfaction and salary. You expect to prove this hypothesis wrong inyour study, allowing you to accept our research/alternative hypothesis: There is a relationship between job satisfaction and salary. You have notstated what that relationship is, whether you expect an increase in job satisfaction to correspond with an increase in salary (positivecorrelation), or a decrease in salary (negative correlation), or whether one causes the other, only that there might be some relationshipbetween the two variables.

In this lesson you will learn more about how to formulate and interpret correlational studies. Pay special attention to the focus in this lesson ona relationship and not on cause and effect.

Applying Correlation and Regression

Correlation and Regression in the Ashford Courses

You will need to understand correlations in a range of graduate courses, including those with a focus on psychological or organizationalassessment and testing, measurement, research methods, and statistics. In these courses students will need to understand the focus of acorrelational study and how to interpret the results. In addition, having knowledge and skills in this area will help you better understand themethods, results, and discussions sections of the research literature they may be asked to evaluate for various courses.

Correlation and Regression in Graduate Research

Correlations are a common statistical test, thus you may choose to use a correlational analysis to evaluate a relationship between variables ofinterest. It is important for you to know the focus of correlational studies, how to formulate hypotheses for this type of study, and how tointerpret the results without inferring causality.

Correlation and Regression in the Professional World

In the professional world it is important to understand how correlational studies are used, when to use them, and how to analyze them in orderto accurately portray data results to others. Because correlational studies are quite common in the business world, it is important that you areaware of the common mistakes made in interpreting correlations as causality. In addition, it is part of one's ethical practice as a professional tounderstand which one is communicating to others, whether it is summarizing another's research data or your own findings.

Tutorial

Relationships between Variables: An Introduction

Even if you don't know the precise statistical meaning of the term, you have probably heard a lot about correlations. News stories aboutresearch studies often cite findings of correlations. You may have heard, for example, that researchers have found a correlation between thedensity of a person's amygdalae (groups of nuclei in the brain) and the number of friends that person has on Facebook, or that there is acorrelation between obesity and breast cancer survival. It is important to be able to analyze this information, and to do so properly, you need tohave some insight into how correlations are calculated and what they are really telling you.

There are four main questions to ask when evaluating a correlation:

1. How strong is the correlation? (Is there a strong relationship or a weak relationship between obesity and breast cancer survival?)

2. Is the correlation positive or negative? (Are denser amygdalae associated with more Facebook friends or fewer Facebook friends?)

3. Is the correlation statistically significant? (Could the results just be due to random chance or do they really indicate a relationship between thetwo variables?)

4. How can the correlation be used and what does the correlation really mean? (Can we predict the likelihood of breast cancer survival based onsomeone's obesity? Can we estimate the density of someone's amygdalae based on the number of Facebook friends they have?)

There are two main ways to quantify how closely related two variables are: calculating the correlation coefficient and performing a linearregression.

Types of Variables

Before we go on, it is useful to revisit the different types of variables from Lesson 2 . Recall that variables can be qualitative (descriptive; forexample, color or sex) or quantitative (having meaningful numerical values such as density or weight). Quantitative variables can be eitherdiscrete (whole numbers, such as number of friends) or continuous (for example, weight).

In descriptive statistics, we usually look at one variable at a time. For example, we might calculate the mean and standard deviation of theweights of the sampled population, or look at the z score of one individual's weight. With correlations, we look at two variables; together theyform a coordinate pair. For example, if we wanted to find out if there is a relationship between passenger vehicle weight and fuel efficiency, wewould pair the weight and fuel efficiency of every vehicle in the sample.

Table 9.1: Weight and Fuel Efficiency of a Sample ofPassenger Vehicles
Weight(lbs)	Fuel Efficiency (miles pergallon)	*Coordinate Pair (x,y)*
2715	24	(2715, 24)
2570	28	(2570, 28)
2610	29	(2610, 29)
2750	38	(2750, 38)
3000	25	(3000, 25)
3410	22	(3410, 22)
3640	20	(3640, 20)
3700	26	(3700, 26)
3880	21	(3880, 21)
3900	18	(3900, 18)
4060	18	(4060, 18)
4710	15	(4710, 15)

Independent vs. Dependent Variables

When we talk about the relationship between two variables, we refer to each variable as either independent or dependent. The independentvariable is the variable we are controlling: the variable we want to relate to something else. In some cases it is the variable that we THINK isthe cause of variation in the dependent variable (though as we will see, that is not necessarily the case and should not be assumed). Thedependent variable is the variable that we are measuring in relation to the value of the independent variable. In the case above, weight is theindependent variable and fuel efficiency is the dependent variable.

It is important to note that we say that the dependent variable depends on the independent variable for the sake of the study being conducted,but keep in mind that this does not necessarily mean that the independent variable controls the dependent variable, or that it is the only factorthat influences the dependent variable. When discussing correlations, the independent variable is sometimes referred to as the predictorvariable, while the dependent is referred to as the outcome variable. In this lesson, we will only consider correlations between two quantitativevariables.

Plotting Points on a Scatter Plot

When trying to determine the statistical relationship between two variables, it is generally useful to begin by plotting points on a scatter plot.This will give you an initial, general idea of how the variables relate to each other. The independent variable is always plotted on the x-axiswhile the dependent variable is plotted on the y-axis.

Figure 9.1

Scatter plot of Fuel Efficiency vs. Weight

In the scatter plot of vehicle fuel efficiency versus weight above, you can see that there is a general trend.

· The relationship looks linear: The points roughly define a straight line.

· The relationship between weight and fuel efficiency is negative: As weight increases, fuel efficiency decreases.

· One point (2750, 38) looks like it might be an outlier.

So by simply graphing the data, we can get a general idea of the relationship between weight and fuel efficiency, or any other variables we wishto plot.

Figure 9.2

This scatter plot shows a strong, positive relationship: y-values increase as x-values increase.

Figure 9.3

This scatter plot shows a strong negitive relationship: y-values decrease as x-values increase.

Figure 9.4

This scatter plot shows a strong positive, exponentialrelationship. A straight line would not fit well through thesedata, but a curved line would.

Figure 9.5

This scatter plot indicates no apparent relationship betweenthe x and y variables.

Making Predictions

One of the main reasons for determining how two variables are related is so that you can use the value of the independent variable to predictthe value of the dependent variable. To do this, we need to look at the relationship more quantitatively by determining the correlationcoefficient. If the correlation coefficient that we calculate is statistically significant, we can then go on to determine the best fit line. Forexample, if the correlation coefficient is statistically significant, we can use the best fit line to estimate or predict the fuel efficiency of a vehiclebased only on its weight.

Calculating the Correlation Coefficient

Exactly how strong is the correlation between fuel efficiency and weight? To answer this, we need to calculate the correlation coefficient. Thecorrelation coefficient, r, is a numerical measure of the strength of association between the independent variable x and the dependent variable y. There are a number of ways to calculate the correlation coefficient. One formula is shown below:

r=1n−1(∑xy−nx−y−)sxsy

where n = the number of data points, s is the standard deviation, Σ is the symbol for "sum of,"

x−

is the mean of x, and

y−

is the mean of y.

If you have already calculated the z scores of your data points, you could use this equation instead:

r=∑zxzyn−1

If you think there is a linear relationship between x and y, then r is a measure of how strong the linear relationship is. (Again, these are notequations that you'll need to memorize, but it is useful to see what goes into the calculation.)

Again, let's look at the data on weight versus fuel efficiency. You'll see that we've already calculated means and standard deviations to make thecalculation for r simpler.

Table 9.2: Weight and Fuel Efficiency of a Sample ofPassenger Vehicles
	x (weight,lbs)	y (fuel efficiency,mi/gal)	xy
	2715	24	65160
	2570	28	71960
	2610	29	75690
	2750	38	104500
	3000	25	75000
	3410	22	75020
	3640	20	72800
	3700	26	96200
	3880	21	81480
	3900	18	70200
	4060	18	73080
	4710	15	70650
Sum	40945	284	931740
Mean	3412.0833	23.6667
StandardDeviation	683.7579	6.1987

We can then plug these numbers into the equation for:

Unexpected text node: 'descr

Thus the correlation coefficient for fuel efficiency versus weight is r = –0.80.

What does this mean exactly? We need to evaluate the value and the sign of r.

Describing the Correlation: Value and Sign of the Correlation Coefficient

What does it mean that the correlation coefficient is –0.80?

Value

· The value of r is always between –1 and +1: –1 ≤ r ≤ 1.

· The closer the correlation coefficient, r, is to –1 or 1 (and the further it is from 0), the stronger the evidence of a significant linearrelationship between x and y.

· The closer r is to –1 or 1, the more closely the data points fall relative to the best fit line.

· Values of r further from 0 indicate a stronger linear relationship between x and y.

· Values of r closer to 0 indicate a weaker linear relationship between x and y.

· If r = 0, there is absolutely no linear relationship between x and y (no correlation).

· If r = 1, there is perfect positive correlation. If r = –1, there is perfect negative correlation.

In both these cases, all of the original data points lie on a straight line. (The example of the cost of a plumbing job and the amount of time thejob takes is an example of a perfect positive correlation. In the social sciences, a perfect correlation is extremely unlikely.)

· A correlation of greater than +0.8, or less than –0.8 is considered to be a strong correlation.

· Correlation of between –0.4 and +0.4 is considered to be a weak correlation.

Sign

· A positive value of r means that when x increases, y increases; also, when x decreases, y decreases (positive correlation). Both variables goin the same direction.

· A negative value of r means that when x increases, y decreases; also, when x decreases, y increases (negative correlation). The variables goin opposite directions.

· The sign of r is the same as the sign of the slope, b, of the best fit line.

So an r of –0.80 indicates that the correlation is quite strong and is negative, meaning that as one variable increases, the other decreases.

The Coefficient of Determination

Another statistic that you might encounter is the coefficient of determination, which is simply equal to the square of the correlation coefficient, r2. The coefficient of determination is generally stated as the percentage of variation in y that is explained by variation in x. For example, r2 forthe fuel efficiency example is (–0.80)2 = 0.64. We can say that 64% of the variation in fuel efficiency is explained by variation in weight. Or,alternatively, 46% of the variation in fuel efficiency of a vehicle is NOT explained by variation in weight.

Significance of the Correlation — Introduction

Once we calculate the correlation coefficient, we still need to ask ourselves whether the correlation is statistically significant. Recall that oursample is a sample, and does not include every member of the population. We need to determine whether the sample is really indicative of thepopulation. Does a "strong" correlation definitely mean that the independent and dependent variables are related? Can a very weak correlationstill provide important and significant information? The answer comes down to the value of the correlation coefficient and the number ofindividuals in the sample.

· The closer r is to 1 or –1, the more likely it is to be significant.

· The more individuals there are in the sample, the more likely it is to be significant.

Determining the Significance of the Correlation

To determine the significance of the correlation coefficient, we perform a hypothesis test, like we did for other variables in Lesson 8. In thiscase,

· The Null Hypothesis (H0): The population correlation coefficient IS NOT significantly different from 0. There IS NOT a significant linearrelationship (correlation) between x and y in the population.

· Alternate Hypothesis (Ha): The population correlation coefficient IS significantly different from 0. There IS a significant linear relationship(correlation) between x and y in the population.

There are two methods to make the decision (both methods are equivalent and give the same result):

1. Method 1: Use the p-value

Statistics packages will give you the p-value for the correlation coefficient. If the value is less than the significance level you've determined(generally 0.05), then r is considered to be statistically significant.

In the case of fuel efficiency vs. weight, an r of –0.885 has a p-value of 0.0018, which is well below the cutoff.

2. Method 2: Use a table of critical values for the confidence level you choose. Table 9.3 shows critical values for a 95% confidence level, or asignificance level of 0.05.

Table 9.3: Critical Values at 95% Confidence
Degrees of Freedom (n − 2)	Critical Values (+ and –) for 95% Confidence
1	0.997
2	0.95
3	0.878
4	0.811
5	0.754
6	0.707
7	0.666
8	0.632
9	0.602
10	0.576
11	0.555
12	0.532
13	0.514
14	0.497
15	0.482
20	0.423
25	0.381
30	0.349
40	0.304
50	0.273
60	0.25
70	0.232
80	0.217
90	0.205
100	0.195

Thus in the case of fuel efficiency vs. weight, n – 2 = 12 – 2 = 10. The critical value for r is 0.576. Any correlation coefficient greater than 0.576or less than –0.576 is significant for a sample size of 12. Recall that the correlation coefficient we calculated is –0.80. This is less than –0.576,and thus is statistically significant.

What Significance Means

What does significance mean?

· If r is statistically significant and the scatter plot shows a reasonable linear trend, then you can go on to perform a linear regression, or finda straight line that best fits the data. If r is significant, the line can be used to predict the value of y for values of x that are within thedomain of observed x values.

· If r is not statistically significant OR if the scatter plot does not show a reasonable linear trend, there is no point in performing a linearregression. You would not be able to use the equation of the line to make predictions.

Significant Is Not the Same as Strong

Notice that there is a difference between a significant correlation and a strong correlation. A strong correlation is one for which r is close to 1or –1, and one for which we can say that a large percentage of variation in the dependent variable is explained by variation in the independentvariable.

A significant correlation on the other hand is simply one that is statistically significant. It may be significant, but it may be weak. Look at the table of critical values again. If the sample size is 100 or more, even a weak correlation of 0.2 is "significant."

Similarly, a strong correlation might not be statistically significant. An r of 0.80 would generally be considered "strong," but it is not significant ifthere are 6 or fewer individuals in the sample. This distinction is important to keep in mind when reading or listening to reports about"significant" correlations. Don't assume that a correlation is strong just because it is significant, and vice versa.

Correlation Does Not Equal Causation

If the correlation coefficient for x and y is significant, you can conclude that there is a relationship between the two. What you cannotimmediately infer is that x causes y or that x is the only factor influencing y. When there is a statistical correlation between two variables, thenthere are a number of possible scenarios.

· x does, in fact, cause or influence y

We know from basic physics, for example, that it is harder to accelerate a more massive object than a less massive one. So it makes sense thatweight does, in fact, influence fuel efficiency.

· It is actually y that causes or influences x

For any two variables, we consider one to be the independent variable and the other to be dependent. But this may not be the case. Forexample, suppose we found a negative correlation between adult health and annual visits to a doctor's office. Rather than conclude immediatelythat visiting a doctor causes poor health, we should consider that the opposite is more likely to be the case: Poor health results in more visitsto the doctor.

· x affects y and y affects x

In many instances, in particular in science, factors influence each other. Air temperature, for example, influences rainfall and rainfall influencesair temperature. Among animals, number of calories consumed per day influences weight, and weight influences number of calories consumedper day.

· Both x and y are influenced by a third factor

For example, suppose we find a significant correlation between ice cream sales and drownings. Does that mean that ice cream sales causedrownings, or that drownings influence ice cream sales? Probably not. It is more likely that both are influenced by a third factor: the weather.

Finding the Best Fit Line — Introduction

Look again at the scatter plot of vehicle fuel efficiency versus weight.

Figure 9.6

You could probably draw a straight line that describes the data fairly well. But to calculate the best fit line, we need to perform what's known asa linear regression.

Equation of a Line

Before we go on, let's review the mathematical equation for a straight line. All straight lines can be defined by an equation in the form of

y = a + bx,

where

· y is the value of the dependent variable,

· x is the value of the independent variable,

· a is the y-intercept, or the value of the dependent variable when x = 0,

· b is the slope of the line, or the change in y for a given change in x.

Figure 9.7

For example, suppose a plumber charges $110 per visit, plus $50 per hour, and we want to figure out the total labor cost of a visit based on theamount of time the job takes.

· The amount of time the job takes is the independent variable, x.

· The total cost of the job is the independent variable, y.

· As soon as the plumber comes, there is a $110 charge, even if the job takes no time. So when x = 0, y = $110. $110 is a, the y-intercept.

· The cost per hour is $50, so for every change in time of 1 hour, the change in cost is $50. Therefore $50 per hour is b, the slope of the line.

Thus, the cost of a visit can be defined by the equation y = 110 + 50x.

Figure 9.8

(Note that the equation of a straight line is often given as y = mx + b, where m is the slope and b is the y-intercept. In this lesson, we will stickwith y = a + bx.)

The best fit line through any set of data will be defined in the form of y = a + bx.

Finding the Best Fit Line — Linear Regression

The best fit line is also called the least-squares line, and it is calculated by performing a linear regression. This is almost always done using acalculator or computer program, so you don't need to memorize the mathematics involved, but it is important to have a conceptualunderstanding of it. (See the Appendix for instructions on how to do this in Microsoft Excel.)

Figure 9.9

Look again at the scatter plot of fuel efficiency versus weight above. Unless the data are perfectly correlated, if we draw a straight line throughthe data set, not all (perhaps none) of the points will actually lie on the line. For each value of x in the data set, there is a difference betweenthe actual value of y and the value of y as it lies on the line. We call this estimated or predicted value, ŷ (y hat).

The difference between y and ŷ is known as the error, or residual, abbreviated ε. You can calculate the residual for each value of x, square eachresidual (ε2), and then sum up the squared residuals for the entire data set (Σε2). This would give you the SSE, or sum of squared errors. In aleast-squares regression, the computer program uses calculus to determine the line for which the SSE is a minimum.

The least-squares line for the sample of vehicles is shown below.

Figure 9.10

Least-squares line (best fit line) for vehicle fuel efficiency versus weight. This is the linefor which the sum of the squared differences between the actual values of y and theestimated values of y (ŷ) are at a minimum.

The equation of the best fit line is ŷ = 48.41 + (–0.007x), so

· a, the y-intercept, is 48.41

· b, the slope, is –0.007

As we noticed simply by plotting the points, the negative slope tells us that the direction of the correlation is negative. The value of the slopemeans that for every pound of weight added, we can expect the fuel efficiency to drop by 0.007 miles per gallon. Notice that the sign of theslope is the same as the sign of the correlation coefficient (r = –0.80). Both the sign of the slope and the sign of the correlation coefficientindicate that the correlation is negative.

Notice again that the y-intercept is 48.41 miles per gallon. Recall that the y-intercept is the value of y when x = 0. So, hypothetically speaking,this means that a car of zero weight would get 48.41 miles per gallon, or that 48.41 is the maximum mileage of a car. This is obviouslyridiculous and serves to point out that predictions using a regression line should be made only within the domain of the original data set. Thatis, we can use the line to predict fuel efficiencies of vehicles between 2570 and 4710 pounds, but not under 2570 pounds or over 4710 pounds.

Outliers

Whenever we perform a linear regression or calculate a correlation coefficient, it is important to consider outliers. Outliers can skew the dataand have a strong influence on the regression and other calculations. Recall that values greater or less than two standard deviations of themean are considered possible outliers. When you find a possible outlier, you will need to look at it carefully to figure out whether or not itshould be thrown out of the data set.

In the case of fuel efficiency versus weight, the value (2750, 38) is an outlier. Further examination of the data set shows that this particularvehicle is a gas/electric hybrid, while all others are conventional gasoline-powered vehicles. In this case, it would be advisable to throw out thisdata point. If we do this, the new line of best fit is ŷ = 42.4 – 0.0058x, the r value is –0.885, and r2 is 0.78. Excluding the outlier, we can saythat 78% of the variation in fuel efficiency is explained by variation in weight.

Figure 9.12

Using a Linear Regression

If you have determined that a correlation coefficient is statistically significant, you can then use the regression equation to predict value.

For example, based on the analysis of fuel efficiency versus weight, if you decide to buy a car that weighs 3200 pounds, what can you expect itsfuel efficiency to be? To answer this question, all you need to do is plug 3200 pounds into the regression equation:

ŷ = 42.4 — 0.0058x

estimated fuel efficiency (mi/gal) = 42.4 – 0.0058 (weight in pounds)

= 42.4 – 0.0058 (3200 pounds)

= 23.84 miles per gallon

= about 24 miles per gallon

Standard Error of the Estimate

We used the regression equation to estimate the fuel efficiency of a 3200-pound car. But as we know, the regression line does not necessary gothrough any actual data points. If we were to plug the weight of a car for which we know the fuel efficiency into the equation, the estimatedefficiency is likely to be different from the actual efficiency. The red squares on the graph below show the estimated values of y, while the bluediamonds show the actual values of y for a given value of x.

Figure 9.13

How different are the estimated y-values from the actual y-values? We can calculate the standard error of the estimate (SEE) to figure this out:

Sy,x=∑(y−yˆ)2n−2−−−−−−−−−−√

where Sy,x is the standard error of the estimate, y is the actual y-value, ŷ is the value estimated using the regression equation, and n is thesample size.

For example, to calculate the standard error of the estimate for fuel efficiency of a vehicle based on its weight, first calculate the estimatedvalues of y, the differences between the actuals and estimates, and the squares of the difference, as shown in the table below.

x(weight)	y (fuelefficiency)	ŷ(estimatedy) = 42.4 —0.0058x	*y - ŷ*	*(y — ŷ)2*
2715	24	26.67	−2.67	7.1289
2570	28	27.511	0.489	0.239121
2610	29	27.279	1.721	2.961841
3000	25	25.017	−0.017	0.000289
3410	22	22.639	−0.639	0.408321
3640	20	21.305	−1.305	1.703025
3700	26	20.957	5.043	25.431849
3880	21	19.913	1.087	1.181569
3900	18	19.797	−1.797	3.229209
4060	18	18.869	−0.869	0.755161
4710	15	15.099	−0.099	0.009801
			Sum: ∑(y−yˆ)2	43.0491

Then plug the sum into the equation:

Sy,x=∑(y−yˆ)2n−2−−−−−−−−−−√=43.04919−−−−−−−√=4.7832−−−−−√=2.187

The standard error of the estimate is similar to standard deviation from the mean. It gives us some idea of what sort of variability we canexpect in the actual values. For a car with a weight of 3200 pounds, we can say that it should be 23.84 miles per gallon plus or minus 2.19miles per gallon (1 SEE) 68% of the time, or 23.84 plus or minus 4.37 miles per gallon (2 SEE) 95% of the time, or 23.84 plus or minus 6.56miles per gallon (3 SEE) 99% of the time. The SEE gives us another indication of how dispersed the true values are around the regression line.

The lower the standard error of the estimate, the better the regression line fits the actual data and the better the prediction that can be madeusing the regression equation.

Summary

Steps to determine whether and how two variables are related linearly:

1. Collect data and organize them as (x, y) coordinate pairs.

2. Plot data on a scatter plot.

3. Perform an initial evaluation of the data:

4. Is there a trend?

5. If so, does it appear to be linear?

6. What is the direction of the trend?

7. Does it look strong or weak?

8. Do there seem to be any outliers?

9. Calculate the correlation coefficient, r, and the coefficient of determination, r2.

10. Determine whether or not the correlation is statistically significant.

11. If it is significant and there appears to be a linear trend, perform a linear regression.

12. Determine the regression equation.

13. Plot the regression line.

14. Calculate the standard error of the estimate.

15. Use the regression equation to make predictions

16. Evaluate the correlation. What does it really mean?

Appendix: Using Microsoft Excel to Perform Linear Regressions and Calculate CorrelationCoefficients

All statistics programs will allow you to calculate regression equations and correlation coefficients with ease. If you don't have a statisticsprogram or are not sure how to use it, you can use Excel. There are a number of different ways, but this is one of the more straightforward. Forother methods, go to the help menu and search for "correlation" or "regression."

1. Open an Excel Worksheet.

2. Type or copy your data onto the worksheet. It is easiest if you place the independent variable in the first column, and the dependent variable inthe column to the right.

3. Select the data that you want to plot.

4. Select Insert > Scatter plot. A scatter plot should appear.

5. Click on one of the points on the graph. Then right click and choose "add trendline."

6. A dialog box will appear. Choose the type of trend line you want (linear in the case of this lesson). Also click on the boxes to display theequation and the r2 value on the chart.

7. The regression line, regression, equation, and r2 value will appear. Take the square root of r2 to find r.

10	37	17	8	13	27	12	24	14	8
15	44	16	40	15	18	21	22	22	25
11	25	33	44	27	17	29	29	31	32
33	4	34	26	35	3	16	26	44	34

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24
25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24

25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24

25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24
25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

10	37	17	8	13	27	12	24	14	8
15	44	16	40	15	18	21	22	22	25
11	25	33	44	27	17	29	29	31	32
33	4	34	26	35	3	16	26	44	34

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24
25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24

25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24

25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24
25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

10	37	17	8	13	27	12	24	14	8
15	44	16	40	15	18	21	22	22	25
11	25	33	44	27	17	29	29	31	32
33	4	34	26	35	3	16	26	44	34

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24
25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24

25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24

25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44

3	4	8	8	10	11	12	13	14	15
15	16	16	17	17	18	21	22	22	24
25	25	26	26	27	27	29	29	31	32
33	33	34	34	35	37	40	44	44	44