Central Tendency and Variability

wjm3774
CONTINUOUSVARIABLESMODULE5.docx

MODULE 5 VISUAL DISPLAYS FOR CONTINUOUS VARIABLES

Page navigation

· previous: Visual Displays of Categorical Variables: Summary

· next: Identifying Continuous Variables

· Go to page 

Current Module | Pages 20 - 23

Visual Displays for Continuous Variables Introduction

Learning Objectives

· Evaluate visual displays of data for continuous variables.

A researcher conducted a study in which she observed students’ scores on an examination. One of the first steps in analyzing a sample of data is to examine the distribution of values for variables in the data set. The distribution of the data tells her about the frequency with which various values are observed. Distributions can be examined in visual displays such as tables and graphs. A good graph or table is informative and allows researchers to identify and communicate important characteristics of the data. Different approaches are taken for visually displaying categorical and continuous variables.

Identifying Continuous Variables

Learning Objectives

· Evaluate visual displays of data for continuous variables.

There are a number of ways of classifying variables.  This skill builder focuses on continuous variables. Formally, a  continuous variable  is one that reflects an  interval  or  ratio level  of measurement. In addition, between any two values for the variable, there is another possible value. For example, consider scores of 1.0 and 1.01 on a continuous variable. Between 1.0 and 1.1, 1.01 is a possible value. Then between 1.00 and 1.001, 1.0001 is a possible value. Note that by using more and more decimal places, you can always find a value between any two values. The number of possible values is infinite. Physical characteristics like height and weight are good examples of constructs that can be measured using continuous variables.

Some variables are not continuous according to the formal definition, but are amenable to the visual display methods that are typically used for continuous variables. For example, number of children in a family is not a continuous variable because only whole numbers (e.g., 1, 2, 3) are used to count children. It makes sense, however, to visually display the data for number of children the same way that you would typically display data for continuous variables. Review the definitions for the following terms to see subtle differences in kinds of variables: Continuous,  discrete quantitative qualitative , and categorical.

Learn by Doing

Hints, displayed below

Consider a study in which the data set contains the following variables: Gender, marital status, years of marriage, and income. Which variables would be appropriately displayed using a method for continuous variables?

Table of multiple choice questions

Continuous

Not Continuous

Gender

Marital Status

Years of Marriage

Income

In examining the distribution for a continuous variable, the interest is in how frequently various values are observed in the sample. There are a number of ways of presenting the distributions: tables, line graphs, and histograms are quite common. You have a choice with regard to how to visually display the data, but should be guided by the principle of clearly communicating important characteristics of the data.

Histograms, Line Graphs, and Frequency Distributions

Learning Objectives

· Evaluate visual displays of data for continuous variables.

To illustrate how you can visually display data for continuous variables, return to the example of students’ exam scores and examine the process of creating a histogram for a set of data.

Exam Grades

Here are the exam grades of 15 students:

88, 48, 60, 51, 57, 85, 69, 75, 97, 72, 71, 79, 65, 63, 73

You first need to break the range of values into intervals (also called "bins" or "classes").  In this case, since the data-set consists of exam scores, it will make sense to choose intervals that typically correspond to the range of a letter grade, 10 points wide: 40-50, 50-60, ... 90-100. By counting how many of the 15 observations fall in each of the intervals, you get the following table:

Score

Count

[40-50]

1

[50-60]

2

[60-70]

4

[70-80]

5

[80-90]

2

[90-100]

1

Note:

The observation 60 was counted in the 60 -70 interval. See the first comment in the Comments section below. 

To construct the following histogram from the preceding table, write the intervals on the x-axis, and show the number of observations in each interval (frequency of the interval) on the y-axis. The number of observations is represented by the height of a rectangle located above the interval. Also note that the bars touch each other because the Score variable is a continuous variable:

This histogram shows the intervals, score, on the x-axis, and show the number of observations, count, in each interval (frequency of the interval) on the y-axis. The number of observations is represented by the height of each rectangle located above the interval on the x-axis.

Learn by Doing

Hint, displayed below

Using the student exam score table and histogram above, what percentage of students earned less than a grade of 70 on the exam?

80%

93%

20%

47%

7%

1. Be sure that each observation is counted only in one interval. For the most part, which interval an observation falls in is clear. However, in the example, the analyst needed to decide whether to include 60 in the interval 50-60, or the interval 60-70, and it was counted in the latter. In fact, this decision is captured by the way notation is used to write the intervals. If you scroll up and look at the table, you'll see that the analyst wrote the intervals in a peculiar way: [40-50), [50,60), [60,70) and so on. The square bracket means "including" and the parenthesis means "not including." For example, [50,60) is the interval from 50 to 60, including 50 and not including 60; [60,70) is the interval from 60 to 70, including 60, and not including 70. Note that some researchers will designate the intervals as 50-59, 60-69, and so on. The exact approach does not matter as long as you're consistent and a score is only assigned to one category.

2. When data are displayed in a histogram, some information is lost. Note that by looking at the histogram, you can answer: "How many students scored 70 or above?" (5+2+1=8) But you cannot answer: "What was the lowest score?" All you can say is that the lowest score is somewhere between 40 and 50, and therefore, you can approximate that it is around 45.

3. Obviously, you could have chosen to break the data into intervals differently (for example: 45-50, 50-55, 55-60, etc.). To see how our choice of intervals (i.e., bins or classes) affects the histogram, you can experiment with different intervals and see how the appearance of the histogram changes.

4. Finally, note that the histogram can be replaced by a line graph, as seen below. The line graph uses the same data used in the histogram. Note that two extra intervals are used, one at the beginning and one at the end, so that the graph touches the horizontal axis.

Line graph with intervals (scores) on the x-axis and frequency of the interval (count) on the y-axis. The line graph starts at the interval 30-40, 0 and rises to its highest point at 70-80, 5, before falling back down to 100-110, 0.

Learn by Doing

1.

 

2.

 

3.

Hint, displayed below

Learn More About Histograms

An instructor asked her students how much time (to the nearest hour) they spent studying for the midterm. The data are displayed in the following histogram:

A histogram with frequency on the vertical axis. The following data points are plotted: 0-2, 3; 2-4, 9; 6-8, 6; 8-10, 3; 10-12, 2; 14-16, 1; and 18-20, 1.

What do the numbers on the horizontal axis represent?

The count of students falling in each of the intervals

The values of the number of hours studied

 

Nextquestion

Did I Get This

1.

 

2.

 

3.

 

4.

Thirty-two students were asked the number of servings of fruits and vegetables they eat daily. The results are displayed in the histogram below.

Histogram with daily portions of fruits and vegetables, from 0 to 7, on the x-axis, and frequency, from 0-10, on the y-axis.

 

Nextquestion

Page navigation

· previous: Identifying Continuous Variables

· next: Visual Displays for Continuous Variables Summary

· Go to page 

· Visual Displays for Continuous Variables Summary

· Before You Continue

· Evaluate your ability to perform each of the following tasks. In other words, how well can you do each task?

Table of multiple choice questions

Not at all yet

With a lot of support

With some support

With minimal support

On my own

Evaluate visual displays of data for continuous variables.*

· * Required questions

· What concept or topic is the least clear to you at this point?

·

· What other questions do you have?

·

· Submit

· QUIZ

· Visual Displays of Continuous Variables 

22

20