Central Tendency and Variability
MODULE 5 VISUAL DISPLAYS FOR CONTINUOUS VARIABLES
Page navigation
· previous: Visual Displays of Categorical Variables: Summary
· next: Identifying Continuous Variables
· Go to page
Current Module | Pages 20 - 23
Visual Displays for Continuous Variables Introduction
Learning Objectives
· Evaluate visual displays of data for continuous variables.
A researcher conducted a study in which she observed students’ scores on an examination. One of the first steps in analyzing a sample of data is to examine the distribution of values for variables in the data set. The distribution of the data tells her about the frequency with which various values are observed. Distributions can be examined in visual displays such as tables and graphs. A good graph or table is informative and allows researchers to identify and communicate important characteristics of the data. Different approaches are taken for visually displaying categorical and continuous variables.
Identifying Continuous Variables
Learning Objectives
· Evaluate visual displays of data for continuous variables.
There are a number of ways of classifying variables. This skill builder focuses on continuous variables. Formally, a continuous variable is one that reflects an interval or ratio level of measurement. In addition, between any two values for the variable, there is another possible value. For example, consider scores of 1.0 and 1.01 on a continuous variable. Between 1.0 and 1.1, 1.01 is a possible value. Then between 1.00 and 1.001, 1.0001 is a possible value. Note that by using more and more decimal places, you can always find a value between any two values. The number of possible values is infinite. Physical characteristics like height and weight are good examples of constructs that can be measured using continuous variables.
Some variables are not continuous according to the formal definition, but are amenable to the visual display methods that are typically used for continuous variables. For example, number of children in a family is not a continuous variable because only whole numbers (e.g., 1, 2, 3) are used to count children. It makes sense, however, to visually display the data for number of children the same way that you would typically display data for continuous variables. Review the definitions for the following terms to see subtle differences in kinds of variables: Continuous, discrete , quantitative , qualitative , and categorical.
Learn by Doing
Hints, displayed below
Consider a study in which the data set contains the following variables: Gender, marital status, years of marriage, and income. Which variables would be appropriately displayed using a method for continuous variables?
|
Table of multiple choice questions |
||
|
|
Continuous |
Not Continuous |
|
Gender |
|
|
|
Marital Status |
|
|
|
Years of Marriage |
|
|
|
Income |
|
|
In examining the distribution for a continuous variable, the interest is in how frequently various values are observed in the sample. There are a number of ways of presenting the distributions: tables, line graphs, and histograms are quite common. You have a choice with regard to how to visually display the data, but should be guided by the principle of clearly communicating important characteristics of the data.
Histograms, Line Graphs, and Frequency Distributions
Learning Objectives
· Evaluate visual displays of data for continuous variables.
To illustrate how you can visually display data for continuous variables, return to the example of students’ exam scores and examine the process of creating a histogram for a set of data.
Exam Grades
Here are the exam grades of 15 students:
88, 48, 60, 51, 57, 85, 69, 75, 97, 72, 71, 79, 65, 63, 73
You first need to break the range of values into intervals (also called "bins" or "classes"). In this case, since the data-set consists of exam scores, it will make sense to choose intervals that typically correspond to the range of a letter grade, 10 points wide: 40-50, 50-60, ... 90-100. By counting how many of the 15 observations fall in each of the intervals, you get the following table:
|
Score |
Count |
|
[40-50] |
1 |
|
[50-60] |
2 |
|
[60-70] |
4 |
|
[70-80] |
5 |
|
[80-90] |
2 |
|
[90-100] |
1 |
Note:
The observation 60 was counted in the 60 -70 interval. See the first comment in the Comments section below.
To construct the following histogram from the preceding table, write the intervals on the x-axis, and show the number of observations in each interval (frequency of the interval) on the y-axis. The number of observations is represented by the height of a rectangle located above the interval. Also note that the bars touch each other because the Score variable is a continuous variable:
Learn by Doing
Hint, displayed below
Using the student exam score table and histogram above, what percentage of students earned less than a grade of 70 on the exam?
80%
93%
20%
47%
7%
1. Be sure that each observation is counted only in one interval. For the most part, which interval an observation falls in is clear. However, in the example, the analyst needed to decide whether to include 60 in the interval 50-60, or the interval 60-70, and it was counted in the latter. In fact, this decision is captured by the way notation is used to write the intervals. If you scroll up and look at the table, you'll see that the analyst wrote the intervals in a peculiar way: [40-50), [50,60), [60,70) and so on. The square bracket means "including" and the parenthesis means "not including." For example, [50,60) is the interval from 50 to 60, including 50 and not including 60; [60,70) is the interval from 60 to 70, including 60, and not including 70. Note that some researchers will designate the intervals as 50-59, 60-69, and so on. The exact approach does not matter as long as you're consistent and a score is only assigned to one category.
2. When data are displayed in a histogram, some information is lost. Note that by looking at the histogram, you can answer: "How many students scored 70 or above?" (5+2+1=8) But you cannot answer: "What was the lowest score?" All you can say is that the lowest score is somewhere between 40 and 50, and therefore, you can approximate that it is around 45.
3. Obviously, you could have chosen to break the data into intervals differently (for example: 45-50, 50-55, 55-60, etc.). To see how our choice of intervals (i.e., bins or classes) affects the histogram, you can experiment with different intervals and see how the appearance of the histogram changes.
4. Finally, note that the histogram can be replaced by a line graph, as seen below. The line graph uses the same data used in the histogram. Note that two extra intervals are used, one at the beginning and one at the end, so that the graph touches the horizontal axis.
Learn by Doing
1.
2.
3.
Hint, displayed below
Learn More About Histograms
An instructor asked her students how much time (to the nearest hour) they spent studying for the midterm. The data are displayed in the following histogram:
What do the numbers on the horizontal axis represent?
The count of students falling in each of the intervals
The values of the number of hours studied
Did I Get This
1.
2.
3.
4.
Thirty-two students were asked the number of servings of fruits and vegetables they eat daily. The results are displayed in the histogram below.
Page navigation
· previous: Identifying Continuous Variables
· next: Visual Displays for Continuous Variables Summary
· Go to page
· Visual Displays for Continuous Variables Summary
· Before You Continue
· Evaluate your ability to perform each of the following tasks. In other words, how well can you do each task?
|
Table of multiple choice questions |
|||||
|
|
Not at all yet |
With a lot of support |
With some support |
With minimal support |
On my own |
|
Evaluate visual displays of data for continuous variables.* |
|
|
|
|
|
· * Required questions
· What concept or topic is the least clear to you at this point?
·
· What other questions do you have?
·
· Submit
· QUIZ
· Visual Displays of Continuous Variables
22
20