WEEK2 DB CJ Stat

profileismails95
cjstat-week3.ppt

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

Fox/Levin/Forde, Elementary Statistics in Criminal Justice Research, 4e

Chapter 2: Organizing the Data

*

*

Create frequency distributions of nominal data

Learning Objectives

After this lecture, you should be able to complete the following Learning Outcomes

2.1

*

2.1

*

  • Formulas and statistical techniques are used by researchers to:

Organize raw data

Test hypotheses

  • Raw data is often difficult to synthesize.

  • Frequency tables make raw data easier to understand

Introduction

*

2.1

*

Frequency Distributions of Nominal Data

Characteristics of a frequency distribution of nominal data:

  • Title
  • consists of two columns:
  • Left column: characteristics (e.g., Response of Child)
  • Right column: frequency (f)
Responses of Men to Hypothesized Auto Theft
Response f
Physically Confront Thief 10
Verbally Confront Thief 25
Shout for Help 5
Call the Police 10
N=50

*

2.1

*

Comparisons clarify results, add information, and allow for comparisons.

Comparing Distributions

Response to Hypothetical Auto Theft by Gender
Gender
Response Male Female
Physically Confront Thief 10 3
Verbally Confront Thief 25 12
Shout for Help 5 10
Call the Police 10 25
Total 50 50

*

Calculate proportions, percentages and rates

Learning Objectives

After this lecture, you should be able to complete the following Learning Outcomes

2.2

*

2.2

*

Allows for a comparison of groups of different sizes.

Proportion – number of cases
compared to the total size
of distribution

Percentage – the frequency of
occurrence of a category per
100 cases

Rate – compares between
actual and potential cases

Proportions, Percentages, and Rates

*

Practice

*

Practice

*

Practice

*

Create simple and grouped frequency distributions

Learning Objectives

After this lecture, you should be able to complete the following Learning Outcomes

2.3

*

Simple Frequency Distribution

  • simple frequency distribution: a basic tally with frequencies and percentages of the values in the distribution
  • nominal variables do not have to be listed in any particular order
  • ordinal and interval variables represent the degree to which a particular characteristic is present
  • they MUST be listed in order
  • lowest to highest OR highest to lowest

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.3

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.3

*

Practice

*

Answer

*

2.3

*

  • Interval level variables sometimes spread over a wide range

makes a single frequency distribution difficult to read

instead might use a grouped frequency distribution

  • clarify the presentation of interval-level scores spread over a wide range
  • each category or group is a class interval

Smaller categories or groups containing more than one score

Class interval size determined by the number of score values it contains

Grouped Frequency Distribution of Interval Data

*

Guidelines for Constructing Class Intervals

  • constructing class intervals is a special way of categorizing data
  • must consider the number of categories you wish to use
  • keep in mind the categories are being used to reveal patterns

too many or too few categories may blur the pattern

decision is based on the set of data and personal objectives

  • two basic guidelines
  • size of class interval should be a whole number
  • make the lowest score in a class interval some multiple of its size

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.3

*

2.3

*

The Midpoint

  • The middlemost score value in a class interval

The sum of the lowest and highest value in a class interval divided by two

The Midpoint

*

2.3

*

  • sometimes it helps to present frequencies in a cumulative fashion

Cumulative Frequencies

  • Total number of cases having a given score or a score that is lower

Shown as cf

Obtained by the sum of frequencies in that category plus all lower categories’ frequencies

Cumulative Percentage

  • Percentage of cases having a given score or a score that is lower

Cumulative Distributions

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.3

*

Flexible Class Intervals

  • frequency distributions won’t always make use of class intervals of equal size

*

Create cross-tabulations

Learning Objectives

After this lecture, you should be able to complete the following Learning Outcomes

2.4

*

  • researchers want to do more than just describe the distribution of some variable
  • want to explain why some individuals fall at one end of the distribution while other are at the opposite
  • this means we need to expand tables into two or more dimensions

*

Cross-Tabs

  • cross-tabulation (cross-tab) presents the distribution (frequencies and percentages) of one variable across the categories of one or more additional variables (usually the IV(s))
  • cross-tabs can be thought of as a series of frequency distributions attached together in one table
  • can use a cross-tab to explore the differences bt male and female victims in terms of their relationships to their killers (table 2.15)

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.4

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.4

*

Marginal Distributions

  • the frequency distribution of each variable separately can be found along the margins of the cross-tab
  • these are the marginal distributions

right margin of the table provides a frequency and percentage distribution for victim-offender relationship

bc the victim-offender relationship variable is placed long the rows, the frequencies and percentages for this variable form the row totals

  • marginal distribution for sex is found in the bottom margin of the cross-tab

*

Percentages

  • Adding percentages can add fuller meaning to the cross-tab
  • in table 2.16 we have total percents (total %)
  • obtained by dividing each frequency by the total number of cases

*

  • row percents – divide the frequencies in each row by the number of cases in that row

provide the distribution of the column variable for each value of the row variable

represent the victim sex distribution within each type of victim- offender relationship (table 2.17)

  • column percents – divide each frequency by the number of cases in that column

*

2.4

*

Percents Within Cross-Tabulations

The choice comes down to which is more relevant to the purpose of the analysis.

  • If the independent variable is on the rows, use row percents.
  • If the independent variable is on the columns, use column percents.
  • If the independent variable is unclear, use whichever percent is most meaningful.

*

Distinguish between various forms of graphic presentations

Learning Objectives

After this lecture, you should be able to complete the following Learning Outcomes

2.5

*

  • readers often tune out columns of numbers
  • using graphs can sometimes provide a better representation of the data

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

*

Bar Graphs and Histograms

  • pie charts only allow for a few categories
  • bar graphs and histograms can accommodate any number of categories at any level of measurement

can display the effect of one variable on another

  • used more frequently in CJ research than pie charts
  • These follow a standard arrangement

x axis: score values or categories

y axis: provides the percentages or frequencies

*

  • Bar graph and histogram are often used interchangeably
  • bar graphs – typically display the frequency of percentage distribution of a discrete variable, esp at the nominal level

includes space bt the bars – bc of the lack of continuity bt one category and another

  • histogram – display continuous measurements

bars are joined to emphasize the continuity of the points along a scale

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

*

Frequency Polygons

  • tends to stress continuity along a scale rather than differentness
  • useful for depicting ordinal and interval data
  • frequencies (or %) are indicated by a series of points placed over the score values or midpoints of each class interval

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

*

The Shape of a Frequency Distribution

  • frequency polygons can help us visualize the variety of shapes and forms taken by frequency distributions
  • some are symmetrical

have same number of extreme score values in both directions

  • some are skewed

have more extreme cases in one direction than the other

can differ in terms of peakedness – kurtosis

  • very peaked – leptokurtic
  • very flat – playtkurtic
  • neither – mesokurtic
  • normal curve has a mesokurtic symmetrical distribution

distributions can also be negatively (to the left) or positively skewed (to the right)

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

*

Line Chart

  • frequency polygon is a special type of line chart
  • line charts display changes in a variable or variables between groups or over time
  • the amount or rate of some variable is plotted and then these points are connected by line segments
  • useful for depicting patterns over time

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

*

Using the General Social Survey (GSS)

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

Frequency distributions can be created to help researchers visualize distributions.

Proportions, percentages, and rates can be calculated as a way to describe data.

Simple frequency distributions can be created using data at any level of measurement, while interval level data is needed to create a grouped frequency data.

Cross-tabulations can be created to illustrate the relationship between two variables.

CHAPTER SUMMARY

2.1

2.2

2.3

2.4

Several forms of graphs can be used to demonstrate patterns and relationships between variables.

2.5

*

f

P

N

=

%(100)

f

N

=

(

)

actual cases

Rate1,000

potential cases

f

f

æö

=

ç÷

èø

lowest score valuehighest score value

2

m

+

=

(

)

%100

cf

c

N

=

(

)

(

)

(

)

Total Percents: %100

Row Percents: %100

Column Percents: %100

total

row

column

f

total

N

f

row

N

f

column

N

=

=

=