Cross-Tabulation & Z-test <Statistics>

profilesleepy_joy
basic_data_analysis_powerpoint.ppt

LEARNING OUTCOMES

Know what descriptive statistics are and why they are used

Create and interpret tabulation tables

Use cross-tabulations to display relationships

Perform basic data transformations

Understand the basics of testing hypotheses using inferential statistics

Z test

14–*

*

The Nature of Descriptive Analysis

  • Descriptive Analysis

The elementary transformation of raw data in a way that describes the basic characteristics such as central tendency, distribution, and variability.

  • Histogram

A graphical way of showing a frequency distribution in which the height of a bar corresponds to the observed frequency of the category.

14–*

*

EXHIBIT 14.1 Levels of Scale Measurement and Suggested Descriptive Statistics

14–*

*

Cross-Tabulation

  • Cross-Tabulation

Addresses research questions involving relationships among multiple less-than interval variables.

Results in a combined frequency table displaying one variable in rows and another variable in columns.

  • Contingency Table

A data matrix that displays the frequency of some combination of responses to multiple variables.

  • Marginals

Row and column totals in a contingency table, which are shown in its margins.

20–*

*

Cross-Tabulation Table

  • Did you watch the movie Into The Woods? Yes No
  • What’s your gender? Male Female

(Observed distribution)

*

No Yes Total
Male 14 3 17
Female 15 17 32
Total 29 20 49

Cross-tab: Project Assignment

  • Thirty respondents were asked if they have the access to the 4G network and if they have used mobile banking services.
  • The results showed that 11 people do not have the access to 4G and have not used mobile banking, 4 people have the access to 4G but have not used mobile banking, 12 people have the access to 4G and have used mobile banking, and 3 people do not have the access to 4G but have used mobile banking (using friends’ smartphone).
  • Present the results in a cross-tabulation table in Project Assignment.

14–*

*

Cross-Tabulation Table

  • Convert frequency table to percentage table.

Statistical base – the number of respondents or observations (in a row or column) used as a basis for computing percentages.

  • What was the percentage of males who watched the movie?
  • What was the percentage of moviegoers who were male?

*

*

Cross-Tabulation Table

  • % of males watched the movie.
  • % of

moviegoers

were male.

*

No Yes Total (base)
Male 3/17= 17.6%
Female
Total
No Yes Total
Male 3/20=15%
Female
Total (base)

Compare these two tables, which one does a better job displaying the relationship between gender and movie going, i.e., if moviegoers’ gender will affect whether they watched the movie.

*

Cross-Tabulation Table

  • Percentages are computed in the direction of the “independent” variable, e.g., gender.

*

No Yes Total
Male 14/29=48% 3/20=15%
Female 15/29=52% 17/20=85%
Total 29/29=100% 20/20=100%

Note that as you have learned from Ch. 9 Experiments and project assignment, gender is NOT an independent variable, because we cannot alter or manipulate participants’ gender in a marketing experiment. However, in data analysis, we treat gender as independent variable in that one’s gender has an effect on DV. In this movie example, it makes sense to say that one’s gender will influence if that person decides to watch the movie. However, it does NOT work the other way: watching movie will affect one’s gender.

*

  • What would be appropriate “independent” variable and dependent variable?
  • Convert the 4G x Mobile Banking cross-tab into a percentage table.

*

Cross-tab: Project Assignment

Now, are you ready to convert the 4G x Mobile Banking table to a percentage table? Follow the instructions above.

*

Data Transformation

  • Data Transformation

Process of changing the data from their original form to a format suitable for performing a data analysis addressing research objectives.

Recoding

Creating summated scales

Collapsing adjacent categories

Creating index numbers, e.g., consumer price index (CPI)

20–*

*

CPI: on changes in the prices paid by urban consumers for a representative basket of goods and services.

Computer Programs for Analysis

  • Statistical Packages

Spreadsheets

Excel

Statistical software:

SPSS (Statistical Package for Social Sciences)

PASW (Predicative Analytics Software)

SAS

MINITAB

14–*

*

Hypothesis Testing Using Basic Statistics

  • Univariate Statistical Analysis

Tests of hypotheses involving only one variable.

  • Bivariate Statistical Analysis

Tests of hypotheses involving two variables.

E.g., t-test, ANOVA, correlation

  • Multivariate Statistical Analysis

Statistical analysis involving three or more variables or sets of variables.

E.g., Multiple regression

14–*

*

Hypothesis Testing Procedure

  • The specifically stated hypothesis is derived from the research objectives.
  • A sample is obtained and the relevant variable is measured.
  • The measured sample value is compared to the value either stated explicitly or implied in the hypothesis.

If the value is consistent with the hypothesis, the hypothesis is supported.

If the value is not consistent with the hypothesis, the hypothesis is not supported.

14–*

*

All these activities are centered around hypothesis.

Null Hypothesis vs.
Alternative Hypothesis

  • Null hypothesis (H0): A statement about a status quo (asserting that any change from what has been thought to be true will be due entirely to random sampling errors).

E.g., H0: µ = 100

  • Alternative hypothesis (H1): A statement indicating the opposite of the null hypothesis.

E.g., H1: µ 100

*

Hypothesis Testing (HT)

  • The purpose of HT is to determine which of the two hypotheses is correct.
  • Significant level: The critical probability in choosing between the null and alternative hypotheses.
  • ą (Greek letter alpha) = .05
  • The probability level that is too low to warrant support of the null hypothesis.

*

Hypothesis Testing (HT)

  • p-value

Probability value, or the observed or computed significance level.

p-values are compared to significance levels to test hypotheses.

p < .05, null hypothesis is reject or alternative hypothesis is supported.

14–*

*

Univariate Hypothesis Testing

  • Is the sample mean significantly different from the hypothesized population mean?

Is the sample a part of the population?

  • Population mean IQ: µ=100
  • Sample mean (e.g., SJSU) IQ: =105
  • Is IQ score 105 statistically significantly different from IQ score 100?

“Well, Are They Satisfied or Not?”

  • Suppose Best Buy is interested in if their customers were satisfied with their “Black Friday” shopping in the Best Buy stores.
  • Unsatisfied 1 2 3 4 5 Satisfied
  • The average score of 225 shoppers is 3.3.
  • Is a satisfaction score of 3.3 good or bad?
  • Need to compare with other scores.

21-*

Step 1*: Stating Hypotheses

  • H0: µ=3.0 (customers were neither unsatisfied

nor satisfied.)

  • H1: µ≠3.0 (customers were satisfied with their Black Friday shopping.)

*

*

Step 2: Deciding on Region of Rejection

-1.96

1.96

0

Critical Z-scores:

Z-distribution

The darkly shaded area shows the region of rejection when ą=.025

Raw scores:

HT: Best Buy Example

  • Sample size n=225
  • Sample mean =3.3
  • Sample standard deviation S=1.5

*

Step 3*: Calculating z-statistic

*

z-statistic:

Standard error of mean:

The standard deviation of the sampling distribution.

(obs=observation; as opposed to expected critical values)

*

Step 4:

Comparing Z-Statistic to Critical Value

-1.96

1.96

3.0

0

Z-scores:

Step 5*: Making a Decision

  • Zobs=3.0 > Z.05=1.96
  • Therefore, p<.05. This means that the chance we observe µ=3.0 is less than 5%.
  • Reject H0
  • This suggests that Best Buy customers were satisfied with their “Black Friday” shopping.

*

Hypothesis Testing Procedure: Z Test

State hypotheses.

Null: H0: µ=

Alternative: H1: µ≠

Decide on region of rejection, i.e., find the critical value(s) for the significant level p=.05.

Z-distribution: -1.96 and 1.96

Calculate the z-statistic

Comparing the z-statistic with critical values.

Make a decision

If z-statistic falls in [-1.96, 1.96], then fail to reject H0.

If z-statistic falls out of [-1.96, 1.96], then reject H0.

*

*

According to the past 5 years of experience, a professor

knows that the average hours his students spend on the final

project is 15 (standard error of the mean = 0.9). In order to

see whether or not the time his students spend on the

project has decreased this semester, he randomly sampled

50 of his students and calculated the average hours as 14.

State an appropriate null hypothesis and alternative hypothesis.

Find the critical values at significant level p=.05

Calculate the z-statistic.

Compare z-statistic with critical values.

Make a decision.

Z-test: Project Assignment

¹

X

n

S

S

S

X

Z

x

x

obs

=

-

=

m