Statistical Decision Techniques for Managers
MMG 525 STATISTICAL DECISION TECHNIQUES FOR MANAGERS
Mid-Term Exam Review
Prof. Andrew D. Banasiewicz, Ph.D.
High Level Typology of Business Data
Andrew D Banasiewicz, Ph.D.
Understanding Individual Variables
Andrew D Banasiewicz, Ph.D.
Nominal: Strictly labels, e.g., gender
Ordinal: Ordered categories, e.g., educational status
Interval: Ranges on a continuum; no absolute 0, e.g., preference
Ratio: Ranges on a continuum including absolute 0, e.g., age
Can data yield dependable results,
can we rely on the resultant insights?
Reliability Are the available data
accurate? recent? complete?
Validity
Do the required data exist?
Can those data be accessed?
Availability
Key Data Considerations
Andrew D Banasiewicz, Ph.D.
Sampling
❑ Due to practical and theoretical considerations, the vast majority of business analyses are conducted using samples of data
Andrew D Banasiewicz, Ph.D.
❑ While there are numerous sampling schemas – random simple, stratified, cluster, and other – none can produce a sample that is the exact mirror of the population from which it was drawn
❑ Sampling error approximates the degree of sample-population dissimilarity
❑ Because of sampling error, point estimates may not offer valid and reliable approximation of ‘true’ parameter values
The Scientific Method
The scientific method is the process by which scientists, collectively and over time, endeavor to construct an accurate (that is, reliable, consistent and non- arbitrary) representation of the world.
The scientific method encompasses four elements:
1. Observation and description of a phenomenon or group of phenomena.
2. Formulation of a hypothesis to explain the phenomena.
3. Use of the hypothesis to affirm or reject observation-based (#1 above) phenomena.
4. Performance of tests to supply objective bases for #3 above.
Statistical tests, such as t-test, chi square and F-test, are among the mechanisms used in #4 above.
Andrew D Banasiewicz, Ph.D.
The Scientific Method Process
Business Question
Hypothesis
Test Design
Data Collection
Data Analysis
Interpretation
Experiment:
A procedure custom-designed to generate data to test the hypothesis
Observation:
The use of already existing data as the basis of the hypothesis test
Statistical inference
Andrew D Banasiewicz, Ph.D.
The Key Elements of Hypothesis Testing
❑ Reliability assessment: Type I vs. Type II error
Type I (α): Incorrectly concluding that there is a difference
Type II (β): Incorrectly concluding that there is no difference
❑ The fundamental premise:
Null hypothesis Ho: X = Y Alternative hypothesis Ha: X ≠ Y
Quantities being compared (X and Y above) can be either counts (categorical frequencies) or means (averages of continuous measures).
Andrew D Banasiewicz, Ph.D.
The Key Elements of Hypothesis Testing cont’d
Levels of significance
Using the standard normal distribution (mean = 0 and standard deviation = 1), levels of statistical significance are derived, with α=.05 (or 95% confidence level) being the most commonly use one
Andrew D Banasiewicz, Ph.D.
The Key Elements of Hypothesis Testing cont’d
One- vs. two-tailed test of significance
▪ A two-tailed test of significance is used to test the hypothesis that X ≠ Y It is akin to saying that X is either larger or smaller than Y
▪ A one-tailed test of significance is used to test the hypothesis that X > Y (or X < Y) It is akin to saying that X is larger (or smaller) than Y
Andrew D Banasiewicz, Ph.D.
Comparisons
Categorical Variables
Continuous Variables
Categorical Variables
Continuous Variables
AssociationDifference
Categorical and Continuous Variables
Differences & Associations
Andrew D Banasiewicz, Ph.D.
Chi Square T-test or F-test Crosstabulation Correlation T-test or F-test
Tests of Association and Difference
Statistic What Does It Do? What Does the Result Mean?
Chi-square (χ2) test
A test to determine if the difference in frequencies of two categorical variables is statistically significant; the test makes no distributional assumptions.
If p-value is less than the chosen threshold (for example, = 0.05 or 95% level of statistical significance), conclude that the difference in frequencies is material.
t-test
A test to determine if the difference between two continuous variables is statistically significant, under the assumption of standard normal distribution.
F-test
A test to determine if the difference between three or more continuous variables is statistically significant, under the assumption of standard normal distribution.
If p-value is less than the chosen threshold (for example, = 0.05 or 95% level of statistical significance), conclude that the difference between at least two of the means is material.
If p-value is less than the chosen threshold (for example, = 0.05 or 95% level of statistical significance), conclude that the difference is material.
Correlation (r)
A measure of association between pairs of concepts; there are numerous computational approaches, depending on the variables’ measurement scales – Pearson’s r / correlation is most commonly used.
Values range from -1 (perfect inverse correlation) to 1 (perfect positive correlation). Assumes both variables are continuous, normally distributed and linearly related.
Andrew D Banasiewicz, Ph.D.