statistic 1

Fredablyee
  • 2 years ago
  • 8
files (4)

statisticsAnalyticalExploration.docx

Analytical Exploration

The excel file attached contains a data set showing life expectancies from 50 randomly selected countries. Use this data to answer the following questions

stataticsD1.xlsx

Sheet1

Country Life Expectancy
Angola 63.24
Australia 83.73
Bangladesh 73.98
Belgium 82.46
Bolivia 68.78
Burkina Faso 60.57
Cameroon 61.92
Canada 82.58
Chile 81.16
China 78.79
Congo 63.26
Croatia 79.4
Denmark 82.58
Djibouti 63.71
Domincan Republic 74.36
Egypt 70.81
Ethiopia 66.65
Fiji 68.45
Grenada 75.49
Iceland 83.02
Iran 76.97
Italy 84.2
Japan 84.59
Jordan 75.02
Latvia 76.06
Madagascar 66.43
Malaysia 76.42
Mali 60.03
Mongolia 72.94
Nepal 70.78
Netherlands 82.58
New Zealand 83.16
Nigeria 53.37
Oman 78.97
Pakistan 67.34
Paraguay 74.1
Peru 76.96
Russia 74.57
Samoa 72.75
Senegal 69.31
Singapore 84.27
South Korea 84.14
Sri Lanka 76.8
Sweden 83.65
Turkey 78.68
Uganda 63.84
United Kingdon 82.31
United States 79.74
Vietnam 74.74
Zimbabwe 61.92

StatisticsFormulasandTemplates.pdf

Name Symbol Formula Description

Basic Statistics

Sample Size 𝑛 n/a

The number of data points in

a sample.

Sample Proportion �̂� �̂� = 𝑥

𝑛

The proportion (percentage)

from a sample.

Population Proportion 𝑝 𝑝 = 𝑥

𝑁

The total proportion

(percentage) from a

population.

Sample Mean �̅� �̅� = ∑ 𝑥

𝑛

The arithmetic average of a

sample.

Population Mean 𝜇 𝜇 = ∑ 𝑥

𝑁

The arithmetic average of a

population.

Sample Standard

Deviation 𝑠 𝑠 = √∑

(𝑥 − �̅�)2

𝑛

The standard deviation from

a sample.

Population Standard

Deviation 𝜎 𝜎 = √∑

(𝑥 − 𝜇)2

𝑁

The standard deviation from

a population.

Sample Variance 𝑠2 𝑠2 = ∑ (𝑥 − �̅�)2

𝑛 The variance from a sample.

Population Variance 𝜎2 𝜎2 = ∑ (𝑥 − 𝜇)2

𝑁

The variance from a

population.

Critical Values

Critical Values

𝑧𝛼 2⁄

𝑧-score found at the 𝛼 2⁄ th percentile of a

standard normal distribution

Critical value used for

confidence intervals

estimating a population

proportion or for two-tailed

hypothesis tests about a

population proportion.

Critical Values

𝑧𝑎

𝑧-score found at the 𝛼th percentile of a

standard normal distribution

Critical value used for left-

tailed or right-tailed

hypothesis tests about a

population proportion.

𝑡𝛼 2⁄

𝑡-statistic found at the 𝛼 2⁄ th percentile of

a student’s 𝑡 distribution (𝑛 degrees of

freedom)

Critical value used for

confidence intervals

estimating a population mean

or for two-tailed hypothesis

tests about a population mean.

𝑡𝛼

𝑡-statistic found at the 𝛼th percentile of a

student’s 𝑡 distribution (𝑛 degrees of

freedom)

Critical value used for left-

tailed or right-tailed

hypothesis tests about a

population mean.

𝜒𝐿 2

Chi-squared statistic found at the bottom

𝛼th percentile of a Chi-squared

distribution (𝑛 degrees of freedom)

Left critical value used in a

confidence intervals and

hypothesis tests about a

population standard deviation.

𝜒𝑅 2

Chi-squared statistic found at the top 𝛼th

percentile of a Chi-squared distribution (𝑛

degrees of freedom)

Right critical value used in a

confidence intervals and

hypothesis tests about a

population standard deviation.

𝑟

𝑟 = 𝑡𝛼 2⁄

√(𝑡𝛼 2⁄ ) 2

+ (𝑛 − 2)

(𝑛 − 2 degrees of freedom)

Critical Values of the Pearson

Correlation Coefficient 𝑟.

Margins of Error

Margin of Error 𝐸

𝐸 = 𝑧𝛼 2⁄ √ �̂� �̂�

𝑛

Margin of error for a

confidence interval estimating

a population proportion.

𝐸 = 𝑡𝛼 2⁄

𝑠

√𝑛

Margin of error for a

confidence interval estimating

a population mean.

Confidence Intervals

Confidence Interval

Confidence Interval

n/a

�̂� − 𝐸 < 𝑝 < �̂� + 𝐸

Confidence interval for

estimating a population

proportion.

�̅� − 𝐸 < 𝜇 < �̅� + 𝐸 Confidence interval for

estimating a population mean.

√ (𝑛 − 1)𝑠2

𝜒𝑅 2 < 𝜎 < √

(𝑛 − 1)𝑠2

𝜒𝐿 2

Confidence interval for

estimating a population

standard deviation.

Hypothesis Tests

Hypothesis Test on One

Sample - Proportions n/a

𝐻0: 𝑝 = 𝑋

𝐻𝑎: 𝑝 < 𝑋 Claim uses “less than”

𝐻0: 𝑝 = 𝑋

𝐻𝑎: 𝑝 > 𝑋 Claim uses “greater than”.

𝐻0: 𝑝 = 𝑋

𝐻𝑎: 𝑝 ≠ 𝑋

Claim uses “equal to” or “not

equal to”.

Hypothesis Test on One

Sample – Means

n/a

n/a

𝐻0: 𝜇 = 𝑋

𝐻𝑎: 𝜇 < 𝑋

Claim uses “less than”

𝐻0: 𝜇 = 𝑋 𝐻𝑎: 𝜇 > 𝑋

Claim uses “greater than”.

𝐻0: 𝜇 = 𝑋

𝐻𝑎: 𝜇 ≠ 𝑋

Claim uses “equal to” or “not

equal to”.

Hypothesis Test on One

Sample – Standard

Deviation

n/a

𝐻0: 𝜎 = 𝑋

𝐻𝑎: 𝜎 < 𝑋 Claim uses “less than”

𝐻0: 𝜎 = 𝑋 𝐻𝑎: 𝜎 > 𝑋

Claim uses “greater than”.

𝐻0: 𝜎 = 𝑋

𝐻𝑎: 𝜎 ≠ 𝑋

Claim uses “equal to” or “not

equal to”.

Hypothesis Test on Two

Samples – Proportions

Hypothesis Test on Two

Samples - Proportions

n/a

𝐻0: 𝑝1 = 𝑝2

𝐻𝑎: 𝑝1 < 𝑝2 Claim uses “less than”

𝐻0: 𝑝1 = 𝑝2

𝐻𝑎: 𝑝1 > 𝑝2 Claim uses “greater than”.

𝐻0: 𝑝1 = 𝑝2

𝐻𝑎: 𝑝1 ≠ 𝑝2

Claim uses “equal to” or

“not equal to”.

Hypothesis Test on Two

Samples – Independent

Means

n/a

𝐻0: 𝜇1 = 𝜇2

𝐻𝑎: 𝜇1 < 𝜇2 Claim uses “less than”

𝐻0: 𝜇1 = 𝜇2

𝐻𝑎: 𝜇1 > 𝜇2 Claim uses “greater than”.

𝐻0: 𝜇1 = 𝜇2

𝐻𝑎: 𝜇1 ≠ 𝜇2

Claim uses “equal to” or “not

equal to”.

Hypothesis Test on Two

Samples – Dependent

Means

n/a

n/a

𝐻0: 𝜇𝑑 = 𝑋

𝐻𝑎: 𝜇𝑑 < 𝑋 Claim uses “less than”

𝐻0: 𝜇𝑑 = 𝑋

𝐻𝑎: 𝜇𝑑 > 𝑋 Claim uses “greater than”.

𝐻0: 𝜇𝑑 = 𝑋

𝐻1: 𝜇𝑑 ≠ 𝑋

Claim uses “equal to” or “not

equal to”.

Test Statistics

𝑧 𝑧 =

�̂� − 𝑝

√ 𝑝𝑞 𝑛

Test statistic used for

hypothesis test about a

population proportion.

Test Statistic

Test Statistic

𝑧 = (�̂�1 − �̂�2) − (𝑝1 − 𝑝2)

√ �̅� �̅� 𝑛1

+ �̅� �̅� 𝑛2

�̅� = 𝑥1 + 𝑥2

𝑛1 + 𝑛2 , �̅� = 1 − �̅�

Test statistic used for a

hypothesis test about two

population proportions.

Pooled proportion.

𝑡

𝑡 = �̅� − 𝜇

𝑠

√𝑛

Test statistic used for a

hypothesis test about a

population mean.

𝑡 = (�̅�1 − �̅�2) − (𝜇1 − 𝜇2)

√ 𝑠1

2

𝑛1 +

𝑠2 2

𝑛2

Test statistic used for a

hypothesis test about two

independent population

means.

𝑡 = �̅� − 𝜇𝑑

𝑠𝑑

√𝑛

Test statistic used for a

hypothesis test about two

dependent means.

𝜒2

𝜒2 = (𝑛 − 1)𝑠2

𝜎2

Test statistic used for a

hypothesis test about a

population standard deviation.

𝜒2 = ∑ (𝑂 − 𝐸)2

𝐸

Test statistic used for a

goodness of fit test or a test of

independence.

Independent Samples

Sample Average

Difference �̅� �̅� = ∑

𝑑

𝑛

The arithmetic average of a

set of differences of matched

pairs from a sample.

Population Average

Difference 𝜇𝑑 𝜇𝑑 = ∑

𝑑

𝑁

The arithmetic average of a

set of differences of matched

pairs from a population.

Sample Standard

Deviation of Differences 𝑠𝑑 𝑠𝑑 = √∑

(𝑑 − �̅�) 2

𝑛

The standard deviation of a

set of differences of matched

pairs.

Linear Regression

Regression Equation

n/a �̂� = 𝑏0 + 𝑏1𝑥 Line of best fit.

�̂�

Substitute dependent variable into

regression equation.

Best predicted value when

regression equation is a good

model.

�̅� �̅� = ∑ 𝑥

𝑛

Average of 𝑦 values, best

predicted value when

regression equation is not a

good model.

𝑏0 𝑏1 = 𝑛(∑ 𝑥𝑦) − (∑ 𝑥)(∑ 𝑦)

𝑛(∑ 𝑥2) − (∑ 𝑥)2 𝑦 − intercept

𝑏1 𝑏0 = (∑ 𝑦)(∑ 𝑥2) − (∑ 𝑥)(∑ 𝑥𝑦)

𝑛(∑ 𝑥2) − (∑ 𝑥)2 slope

Correlation Coefficient 𝑟

𝑟

= 𝑛(∑ 𝑥𝑦) − (∑ 𝑥)(∑ 𝑦)

√𝑛(∑ 𝑥2) − (∑ 𝑥)2√𝑛(∑ 𝑦2) − (∑ 𝑦)2

Measures the strength of the

linear correlation between two

sets of paired variables.

Coefficient of

Determination 𝑟2 Square of the Correlation Coefficient, 𝑟

Measures the ratio of

explained variance to total

variance between two sets of

paired variables.

StatiticsKeyConceptsandReview.pdf
This file is too large to display.View in new window