Reflection Paper

profileBaba0914
AC21620-Chapter51.pptx

A Practical Approach to Analyzing Healthcare Data, Fourth Edition Chapter 5, Analyzing Continuous Variables

Susan White, PhD, RHIA, CHDA

ahima.org

© 2019 AHIMA

ahima.org

Learning Objectives

Compare and contrast commonly used measures of central tendency

Compare and contrast commonly used measures of variation or spread of values

Illustrate appropriate inferential statistics to use for continuous data

© 2019 AHIMA

ahima.org

Continuous Variables

Data elements that represent naturally numeric values that can take infinite values

Interval (no true zero)

Ratio

Healthcare Examples

Length of stay

Charge

Systolic blood pressure

Age

Time to code records

© 2019 AHIMA

ahima.org

Descriptive Statistics Measures of Central Tendency

Mean

Arithmetic average

Sum of values divided by the number of values

Median

Middle value

If even number of values, average of two middle values

Less influenced by extreme values or outliers than the mean

Mode

Most frequent value

© 2019 AHIMA

ahima.org

Descriptive Statistics Measures of Variation

Range

Maximum value minus minimum value

Interquartile range

Difference between the third and first quartile

Variance

Average squared deviation from the mean

Unit of measure is “squared units”

Standard deviation

Square root of the variance

Unit of measure is same as unit of measure in sample

© 2019 AHIMA

ahima.org

Descriptive Statistics Example

Calculate the mean, median and mode of the following sample length of stay data:

2, 4, 6, 3, 1, 2, 5

Mean:

Median

Sort values: 1, 2, 2, 3, 4, 5, 6

Median or middle value = 3

Mode

2, since it is the most frequent value

Note: The mode is rarely used for continuous variables that have many unique values and is presented here for demonstration purposes.

© 2019 AHIMA

ahima.org

Descriptive Statistics Example

Calculate the range, variance and standard deviation of the following sample length of stay data:

2, 4, 6, 3, 1, 2, 5

Range = 6 – 1 = 5

Sample variance

= 3.2

Standard deviation

s =

© 2019 AHIMA

ahima.org

Review: Hypothesis Testing Steps

Determine the null and alternative hypotheses

Set the acceptable type I error or alpha level

Select the appropriate test statistic

Compare the test statistic to a critical value based on the alpha level and the distribution of the test statistic

Reject the null hypothesis if the test statistic is more extreme than the critical value. If not, do not reject the null hypothesis.

© 2019 AHIMA

ahima.org

Inferential Statistics One Sample t-test

One-sample t-test

Used to test if a population value is different from a standard or benchmark

Test statistic:

Compare to a t-distribution to determine critical value

May be one sided or two sided

Anatomy of test statistic:

Numerator: distance from sample mean to null hypothesis value

Denominator: standard error of the sample mean (SEM)

© 2019 AHIMA

ahima.org

Inferential Statistics One Sample t-test - Example

Suppose the researcher that collected the length of stay (LOS) data in the previous examples would like to determine if the population LOS is longer than a standard of 3 days.

Step 1: Determine the null and alternative hypotheses

Ho: µ ≤ 3

Ha: µ > 3

Step 2: Set the acceptable type 1 error rate (AKA alpha level).

Set α = 0.05

Step 3: Select the appropriate test statistic: t-test

© 2019 AHIMA

ahima.org

Inferential Statistics One Sample t-test -Example

Step 3 (con’t)

Recall from previous slides:

s = 1.8

n = 7

=0.44

Step 4: Compare test statistic to critical value.

T-test statistic critical value comes from the t-distribution with n-1 degrees of freedom

T-distribution is symmetric around zero much like standard normal (bell curve); width is defined by the degrees of freedom. (see Figure 5.1 in text)

© 2019 AHIMA

ahima.org

Inferential Statistics One Sample t-test - Example

Step 4 (con’t): t= 0.44; df = n – 1 = 7 -1 = 6, one sided test at α=0.05, critical value = 1.943

Step 5: Reject the null hypothesis if the test statistic is more extreme than the critical value. 0.44 is not greater than 1.943, do not reject the null hypothesis and conclude that the LOS is not longer than the standard

© 2019 AHIMA

ahima.org

Inferential Statistics Confidence Interval for Population Mean

Recall that a confidence interval is a range of values that is likely to cover the true population value with a pre-defined probability or level of confidence

A (1-α)% confidence interval for the population mean is centered at the sample mean and has a width that is dependent on the confidence level and standard error of the mean

Higher level of confidence requires a wider interval

Large sample size results in a narrower interval

Width of confidence interval is a measure of the precision of the estimate of the sample mean

A narrower interval is more precise

© 2019 AHIMA

ahima.org

Inferential Statistics Confidence Interval for Population Mean

Formulate a 95% confidence interval for the LOS data presented in the previous example:

s = 1.8

n = 7

95% CI, so α = 0.05; α/2 = 0.025

df = 6

Critical value (table 5.1) = 2.447

95% CI:

1.7

(1.6,5.0)

We are 95% sure that the range 1.6 to 5.0 days includes the true population LOS is between

© 2019 AHIMA

ahima.org

Inferential Statistics Paired t-test

Paired t-test

Used to compare pre/post test population values or matched pairs

Test statistic:

Where d = difference between the pre/post values or the pairs

Compare to a t-distribution to determine critical value

May be one sided or two sided

Anatomy of test statistic:

Numerator: distance from sample mean difference to null hypothesis value (usually zero)

Denominator: standard error of the sample mean difference (SEM)

© 2019 AHIMA

ahima.org

Inferential Statistics Paired t-test – Example

The transition from ICD-9 to ICD-10 is predicted to cause an increase in the amount of time required to code medical records. A pilot study was conducted using a random sample of 10 records to determine if the time required was significantly different. Each record was coded using the two coding systems by on coder. The values are recorded in the table.

Step 1: Determine the null and alternative hypotheses:

Ho: D = 0

Ha: D ≠ 0

Step 2: Set the alpha level: 0.01

ID ICD-9 Time ICD-10 Time d
1 10 15 5
2 11 12 1
3 15 10 -5
4 30 36 6
5 5 7 2
6 10 13 3
7 8 5 -3
8 11 19 8
9 21 19 -2
10 18 23 5

© 2019 AHIMA

ahima.org

Inferential Statistics Paired t-test – Example

Step 3: Select the appropriate test statistic:

Step 4: Compare the test statistic to the critical value

=1.49

Compare to t distn with df = 9, α/2 = 0.005

1.49 not > 3.25

Step 5: Do not reject Ho

ID ICD-9 Time ICD-10 Time d
1 10 15 5
2 11 12 1
3 15 10 -5
4 30 36 6
5 5 7 2
6 10 13 3
7 8 5 -3
8 11 19 8
9 21 19 -2
10 18 23 5

© 2019 AHIMA

ahima.org

Inferential Statistics Two Sample t-test

Used to test if a two population means are different

Test statistic complex

Denominator is standard error pooled across the two samples

use statistical software to calculate

Compare to a t-distribution to determine critical value

May be one sided or two sided

Anatomy of test statistic:

Numerator: distance between the two sample means

Denominator: pooled standard error of the difference between the two sample means

© 2019 AHIMA

ahima.org

Inferential Statistics Two Sample t-test - Example

An analyst wanted to find out if the charges for CHF patients admitted through the emergency

department (ED) are different from those admitted through other sources. The data for the

sample may be found in the Chapter 5 Data file. The summary statistics from the samples

appear in the table below:

Step 1: State hypotheses:

Ho: µ1= µ2

Ho: µ1≠ µ2

Step 2: Set the alpha level = 0.01

Step 3: Determine the test statistic: T-test

© 2019 AHIMA

ahima.org

Inferential Statistics Two Sample t-test – Results from R

Step 4: Compared test statistic to critical value

t = 2.2363 with a p-value = 0.03189

Step 5: Reject null hypothesis if test statistic is more extreme than critical value or the p=value is less than alpha

Reject the null hypothesis (0.03 < 0.05) and conclude that patients that are admitted through the emergency department have longer lengths of stay

© 2019 AHIMA

ahima.org

Inferential Statistics ANOVA

Used to test if a more than two population means are different

Test statistic: F-test

Best to use software to compute

Compare to an F-distribution to determine critical value

Anatomy of test statistic:

Numerator: variance between comparison groups

Denominator: variance within comparison groups

© 2019 AHIMA

ahima.org

Inferential Statistics ANOVA

Sum of Squares

Degrees of Freedom

Mean

Squares

Test statistic: F

© 2019 AHIMA

ahima.org

Inferential Statistics ANOVA - Example

The Medicare severity-adjusted diagnosis-related group (MS-DRG) system is designed so that the level of resources as measured by charges per patient required to treat a patient are different within the no complication or comorbidity, complication or comorbidity (CC), or major complication or comorbidity (MCC) family. An analyst was asked to test to see if that relationship was true at her facility. A sample of 80 cases was selected for the three congestive heart failure MS-DRGs: 291 (MCC), 292 (CC), and 293 (no CC or MCC). Since three populations of patients are compared, the analyst used R to generate summary statistics and the ANOVA table below.

© 2019 AHIMA

ahima.org

Inferential Statistics ANOVA - Example

Step 1: State the hypotheses

Ho: µ291= µ292= µ293

At least two of the population means are unequal

Step 2: Set the acceptable error level: α=0.05

Step 3: Determine the appropriate test statistic: F-test

Step 4: Compare p-value to α=0.05

Step 5: Conclude to reject Ho since p < 0.0001 < 0.05

© 2019 AHIMA

ahima.org

image3.jpg

image4.png

image5.png

image40.png

image6.png

image7.png

image8.png

image9.png

image10.png

image11.png

image12.png

image13.png

image14.png

image15.emf

image16.png

image17.emf

image18.emf

image1.jpg

image2.jpg