deliverable7
Statistical Analysis Report
Effect of age on infectious disease treatment methods
Overview
There has been an increase in patients admitted with a particular infectious disease. I believe that the ages of these patients play a critical role in the method used to treat the patients.
Classification
There were three variables in the dataset: patient # (Client number), Infection disease status, and patient age.
Age of the patient and Client number is a quantitative variable, while Infection disease status is a qualitative variable.
Age and client number have been expressed as integers; thus, they are discrete. Infection disease status is also a discrete variable.
Age is a ratio scale as there exists a true zero, while Infection disease status is a nominal scale.
Measures of Central tendency
Mean
•Mean is the average value of all the data points. It uses all data points in the data; thus, it is the most effective measure. However, it also suffers from extreme values as it gives a misleading figure about the data.
Median
•It is the middle value when the data values have been arranged in either ascending or descending order. It does not suffer from extreme values, but it is less powerful when we have small data.
Mode
•It is the value that has been recorded many times. It is the only measure of central tendency which is helpful for nominal data, but it becomes more reliable when we have many model values
Measures of Variation
Standard deviation
•It measures how much the values deviate from each other. It is resistant to sampling variation, but it is also affected by extreme values.
Range
•It measures the difference between the maximum and the minimum value in the data. It is easier to compute, but the value is affected by only extreme values making it less reliable.
Interquartile range
•It is good to use if the data is skewed, but it is not up to the standard deviation.
Calculations
| measure | formulae | results |
| mean | =AVERAGE(C2:C66) | 61.7692 Average age of people with infectious disease |
| Median | =MEDIAN(C2:C66) | 61 The age of patient that lies in the middle of the group of patients |
| Mode | =MODE(C2:C66) | 55 The age of patients that appear most often (multimodal) |
| Mid-range | =(C2+C66)/2 | 62.5 The age of patients with the disease |
| Range | MAX(C2:C66)-MIN(C2:C66) | 43.0000 The range of patients from oldest to youngest |
| Variance | =VAR.S(C2:C66) | 99.6490 The variance of age of patients with the disease |
| Standard Deviation | =STDEV.S(C2:C66) | 9.9824 The standard deviation in years |
Confidence Intervals
Confidence intervals
•Confidence interval is a statistic that displays the probability of a parameter to fall between two values around the mean.
Point estimate
•It is using the sample data to calculate a single value that is regarded as the best estimate.
The best point estimate for a population mean
•The best point estimator of the mean is the one with a more minor variance of all the unbiased and consistent estimators
Importance of confidence interval
•It provides the range in which the true values lie with a certain degree of probability
Confidence Intervals
Critical Value
•The 95% critical value is 1.96
Margin of Error
•The formulae for calculating marginal error is Margin of error = Critical value x Standard error of the sample. The marginal error is 2.4268.
Upper and Lower Bounds
•The upper and lower bound are calculated by the following formulae: mean± margin error. The lower bound of the age of patients is 59.3424, while the upper bound of the age of patients is 64.1960.
Hypothesis Testing
Null hypothesis:µ1=64
Alternative hypothesis:µ1<64
The alternative is the researcher's claim. The researcher wants to find out if the mean age of patients in the study is less than 64.
This is a left-tailed test (one-tailed) as it has a less than inequality symbol
For this we use the t-test for my hypothesis because the sample size is small and σ is unknown.
Hypothesis Testing
Test Statistic
The test statistic value is calculated using the formulae: (x-µ)/se.
The t statistic value using the formulae above is -1.801668011Critical Value
The critical value of the test is 1.669 from the t tables P-value
The p-value of the test is 0.03815397 from the standard typical tables.
Hypothesis Testing
We reject the null hypothesis as the t statistics value of 1.80 is greater than the critical value of 1.669.
Using the p-value method, we reject the null hypothesis as the p-value of 0.038 is less than the significance alpha value of 0.05.
In conclusion, the average age of all patients admitted to the hospital with infectious diseases is determined as less than 64 years of age.
Conclusion
The mean age of patients in the sample is 62 years, while the standard deviation of 9.9824 shows much variation in the data.
We are 95% confident that the mean age is between 60 to 64 years. The mean age of patients was less than 64 years, as indicated by the results of the t-test.
From the population, I believe that those who were 64 years were more likely to have the infectious disease, thus informing the method used to treat the patients.
One sample t-test that was appropriate for this test as we are comparing the mean age to hypothesized value.