assess04

Bella10
  • 2 years ago
  • 15
files (3)

PreassessST03.docx

This pre-assessment will make use of the BODY DATA Excel file created in ST3001. Comparing BMI by smoking status You are interested in determining if smokers have a BMI that is greater than nonsmokers.

1. Explain which hypothesis test would be appropriate for this situation (assume population variances are not known and sample variances are not equal).

2. Explain whether this is a left-tailed, right-tailed, or two-tailed test and justify your choice.

ST3004_B_Pacheco_project.docx

Hypothesis testing

Part 1- Hypothesis Testing in Research

Whitley, D., & Fuller-Thomson, E. (2017). African-American Solo Grandparents Raising Grandchildren: A Representative Profile of Their Health Status. Journal of Community Health, 42(2), 312–323. https://doi.org/10.1007/s10900-016-0257-8

Hypothesis Statements

Null Hypothesis (H₀): There is no significant relationship between the gender of the child and parenting status among solo African-American grandparents and single African-American parents.

Alternative Hypothesis (H₁): There is a significant relationship between the gender of the child and parenting status among solo African-American grandparents and single African-American parents.

Explanation of Non-significance

The p-value for the relationship between child gender and parenting status is 0.47, which exceeds the alpha level of 0.05. This means the observed difference in child gender (boys 51.2%, girls 48.8% for solo grandparents, and boys 48.1%, girls 51.9% for single parents) is not statistically significant. A p-value greater than 0.05 indicates that the likelihood of observing such a difference purely by chance is high, so we fail to reject the null hypothesis.

Conclusion Statement

Based on the p-value of 0.47, we conclude that there is no statistically significant relationship between the child's gender and the parenting status of solo African-American grandparents and single African-American parents. Therefore, the null hypothesis is retained, indicating that parenting status does not depend on the child's gender.

Type II Error Explanation

A Type II error occurs when we fail to reject the null hypothesis when it is false. In this context, a Type II error would mean that there is indeed a significant relationship between child gender and parenting status, but we mistakenly concluded that no such relationship exists due to our failure to detect it. This could happen if the sample size was too small to reveal a true effect or other factors masked the relationship.

Statistically significant relationship

Null Hypothesis (H₀): There is no significant relationship between arthritis diagnosis and parenting status among solo African-American grandparents and single African-American parents.

Alternative Hypothesis (H₁): There is a significant relationship between arthritis diagnosis and parenting status among solo African-American grandparents and single African-American parents.

Explanation of Significance

The possible relationship between two category variables is discovered using the chi-square test of independence. Our particular area of interest is the observed frequency of physical exercise (Yes/No) among African-American grandparents and single parents who reside alone. In hypothesis testing, the p-value is used to assess the importance of the findings. In this instance, the association between the last month's physical activity level and parental status has a p-value of 0.001, which is less than the 0.05 alpha threshold and suggests that the relationship is statistically significant. When conducting statistical hypothesis testing, we begin with an alternative hypothesis—what we hope to prove—and a null hypothesis, which frequently states that there is no impact or difference (Kelter, 2020). The likelihood of getting the observed data—or even more extreme data—if the null hypothesis is correct is known as the p-value.

The significance level, or alpha level (α), is the cutoff point below which the null hypothesis is rejected. Typically, α is set at 0.05, indicating a 5% possibility that the null hypothesis cannot be correctly rejected.

The p-value in this instance is 0.001, which is less than the 0.05 alpha threshold.

Conclusion Statement

Based on the p-value of 0.001, we reject the null hypothesis and conclude that there is a statistically significant relationship between arthritis diagnosis and parenting status. Solo African-American grandparents are significantly more likely to be diagnosed with arthritis compared to single African-American parents.

Type I Error Explanation

A Type I error occurs when we reject the null hypothesis when it is true. In this context, a Type I error would mean that we incorrectly concluded that there is a significant relationship between arthritis diagnosis and parenting status when, in reality, no such relationship exists. This could result in overestimating the health disparities between the two groups.

Part 2- Performing a hypothesis test

Explain which hypothesis test would be appropriate for this situation (assume population variances are unknown and sample variances are not equal).

An Independent Samples t-test is a suitable analysis for your research question.

The study inquiry "Is the BMI of smokers higher than that of nonsmokers?" contrasts the means of two independent groups (smokers and nonsmokers) regarding a continuous outcome (BMI).

The Independent Samples t-test is employed to compare the means of a continuous variable, specifically BMI, between two independent groups, namely smokers and nonsmokers.

This test will determine whether a substantial difference exists between the two groups.

The null hypothesis (H₀) for this examination states, "There is no disparity in BMI between smokers and nonsmokers." The alternative hypothesis (H₁) states that smokers possess a higher BMI than nonsmokers.

The Independent Samples t-test will yield a p-value. Should the p-value fall below the predetermined significance level (often 0.05), one would reject the null hypothesis and infer a substantial disparity in BMI between smokers and nonsmokers.

Explain if this is a left-tailed, right-tailed, or two-tailed test and justify your choice.

The research hypothesis is to ascertain whether a significant difference in BMI exists between the two groups. Our research hypothesis aims to ascertain whether smokers have a higher average BMI than nonsmokers. This corresponds to a right-tailed test, as our emphasis is on the upper segment of the distribution where the BMI of smokers may be elevated (Keane & Neal, 2023). The justification for the right-tailed hypothesis is based on the suspicion that smokers may possess a higher BMI attributable to certain lifestyle characteristics. This is not a two-tailed test, as we are not specifically examining whether smokers possess a considerably higher or lower BMI; our emphasis is solely on the "greater than" element.

The distinction between the tails of a hypothesis test pertains to the direction of the test being conducted.

A left-tailed test examines for a reduction or a value that is lower on the left side of the distribution. A right-tailed test examines for an increase or a higher value on the right side of the distribution. A two-tailed test examines for any significant difference, whether lesser or greater, thereby evaluating in both directions. A left-tailed test is employed when the inquiry concerns whether the parameter is inferior to a specified value (Xu et al., 2022). The term "left-tailed" test is used because the key region for potentially rejecting the null hypothesis is situated in the left tail of the distribution.

t-Test: Two-Sample Assuming Unequal Variances

 

BMI SMOKERS

BMI NON SMOKERS

Mean

28.10641026

30.34761905

Variance

28.72008825

73.95511521

Observations

78

63

Hypothesized Mean Difference

0

df

99

t Stat

-1.804789162

P(T<=t) one-tail

0.03707391

t Critical one-tail

1.660391156

P(T<=t) two-tail

0.074147819

t Critical two-tail

1.984216952

 

Conclusion Statement

The calculated t-statistic is -1.804, less than the critical value of 1.660 for a one-tailed test at the 5% significance level (α = 0.05). The p-value for the one-tailed test is 0.037, which is less than 0.05. Therefore, we reject the null hypothesis and conclude that there is enough evidence to support that the BMI of smokers is less than the BMI of nonsmokers.

Decision in Context of the Research Question

The research question asked whether smokers have a greater BMI than nonsmokers. Since we rejected the null hypothesis and the BMI of smokers is statistically lower than that of nonsmokers (as indicated by a negative t-statistic), we conclude that smokers do not have a BMI greater than nonsmokers and have a lower BMI on average based on this sample.

Explain which hypothesis test would be appropriate for this situation.

Employing an ANOVA test is most suitable for this scenario, as it will compare the weights across all four parts of the country and assess any statistical differences in weight among the population regions. The null hypothesis (H0) posits that the mean weights of all four regions are equivalent. The alternative hypothesis (Ha) posits that at least one distinct region exists.

Anova: Single Factor

SUMMARY

Groups Count Sum Average Variance

NORTH 32 2494.4 77.95 294.8232258

SOUTH 38 3124.3 82.21842105 379.0647866

EAST 32 2809.9 87.809375 466.7021673

WEST 39 3239.5 83.06410256 383.1786775

ANOVA

Source of Variation SS df MS F P-value F crit

Between Groups 1570.977595 3 523.6591983 1.374526442 0.253217791 2.670686875

Within Groups 52193.47404 137 380.974263

Total 53764.45163 140

Conclusion Statement

Based on the ANOVA single-factor analysis, the F-value (1.375) is less than the critical F-value (2.671), and the p-value (0.253) is greater than 0.05. This indicates that we fail to reject the null hypothesis. Therefore, there is no statistically significant difference in the average weights across the four regions (NORTH, SOUTH, EAST, and WEST). The data suggests that the average weights are similar for all regions.

Decision in Context

Given that the research question asks whether all four regions have equal average weights, the ANOVA results do not provide sufficient evidence to suggest a difference. Thus, the decision is that the average weights for the four regions are equal, supporting the null hypothesis that the regions do not differ significantly in average weight.

According to question 4, the ANOVA test confirmed the null hypothesis that the mean weights of all four regions are equal. The box plots provide visual confirmation of this. Examining the box plots indicates that the average values reside within a narrow range, demonstrating that the mean weights may be comparable.

References

Keane, M., & Neal, T. (2023). Instrument strength in IV estimation and inference: A guide to theory and practice. Journal of Econometrics. https://doi.org/10.1016/j.jeconom.2022.12.009

Kelter, R. (2020). Bayesian alternatives to null hypothesis significance testing in biomedical research: a non-technical introduction to Bayesian inference with JASP. BMC Medical Research Methodology, 20(1). https://doi.org/10.1186/s12874-020-00980-6

Xu, S., Li, S., Huang, W., & Wen, R. (2022). Detecting spatiotemporal traffic events using geosocial media data. Computers, Environment and Urban Systems, 94, 101797. https://doi.org/10.1016/j.compenvurbsys.2022.101797

image1.png

asSTData_B_Pacheco.xlsx
This file is too large to display.View in new window