13.docx

Using the Gapminder datasets provided (files located at the bottom of this screen), perform the following problems using RStudio.

1. Perform a Cox proportional hazard test to determine the risk factors comparing survival curves between the groups breast cancer deaths per 100,000 women, cervical cancer deaths per 100,000 women, and colon/rectum cancer deaths per 100,000 women data set here:  indicator breast female mortality.xlsx indicator cervix female mortality.xlsx   indicator colon and rectum female mortality.xlsx       (Chapter 14 in Introductory Statistics with R) Please use years 2000, 2001, and 2002.

2. Perform a Kaplan-Meier Log-rank test to determine the survival curve of the dataset here: indicator_estimated incidence infectious tb per 100000.xlsx Please use the date range of 1998-2007 to calculate the survival curve.

3. Perform a Chi-Square analysis to determine the observed and expected distributions between infectious TB estimated number of new cases per 100,000 and total number of new cases reported here: indicator_reported incidence infectious tb.xlsx     indicator_estimated incidence infectious tb per 100000.xlsx   Please use the years 2005-2007 in each of the studies.

· Present and explain your findings of the all required tests (Cox proportional hazard test, Kaplan-Meier Log-rank test and Chi-Square analysis) in a Word document with introduction, body and conclusion.

· Your submission should be as many pages as you need to display your findings.