PowerPoint presentation
Data Visualizations using R: Titanic Dataset
Analyzing & Visualizing Data - Dr. Timothy McGee
Aug 5, 2020
Dataset
Titanic tragedy claimed 1500 lives while 700 survived the tragedy
Hypothesis: Is there any inherent bias towards one of the following categories of the survivors – Gender, Relations and/or Class
The titanic dataset contains masked values in ‘pclass’ column where 1st = Upper, 2nd = Middle and 3rd = Lower
2
Overall Survival Rate
64% of passangers did not survive while 36% did
3
Survival Rate by Ticket Class & Gender
67% of Females survived while only 23% of Males Survived. 62% od 1st Class Passengers survived while only 21% Survived in third class.
4
Survival Rate by Ticket Class & Gender
97% first class females survived while 3% of third class males did.
5
Survival Rate by Age
6
Survival Rate by Age, Gender & Class
7
Conclusion
Bias towards gender and class is apparent from the graphs and could also possible mean just a correlation.
Needs further investigation to verify if there was bias towards categorical classes of survivors (Class 3, Middle Aged, Female & Children)
Results from further investigation can prove if there exists not only correlation by also causation between the independent variables (Age, Sex and Pclass) and dependent variable.
8