Excel Data Analysis Project
Time Management
a statistical analysis by Sean Trippel
American Time Use Survey
The American Time Use Survey (ATUS) measures the amount of time people spend doing various activities such as paid work, childcare, volunteering, and socializing.
Conducted by U.S. Bureau of Labor Statistics annually
11,592 people were interviewed in 2014
24 hours of activity per respondent
Raw data from survey available to download from www.bls.gov
Data Highlights
Respondent Demographics
Age, Gender
Location (State)
Industry
Occupation
Earnings
Children
Public, Private, Self-Employed
Time Spent
Sleeping
Grooming
Household Activities
Caring for Family Members
Working
Exercising
Education
Shopping
Eating
Traveling
Socializing
Volunteering
Data Challenges
Large sample size led to challenges in managing data
11,592 respondents
223,250 activity records
ATUS raw data is numerically coded which requires decoding to gain meaningful data; for example:
TRMJIND1 = “Industry”
Under TRMJIND1, 1 = “Agriculture”, 2 = “Construction”, 3 = “Manufacturing”, etc.
RESULT
Most raw data required filtering and v-lookups in Excel prior to exporting to MiniTab.
Hypothesis to Consider using ATUS data:
Test 1: Does one gender sleep significantly more than the other?
Two-sample analysis
Test 2: Are there industries that work more hours in a day than others?
Analysis of Variance
Test 3: Is there a correlation between sleep hours and work hours?
Linear Regression
Test 4: Is there a correlation between annual earnings, sleep hours, and work hours?
Multiple Linear Regression
Test 1: Gender Sleep Differences?
Null Hypothesis:
There is not a difference in the amount of sleep time between Males and Females.
Alternative Hypothesis:
There is a difference in the amount of sleep time between Males and Females.
Two-sample Analysis
Binary Categorical Variable: Gender
Continuous Variable: Sleep Hours
Test 1: Gender Sleep Differences?
Test 1: Gender Sleep Differences?
Test 1: Gender Sleep Differences?
Males have lower average hours of sleep (both Mean and Median) by 12 – 15 minutes per night (0.20 - 0.25 hours).
Males have a lower quartile that is 5 minutes (0.08 hours) less than Females
Males and Females share the same upper quartile of 10 hours of sleep
Both genders contained respondents with zero sleep (minimum) and some with nearly 23 hours of sleep (maximum)
Test 1: Gender Sleep Differences?
It is estimated that a Female will sleep 13 minutes (0.22 hours) longer than a Male
Small Standard Error implies more accurate estimate
The Confidence Interval does not contain zero – statistically significant
P-value is less than 1%, thus there is sufficient evidence to reject the null hypothesis
CONCLUSION
Reject the null hypothesis that there is not a difference in the amount of sleep time between Males and Females.
Test 2: Industries that work more hours?
Null Hypothesis: There is not a difference in the amount of hours worked in a day amongst industries.
Alternative Hypothesis: There is a difference in the amount of hours worked in a day amongst industries.
Analysis of variance
Categorical Variable: Industry
Continuous Variable: Hours Worked
Note: For analysis, only included employed respondents who worked at least 5 hours in survey period. By excluding low hour respondents, results should reduce impact of employees who did not work a “full” day.
Test 2: Industries that work more hours?
Test 2: Industries that work more hours?
Test 2: Industries that work more hours?
Test 2: Industries that work more hours?
Agriculture has the highest average (both median and mean)
Approx. 10 hours
Leisure and hospitality have the lowest average (both median and mean)
Approx. 8 hours
Agriculture has a notably high upper quartile at 12.4 hours
Minimum hours are all approx. 5 hours due to exclusion of respondents with less than 5 hours
The most respondents came from the Education and Health Services industry
744 respondents
The least respondents came from the Mining, Oil & Gas industry
24 respondents
Test 2: Industries that work more hours?
Interval Plot
16
Test 2: Industries that work more hours?
P-value is less than 1%, thus there is sufficient evidence to conclude that not all industries are identical in respect to hours worked.
Thus, perform Pairwise Comparison (Tukey Simulation).
Test 2: Industries that work more hours?
Conclusion:
Reject the Null Hypothesis that there is not a difference in work hours amongst Industries
Hours worked in Agriculture are significantly different than 8 of the other 12 industries.
Hours worked in Leisure and hospitality are significantly different than 7 of the other 12 industries.
3. No other industry is significantly different than half (6) of the other industries.
Test 3: Does working more result in less sleep?
Regression Analysis
Two Continuous Variables:
Input: Work Hours
Output: Sleep Hours
Null Hypothesis:
Working more hours is not associated with sleeping less hours.
Alternative Hypothesis:
Working more hours is associated with sleeping less hours.
Test 3: Does working more result in less sleep?
Test 3: Does working more result in less sleep?
RESULT
P-value is less than 1%, so the regression is statistically significant.
R-square of 11.97% meaning 11.97% of the variability in sleep hours can be explained by variations in work hours
Regression equation:
Work Hours = 12.149 – 0.4345 Sleep Hours
Test 3: Does working more result in less sleep?
Test 3: Does working more result in less sleep?
When working 12 hours:
Predict between 3.7 and 9.2 hours of sleep
When working 8 hours:
Predict between 4.8 and 11.0 hours of sleep
Result: Negative relationship established between work hours and sleep hours
Relationship is difficult to predict due to small R square with data spread out along regression line.
Test 4: Does working more result in less sleep or more money?
Regression Analysis
Three Continuous Variables:
Input 1: Work Hours
Input 2: Sleep Hours
Output: Annual Earnings
Null Hypothesis:
There is not a correlation between annual earnings, sleep hours, and work hours?
Alternative Hypothesis:
There is a correlation between annual earnings, sleep hours, and work hours?
Test 4: Does working more result in less sleep or more money?
$0 - $35,000
$35,001 - $65,000
$65,001 - $95,000
$95,001 - $125,000
$125,001+
Graphical representation of wage data:
Test 4: Does working more result in less sleep or more money?
All variables have p-values of less than 1%, so regression model is statistically significant.
Thus, use regression model.
Test 4: Does working more result in less sleep or more money?
RESULT
R-square of 3.88% meaning 3.88% of the variability in annual earnings can be explained by variations in work hours and sleep hours
Work hours have a positive relationship with annual earnings
Sleep hours have a negative relationship with annual earnings
Conclusions
Large data sets led to statistically significant findings in data (small p-values)
Recap of data findings:
Females sleep more than Males
Some industries work more hours than other industries
Negative relationship between work hours and sleep hours
Positive relationship between work hours and annual earnings
Relationships returned low R-squared values, so although a relationship is established, the input variables cannot be used as accurate predictors of the output variables.