Rstudio

profileDamond
example_project_3.pdf

What Leads to Happiness? An analysis on the psychological, geographical, and institutional effects

Prepared for Dr. Shobhana Stoyanov

University of California, Berkeley

Group 66 Jacob Channell Emma Zazueta

Calvin Cox Diego Mason Kobe Christo

Zoya Ali

12/12/2019

Exploratory Graphics

Formal Data Analysis

Psychological Impact

Demographic Impact Calling back to Figure 5 from our exploratory analysis, we can observe what happens when we plot only the average happiness of each continent.

This graph shows the average happiness score for the entire continent, with the regression line and the average line through it. From the data computed, the average world happiness computed from the average continent scores is 5.944915. These happiness scores have a r of 0.7377331. This implies that there is a strong linear trend between the points, and that happiness and Continent are highly correlated.

Hypothesis Test on Whether Continent Affects Happiness

Continent Life Ladder Averages

Continent Average

Africa 4.560497

Asia 5.259712

Europe 6.297638

North America 6.155330

South America 5.944915

Oceania 7.273640

From the World Population Report, there are only 31 nations that are classified as First World, and after adding the development status to the table, we observed that 30 are present in the report. Furthermore, 22 of the 30 (73.3%) First World countries are located in Europe, while none are located in Africa or South America. We hypothesize that people who are raised in different environments develop different levels of happiness because they are given different opportunities and have varying levels of hardship growing up. This affects overall happiness because people who have a difficult time growing up and being successful are more likely to be unhappy. We suspect that the continent you live in affects your overall happiness; therefore, to investigate this, we performed a hypothesis test on average overall happiness in Europe and Africa compared to the average overall happiness of all nations in the dataset. All values are calculated using RStudio.

Null Hypothesis: Where you live does not affect overall happiness (difference is due to chance variation). x̄=μ Alt Hypothesis: Where you live affects overall happiness (difference is real). x̄>μ Population average = μ = 5.502 Population sd = σ = 1.095348

We reject the null hypothesis for both Europe and Africa. It is highly likely that the difference in average overall happiness between cannot be explained by chance variation. After initial observations, we hypothesized that people who live in First World and Developing World countries have different average overall happiness compared to all countries. A possible explanation for this difference in average overall happiness is the difference in quality of life between people in First World countries versus people in Developing World countries.

Hypothesis Test on Whether Development Status Affects Happiness

Calling back to Figures 4 and 5 from our exploratory analysis, we want to find out whether Development Status does indeed have an effect on Happiness. In Figure 4, we see that there is a notable difference between the average happiness of people with respect to whether they live in a First World country or a Developing World country. It is possible that the difference in average overall happiness between First World countries and Developing World countries is due to chance; therefore, to investigate this, we performed a hypothesis test on the average overall happiness of people in First World countries and compared to the average overall happiness of people in Developing World countries. All values are calculated using RStudio.

One-sided z-test for difference between means Null Hypothesis: People who live in Developing World countries are equally as happy as people in First World countries (Difference is due to chance variation). μ(Developing) - μ(First) = 0 Alternative Hypothesis: People who live in Developing World countries are less happy than people in First World countries (Difference is real). μ(Developing) - μ(First) < 0 Developing World Average: 5.170529 First World Average: 6.768248

Developing World SD: 0.9396434

Population SD is unknown, so we bootstrap

First World SD: 0.7119897

*population SD is unknown, so we use the bootstrapping method (use sample sd).

The population distribution is unknown, so we cannot use a t-test; we must use a single-sample z-test for the difference between means.

pnorm(-10.47011) = 5.925261e-26 = approximately 0% With an extremely low p-value of approximately 0, which is lower than a standard significance level of 5%, we reject the null hypothesis. We reject our null hypothesis that people who live in Developing World countries are equally as happy as people who live in First World Countries. It is highly likely that the difference between average happiness in Developing World countries and First World countries cannot be explained by chance variation. It is possible that the difference in average happiness is due to the difference in quality of life between countries that are developed and countries that are not.

Institutional Impact

The correlation coefficient between a country’s Freedom and Ladder score is 0.546777, suggesting an association between the freedom one has and how high one scores on the ladder. High freedom scores indicate a higher ladder score. The countries with the relatively lowest Freedom scores tend to have the lowest Ladder scores, indicating that countries in which people are more satisfied with their freedom tend to have higher happiness. Conversely, countries with the highest Freedom scores have higher Ladder scores, indicating that countries with less freedom are likely to be less happy.

As can be seen in the graph on the right, the correlation coefficient between a country’s ladder score and level of corruption is 0.1900709. This is too slight of a correlation to be able to make any assumptions or associations that the level of corruption has a significant impact on happiness score.

Conclusion We explored two datasets in order to determine which variables would affect the average citizen happiness of a country. We narrowed our focus on demographic, institutional, and psychological variables, and their impacts on overall happiness. From our initial data exploration, we saw that continent in which a country lies may affect overall happiness of its citizens. We constructed a hypothesis test that concluded we should reject the null that the location of a country has no effect on happiness, and that continental location does, in fact, has an impact on happiness.

Once this association was supported, we sought to figure out what institutional variables could cause an association between happiness and which continent one inhabits. We believed that government type, the amount of freedom, corruption, and economic inequality would be the major variables that could affect happiness. That perhaps, it was more so an institutional association rather than geographical. We created new regressions to see if there was a correlation between these variables. We found that statistically significant increases on institutional factors, such as freedom. We also discovered that the equal distribution of economic resources doesn’t affect the happiness index as much as we initially hypothesized.

These correlations strengthen the claim that where an individual lives, and the institutions they live under, are associated with happiness. Countries that are closer together—on a continental basis—share similar beliefs, social norms, and government types. Consequently, we can conclude that continental region has a great effect on people’s happiness, most specifically through the institutional and social structures that the countries share within specific continents and the implications that come with them.

Based on the results of our data analysis, we have identified that although demographic and institutional factors are associated with an individual’s happiness, there is not much one can do to improve these factors. It is very hard for a family, let alone an individual to leave where they live in order to seek happiness. We do see this in some scenarios throughout world history, however, such as the gold rush and the immigration to the United States in the late 1800’s, where some individuals did move. Instead we looked at the impact of psychological variables that could help to improve ones happiness.

The largest correlation we found was r = .82. This was between an individual’s social circle and their happiness score. The association is that the stronger a support group of friends and family the more happy you are going to be. Given the results, we would recommend people to become aware of the way in which they relate to others and the social circles that they are a part of, to make sure that they are spending time with the right people—those who have a positive effect on their state of mind. We would further recommend people to change their social circles to improve their happiness, should they not be satisfied with the social interactions that they currently have.

Works Cited PromptCloud. “World Happiness Report 2019.” Kaggle, 20 Mar. 2019, www.kaggle.com/PromptCloudHQ/world-happiness-report-2019. “World Happiness Report 2019.” 2019 | The, 20 Mar. 2019, worldhappiness.report/ed/2019/. “World Map.” World Atlas - Maps, Geography, Travel, 11 Aug. 2015, www.worldatlas.com/cntycont.htm. First World Countries Population. (2019-10-24). Retrieved 2019-12-12, from http://worldpopulationreview.com/countries/first-world- countries/

Introduction The World Happiness Report is a survey that gathers data from 156 countries and creates a ranking based on their individual happiness score. Such score is driven by distinct factors, amongst which are the country’s social context and the political institutions that have ruled it’s given region. Since the world is a rapidly changing place, the goal of The World Happiness Report is to show how happiness has transcended throughout the past decades.

The purpose of this project is to find the underlying factors of happiness through the relationship demographic, institutional, and psychological factors have with a country’s happiness score. We aim to draw insights on what leads to happiness through the analysis of the effects of psychological, geographical and institutional factors. We want to answer general questions like does where you live, GDP, or confidence in the government determine your happiness? What affects happiness the most? How important is having social support in regards to happiness?

Exploratory Data Analysis As we begun to look at the two datasets, there were a series of questions that came to mind, and we used these as a guide for our first approximation to the data. A few of them were: 1. To what extent is an underdeveloped country’s happiness affected in comparison to a developed country? 2. If a country’s happiness score were translated to a letter grade, would there be any country to achieve excellence? 3. Is happiness more important than school? Or does school increase wealth which increases future happiness, so does the marginal benefit outdo the marginal cost? 4. How was the sample taken for the ladder scores, and was there any bias in the responses that could skew data? 5. Where is corruption mostly accumulated and how happy are the most corrupt countries in comparison to the least corrupt? 6. To what extent do demographics affect a country’s happiness score?

Variable Descriptions and Summary Statistics SCORE_2018: The cumulative numerical value represented by the level of happiness. GDP: The measure of a country’s economic output that accounts for its number of people. HEALTHY LIFE EXPECTANCY: The numerical value indicating the health status of the individuals. LADDER: The ranking of the statistics of the scores of happiness and well-being. SOCIAL SUPPORT: The national average of the binary responses representing support from relatives and friends. POSITIVE AFFECT: A numerical value of the average of the binary responses measuring laughter and enjoyment. NEGATIVE AFFECT: A numerical value of the average of the binary responses measuring worry, sadness and anger. CORRUPTION: A numerical value of the corruption perception at the national level. FREEDOM: A numerical value representing the average satisfaction in the amount of freedom people feel they have on a national level.

Summary Statistics for the Variables we used to make our Analysis

Summary Min X1st.Quarter Median X3rd.Quarter Max NA.s

Score 2018 2.694 4.702 5.49 6.295 7.858 21

GDP 1.000 38.750 76.50 114.250 152.000 4

Life Expectancy

1.000 38.250 75.50 112.750 150.000 6

Ladder 1.000 39.750 78.50 117.250 156.000 0

Social Support

1.000 39.500 78.00 116.500 155.000 1

Positive Affect

1.000 39.500 78.00 116.500 155.000 1

Negative Affect

1.000 39.500 78.00 116.500 155.000 1

Corruption 1.000 37.750 74.50 111.250 148.000 8

Freedom 1.000 39.500 78.00 116.500 155.000 1

Figure 1: By Jacob Channell.

Figure 1 shows the average happiness ladder score by continent. In order to simplify the 156 countries we grouped each country by the United Nations classification for continent. Their happiness, which averaged to 60.09, was then distributed to the 6 relevant continents. When looking at the data, Oceania had a significantly lower ladder ranking for happiness than the other continents (9.50). On the other side of the spectrum, however, Africa had a significantly high average ladder ranking (122.27). Could this be due to chance variation, or are there other variables in play?

Figure 2: By Jacob Channell.

Figure 2 shows the average life expectancy at birth per continent. Yet again, when compared to the mean (60.82), we see Oceania having significantly low ladder rankings (12.00), as well as Africa having a higher average ladder rank (122.68). When looking at the orignal happiness scores by continent, some outliers arised. This posed a few questions about what variables had more impact on happiness, was it just geography? Perhaps, government or other institutional effects.

Figure 3: By Calvin Cox.

Figure 3 is a linear regression model for the correlation between average life expectancy per country and the average happiness response per country in 2018. We wanted to calculate the correlation coefficient to see if there was truly an association between the two variables. Our calculations showed that r = .744, showing a strong correlation between how long an individual’s life expectancy at birth is, and their overall happiness. Perhaps this is due to geography, or the institutions within these countries that cause higher or lower life expectancies. Perhaps it is a variable that an individual can control, such as their social circle or positive and negative effects they encounter.

Figure 4: By Emma Zazueta.

This graph is a scatterplot of the Countries in the World Happiness Report with their respective Happiness score reported in 2018, and then colored according to the Development Status of the country in question. The separation of Development is based off of the World Population Review’s categorization, and the other countries were marked as Developing. Furthermore, this graph has a line that runs through the average happiness score for each group. This shows that countries that are First World countries are likely to be happier than countries in the Developing World, on average.

Figure 5: By Emma Zazueta.

When all the countries are plotted together, colored according to Continent, and a line is run through the average happiness score of 2018, what one can observe is that there appears to be a significant difference in the average happiness for each continent. According to this graph, Oceania has the highest average happiness, with a 7.273640, and Africa has the lowest average happiness at 4.560497. Europe, North America, and South America are close together at 6.297638, 6.155330, and 5.955915 respectively, and Asia falls down to an average happiness score of 5.259712.

Figure 6: By Jacob Channell.

Figure 6 shows a new variable, the social average. It is the combination of three variables that affect the psychology of an individual: negative affect, positive affect, and social support. These scores were combined to create an average of the variables that an individual has some control over. To see if this correlated with happiness the average ladder scores of these three are compared against the overall ladder happiness score of each nation. A correlation test was then conducted, resulting in a correlation coefficient of r = .76. This correlation means there is high association between the psychological aspects of a person’s life and their happiness. A social circle is easier to improve, however, the negative and positive effects could be harder. An individual has little to no control over factors that affect their happiness.

Through our exploratory data analysis, we discovered that many variables have an effect on happiness. Using the information from our study we decided to narrow our search to what individuals could and could not control, in order to improve their happiness. We decided to do a linear regression model to analyze the association between whether individuals could depend on their friends and family (social circles), which is the independent variable, and the happiness score of each individual country, which is the dependent variable. The happiness score is highly correlated to an individual’s dependence on their social circle. We determined that an individual’s mental health and the psychological effects of individual given their support of friends and family has a large impact on the score of happiness. The Correlation Coefficiant between social circles and happiness was r = .82 , indicating that there is a very strong linear correlation between a country’s average social circle and their average happiness score. Given the strong positive correlation, as the overall social score increases, the overall happiness score increases. As the ladder score, which is the rank number, for a country’s social circle decreased, we can conclude that the average support of individual’s social circle increases and there is more overall happiness. This data supported our initial hypothesis that a country’s social circle is associated with a countries average happiness.

When doing a linear model through R, an independence test is made with the Null Hypothesis saying the slope will equal 0. This calculation is shown when a summary is completed of the regression.

The P value was .000643, which is significant; rejecting the Null Hypothesis that the two variables are independent. This means there is a strong association between social circle’s and happiness.

Figure 7: By Jacob Channell.

Figure 8: By Jacob Channell.

With our hypothesized association supported, we decided to compare continents by their social support average in order to determine if geography had an effect on happiness. The box plot to the left (Figure 8) shows the social support ladder average totals by continent. We initially hypothesized that happiness is not dependent upon geography and that it was due to chance instead. However, analysing the data above, we found Africa and Oceania as relatively extreme data points. Africa had the highest average social support score (120.71), meaning they had some of the weakest social support for its citizens. Oceania had the smallest average social support score (6.00), meaning they had some of the strongest social support for its citizens. Because of the association found in Figure 7, as well as some of our exploratory data, a lower social support average correlates with lower ladder average score for happiness. These points are far off the mean of 61.55, and were thus tested to see if it was due to chance variation, or another cause.

Figure 5: By Emma Zazueta. Figure 9: By Emma Zazueta.

For Europe SE(Europe)= σ/sqrt(n)= 1.095348/sqrt(42)= .1690159 z= (obs-exp)/SE(Europe)= (6.297638-5.502)/.1690159= 4.707474 1-pnorm(z)= 1-pnorm(4.707474)= 1.254026e-06= approximately 0%. With an extremely low p-value of approximately 0, which is lower than a standard significance level of 5%, we reject the null hypothesis that the continent you live in does not affect overall happiness.

For Africa SE(Africa)= σ/sqrt(n)= 1.095348/sqrt(38)= 0.1776889 z= (obs-exp)/SE(Africa)= (4.560497-5.502)/.1776889= -5.298603 1-pnorm(z)= pnorm(-5.298603)= 5.834602e-08 = approximately 0%. With an extremely low p-value of approximately 0, which is lower than a standard significance level of 5%, we reject the null hypothesis that the continent you live in does not affect overall happiness.

Figure 10: By Zoya Ali. Figure 11: By Zoya Ali.

Figure 12: By Diego Mason.

In order to make this scatter plot, the World Happiness Report alternate data was used. The variables plotted consider the Gini of Household Income Reported by Year against the Life Ladder. The closer that a country’s GINI index is to zero, the closer the country is to having a perfect distribution when it comes to household income and expenses. By plotting the GINI index of a country against its happiness score, this plot implies that the countries that approach perfect equality are likely to have happier inhabitants. There is a negative correlation between the variables of r = -0.3786795. From the plot we can observe a trend that countries near equality have a higher happiness score, whereas countries with varying degrees of inequality tend to have a lower happiness score, but because the correlation is so low, we cannot conclude if this higher level of happiness is due to the equality level.

Figure 13: By Kobe Christo.

This is an interactive scatterplot shows the relationship between happiness scores, healthy life expectancy and GDP per capita. There is a strong positive correlation between GDP and Happiness. Additionally, points are coloured by the Health score, which also suggests that Health tends to have a big impact on happiness. This is important because this shows that GDP and Healthy life expectancy are correlated and are important factors that make up Happiness scores for a country. The coloring of the graph indicates that as a country goes down on the ladder, as they score lower on the happiness scale, their healthy life expectancy also decreases.

RealSD = .9396434 ! sqrt((n " 1)/n) = 0.9396434 ! sqrt((126 " 1)/126) = 0.9359072

SE(Developing) = SD/sqrt(n) = 0.9359072/sqrt(126) = 0.08337724

RealSD = 0.7119897 ! sqrt((n " 1)/n) = 0.7119897 ! sqrt((30 " 1)/30) = 0.7000226

SE(First) = SD/sqrt(n) = 0.7000226/sqrt(30) = 0.1278061

z = (x #(FirstWorld) " x #(Developing))/sqrt(SE(Developing + SE(First ))2 )2

= (5.170529 " 6.768248)/sqrt( + )0.083377242 0.12780612

= "10.47011