R studio Project
Population Development
In this report, we explore the population development indicator of middle east countries which includes Saudi Arabia, United Arab Emirates, Bahrain, Yemen, Egypt, Iran, Iraq, Jordan, Kuwait, Lebanon, Oman, Palestine, Qatar, Syria, Turkey, and Cyprus. The population development index is dependent on many factors; however, we have identified few of them. Purpose of this study is to explore the relation of these attributes on the population development index of countries.
The list of attributes is shown below:
· HEALTH - Health Index
· POP - Population Index
· POSC - Peace and Order Index
· ECONSZSC - Size of Economy Score
· AI - Air Index
· LOWBWT - Low Birth Weight
· NTLWTHSC - National Wealth Index Score
· BLDTOT - Total Built Land
· IWI - Inland Water Index
· LANDDSC - Land Diversity Score
· LANDQSC - Land Quality Score
Analysis
Relation among these indicators can be visualized using the following plots.From the above plots and observation, we can formulate the hypothesis to compare the economy score and net wealth score index in this region. These two indicators should be highly correlated with an assumption that the net wealth score index contributes towards the size of the economy score of the country. This can be visualized to see its relation more closely.
> View(df)
Hypothesis:
We can see the effect of net wealth score with the economy score. It can be further investigated using statistical test.
Hypothesis can be defined as:
H0: There is not correlation between NTLWTHSC (Net Wealth Score) and ECONSZSC (Size of Economy Score).
H1: There is significant correlation between NTLWTHSC (Net Wealth Score) and ECONSZSC (Size of Economy Score).
We have made Pearson's product moment correlation coefficient test for analyzing the correlation.
Test Results:
t = 8.1924, df = 11, p-value = 0.000005208
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval: 0.7683433 0.9782794
sample estimates: cor 0.9269206
Since p-value 0.000005208, is very less than the significance level which refers to alpha = 0.05, we can reject the null hypothesis and concludes that there is significance level of correlation between Net Wealth Score and Size of Economy Score.
R Code:
> library(dplyr)
> library(reshape)
> library(ggplot2)
> middle_east <- c("Bahrain", "Cyprus", "Egypt", "Iran", "Iraq", "Jordan", "Kuwait", "Lebanon", "Oman", "Palestine", "Qatar", "Saudi Arabia", "Syria", "Turkey", "United Arab Emirates", "Yemen")
> indicators <- c("Country_Standard", "HEALTH", "POP", "POSC", "ECONSZSC", "AI", "LOWBWT", "NTLWTHSC","BLDTOT","IWI","LANDDSC", "LANDQSC")
> df = data.frame(Country_Standard = middle_east)
> df_selected <- merge(world_dataset, df, by = "Country_Standard")
> df_selected <- df_selected[, indicators]
> df <- data.frame(sapply(df_selected, as.character), stringsAsFactors=FALSE)
> country <- df$Country_Standard
> df <- data.frame(sapply(df[,-1], as.numeric), stringsAsFactors=FALSE)
> df$Country <- country
> d1 = data.frame(Indicator = "HEALTH", VALUE = df$HEALTH)
> d2 = data.frame(Indicator = "ECONSZSC", VALUE = df$ECONSZSC)
> d <- rbind(d1, d2)
> pairs(df[,-12], pch = 21)
> ggplot(data = df, mapping = aes(x= NTLWTHSC, y=ECONSZSC))+
+ geom_point()
> cor.test(df$NTLWTHSC, df$ECONSZSC)
Pearson's product-moment correlation
data: df$NTLWTHSC and df$ECONSZSC
t = 8.1924, df = 11, p-value = 5.208e-06
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.7683433 0.9782794
sample estimates:
cor
0.9269206
> ggplot(data = df, mapping = aes(x= Country, y=NTLWTHSC))+
+ geom_point()
> ggplot(data = df, mapping = aes(x= Country, y=ECONSZSC))+
+ geom_point()