Assignment for Kake
Decision Management Systems
Instructor:
Student:
Contact:
Date:
Introduction
Despite being a rising like star in the global economy, China still has many issues needing to be addressed such as political freedom, health care incompetence, environment crisis, etc. Smog is a huge concern in recent years. It is both health-related and environment-related.
On December 8, 2015, Beijing's city government issued its first red alert for pollution. They closed schools and construction sites and restricted the number of cars on the road. Ten days later, the second red alert was issued as the air pollution continued to chock the city sky. The smog problem is not only in Beijing but also has spread over other major cities across the country.
The data set in my study is combined from three data sets ( Table 1, 2 & 3 ) that are retrieved from the National Bureau of Statistics of China, which ensures the quality of the source and data itself. There are no missing values or bad records in the data set. The primary data sets are Ambient Air Quality in Major Cities, Main Pollutant Emission in Waste Gas in Main Cities and Pollutant Emission in Waste Water in Main Cities.
There are a total of 21 variables in the final data set. City name serves as index and year indicates the data collection period. The targeted variable is “Days of Air Quality equal or above Grade II”. Other variables are interests as well such as, “Volume of Consumption Soot Emission”, “Annual Average Concentration of PM10”, etc. First section of data is quantitative data of chemical concentrations and the Air quality. Second section is the measurements of the emission of gas waste and third section is the records of emission of water waste. The emission of water waste might seem little related to the air quality, but it is part of the environment pollution elements. It would be useful when we explore the data in Watson Analytics to confirm that.
Data exploration
To discover all the meanings of the data variables, data exploration of Watson Analytics not only covers all data aspects but also visualizes them. Some suggested questions are good starting point.
The first question examined the volume of consumption soot emission per city. Data visualization ( Exploration Figure 1 ) shows that the City of Harbin has the highest volume of this emission then followed by the capital of China, Beijing. Nanjing is the lowest and similar to cities of Kunming and Fuzhou.
Next question provided the visualization of the living emissions of the cities ( Exploration Figure 2). The cities of Shanghai, Chongqing and Guangzhou are the top three. According to the report from the United Nations, these three cities are among the top five most populated cities in China between the years of 2010 – 2015. Therefore, the data evidently shows the high living emissions.
Now, we would like to explore the details of Days of Air Quality equal or above Grade II compared by cities ( Exploration Figure 3). We can see that Shijiazhuang occupied the smallest area, which means that this city has the least days of good air quality. Shijiazhuang is a heavy industry concentrated area, which explains the low air quality. Meanwhile, Haikou and Kunming showed that they have the best quality for air. It is very obvious that both cities have a low range of the living emission. It is not hard to be convinced that since the city of Haikou and Kunming both are not highly populated. In addition both are located far away from the heavy industrial area.
To further explore the data, we would like to apply the features available to discover more details of the data. To focus on the Days of Air Quality ≥ Grade II, we will create a new variable called “Region” to group the cities in four group by their location. Since the north showed the highest Volume of Consumption Soot Emission and South is just opposite ( Exploration Figure 4.) so we would like to see all other differences in comparison while we apply more filters.
As Region is applied as a global filter ( Exploration Figure 5 ), South and North in particular. We can see that the cities in southern region all have more days of better air quality than northern ones. After we calculated the total of emission in waste gas, it had a similar outcome – all the southern cities have less industrial total waste in gas compared with cities in north. If we look closely by applied additional filters - Annual Average Concentration of PM10 < 101, which means air is not polluted. These five cities are all in southern region.
The data exploration is really helpful in finding relationships between variables. It could be used in many cases. In my field, research, principle investigators could use the features to explore key variables and see which ones most affect the outcomes, even before the deep analysis by statisticians. This would help them make decisions about planning subsequent investigations.
Data refinement
Even though the data set has good quality, it is still necessary to do some additional modification to in order to present meaningful data for data exploration.
Begin with data matrix, Days of Air Quality Equal to or Above Grade II (day) is the best variable scored 100% and the lowest quality is Common Industrial Solid Wastes Disposed (56%). It is not a surprise because the data was collected cross the country that includes the heavy industrial cities and light industrial cities. The outliers are due to the nature of the data so it should be kept. For the same reason, variable Volume of Industrial sulfur dioxide (ton) should be treated the same even though it is median quality with score of 59%.
Being said that data was collected from wide spread areas including heavy industry and light industry cities, a new variable was created grouping the cities into North, South, West and East and was used with the data exploration tool to examine the difference or similarity.
A hierarchy “Pollution hierarchy” was created by using the “Annual Average Concentration of PM10”. And it was used in conjunction with the new variable “Total emission in waste gas waste” that was created by summarize the total industrial emission waste gas. Once again the northern cities indicated serious pollution level with PM10 value from 150 to 305 and followed by the west region.
Conclusion
By now, we can confidently conclude that the air pollution in China is serious. The overall mean PM10 indicates air is lightly polluted but that is misleading without considering the huge variances. The PM10 ranges from 47 – 305. The north region cities have the largest variances followed by west region cities. From the data exploration we can tell that the high population has an impact on living emissions. However, the emission in waste gas more significantly contributes to the air pollution. This is consistent with the nature of industry distribution in China. Since the data was collected during 2013 and 2014, it only tells the old story. The red alert of pollution issued in end of 2015 is definitely an encouragement that implies that Chinese government is taking it seriously.
References
Ambient Air Quality in Key Cities of Environmental Protection (2013)
http://www.stats.gov.cn/english/Statisticaldata/AnnualData/
Hunt, K., Lu, S. (2015). Smog in China closes schools and construction sites, cuts traffic in Beijing; CNN. Retrieved From: http://www.cnn.com/2015/12/07/asia/china-beijing-pollution-red-alert/
Main Pullutant Emission in Waste Gas in Main Cities (2013). Data Retrieved from:
http://www.stats.gov.cn/english/Statisticaldata/AnnualData/
Main Pullutant Emission in Waste Gas in Main Cities (2014). Data Retrieved from:
http://www.stats.gov.cn/english/Statisticaldata/AnnualData/
Most populated cities in China. Data Retrieved from: http://www.nationsonline.org/oneworld/china_cities.htm
Rohde, R., Muller, R. (2015) Air Pollution in China: Mapping of Concentrations and Sources. PLoS ONE 10(8): e0135749. doi: 10.1371/journal.pone.0135749; Retrieved From: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0135749
Wong, E. (2015). World Briefing | Asia; China: ‘Red Alert’ on Beijing’s Air. New York. P10. Retrieved From: http://www.nytimes.com/2015/12/18/world/asia/beijing-issues-a-second-red-alert-on-pollution.html
Zhang, Y., Li, M., Bravo, M. A., Jin, L., Nori-Sarma, A., Xu, Y., … Bell, M. L. (2014). Air Quality in Lanzhou, a Major Industrial City in China: Characteristics of Air Pollution and Review of Existing Evidence from Air Pollution and Health Studies. Water, Air, and Soil Pollution, 225(10), 2187. http://doi.org/10.1007/s11270-014-2187-3
Appendix:
Table 1.
|
City |
Year |
Annual Average Concentration of SO2 (μg/m3) |
Annual Average Concentration of NO2 (μg/m3) |
Annual Average Concentration of PM10 (μg/m3) |
95th Percentile Daily Average Concentration of CO (mg/m3) |
90th Percentile Daily Maximum 8 hours Average Concentration of O3 (μg/m3) |
Annual Average Concentration of PM25 (μg/m3) |
Days of Air Quality Equal to or Above Grade II (day) |
|
Beijing |
2013 |
26 |
56 |
108 |
3.4 |
188 |
89 |
167 |
|
Changchun |
2013 |
44 |
44 |
130 |
2.1 |
127 |
73 |
230 |
|
Changsha |
2013 |
33 |
46 |
94 |
2.3 |
134 |
83 |
196 |
|
Chengdu |
2013 |
31 |
63 |
150 |
2.6 |
157 |
96 |
139 |
|
Chongqing |
2013 |
32 |
38 |
106 |
1.5 |
163 |
70 |
207 |
|
Fuzhou |
2013 |
11 |
43 |
64 |
1.2 |
73 |
36 |
343 |
|
Guangzhou |
2013 |
20 |
52 |
72 |
1.5 |
156 |
53 |
259 |
|
Guiyang |
2013 |
31 |
33 |
85 |
1.3 |
101 |
53 |
278 |
|
Haikou |
2013 |
7 |
17 |
47 |
1.0 |
106 |
27 |
342 |
|
Hangzhou |
2013 |
28 |
53 |
106 |
1.9 |
155 |
70 |
212 |
Table 2.
|
City |
Year |
Volumn of Industrial sulfur dioxide (ton) |
Volumn of Industrial nitrogen oxides Emission (ton) |
Volumn of Industrial smoke ( powder ) dust (ton) |
Volumn of sulfur dioxide Emission by Consumption (ton) |
Volumn of nitrogen oxides Emission by Consumption (ton) |
Volume of Consumption Soot Emission (ton) |
|
Beijing |
2013 |
52041 |
75927 |
27182 |
34967 |
13638 |
28258 |
|
Changchun |
2013 |
57246 |
95190 |
72970 |
7344 |
1545 |
7919 |
|
Changsha |
2013 |
21173 |
15951 |
19545 |
2366 |
153 |
2946 |
|
Chengdu |
2013 |
52040 |
44411 |
21452 |
4891 |
2109 |
661 |
|
Chongqing |
2013 |
494415 |
247905 |
179842 |
53261 |
4487 |
4401 |
|
Fuzhou |
2013 |
76043 |
72284 |
43483 |
1279 |
169 |
547 |
|
Guangzhou |
2013 |
65589 |
57164 |
16660 |
663 |
276 |
214 |
|
Guiyang |
2013 |
70603 |
30450 |
24233 |
35493 |
1753 |
5530 |
|
Haikou |
2013 |
1798 |
86 |
1149 |
11 |
17 |
5 |
|
Hangzhou |
2013 |
82021 |
67283 |
40243 |
633 |
335 |
135 |
Table 3.
|
City |
Year |
Industrial Waste Water Discharged (10,000 tons) |
Industrial COD Emission (ton) |
Industrial Ammonia Nitrogen (ton) |
Urban Living Waste Water Discharged (10,000 tons) |
Living COD Emission (ton) |
Living Ammonia Nitrogen (ton) |
|
Beijing |
2013 |
9486 |
6055 |
330 |
134991 |
89868 |
14189 |
|
Changchun |
2013 |
5482 |
11670 |
1384 |
20797 |
32006 |
6970 |
|
Changsha |
2013 |
4049 |
13499 |
456 |
43000 |
59931 |
8656 |
|
Chengdu |
2013 |
10524 |
12321 |
801 |
99860 |
102595 |
13144 |
|
Chongqing |
2013 |
33451 |
51534 |
3266 |
108937 |
218601 |
36211 |
|
Fuzhou |
2013 |
4682 |
5190 |
405 |
32268 |
67654 |
9509 |
|
Guangzhou |
2013 |
21391 |
22664 |
1389 |
135179 |
106077 |
17442 |
|
Guiyang |
2013 |
2262 |
6993 |
293 |
21774 |
26324 |
4490 |
|
Haikou |
2013 |
825 |
858 |
51 |
11120 |
6341 |
3799 |
|
Hangzhou |
2013 |
39186 |
31947 |
1373 |
53902 |
39650 |
8060 |
Exploration Figure 1.
Exploration Figure 2.
Exploration Figure 3.
Exploration Figure 4.
Exploration Figure 5.
1