Assignment for Kake

profileusmizpassion
sample-assignment_2.docx

Decision Management Systems

Title

Instructor:

Student:

Contact:

Date:

Introduction

Despite being a rising like star in the global economy, China still has many issues needing to be addressed such as political freedom, health care incompetence, environment crisis, etc. Smog is a huge concern in recent years. It is both health-related and environment-related.

On December 8, 2015, Beijing's city government issued its first red alert for pollution. They closed schools and construction sites and restricted the number of cars on the road. Ten days later, the second red alert was issued as the air pollution continued to chock the city sky. The smog problem is not only in Beijing but also has spread over other major cities across the country.

The data set in my study is combined from three data sets ( Table 1, 2 & 3 ) that are retrieved from the National Bureau of Statistics of China, which ensures the quality of the source and data itself. There are no missing values or bad records in the data set. The primary data sets are Ambient Air Quality in Major Cities, Main Pollutant Emission in Waste Gas in Main Cities and Pollutant Emission in Waste Water in Main Cities.

There are a total of 21 variables in the final data set. City name serves as index and year indicates the data collection period. The targeted variable is “Days of Air Quality equal or above Grade II”. Other variables are interests as well such as, “Volume of Consumption Soot Emission”, “Annual Average Concentration of PM10”, etc. First section of data is quantitative data of chemical concentrations and the Air quality. Second section is the measurements of the emission of gas waste and third section is the records of emission of water waste. The emission of water waste might seem little related to the air quality, but it is part of the environment pollution elements. It would be useful when we explore the data in Watson Analytics to confirm that.

Data exploration

To discover all the meanings of the data variables, data exploration of Watson Analytics not only covers all data aspects but also visualizes them. Some suggested questions are good starting point.

The first question examined the volume of consumption soot emission per city. Data visualization ( Exploration Figure 1 ) shows that the City of Harbin has the highest volume of this emission then followed by the capital of China, Beijing. Nanjing is the lowest and similar to cities of Kunming and Fuzhou.

Next question provided the visualization of the living emissions of the cities ( Exploration Figure 2). The cities of Shanghai, Chongqing and Guangzhou are the top three. According to the report from the United Nations, these three cities are among the top five most populated cities in China between the years of 2010 – 2015. Therefore, the data evidently shows the high living emissions.

Now, we would like to explore the details of Days of Air Quality equal or above Grade II compared by cities ( Exploration Figure 3). We can see that Shijiazhuang occupied the smallest area, which means that this city has the least days of good air quality. Shijiazhuang is a heavy industry concentrated area, which explains the low air quality. Meanwhile, Haikou and Kunming showed that they have the best quality for air. It is very obvious that both cities have a low range of the living emission. It is not hard to be convinced that since the city of Haikou and Kunming both are not highly populated. In addition both are located far away from the heavy industrial area.

To further explore the data, we would like to apply the features available to discover more details of the data. To focus on the Days of Air Quality ≥ Grade II, we will create a new variable called “Region” to group the cities in four group by their location. Since the north showed the highest Volume of Consumption Soot Emission and South is just opposite ( Exploration Figure 4.) so we would like to see all other differences in comparison while we apply more filters.

As Region is applied as a global filter ( Exploration Figure 5 ), South and North in particular. We can see that the cities in southern region all have more days of better air quality than northern ones. After we calculated the total of emission in waste gas, it had a similar outcome – all the southern cities have less industrial total waste in gas compared with cities in north. If we look closely by applied additional filters - Annual Average Concentration of PM10 < 101, which means air is not polluted. These five cities are all in southern region.

The data exploration is really helpful in finding relationships between variables. It could be used in many cases. In my field, research, principle investigators could use the features to explore key variables and see which ones most affect the outcomes, even before the deep analysis by statisticians. This would help them make decisions about planning subsequent investigations.

Data refinement

Even though the data set has good quality, it is still necessary to do some additional modification to in order to present meaningful data for data exploration.

Begin with data matrix, Days of Air Quality Equal to or Above Grade II (day) is the best variable scored 100% and the lowest quality is Common Industrial Solid Wastes Disposed (56%). It is not a surprise because the data was collected cross the country that includes the heavy industrial cities and light industrial cities. The outliers are due to the nature of the data so it should be kept. For the same reason, variable Volume of Industrial sulfur dioxide (ton) should be treated the same even though it is median quality with score of 59%.

Being said that data was collected from wide spread areas including heavy industry and light industry cities, a new variable was created grouping the cities into North, South, West and East and was used with the data exploration tool to examine the difference or similarity.

A hierarchy “Pollution hierarchy” was created by using the “Annual Average Concentration of PM10”. And it was used in conjunction with the new variable “Total emission in waste gas waste” that was created by summarize the total industrial emission waste gas. Once again the northern cities indicated serious pollution level with PM10 value from 150 to 305 and followed by the west region.

Conclusion

By now, we can confidently conclude that the air pollution in China is serious. The overall mean PM10 indicates air is lightly polluted but that is misleading without considering the huge variances. The PM10 ranges from 47 – 305. The north region cities have the largest variances followed by west region cities. From the data exploration we can tell that the high population has an impact on living emissions. However, the emission in waste gas more significantly contributes to the air pollution. This is consistent with the nature of industry distribution in China. Since the data was collected during 2013 and 2014, it only tells the old story. The red alert of pollution issued in end of 2015 is definitely an encouragement that implies that Chinese government is taking it seriously.

References

Ambient Air Quality in Key Cities of Environmental Protection (2013)

http://www.stats.gov.cn/english/Statisticaldata/AnnualData/

Hunt, K., Lu, S. (2015). Smog in China closes schools and construction sites, cuts traffic in Beijing; CNN. Retrieved From: http://www.cnn.com/2015/12/07/asia/china-beijing-pollution-red-alert/

Main Pullutant Emission in Waste Gas in Main Cities (2013). Data Retrieved from:

http://www.stats.gov.cn/english/Statisticaldata/AnnualData/

Main Pullutant Emission in Waste Gas in Main Cities (2014). Data Retrieved from:

http://www.stats.gov.cn/english/Statisticaldata/AnnualData/

Most populated cities in China. Data Retrieved from: http://www.nationsonline.org/oneworld/china_cities.htm

Rohde, R., Muller, R. (2015) Air Pollution in China: Mapping of Concentrations and Sources. PLoS ONE 10(8): e0135749. doi: 10.1371/journal.pone.0135749; Retrieved From: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0135749

Wong, E. (2015). World Briefing | Asia; China: ‘Red Alert’ on Beijing’s Air. New York. P10. Retrieved From: http://www.nytimes.com/2015/12/18/world/asia/beijing-issues-a-second-red-alert-on-pollution.html

Zhang, Y., Li, M., Bravo, M. A., Jin, L., Nori-Sarma, A., Xu, Y., … Bell, M. L. (2014). Air Quality in Lanzhou, a Major Industrial City in China: Characteristics of Air Pollution and Review of Existing Evidence from Air Pollution and Health Studies. Water, Air, and Soil Pollution225(10), 2187. http://doi.org/10.1007/s11270-014-2187-3

Appendix:

Table 1.

City

Year

Annual Average Concentration of SO2 (μg/m3)

Annual Average Concentration of NO2 (μg/m3)

Annual Average Concentration of PM10 (μg/m3)

95th Percentile Daily Average Concentration of CO (mg/m3)

90th Percentile Daily Maximum 8 hours Average Concentration of O3 (μg/m3)

Annual Average Concentration of PM25 (μg/m3)

Days of Air Quality Equal to or Above Grade II (day)

Beijing

2013

26

56

108

3.4

188

89

167

Changchun

2013

44

44

130

2.1

127

73

230

Changsha

2013

33

46

94

2.3

134

83

196

Chengdu

2013

31

63

150

2.6

157

96

139

Chongqing

2013

32

38

106

1.5

163

70

207

Fuzhou

2013

11

43

64

1.2

73

36

343

Guangzhou

2013

20

52

72

1.5

156

53

259

Guiyang

2013

31

33

85

1.3

101

53

278

Haikou

2013

7

17

47

1.0

106

27

342

Hangzhou

2013

28

53

106

1.9

155

70

212

Table 2.

City

Year

Volumn of Industrial sulfur dioxide (ton)

Volumn of Industrial nitrogen oxides Emission (ton)

Volumn of Industrial smoke ( powder ) dust (ton)

Volumn of sulfur dioxide Emission by Consumption (ton)

Volumn of nitrogen oxides Emission by Consumption (ton)

Volume of Consumption Soot Emission (ton)

Beijing

2013

52041

75927

27182

34967

13638

28258

Changchun

2013

57246

95190

72970

7344

1545

7919

Changsha

2013

21173

15951

19545

2366

153

2946

Chengdu

2013

52040

44411

21452

4891

2109

661

Chongqing

2013

494415

247905

179842

53261

4487

4401

Fuzhou

2013

76043

72284

43483

1279

169

547

Guangzhou

2013

65589

57164

16660

663

276

214

Guiyang

2013

70603

30450

24233

35493

1753

5530

Haikou

2013

1798

86

1149

11

17

5

Hangzhou

2013

82021

67283

40243

633

335

135

Table 3.

City

Year

Industrial Waste Water Discharged (10,000 tons)

Industrial COD Emission (ton)

Industrial Ammonia Nitrogen (ton)

Urban Living Waste Water Discharged (10,000 tons)

Living COD Emission (ton)

Living Ammonia Nitrogen (ton)

Beijing

2013

9486

6055

330

134991

89868

14189

Changchun

2013

5482

11670

1384

20797

32006

6970

Changsha

2013

4049

13499

456

43000

59931

8656

Chengdu

2013

10524

12321

801

99860

102595

13144

Chongqing

2013

33451

51534

3266

108937

218601

36211

Fuzhou

2013

4682

5190

405

32268

67654

9509

Guangzhou

2013

21391

22664

1389

135179

106077

17442

Guiyang

2013

2262

6993

293

21774

26324

4490

Haikou

2013

825

858

51

11120

6341

3799

Hangzhou

2013

39186

31947

1373

53902

39650

8060

Exploration Figure 1.

Exploration Figure 2.

Exploration Figure 3.

Exploration Figure 4.

Exploration Figure 5.

1