HighlightedStats.docx

Name: _______Uguz Ghany ______________ ID number: ______16203548 ___________

Applied Statistics 161.111

Assignment 1

Due date: Friday 24 April 2020

Total marks: 50

Image result for green lipped mussel
The population data

The population we are considering for this assignment are the 10,000 kuku (New Zealand green–lipped mussels) growing in a mussel farm in the Marlborough Sounds. The variables of interest are the length of the kuku (in millimetres), grade (small, medium or large) and sex (male or female).

Each kuku (mussel) has a unique ID. The population consists of:

· 1948 large kuku with ID numbers from 1 to 1948.

· 4457 medium kuku with ID numbers from 1949 to 6405

· 3595 small kuku with ID numbers from 6406 to 10000.

You do not have access to the population data. You will collect a sample using the following steps.

· Click on the following link to open a Shiny app in your web-browser and type your ID number into the “Student ID:” box.

· http://shiny.massey+.ac.nz/dleader/DataDownload111/

· This will generate an Excel file containing ID, length, grade and sex for each of the kuku in your sample. This is your sample data. The file you have downloaded is unique to you.

· Keep an electronic copy of your sample data for use in Assignment 2.

· Attach a copy of your sample data to the appendix.

· Use your sample data to answer the following questions in the spaces provided. You can re-size the answer spaces.

· Use Excel and incorporate the Excel output into your answers in the document below.

Part A: The Sampling Method [11 marks]

1. Your sample has been generated using the stratified random sampling method, where the kuku grades were used as the strata. Describe the process behind taking the sample with enough detail that someone could use your description to collect a sample them self. [7 marks]

The Stratified Random Sampling is a sampling method that involves dividing the population into small sub-group known as strata. These small groups called strata are formed based on the attribute or characteristics that are shared among the members under the study.

After dividing the population into strata, random samples are selected from the strata, which will be used during the study.

For example, let the population consist of total Number of Kuku; the kuku population is divided into K-units with the ith stratum that consists of Kuku units. After the stratum has been selected, a sample size of n Kuku from the ith stratum is then selected, which will be used for the study.

Talk about the actual Stratified sampling method for the Kuku, not Ni or what ever you talked about in the second paragraph.

2. Why is stratified random sampling an appropriate method to use with this population? [3 marks]

The method is appropriate since the sample being studied has a minimal variance, and the objective of the study is to estimate the population parameters that have the highest precision. Also, the method is appropriate since the population consists of two clear groups. This ensures that both groups are clearly represented.

3. Paste a copy of your data in the appendix. [1 mark]

Part B: Exploratory analysis on categorical data [9 marks]

1. Use Excel to produce a table of counts for the sex of the kuku in your sample. [2 marks]

Sex of kuku table of counts

Male

53

Female

47

Totals

100

2. What proportion of your sample are female kuku? [2 marks]

Total Female Kuku was 47 of a total Kuku amount of 100. Thus, the proportion will be

47/100

3. Use Excel to draw an appropriate graph to display the table you created above. [4 marks]

4. What does your graph tell you? [1 mark]

From the Graph, we Can observe that there is more Kuku male compared to Kuku female.

We can note from the graph that generally, male kuku are longer than female kuku.

As there is only 2 units to observe (male and female Kuku) we cannot draw on the centre, spread, shape, or if there are any outliers.

Part C: Exploratory analysis on numerical data [15 marks]

1. Use Excel to draw a boxplot of kuku lengths. [3 marks]

2. Use Excel to draw a histogram of kuku lengths. [4 marks]

Paste your graph here

3. Use Excel to calculate the numerical summaries of the kuku lengths. Fill in the table with the values rounded to 2 decimal places. [4 marks]

Kuku Lengths

Sample size

100

Mean

100.41

Standard deviation

47.62

Minimum

11.9

Lower Quartile

66.95

Median

92.3

Upper Quartile

122.55

Maximum

234.8

4. What do your plots and summary statistics tell you about kuku lengths in your sample? Consider the centre, spread, shape and outliers. [4 marks]

The lengths of the Kuku are cantered at the mode of the length where most of the length is distributed at 92.3mm length.

The range is 222.9mm thus tells us the lengths are highly spread.

The graph of the lengths is positively skewed with one peak where most of the length is concentrated at left (35mm).

There are three outliers 234.8mm, 216.1mm, and 212.7mm that were identified in the Kuku length.

Part D: Confidence interval for the mean [15 marks]

1. Calculate a 95% confidence interval for the mean length of kuku on the Marlborough Sounds mussel farm. To get full marks you must show your working for the following: [7 marks]

Standard error = Standard Deviation/ Sqrt of Counts

=47.62 /√100

SE = 4.762 mm

Confidence interval = using the Excel you can find confidence interval

= 9.45

Lower limit = calculated by subtracting confidence interval from the mean of the length

= 100.41 – 4.765 x 2

= 90.886 mm

Upper limit = calculated by adding the mean and the confidence interval

= 100.41 + 4.765 x 2

= 109.934 mm

95% confidence interval is between 90.886 mm and 109.934 mm.

2. Write a sentence to interpret your confidence interval in context. [4 marks]

We are 95% certain that the interval from 90.96 to 109.86 captures the true mean lengths of Kuku on the Marlborough Sounds mussel farm.

3. Are the conditions for this confidence interval met? Explain. [2 marks]

Yes, the conditions are met.

This is because the samples used during the study was stratified random sampling method.

Also, the sample size of kuku is larger than 30, hence the sampling distribution is normal.

4. Before the data was collected the manager of the mussel farm claimed that the average length of the kuku on the farm is 100mm. Does your confidence interval support the manager’s claim? Explain. [2 marks]

Yes, the confidence interval supports the manager’s claims since the mean lies between the lower and upper limit of the confidence interval, and thus the manger was correct about the claim.

+ + + + + + + +

Appendix:

Paste your data here: [1 mark]

ID

Length

Grade

Sex

496

216.1

Large

Female

883

169.9

Large

Male

546

217.8

Large

Male

843

202.2

Large

Male

1483

178.5

Large

Female

54

173.7

Large

Male

281

155

Large

Female

379

150.9

Large

Female

563

179

Large

Female

582

158.9

Large

Male

1660

129.2

Large

Male

1311

197.6

Large

Female

788

212.7

Large

Female

995

234.8

Large

Female

905

179.9

Large

Male

1312

187.7

Large

Male

1652

152.5

Large

Female

1032

173.2

Large

Male

250

148.3

Large

Male

4466

82.4

Medium

Female

3556

93.8

Medium

Female

2316

98

Medium

Female

5546

96.6

Medium

Female

5567

111.6

Medium

Female

4859

91

Medium

Male

5187

80.9

Medium

Male

5071

92.3

Medium

Male

5077

101.1

Medium

Female

4984

88

Medium

Male

2322

82.3

Medium

Male

4043

121

Medium

Female

3983

107.8

Medium

Male

3312

117.3

Medium

Female

5684

122.1

Medium

Male

4605

105.1

Medium

Female

2243

79.6

Medium

Male

2617

129.7

Medium

Female

3030

105.1

Medium

Male

3431

128

Medium

Male

2410

118.2

Medium

Female

5198

105.3

Medium

Male

6356

63

Medium

Male

4499

107.8

Medium

Female

2982

82

Medium

Male

4885

77.7

Medium

Female

3418

94.1

Medium

Female

2200

116.9

Medium

Male

2382

129.3

Medium

Male

6346

67.4

Medium

Male

2830

104

Medium

Male

5661

96.2

Medium

Female

5480

91.6

Medium

Male

5125

92.3

Medium

Female

2986

122.7

Medium

Female

6374

72.8

Medium

Female

5449

98.4

Medium

Female

2224

93.9

Medium

Male

3264

75.4

Medium

Male

5021

133.4

Medium

Male

5869

85.2

Medium

Female

2715

135.8

Medium

Male

3706

113.9

Medium

Male

4450

101.6

Medium

Female

5085

100.7

Medium

Female

7731

48.7

Small

Female

6929

89.9

Small

Male

7785

74.2

Small

Male

8572

47.9

Small

Female

8120

59.1

Small

Female

7551

68.9

Small

Male

6480

83.8

Small

Female

9417

63.9

Small

Male

9081

49.1

Small

Male

8507

61.1

Small

Male

7406

49.2

Small

Female

8694

70.7

Small

Female

7982

37.6

Small

Female

8003

12.6

Small

Female

8006

52.8

Small

Female

7734

58.1

Small

Male

7687

11.9

Small

Male

8553

72.3

Small

Male

7770

67.6

Small

Male

9740

35.2

Small

Male

8013

62.8

Small

Male

7728

50.1

Small

Female

8552

74.3

Small

Female

9404

66.8

Small

Female

8711

50.6

Small

Female

7314

33.4

Small

Male

8042

51.3

Small

Male

9379

43

Small

Male

8815

70.4

Small

Male

9844

61

Small

Female

6953

100.2

Small

Female

6421

87.7

Small

Male

8660

47

Small

Male

6534

78.5

Small

Male

9290

54.7

Small

Male

8135

63.4

Small

Female

Bar Chart for Count

Male Female 53 47

Assignment 1 Page 1