Statistical Analysis Subject

umairchill5
feedbackofprojectpartb.docx

SOUTHERN CROSS UNIVERSITY

School of Business and Tourism

MAT10251 Statistical Analysis

PROJECT COVER SHEET

Please complete all of the following details and then make these sheets the first pages of your project – do not send it as a separate document.

Your project must be submitted as a Word document.

PART B

Student Name:

Umair Elahi

Student ID No.:

23039692

Tutor’s name:

Badri Bhattarai

Due date:

13th January 2019

Date submitted:

16Th January 2019

Declaration:

I have read and understand the Rules Relating to Awards ( Rule 3 Section 18 – Academic Integrity ) as contained in the SCU Policy Library. I understand the penalties that apply for academic misconduct and agree to be bound by these rules.

The work I am submitting electronically is entirely my own work.

.

Signed:

(please type your name)

Umair

Date:

16/01/19

STUDENT NAME: Umair Elahi

STUDENT ID NUMBER: 23039692

MAT10251 – Statistical Analysis

Project Part B

Complete the summary table below.

Sample Number (last digit of your student ID number)

2

Fuel

First letter family name A to M – Unleaded 91

First letter family name N to Z – Diesel

E

Confidence Level

95%

Level of Significance

5%

Value: 15%

PLEASE ENSURE YOU KEEP A COPY OF YOUR PROJECT

Self-Marking Sheet for Part A

Reflection/feedback ( approximately 200 words )

From the work done in part A, the representation of data in a graph was well understood and implemented. As showcased, two graphs were constructed using the same data set but different class intervals resulting in two different shapes. In addition, calculation of the descriptive statistics was well executed. The interpretation of the aforementioned statistical values was also done appropriately with deep understanding of what each statistic meant or represented.

However, there were some challenges and mistakes encountered during the tasks. First, the task of introducing data was a challenge. To avoid this in future, taking time to read and fully understand the population from which the sample is derived and also to understand the sample is a step to be taken. By doing so, I will be able to introduce the data before commencing on the calculations. Another challenge was in the choice of the measure of central tendency as the median and mean were close to each other. To avoid this, more background research regarding the same will be done.

From the submission and self-marking of part A, I was able to discover the mistakes and challenges I faced when doing the tasks and think of the ways with which I can avoid or rectify such mistakes in the future. Marking and Feedback Sheet Part B

Comments: Please follow the provided instruction. If you need any help, please see me next time.

Figure 1(Histogram) Similar to Video

Bins

Midpoints

Frequency

134.99

$132.50

6

139.99

$137.50

21

144.99

$142.50

12

149.99

$147.50

16

154.99

$152.50

15

159.99

$157.50

10

164.99

$162.50

0

Figure 2 (Histogram) With New Clases Bins and Midpoints

Bins

Midpoints

Frequency

131.99

$131.00

0

133.99

$133.00

4

135.99

$135.00

9

137.99

$137.00

7

139.99

$139.00

7

141.99

$141.00

4

143.99

$143.00

6

145.99

$145.00

4

147.99

$147.00

6

149.99

$149.00

8

151.99

$151.00

6

153.99

$153.00

7

155.99

$155.00

10

157.99

$157.00

1

159.99

$159.00

1

161.99

$161.00

0

The First and Second graph are construted using the same data but because of choosing different classes the shapes are different . The first data set shows a skew to the right while the second one is showing some sort of symmetric or uniform data set, the first graph is constructed using 5 cents difference while the second is costructed using 2 cents difference,

So defining the second one in detail.As you can see that the above grapgh is representing the NSW Unleaded 91 Fuel prices in 80 Town/Suburbs according to Cents per litre with different prices ranging from 132.9 cents / litre the minimum to 158.9 cents / litre the maxium.

Descriptive Summary

 

 

 

Cents Per Litre

Mean

145.3375

Median

145.85

Mode

155.9

Minimum

132.9

Maximum

158.9

Range

26

Variance

55.5586

Standard Deviation

7.4538

Coeff. of Variation

5.13%

Skewness

0.0083

Kurtosis

-1.3280

Count

80

Standard Error

0.8334

From the above graph we can see that there are four suburbs for the fuel prices ranging from 130 cents/litre to 134cents/litre, four for 142cents/litre to 144cents/litre and four for 144cents/litre to 146 cents/litre while majority of the suburbs has got the same price range i.e. 134 cents/litre to 136 cents/litre and but if we see prices ranging from136 cents/litre to 138 cents/litre and 138 cents/litre to 140 cents/litre we can see that seven of the suburbs has got the same price range respectively.

Descriptive Statistics

More useful information can be found in the descriptive statistics in the table given above. In particular the least fuel price among all of the suburbs in NSW is 132.9 cents/litre while the most expensive or highest is 158.9 cents/litre. The median, which is the middle value among 80 suburbs fuel prices is 145.85 cents/litre i.e. 50 precent of the suburbs are falling under this price range. While the mean, the single value, the central tendency, the average is 145.335 cents/litre. As the mean and median are comparatively same we can conclude that average fuel price among 80 suburbs is 145.335 cents/litre. However the standard deviation of 7.4538 shows that the most of the fuel prices are very very close to the mean i.e. 145.335 cents/litre because of the less standard deviation.

Five-Number Summary

Minimum

132.90

First quartile

137.90

Median

145.85

Third quartile

151.90

Maximum

158.90

Furthermore we will end up by describing the five numbers summary given above which divides the samples into quarters, with 25% of the data set in the sample lie below the first quartile i.e. 137.90 cents/litre and 25% more lie above the third quartile i.e. 151.90 cents/litre.

Figure 3 Boxplot

Written Answer Part B Components of a longer report

The questions in part B both deal with the question of whether or not motorists view the price of the fuel as expensive though from different perspectives.

Question 1 in particular answers the question of whether the price of the fuel is expensive from the perspective of the population mean. The sample mean was estimated to be 145.3375 cents.

The results are as follows:

The interval was found to be [143.7074 , 146.9709] cents

Since the interval does not include the value $1.50 or 150 cents, the null hypothesis is rejected. Comment by Badri Bhattarai: ????? please display your excel output.

Question 2 on the other hand answers the question of whether or not the fuel price is expensive from the perspective of a subset (more than 25% of petrol stations) of the sample having the fuel price at least $1.50 per litre.

Calculations were done and the results are as follows:

It was found that the price of fuel in 24 out of 80 petrol stations in the state was higher than $1.50. This translates to 30%.

B.1 Average Price Unleaded 91/Diesel Price

No, the average price of fuel on that day and in the state specified was not expensive. This is so since as per the interval test in statistics that was carried out (check appendix), the null hypothesis which states that the fuel was expensive was rejected in our case.

B.2 Unleaded 91/Diesel Price Expensive

Yes, the price of fuel was at least $1.50 per litre in more than 25% of petrol stations in the state specified by the sample.

From the foregoing, we conclude that using the criteria where motorists perceive fuel price to be expensive when the price of fuel is at least $1.50 at more than 25% petrol stations in a state, the price of the fuel was expensive on the day in the state specified. Comment by Badri Bhattarai: Support your answers with your excel outputs

Appendices Part B

Appendix B.1 – Statistical answer for Question 1

The random variables were defined as follows:

· X_ is a random variable representing the sample mean.

· Sigma represents the standard deviation of the data from the mean.

· N represents the number of entries or petrol stations in the sample.

The following assumptions were made in the calculation and inference of the data:

X ~ N(X_ , Sigma2) i.e. X follows a normal distribution with mean= X_ and variance Sigma2.

The interval test was chosen in this case. This is because with the descriptive statistics that were previously calculated it was easier and faster to use the interval method. Also, the interval method does not require much calculation in the event that the average for which the price of fuel has to be to be considered expensive changes from $1.50. in fact, all that will be needed is to check whether the new average falls in the interval or not and make a decision.

Hypothesis testing

Null hypothesis: The price of fuel is expensive. In other words, the average price is at least $1.50.

Alternative hypothesis: The price of fuel is not expensive. In other words, the average price is less than $1.50.

To test the above hypothesis, a confidence interval was constructed as shown below:

X_ ± sigma/ where X_= 145.3375 cents, sigma= 7.4538 and N=80

The 95% confidence interval was found to be [143.7074, 146.9709] cents.

When comparing the value 150 cents to the interval, it can be seen that the value falls outside the interval on the upper limit. Therefore, the null hypothesis is rejected.

For question 1, the excel output used was that of the descriptive statistics that are needed in the calculation of the interval.

Descriptive Summary

 

 

 

Cents Per Litre

Mean

145.3375

Median

145.85

Mode

155.9

Minimum

132.9

Maximum

158.9

Standard Deviation

7.4538

Interpretation of results: since we have failed to reject the null hypothesis, we conclude that the price of fuel is not expensive as per the criterion used in question 1.

Appendix B.2 – Statistical answer for Question 2

For question two, only one random variable was defined. X represents the individual price of fuel at each petrol station in the state.

The following logical function was used in excel: =IF(C2:C81>150,1,0) where the column C contained the price of fuel at each petrol station in cents. The column created by this logical function was then summed to find out the total number of stations which had at least a fuel price of 150 cents.

Hypothesis testing

Null hypothesis: the percentage of petrol stations with fuel price higher than 250 cents is greater than 25% hence fuel price is expensive.

Alternative hypothesis: the percentage of petrol stations with fuel price less than 250 cents is less than 25% hence fuel price is not expensive. Comment by Badri Bhattarai: ???

It was found had 24 petrol stations had fuel price higher than 150 cents. This translates to 30%. Therefore, we fail to reject the null hypothesis.

Interpretation of results: since we have failed to reject the null hypothesis, we conclude that the price of fuel is expensive as per the criterion used in question 2. The excel output is as shown below:

Town/Suburb

Location

Unleaded 91 (Cents per Litre)

logic

Albury

Regional

143.8

0.0

Bathurst

Regional

150.9

1.0

Bermagui

Regional

151.9

1.0

Bourke

Regional

155.9

1.0

Broken Hill

Regional

147.9

0.0

Casino

Regional

155.9

1.0

Coffs Harbour

Regional

153.9

1.0

Coonabarabran

Regional

153.9

1.0

Dorrigo

Regional

148.9

0.0

Drake

Regional

139.9

0.0

Evans Head

Regional

152.9

1.0

Glen Innes

Regional

152.9

1.0

Goulburn

Regional

145.8

0.0

Gunnedah

Regional

142.9

0.0

Halfway Creek

Regional

155.9

1.0

Kempsey

Regional

146.9

0.0

Lismore

Regional

155.9

1.0

Manilla

Regional

156.9

1.0

Moree

Regional

152.9

1.0

Mudgee

Regional

155.9

1.0

Mungindi

Regional

155.9

1.0

Muswellbrook

Regional

158.9

1.0

Narrabri

Regional

149.9

0.0

Newcastle West

Regional

154.9

1.0

Port Kembla

Regional

138.9

0.0

Port Macquarie

Regional

155.4

1.0

Queanbeyan

Regional

149.9

0.0

Tamworth

Regional

149.9

0.0

Tenterfield

Regional

146.7

0.0

Tenterfield

Regional

146.7

0.0

Tweed Heads

Regional

148.9

0.0

Ulladulla

Regional

144.7

0.0

Uralla

Regional

151.9

1.0

Waga Waga

Regional

154.9

1.0

Walgett

Regional

150.9

1.0

Wauchope

Regional

155.9

1.0

West Armidale

Regional

151.0

1.0

Woolgoolga

Regional

153.9

1.0

Wyong

Regional

141.7

0.0

Yamba

Regional

153.9

1.0

Alexandria

Capital - Sydney

143.9

0.0

Arncliffe

Capital - Sydney

136.9

0.0

Bankstown

Capital - Sydney

133.9

0.0

Baulkham Hills

Capital - Sydney

141.9

0.0

Bexley North

Capital - Sydney

135.9

0.0

Blacktown

Capital - Sydney

136.9

0.0

Bondi Junction

Capital - Sydney

148.4

0.0

Brighton Le Sands

Capital - Sydney

135.9

0.0

Brookvale

Capital - Sydney

146.4

0.0

Cabramatta

Capital - Sydney

142.9

0.0

Casula

Capital - Sydney

137.9

0.0

Croydon Park

Capital - Sydney

135.7

0.0

Fairfield

Capital - Sydney

135.9

0.0

Five Dock

Capital - Sydney

150.0

0.0

Forestville

Capital - Sydney

149.4

0.0

Granville

Capital - Sydney

132.9

0.0

Homebush

Capital - Sydney

135.8

0.0

Leppington

Capital - Sydney

135.9

0.0

Lewisham

Capital - Sydney

133.9

0.0

Lidcombe

Capital - Sydney

138.9

0.0

Maroubra

Capital - Sydney

143.9

0.0

Marrickville

Capital - Sydney

137.5

0.0

Miranda

Capital - Sydney

137.9

0.0

Mona Vale

Capital - Sydney

144.9

0.0

Mortdale

Capital - Sydney

136.9

0.0

North Ryde

Capital - Sydney

135.9

0.0

Northwood

Capital - Sydney

139.9

0.0

Pagewood

Capital - Sydney

148.4

0.0

Pennant Hills

Capital - Sydney

143.4

0.0

Petersham

Capital - Sydney

137.7

0.0

Punchbowl

Capital - Sydney

138.9

0.0

Quakers Hill

Capital - Sydney

139.9

0.0

Revesby

Capital - Sydney

133.9

0.0

Ryde

Capital - Sydney

140.9

0.0

Sydney

Capital - Sydney

138.7

0.0

Tarren Point

Capital - Sydney

140.4

0.0

Villawood

Capital - Sydney

134.7

0.0

West Hoxton

Capital - Sydney

145.9

0.0

Woolloomooloo

Capital - Sydney

146.9

0.0

Yagoona

Capital - Sydney

134.5

0.0

24.0

Do not cut my marks as I have been approved by my unit assessor because I have got the extension but I can’t be able to upload my assignment again thill the extension date so she reset my link. The attached copy of email you can see below thanks.

2

Sheet1

Max Marks Recommended Marks
Cover sheet or sample incorrect -2.0
Format incorrect, including name -2.0
Statistical Calculations
Graph (Frequency Histogram or Polygon) 4.0 4.0
Descriptive Statistics 4.0 4.0
Total Descriptive Statistics 8.0 8.0
Written Answer (Component of a business report)
Introduction and data 2.0 0.0
Comments on graph 3.0 3.0 `
Comments on descriptive statistics 4.0 3.0
Difference in measures of central tendency 1.0 1.0
Structure, grammar and spelling 2.0 2.0
Total Report 12.0 9.0
Total 20.0 17.0

Sheet2

Sheet3

Max MarksMark

Cover sheet or sample incorrect-2

Format incorrect, including file name-2

Self-Marking and Reflection Part A (5 marks)

Self-Marking Part A22.0

Reflection32.0

Part B Statistical Inference Tasks (19 marks)

Statistical Inference Question 1

Choice of technique, assumptions & other required steps41.0

Calculation (Excel output)30.0

Conclusion20.0

Statistical Inference Question 2

Choice of technique, assumptions & other required steps50.0

Calculation (Excel output)30.0

Decision and conclusion20.0

Written task - Discussion and results (6 marks)

Question 121.0

Question 220.0

Structure, grammar and spelling21.0

Total Part B307.0

Sheet1

Max Marks Mark
Cover sheet or sample incorrect -2
Format incorrect, including file name -2
Self-Marking and Reflection Part A (5 marks)
Self-Marking Part A 2 2.0
Reflection 3 2.0
Part B Statistical Inference Tasks (19 marks)
Statistical Inference Question 1
Choice of technique, assumptions & other required steps 4 1.0
Calculation (Excel output) 3 0.0
Conclusion 2 0.0
Statistical Inference Question 2
Choice of technique, assumptions & other required steps 5 0.0
Calculation (Excel output) 3 0.0
Decision and conclusion 2 0.0
Written task - Discussion and results (6 marks)
Question 1 2 1.0
Question 2 2 0.0
Structure, grammar and spelling 2 1.0
Total Part B 30 7.0

Sheet2

Sheet3

Max MarksRecommended

Marks

Cover sheet or sample incorrect-2.0

Format incorrect, including name-2.0

Statistical Calculations

Graph (Frequency Histogram or Polygon)4.04.0

Descriptive Statistics4.04.0

Total Descriptive Statistics8.08.0

Written Answer (Component of a business

report)

Introduction and data2.00.0

Comments on graph3.03.0

Comments on descriptive statistics4.03.0

Difference in measures of central tendency1.01.0

Structure, grammar and spelling2.02.0

Total Report12.09.0

Total20.017.0