report

profileSnug
Workshop_07.pdf

BUS105

Business Information

Systems

Workshop Week 7

Structured Data Management

(Introductory Analytics) Life Cycle

Workshop (Excel)22

Copyright Notice

COPYRIGHT COMMONWEALTH OF AUSTRALIA

Copyright Regulations 1969 WARNING

This material has been reproduced and communicated to you by or on behalf of Kaplan Higher

Education pursuant to Part VB of the Copyright Act 1968 (the Act). The material in

this communication may be subject to copyright under the Act. Any further reproduction

or communication of this material by you may be the subject of copyright protection under the Act.

Do not remove this notice

2

Lesson Learning Outcomes

1 Learn about the data analytics project

lifecycle

2 Do a hands-on exercise in excel with

reference to LO1

3 Interpret results as required

Excel Workshop Week 7 Vehicle Cost Analysis

Commons.wikipedia.org

Business Question: How much does it cost

to run a bus service?

Intechen.com

Today’s Tasks

• Please download today’s data file now

BUS105_ProximityBus_for_week_7.xlsx

• You will be doing a hands-on Microsoft Excel cost analysis

exercise in order to answer the business question:

How much does it cost to run a bus service?

• General Excel instructions will be followed by your specific

instructions.

• At the same time we will be learning about the data

analytics lifecycle and referring to it every now and then.

Data Analytics Lifecycle

Business Understanding

Data Understanding

Data Preparation

Data Modelling

Evaluation

Deployment

Kelleher, JD, MacNamee, B & D’Arcy A 2015, Fundamentals of machine learning for

predictive analytics, The MIT Press, Cambridge Massachusetts, p12-15.

Data

Kelleher, JD, MacNamee, B & D’Arcy A 2015, Fundamentals of machine learning for

predictive analytics, The MIT Press, Cambridge Massachusetts, p12-15.

Stage 1: Business Understanding

This is stage 1 of the data analytics lifecycle.

Some questions you should answer during this stage:

• What are your objectives/aims?

e.g. Is our bus company making a profit?

• What resources do you need to start the project?

e.g. Do we need an analyst? What software do we need?

• What are your business success criteria?

e.g. How can we maintain a bus good service and keep

costs below a certain level?

• In this workshop we will work with vehicle mileage and

cost data, draw charts and perform cost calculations

using excel in-built functions.

Opening the Excel Data File

Double click on the

BUS105_ProximityB

us_for_week_7.xlsx

file icon to open the

file in Excel.

Data Understanding

• Questions to ask at this stage:

• What data have you got and is it complete?

e.g. Bus ID, cost per km, km driven...

• What was the source? e.g. Maintenance department

• What other data would be useful,

• e.g. Bus ticket prices, number of passengers per day, …..

• Do you have a description of the data

e.g. (Data dictionary or encyclopedia)

This Photo by Unknown Author is

licensed under CC BY

Instructions

Clicking on a cell

makes it active

• Use the mouse

OR

• Use the arrow keys

to move around

How to Select a Cell

Cell is active when a heavy

border surrounds it.

© 2017 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except

for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected

website for classroom use.

Ischool.utexus.edu

Cell A1

Instructions To enter worksheet titles, numbers or text

– Open the file in Microsoft Excel

– Click on a cell to make the cell active

– Type desired text

– Click the ENTER button to complete the entry

– Move to the next cell of interest and repeat

Additional information:

(To cut and paste, use Ctrl C and Ctrl V as in Word)

Your instructions on next page…

How to Enter Items

© 2017 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,

except for use as permitted in a license distributed with a certain product or service or otherwise on a password-

protected website for classroom use.

Now Enter Text

You will notice that some information is missing.

Your Instructions

• Click on cell B3 & enter “Cost per Km”

• Move to cell C6 and enter the missing value

14949.00

• Move to cell C7 and enter the missing value

14905.00

• Move to cell E3 and replace “M cost” with

“Mileage Cost”

Your File Should Look Like This

Are the first four columns complete?

Formulae With Simple Operators

Instructions

Using simple operators and relative cell reference

• Recall all formulae start with an = sign

• Simple operators for addition, subtraction, multiplication,

division and nth power are +, -, *, / and ^n where n is the

power, e.g. 5 squared is =5^2

• If we drag the cursor along, the cell addresses are changed

relative to position. This is called relative cell referencing.

Your Instructions

• Obtain mileage cost: Go to cell E4 in the Mileage column

of your worksheet, type “= B4*C4” ENTER

Fill Handle

Instructions :

How to copy a cell calculation to adjacent cells in a col/row

• With the cell containing the contents (e.g. E4), to fill down

the column, point to the fill handle to activate it (i.e. click

on the lower right hand corner of the active cell and a plus

sign should appear “+”)

• Hold on to the corner and drag the handle down the

column as required (i.e. to cell E12)

Your Instructions

• Copy the formula (using the fill handle) down the rest of

column E to cell E12

Mileage

Costs

Your File Should Look Like This

Using Simple Formulae Cont.

Your Instructions

• Obtain total cost: Go to cell F4 and type in “=D4 + E4”

ENTER

• Copy the formula (by dragging the cursor) down the rest

of column F to cell F12

• Obtain total cost per Km: Go to cell G4 and type in

“=F4/C4” ENTER

• Copy the formula (using the fill handle) down the rest of

column G to cell G12

Your File Should Look Like This

Finding Totals Using ‘SUM’

Instructions

To sum a column of numbers

– Click the first empty cell below the column of numbers to sum

– Click the AutoSum button on the HOME tab to display a formula

in the formula bar and in the active cell, for example

=SUM(B4:B12)

Your Instructions

• Highlight G4 to G12 and click on the decrease decimal places button

in the Number menu. Reduce the decimals to 2 places.

• Sum the columns using the sum formula in the “totals” row (row 13)

Instructions

To sum a column of numbers

– Click the first empty cell below the column of numbers to sum

– Click the AutoSum button on the HOME tab to display a formula in

the formula bar and in the active cell, for example, =SUM(B4:B12)

Your Instructions

• Find column totals using the SUM formula in the “Totals” row (row 13)

• Highlight G4 to G12 and click on the decrease decimal places button

in the Number menu. Reduce the decimals to 1 place.

Finding Totals Using ‘SUM’

Your File Should Look Like This

Data Preparation

• We have carried out a small amount of data preparation.

Some of the questions you should answer during this

stage:

• Have you considered your data storage and

maintenance capacity?

e.g. Do you need new software, cloud warehousing or

just a PC?

• Do you need to transform (data wrangling) or integrate

the data in any way?

e.g. Finding total cost, reducing numbers to one decimal

Stage 4: Modelling

Questions for the modelling phase:

• What models will you use?

e.g. Descriptive, predictive analytics or AI techniques

• How will you train/test and assess the models?

e.g. You will need a training data set if you are going to

use machine learning

Let’s look at some basic summary statistics, average, max

and min.

https://www.sv-europe.com/crisp-dm-methodology//

Absolute Versus Relative

Addressing

Table 3-6 Examples of Absolute, Relative, and Mixed Cell References

Cell Reference Type of Reference Meaning

$B$4 Absolute cell reference Both column and row references remain the same

when you copy this cell, because the cell references

are absolute

B4 Relative cell reference Both column and row references are relative. When

copied to another cell, both the column and row in the

cell reference are adjusted to reflect the new location

B$4 Mixed reference This cell reference is mixed. The column reference

changes when you copy this cell to another column

because it is relative. The row reference does not

change because it is absolute

$B4 Mixed reference This cell reference is mixed. The column reference

does not change because it is absolute. The row

reference changes when you copy this cell reference to

another row because it is relative

Absolute Versus Relative Address

Instructions

To enter a formula containing absolute cell references

– Given a selected cell, enter the formula and then press the F4 key

to change the most recently typed cell reference from a relative cell

reference to an absolute cell reference

Your Instructions

– Go to cell A17 in your spreadsheet and type in “9”

– Calculate the average using each total divided by 9 using absolute

referencing: Go to cell B14 and type in “=B13/$A$17”

– Apply this to cells C14 to G14 using the fill handle and adjust the

results to 2 decimal places

Find Max and Min

Instructions

To find the maximum or minimum of a range of cells type in

=max(start cell:end cell) for maximum

=min(start cell:end cell) for minimum

Your Instructions

• Fill in the “Highest” and “Lowest” column values in row 15

and 16, using =max(B4:B12) and =min(B4:B12)

• If required, change all values so that 2 decimal places are

displayed

Your File Should Look Like This

Stage 5: Evaluation

Questions regarding the evaluation and deployment

phases:

• How will you assess the results in terms of business

success criteria?

e.g. How are these results going to help the bus company?

• Have you reviewed all the modelling so far?

e.g. What other preliminary models we can learn from?

See evaluation activity on the next page

Activity 1:Evaluation

• Answer the following questions:

1. Which bus costs the most to run per km?

2. Which bus has (lowest mileage) driven the

least number of kilometres?

3. What is the lowest maintenance cost?

Stage 5: Deployment

Questions regarding the deployment phases:

• Next steps? Do you need to gather more data, carry out

another data mining project, or start deployment?

e.g. let’s try filtering and sorting values of interest

(see next page)

• How will you implement your findings?

e.g. find a way to reduce the cost of bus 701

Filtering Instructions

Filtering

• The editing menu has sort and filter commands

• To filter items based on a particular column: click on the column to be filtered

• Move your mouse to the editing menu and select filter a small box with an arrow will appear at the top of the column

• Clicking on the arrow reveal the items in the list

• Unticking individual boxes hides (filters out those items) and the filter box changes shape

• This command is good for removing BLANKS in data sets

• To display (unfilter) the list click on “select all” in the list of filter boxes, and your original data should be displayed

Filtering

We want to filter out the costs per Km less than

1.80

Your Instructions

1. Click on the top of column B of your spreadsheet

2. Take the cursor to the editing menu and select filter

3. Click on the filter box in column B to reveal the details

of data in column B

4. Untick the boxes with values lower than 1.80 to hide

them

Notice that the row

with the minimum

is now hidden too

Filtered column

Your File Should Look Like This

We want to sort the mileage costs while keeping the other

row information consistent with those costs

Your Instructions

1. First unfilter column B by ticking the “select all” box in the

filter options

2. Copy just the values of data block from cell A3 to G12 to

Sheet 1 by highlighting the data and using control C

3. Click on cell A1 in sheet 1, right click on your mouse and

select paste special, click on the values option and OK

4. Select column E, go to the sort menu and select “sort

smallest to largest”

The “Expand selection” box will appear, make sure the

expand selection option is checked and then press SORT

Sorting

Sorting

Your Worksheet Sheet 1 should look like this

Total Cost per bus as a Percentage of

the Entire Total Cost

B701 15%

B702 12%

B703 11%

B704 11%

B705 7%

B706 10%

B707 11%

B708 11%

B709 12%

TOTAL COST PER BUS AS A PERCENTAGE OF SUM TOTAL COST

An alternative representation of total costs

Activity 2: Interpretation

Answer these questions:

1. Which bus has the greatest mileage cost?

2. What is the maintenance cost of the bus of

interest in question 1?

3. Is it easier to interpret the table or pie chart?

Why?