Data Visualization for Business With Tableau

profileDaraja
ANL201StudyUnit2ver20200108.pdf

Visualisation for Business

ANL 201

The Science of Data Visualisation

Study Unit 2

January 2020

Data Visualisation

3

Data Visualisation The big idea – Concepts

‣ Data — facts and statistics used for reference or analysis

‣ Data Visualisation — the graphical representation of data

4

Data Visualisation Overwhelming amount of data available today

5

Data Visualisation Pre-computing era visualisation

https://en.wikipedia.org/wiki/1854_Broad_Street_cholera_outbreak

During the Cholera epidemic of

1849-1854 in London, John Snow

showed a relationship between

water wells and the severity of the

outbreak amongst households

6

Data Visualisation Benefits

‣ Provides us the ability to comprehend huge amounts of data

‣ Allows the perception of emergent properties that are not anticipated

‣ Often enable problems with data to become immediately apparent

‣ Facilitates the understanding of both large-scale and small-scale features of the data

‣ Facilitates hypothesis formation

Mun Teng
Sticky Note
why is the data wrong?

7

Data Visualisation The four stages of the data visualisation process

1. Data Collection and Storage: the collection and storage of data

2. Data Pre-processing: the pre-processing of data to transform it into something

one can understand

3. Graphics Engine: the display hardware and the graphics algorithms to produce

data visualisation on screen

4. Human Visual and Cognitive Processing: human perceptual and cognitive

systems that are involved in interpreting the visualised data

Mun Teng
Sticky Note
include data visualization process in the slide here

8

Data Visualisation Data visualisation in everyday life

9

Data Visualisation Data visualisation in everyday life

https://www.nationalgeographic.com/what-the-world-eats/

Semiotics of Data Visualisation

11

Semiotics of Data Visualisation The big idea – Concepts

‣ Semiotics is the study of symbols and how they convey meaning

‣ This discipline was originated in the United States by C. S. Peirce, and later developed in Europe by French philosopher and linguist Ferdinand de Saussure

‣ Saussure defines a principle of arbitrariness, and applies it to the relationship between a symbol and the thing it signifies

‣ Meanings to one culture may be nonsense to another

Mun Teng
Sticky Note
under the same context, the same symbol can be interpreted differently. acronyms such as pie and tpe is only understood in Singapore

12

Semiotics of Data Visualisation Properties of sensory and arbitrary representation

‣ Sensory refers to symbols and aspects of representation that uses the perceptual processing power of the brain without training

‣ Arbitrary refers to aspects of representation without a perceptual basis, and users must be trained to interpret it

‣ Sensory representation can be understood without training, processed rapidly and in parallel, tends to be stable across individuals, cultures and time, and is

resistant to instructional bias. Conversely, arbitrary representation is capable of

rapid change and derives its power from culture. It can vary with culture and

application

Semiotics of Data Visualisation

Properties of Sensory Representation

▪ Understanding without Training.

▪ Resistance to Instructional Bias.

▪ Sensory Immediacy.

▪ Cross-cultural Validity.

Properties of Arbitrary Representation

▪ Hard to Learn.

▪ Easy to Forget.

▪ Embedded in Culture and Applications.

▪ Formally Powerful.

▪ Capable of Rapid Change.

• Sensory vs arbitrary representation

Mun Teng
Sticky Note
example: mathematics - different indication of signs example: language

14

Semiotics of Data Visualisation The perceptual processing model

Mun Teng
Sticky Note
how human act and/or react to something

Understanding Data

16

Understanding Data The two fundamental forms of data – entities and relationships

‣ Entities are generally objects of interest. A group of objects can be considered as a single entity by data visualisation designers

‣ Relationships form the structures that relate to entities

Mun Teng
Sticky Note
can be single or category of objects combined together
Mun Teng
Sticky Note
example, component of the car, social relationship such as good and bad friends, supervisory relationship, professor-student relationship relationship can be structural or social

17

Understanding Data Data attributes

‣ Both entities and relationships can have attributes. In general, something should be called an attribute when it is a property of some entities and cannot be

thought of independently

‣ Defining what should be an entity and what should be an attribute is not always straight forward. For example, the price of a laptop could be thought of as an

attribute of the laptop, but we can also think of that amount-of-money as an entity

in itself. In this case we have to define the relationship between the laptop entity

and the amount-of-money entity

Mun Teng
Sticky Note
attributes are used to describe relationship

18

Understanding Data The four measurement levels of data quality attribute

1. Nominal measurements measure items based on their labels or categories or

other qualitative classification the items belong to with no implied order

2. Ordinal measurements arise from the operation of rank ordering

3. Interval measurements allow us to measure the degree of difference between

items, but not the ratio between them

4. Ratio measurements estimate the ratio between a magnitude of a continuous

quantity and a unit magnitude of the same kind

Mun Teng
Sticky Note
cannot do a comparison on what is better or not better
Mun Teng
Sticky Note
difference between each ranking may not be the same (not the same interval) it doesn't tell us the difference
Mun Teng
Sticky Note
are similar to interval but they have no negative value

19

Understanding Data Metadata

‣ Metadata is structured information that explain, describe or locate the original (i.e. also known as primary data), otherwise make the using of original data more

efficient

20

Understanding Data Preparing data with data visualisation applications

21

Discussion

The four measurement levels of data quality attribute

• What are some examples for the four levels of measurement that you can

identify in your company, or any other organisations you are familiar with?

Tableau (Class Activity)

Tableau (Class Activity)

‣ Sit in your GBA groups

‣ Ensure that you have a working copy of Tableau Desktop installed on your computer

‣ Ensure that you have the following datasets downloaded onto your computer:

1. global_superstore_2016.xlsx

2. Sales 2016.xlsx

3. Products 2016.csv

4. Coffee Chain.xlsx

5. Office City.xlsx

Table Join

Table Join

Cross-database Join

Mun Teng
Sticky Note
add connections

Cross-database Join

Data Blending

More info:

https://help.tableau.com/current/pro/desktop/en-us/multiple_connections.htm

Mun Teng
Sticky Note
new data source

Data Blending Discuss to identify:

• Primary and Secondary data sources

• linking field(s)

Pivot Data from Columns to Rows Pivot from wide format to long format

More info:

https://help.tableau.com/current/pro/desktop/en-us/pivot.htm

Pivot

To long format:

Split

Split “Employee” column:

Split

suss.edu.sg

Course Homepage https://canvas.suss.edu.sg/courses/21575

Study Guide https://ibookstore.suss.edu.sg/

Tableau Desktop https://www.tableau.com/products/trial

Tableau Tutorials https://www.tableau.com/learn/get-started/creator

Academic Calendar https://www.suss.edu.sg/docs/default-

source/contentdoc/cel/ft-2020acadcalendar.pdf