Self reflective report 2

toygikiller
ManagingData.pptx

23/08/2019

1

2

1

Housekeeping

Assignment 1 Due 11:59pm Monday 26/08/2019 AEST

Assignment 2 Project Update due by 11:59pm Sunday 1st September

2019 via email to tutor (Subject: Team # Project Update) AESTEmail update to me (Subject: Team # Project Update)

Discussion Boards

– Assignment threads will be monitored

– Weekly online activity threads are intended to be student-student

The INF30018 Journey thus far: (1) Complexity, (2) Strategy, (3) Data, (4) Enterprise Architecture, (5) Emerging Data, (6) Benefits Realisation & Management, (7) Governance, (8) Service Delivery, Easter Break

2

Information Systems Management

Managing Data

1 CRICOS 00111D

TOID 3069

23/08/2019

3

4

2

4

Manage THIS! #censusfail

Census Night 2016

How NOT to manage data.

What did the ABS do wrong?

3

http://www.itnews.com.au/news/experts-cast-doubt-on-abs-census-dos-claims-433926

http://www.abc.net.au/news/2016-08-10/census-night-how-the-shambles-unfolded/7712964

23/08/2019

and … Wisdom

5

6

3

Assumed prior knowledge

This lecture assumes YOU are familiar with the general concepts of

DATA

– IBIS / BI / INF10002 Database Analysis and Design

– INF20004 Database Concepts and Modelling, etc.

– Object identities and schemata are not covered in this unit

6

Takeaways from today’s lecture

Repetition to a minimum

– Assumed knowledge, Data, Information/Knowledge

The Data Deluge

Dealing with data / what you need to know

– Measuring data

– Business case for understanding data

– Securing data

Big Data / Drivers

– Growth of unstructured data

Metadata / Managing metadata

Master Data Management (MDM)

Data Centres / Management

Data Visualisations

Data Technologies

5

23/08/2019

7

8

4

Measuring data

Can you manage if you cannot measure?

Who really cares how big a Yottabyte is?

In 2010, it was estimated that storing a Yottabyte (YB) on terabyte-size disk drives would require one billion city block-size data-centers

Knowing that stuff is only good for answering questions at trivia nights, Right?

MAYBE

8

Data Deluge / ... Increasing complexity for YOU

Many metaphors of data deluge

7

23/08/2019

9

10

5

What percent of data

stored by organisations

is duplicate data?

As much as 42% of data

stored by organisations is duplicate data

- data stored on a file share, backed up on a desktop, attached in an email file or copied to a mobile device.

Some other issues for YOU as IS Manager

Slows processing time, greater chance of error in ID of customers and suppliers, creates need for data cleansing.

10

Organisation acquire data (purchase data sets, Mergers & Acquisitions).

The data you deal with as IS Managers needs to be

UNDERSTOOD

How much of it does your organisation deal with?

What’s generating it?

Is YOUR organisation (primarily) a consumer or producer of data?

How is it being used (collected, manipulated)?

Where is it being stored?

Why does it need protection?

What dimensions of cost ($, Reputation, …) are involved if YOU lose it?

9

23/08/2019

11

12

6

Organisations still struggling to secure data

70 percent of organisations lost important business information due to human error, hardware failure, software failure and lost or stolen mobile devices

70 percent of organisations exposed confidential information

30 percent of organisations had compliance failures

A consideration for YOU as IS Manager

Data loss can occur on any device that stores data. Although any loss of data, even a simple misplacement, is by definition technically a loss, what we are primarily concerned with is the permanent loss of data that is important to your business' ongoing success.

Boston Consulting Group 12

The business case for understanding the importance of data

11

23/08/2019

13

14

7

What constitutes and is driving BIG data?

Turns out it's more than just size. You also have to look at the type of data involved—structured, unstructured or semi-structured--as well as latency and complexity.

Five big data drivers:

1. Financial transactions;

2. Email; These are all data sources

3. Imaging data; common to every industry

4. Web logs; and

5. Internet text and documents

https:// ww w . y o u t u b e . c o m / w a t c h 14

?v=FSIxMKGfpvM

Data, Information and Knowledge (both tacit and explicit) are the foundations of enhanced decision making

Big Data in a way just means “all data” (in the context of your organisation’s ecosystem).

13

23/08/2019

15

16

8

The BIG data challenge is REAL

Information Week survey: “The big data challenge is real, but only a third of the businesses we surveyed differentiate ‘big data’ from traditional data, and use distinct tools and management approaches

to deal with the higher volume, complexity and dynamics… and nearly

90% of respondents are still using conventional databases as the primary means of handling data”.

- Generally the term structured data (SD) is applied to databases (DB)

and unstructured data (UD) applies to everything else.

- The terms themselves aren’t terribly meaningful as all computer data

is structured. Example: “,.;”

16

15

23/08/2019

17

18

9

Metadata

18

The growth in unstructured data

Humans like data

Steven Lohr, The age of big data

Lohr’s assessment is correct: “Despite the caveats, there seems to be no turning back. Data is in the driver’s seat. It’s there, it’s useful and it’s valuable, even hip.”

17

23/08/2019

19

20

10

Challenges in managing metadata

Some questions for YOU as an IS Manager:

- How do you ensure that you are exploiting the

metadata you are collecting to the fullest, possible

extent?

- How do you make sure that your metadata is easily

accessible and effectively used across your

organisation?

- How do you ensure that it is kept up-to-date so that

new metadata about new data is incorporated?

20

Metadata

Often described as “information about data.”

More precisely, metadata is the description of the data itself, its purpose, how it is used, and the systems used to manage it.

Metadata not only consists of technical information but also includes information that makes business users aware of the data’s purpose and use. Good coders write // comments!

It is not related only to data warehousing and business intelligence;

it comprises all four categories of the entire enterprise architecture:

- Business

- Application

- Data

- Technology

19

23/08/2019

21

22

11

Bringing it all together: Master Data Management

(MDM)

Variance can/does occur between systems in terms of

rules/details/concepts (RDCs):

- Understanding what is a Customer?

- Understanding what is a Product?

… These RDCs are often in the Applications themselves. The differences permeate from application to application, Dept. to Dept. and Organisation to Organisation.

MDM couples the RDCs to the data itself

Conceptually, think of RDCs as a wrapper for the data

- The 60-Second Scoop: SAS Master Data Management

- https:// ww w . y o u t u be . c o m / w a t c h?v=ndVqGY7BMqc 22

Challenges in managing metadata /2

Some considerations for YOU as an IS Manager:

- Data architects and DBAs must effectively communicate with all of

the internal stakeholders who have access to, or are using data.

- As is evidenced in many different instances, if information about

how to use data is hard to find or hard to use, it is likely that the

data will be either misused or replicated with different standards and

in a different format.

- Out-of-control application growth is at the root of data redundancy

and inaccuracy. Feral systems. For this reason, clear communication

is vital to leveraging metadata.

21

23/08/2019

23

24

12

Data centre drivers & challenges

The rapid adoption of internet-enabled devices, coupled with the shift from consumer-side to SaaS and Cloud-based systems is accelerating the growth of large-scale data centres (DCs)

- h tt p : // s t a t i c . goog l e u s e r c o n t e n t . c o m / m e d i a / ww w . go o g l e . c om /en//about/datacenters/efficiency/interna

l/assets/machine-learning-applicationsfor-datacenter-optimization-finalv2.pdf

One of the most complex challenges is power management.

- h tt p : // s t a t i c . goog l e u s e r c o n t e n t . c o m / m e d i a / ww w . go o g l e . c om /en//about/datacenters/efficiency/interna

l/assets/machine-learning-applicationsfor-datacenter-optimization-finalv2.pdf

Data centres have for years been known to be excessive consumers of power, consuming up to 3% of all global electricity production, and roughly ten times more per square metre than the average office. Previously, energy efficiency wouldn’t necessarily be at the top of an information technology (IT) organisation’s priority list, but rising power costs, and an ongoing need for more hardware and equipment as well as booming data consumption is changing the way data centre operators are planning and running their facilities

24

Where does data ‘live’? Where’s its ‘home’?

These colorful pipes send and receive water for cooling our facility. Also pictured is a G-Bike, the vehicle of choice

for team members to get around outside our data centers. Google’s Douglas County, Georgia, USA Data Centre. 23

23/08/2019

25

26

13

26

Data in the ‘Cloud’

Some considerations for YOU as an IS Manager:

- Someone else likely has the (your) data

- Though cloud data is not shared, the facilities it is housed in are

- Outages

- Government intrusion

- Server location: since privacy laws vary wildly all across the world, it pays to make sure the country you’re storing your data in meets all your requirements. If you can also find a location that gives you a good connection speed, that is a major bonus.

25

Sullivan, F 2017, Top Ten Major Risks Associated With Cloud Storage, available https://www.cloudwards.net/top- ten-major-risks-associated-with-cloud-storage/>, last accessed 21 July, 2017.

23/08/2019

27

28

14

Data visualisation

Representing results in an easily consumable form -- is critical. What good is all that data if you can't understand what the interpreters--human or software– have concluded from their analysis.

Data visualisation design theory isn't new but, like many things that involve deep understanding of the range and vagaries of human cognition, it's hard to do well.

28

Comprehending your data / Visualisations

https://www.youtube.com/watch?v=ENWVRcMGDoU

27

30

Radian6’s Social Analytics Platform includes tools that enable:

- Listening to the community-ID and monitor all conversations in the social

web on a particular topic or brand

- Learning who is in the community-learn customer demographics to

foster closer relationship with community

- Engaging people in the community-communicate directly with

customers on social platforms such as FB, YouTube, LinkedIn, Twitter

using a single App

- Analysing what is being said-Sentiment analysis

23/08/2019

29

30

15

Social analytic platform example

Business & social data analytic tools

Business analytic tools

At the core of business analytics are the tools

- Statistical analysis software “Why is this happening?”

- Forecasting/Extrapolation “What if these trends continue?”

- Predictive Modelling “What will happen next?”

- Optimisation What is the best that can happen?”

Social data analytic tools

A class of tools called social analytics were created to address the issue of the rise in interest in using social IT as long as there was some measure of the value gained from the invested time and resources.

- How to analyse conversations, tweets, blogs, and other social IT to

create meaningful, actionable facts.

- Relatively easy to measure ‘hits’ or ‘click throughs’ but what does that 29

information really tell YOU as the manager?

31

23/08/2019

31

32

16

Social graphs

32

Social graphs

Ever wondered who your connections are connected to?

A social graph is a pictorial representation of relationships

Individuals/entities are represented as nodes and lines between the nodes indicate some type of relationship

Relationships can be ‘strong’ (close friend) or ‘weak’ (acquaintance)

Why YOU may need to know the Graphs

- If you want to effect change you will need to know the influencers in the

network

- If you need to find expertise that is outside your network then perhaps

the extended social graph of your connections has such a person

33

34

23/08/2019

33

34

17

The economics of ‘things’ & the economics of data/information, some T/F

Every business is in the information business (Evans & Wurster,

2000)

Things wear out / Data doesn’t wear out but can become obsolete or untrue

Things are replicated at the expense of the manufacturer / Data is replicated at almost zero-cost without limit

Things exist in a tangible location / Data does not physically exist

When things are sold, possession changes / When data is sold the seller may still possess and sell again

The price of things are based on production costs / The price of data is based on value to customer

Mashed Up data

Information is data endowed with relevance and purpose (Drucker, 1988)

A mashup is the term used for applications that combine data from different sources to create a new application on the web.

Web Application Hybrid

A mashup of location data and housing prices adds something beyond what the data provides individually.

Improved comprehension, added layers of data, into information, knowledge

23/08/2019

35

35

36

18

References

Boddy, D., Boonstra, A & Kennedy, G 2002, Managing Information Systems. An organisational perspective, Pearson, Harlow.

Davenport, T.H 1997, Information ecology, New York: Oxford University

Press, pp. 9-10.

Drucker, P 1988, The coming of the new organization, Harvard Business

Review (January-February) pp. 45-53.

Evans, P & Wurster, T 2000, Blown to bits, Harvard Business School, Boston.

Polyani, M 1966, The tacit dimension, Peter Smith, Magnolia, MA, p. 4.

Sullivan, F 2017, Top ten major risks associated with cloud storage, available <https://www.cloudwards.net/top-ten-major-risks-associated- with-cloud-storage/, last accessed 21 July, 2017.

36

(Some) takeaways for managing data

Deal with the usual suspects first

- Text to data mismatches: Which is ‘MORE’ correct? IBM,

International Business Machines or Big Blue

- Data Quality: A general challenge when automatically

integrating data from autonomous sources. Even more of a

challenge when performing ‘reasoning’

- (automatically inferring new data from existing data)

Focus on the data, transform it to information

Do not focus on the device or data centre

Gain a complete understanding

Be efficient

Set consistent policies

Stay agile