Self reflective report 2
23/08/2019
1
2
1
Housekeeping
Assignment 1 Due 11:59pm Monday 26/08/2019 AEST
Assignment 2 Project Update due by 11:59pm Sunday 1st September
2019 via email to tutor (Subject: Team # Project Update) AESTEmail update to me (Subject: Team # Project Update)
Discussion Boards
– Assignment threads will be monitored
– Weekly online activity threads are intended to be student-student
The INF30018 Journey thus far: (1) Complexity, (2) Strategy, (3) Data, (4) Enterprise Architecture, (5) Emerging Data, (6) Benefits Realisation & Management, (7) Governance, (8) Service Delivery, Easter Break
2
Information Systems Management
Managing Data
1 CRICOS 00111D
TOID 3069
23/08/2019
3
4
2
4
Manage THIS! #censusfail
Census Night 2016
How NOT to manage data.
What did the ABS do wrong?
3
http://www.itnews.com.au/news/experts-cast-doubt-on-abs-census-dos-claims-433926
http://www.abc.net.au/news/2016-08-10/census-night-how-the-shambles-unfolded/7712964
23/08/2019
and … Wisdom
5
6
3
Assumed prior knowledge
This lecture assumes YOU are familiar with the general concepts of
DATA
– IBIS / BI / INF10002 Database Analysis and Design
– INF20004 Database Concepts and Modelling, etc.
– Object identities and schemata are not covered in this unit
6
Takeaways from today’s lecture
Repetition to a minimum
– Assumed knowledge, Data, Information/Knowledge
The Data Deluge
Dealing with data / what you need to know
– Measuring data
– Business case for understanding data
– Securing data
Big Data / Drivers
– Growth of unstructured data
Metadata / Managing metadata
Master Data Management (MDM)
Data Centres / Management
Data Visualisations
Data Technologies
5
23/08/2019
7
8
4
Measuring data
Can you manage if you cannot measure?
Who really cares how big a Yottabyte is?
In 2010, it was estimated that storing a Yottabyte (YB) on terabyte-size disk drives would require one billion city block-size data-centers
Knowing that stuff is only good for answering questions at trivia nights, Right?
MAYBE
8
Data Deluge / ... Increasing complexity for YOU
Many metaphors of data deluge
7
23/08/2019
9
10
5
What percent of data
stored by organisations
is duplicate data?
As much as 42% of data
stored by organisations is duplicate data
- data stored on a file share, backed up on a desktop, attached in an email file or copied to a mobile device.
Some other issues for YOU as IS Manager
Slows processing time, greater chance of error in ID of customers and suppliers, creates need for data cleansing.
10
Organisation acquire data (purchase data sets, Mergers & Acquisitions).
The data you deal with as IS Managers needs to be
UNDERSTOOD
How much of it does your organisation deal with?
What’s generating it?
Is YOUR organisation (primarily) a consumer or producer of data?
How is it being used (collected, manipulated)?
Where is it being stored?
Why does it need protection?
What dimensions of cost ($, Reputation, …) are involved if YOU lose it?
9
23/08/2019
11
12
6
Organisations still struggling to secure data
70 percent of organisations lost important business information due to human error, hardware failure, software failure and lost or stolen mobile devices
70 percent of organisations exposed confidential information
30 percent of organisations had compliance failures
A consideration for YOU as IS Manager
Data loss can occur on any device that stores data. Although any loss of data, even a simple misplacement, is by definition technically a loss, what we are primarily concerned with is the permanent loss of data that is important to your business' ongoing success.
Boston Consulting Group 12
The business case for understanding the importance of data
11
23/08/2019
13
14
7
What constitutes and is driving BIG data?
Turns out it's more than just size. You also have to look at the type of data involved—structured, unstructured or semi-structured--as well as latency and complexity.
Five big data drivers:
1. Financial transactions;
2. Email; These are all data sources
3. Imaging data; common to every industry
4. Web logs; and
5. Internet text and documents
https:// ww w . y o u t u b e . c o m / w a t c h 14
?v=FSIxMKGfpvM
Data, Information and Knowledge (both tacit and explicit) are the foundations of enhanced decision making
Big Data in a way just means “all data” (in the context of your organisation’s ecosystem).
13
23/08/2019
15
16
8
The BIG data challenge is REAL
Information Week survey: “The big data challenge is real, but only a third of the businesses we surveyed differentiate ‘big data’ from traditional data, and use distinct tools and management approaches
to deal with the higher volume, complexity and dynamics… and nearly
90% of respondents are still using conventional databases as the primary means of handling data”.
- Generally the term structured data (SD) is applied to databases (DB)
and unstructured data (UD) applies to everything else.
- The terms themselves aren’t terribly meaningful as all computer data
is structured. Example: “,.;”
16
15
23/08/2019
17
18
9
Metadata
18
The growth in unstructured data
Humans like data
Steven Lohr, The age of big data
Lohr’s assessment is correct: “Despite the caveats, there seems to be no turning back. Data is in the driver’s seat. It’s there, it’s useful and it’s valuable, even hip.”
17
23/08/2019
19
20
10
Challenges in managing metadata
Some questions for YOU as an IS Manager:
- How do you ensure that you are exploiting the
metadata you are collecting to the fullest, possible
extent?
- How do you make sure that your metadata is easily
accessible and effectively used across your
organisation?
- How do you ensure that it is kept up-to-date so that
new metadata about new data is incorporated?
20
Metadata
Often described as “information about data.”
More precisely, metadata is the description of the data itself, its purpose, how it is used, and the systems used to manage it.
Metadata not only consists of technical information but also includes information that makes business users aware of the data’s purpose and use. Good coders write // comments!
It is not related only to data warehousing and business intelligence;
it comprises all four categories of the entire enterprise architecture:
- Business
- Application
- Data
- Technology
19
23/08/2019
21
22
11
Bringing it all together: Master Data Management
(MDM)
Variance can/does occur between systems in terms of
rules/details/concepts (RDCs):
- Understanding what is a Customer?
- Understanding what is a Product?
… These RDCs are often in the Applications themselves. The differences permeate from application to application, Dept. to Dept. and Organisation to Organisation.
MDM couples the RDCs to the data itself
Conceptually, think of RDCs as a wrapper for the data
- The 60-Second Scoop: SAS Master Data Management
- https:// ww w . y o u t u be . c o m / w a t c h?v=ndVqGY7BMqc 22
Challenges in managing metadata /2
Some considerations for YOU as an IS Manager:
- Data architects and DBAs must effectively communicate with all of
the internal stakeholders who have access to, or are using data.
- As is evidenced in many different instances, if information about
how to use data is hard to find or hard to use, it is likely that the
data will be either misused or replicated with different standards and
in a different format.
- Out-of-control application growth is at the root of data redundancy
and inaccuracy. Feral systems. For this reason, clear communication
is vital to leveraging metadata.
21
23/08/2019
23
24
12
Data centre drivers & challenges
The rapid adoption of internet-enabled devices, coupled with the shift from consumer-side to SaaS and Cloud-based systems is accelerating the growth of large-scale data centres (DCs)
- h tt p : // s t a t i c . goog l e u s e r c o n t e n t . c o m / m e d i a / ww w . go o g l e . c om /en//about/datacenters/efficiency/interna
l/assets/machine-learning-applicationsfor-datacenter-optimization-finalv2.pdf
One of the most complex challenges is power management.
- h tt p : // s t a t i c . goog l e u s e r c o n t e n t . c o m / m e d i a / ww w . go o g l e . c om /en//about/datacenters/efficiency/interna
l/assets/machine-learning-applicationsfor-datacenter-optimization-finalv2.pdf
Data centres have for years been known to be excessive consumers of power, consuming up to 3% of all global electricity production, and roughly ten times more per square metre than the average office. Previously, energy efficiency wouldn’t necessarily be at the top of an information technology (IT) organisation’s priority list, but rising power costs, and an ongoing need for more hardware and equipment as well as booming data consumption is changing the way data centre operators are planning and running their facilities
24
Where does data ‘live’? Where’s its ‘home’?
These colorful pipes send and receive water for cooling our facility. Also pictured is a G-Bike, the vehicle of choice
for team members to get around outside our data centers. Google’s Douglas County, Georgia, USA Data Centre. 23
23/08/2019
25
26
13
26
Data in the ‘Cloud’
Some considerations for YOU as an IS Manager:
- Someone else likely has the (your) data
- Though cloud data is not shared, the facilities it is housed in are
- Outages
- Government intrusion
- Server location: since privacy laws vary wildly all across the world, it pays to make sure the country you’re storing your data in meets all your requirements. If you can also find a location that gives you a good connection speed, that is a major bonus.
25
Sullivan, F 2017, Top Ten Major Risks Associated With Cloud Storage, available https://www.cloudwards.net/top- ten-major-risks-associated-with-cloud-storage/>, last accessed 21 July, 2017.
23/08/2019
27
28
14
Data visualisation
Representing results in an easily consumable form -- is critical. What good is all that data if you can't understand what the interpreters--human or software– have concluded from their analysis.
Data visualisation design theory isn't new but, like many things that involve deep understanding of the range and vagaries of human cognition, it's hard to do well.
28
Comprehending your data / Visualisations
https://www.youtube.com/watch?v=ENWVRcMGDoU
27
30
Radian6’s Social Analytics Platform includes tools that enable:
- Listening to the community-ID and monitor all conversations in the social
web on a particular topic or brand
- Learning who is in the community-learn customer demographics to
foster closer relationship with community
- Engaging people in the community-communicate directly with
customers on social platforms such as FB, YouTube, LinkedIn, Twitter
using a single App
- Analysing what is being said-Sentiment analysis
23/08/2019
29
30
15
Social analytic platform example
Business & social data analytic tools
Business analytic tools
At the core of business analytics are the tools
- Statistical analysis software “Why is this happening?”
- Forecasting/Extrapolation “What if these trends continue?”
- Predictive Modelling “What will happen next?”
- Optimisation What is the best that can happen?”
Social data analytic tools
A class of tools called social analytics were created to address the issue of the rise in interest in using social IT as long as there was some measure of the value gained from the invested time and resources.
- How to analyse conversations, tweets, blogs, and other social IT to
create meaningful, actionable facts.
- Relatively easy to measure ‘hits’ or ‘click throughs’ but what does that 29
information really tell YOU as the manager?
31
23/08/2019
31
32
16
Social graphs
32
Social graphs
Ever wondered who your connections are connected to?
A social graph is a pictorial representation of relationships
Individuals/entities are represented as nodes and lines between the nodes indicate some type of relationship
Relationships can be ‘strong’ (close friend) or ‘weak’ (acquaintance)
Why YOU may need to know the Graphs
- If you want to effect change you will need to know the influencers in the
network
- If you need to find expertise that is outside your network then perhaps
the extended social graph of your connections has such a person
33
34
23/08/2019
33
34
17
The economics of ‘things’ & the economics of data/information, some T/F
Every business is in the information business (Evans & Wurster,
2000)
Things wear out / Data doesn’t wear out but can become obsolete or untrue
Things are replicated at the expense of the manufacturer / Data is replicated at almost zero-cost without limit
Things exist in a tangible location / Data does not physically exist
When things are sold, possession changes / When data is sold the seller may still possess and sell again
The price of things are based on production costs / The price of data is based on value to customer
Mashed Up data
Information is data endowed with relevance and purpose (Drucker, 1988)
A mashup is the term used for applications that combine data from different sources to create a new application on the web.
Web Application Hybrid
A mashup of location data and housing prices adds something beyond what the data provides individually.
Improved comprehension, added layers of data, into information, knowledge
23/08/2019
35
35
36
18
References
Boddy, D., Boonstra, A & Kennedy, G 2002, Managing Information Systems. An organisational perspective, Pearson, Harlow.
Davenport, T.H 1997, Information ecology, New York: Oxford University
Press, pp. 9-10.
Drucker, P 1988, The coming of the new organization, Harvard Business
Review (January-February) pp. 45-53.
Evans, P & Wurster, T 2000, Blown to bits, Harvard Business School, Boston.
Polyani, M 1966, The tacit dimension, Peter Smith, Magnolia, MA, p. 4.
Sullivan, F 2017, Top ten major risks associated with cloud storage, available <https://www.cloudwards.net/top-ten-major-risks-associated- with-cloud-storage/, last accessed 21 July, 2017.
36
(Some) takeaways for managing data
Deal with the usual suspects first
- Text to data mismatches: Which is ‘MORE’ correct? IBM,
International Business Machines or Big Blue
- Data Quality: A general challenge when automatically
integrating data from autonomous sources. Even more of a
challenge when performing ‘reasoning’
- (automatically inferring new data from existing data)
Focus on the data, transform it to information
Do not focus on the device or data centre
Gain a complete understanding
Be efficient
Set consistent policies
Stay agile