Research T1

profilevivi_5488
ITKM548-11111.pptx

Enterprise Intelligence and Decision Support Systems

The Components of BI Architecture

The Components of BI Architecture

2

Ease of use, minimal training, uses Standard Reinsurance business terms

Fast performance

For those metrics which cover Operations detail, Ad-hoc reports can be easily created

Executive Ad-hoc reports can be easily created

Scaleable and insulated from data sources

Data Sources

Extract,

Transform

& Load

(ETL)

Data from

source

systems

Evolve

First Data

Mart:

Create

specialized,

web-

enabled

interfaces

Aggregate

at group &

business

unit

levels

Distribute

to business

units and

Group

Source 3

Source 4

Source 2

Source 1

Etc.

External

Market

Data

User Benefits:

Standard operational reports not covered in warehouse because of lower “grain” or different refresh schedule.

Near “real time” access

Transaction Systems

Start with

One Key

Performance

Indicator

(KPI)

Create a

single data

mart for all

audiences

Data Warehouse

Transaction System Reporting

Either provided with the applications or enhanced with direct connect to transaction systems

User Benefits:

Future State:

A Data Warehouse With Comprehensive Business Intelligence:

User Benefits:

Extend data marts, create new data marts for additional KPI’s

Add data marts specialized for “problem” transactional reporting

Extend intranet-based viewing to include web-based querying

Provide seamless drill-down from high-level summary to transaction detail

Integrate information from multiple sources

Extends benefits from the first data mart

A Sample Comprehensive BI Architecture

3

3

OLTP = Online Transaction Processing = Source Systems, Transaction Processing Systems or Enterprise Resource Planning (ERP) Systems

Source: Howson

4

The “Perfect World” of ERP Conceptual Architecture

This would be great…

Prof’s experience confirms ERP systems’

reporting still need work

The Reality…

Multiple, disparate source systems

Redundancy in systems so must answer:

which source is the relative source?

Same data content, different identifiers

The data is incomplete or “dirty:” poor quality

Important term: edits

The data sources are at different grains, or

levels of detail

Data fields with the same name have different

business rules/definitions

Source data comes from multiple platforms

Depending on the age of the source platform,

documentation and expertise may not available

The schedule of source data refresh may not be

appropriate

Source system management may not alert

“downstream” system owners to changes

Partial loads with no notification or ETA

5

Source: Howson

5

Key Concepts Fighting the Uphill Battle

Data Profiling: Looking at source system data to affirm what’s really in there and how bad it is

Metadata: Data about data

Data Lineage: Answers where the data you’re looking at come from (a form of metadata)

Master Data Management: Central control for cross reference, standard business rules and glossaries, etc.

Data Governance: Ensuring your information assets remain of highest quality

Configuration Management: Application component management that answers if a change occurs here,

what else is impacted?

Project and Operations Management: Organized delivery of new features and change management

Quality Assurance: Testing the results with business users to ensure the quality data governance seeks

 Also: Tell the class something interesting about your teammates !

6

Key BI Components

7

Source: Howson

Data Marts

&

Detail Data

Executive Dashboards, Management KPIs

Source Systems

Production & Operation

Self Service Environment

Self Service & Ad-hoc Query and Reporting

Dashboard, Scorecards, Scenario Analysis Tools

Cubes

Production & Operation Reports and Tools

Predefined Reports, Queries & Tools

Sample Business Intelligence (BI) Conceptual Solution Architecture

Detail data to support business analytic & reporting requirements

Data Foundation

Analytical

Workspaces,

Planning Tools

Production, Profitability & Plan Cubes

Source: Approximates one of the Professor’s Professional Environments

8

8

Metrics

Current and primary focus is delivery of key financial metrics at the office level, instill confidence in users –pending, notice counts, CWA, CWP, paid loss

One portal into reporting from any sources

Data in single source, integration of data, standardization

Transactional systems are not built for reporting:

Designed to get data in quickly and easily

Not designed to get information out quickly and easily

Another name for transactional systems is Online Transaction Processing or OLTP

Analytical systems are designed to get data out for reporting

Another name for analytical systems is Online Analytical Processing or OLAP

Transaction (OLTP) system data

Is up-to-the-minute

Contains details of the transaction system

Is preferred by functional managers and staff

Analytical (OLAP) system data

Refreshes on a real-time, daily or weekly schedule

Information aggregated at a higher detail for faster analysis

Most often preferred by upper level and executive management

“OLTP” and “OLAP” are fundamental BI Concepts

OLTP and OLAP

9

= OLTP

= OLAP

10

Source: Howson

A centralized, cleansed, tested, verified extract of data formatted for easy presentation and use

What is a Data Lake, a.k.a. a Data Warehouse?

11

Data marts present subsets of summarized, or aggregated, information found in a data warehouse for a specific business function

What is a Data Mart?

12

Source

1

Source

3

Source

2

Operational Data

Store/

Data Warehouse

Data Mart

1

Data Mart

2

Data Mart

3

Data Mart

4

Data Mart

n

Source

4

Enterprise

Drill Down

(Note: Each of these is

A data mart “star”)

From Data Warehouses to Data Marts

Staging Area

For data integration

and cleansing

13

13

Normalized Data = OLTP = Quick Loads and Lots of Transactions and Modifications

14

Source: Howson

Sample Transactional/Detail Data Model

Sample Data Mart (Aggregated) Data Model

Vs.

Aggregated data models are easy to use and maintain and provide fast response time:

Customized to fit your business model vs. a vendor’s approach

More stable because you own it – not a vendor

Enables source data integration: makes disparate systems talk

Why Build a BI Application?

15

Denormalized Data = OLAP = Extract and Serve the Data for Speedy Access and Presentation

16

Source: Howson

Facts – Numerical Measures saved at the lowest “grain”

Dimensions – Business Descriptors one wants to report on and see numbers aggregated on

Grain – Equals “granularity” of data, I.e., the lowest aggregation stored for a measure

Arranged in a “star” (or “star schema” in tech. terms), because the structure looks like a star:

If designed properly, dimensional models

Speed queries – the paths to the data are shorter, and aggregation is pre-built for each dimension

Easy for business people to understand

Another term for a dimensionally modeled record is ”flattened” because each record is physically long and “flat”

FACTS

Business

Client Company

Time

Etc.

Earned Premium

Written Premium

Paid Loss

IBNR

Etc.

For any star, a

Rule of thumb is

12 – 14 dimensions

Accident Day

Accident Month

Accident Year

Underwriting Year

Etc.

Account Number

Main Class of Business

Class of Business

SubClass of Business

Portfolio Segment

US Department

Etc.

Name

Currency

ID

Etc.

DIMENSION:

DIMENSION:

DIMENSION:

DIMENSION:

Data Mart Architecture

17

Key Concept: SQL

SQL (Pronounced See-Quel) = Structured Query Language

Background

Originally designed as an easy to use access language to go against relational database management systems (RDBMS)

Two things happened:

It became the de facto access language for RDBMS’ so it moved from end user query tools and

became embedded in complex code

End users found it more difficult to use than anticipated, reinforcing the first bullet

“Code generators” came along to insulate end users from having to generate the SQL. Most BI tools insulate

their end users by providing a “semantic layer” that uses business terms while their “back end” generates

SQL code

How is it Used? This question should be posed, “How isn’t it Used?”

The data structures just discussed are generated by “data design language” (DDL)

Data analysts use SQL to extract and profile the data in source systems to understand it

Quality assurance specialists use SQL to extract data to test their results

Database Administrators (DBAs) use it to assess and tune their system responsiveness (discussed two slides hence)

It is often still used to build complex extract, transformation and load (ETL)

18

A Closer Look at SQL

19

Introducing www.w3schools.com/sql.

ETL: SQL vs. a Graphical User Interface (GUI)

SQL Statements

SELECT {Table names}

WHERE {Conditions}

GROUP BY {Aggregation rules}

Informatica Power Center Mapping

Easier to maintain (a picture is worth 1,000 words)

Can access stored procedures that contain SQL code

(be careful, the Prof has seen DBA’s use the GUI as a

“shell” for a group of SQL statements, removing the

benefit of the tool)

DBAs prefer this because it provides

direct control

In the past (may have changed), some

native functionality was not optimized

in the GUI tools

20

Cyriac (2012)

BI in the Cloud

21

Source: Howson

SAS Studio, SAS On Demand for Academics

And SAS Viya are all examples of

BI in the Cloud!

Why the Cloud?

These days, cloud providers include leaders like Amazon, Google and salesforce.com

If you have a solid vendor management contract and are working with a reputable player, BI in the cloud makes sense:

Your IT staff has many priorities. Why not outsource your data and infrastructure to someone whose core competency is to do it right?

Scalability – you can expand or contract at will with no adverse impact

Security – it may seem counter intuitive but since cloud providers’ core competency is ensuring your data and infrastructure is safe, they employ the most advanced security measures

22

“Flavors” of Cloud Offerings

Infrastructure as a Service (IaaS): virtual servers, storage and networks

This is pay as you go

You retain ownership of tools and deployment

Platform as a Service (Paas): IaaS plus and an added layer of software clients use to build applications. The software is owned and managed by the provider

Software as a Service (SaaS): Vendors provide the solution, you consume it (e.g. salesforce.com)

23

Source: Howson

Data Marts

&

Detail Data

Executive Dashboards, Management KPIs

Source Systems

Production & Operation

Self Service Environment

Self Service & Ad-hoc Query and Reporting

Dashboard, Scorecards, Scenario Analysis Tools

Cubes

Production & Operation Reports and Tools

Predefined Reports, Queries & Tools

Where does Big Data Fit?

Detail data to support business analytic & reporting requirements

Data Foundation

Analytical

Workspaces,

Planning Tools

Production, Profitability & Plan Cubes

Source: Approximates one of the Professor’s Professional Environments

Big Data

Expands the

Foundation…

24

24

Metrics

Current and primary focus is delivery of key financial metrics at the office level, instill confidence in users –pending, notice counts, CWA, CWP, paid loss

One portal into reporting from any sources

Data in single source, integration of data, standardization

Placing Big Data in Context

Like the cloud, big data is getting beyond the hype and becoming indispensable

Why?

Before the term “big data” was coined, combining externally facing data with internally facing data for competitive understanding was just as crucial as it is now

The “holy grail” was managing unstructured data. For example, a memo field containing customer notes was difficult to parse to find a name. We always said, “Man, if we could just do that with something better than Soundex, it’d be great”

Web 2.0 technology became a primary means of communication and digitized video, social comments, pictures, etc.) enabled is now captured. There is a goldmine of feedback out there that we now must obtain…

 This isn’t a revolution, its a progression. Market data is still available and now we can also work with unstructured data. When you combine all relevant internal and data, its the killer BI app!!

25

Big Data Technologies

Hadoop and NoSQL handle unstructured data far better than RDBMS

Unlike the DBMS’ discussed until now, NoSQL is not relational, columnar or proprietary. Instead it is flexible and open source and can work with “loosely described” data

Hadoop is named after its creator’s child’s toy elephant

It acts as an “umbrella” for a number of open source projects

Uses “pointers” to relate files

26

Source: Howson

Reference List

Cyriac, J. (2012 6 15). Dynamically change your target flat file using Informatica mapping. [Web

log entry]. Retrieved from http://www.disoln.org/2012/06/dynamically-change-your-flat-file- name.html

Howson, C. (2014). Successful business intelligence: Unlock the value

of BI and big data. New York. McGraw Hill Education.

ISBN: 9780071809184

27