Research T1
Enterprise Intelligence and Decision Support Systems
The Components of BI Architecture
The Components of BI Architecture
2
Ease of use, minimal training, uses Standard Reinsurance business terms
Fast performance
For those metrics which cover Operations detail, Ad-hoc reports can be easily created
Executive Ad-hoc reports can be easily created
Scaleable and insulated from data sources
Data Sources
Extract,
Transform
& Load
(ETL)
Data from
source
systems
Evolve
First Data
Mart:
Create
specialized,
web-
enabled
interfaces
Aggregate
at group &
business
unit
levels
Distribute
to business
units and
Group
Source 3
Source 4
Source 2
Source 1
Etc.
External
Market
Data
User Benefits:
Standard operational reports not covered in warehouse because of lower “grain” or different refresh schedule.
Near “real time” access
Transaction Systems
Start with
One Key
Performance
Indicator
(KPI)
Create a
single data
mart for all
audiences
Data Warehouse
Transaction System Reporting
Either provided with the applications or enhanced with direct connect to transaction systems
User Benefits:
Future State:
A Data Warehouse With Comprehensive Business Intelligence:
User Benefits:
Extend data marts, create new data marts for additional KPI’s
Add data marts specialized for “problem” transactional reporting
Extend intranet-based viewing to include web-based querying
Provide seamless drill-down from high-level summary to transaction detail
Integrate information from multiple sources
Extends benefits from the first data mart
A Sample Comprehensive BI Architecture
3
3
OLTP = Online Transaction Processing = Source Systems, Transaction Processing Systems or Enterprise Resource Planning (ERP) Systems
Source: Howson
4
The “Perfect World” of ERP Conceptual Architecture
This would be great…
Prof’s experience confirms ERP systems’
reporting still need work
The Reality…
Multiple, disparate source systems
Redundancy in systems so must answer:
which source is the relative source?
Same data content, different identifiers
The data is incomplete or “dirty:” poor quality
Important term: edits
The data sources are at different grains, or
levels of detail
Data fields with the same name have different
business rules/definitions
Source data comes from multiple platforms
Depending on the age of the source platform,
documentation and expertise may not available
The schedule of source data refresh may not be
appropriate
Source system management may not alert
“downstream” system owners to changes
Partial loads with no notification or ETA
5
Source: Howson
5
Key Concepts Fighting the Uphill Battle
Data Profiling: Looking at source system data to affirm what’s really in there and how bad it is
Metadata: Data about data
Data Lineage: Answers where the data you’re looking at come from (a form of metadata)
Master Data Management: Central control for cross reference, standard business rules and glossaries, etc.
Data Governance: Ensuring your information assets remain of highest quality
Configuration Management: Application component management that answers if a change occurs here,
what else is impacted?
Project and Operations Management: Organized delivery of new features and change management
Quality Assurance: Testing the results with business users to ensure the quality data governance seeks
Also: Tell the class something interesting about your teammates !
6
Key BI Components
7
Source: Howson
Data Marts
&
Detail Data
Executive Dashboards, Management KPIs
Source Systems
Production & Operation
Self Service Environment
Self Service & Ad-hoc Query and Reporting
Dashboard, Scorecards, Scenario Analysis Tools
Cubes
Production & Operation Reports and Tools
Predefined Reports, Queries & Tools
Sample Business Intelligence (BI) Conceptual Solution Architecture
Detail data to support business analytic & reporting requirements
Data Foundation
Analytical
Workspaces,
Planning Tools
Production, Profitability & Plan Cubes
Source: Approximates one of the Professor’s Professional Environments
8
8
Metrics
Current and primary focus is delivery of key financial metrics at the office level, instill confidence in users –pending, notice counts, CWA, CWP, paid loss
One portal into reporting from any sources
Data in single source, integration of data, standardization
Transactional systems are not built for reporting:
Designed to get data in quickly and easily
Not designed to get information out quickly and easily
Another name for transactional systems is Online Transaction Processing or OLTP
Analytical systems are designed to get data out for reporting
Another name for analytical systems is Online Analytical Processing or OLAP
Transaction (OLTP) system data
Is up-to-the-minute
Contains details of the transaction system
Is preferred by functional managers and staff
Analytical (OLAP) system data
Refreshes on a real-time, daily or weekly schedule
Information aggregated at a higher detail for faster analysis
Most often preferred by upper level and executive management
“OLTP” and “OLAP” are fundamental BI Concepts
OLTP and OLAP
9
= OLTP
= OLAP
10
Source: Howson
A centralized, cleansed, tested, verified extract of data formatted for easy presentation and use
What is a Data Lake, a.k.a. a Data Warehouse?
11
Data marts present subsets of summarized, or aggregated, information found in a data warehouse for a specific business function
What is a Data Mart?
12
Source
1
Source
3
Source
2
Operational Data
Store/
Data Warehouse
Data Mart
1
Data Mart
2
Data Mart
3
Data Mart
4
Data Mart
n
Source
4
Enterprise
Drill Down
(Note: Each of these is
A data mart “star”)
From Data Warehouses to Data Marts
Staging Area
For data integration
and cleansing
13
13
Normalized Data = OLTP = Quick Loads and Lots of Transactions and Modifications
14
Source: Howson
Sample Transactional/Detail Data Model
Sample Data Mart (Aggregated) Data Model
Vs.
Aggregated data models are easy to use and maintain and provide fast response time:
Customized to fit your business model vs. a vendor’s approach
More stable because you own it – not a vendor
Enables source data integration: makes disparate systems talk
Why Build a BI Application?
15
Denormalized Data = OLAP = Extract and Serve the Data for Speedy Access and Presentation
16
Source: Howson
Facts – Numerical Measures saved at the lowest “grain”
Dimensions – Business Descriptors one wants to report on and see numbers aggregated on
Grain – Equals “granularity” of data, I.e., the lowest aggregation stored for a measure
Arranged in a “star” (or “star schema” in tech. terms), because the structure looks like a star:
If designed properly, dimensional models
Speed queries – the paths to the data are shorter, and aggregation is pre-built for each dimension
Easy for business people to understand
Another term for a dimensionally modeled record is ”flattened” because each record is physically long and “flat”
FACTS
Business
Client Company
Time
Etc.
Earned Premium
Written Premium
Paid Loss
IBNR
Etc.
For any star, a
Rule of thumb is
12 – 14 dimensions
Accident Day
Accident Month
Accident Year
Underwriting Year
Etc.
Account Number
Main Class of Business
Class of Business
SubClass of Business
Portfolio Segment
US Department
Etc.
Name
Currency
ID
Etc.
DIMENSION:
DIMENSION:
DIMENSION:
DIMENSION:
Data Mart Architecture
17
Key Concept: SQL
SQL (Pronounced See-Quel) = Structured Query Language
Background
Originally designed as an easy to use access language to go against relational database management systems (RDBMS)
Two things happened:
It became the de facto access language for RDBMS’ so it moved from end user query tools and
became embedded in complex code
End users found it more difficult to use than anticipated, reinforcing the first bullet
“Code generators” came along to insulate end users from having to generate the SQL. Most BI tools insulate
their end users by providing a “semantic layer” that uses business terms while their “back end” generates
SQL code
How is it Used? This question should be posed, “How isn’t it Used?”
The data structures just discussed are generated by “data design language” (DDL)
Data analysts use SQL to extract and profile the data in source systems to understand it
Quality assurance specialists use SQL to extract data to test their results
Database Administrators (DBAs) use it to assess and tune their system responsiveness (discussed two slides hence)
It is often still used to build complex extract, transformation and load (ETL)
18
ETL: SQL vs. a Graphical User Interface (GUI)
SQL Statements
SELECT {Table names}
WHERE {Conditions}
GROUP BY {Aggregation rules}
Informatica Power Center Mapping
Easier to maintain (a picture is worth 1,000 words)
Can access stored procedures that contain SQL code
(be careful, the Prof has seen DBA’s use the GUI as a
“shell” for a group of SQL statements, removing the
benefit of the tool)
DBAs prefer this because it provides
direct control
In the past (may have changed), some
native functionality was not optimized
in the GUI tools
20
Cyriac (2012)
BI in the Cloud
21
Source: Howson
SAS Studio, SAS On Demand for Academics
And SAS Viya are all examples of
BI in the Cloud!
Why the Cloud?
These days, cloud providers include leaders like Amazon, Google and salesforce.com
If you have a solid vendor management contract and are working with a reputable player, BI in the cloud makes sense:
Your IT staff has many priorities. Why not outsource your data and infrastructure to someone whose core competency is to do it right?
Scalability – you can expand or contract at will with no adverse impact
Security – it may seem counter intuitive but since cloud providers’ core competency is ensuring your data and infrastructure is safe, they employ the most advanced security measures
22
“Flavors” of Cloud Offerings
Infrastructure as a Service (IaaS): virtual servers, storage and networks
This is pay as you go
You retain ownership of tools and deployment
Platform as a Service (Paas): IaaS plus and an added layer of software clients use to build applications. The software is owned and managed by the provider
Software as a Service (SaaS): Vendors provide the solution, you consume it (e.g. salesforce.com)
23
Source: Howson
Data Marts
&
Detail Data
Executive Dashboards, Management KPIs
Source Systems
Production & Operation
Self Service Environment
Self Service & Ad-hoc Query and Reporting
Dashboard, Scorecards, Scenario Analysis Tools
Cubes
Production & Operation Reports and Tools
Predefined Reports, Queries & Tools
Where does Big Data Fit?
Detail data to support business analytic & reporting requirements
Data Foundation
Analytical
Workspaces,
Planning Tools
Production, Profitability & Plan Cubes
Source: Approximates one of the Professor’s Professional Environments
Big Data
Expands the
Foundation…
24
24
Metrics
Current and primary focus is delivery of key financial metrics at the office level, instill confidence in users –pending, notice counts, CWA, CWP, paid loss
One portal into reporting from any sources
Data in single source, integration of data, standardization
Placing Big Data in Context
Like the cloud, big data is getting beyond the hype and becoming indispensable
Why?
Before the term “big data” was coined, combining externally facing data with internally facing data for competitive understanding was just as crucial as it is now
The “holy grail” was managing unstructured data. For example, a memo field containing customer notes was difficult to parse to find a name. We always said, “Man, if we could just do that with something better than Soundex, it’d be great”
Web 2.0 technology became a primary means of communication and digitized video, social comments, pictures, etc.) enabled is now captured. There is a goldmine of feedback out there that we now must obtain…
This isn’t a revolution, its a progression. Market data is still available and now we can also work with unstructured data. When you combine all relevant internal and data, its the killer BI app!!
25
Big Data Technologies
Hadoop and NoSQL handle unstructured data far better than RDBMS
Unlike the DBMS’ discussed until now, NoSQL is not relational, columnar or proprietary. Instead it is flexible and open source and can work with “loosely described” data
Hadoop is named after its creator’s child’s toy elephant
It acts as an “umbrella” for a number of open source projects
Uses “pointers” to relate files
26
Source: Howson
Reference List
Cyriac, J. (2012 6 15). Dynamically change your target flat file using Informatica mapping. [Web
log entry]. Retrieved from http://www.disoln.org/2012/06/dynamically-change-your-flat-file- name.html
Howson, C. (2014). Successful business intelligence: Unlock the value
of BI and big data. New York. McGraw Hill Education.
ISBN: 9780071809184
27