hw2

jaffernf

AnalyticsDataScienceandArtificialIntelligence.pdf.zip

Home >Information Systems homework help >hw2

Analytics, Data Science, and Artificial Intelligence.pdf

ANALYTICS, DATA SCIENCE, & ARTIFICIAL INTELLIGENCE

SYSTEMS FOR DECISION SUPPORT

E L E V E N T H E D I T I O N

Ramesh Sharda Oklahoma State University

Dursun Delen Oklahoma State University

Efraim Turban University of Hawaii

Microsoft and/or its respective suppliers make no representations about the suitability of the information contained in the documents and related graphics published as part of the services for any purpose. All such documents and related graphics are provided “as is” without warranty of any kind. Microsoft and/or its respective suppliers hereby disclaim all warranties and conditions with regard to this information, including all warranties and conditions of merchantability, whether express, implied or statutory, fitness for a particular purpose, title and non-infringement. In no event shall Microsoft and/or its respective suppliers be liable for any special, indirect or consequential damages or any damages whatsoever resulting from loss of use, data or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the use or performance of information available from the services. The documents and related graphics contained herein could include technical inaccuracies or typographical errors. Changes are periodically added to the information herein. Microsoft and/or its respective suppliers may make improvements and/or changes in the product(s) and/or the program(s) described herein at any time. Partial screen shots may be viewed in full within the software version specified. Microsoft® Windows® and Microsoft Office® are registered trademarks of Microsoft Corporation in the U.S.A. and other countries. This book is not sponsored or endorsed by or affiliated with the Microsoft Corporation.

Vice President of Courseware Portfolio Management: Andrew Gilfillan

Executive Portfolio Manager: Samantha Lewis Team Lead, Content Production: Laura Burgess Content Producer: Faraz Sharique Ali Portfolio Management Assistant: Bridget Daly Director of Product Marketing: Brad Parkins Director of Field Marketing: Jonathan Cottrell Product Marketing Manager: Heather Taylor Field Marketing Manager: Bob Nisbet Product Marketing Assistant: Liz Bennett Field Marketing Assistant: Derrica Moser Senior Operations Specialist: Diane Peirano Senior Art Director: Mary Seiner

Interior and Cover Design: Pearson CSC Cover Photo: Phonlamai Photo/Shutterstock Senior Product Model Manager: Eric Hakanson Manager, Digital Studio: Heather Darby Course Producer, MyLab MIS: Jaimie Noy Digital Studio Producer: Tanika Henderson Full-Service Project Manager: Gowthaman

Sadhanandham Full Service Vendor: Integra Software Service

Pvt. Ltd. Manufacturing Buyer: LSC Communications,

Maura Zaldivar-Garcia Text Printer/Bindery: LSC Communications Cover Printer: Phoenix Color

ISBN 10: 0-13-519201-3 ISBN 13: 978-0-13-519201-6

Copyright © 2020, 2015, 2011 by Pearson Education, Inc. 221 River Street, Hoboken, NJ 07030. All rights reserved. Manufactured in the United States of America. This publication is protected by Copyright, and permission should be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. For information regarding permissions, request forms and the appropriate contacts within the Pearson Education Global Rights & Permissions Department, please visit www.pearsoned.com/permissions. Acknowledgments of third-party content appear on the appropriate page within the text, which constitutes an extension of this copyright page. Unless otherwise indicated herein, any third-party trademarks that may appear in this work are the property of their respective owners and any references to third-party trademarks, logos or other trade dress are for demonstrative or descriptive purposes only. Such references are not intended to imply any sponsorship, endorsement, authorization, or promotion of Pearson’s products by the owners of such marks, or any relationship between the owner and Pearson Education, Inc. or its affiliates, authors, licensees or distributors.

Library of Congress Cataloging-in-Publication Data

Library of Congress Cataloging in Publication Control Number: 2018051774

http://www.pearsoned.com/permissions

iii

Preface xxv

About the Authors xxxiv

PART I Introduction to Analytics and AI 1 Chapter 1 Overview of Business Intelligence, Analytics,

Data Science, and Artificial Intelligence: Systems for Decision Support 2

Chapter 2 Artificial Intelligence: Concepts, Drivers, Major Technologies, and Business Applications 73

Chapter 3 Nature of Data, Statistical Modeling, and Visualization 117

PART II Predictive Analytics/Machine Learning 193 Chapter 4 Data Mining Process, Methods, and Algorithms 194

Chapter 5 Machine-Learning Techniques for Predictive Analytics 251

Chapter 6 Deep Learning and Cognitive Computing 315

Chapter 7 Text Mining, Sentiment Analysis, and Social Analytics 388

PART III Prescriptive Analytics and Big Data 459 Chapter 8 Prescriptive Analytics: Optimization and

Simulation 460

Chapter 9 Big Data, Cloud Computing, and Location Analytics: Concepts and Tools 509

PART IV Robotics, Social Networks, AI and IoT 579 Chapter 10 Robotics: Industrial and Consumer Applications 580

Chapter 11 Group Decision Making, Collaborative Systems, and AI Support 610

Chapter 12 Knowledge Systems: Expert Systems, Recommenders, Chatbots, Virtual Personal Assistants, and Robo Advisors 648

Chapter 13 The Internet of Things as a Platform for Intelligent Applications 687

PART V Caveats of Analytics and AI 725 Chapter 14 Implementation Issues: From Ethics and Privacy to

Organizational and Societal Impacts 726

Glossary 770

Index 785

BRIEF CONTENTS

CONTENTS

Preface xxv

About the Authors xxxiv

PART I Introduction to Analytics and AI 1

Chapter 1 Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence: Systems for Decision Support 2 1.1 Opening Vignette: How Intelligent Systems Work for

KONE Elevators and Escalators Company 3

1.2 Changing Business Environments and Evolving Needs for Decision Support and Analytics 5

Decision-Making Process 6

The Influence of the External and Internal Environments on the Process 6

Data and Its Analysis in Decision Making 7

Technologies for Data Analysis and Decision Support 7

1.3 Decision-Making Processes and Computerized Decision Support Framework 9

Simon’s Process: Intelligence, Design, and Choice 9

The Intelligence Phase: Problem (or Opportunity) Identification 10 0 APPLICATION CASE 1.1 Making Elevators Go Faster! 11

The Design Phase 12

The Choice Phase 13

The Implementation Phase 13

The Classical Decision Support System Framework 14

A DSS Application 16

Components of a Decision Support System 18

The Data Management Subsystem 18

The Model Management Subsystem 19 0 APPLICATION CASE 1.2 SNAP DSS Helps OneNet Make

Telecommunications Rate Decisions 20

The User Interface Subsystem 20

The Knowledge-Based Management Subsystem 21

1.4 Evolution of Computerized Decision Support to Business Intelligence/Analytics/Data Science 22

A Framework for Business Intelligence 25

The Architecture of BI 25

The Origins and Drivers of BI 26

Data Warehouse as a Foundation for Business Intelligence 27

Transaction Processing versus Analytic Processing 27

A Multimedia Exercise in Business Intelligence 28

Contents v

1.5 Analytics Overview 30

Descriptive Analytics 32 0 APPLICATION CASE 1.3 Silvaris Increases Business with Visual

Analysis and Real-Time Reporting Capabilities 32 0 APPLICATION CASE 1.4 Siemens Reduces Cost with the Use of Data

Visualization 33

Predictive Analytics 33 0 APPLICATION CASE 1.5 Analyzing Athletic Injuries 34

Prescriptive Analytics 34 0 APPLICATION CASE 1.6 A Specialty Steel Bar Company Uses Analytics

to Determine Available-to-Promise Dates 35

1.6 Analytics Examples in Selected Domains 38

Sports Analytics—An Exciting Frontier for Learning and Understanding Applications of Analytics 38

Analytics Applications in Healthcare—Humana Examples 43 0 APPLICATION CASE 1.7 Image Analysis Helps Estimate Plant Cover 50

1.7 Artificial Intelligence Overview 52

What Is Artificial Intelligence? 52

The Major Benefits of AI 52

The Landscape of AI 52 0 APPLICATION CASE 1.8 AI Increases Passengers’ Comfort and

Security in Airports and Borders 54

The Three Flavors of AI Decisions 55

Autonomous AI 55

Societal Impacts 56 0 APPLICATION CASE 1.9 Robots Took the Job of Camel-Racing Jockeys

for Societal Benefits 58

1.8 Convergence of Analytics and AI 59

Major Differences between Analytics and AI 59

Why Combine Intelligent Systems? 60

How Convergence Can Help? 60

Big Data Is Empowering AI Technologies 60

The Convergence of AI and the IoT 61

The Convergence with Blockchain and Other Technologies 62 0 APPLICATION CASE 1.10 Amazon Go Is Open for Business 62

IBM and Microsoft Support for Intelligent Systems Convergence 63

1.9 Overview of the Analytics Ecosystem 63

1.10 Plan of the Book 65

1.11 Resources, Links, and the Teradata University Network Connection 66

Resources and Links 66

Vendors, Products, and Demos 66

Periodicals 67

The Teradata University Network Connection 67

vi Contents

The Book’s Web Site 67 Chapter Highlights 67 • Key Terms 68

Questions for Discussion 68 • Exercises 69

References 70

Chapter 2 Artificial Intelligence: Concepts, Drivers, Major Technologies, and Business Applications 73 2.1 Opening Vignette: INRIX Solves Transportation

Problems 74

2.2 Introduction to Artificial Intelligence 76

Definitions 76

Major Characteristics of AI Machines 77

Major Elements of AI 77

AI Applications 78

Major Goals of AI 78

Drivers of AI 79

Benefits of AI 79

Some Limitations of AI Machines 81

Three Flavors of AI Decisions 81

Artificial Brain 82

2.3 Human and Computer Intelligence 83

What Is Intelligence? 83

How Intelligent Is AI? 84

Measuring AI 85 0 APPLICATION CASE 2.1 How Smart Can a Vacuum Cleaner Be? 86

2.4 Major AI Technologies and Some Derivatives 87

Intelligent Agents 87

Machine Learning 88 0 APPLICATION CASE 2.2 How Machine Learning Is Improving Work

in Business 89

Machine and Computer Vision 90

Robotic Systems 91

Natural Language Processing 92

Knowledge and Expert Systems and Recommenders 93

Chatbots 94

Emerging AI Technologies 94

2.5 AI Support for Decision Making 95

Some Issues and Factors in Using AI in Decision Making 96

AI Support of the Decision-Making Process 96

Automated Decision Making 97 0 APPLICATION CASE 2.3 How Companies Solve Real-World Problems

Using Google’s Machine-Learning Tools 97

Conclusion 98

Contents vii

2.6 AI Applications in Accounting 99

AI in Accounting: An Overview 99

AI in Big Accounting Companies 100

Accounting Applications in Small Firms 100 0 APPLICATION CASE 2.4 How EY, Deloitte, and PwC Are Using AI 100

Job of Accountants 101

2.7 AI Applications in Financial Services 101

AI Activities in Financial Services 101

AI in Banking: An Overview 101

Illustrative AI Applications in Banking 102

Insurance Services 103 0 APPLICATION CASE 2.5 US Bank Customer Recognition and

Services 104

2.8 AI in Human Resource Management (HRM) 105

AI in HRM: An Overview 105

AI in Onboarding 105 0 APPLICATION CASE 2.6 How Alexander Mann Solutions (AMS) Is

Using AI to Support the Recruiting Process 106

Introducing AI to HRM Operations 106

2.9 AI in Marketing, Advertising, and CRM 107

Overview of Major Applications 107

AI Marketing Assistants in Action 108

Customer Experiences and CRM 108 0 APPLICATION CASE 2.7 Kraft Foods Uses AI for Marketing

and CRM 109

Other Uses of AI in Marketing 110

2.10 AI Applications in Production-Operation Management (POM) 110

AI in Manufacturing 110

Implementation Model 111

Intelligent Factories 111

Logistics and Transportation 112 Chapter Highlights 112 • Key Terms 113

Questions for Discussion 113 • Exercises 114

References 114

Chapter 3 Nature of Data, Statistical Modeling, and Visualization 117 3.1 Opening Vignette: SiriusXM Attracts and Engages a

New Generation of Radio Consumers with Data-Driven Marketing 118

3.2 Nature of Data 121

3.3 Simple Taxonomy of Data 125 0 APPLICATION CASE 3.1 Verizon Answers the Call for Innovation: The

Nation’s Largest Network Provider uses Advanced Analytics to Bring the Future to its Customers 127

viii Contents

3.4 Art and Science of Data Preprocessing 129 0 APPLICATION CASE 3.2 Improving Student Retention with

Data-Driven Analytics 133

3.5 Statistical Modeling for Business Analytics 139

Descriptive Statistics for Descriptive Analytics 140

Measures of Centrality Tendency (Also Called Measures of Location or Centrality) 140

Arithmetic Mean 140

Median 141

Mode 141

Measures of Dispersion (Also Called Measures of Spread or Decentrality) 142

Range 142

Variance 142

Standard Deviation 143

Mean Absolute Deviation 143

Quartiles and Interquartile Range 143

Box-and-Whiskers Plot 143

Shape of a Distribution 145 0 APPLICATION CASE 3.3 Town of Cary Uses Analytics to Analyze Data

from Sensors, Assess Demand, and Detect Problems 150

3.6 Regression Modeling for Inferential Statistics 151

How Do We Develop the Linear Regression Model? 152

How Do We Know If the Model Is Good Enough? 153

What Are the Most Important Assumptions in Linear Regression? 154

Logistic Regression 155

Time-Series Forecasting 156 0 APPLICATION CASE 3.4 Predicting NCAA Bowl Game Outcomes 157

3.7 Business Reporting 163 0 APPLICATION CASE 3.5 Flood of Paper Ends at FEMA 165

3.8 Data Visualization 166

Brief History of Data Visualization 167 0 APPLICATION CASE 3.6 Macfarlan Smith Improves Operational

Performance Insight with Tableau Online 169

3.9 Different Types of Charts and Graphs 171

Basic Charts and Graphs 171

Specialized Charts and Graphs 172

Which Chart or Graph Should You Use? 174

3.10 Emergence of Visual Analytics 176

Visual Analytics 178

High-Powered Visual Analytics Environments 180

3.11 Information Dashboards 182

Contents ix

0 APPLICATION CASE 3.7 Dallas Cowboys Score Big with Tableau and Teknion 184

Dashboard Design 184 0 APPLICATION CASE 3.8 Visual Analytics Helps Energy Supplier Make

Better Connections 185

What to Look for in a Dashboard 186

Best Practices in Dashboard Design 187

Benchmark Key Performance Indicators with Industry Standards 187

Wrap the Dashboard Metrics with Contextual Metadata 187

Validate the Dashboard Design by a Usability Specialist 187

Prioritize and Rank Alerts/Exceptions Streamed to the Dashboard 188

Enrich the Dashboard with Business-User Comments 188

Present Information in Three Different Levels 188

Pick the Right Visual Construct Using Dashboard Design Principles 188

Provide for Guided Analytics 188 Chapter Highlights 188 • Key Terms 189

Questions for Discussion 190 • Exercises 190

References 192

PART II Predictive Analytics/Machine Learning 193

Chapter 4 Data Mining Process, Methods, and Algorithms 194 4.1 Opening Vignette: Miami-Dade Police Department Is Using

Predictive Analytics to Foresee and Fight Crime 195

4.2 Data Mining Concepts 198 0 APPLICATION CASE 4.1 Visa Is Enhancing the Customer

Experience while Reducing Fraud with Predictive Analytics and Data Mining 199

Definitions, Characteristics, and Benefits 201

How Data Mining Works 202 0 APPLICATION CASE 4.2 American Honda Uses Advanced Analytics to

Improve Warranty Claims 203

Data Mining Versus Statistics 208

4.3 Data Mining Applications 208 0 APPLICATION CASE 4.3 Predictive Analytic and Data Mining Help

Stop Terrorist Funding 210

4.4 Data Mining Process 211

Step 1: Business Understanding 212

Step 2: Data Understanding 212

Step 3: Data Preparation 213

Step 4: Model Building 214 0 APPLICATION CASE 4.4 Data Mining Helps in Cancer Research 214

Step 5: Testing and Evaluation 217

x Contents

Step 6: Deployment 217

Other Data Mining Standardized Processes and Methodologies 217

4.5 Data Mining Methods 220

Classification 220

Estimating the True Accuracy of Classification Models 221

Estimating the Relative Importance of Predictor Variables 224

Cluster Analysis for Data Mining 228 0 APPLICATION CASE 4.5 Influence Health Uses Advanced Predictive

Analytics to Focus on the Factors That Really Influence People’s Healthcare Decisions 229

Association Rule Mining 232

4.6 Data Mining Software Tools 236 0 APPLICATION CASE 4.6 Data Mining goes to Hollywood: Predicting

Financial Success of Movies 239

4.7 Data Mining Privacy Issues, Myths, and Blunders 242 0 APPLICATION CASE 4.7 Predicting Customer Buying Patterns—The

Target Story 243

Data Mining Myths and Blunders 244 Chapter Highlights 246 • Key Terms 247

Questions for Discussion 247 • Exercises 248

References 250

Chapter 5 Machine-Learning Techniques for Predictive Analytics 251 5.1 Opening Vignette: Predictive Modeling Helps

Better Understand and Manage Complex Medical Procedures 252

5.2 Basic Concepts of Neural Networks 255

Biological versus Artificial Neural Networks 256 0 APPLICATION CASE 5.1 Neural Networks are Helping to Save

Lives in the Mining Industry 258

5.3 Neural Network Architectures 259

Kohonen’s Self-Organizing Feature Maps 259

Hopfield Networks 260 0 APPLICATION CASE 5.2 Predictive Modeling Is Powering the Power

Generators 261

5.4 Support Vector Machines 263 0 APPLICATION CASE 5.3 Identifying Injury Severity Risk Factors in

Vehicle Crashes with Predictive Analytics 264

Mathematical Formulation of SVM 269

Primal Form 269

Dual Form 269

Soft Margin 270

Nonlinear Classification 270

Kernel Trick 271

Contents xi

5.5 Process-Based Approach to the Use of SVM 271

Support Vector Machines versus Artificial Neural Networks 273

5.6 Nearest Neighbor Method for Prediction 274

Similarity Measure: The Distance Metric 275

Parameter Selection 275 0 APPLICATION CASE 5.4 Efficient Image Recognition and

Categorization with knn 277

5.7 Naïve Bayes Method for Classification 278

Bayes Theorem 279

Naïve Bayes Classifier 279

Process of Developing a Naïve Bayes Classifier 280

Testing Phase 281 0 APPLICATION CASE 5.5 Predicting Disease Progress in Crohn’s

Disease Patients: A Comparison of Analytics Methods 282

5.8 Bayesian Networks 287

How Does BN Work? 287

How Can BN Be Constructed? 288

5.9 Ensemble Modeling 293

Motivation—Why Do We Need to Use Ensembles? 293

Different Types of Ensembles 295

Bagging 296

Boosting 298

Variants of Bagging and Boosting 299

Stacking 300

Information Fusion 300

Summary—Ensembles are not Perfect! 301 0 APPLICATION CASE 5.6 To Imprison or Not to Imprison:

A Predictive Analytics-Based Decision Support System for Drug Courts 304

Chapter Highlights 306 • Key Terms 308

Questions for Discussion 308 • Exercises 309

Internet Exercises 312 • References 313

Chapter 6 Deep Learning and Cognitive Computing 315 6.1 Opening Vignette: Fighting Fraud with Deep Learning

and Artificial Intelligence 316

6.2 Introduction to Deep Learning 320 0 APPLICATION CASE 6.1 Finding the Next Football Star with

Artificial Intelligence 323

6.3 Basics of “Shallow” Neural Networks 325 0 APPLICATION CASE 6.2 Gaming Companies Use Data Analytics to

Score Points with Players 328

0 APPLICATION CASE 6.3 Artificial Intelligence Helps Protect Animals from Extinction 333

xii Contents

6.4 Process of Developing Neural Network–Based Systems 334

Learning Process in ANN 335

Backpropagation for ANN Training 336

6.5 Illuminating the Black Box of ANN 340 0 APPLICATION CASE 6.4 Sensitivity Analysis Reveals Injury Severity

Factors in Traffic Accidents 341

6.6 Deep Neural Networks 343

Feedforward Multilayer Perceptron (MLP)-Type Deep Networks 343

Impact of Random Weights in Deep MLP 344

More Hidden Layers versus More Neurons? 345 0 APPLICATION CASE 6.5 Georgia DOT Variable Speed Limit Analytics

Help Solve Traffic Congestions 346

6.7 Convolutional Neural Networks 349

Convolution Function 349

Pooling 352

Image Processing Using Convolutional Networks 353 0 APPLICATION CASE 6.6 From Image Recognition to Face

Recognition 356

Text Processing Using Convolutional Networks 357

6.8 Recurrent Networks and Long Short-Term Memory Networks 360 0 APPLICATION CASE 6.7 Deliver Innovation by Understanding

Customer Sentiments 363

LSTM Networks Applications 365

6.9 Computer Frameworks for Implementation of Deep Learning 368

Torch 368

Caffe 368

TensorFlow 369

Theano 369

Keras: An Application Programming Interface 370

6.10 Cognitive Computing 370

How Does Cognitive Computing Work? 371

How Does Cognitive Computing Differ from AI? 372

Cognitive Search 374

IBM Watson: Analytics at Its Best 375 0 APPLICATION CASE 6.8 IBM Watson Competes against the

Best at Jeopardy! 376

How Does Watson Do It? 377

What Is the Future for Watson? 377 Chapter Highlights 381 • Key Terms 383

Questions for Discussion 383 • Exercises 384

References 385

Contents xiii

Chapter 7 Text Mining, Sentiment Analysis, and Social Analytics 388 7.1 Opening Vignette: Amadori Group Converts Consumer

Sentiments into Near-Real-Time Sales 389

7.2 Text Analytics and Text Mining Overview 392 0 APPLICATION CASE 7.1 Netflix: Using Big Data to Drive Big

Engagement: Unlocking the Power of Analytics to Drive Content and Consumer Insight 395

7.3 Natural Language Processing (NLP) 397 0 APPLICATION CASE 7.2 AMC Networks Is Using Analytics to

Capture New Viewers, Predict Ratings, and Add Value for Advertisers in a Multichannel World 399

7.4 Text Mining Applications 402

Marketing Applications 403

Security Applications 403

Biomedical Applications 404 0 APPLICATION CASE 7.3 Mining for Lies 404

Academic Applications 407 0 APPLICATION CASE 7.4 The Magic Behind the Magic: Instant Access

to Information Helps the Orlando Magic Up their Game and the Fan’s Experience 408

7.5 Text Mining Process 410

Task 1: Establish the Corpus 410

Task 2: Create the Term–Document Matrix 411

Task 3: Extract the Knowledge 413 0 APPLICATION CASE 7.5 Research Literature Survey with Text

Mining 415

7.6 Sentiment Analysis 418 0 APPLICATION CASE 7.6 Creating a Unique Digital Experience to

Capture Moments That Matter at Wimbledon 419

Sentiment Analysis Applications 422

Sentiment Analysis Process 424

Methods for Polarity Identification 426

Using a Lexicon 426

Using a Collection of Training Documents 427

Identifying Semantic Orientation of Sentences and Phrases 428

Identifying Semantic Orientation of Documents 428

7.7 Web Mining Overview 429

Web Content and Web Structure Mining 431

7.8 Search Engines 433

Anatomy of a Search Engine 434

1. Development Cycle 434

2. Response Cycle 435

Search Engine Optimization 436

Methods for Search Engine Optimization 437

xiv Contents

0 APPLICATION CASE 7.7 Delivering Individualized Content and Driving Digital Engagement: How Barbour Collected More Than 49,000 New Leads in One Month with Teradata Interactive 439

7.9 Web Usage Mining (Web Analytics) 441

Web Analytics Technologies 441

Web Analytics Metrics 442

Web Site Usability 442

Traffic Sources 443

Visitor Profiles 444

Conversion Statistics 444

7.10 Social Analytics 446

Social Network Analysis 446

Social Network Analysis Metrics 447 0 APPLICATION CASE 7.8 Tito’s Vodka Establishes Brand Loyalty with

an Authentic Social Strategy 447

Connections 450

Distributions 450

Segmentation 451

Social Media Analytics 451

How Do People Use Social Media? 452

Measuring the Social Media Impact 453

Best Practices in Social Media Analytics 453 Chapter Highlights 455 • Key Terms 456

Questions for Discussion 456 • Exercises 456

References 457

PART III Prescriptive Analytics and Big Data 459

Chapter 8 Prescriptive Analytics: Optimization and Simulation 460 8.1 Opening Vignette: School District of Philadelphia Uses

Prescriptive Analytics to Find Optimal Solution for Awarding Bus Route Contracts 461

8.2 Model-Based Decision Making 462 0 APPLICATION CASE 8.1 Canadian Football League Optimizes Game

Schedule 463

Prescriptive Analytics Model Examples 465

Identification of the Problem and Environmental Analysis 465 0 APPLICATION CASE 8.2 Ingram Micro Uses Business Intelligence

Applications to Make Pricing Decisions 466

Model Categories 467

8.3 Structure of Mathematical Models for Decision Support 469

The Components of Decision Support Mathematical Models 469

The Structure of Mathematical Models 470

Contents xv

8.4 Certainty, Uncertainty, and Risk 471

Decision Making under Certainty 471

Decision Making under Uncertainty 472

Decision Making under Risk (Risk Analysis) 472 0 APPLICATION CASE 8.3 American Airlines Uses Should-Cost

Modeling to Assess the Uncertainty of Bids for Shipment Routes 472

8.5 Decision Modeling with Spreadsheets 473 0 APPLICATION CASE 8.4 Pennsylvania Adoption Exchange Uses

Spreadsheet Model to Better Match Children with Families 474

0 APPLICATION CASE 8.5 Metro Meals on Wheels Treasure Valley Uses Excel to Find Optimal Delivery Routes 475

8.6 Mathematical Programming Optimization 477 0 APPLICATION CASE 8.6 Mixed-Integer Programming Model

Helps the University of Tennessee Medical Center with Scheduling Physicians 478

Linear Programming Model 479

Modeling in LP: An Example 480

Implementation 484

8.7 Multiple Goals, Sensitivity Analysis, What-If Analysis, and Goal Seeking 486

Multiple Goals 486

Sensitivity Analysis 487

What-If Analysis 488

Goal Seeking 489

8.8 Decision Analysis with Decision Tables and Decision Trees 490

Decision Tables 490

Decision Trees 492

8.9 Introduction to Simulation 493

Major Characteristics of Simulation 493 0 APPLICATION CASE 8.7 Steel Tubing Manufacturer Uses a

Simulation-Based Production Scheduling System 493

Advantages of Simulation 494

Disadvantages of Simulation 495

The Methodology of Simulation 495

Simulation Types 496

Monte Carlo Simulation 497

Discrete Event Simulation 498 0 APPLICATION CASE 8.8 Cosan Improves Its Renewable Energy

Supply Chain Using Simulation 498

8.10 Visual Interactive Simulation 500

Conventional Simulation Inadequacies 500

Visual Interactive Simulation 500

xvi Contents

Visual Interactive Models and DSS 500

Simulation Software 501 0 APPLICATION CASE 8.9 Improving Job-Shop Scheduling Decisions

through RFID: A Simulation-Based Assessment 501

Chapter Highlights 505 • Key Terms 505

Questions for Discussion 505 • Exercises 506

References 508

Chapter 9 Big Data, Cloud Computing, and Location Analytics: Concepts and Tools 509 9.1 Opening Vignette: Analyzing Customer Churn in a Telecom

Company Using Big Data Methods 510

9.2 Definition of Big Data 513

The “V”s That Define Big Data 514 0 APPLICATION CASE 9.1 Alternative Data for Market Analysis or

Forecasts 517

9.3 Fundamentals of Big Data Analytics 519

Business Problems Addressed by Big Data Analytics 521 0 APPLICATION CASE 9.2 Overstock.com Combines Multiple Datasets

to Understand Customer Journeys 522

9.4 Big Data Technologies 523

MapReduce 523

Why Use MapReduce? 523

Hadoop 524

How Does Hadoop Work? 525

Hadoop Technical Components 525

Hadoop: The Pros and Cons 527

NoSQL 528 0 APPLICATION CASE 9.3 eBay’s Big Data Solution 529

0 APPLICATION CASE 9.4 Understanding Quality and Reliability of Healthcare Support Information on Twitter 531

9.5 Big Data and Data Warehousing 532

Use Cases for Hadoop 533

Use Cases for Data Warehousing 534

The Gray Areas (Any One of the Two Would Do the Job) 535

Coexistence of Hadoop and Data Warehouse 536

9.6 In-Memory Analytics and Apache Spark™ 537 0 APPLICATION CASE 9.5 Using Natural Language Processing to

analyze customer feedback in TripAdvisor reviews 538

Architecture of Apache SparkTM 538

Getting Started with Apache SparkTM 539

9.7 Big Data and Stream Analytics 543

Stream Analytics versus Perpetual Analytics 544

Critical Event Processing 545

Data Stream Mining 546

Applications of Stream Analytics 546

Contents xvii

e-Commerce 546

Telecommunications 546 0 APPLICATION CASE 9.6 Salesforce Is Using Streaming Data to

Enhance Customer Value 547

Law Enforcement and Cybersecurity 547

Power Industry 548

Financial Services 548

Health Sciences 548

Government 548

9.8 Big Data Vendors and Platforms 549

Infrastructure Services Providers 550

Analytics Solution Providers 550

Business Intelligence Providers Incorporating Big Data 551 0 APPLICATION CASE 9.7 Using Social Media for Nowcasting

Flu Activity 551

0 APPLICATION CASE 9.8 Analyzing Disease Patterns from an Electronic Medical Records Data Warehouse 554

9.9 Cloud Computing and Business Analytics 557

Data as a Service (DaaS) 558

Software as a Service (SaaS) 559

Platform as a Service (PaaS) 559

Infrastructure as a Service (IaaS) 559

Essential Technologies for Cloud Computing 560 0 APPLICATION CASE 9.9 Major West Coast Utility Uses Cloud-Mobile

Technology to Provide Real-Time Incident Reporting 561

Cloud Deployment Models 563

Major Cloud Platform Providers in Analytics 563

Analytics as a Service (AaaS) 564

Representative Analytics as a Service Offerings 564

Illustrative Analytics Applications Employing the Cloud Infrastructure 565

Using Azure IOT, Stream Analytics, and Machine Learning to Improve Mobile Health Care Services 565

Gulf Air Uses Big Data to Get Deeper Customer Insight 566

Chime Enhances Customer Experience Using Snowflake 566

9.10 Location-Based Analytics for Organizations 567

Geospatial Analytics 567 0 APPLICATION CASE 9.10 Great Clips Employs Spatial Analytics to

Shave Time in Location Decisions 570

0 APPLICATION CASE 9.11 Starbucks Exploits GIS and Analytics to Grow Worldwide 570

Real-Time Location Intelligence 572

Analytics Applications for Consumers 573 Chapter Highlights 574 • Key Terms 575

Questions for Discussion 575 • Exercises 575

References 576

xviii Contents

PART IV Robotics, Social Networks, AI and IoT 579

Chapter 10 Robotics: Industrial and Consumer Applications 580 10.1 Opening Vignette: Robots Provide Emotional Support

to Patients and Children 581

10.2 Overview of Robotics 584

10.3 History of Robotics 584

10.4 Illustrative Applications of Robotics 586

Changing Precision Technology 586

Adidas 586

BMW Employs Collaborative Robots 587

Tega 587

San Francisco Burger Eatery 588

Spyce 588

Mahindra & Mahindra Ltd. 589

Robots in the Defense Industry 589

Pepper 590

Da Vinci Surgical System 592

Snoo – A Robotic Crib 593

MEDi 593

Care-E Robot 593

AGROBOT 594

10.5 Components of Robots 595

10.6 Various Categories of Robots 596

10.7 Autonomous Cars: Robots in Motion 597

Autonomous Vehicle Development 598

Issues with Self-Driving Cars 599

10.8 Impact of Robots on Current and Future Jobs 600

10.9 Legal Implications of Robots and Artificial Intelligence 603

Tort Liability 603

Patents 603

Property 604

Taxation 604

Practice of Law 604

Constitutional Law 605

Professional Certification 605

Law Enforcement 605 Chapter Highlights 606 • Key Terms 606

Questions for Discussion 606 • Exercises 607

References 607

Contents xix

Chapter 11 Group Decision Making, Collaborative Systems, and AI Support 610 11.1 Opening Vignette: Hendrick Motorsports Excels with

Collaborative Teams 611

11.2 Making Decisions in Groups: Characteristics, Process, Benefits, and Dysfunctions 613

Characteristics of Group Work 613

Types of Decisions Made by Groups 614

Group Decision-Making Process 614

Benefits and Limitations of Group Work 615

11.3 Supporting Group Work and Team Collaboration with Computerized Systems 616

Overview of Group Support Systems (GSS) 617

Time/Place Framework 617

Group Collaboration for Decision Support 618

11.4 Electronic Support for Group Communication and Collaboration 619

Groupware for Group Collaboration 619

Synchronous versus Asynchronous Products 619

Virtual Meeting Systems 620

Collaborative Networks and Hubs 622

Collaborative Hubs 622

Social Collaboration 622

Sample of Popular Collaboration Software 623

11.5 Direct Computerized Support for Group Decision Making 623

Group Decision Support Systems (GDSS) 624

Characteristics of GDSS 625

Supporting the Entire Decision-Making Process 625

Brainstorming for Idea Generation and Problem Solving 627

Group Support Systems 628

11.6 Collective Intelligence and Collaborative Intelligence 629

Definitions and Benefits 629

Computerized Support to Collective Intelligence 629 0 APPLICATION CASE 11.1 Collaborative Modeling for Optimal

Water Management: The Oregon State University Project 630

How Collective Intelligence May Change Work and Life 631

Collaborative Intelligence 632

How to Create Business Value from Collaboration: The IBM Study 632

xx Contents

11.7 Crowdsourcing as a Method for Decision Support 633

The Essentials of Crowdsourcing 633

Crowdsourcing for Problem-Solving and Decision Support 634

Implementing Crowdsourcing for Problem Solving 635 0 APPLICATION CASE 11.2 How InnoCentive Helped GSK Solve a

Difficult Problem 636

11.8 Artificial Intelligence and Swarm AI Support of Team Collaboration and Group Decision Making 636

AI Support of Group Decision Making 637

AI Support of Team Collaboration 637

Swarm Intelligence and Swarm AI 639 0 APPLICATION CASE 11.3 XPRIZE Optimizes Visioneering 639

11.9 Human–Machine Collaboration and Teams of Robots 640

Human–Machine Collaboration in Cognitive Jobs 641

Robots as Coworkers: Opportunities and Challenges 641

Teams of collaborating Robots 642 Chapter Highlights 644 • Key Terms 645

Questions for Discussion 645 • Exercises 645

References 646

Chapter 12 Knowledge Systems: Expert Systems, Recommenders, Chatbots, Virtual Personal Assistants, and Robo Advisors 648 12.1 Opening Vignette: Sephora Excels with Chatbots 649

12.2 Expert Systems and Recommenders 650

Basic Concepts of Expert Systems (ES) 650

Characteristics and Benefits of ES 652

Typical Areas for ES Applications 653

Structure and Process of ES 653 0 APPLICATION CASE 12.1 ES Aid in Identification of Chemical,

Biological, and Radiological Agents 655

Why the Classical Type of ES Is Disappearing 655 0 APPLICATION CASE 12.2 VisiRule 656

Recommendation Systems 657 0 APPLICATION CASE 12.3 Netflix Recommender: A Critical Success

Factor 658

12.3 Concepts, Drivers, and Benefits of Chatbots 660

What Is a Chatbot? 660

Chatbot Evolution 660

Components of Chatbots and the Process of Their Use 662

Drivers and Benefits 663

Representative Chatbots from Around the World 663

12.4 Enterprise Chatbots 664

The Interest of Enterprises in Chatbots 664

Contents xxi

Enterprise Chatbots: Marketing and Customer Experience 665 0 APPLICATION CASE 12.4 WeChat’s Super Chatbot 666 0 APPLICATION CASE 12.5 How Vera Gold Mark Uses Chatbots to

Increase Sales 667

Enterprise Chatbots: Financial Services 668

Enterprise Chatbots: Service Industries 668

Chatbot Platforms 669 0 APPLICATION CASE 12.6 Transavia Airlines Uses Bots for

Communication and Customer Care Delivery 669

Knowledge for Enterprise Chatbots 671

12.5 Virtual Personal Assistants 672

Assistant for Information Search 672

If You Were Mark Zuckerberg, Facebook CEO 672

Amazon’s Alexa and Echo 672

Apple’s Siri 675

Google Assistant 675

Other Personal Assistants 675

Competition Among Large Tech Companies 675

Knowledge for Virtual Personal Assistants 675

12.6 Chatbots as Professional Advisors (Robo Advisors) 676

Robo Financial Advisors 676

Evolution of Financial Robo Advisors 676

Robo Advisors 2.0: Adding the Human Touch 676 0 APPLICATION CASE 12.7 Betterment, the Pioneer of Financial Robo

Advisors 677

Managing Mutual Funds Using AI 678

Other Professional Advisors 678

IBM Watson 680

12.7 Implementation Issues 680

Technology Issues 680

Disadvantages and Limitations of Bots 681

Quality of Chatbots 681

Setting Up Alexa’s Smart Home System 682

Constructing Bots 682 Chapter Highlights 683 • Key Terms 683

Questions for Discussion 684 • Exercises 684

References 685

Chapter 13 The Internet of Things as a Platform for Intelligent Applications 687 13.1 Opening Vignette: CNH Industrial Uses the Internet of

Things to Excel 688

13.2 Essentials of IoT 689

Definitions and Characteristics 690

xxii Contents

The IoT Ecosystem 691

Structure of IoT Systems 691

13.3 Major Benefits and Drivers of IoT 694

Major Benefits of IoT 694

Major Drivers of IoT 695

Opportunities 695

13.4 How IoT Works 696

IoT and Decision Support 696

13.5 Sensors and Their Role in IoT 697

Brief Introduction to Sensor Technology 697 0 APPLICATION CASE 13.1 Using Sensors, IoT, and AI for

Environmental Control at the Athens, Greece, International Airport 697

How Sensors Work with IoT 698 0 APPLICATION CASE 13.2 Rockwell Automation

Monitors Expensive Oil and Gas Exploration Assets to Predict Failures 698

Sensor Applications and Radio-Frequency Identification (RFID) Sensors 699

13.6 Selected IoT Applications 701

A Large-scale IoT in Action 701

Examples of Other Existing Applications 701

13.7 Smart Homes and Appliances 703

Typical Components of Smart Homes 703

Smart Appliances 704

A Smart Home Is Where the Bot Is 706

Barriers to Smart Home Adoption 707

13.8 Smart Cities and Factories 707 0 APPLICATION CASE 13.3 Amsterdam on the Road to Become a

Smart City 708

Smart Buildings: From Automated to Cognitive Buildings 709

Smart Components in Smart Cities and Smart Factories 709 0 APPLICATION CASE 13.4 How IBM Is Making Cities Smarter

Worldwide 711

Improving Transportation in the Smart City 712

Combining Analytics and IoT in Smart City Initiatives 713

Bill Gates’ Futuristic Smart City 713

Technology Support for Smart Cities 713

13.9 Autonomous (Self-Driving) Vehicles 714

The Developments of Smart Vehicles 714 0 APPLICATION CASE 13.5 Waymo and Autonomous Vehicles 715

Flying Cars 717

Implementation Issues in Autonomous Vehicles 717

Contents xxiii

13.10 Implementing IoT and Managerial Considerations 717

Major Implementation Issues 718

Strategy for Turning Industrial IoT into Competitive Advantage 719

The Future of the IoT 720 Chapter Highlights 721 • Key Terms 721

Questions for Discussion 722 • Exercises 722

References 722

PART V Caveats of Analytics and AI 725

Chapter 14 Implementation Issues: From Ethics and Privacy to Organizational and Societal Impacts 726 14.1 Opening Vignette: Why Did Uber Pay $245 Million to

Waymo? 727

14.2 Implementing Intelligent Systems: An Overview 729

The Intelligent Systems Implementation Process 729

The Impacts of Intelligent Systems 730

14.3 Legal, Privacy, and Ethical Issues 731

Legal Issues 731

Privacy Issues 732

Who Owns Our Private Data? 735

Ethics Issues 735

Ethical Issues of Intelligent Systems 736

Other Topics in Intelligent Systems Ethics 736

14.4 Successful Deployment of Intelligent Systems 737

Top Management and Implementation 738

System Development Implementation Issues 738

Connectivity and Integration 739

Security Protection 739

Leveraging Intelligent Systems in Business 739

Intelligent System Adoption 740

14.5 Impacts of Intelligent Systems on Organizations 740

New Organizational Units and Their Management 741

Transforming Businesses and Increasing Competitive Advantage 741 0 APPLICATION CASE 14.1 How 1-800-Flowers.com Uses Intelligent

Systems for Competitive Advantage 742

Redesign of an Organization Through the Use of Analytics 743

Intelligent Systems’ Impact on Managers’ Activities, Performance, and Job Satisfaction 744

Impact on Decision Making 745

Industrial Restructuring 746

xxiv Contents

14.6 Impacts on Jobs and Work 747

An Overview 747

Are Intelligent Systems Going to Take Jobs—My Job? 747

AI Puts Many Jobs at Risk 748 0 APPLICATION CASE 14.2 White-Collar Jobs That Robots Have

Already Taken 748

Which Jobs Are Most in Danger? Which Ones Are Safe? 749

Intelligent Systems May Actually Add Jobs 750

Jobs and the Nature of Work Will Change 751

Conclusion: Let’s Be Optimistic! 752

14.7 Potential Dangers of Robots, AI, and Analytical Modeling 753

Position of AI Dystopia 753

The AI Utopia’s Position 753

The Open AI Project and the Friendly AI 754

The O’Neil Claim of Potential Analytics’ Dangers 755

14.8 Relevant Technology Trends 756

Gartner’s Top Strategic Technology Trends for 2018 and 2019 756

Other Predictions Regarding Technology Trends 757

Summary: Impact on AI and Analytics 758

Ambient Computing (Intelligence) 758

14.9 Future of Intelligent Systems 760

What Are the Major U.S. High-Tech Companies Doing in the Intelligent Technologies Field? 760

AI Research Activities in China 761 0 APPLICATION CASE 14.3 How Alibaba.com Is Conducting AI 762

The U.S.–China Competition: Who Will Control AI? 764

The Largest Opportunity in Business 764

Conclusion 764 Chapter Highlights 765 • Key Terms 766

Questions for Discussion 766 • Exercises 766

References 767

Glossary 770

Index 785

xxv

PREFACE

Analytics has become the technology driver of this decade. Companies such as IBM, Oracle, Microsoft, and others are creating new organizational units focused on analytics that help businesses become more effective and efficient in their operations. Decision makers are using data and computerized tools to make better decisions. Even consumers are using analytics tools directly or indirectly to make decisions on routine activities such as shopping, health care, and entertainment. The field of business analytics (BA)/data sci- ence (DS)/decision support systems (DSS)/business intelligence (BI) is evolving rapidly to become more focused on innovative methods and applications to utilize data streams that were not even captured some time back, much less analyzed in any significant way. New applications emerge daily in customer relationship management, banking and fi- nance, health care and medicine, sports and entertainment, manufacturing and supply chain management, utilities and energy, and virtually every industry imaginable.

The theme of this revised edition is analytics, data science, and AI for enterprise decision support. In addition to traditional decision support applications, this edition ex- pands the reader’s understanding of the various types of analytics by providing examples, products, services, and exercises by means of introducing AI, machine-learning, robotics, chatbots, IoT, and Web/Internet-related enablers throughout the text. We highlight these technologies as emerging components of modern-day business analytics systems. AI tech- nologies have a major impact on decision making by enabling autonomous decisions and by supporting steps in the process of making decisions. AI and analytics support each other by creating a synergy that assists decision making.

The purpose of this book is to introduce the reader to the technologies that are generally and collectively called analytics (or business analytics) but have been known by other names such as decision support systems, executive information systems, and business intelligence, among others. We use these terms interchangeably. This book pres- ents the fundamentals of the methods, methodologies, and techniques used to design and develop these systems. In addition, we introduce the essentials of AI both as it relates to analytics as well as a standalone discipline for decision support.

We follow an EEE approach to introducing these topics: Exposure, Experience, and Explore. The book primarily provides exposure to various analytics techniques and their applications. The idea is that a student will be inspired to learn from how other organizations have employed analytics to make decisions or to gain a competitive edge. We believe that such exposure to what is being done with analytics and how it can be achieved is the key component of learning about analytics. In describing the techniques, we also introduce specific software tools that can be used for developing such applica- tions. The book is not limited to any one software tool, so the students can experience these techniques using any number of available software tools. Specific suggestions are given in each chapter, but the student and the professor are able to use this book with many different software tools. Our book’s companion Web site will include specific soft- ware guides, but students can gain experience with these techniques in many different ways. Finally, we hope that this exposure and experience enable and motivate read- ers to explore the potential of these techniques in their own domain. To facilitate such exploration, we include exercises that direct them to Teradata University Network and other sites as well that include team-oriented exercises where appropriate. In our own teaching experience, projects undertaken in the class facilitate such exploration after the students have been exposed to the myriad of applications and concepts in the book and they have experienced specific software introduced by the professor.

xxvi Preface

This edition of the book can be used to offer a one-semester overview course on analytics, which covers most or all of the topics/chapters included in the book. It can also be used to teach two consecutive courses. For example, one course could focus on the overall analytics coverage. It could cover selective sections of Chapters 1 and 3–9. A second course could focus on artificial intelligence and emerging technologies as the enablers of modern-day analytics as a subsequent course to the first course. This second course could cover portions of Chapters 1, 2, 6, 9, and 10–14. The book can be used to offer managerial-level exposure to applications and techniques as noted in the previous paragraph, but it also includes sufficient technical details in selected chapters to allow an instructor to focus on some technical methods and hands-on exercises.

Most of the specific improvements made in this eleventh edition concentrate on three areas: reorganization, content update/upgrade (including AI, machine-learning, chatbots, and robotics as enablers of analytics), and a sharper focus. Despite the many changes, we have preserved the comprehensiveness and user friendliness that have made the textbook a market leader in the last several decades. We have also optimized the book’s size and content by eliminating older and redundant material and by adding and combining material that is parallel to the current trends and is also demanded by many professors. Finally, we present accurate and updated material that is not available in any other text. We next describe the changes in the eleventh edition.

The book is supported by a Web site (pearsonhighered.com/sharda). We provide links to additional learning materials and software tutorials through a special section of the book Web site.

WHAT’S NEW IN THE ELEVENTH EDITION?

With the goal of improving the text and making it current with the evolving technology trends, this edition marks a major reorganization to better reflect on the current focus on analytics and its enabling technologies. The last three editions transformed the book from the traditional DSS to BI and then from BI to BA and fostered a tight linkage with the Teradata University Network (TUN). This edition is enhanced with new materials parallel- ing the latest trends in analytics including AI, machine learning, deep learning, robotics, IoT, and smart/robo-collaborative assisting systems and applications. The following sum- marizes the major changes made to this edition.

• New organization. The book is now organized around two main themes: (1) presentation of motivations, concepts, methods, and methodologies for different types of analytics (focusing heavily on predictive and prescriptive analytic), and (2) introduction and due coverage of new technology trends as the enablers of the modern-day analytics such as AI, machine learning, deep learning, robotics, IoT, smart/robo-collaborative assisting systems, etc. Chapter 1 provides an introduction to the journey of decision support and enabling technologies. It begins with a brief overview of the classical decision making and decision support systems. Then it moves to business intelligence, followed by an introduction to analytics, Big Data, and AI. We follow that with a deeper introduction to artificial intelligence in Chapter 2. Because data is fundamental to any analysis, Chapter 3 introduces data issues as well as descriptive analytics including statistical concepts and visualization. An on- line chapter covers data warehousing processes and fundamentals for those who like to dig deeper into these issues. The next section covers predictive analytics and machine learning. Chapter 4 provides an introduction to data mining applications and the data mining process. Chapter 5 introduces many of the common data min- ing techniques: classification, clustering, association mining, and so forth. Chapter 6 includes coverage of deep learning and cognitive computing. Chapter 7 focuses on

http://pearsonhighered.com/sharda

Preface xxvii

text mining applications as well as Web analytics, including social media analytics, sentiment analysis, and other related topics. The following section brings the “data science” angle to a further depth. Chapter 8 covers prescriptive analytics including optimization and simulation. Chapter 9 includes more details of Big Data analytics. It also includes introduction to cloud-based analytics as well as location analytics. The next section covers Robotics, social networks, AI, and the Internet of Things (IoT). Chapter 10 introduces robots in business and consumer applications and also stud- ies the future impact of such devices on society. Chapter 11 focuses on collaboration systems, crowdsourcing, and social networks. Chapter 12 reviews personal assis- tants, chatbots, and the exciting developments in this space. Chapter 13 studies IoT and its potential in decision support and a smarter society. The ubiquity of wireless and GPS devices and other sensors is resulting in the creation of massive new data- bases and unique applications. Finally, Chapter 14 concludes with a brief discussion of security, privacy, and societal dimensions of analytics and AI.

We should note that several chapters included in this edition have been avail- able in the following companion book: Business Intelligence, Analytics, and Data Science: A Managerial Perspective, 4th Edition, Pearson (2018) (Hereafter referred to as BI4e). The structure and contents of these chapters have been updated somewhat before inclusion in this edition of the book, but the changes are more significant in the chapters marked as new. Of course, several of the chapters that came from BI4e were not included in previous editions of this book.

• New chapters. The following chapters have been added:

Chapter 2 “Artificial Intelligence: Concepts, Drivers, Major Technologies, and Business Applications” This chapter covers the essentials of AI, outlines its benefits, compares it with humans’ intelligence, and describes the content of the field. Example applications in accounting, finance, human resource management, marketing and CRM, and production-operation management illustrate the benefits to business (100% new material) Chapter 6, “Deep Learning and Cognitive Computing” This chapter covers the generation of machine learning technique, deep learning as well as the increasingly more popular AI topic, cognitive computing. It is an almost entirely new chapter (90% new material). Chapter 10, “Robotics: Industrial and Consumer Applications” This chapter introduces many robotics applications in industry and for consumers and concludes with impacts of such advances on jobs and some legal ramifications (100% new material). Chapter 12, “Knowledge Systems: Expert Systems, Recommenders, Chatbots, Virtual Personal Assistants, and Robo Advisors” This new chapter concentrates on different types of knowledge systems. Specifically, we cover new generations of expert systems and recommenders, chatbots, enterprise chatbots, virtual personal assistants, and robo-advisors (95% new). Chapter 13, “The Internet of Things as a Platform for Intelligent Applications” This new chapter introduces IoT as an enabler to analytics and AI applications. The following technologies are described in detail: smart homes and appliances, smart cities (including factories), and autonomous vehicles (100% new). Chapter 14, “Implementation Issues: From Ethics and Privacy to Organiza- tional and Societal Impacts” This mostly new chapter deals with implementation issues of intelligent systems (including analytics). The major issues covered are protection of privacy, intellectual property, ethics, technical issues (e.g., integration and security) and administrative issues. We also cover the impact of these technolo- gies on organizations and people and specifically deal with the impact on work and

xxviii Preface

jobs. Special attention is given to possible unintended impacts of analytics and AI (robots). Then we look at relevant technology trends and conclude with an assess- ment of the future of analytics and AI (85% new).

• Streamlined coverage. We have optimized the book size and content by add- ing a lot of new material to cover new and cutting-edge analytics and AI trends and technologies while eliminating most of the older, less-used material. We use a dedicated Web site for the textbook to provide some of the older material as well as updated content and links.

• Revised and updated content. Several chapters have new opening vignettes that are based on recent stories and events. In addition, application cases throughout the book are new or have been updated to include recent examples of applications of a specific technique/model. These application case stories now include suggested questions for discussion to encourage class discussion as well as further explora- tion of the specific case and related materials. New Web site links have been added throughout the book. We also deleted many older product links and references. Finally, most chapters have new exercises, Internet assignments, and discussion questions throughout. The specific changes made to each chapter are as follows: Chapters 1, 3–5, and 7–9 borrow material from BI4e to a significant degree.

Chapter 1, “Overview of Business Intelligence, Analytics, Data Science, and Artifi- cial Intelligence: Systems for Decision Support” This chapter includes some material from DSS10e Chapters 1 and 2, but includes several new application cases, entirely new material on AI, and of course, a new plan for the book (about 50% new material).

Chapter 3, “Nature of Data, Statistical Modeling, and Visualization” • 75% new content. • Most of the content related to nature of data and statistical analysis is new. • New opening case. • Mostly new cases throughout.

Chapter 4, “Data Mining Process, Methods, and Algorithms” • 25% of the material is new. • Some of the application cases are new.

Chapter 5, “Machine Learning Techniques for Predictive Analytics” • 40% of the material is new. • New machine-learning methods: naïve Bayes, Bayesian networks, and ensemble

modeling. • Most of the cases are new.

Chapter 7, “Text Mining, Sentiment Analysis, and Social Analytics” • 25% of the material is new. • Some of the cases are new.

Chapter 8, “Prescriptive Analytics: Optimization and Simulation” • Several new optimization application exercises are included. • A new application case is included. • 20% of the material is new.

Chapter 9, “Big Data, Cloud Computing, and Location Analytics: Concepts and Tools” This material has bene updated substantially in this chapter to include greater coverage of stream analytics. It also updates material from Chapters 7 and 8 from BI4e (50% new material).

Chapter 11, “Group Decision Making, Collaborative Systems, and AI Support” The chapter is completely revised, regrouping group decision support. New topics include

Preface xxix

collective and collaborative intelligence, crowdsourcing, swarm AI, and AI support of all related activities (80% new material).

We have retained many of the enhancements made in the last editions and updated the content. These are summarized next:

• Links to Teradata University Network (TUN). Most chapters include new links to TUN (teradatauniversitynetwork.com). We encourage the instructors to reg- ister and join teradatauniversitynetwork.com and explore the various content available through the site. The cases, white papers, and software exercises available through TUN will keep your class fresh and timely.

• Book title. As is already evident, the book’s title and focus have changed. • Software support. The TUN Web site provides software support at no charge.

It also provides links to free data mining and other software. In addition, the site provides exercises in the use of such software.

THE SUPPLEMENT PACKAGE: PEARSONHIGHERED.COM/SHARDA

A comprehensive and flexible technology-support package is available to enhance the teaching and learning experience. The following instructor and student supplements are available on the book’s Web site, pearsonhighered.com/sharda:

• Instructor’s Manual. The Instructor’s Manual includes learning objectives for the entire course and for each chapter, answers to the questions and exercises at the end of each chapter, and teaching suggestions (including instructions for projects). The Instructor’s Manual is available on the secure faculty section of pearsonhigh- ered.com/sharda.

• Test Item File and TestGen Software. The Test Item File is a comprehensive collection of true/false, multiple-choice, fill-in-the-blank, and essay questions. The questions are rated by difficulty level, and the answers are referenced by book page number. The Test Item File is available in Microsoft Word and in TestGen. Pear- son Education’s test-generating software is available from www.pearsonhighered. com/irc. The software is PC/MAC compatible and preloaded with all of the Test Item File questions. You can manually or randomly view test questions and drag- and-drop to create a test. You can add or modify test-bank questions as needed. Our TestGens are converted for use in BlackBoard, WebCT, Moodle, D2L, and Angel. These conversions can be found on pearsonhighered.com/sharda. The TestGen is also available in Respondus and can be found on www.respondus.com.

• PowerPoint slides. PowerPoint slides are available that illuminate and build on key concepts in the text. Faculty can download the PowerPoint slides from pear- sonhighered.com/sharda.

ACKNOWLEDGMENTS

Many individuals have provided suggestions and criticisms since the publication of the first edition of this book. Dozens of students participated in class testing of various chap- ters, software, and problems and assisted in collecting material. It is not possible to name everyone who participated in this project, but our thanks go to all of them. Certain indi- viduals made significant contributions, and they deserve special recognition.

First, we appreciate the efforts of those individuals who provided formal reviews of the first through eleventh editions (school affiliations as of the date of review):

Robert Blanning, Vanderbilt University Ranjit Bose, University of New Mexico

http://teradatauniversitynetwork.com

http://Pearsonhighered.com

http://pearsonhighered.com/sharda

http://www.pearsonhighered.com/irc

http://pearsonhighered.com/sharda

http://www.respondus.com

http://pearsonhighered.com/sharda

xxx Preface

Warren Briggs, Suffolk University Lee Roy Bronner, Morgan State University Charles Butler, Colorado State University Sohail S. Chaudry, University of Wisconsin–La Crosse Kathy Chudoba, Florida State University Wingyan Chung, University of Texas Woo Young Chung, University of Memphis Paul “Buddy” Clark, South Carolina State University Pi’Sheng Deng, California State University–Stanislaus Joyce Elam, Florida International University Kurt Engemann, Iona College Gary Farrar, Jacksonville University George Federman, Santa Clara City College Jerry Fjermestad, New Jersey Institute of Technology Joey George, Florida State University Paul Gray, Claremont Graduate School Orv Greynholds, Capital College (Laurel, Maryland) Martin Grossman, Bridgewater State College Ray Jacobs, Ashland University Leonard Jessup, Indiana University Jeffrey Johnson, Utah State University Jahangir Karimi, University of Colorado Denver Saul Kassicieh, University of New Mexico Anand S. Kunnathur, University of Toledo Shao-ju Lee, California State University at Northridge Yair Levy, Nova Southeastern University Hank Lucas, New York University Jane Mackay, Texas Christian University George M. Marakas, University of Maryland Dick Mason, Southern Methodist University Nick McGaughey, San Jose State University Ido Millet, Pennsylvania State University–Erie Benjamin Mittman, Northwestern University Larry Moore, Virginia Polytechnic Institute and State University Simitra Mukherjee, Nova Southeastern University Marianne Murphy, Northeastern University Peter Mykytyn, Southern Illinois University Natalie Nazarenko, SUNY College at Fredonia David Olson, University of Nebraska Souren Paul, Southern Illinois University Joshua Pauli, Dakota State University Roger Alan Pick, University of Missouri–St. Louis Saeed Piri, University of Oregon W. “RP” Raghupaphi, California State University–Chico Loren Rees, Virginia Polytechnic Institute and State University David Russell, Western New England College Steve Ruth, George Mason University Vartan Safarian, Winona State University Glenn Shephard, San Jose State University Jung P. Shim, Mississippi State University Meenu Singh, Murray State University Randy Smith, University of Virginia

Preface xxxi

James T. C. Teng, University of South Carolina John VanGigch, California State University at Sacramento David Van Over, University of Idaho Paul J. A. van Vliet, University of Nebraska at Omaha B. S. Vijayaraman, University of Akron Howard Charles Walton, Gettysburg College Diane B. Walz, University of Texas at San Antonio Paul R. Watkins, University of Southern California Randy S. Weinberg, Saint Cloud State University Jennifer Williams, University of Southern Indiana Selim Zaim, Sehir University Steve Zanakis, Florida International University Fan Zhao, Florida Gulf Coast University Hamed Majidi Zolbanin, Ball State University

Several individuals contributed material to the text or the supporting material. For this new edition, assistance from the following students and colleagues is grate- fully acknowledged: Behrooz Davazdahemami, Bhavana Baheti, Varnika Gottipati, and Chakradhar Pathi (all of Oklahoma State University). Prof. Rick Wilson contrib- uted some examples and new exercise questions for Chapter 8. Prof. Pankush Kalgotra (Auburn University) contributed the new streaming analytics tutorial in Chapter 9. Other contributors of materials for specific application stories are identified as sources in the respective sections. Susan Baskin, Imad Birouty, Sri Raghavan, and Yenny Yang of Tera- data provided special help in identifying new TUN content for the book and arranging permissions for the same.

Many other colleagues and students have assisted us in developing previous editions or the recent edition of the companion book from which some of the content has been adapted in this revision. Some of that content is still included this edition. Their assistance and contributions are acknowledged as well in chronological order. Dr. Dave Schrader contributed the sports examples used in Chapter 1. These will provide a great introduc- tion to analytics. We also thank INFORMS for their permission to highlight content from Interfaces. We also recognize the following individuals for their assistance in develop- ing Previous edition of the book: Pankush Kalgotra, Prasoon Mathur, Rupesh Agarwal, Shubham Singh, Nan Liang, Jacob Pearson, Kinsey Clemmer, and Evan Murlette (all of Oklahoma State University). Their help for BI 4e is gratefully acknowledged. The Tera- data Aster team, especially Mark Ott, provided the material for the opening vignette for Chapter 9. Dr. Brian LeClaire, CIO of Humana Corporation led with contributions of sev- eral real-life healthcare case studies developed by his team at Humana. Abhishek Rathi of vCreaTek contributed his vision of analytics in the retail industry. In addition, the follow- ing former PhD students and research colleagues of ours have provided content or advice and support for the book in many direct and indirect ways: Asil Oztekin, University of Massachusetts-Lowell; Enes Eryarsoy, Sehir University; Hamed Majidi Zolbanin, Ball State University; Amir Hassan Zadeh, Wright State University; Supavich (Fone) Pengnate, North Dakota State University; Christie Fuller, Boise State University; Daniel Asamoah, Wright State University; Selim Zaim, Istanbul Technical University; and Nihat Kasap, Sabanci Uni- versity. Peter Horner, editor of OR/MS Today, allowed us to summarize new application stories from OR/MS Today and Analytics Magazine. We also thank INFORMS for their permission to highlight content from Interfaces. Assistance from Natraj Ponna, Daniel Asamoah, Amir Hassan-Zadeh, Kartik Dasika, and Angie Jungermann (all of Oklahoma State University) is gratefully acknowledged for DSS 10th edition. We also acknowledge Jongswas Chongwatpol (NIDA, Thailand) for the material on SIMIO software, and Kazim Topuz (University of Tulsa) for his contributions to the Bayesian networks section in

xxxii Preface

Chapter 5. For other previous editions, we acknowledge the contributions of Dave King (a technology consultant and former executive at JDA Software Group, Inc.) and Jerry Wagner (University of Nebraska–Omaha). Major contributors for earlier editions include Mike Goul (Arizona State University) and Leila A. Halawi (Bethune-Cookman College), who provided material for the chapter on data warehousing; Christy Cheung (Hong Kong Baptist University), who contributed to the chapter on knowledge management; Linda Lai (Macau Polytechnic University of China); Lou Frenzel, an independent consultant whose books Crash Course in Artificial Intelligence and Expert Systems and Understanding of Expert Systems (both published by Howard W. Sams, New York, 1987) provided material for the early editions; Larry Medsker (American University), who contributed substantial material on neural networks; and Richard V. McCarthy (Quinnipiac University), who per- formed major revisions in the seventh edition.

Previous editions of the book have also benefited greatly from the efforts of many individuals who contributed advice and interesting material (such as problems), gave feedback on material, or helped with class testing. These include Warren Briggs (Suffolk University), Frank DeBalough (University of Southern California), Mei-Ting Cheung (Uni- versity of Hong Kong), Alan Dennis (Indiana University), George Easton (San Diego State University), Janet Fisher (California State University, Los Angeles), David Friend (Pilot Soft- ware, Inc.), the late Paul Gray (Claremont Graduate School), Mike Henry (OSU), Dustin Huntington (Exsys, Inc.), Subramanian Rama Iyer (Oklahoma State University), Elena Karahanna (The University of Georgia), Mike McAulliffe (The University of Georgia), Chad Peterson (The University of Georgia), Neil Rabjohn (York University), Jim Ragusa (University of Central Florida), Alan Rowe (University of Southern California), Steve Ruth (George Mason University), Linus Schrage (University of Chicago), Antonie Stam (University of Missouri), Late Ron Swift (NCR Corp.), Merril Warkentin (then at Northeastern Uni- versity), Paul Watkins (The University of Southern California), Ben Mortagy (Claremont Graduate School of Management), Dan Walsh (Bellcore), Richard Watson (The University of Georgia), and the many other instructors and students who have provided feedback.

Several vendors cooperated by providing development and/or demonstration software: Dan Fylstra of Frontline Systems, Gregory Piatetsky-Shapiro of KDNuggets.com, Logic Programming Associates (UK), Gary Lynn of NeuroDimension Inc. (Gainesville, Florida), Palisade Software (Newfield, New York), Jerry Wagner of Planners Lab (Omaha, Nebraska), Promised Land Technologies (New Haven, Connecticut), Salford Systems (La Jolla, Califor- nia), Gary Miner of StatSoft, Inc. (Tulsa, Oklahoma), Ward Systems Group, Inc. (Frederick, Maryland), Idea Fisher Systems, Inc. (Irving, California), and Wordtech Systems (Orinda, California).

Special thanks to the Teradata University Network and especially to Hugh Watson, Michael Goul, and Susan Baskin, Program Director, for their encouragement to tie this book with TUN and for providing useful material for the book.

Many individuals helped us with administrative matters and editing, proofreading, and preparation. The project began with Jack Repcheck (a former Macmillan editor), who initiated this project with the support of Hank Lucas (New York University). Jon Outland assisted with the supplements.

Finally, the Pearson team is to be commended: Executive Editor Samantha Lewis who orchestrated this project; the copyeditors; and the production team, Faraz Sharique Ali at Pearson, and Gowthaman and staff at Integra Software Services, who transformed the manuscript into a book.

http://KDNuggets.com

Preface xxxiii

We would like to thank all these individuals and corporations. Without their help, the creation of this book would not have been possible. We want to specifically acknowl- edge the contributions of previous coauthors Janine Aronson, David King, and T. P. Liang, whose original contributions constitute significant components of the book.

R.S.

D.D.

E.T.

Note that Web site URLs are dynamic. As this book went to press, we verified that all the cited Web sites were active and valid. Web sites to which we refer in the text sometimes change or are discontinued because compa- nies change names, are bought or sold, merge, or fail. Sometimes Web sites are down for maintenance, repair, or redesign. Most organizations have dropped the initial “www” designation for their sites, but some still use it. If you have a problem connecting to a Web site that we mention, please be patient and simply run a Web search to try to identify the new site. Most times, the new site can be found quickly. Some sites also require a free registra- tion before allowing you to see the content. We apologize in advance for this inconvenience.

xxxiv

Ramesh Sharda (M.B.A., Ph.D., University of Wisconsin–Madison) is the Vice Dean for Research and Graduate Programs, Watson/ConocoPhillips Chair and a Regents Professor of Management Science and Information Systems in the Spears School of Business at Oklahoma State University. His research has been published in major journals in man- agement science and information systems including Management Science, Operations Research, Information Systems Research, Decision Support Systems, Decision Sciences Journal, EJIS, JMIS, Interfaces, INFORMS Journal on Computing, ACM Data Base, and many others. He is a member of the editorial boards of journals such as the Decision Support Systems, Decision Sciences, and ACM Database. He has worked on many spon- sored research projects with government and industry, and has also served as consultants to many organizations. He also serves as the Faculty Director of Teradata University Net- work. He received the 2013 INFORMS Computing Society HG Lifetime Service Award, and was inducted into Oklahoma Higher Education Hall of Fame in 2016. He is a Fellow of INFORMS.

Dursun Delen (Ph.D., Oklahoma State University) is the Spears and Patterson Chairs in Business Analytics, Director of Research for the Center for Health Systems Innova- tion, and Regents Professor of Management Science and Information Systems in the Spears School of Business at Oklahoma State University (OSU). Prior to his academic career, he worked for a privately owned research and consultancy company, Knowledge Based Systems Inc., in College Station, Texas, as a research scientist for five years, dur- ing which he led a number of decision support and other information systems–related research projects funded by federal agencies such as DoD, NASA, NIST, and DOE. Dr. Delen’s research has appeared in major journals including Decision Sciences, Decision Support Systems, Communications of the ACM, Computers and Operations Research, Computers in Industry, Journal of Production Operations Management, Journal of American Medical Informatics Association, Artificial Intelligence in Medicine, Expert Systems with Applications, among others. He has published eight books/textbooks and more than 100 peer-reviewed journal articles. He is often invited to national and inter- national conferences for keynote addresses on topics related to business analytics, Big Data, data/text mining, business intelligence, decision support systems, and knowledge management. He served as the general co-chair for the 4th International Conference on Network Computing and Advanced Information Management (September 2–4, 2008, in Seoul, South Korea) and regularly serves as chair on tracks and mini-tracks at various business analytics and information systems conferences. He is the co-editor-in-chief for the Journal of Business Analytics, the area editor for Big Data and Business Analytics on the Journal of Business Research, and also serves as chief editor, senior editor, associate editor, and editorial board member on more than a dozen other journals. His consul- tancy, research, and teaching interests are in business analytics, data and text mining, health analytics, decision support systems, knowledge management, systems analysis and design, and enterprise modeling.

Efraim Turban (M.B.A., Ph.D., University of California, Berkeley) is a visiting scholar at the Pacific Institute for Information System Management, University of Hawaii. Prior to this, he was on the staff of several universities, including City University of Hong Kong; Lehigh University; Florida International University; California State University, Long

ABOUT THE AUTHORS

About the Authors xxxv

Beach; Eastern Illinois University; and the University of Southern California. Dr. Turban is the author of more than 110 refereed papers published in leading journals, such as Management Science, MIS Quarterly, and Decision Support Systems. He is also the author of 22 books, including Electronic Commerce: A Managerial Perspective and Information Technology for Management. He is also a consultant to major corporations worldwide. Dr. Turban’s current areas of interest are Web-based decision support systems, digital commerce, and applied artificial intelligence.

This page is intentionally left blank

P A R T

Introduction to Analytics and AI

LEARNING OBJECTIVES

Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence: Systems for Decision Support

■■ Understand the need for computerized support of managerial decision making

■■ Understand the development of systems for providing decision-making support

■■ Recognize the evolution of such computerized support to the current state of analytics/data science and artificial intelligence

■■ Describe the business intelligence (BI) methodology and concepts

■■ Understand the different types of analytics and review selected applications

■■ Understand the basic concepts of artificial intelligence (AI) and see selected applications

■■ Understand the analytics ecosystem to identify various key players and career opportunities

T he business environment (climate) is constantly changing, and it is becoming more and more complex. Organizations, both private and public, are under pres-sures that force them to respond quickly to changing conditions and to be in- novative in the way they operate. Such activities require organizations to be agile and to make frequent and quick strategic, tactical, and operational decisions, some of which are very complex. Making such decisions may require considerable amounts of relevant data, information, and knowledge. Processing these in the framework of the needed decisions must be done quickly, frequently in real time, and usually requires some computerized support. As technologies are evolving, many decisions are being automated, leading to a major impact on knowledge work and workers in many ways.

This book is about using business analytics and artificial intelligence (AI) as a computerized support portfolio for managerial decision making. It concentrates on the

C H A P T E R

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 3

theoretical and conceptual foundations of decision support as well as on the commercial tools and techniques that are available. The book presents the fundamentals of the tech- niques and the manner in which these systems are constructed and used. We follow an EEE (exposure, experience, and exploration) approach to introducing these topics. The book primarily provides exposure to various analytics/AI techniques and their applica- tions. The idea is that students will be inspired to learn from how various organizations have employed these technologies to make decisions or to gain a competitive edge. We believe that such exposure to what is being accomplished with analytics and that how it can be achieved is the key component of learning about analytics. In describing the techniques, we also give examples of specific software tools that can be used for devel- oping such applications. However, the book is not limited to any one software tool, so students can experience these techniques using any number of available software tools. We hope that this exposure and experience enable and motivate readers to explore the potential of these techniques in their own domain. To facilitate such exploration, we include exercises that direct the reader to Teradata University Network (TUN) and other sites that include team-oriented exercises where appropriate. In our own teaching experi- ence, projects undertaken in the class facilitate such exploration after students have been exposed to the myriad of applications and concepts in the book and they have experi- enced specific software introduced by the professor.

This introductory chapter provides an introduction to analytics and artificial intel- ligence as well as an overview of the book. The chapter has the following sections:

1.1 Opening Vignette: How Intelligent Systems Work for KONE Elevators and Escalators Company 3

1.2 Changing Business Environments and Evolving Needs for Decision Support and Analytics 5

1.3 Decision-Making Processes and Computer Decision Support Framework 9 1.4 Evolution of Computerized Decision Support to Business Intelligence/

Analytics/Data Science 22 1.5 Analytics Overview 30 1.6 Analytics Examples in Selected Domains 38 1.7 Artificial Intelligence Overview 52 1.8 Convergence of Analytics and AI 59 1.9 Overview of the Analytics Ecosystem 63

1.10 Plan of the Book 65 1.11 Resources, Links, and the Teradata University Network Connection 66

1.1 OPENING VIGNETTE: How Intelligent Systems Work for KONE Elevators and Escalators Company

KONE is a global industrial company (based in Finland) that manufactures mostly eleva- tors and escalators and also services over 1.1 million elevators, escalators, and related equipment in several countries. The company employs over 50,000 people.

THE PROBLEM

Over 1 billion people use the elevators and escalators manufactured and serviced by KONE every day. If equipment does not work properly, people may be late to work, can- not get home in time, and may miss important meetings and events. So, KONE’s objective is to minimize the downtime and users’ suffering.

4 Part I • Introduction to Analytics and AI

The company has over 20,000 technicians who are dispatched to deal with the elevators anytime a problem occurs. As buildings are getting higher (the trend in many places), more people are using elevators, and there is more pressure on elevators to handle the growing amount of traffic. KONE faced the responsibility to serve users smoothly and safely.

THE SOLUTION

KONE decided to use IBM Watson IoT Cloud platform. As we will see in Chapter 6, IBM installed cognitive abilities in buildings that make it possible to recognize situations and behavior of both people and equipment. The Internet of Things (IoT), as we will see in Chapter 13, is a platform that can connect millions of “things” together and to a central command that can manipulate the connected things. Also, the IoT connects sensors that are attached to KONE’s elevators and escalators. The sensors collect information and data about the elevators (such as noise level) and other equipment in real time. Then, the IoT transfers to information centers via the collected data “cloud.” There, analytic systems (IBM Advanced Analytic Engine) and AI process the collected data and predict things such as potential failures. The systems also identify the likely causes of problems and suggest poten- tial remedies. Note the predictive power of IBM Watson Analytics (using machine learning, an AI technology described in Chapters 4–6) for finding problems before they occur.

The KONE system collects a significant amount of data that are analyzed for other purposes so that future design of equipment can be improved. This is because Watson Analytics offers a convenient environment for communication of and collaboration around the data. In addition, the analysis suggests how to optimize buildings and equip- ment operations. Finally, KONE and its customers can get insights regarding the financial aspects of managing the elevators.

KONE also integrates the Watson capabilities with Salesforce’s service tools (Service Cloud Lightning and Field Service Lightning). This combination helps KONE to immedi- ately respond to emergencies or soon-to-occur failures as quickly as possible, dispatch- ing some of its 20,000 technicians to the problems’ sites. Salesforce also provides superb customer relationship management (CRM). The people–machine communication, query, and collaboration in the system are in a natural language (an AI capability of Watson Analytics; see Chapter 6). Note that IBM Watson analytics includes two types of analytics: predictive, which predicts when failures may occur, and prescriptive, which recommends actions (e.g., preventive maintenance).

THE RESULTS

KONE has minimized downtime and shortened the repair time. Obviously, elevators/ escalators users are much happier if they do not have problems because of equipment downtime, so they enjoy trouble-free rides. The prediction of “soon-to-happen” can save many problems for the equipment owners. The owners can also optimize the schedule of their own employees (e.g., cleaners and maintenance workers). All in all, the decision mak- ers at both KONE and the buildings can make informed and better decisions. Some day in the future, robots may perform maintenance and repairs of elevators and escalators.

Note: This case is a sample of IBM Watson’s success using its cognitive buildings capability. To learn more, we suggest you view the following YouTube videos: (1) youtube.com/watch?v=6UPJHyiJft0 (1:31 min.) (2017); (2) youtube.com/watch?v=EVbd3ejEXus (2:49 min.) (2017).

Sources: Compiled from J. Fernandez. (2017, April). “A Billion People a Day. Millions of Elevators. No Room for Downtime.” IBM developer Works Blog. developer.ibm.com/dwblog/2017/kone-watson-video/ (accessed September 2018); H. Srikanthan. “KONE Improves ‘People Flow’ in 1.1 Million Elevators with IBM Watson IoT.” Generis. https://generisgp.com/2018/01/08/ibm-case-study-kone-corp/ (accessed September 2018); L. Slowey. (2017, February 16). “Look Who’s Talking: KONE Makes Elevator Services Truly Intelligent with Watson IoT.” IBM Internet of Things Blog. ibm.com/blogs/internet-of-things/kone/ (accessed September 2018).

http://youtube.com/watch?v=6UPJHyiJft0

http://youtube.com/watch?v=EVbd3ejEXus

http://developer.ibm.com/dwblog/2017/kone-watson-video/

https://generisgp.com/2018/01/08/ibm-case-study-kone-corp/

http://ibm.com/blogs/internet-of-things/kone/

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 5

u QUESTIONS FOR THE OPENING VIGNETTE

1. It is said that KONE is embedding intelligence across its supply chain and enables smarter buildings. Explain.

2. Describe the role of IoT in this case. 3. What makes IBM Watson a necessity in this case? 4. Check IBM Advanced Analytics. What tools were included that relate to this case? 5. Check IBM cognitive buildings. How do they relate to this case?

WHAT CAN WE LEARN FROM THIS VIGNETTE?

Today, intelligent technologies can embark on large-scale complex projects when they include AI combined with IoT. The capabilities of integrated intelligent platforms, such as IBM Watson, make it possible to solve problems that were economically and techno- logically unsolvable just a few years ago. The case introduces the reader to several of the technologies, including advanced analytics, sensors, IoT, and AI that are covered in this book. The case also points to the use of “cloud.” The cloud is used to centrally process large amounts of information using analytics and AI algorithms, involving “things” in dif- ferent locations. This vignette also introduces us to two major types of analytics: predic- tive analytics (Chapters 4–6) and prescriptive analytics (Chapter 8).

Several AI technologies are discussed: machine learning, natural language process- ing, computer vision, and prescriptive analysis.

The case is an example of augmented intelligence in which people and machines work together. The case illustrates the benefits to the vendor, the implementing compa- nies, and their employees and to the users of the elevators and escalators.

1.2 CHANGING BUSINESS ENVIRONMENTS AND EVOLVING NEEDS FOR DECISION SUPPORT AND ANALYTICS

Decision making is one of the most important activities in organizations of all kind— probably the most important one. Decision making leads to the success or failure of orga- nizations and how well they perform. Making decisions is getting difficult due to internal and external factors. The rewards of making appropriate decisions can be very high and so can the loss of inappropriate ones.

Unfortunately, it is not simple to make decisions. To begin with, there are several types of decisions, each of which requires a different decision-making approach. For ex- ample, De Smet et al. (2017) of McKinsey & Company management consultants classify organizational decision into the following four groups:

• Big-bet, high-risk decisions. • Cross-cutting decisions, which are repetitive but high risk that require group work

(Chapter 11). • Ad hoc decisions that arise episodically. • Delegated decisions to individuals or small groups.

Therefore, it is necessary first to understand the nature of decision making. For a comprehensive discussion, see (De Smet et al. 2017).

Modern business is full of uncertainties and rapid changes. To deal with these, or- ganizational decision makers need to deal with ever-increasing and changing data. This book is about the technologies that can assist decision makers in their jobs.

6 Part I • Introduction to Analytics and AI

Decision-Making Process

For years, managers considered decision making purely an art—a talent acquired over a long period through experience (i.e., learning by trial and error) and by using intuition. Management was considered an art because a variety of individual styles could be used in approaching and successfully solving the same types of manage- rial problems. These styles were often based on creativity, judgment, intuition, and experience rather than on systematic quantitative methods grounded in a scientific ap- proach. However, recent research suggests that companies with top managers who are more focused on persistent work tend to outperform those with leaders whose main strengths are interpersonal communication skills. It is more important to emphasize methodical, thoughtful, analytical decision making rather than flashiness and interper- sonal communication skills.

Managers usually make decisions by following a four-step process (we learn more about these in the next section):

1. Define the problem (i.e., a decision situation that may deal with some difficulty or with an opportunity).

2. Construct a model that describes the real-world problem. 3. Identify possible solutions to the modeled problem and evaluate the solutions. 4. Compare, choose, and recommend a potential solution to the problem.

A more detailed process is offered by Quain (2018), who suggests the following steps:

1. Understand the decision you have to make. 2. Collect all the information. 3. Identify the alternatives. 4. Evaluate the pros and cons. 5. Select the best alternative. 6. Make the decision. 7. Evaluate the impact of your decision.

We will return to this process in Section 1.3.

The Influence of the External and Internal Environments on the Process

To follow these decision-making processes, one must make sure that sufficient alterna- tive solutions, including good ones, are being considered, that the consequences of using these alternatives can be reasonably predicted, and that comparisons are done properly. However, rapid changes in internal and external environments make such an evaluation process difficult for the following reasons:

• Technology, information systems, advanced search engines, and globalization re- sult in more and more alternatives from which to choose.

• Government regulations and the need for compliance, political instability and ter- rorism, competition, and changing consumer demands produce more uncertainty, making it more difficult to predict consequences and the future. • Political factors. Major decisions may be influenced by both external and

internal politics. An example is the 2018 trade war on tariffs. • Economic factors. These range from competition to the genera and state

of the economy. These factors, both in the short and long run, need to be considered.

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 7

• Sociological and psychological factors regarding employees and customers. These need to be considered when changes are being made.

• Environment factors. The impact on the physical environment must be assessed in many decision-making situations.

Other factors include the need to make rapid decisions, the frequent and unpredict- able changes that make trial-and-error learning difficult, and the potential costs of making mistakes that may be large.

These environments are growing more complex every day. Therefore, making deci- sions today is indeed a complex task. For further discussion, see Charles (2018). For how to make effective decisions under uncertainty and pressure, see Zane (2016).

Because of these trends and changes, it is nearly impossible to rely on a trial- and-error approach to management. Managers must be more sophisticated; they must use the new tools and techniques of their fields. Most of those tools and techniques are discussed in this book. Using them to support decision making can be extremely rewarding in making effective decisions. Further, many tools that are evolving impact even the very existence of several decision-making tasks that are being automated. This impacts future demand for knowledge workers and begs many legal and societal impact questions.

Data and Its Analysis in Decision Making

We will see several times in this book how an entire industry can employ analytics to develop reports on what is happening, predict what is likely to happen, and then make decisions to make the best use of the situation at hand. These steps require an organiza- tion to collect and analyze vast stores of data. In general, the amount of data doubles every two years. From traditional uses in payroll and bookkeeping functions, computer- ized systems are now used for complex managerial areas ranging from the design and management of automated factories to the application of analytical methods for the eval- uation of proposed mergers and acquisitions. Nearly all executives know that information technology is vital to their business and extensively use these technologies.

Computer applications have moved from transaction-processing and monitoring ac- tivities to problem analysis and solution applications, and much of the activity is done with cloud-based technologies, in many cases accessed through mobile devices. Analytics and BI tools such as data warehousing, data mining, online analytical processing (OLAP), dashboards, and the use of cloud-based systems for decision support are the cornerstones of today’s modern management. Managers must have high-speed, networked information systems (wired or wireless) to assist them with their most important task: making deci- sions. In many cases, such decisions are routinely being fully automated (see Chapter 2), eliminating the need for any managerial intervention.

Technologies for Data Analysis and Decision Support

Besides the obvious growth in hardware, software, and network capacities, some devel- opments have clearly contributed to facilitating the growth of decision support and ana- lytics technologies in a number of ways:

• Group communication and collaboration. Many decisions are made today by groups whose members may be in different locations. Groups can collaborate and communicate readily by using collaboration tools as well as the ubiquitous smartphones. Collaboration is especially important along the supply chain, where partners—all the way from vendors to customers—must share information. Assembling a group of decision makers, especially experts, in one place can be

8 Part I • Introduction to Analytics and AI

costly. Information systems can improve the collaboration process of a group and enable its members to be at different locations (saving travel costs). More critically, such supply chain collaboration permits manufacturers to know about the changing patterns of demand in near real time and thus react to marketplace changes faster. For a comprehensive coverage and the impact of AI, see Chapters 2, 10, and 14.

• Improved data management. Many decisions involve complex computations. Data for these can be stored in different databases anywhere in the organization and even possibly outside the organization. The data may include text, sound, graphics, and video, and these can be in different languages. Many times it is neces- sary to transmit data quickly from distant locations. Systems today can search, store, and transmit needed data quickly, economically, securely, and transparently. See Chapters 3 and 9 and the online chapter for details.

• Managing giant data warehouses and Big Data. Large data warehouses (DWs), like the ones operated by Walmart, contain huge amounts of data. Special methods, including parallel computing and Hadoop/Spark, are available to orga- nize, search, and mine the data. The costs related to data storage and mining are declining rapidly. Technologies that fall under the broad category of Big Data have enabled massive data coming from a variety of sources and in many different forms, which allows a very different view of organizational performance that was not pos- sible in the past. See Chapter 9 for details.

• Analytical support. With more data and analysis technologies, more alternatives can be evaluated, forecasts can be improved, risk analysis can be performed quickly, and the views of experts (some of whom may be in remote locations) can be collected quickly and at a reduced cost. Expertise can even be derived directly from analytical systems. With such tools, decision makers can perform complex simulations, check many possible scenarios, and assess diverse impacts quickly and economically.This, of course, is the focus of several chapters in the book. See Chapters 4–7.

• Overcoming cognitive limits in processing and storing information. The human mind has only a limited ability to process and store information. People sometimes find it difficult to recall and use information in an error-free fashion due to their cognitive limits. The term cognitive limits indicates that an individual’s problem-solving capability is limited when a wide range of diverse information and knowledge is required. Computerized systems enable people to overcome their cognitive limits by quickly accessing and processing vast amounts of stored infor- mation. One way to overcome humans’ cognitive limitations is to use AI support. For coverage of cognitive aspects, see Chapter 6.

• Knowledge management. Organizations have gathered vast stores of informa- tion about their own operations, customers, internal procedures, employee interac- tions, and so forth through the unstructured and structured communications taking place among various stakeholders. Knowledge management systems (KMS) have become sources of formal and informal support for decision making to manag- ers, although sometimes they may not even be called KMS. Technologies such as text analytics and IBM Watson are making it possible to generate value from such knowledge stores. (See Chapters 6 and 12 for details.

• Anywhere, anytime support. Using wireless technology, managers can access information anytime and from any place, analyze and interpret it, and communicate with those using it. This perhaps is the biggest change that has occurred in the last few years. The speed at which information needs to be processed and converted into decisions has truly changed expectations for both consumers and businesses. These and other capabilities have been driving the use of computerized decision support since the late 1960s, especially since the mid-1990s. The growth of mobile technologies, social media platforms, and analytical tools has enabled a different level of information systems (IS) to support managers. This growth in providing

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 9

data-driven support for any decision extends not just to managers but also to con- sumers. We will first study an overview of technologies that have been broadly referred to as BI. From there we will broaden our horizons to introduce various types of analytics.

• Innovation and artificial intelligence. Because of the complexities in the decision-making process discussed earlier and the environment surrounding the process, a more innovative approach is frequently need. A major facilitation of innovation is provided by AI. Almost every step in the decision-making process can be influenced by AI. AI is also integrated with analytics, creating synergy in making decisions (Section 1.8).

u SECTION 1.2 REVIEW QUESTIONS

1. Why is it difficult to make organizational decisions? 2. Describe the major steps in the decision-making process. 3. Describe the major external environments that can impact decision making. 4. What are some of the key system-oriented trends that have fostered IS-supported

decision making to a new level?

5. List some capabilities of information technologies that can facilitate managerial deci- sion making.

1.3 DECISION-MAKING PROCESSES AND COMPUTERIZED DECISION SUPPORT FRAMEWORK

In this section, we focus on some classical decision-making fundamentals and in more detail on the decision-making process. These two concepts will help us ground much of what we will learn in terms of analytics, data science, and artificial intelligence.

Decision making is a process of choosing among two or more alternative courses of action for the purpose of attaining one or more goals. According to Simon (1977), mana- gerial decision making is synonymous with the entire management process. Consider the important managerial function of planning. Planning involves a series of decisions: What should be done? When? Where? Why? How? By whom? Managers set goals, or plan; hence, planning implies decision making. Other managerial functions, such as organizing and controlling, also involve decision making.

Simon’s Process: Intelligence, Design, and Choice

It is advisable to follow a systematic decision-making process. Simon (1977) said that this involves three major phases: intelligence, design, and choice. He later added a fourth phase: implementation. Monitoring can be considered a fifth phase—a form of feedback. However, we view monitoring as the intelligence phase applied to the imple- mentation phase. Simon’s model is the most concise and yet complete characterization of rational decision making. A conceptual picture of the decision-making process is shown in Figure 1.1. It is also illustrated as a decision support approach using modeling.

There is a continuous flow of activity from intelligence to design to choice (see the solid lines in Figure 1.1), but at any phase, there may be a return to a previous phase (feedback). Modeling is an essential part of this process. The seemingly chaotic nature of following a haphazard path from problem discovery to solution via decision making can be explained by these feedback loops.

The decision-making process starts with the intelligence phase; in this phase, the decision maker examines reality and identifies and defines the problem. Problem owner- ship is established as well. In the design phase, a model that represents the system is constructed. This is done by making assumptions that simplify reality and by writing down

10 Part I • Introduction to Analytics and AI

the relationships among all the variables. The model is then validated, and criteria are de- termined in a principle of choice for evaluation of the alternative courses of action that are identified. Often, the process of model development identifies alternative solutions and vice versa.

The choice phase includes the selection of a proposed solution to the model (not necessarily to the problem it represents). This solution is tested to determine its viability. When the proposed solution seems reasonable, we are ready for the last phase: imple- mentation of the decision (not necessarily of a system). Successful implementation results in solving the real problem. Failure leads to a return to an earlier phase of the process. In fact, we can return to an earlier phase during any of the latter three phases. The decision- making situations described in the opening vignette follow Simon’s four-phase model, as do almost all other decision-making situations.

The Intelligence Phase: Problem (or Opportunity) Identification

The intelligence phase begins with the identification of organizational goals and objectives related to an issue of concern (e.g., inventory management, job selection, lack of or incorrect Web presence) and determination of whether they are being met. Problems occur because of dissatisfaction with the status quo. Dissatisfaction is the result of a difference between what people desire (or expect) and what is occurring. In this first phase, a decision maker attempts to determine whether a problem exists, identify its symptoms, determine its magnitude, and

Success

Organization objectives Search and scanning procedures Data collection Problem identification Problem ownership Problem classification Problem statement

Solution to the model Sensitivity analysis Selection to the best (good) alternative(s)

Plan for implementation

Formulate a model Set criteria for choice Search for alternatives Predict and measure outcomes

Assumptions

Simplification

Problem Statement

Alternatives

Validation of the Model

Verification, Testing of the Proposed Solution

Implementation of the solution

Failure

Intelligence

Design

Choice

Reality

FIGURE 1.1 The Decision-Making/Modeling Process.

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 11

explicitly define it. Often, what is described as a problem (e.g., excessive costs) may be only a symptom (i.e., measure) of a problem (e.g., improper inventory levels). Because real-world problems are usually complicated by many interrelated factors, it is sometimes difficult to distinguish between the symptoms and the real problem. New opportunities and problems certainly may be uncovered while investigating the causes of symptoms.

The existence of a problem can be determined by monitoring and analyzing the organization’s productivity level. The measurement of productivity and the construction of a model are based on real data. The collection of data and the estimation of future data are among the most difficult steps in the analysis.

ISSUES IN DATA COLLECTION The following are some issues that may arise during data collection and estimation and thus plague decision makers:

• Data are not available. As a result, the model is made with and relies on potentially inaccurate estimates.

• Obtaining data may be expensive. • Data may not be accurate or precise enough. • Data estimation is often subjective. • Data may be insecure. • Important data that influence the results may be qualitative (soft). • There may be too many data (i.e., information overload). • Outcomes (or results) may occur over an extended period. As a result, revenues,

expenses, and profits will be recorded at different points in time. To overcome this difficulty, a present-value approach can be used if the results are quantifiable.

• It is assumed that future data will be similar to historical data. If this is not the case, the nature of the change has to be predicted and included in the analysis.

When the preliminary investigation is completed, it is possible to determine whether a problem really exists, where it is located, and how significant it is. A key issue is whether an information system is reporting a problem or only the symptoms of a problem. For example, if reports indicate that sales are down, there is a problem, but the situation, no doubt, is symptomatic of the problem. It is critical to know the real problem. Sometimes it may be a problem of perception, incentive mismatch, or organizational processes rather than a poor decision model.

To illustrate why it is important to identify the problem correctly, we provide a clas- sical example in Application Case 1.1.

This story has been reported in numerous places and has almost become a classic example to explain the need for problem identification. Ackoff (as cited in Larson, 1987) described the problem of manag- ing complaints about slow elevators in a tall hotel tower. After trying many solutions for reducing the complaint—staggering elevators to go to different floors, adding operators, and so on—the manage- ment determined that the real problem was not

about the actual waiting time but rather the per- ceived waiting time. So the solution was to install full-length mirrors on elevator doors on each floor. As Hesse and Woolsey (1975) put it, “The women would look at themselves in the mirrors and make adjustments, while the men would look at the women, and before they knew it, the elevator was there.” By reducing the perceived waiting time, the problem went away. Baker and Cameron (1996)

Application Case 1.1 Making Elevators Go Faster!

(Continued )

12 Part I • Introduction to Analytics and AI

PROBLEM CLASSIFICATION Problem classification is the conceptualization of a problem in an attempt to place it in a definable category, possibly leading to a standard solution approach. An important approach classifies problems according to the degree of struc- turedness evident in them. This ranges from totally structured (i.e., programmed) to to- tally unstructured (i.e., unprogrammed).

PROBLEM DECOMPOSITION Many complex problems can be divided into subproblems. Solving the simpler subproblems may help in solving a complex problem. Also, seemingly poorly structured problems sometimes have highly structured subproblems. Just as a sem- istructured problem results when some phases of decision making are structured whereas other phases are unstructured, and when some subproblems of a decision- making prob- lem are structured with others unstructured, the problem itself is semistructured. As a de- cision support system is developed and the decision maker and development staff learn more about the problem, it gains structure.

PROBLEM OWNERSHIP In the intelligence phase, it is important to establish problem ownership. A problem exists in an organization only if someone or some group takes the responsibility for attacking it and if the organization has the ability to solve it. The assign- ment of authority to solve the problem is called problem ownership. For example, a man- ager may feel that he or she has a problem because interest rates are too high. Because interest rate levels are determined at the national and international levels and most manag- ers can do nothing about them, high interest rates are the problem of the government, not a problem for a specific company to solve. The problem that companies actually face is how to operate in a high interest-rate environment. For an individual company, the interest rate level should be handled as an uncontrollable (environmental) factor to be predicted.

When problem ownership is not established, either someone is not doing his or her job or the problem at hand has yet to be identified as belonging to anyone. It is then important for someone to either volunteer to own it or assign it to someone.

The intelligence phase ends with a formal problem statement.

The Design Phase

The design phase involves finding or developing and analyzing possible courses of action. These include understanding the problem and testing solutions for feasibility. A model of the decision-making problem is constructed, tested, and validated. Let us first define a model.

give several other examples of distractions, includ- ing lighting and displays, that organizations use to reduce perceived waiting time. If the real problem is identified as perceived waiting time, it can make a big difference in the proposed solutions and their costs. For example, full-length mirrors probably cost a whole lot less than adding an elevator!

Sources: Based on J. Baker and M. Cameron. (1996, September). “The Effects of the Service Environment on Affect and Consumer Perception of Waiting Time: An Integrative Review and Research Propositions,” Journal of the Academy of Marketing

Science, 24, pp. 338–349; R. Hesse and G. Woolsey (1975). Applied Management Science: A Quick and Dirty Approach. Chicago, IL: SRA Inc; R. C. Larson. (1987, November/December). “Perspectives on Queues: Social Justice and the Psychology of Queuing.” Operations Research, 35(6), pp. 895–905.

Questions for Case 1.1

1. Why this is an example relevant to decision making?

2. Relate this situation to the intelligence phase of decision making.

Application Case 1.1 (Continued)

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 13

MODELS A major characteristic of computerized decision support and many BI tools (notably those of business analytics) is the inclusion of at least one model. The basic idea is to perform the analysis on a model of reality rather than on the real system. A model is a simplified representation or abstraction of reality. It is usually simplified because reality is too complex to describe exactly and because much of the complexity is actually irrel- evant in solving a specific problem.

Modeling involves conceptualizing a problem and abstracting it to quantitative and/ or qualitative form. For a mathematical model, the variables are identified and their mu- tual relationships are established. Simplifications are made, whenever necessary, through assumptions. For example, a relationship between two variables may be assumed to be linear even though in reality there may be some nonlinear effects. A proper balance be- tween the level of model simplification and the representation of reality must be obtained because of the cost–benefit trade-off. A simpler model leads to lower development costs, easier manipulation, and a faster solution but is less representative of the real problem and can produce inaccurate results. However, a simpler model generally requires fewer data, or the data are aggregated and easier to obtain.

The Choice Phase

Choice is the critical act of decision making. The choice phase is the one in which the actual decision and the commitment to follow a certain course of action are made. The boundary between the design and choice phases is often unclear because certain activi- ties can be performed during both of them and because the decision maker can return frequently from choice activities to design activities (e.g., generate new alternatives while performing an evaluation of existing ones). The choice phase includes the search for, evaluation of, and recommendation of an appropriate solution to a model. A solution to a model is a specific set of values for the decision variables in a selected alternative. Choices can be evaluated as to their viability and profitability.

Each alternative must be evaluated. If an alternative has multiple goals, they must all be examined and balanced against each other. Sensitivity analysis is used to determine the robustness of any given alternative; slight changes in the parameters should ideally lead to slight or no changes in the alternative chosen. What-if analysis is used to explore major changes in the parameters. Goal seeking helps a manager determine values of the decision variables to meet a specific objective. These topics are addressed in Chapter 8.

The Implementation Phase

In The Prince, Machiavelli astutely noted some 500 years ago that there was “nothing more difficult to carry out, nor more doubtful of success, nor more dangerous to handle, than to initiate a new order of things.” The implementation of a proposed solution to a problem is, in effect, the initiation of a new order of things or the introduction of change. And change must be managed. User expectations must be managed as part of change management.

The definition of implementation is somewhat complicated because implementation is a long, involved process with vague boundaries. Simplistically, the implementation phase involves putting a recommended solution to work, not necessarily implementing a computer system. Many generic implementation issues, such as resistance to change, degree of support of top management, and user training, are important in dealing with information system–supported decision making. Indeed, many previous technology- related waves (e.g., business process reengineering [BPR] and knowledge management) have faced mixed results mainly because of change management challenges and issues. Management of change is almost an entire discipline in itself, so we recognize its impor- tance and encourage readers to focus on it independently. Implementation also includes

14 Part I • Introduction to Analytics and AI

a thorough understanding of project management. The importance of project manage- ment goes far beyond analytics, so the last few years have witnessed a major growth in certification programs for project managers. A very popular certification now is the Project Management Professional (PMP). See pmi.org for more details.

Implementation must also involve collecting and analyzing data to learn from the previous decisions and improve the next decision. Although analysis of data is usually conducted to identify the problem and/or the solution, analytics should also be employed in the feedback process. This is especially true for any public policy decisions. We need to be sure that the data being used for problem identification is valid. Sometimes people find this out only after the implementation phase.

The decision-making process, though conducted by people, can be improved with computer support, which is introduced next.

The Classical Decision Support System Framework

The early definitions of decision support system (DSS) identified it as a system intended to support managerial decision makers in semistructured and unstructured decision situ- ations. DSS was meant to be an adjunct to decision makers, extending their capabilities but not replacing their judgment. DSS was aimed at decisions that required judgment or at decisions that could not be completely supported by algorithms. Not specifically stated but implied in the early definitions was the notion that the system would be computer based, would operate interactively online, and preferably would have graphical output capabilities, now simplified via browsers and mobile devices.

An early framework for computerized decision support includes several major con- cepts that are used in forthcoming sections and chapters of this book. Gorry and Scott- Morton created and used this framework in the early 1970s, and the framework then evolved into a new technology called DSS.

Gorry and Scott-Morton (1971) proposed a framework that is a 3-by-3 matrix, as shown in Figure 1.2. The two dimensions are the degree of structuredness and the types of control.

DEGREE OF STRUCTUREDNESS The left side of Figure 1.2 is based on Simon’s (1977) idea that decision-making processes fall along a continuum that ranges from highly struc- tured (sometimes called programmed) to highly unstructured (i.e., non-programmed) decisions. Structured processes are routine and typically repetitive problems for which standard solution methods exist. Unstructured processes are fuzzy, complex problems for which there are no cut-and-dried solution methods.

An unstructured problem is one where the articulation of the problem or the solu- tion approach may be unstructured in itself. In a structured problem, the procedures for obtaining the best (or at least a good enough) solution are known. Whether the problem involves finding an appropriate inventory level or choosing an optimal investment strat- egy, the objectives are clearly defined. Common objectives are cost minimization and profit maximization.

Semistructured problems fall between structured and unstructured problems, hav- ing some structured elements and some unstructured elements. Keen and Scott-Morton (1978) mentioned trading bonds, setting marketing budgets for consumer products, and performing capital acquisition analysis as semistructured problems.

TYPES OF CONTROL The second half of the Gorry and Scott-Morton (1971) framework (refer to Figure 1.2) is based on Anthony’s (1965) taxonomy, which defines three broad categories that encompass all managerial activities: strategic planning, which involves defining long-range goals and policies for resource allocation; management control, the

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 15

acquisition and efficient use of resources in the accomplishment of organizational goals; and operational control, the efficient and effective execution of specific tasks.

THE DECISION SUPPORT MATRIX Anthony’s (1965) and Simon’s (1977) taxonomies are combined in the nine-cell decision support matrix shown in Figure 1.2. The initial pur- pose of this matrix was to suggest different types of computerized support to differ- ent cells in the matrix. Gorry and Scott-Morton (1971) suggested, for example, that for making semistructured decisions and unstructured decisions, conventional management information systems (MIS) and management science (MS) tools are insufficient. Human intellect and a different approach to computer technologies are necessary. They proposed the use of a supportive information system, which they called a DSS.

Note that the more structured and operational control-oriented tasks (such as those in cells 1, 2, and 4 of Figure 1.2) are usually performed by lower-level managers, whereas the tasks in cells 6, 8, and 9 are the responsibility of top executives or highly trained specialists.

COMPUTER SUPPORT FOR STRUCTURED DECISIONS Since the 1960s, computers have his- torically supported structured and some semistructured decisions, especially those that involve operational and managerial control. Operational and managerial control decisions are made in all functional areas, especially in finance and production (i.e., operations) management.

Monitoring accounts receivable Monitoring accounts payable Placing order entries

Operational Control

Structured

Managerial Control

Strategic Planning

Semistructured

Unstructured

Analyzing budget Forecasting short-term Reporting on personnel Making or buying

Scheduling production Controlling inventory

Evaluating credit Preparing budget Laying out plant Scheduling project Designing reward system Categorizing inventory

Building a new plant Planning mergers and acquisitions Planning new products Planning compensation Providing quality assurance Establishing human resources policies Planning inventory

Buying software Approving loans Operating a help desk Selecting a cover for a magazine

Negotiating Recruiting an executive Buying hardware Lobbying

Planning research and development Developing new technologies Planning social responsibility

Type of Decision

Type of Control

1 2 3

4 5 6

7 8 9

Managing finances Monitoring investment portfolio Locating warehouse Monitoring distribution systems

FIGURE 1.2 Decision Support Frameworks.

16 Part I • Introduction to Analytics and AI

Structured problems, which are encountered repeatedly, have a high level of struc- ture, as their name suggests. It is therefore possible to abstract, analyze, and classify them into specific categories. For example, a make-or-buy decision is one category. Other examples of categories are capital budgeting, allocation of resources, distribution, pro- curement, planning, and inventory control decisions. For each category of decision, an easy-to-apply prescribed model and solution approach have been developed, generally as quantitative formulas. Therefore, it is possible to use a scientific approach for automat- ing portions of managerial decision making. Solutions to many structured problems can be fully automated (see Chapters 2 and 12).

COMPUTER SUPPORT FOR UNSTRUCTURED DECISIONS Unstructured problems can be only partially supported by standard computerized quantitative methods. It is usually necessary to develop customized solutions. However, such solutions may benefit from data and information generated from corporate or external data sources. Intuition and judgment may play a large role in these types of decisions, as may computerized com- munication and collaboration technologies, as well as cognitive computing (Chapter 6) and deep learning (Chapter 5).

COMPUTER SUPPORT FOR SEMISTRUCTURED PROBLEMS Solving semistructured prob- lems may involve a combination of standard solution procedures and human judgment. Management science can provide models for the portion of a decision-making problem that is structured. For the unstructured portion, a DSS can improve the quality of the information on which the decision is based by providing, for example, not only a single solution, but also a range of alternative solutions along with their potential impacts. These capabilities help managers to better understand the nature of problems and, thus, to make better decisions.

DECISION SUPPORT SYSTEM: CAPABILITIES The early definitions of DSS identified it as a system intended to support managerial decision makers in semistructured and unstructured decision situations. DSS was meant to be an adjunct to decision makers, extending their capabilities but not replacing their judgment. It was aimed at decisions that required judgment or at decisions that could not be completely supported by al- gorithms. Not specifically stated but implied in the early definitions was the notion that the system would be computer based, would operate interactively online, and prefer- ably would have graphical output capabilities, now simplified via browsers and mobile devices.

A DSS Application

A DSS is typically built to support the solution of a certain problem or to evaluate an op- portunity. This is a key difference between DSS and BI applications. In a very strict sense, business intelligence (BI) systems monitor situations and identify problems and/or opportunities using analytic methods. Reporting plays a major role in BI; the user gener- ally must identify whether a particular situation warrants attention and then can apply analytical methods. Again, although models and data access (generally through a data warehouse) are included in BI, a DSS may have its own databases and is developed to solve a specific problem or set of problems and are therefore called DSS applications.

Formally, a DSS is an approach (or methodology) for supporting decision mak- ing. It uses an interactive, flexible, adaptable computer-based information system (CBIS) especially developed for supporting the solution to a specific unstructured management problem. It uses data, provides an easy user interface, and can incorporate the decision maker’s own insights. In addition, a DSS includes models and is developed (possibly by

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 17

end users) through an interactive and iterative process. It can support all phases of deci- sion making and may include a knowledge component. Finally, a DSS can be used by a single user or can be Web based for use by many people at several locations.

THE CHARACTERISTICS AND CAPABILITIES OF DSS Because there is no consensus on exactly what a DSS is, there is obviously no agreement on the standard characteristics and capabilities of DSS. The capabilities in Figure 1.3 constitute an ideal set, some members of which are described in the definitions of DSS and illustrated in the application cases.

The key characteristics and capabilities of DSS (as shown in Figure 1.3) are as follows:

1. Supports decision makers, mainly in semistructured and unstructured situations, by bringing together human judgment and computerized information. Such problems cannot be solved (or cannot be solved conveniently) by other computerized systems or through use of standard quantitative methods or tools. Generally, these problems gain structure as the DSS is developed. Even some structured problems have been solved by DSS.

2. Supports all managerial levels, ranging from top executives to line managers. 3. Supports individuals as well as groups. Less-structured problems often require the

involvement of individuals from different departments and organizational levels or even from different organizations. DSS supports virtual teams through collaborative Web tools. DSS has been developed to support individual and group work as well

Is adaptable and flexible

Provides interactivity, ease of use

Support variety of decision

processes and styles

Improves effectiveness and efficiency

Supports intelligence,

design, choice, and implementation

Provides complete human control of

the process

Supports managers at

all levels

2 Provides support for semistructured or unstructured

problems

1 Can be stand-

alone, integrated, and Web- based tool

Provides data access

Supports individuals and groups

Provides models and analysis

Supports interdependent or sequential

decisions

Provides ease of development by end users

11 Decision Support System (DSS)

FIGURE 1.3 Key Characteristics and Capabilities of DSS.

18 Part I • Introduction to Analytics and AI

as to support individual decision making and groups of decision makers working somewhat independently.

4. Supports interdependent and/or sequential decisions. The decisions may be made once, several times, or repeatedly.

5. Supports all phases of the decision-making process: intelligence, design, choice, and implementation.

6. Supports a variety of decision-making processes and styles. 7. Is flexible, so users can add, delete, combine, change, or rearrange basic elements.

The decision maker should be reactive, able to confront changing conditions quickly, and able to adapt the DSS to meet these changes. It is also flexible in that it can be readily modified to solve other, similar problems.

8. Is user-friendly, has strong graphical capabilities, and a natural language interactive human-machine interface can greatly increase the effectiveness of DSS. Most new DSS applications use Web-based interfaces or mobile platform interfaces.

9. Improves the effectiveness of decision making (e.g., accuracy, timeliness, quality) rather than its efficiency (e.g., the cost of making decisions). When DSS is deployed, decision making often takes longer, but the decisions are better.

10. Provides complete control by the decision maker over all steps of the decision- making process in solving a problem. A DSS specifically aims to support, not to replace, the decision maker.

11. Enables end users to develop and modify simple systems by themselves. Larger systems can be built with assistance from IS specialists. Spreadsheet packages have been utilized in developing simpler systems. OLAP and data mining soft- ware in conjunction with data warehouses enable users to build fairly large, complex DSS.

12. Provides models that are generally utilized to analyze decision-making situations. The modeling capability enables experimentation with different strategies under dif- ferent configurations.

13. Provides access to a variety of data sources, formats, and types, including GIS, mul- timedia, and object-oriented data.

14. Can be employed as a stand-alone tool used by an individual decision maker in one location or distributed throughout an organization and in several organiza- tions along the supply chain. It can be integrated with other DSS and/or applica- tions, and it can be distributed internally and externally, using networking and Web technologies.

These key DSS characteristics and capabilities allow decision makers to make bet- ter, more consistent decisions in a timely manner, and they are provided by major DSS components,

Components of a Decision Support System

A DSS application can be composed of a data management subsystem, a model manage- ment subsystem, a user interface subsystem, and a knowledge-based management sub- system. We show these in Figure 1.4.

The Data Management Subsystem

The data management subsystem includes a database that contains relevant data for the situation and is managed by software called the database management system (DBMS). DBMS is used as both singular and plural (system and systems) terms, as are many other acronyms in this text. The data management subsystem can be interconnected with the corporate data warehouse, a repository for corporate relevant decision-making data.

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 19

Usually, the data are stored or accessed via a database Web server. The data management subsystem is composed of the following elements:

• DSS database • Database management system • Data directory • Query facility

Many of the BI or descriptive analytics applications derive their strength from the data management side of the subsystems.

The Model Management Subsystem

The model management subsystem is the component that includes financial, statistical, management science, or other quantitative models that provide the system’s analytical capabilities and appropriate software management. Modeling languages for building cus- tom models are also included. This software is often called a model base management system (MBMS). This component can be connected to corporate or external storage of models. Model solution methods and management systems are implemented in Web de- velopment systems (such as Java) to run on application servers. The model management subsystem of a DSS is composed of the following elements:

• Model base • MBMS • Modeling language • Model directory • Model execution, integration, and command processor

Because DSS deals with semistructured or unstructured problems, it is often neces- sary to customize models, using programming tools and languages. Some examples of these are .NET Framework languages, C++, and Java. OLAP software may also be used to work with models in data analysis. Even languages for simulations such as Arena and

Data: internal and/or external

ERP/POS

Legacy

Web, etc.

Other Computer-Based

Systems

Internet, Intranet, Extranet

Data Management

User Interface

Knowledge-Based Subsystems

Manager (user)Organizational Knowledgebase

External Models

Model Management

FIGURE 1.4 Schematic View of DSS.

20 Part I • Introduction to Analytics and AI

statistical packages such as those of SPSS offer modeling tools developed through the use of a proprietary programming language. For small- and medium-sized DSS or for less complex ones, a spreadsheet (e.g., Excel) is usually used. We use Excel for several ex- amples in this book. Application Case 1.2 describes a spreadsheet-based DSS.

The User Interface Subsystem

The user communicates with and commands the DSS through the user interface subsys- tem. The user is considered part of the system. Researchers assert that some of the unique contributions of DSS are derived from the intensive interaction between the computer and the decision maker. A difficult user interface is one of the major reasons that man- agers do not use computers and quantitative analyses as much as they could, given the availability of these technologies. The Web browser provided a familiar, consistent GUI structure for many DSS in the 2000s. For locally used DSS, a spreadsheet also provides a familiar user interface. The Web browser has been recognized as an effective DSS GUI because it is flexible, user-friendly, and a gateway to almost all sources of necessary infor- mation and data. Essentially, Web browsers have led to the development of portals and dashboards, which front end many DSS.

Explosive growth in portable devices, including smartphones and tablets, has changed the DSS user interfaces as well. These devices allow either handwritten input or

Telecommunications network services to educational institutions and government entities are typically pro- vided by a mix of private and public organizations. Many states in the United States have one or more state agencies that are responsible for providing net- work services to schools, colleges, and other state agencies. One example of such an agency is OneNet in Oklahoma. OneNet is a division of the Oklahoma State Regents for Higher Education and operated in cooperation with the Office of State Finance.

Usually agencies such as OneNet operate as an enterprise-type fund. They must recover their costs through billing their clients and/or by justifying appropriations directly from the state legislatures. This cost recovery should occur through a pricing mechanism that is efficient, simple to implement, and equitable. This pricing model typically needs to recognize many factors: convergence of voice, data, and video traffic on the same infrastructure; diversity of user base in terms of educational institu- tions and state agencies; diversity of applications in use by state clients from e-mail to videoconferences, IP telephoning, and distance learning; recovery of current costs as well as planning for upgrades and

future developments; and leverage of the shared infrastructure to enable further economic develop- ment and collaborative work across the state that leads to innovative uses of OneNet.

These considerations led to the development of a spreadsheet-based model. The system, SNAP-DSS, or Service Network Application and Pricing (SNAP)- based DSS, was developed in Microsoft Excel 2007 and used the VBA programming language.

The SNAP-DSS offers OneNet the ability to select the rate card options that best fit the preferred pricing strategies by providing a real-time, user- friendly, graphical user interface (GUI). In addition, the SNAP-DSS not only illustrates the influence of the changes in the pricing factors on each rate card option but also allows the user to analyze various rate card options in different scenarios using dif- ferent parameters. This model has been used by OneNet financial planners to gain insights into their customers and analyze many what-if scenarios of different rate plan options.

Source: Based on J. Chongwatpol and R. Sharda. (2010, December). “SNAP: A DSS to Analyze Network Service Pricing for State Networks.” Decision Support Systems, 50(1), pp. 347–359.

Application Case 1.2 SNAP DSS Helps OneNet Make Telecommunications Rate Decisions

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 21

typed input from internal or external keyboards. Some DSS user interfaces utilize natural language input (i.e., text in a human language) so that the users can easily express them- selves in a meaningful way. Cell phone inputs through short message service (SMS) or chatbots are becoming more common for at least some consumer DSS-type applications. For example, one can send an SMS request for search on any topic to GOOGL (46645). Such capabilities are most useful in locating nearby businesses, addresses, or phone numbers, but it can also be used for many other decision support tasks. For example, users can find definitions of words by entering the word “define” followed by a word, such as “define extenuate.” Some of the other capabilities include

• Price lookups: “Price 64GB iPhone X.” • Currency conversions: “10 US dollars in euros.” • Sports scores and game times: Just enter the name of a team (“NYC Giants”), and Google

SMS will send the most recent game’s score and the date and time of the next match.

This type of SMS-based search capability is also available for other search engines such as Microsoft’s search engine Bing.

With the emergence of smartphones such as Apple’s iPhone and Android smartphones from many vendors, many companies are developing apps to provide purchasing-decision support. For example, Amazon’s app allows a user to take a picture of any item in a store (or wherever) and send it to Amazon.com. Amazon.com’s graphics-understanding al- gorithm tries to match the image to a real product in its databases and sends the user a page similar to Amazon.com’s product info pages, allowing users to perform price com- parisons in real time. Millions of other apps have been developed that provide consumers support for decision making on finding and selecting stores/restaurants/service providers on the basis of location, recommendations from others, and especially from your own so- cial circles. Search activities noted in the previous paragraph are also largely accomplished now through apps provided by each search provider.

Voice input for these devices and the new smart speakers such as Amazon Echo (Alexa) and Google Home is common and fairly accurate (but not perfect). When voice input with accompanying speech-recognition software (and readily available text-to- speech software) is used, verbal instructions with accompanied actions and outputs can be invoked. These are readily available for DSS and are incorporated into the portable devices described earlier. An example of voice inputs that can be used for a general- purpose DSS is Apple’s Siri application and Google’s Google Now service. For example, a user can give her or his zip code and say “pizza delivery.” These devices provide the search results and can even place a call to a business.

The Knowledge-Based Management Subsystem

Many of the user interface developments are closely tied to the major new advances in their knowledge-based systems. The knowledge-based management subsystem can support any of the other subsystems or act as an independent component. It provides intelligence to aug- ment the decision maker’s own or to help understand a user’s query so as to provide a consis- tent answer. It can be interconnected with the organization’s knowledge repository (part of a KMS), which is sometimes called the organizational knowledge base, or connect to thousands of external knowledge sources. Many artificial intelligence methods have been implemented in the current generation of learning systems and are easy to integrate into the other DSS com- ponents. One of the most widely publicized knowledge-based DSS is IBM’s Watson, which was introduced in the opening vignette and will be described in more detail later.

This section has covered the history and progression of Decision Support Systems in brief. In the next section we discuss evolution of this support to business intelligence, analytics, and data science.

http://Amazon.com

22 Part I • Introduction to Analytics and AI

u SECTION 1.3 REVIEW QUESTIONS

1. List and briefly describe Simon’s four phases of decision making. 2. What is the difference between a problem and its symptoms? 3. Why is it important to classify a problem? 4. Define implementation. 5. What are structured, unstructured, and semistructured decisions? Provide two exam-

ples of each.

6. Define operational control, managerial control, and strategic planning. Provide two examples of each.

7. What are the nine cells of the decision framework? Explain what each is for. 8. How can computers provide support for making structured decisions? 9. How can computers provide support for making semistructured and unstructured

decisions?

1.4 EVOLUTION OF COMPUTERIZED DECISION SUPPORT TO BUSINESS INTELLIGENCE/ANALYTICS/DATA SCIENCE

The timeline in Figure 1.5 shows the terminology used to describe analytics since the 1970s. During the 1970s, the primary focus of information systems support for decision making focused on providing structured, periodic reports that a manager could use for decision making (or ignore them). Businesses began to create routine reports to inform decision makers (managers) about what had happened in the previous period (e.g., day, week, month, quarter). Although it was useful to know what had happened in the past, managers needed more than this: They needed a variety of reports at different levels of granularity to better understand and address changing needs and challenges of the busi- ness. These were usually called management information systems (MIS). In the early 1970s, Scott-Morton first articulated the major concepts of DSS. He defined DSS as “inter- active computer-based systems, which help decision makers utilize data and models to solve unstructured problems” (Gorry and Scott-Morton, 1971). The following is another classic DSS definition provided by Keen and Scott-Morton (1978):

Decision support systems couple the intellectual resources of individuals with the capabilities of the computer to improve the quality of decisions. It is a computer-based support system for management decision makers who deal with semistructured problems.

Big Data AutomationAnalyticsBusiness IntelligenceEnterprise/Executive IS

Routine Reporting

AI/Expert Systems

Decision Support Systems

Relational DBM S

On-Demand Static Reporting

Enterprise Resource Planning

Data W arehousing

Dashboards, Scorecards

Executive Information Systems

Software as a Service

Data/Text/W eb M

ining

Business Intelligence, BPM

Cloud, Big Data Analytics

In-M emory/In-Database/M

Social Network/M edia Analytics

Automated Analytics

Al/Deep Learning, loT/Sensors

Robotics, Smart Robo-Assistants

Decision Support Systems

1970s 1980s 1990s 2000s 2010s 2020s

FIGURE 1.5 Evolution of Decision Support, Business Intelligence, Analytics, and AI.

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 23

Note that the term decision support system, like management information system and several other terms in the field of IT, is a content-free expression (i.e., it means dif- ferent things to different people). Therefore, there is no universally accepted definition of DSS.

During the early days of analytics, data were often obtained from the domain ex- perts using manual processes (i.e., interviews and surveys) to build mathematical or knowledge-based models to solve constrained optimization problems. The idea was to do the best with limited resources. Such decision support models were typically called operations research (OR). The problems that were too complex to solve optimally (using linear or nonlinear mathematical programming techniques) were tackled using heuristic methods such as simulation models. (We will introduce these as prescriptive analytics later in this chapter).

In the late 1970s and early 1980s, in addition to the mature OR models that were being used in many industries and government systems, a new and exciting line of mod- els had emerged: rule-based expert systems (ESs). These systems promised to capture ex- perts’ knowledge in a format that computers could process (via a collection of if-then-else rules or heuristics) so that these could be used for consultation much the same way that one would use domain experts to identify a structured problem and to prescribe the most probable solution. ESs allowed scarce expertise to be made available where and when needed, using an “intelligent” DSS.

The 1980s saw a significant change in the way organizations captured business- related data. The old practice had been to have multiple disjointed information systems tailored to capture transactional data of different organizational units or functions (e.g., accounting, marketing and sales, finance, manufacturing). In the 1980s, these systems were integrated as enterprise-level information systems that we now commonly call en- terprise resource planning (ERP) systems. The old mostly sequential and nonstandardized data representation schemas were replaced by relational database management (RDBM) systems. These systems made it possible to improve the capture and storage of data as well as the relationships between organizational data fields while significantly reducing the replication of information. The need for RDBM and ERP systems emerged when data integrity and consistency became an issue, significantly hindering the effectiveness of business practices. With ERP, all the data from every corner of the enterprise is collected and integrated into a consistent schema so that every part of the organization has access to the single version of the truth when and where needed. In addition to the emergence of ERP systems, or perhaps because of these systems, business reporting became an on- demand, as-needed business practice. Decision makers could decide when they needed to or wanted to create specialized reports to investigate organizational problems and opportunities.

In the 1990s, the need for more versatile reporting led to the development of execu- tive information systems (EISs; DSS designed and developed specifically for executives and their decision-making needs). These systems were designed as graphical dashboards and scorecards so that they could serve as visually appealing displays while focusing on the most important factors for decision makers to keep track of the key performance in- dicators. To make this highly versatile reporting possible while keeping the transactional integrity of the business information systems intact, it was necessary to create a middle data tier known as a DW as a repository to specifically support business reporting and decision making. In a very short time, most large- to medium-sized businesses adopted data warehousing as their platform for enterprise-wide decision making. The dashboards and scorecards got their data from a DW, and by doing so, they were not hindering the efficiency of the business transaction systems mostly referred to as ERP systems.

In the 2000s, the DW-driven DSS began to be called BI systems. As the amount of longitudinal data accumulated in the DWs increased, so did the capabilities of hardware

24 Part I • Introduction to Analytics and AI

and software to keep up with the rapidly changing and evolving needs of the decision makers. Because of the globalized competitive marketplace, decision makers needed current information in a very digestible format to address business problems and to take advantage of market opportunities in a timely manner. Because the data in a DW are up- dated periodically, they do not reflect the latest information. To elevate this information latency problem, DW vendors developed a system to update the data more frequently, which led to the terms real-time data warehousing and, more realistically, right-time data warehousing, which differs from the former by adopting a data-refreshing policy based on the needed freshness of the data items (i.e., not all data items need to be refreshed in real time). DWs are very large and feature rich, and it became necessary to “mine” the corporate data to “discover” new and useful knowledge nuggets to improve business pro- cesses and practices, hence, the terms data mining and text mining. With the increasing volumes and varieties of data, the needs for more storage and more processing power emerged. Although large corporations had the means to tackle this problem, small- to medium-sized companies needed more financially manageable business models. This need led to service-oriented architecture and software and infrastructure-as-a-service ana- lytics business models. Smaller companies, therefore, gained access to analytics capabili- ties on an as-needed basis and paid only for what they used, as opposed to investing in financially prohibitive hardware and software resources.

In the 2010s, we are seeing yet another paradigm shift in the way that data are captured and used. Largely because of the widespread use of the Internet, new data gen- eration mediums have emerged. Of all the new data sources (e.g., radio-frequency iden- tification [RFID] tags, digital energy meters, clickstream Web logs, smart home devices, wearable health monitoring equipment), perhaps the most interesting and challenging is social networking/social media. These unstructured data are rich in information content, but analysis of such data sources poses significant challenges to computational systems from both software and hardware perspectives. Recently, the term Big Data has been coined to highlight the challenges that these new data streams have brought on us. Many advancements in both hardware (e.g., massively parallel processing with very large com- putational memory and highly parallel multiprocessor computing systems) and software/ algorithms (e.g., Hadoop with MapReduce and NoSQL, Spark) have been developed to address the challenges of Big Data.

The last few years and the upcoming decade are bringing massive growth in many exciting dimensions. For example, streaming analytics and the sensor technologies have enabled the IoT. Artificial Intelligence is changing the shape of BI by enabling new ways of analyzing images through deep learning, not just traditional visualization of data. Deep learning and AI are also helping grow voice recognition and speech synthesis, leading to new interfaces in interacting with technologies. Almost half of U.S. households already have a smart speaker such as Amazon Echo or Google Home and have begun to interact with data and systems using voice interfaces. Growth in video interfaces will eventually enable gesture-based interaction with systems. All of these are being enabled due to massive cloud- based data storage and amazingly fast processing capabilities. And more is yet to come.

It is hard to predict what the next decade will bring and what the new analytics-related terms will be. The time between new paradigm shifts in information systems and particularly in analytics has been shrinking, and this trend will continue for the foreseeable future. Even though analytics is not new, the explosion in its popularity is very new. Thanks to the recent explosion in Big Data, ways to collect and store these data and intuitive software tools, data- driven insights are more accessible to business professionals than ever before. Therefore, in the midst of global competition, there is a huge opportunity to make better managerial decisions by using data and analytics to increase revenue while decreasing costs by building better products, improving customer experience, and catching fraud before it happens, im- proving customer engagement through targeting and customization, and developing entirely

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 25

new lines of business, all with the power of analytics and data. More and more companies are now preparing their employees with the know-how of business analytics to drive effec- tiveness and efficiency in their day-to-day decision-making processes.

The next section focuses on a framework for BI. Although most people would agree that BI has evolved into analytics and data science, many vendors and researchers still use that term. So the next few paragraphs pay homage to that history by specifically focusing on what has been called BI. Following the next section, we introduce analytics and use that as the label for classifying all related concepts.

A Framework for Business Intelligence

The decision support concepts presented in Sections 1.2 and 1.3 have been implemented incrementally, under different names, by many vendors that have created tools and meth- odologies for decision support. As noted in Section 1.2, as the enterprise-wide systems grew, managers were able to access user-friendly reports that enabled them to make deci- sions quickly. These systems, which were generally called EISs, then began to offer addi- tional visualization, alerts, and performance measurement capabilities. By 2006, the major commercial products and services appeared under the term business intelligence (BI).

DEFINITIONS OF BI Business intelligence (BI) is an umbrella term that combines architec- tures, tools, databases, analytical tools, applications, and methodologies. It is, like DSS, a content-free expression, so it means different things to different people. Part of the confu- sion about BI lies in the flurry of acronyms and buzzwords that are associated with it (e.g., business performance management [BPM]). BI’s major objective is to enable interactive access (sometimes in real time) to data, to enable manipulation of data, and to give busi- ness managers and analysts the ability to conduct appropriate analyses. By analyzing his- torical and current data, situations, and performances, decision makers get valuable insights that enable them to make more informed and better decisions. The process of BI is based on the transformation of data to information, then to decisions, and finally to actions.

A BRIEF HISTORY OF BI The term BI was coined by the Gartner Group in the mid-1990s. However, as the history in the previous section points out, the concept is much older; it has its roots in the MIS reporting systems of the 1970s. During that period, reporting sys- tems were static, were two dimensional, and had no analytical capabilities. In the early 1980s, the concept of EISs emerged. This concept expanded the computerized support to top-level managers and executives. Some of the capabilities introduced were dynamic mul- tidimensional (ad hoc or on-demand) reporting, forecasting and prediction, trend analysis, drill-down to details, status access, and critical success factors. These features appeared in dozens of commercial products until the mid-1990s. Then the same capabilities and some new ones appeared under the name BI. Today, a good BI-based enterprise information system contains all the information that executives need. So, the original concept of EIS was transformed into BI. By 2005, BI systems started to include artificial intelligence ca- pabilities as well as powerful analytical capabilities. Figure 1.6 illustrates the various tools and techniques that may be included in a BI system. It illustrates the evolution of BI as well. The tools shown in Figure 1.6 provide the capabilities of BI. The most sophisticated BI products include most of these capabilities; others specialize in only some of them.

The Architecture of BI

A BI system has four major components: a DW, with its source data; business analytics, a collection of tools for manipulating, mining, and analyzing the data in the DW; BPM for monitoring and analyzing performance; and a user interface (e.g., a dashboard). The re- lationship among these components is illustrated in Figure 1.7.

26 Part I • Introduction to Analytics and AI

The Origins and Drivers of BI

Where did modern approaches to DW and BI come from? What are their roots, and how do those roots affect the way organizations are managing these initiatives today? Today’s investments in information technology are under increased scrutiny in terms of their bottom-line impact and potential. The same is true of DW and the BI applications that make these initiatives possible.

Business Intelligence

Spreadsheets (MS Excel)

DSS

ETL

Data Warehouse

Data Marts

Metadata

Querying and Reporting

EIS/ESS

Broadcasting Tools

Portals

OLAP

Scorecards and Dashboards

Alerts and Notifications

Data and Text Mining Predictive

Analytics

Digital Cockpits and Dashboards

Workflow

Financial Reporting

FIGURE 1.6 Evolution of Business Intelligence (BI).

Technical staff

Build the data warehouse

- Organizing - Summarizing - Standardizing

Data warehouse

Business users

Access

Manipulation, results

Managers/executives

BPM strategies

Future component: Intelligent systems

User interface

- Browser - Portal - Dashboard

Data Warehouse Environment

Business Analytics Environment

Performance and Strategy

Data Sources

FIGURE 1.7 A High-Level Architecture of BI. Source: Based on W. Eckerson. (2003). Smart Companies in the 21st Century: The Secrets of Creating Successful Business Intelligent Solutions.

Seattle, WA: The Data Warehousing Institute, p. 32, Illustration 5.

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 27

Organizations are being compelled to capture, understand, and harness their data to support decision making to improve business operations. Legislation and regulation (e.g., the Sarbanes-Oxley Act of 2002) now require business leaders to document their business processes and to sign off on the legitimacy of the information they rely on and report to stakeholders. Moreover, business cycle times are now extremely compressed; faster, more informed, and better decision making is, therefore, a competitive impera- tive. Managers need the right information at the right time and in the right place. This is the mantra for modern approaches to BI.

Organizations have to work smart. Paying careful attention to the management of BI initiatives is a necessary aspect of doing business. It is no surprise, then, that organiza- tions are increasingly championing BI and under its new incarnation as analytics.

Data Warehouse as a Foundation for Business Intelligence

BI systems rely on a DW as the information source for creating insight and supporting managerial decisions. A multitude of organizational and external data is captured, trans- formed, and stored in a DW to support timely and accurate decisions through enriched business insight. In simple terms, a DW is a pool of data produced to support decision making; it is also a repository of current and historical data of potential interest to man- agers throughout the organization. Data are usually structured to be available in a form ready for analytical processing activities (i.e., OLAP, data mining, querying, reporting, and other decision support applications). A DW is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management’s decision-making process.

Whereas a DW is a repository of data, data warehousing is literally the entire process. Data warehousing is a discipline that results in applications that provide decision support capability, allows ready access to business information, and creates business insight. The three main types of data warehouses are data marts (DMs), operational data stores (ODS), and enterprise data warehouses (EDW). Whereas a DW combines databases across an en- tire enterprise, a DM is usually smaller and focuses on a particular subject or department. A DM is a subset of a data warehouse, typically consisting of a single subject area (e.g., marketing, operations). An operational data store (ODS) provides a fairly recent form of customer information file. This type of database is often used as an interim staging area for a DW. Unlike the static contents of a DW, the contents of an ODS are updated throughout the course of business operations. An EDW is a large-scale data warehouse that is used across the enterprise for decision support. The large-scale nature of an EDW provides in- tegration of data from many sources into a standard format for effective BI and decision support applications. EDWs are used to provide data for many types of DSS, including CRM, supply chain management (SCM), BPM, business activity monitoring, product life- cycle management, revenue management, and sometimes even KMS.

In Figure 1.8, we show the DW concept. Data from many different sources can be extracted, transformed, and loaded into a DW for further access and analytics for decision support. Further details of DW are available in an online chapter on the book’s Web site.

Transaction Processing versus Analytic Processing

To illustrate the major characteristics of BI, first we will show what BI is not—namely, transaction processing. We are all familiar with the information systems that support our transactions, like ATM withdrawals, bank deposits, and cash register scans at the grocery store. These transaction processing systems are constantly involved in handling updates to what we might call operational databases. For example, in an ATM withdrawal transac- tion, we need to reduce our bank balance accordingly; a bank deposit adds to an account; and a grocery store purchase is likely reflected in the store’s calculation of total sales for the day, and it should reflect an appropriate reduction in the store’s inventory for the items we bought, and so on. These online transaction processing (OLTP) systems handle a

28 Part I • Introduction to Analytics and AI

company’s routine ongoing business. In contrast, a DW is typically a distinct system that provides storage for data that will be used for analysis. The intent of that analysis is to give management the ability to scour data for information about the business, and it can be used to provide tactical or operational decision support whereby, for example, line per- sonnel can make quicker and/or more informed decisions. DWs are intended to work with informational data used for online analytical processing (OLAP) systems.

Most operational data in ERP systems—and in their complementary siblings like SCM or CRM—are stored in an OLTP system, which is a type of computer processing where the computer responds immediately to user requests. Each request is considered to be a transaction, which is a computerized record of a discrete event, such as the receipt of inventory or a customer order. In other words, a transaction requires a set of two or more database updates that must be completed in an all-or-nothing fashion.

The very design that makes an OLTP system efficient for transaction processing makes it inefficient for end-user ad hoc reports, queries, and analysis. In the 1980s, many business users referred to their mainframes as “black holes” because all the information went into them, but none ever came back. All requests for reports had to be programmed by the IT staff, whereas only “precanned” reports could be generated on a scheduled basis, and ad hoc real-time querying was virtually impossible. Although the client/server-based ERP sys- tems of the 1990s were somewhat more report friendly, they have still been a far cry from a desired usability by regular, nontechnical end users for things such as operational reporting and interactive analysis. To resolve these issues, the notions of DW and BI were created.

DWs contain a wide variety of data that present a coherent picture of business con- ditions at a single point in time. The idea was to create a database infrastructure that was always online and contained all the information from the OLTP systems, including histori- cal data, but reorganized and structured in such a way that it was fast and efficient for querying, analysis, and decision support. Separating the OLTP from analysis and decision support enables the benefits of BI that were described earlier.

A Multimedia Exercise in Business Intelligence

TUN includes videos (similar to the television show CSI) to illustrate concepts of analytics in different industries. These are called “BSI Videos (Business Scenario Investigations).” Not only are these entertaining, but they also provide the class with some questions for discussion. For starters, please go to https://www.teradatauniversitynetwork.com/ Library/Items/BSI-The-Case-of-the-Misconnecting-Passengers/ or www.youtube.

Data Marts

Applications (Visualization)

Data/Text Mining

OLAP, Dashboard, Web

Routine Business Reporting

Custom-Built Applications

Data Sources

POS

Other OLTP/Web

External Data

ETL Process

Select

Extract

Transform

Integrate

Load

Metadata

Enterprise Data

Warehouse

Replication

Data mart (Marketing)

Data mart (Operations)

Data mart (Finance)

Data mart (...)

No data mart options

ERP

Legacy

A P

I/ M

id dl

ew ar

FIGURE 1.8 Data Warehouse Framework and Views.

https://www.teradatauniversitynetwork.com/Library/Items/BSI-The-Case-of-the-Misconnecting-Passengers/

http://www.youtube.com/watch?v=NXEL5F4_aKA

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 29

com/watch?v=NXEL5F4_aKA. Watch the video that appears on YouTube. Essentially, you have to assume the role of a customer service center professional. An incoming flight is run- ning late, and several passengers are likely to miss their connecting flights. There are seats on one outgoing flight that can accommodate two of the four passengers. Which two passengers should be given priority? You are given information about customers’ profiles and relationships with the airline. Your decisions might change as you learn more about those customers’ profiles.

Watch the video, pause it as appropriate, and answer the questions on which pas- sengers should be given priority. Then resume the video to get more information. After the video is complete, you can see the slides related to this video and how the analy- sis was prepared on a slide set at www.slideshare.net/teradata/bsi-how-we-did-it- the-case-of-the-misconnecting-passengers.

This multimedia excursion provides an example of how additional available information through an enterprise DW can assist in decision making.

Although some people equate DSS with BI, these systems are not, at present, the same. It is interesting to note that some people believe that DSS is a part of BI—one of its analytical tools. Others think that BI is a special case of DSS that deals mostly with re- porting, communication, and collaboration (a form of data-oriented DSS). Another expla- nation (Watson, 2005) is that BI is a result of a continuous revolution, and as such, DSS is one of BI’s original elements. Further, as noted in the next section onward, in many circles, BI has been subsumed by the new terms analytics or data science.

APPROPRIATE PLANNING AND ALIGNMENT WITH THE BUSINESS STRATEGY First and foremost, the fundamental reasons for investing in BI must be aligned with the company’s business strategy. BI cannot simply be a technical exercise for the information systems department. It has to serve as a way to change the manner in which the company con- ducts business by improving its business processes and transforming decision- making processes to be more data driven. Many BI consultants and practitioners involved in suc- cessful BI initiatives advise that a framework for planning is a necessary precondition. One framework, proposed by Gartner, Inc. (2004), decomposed planning and execution into business, organization, functionality, and infrastructure components. At the busi- ness and organizational levels, strategic and operational objectives must be defined while considering the available organizational skills to achieve those objectives. Issues of orga- nizational culture surrounding BI initiatives and building enthusiasm for those initiatives and procedures for the intra-organizational sharing of BI best practices must be consid- ered by upper management—with plans in place to prepare the organization for change. One of the first steps in that process is to assess the IS organization, the skill sets of the potential classes of users, and whether the culture is amenable to change. From this as- sessment, and assuming there are justification and the need to move ahead, a company can prepare a detailed action plan. Another critical issue for BI implementation success is the integration of several BI projects (most enterprises use several BI projects) among themselves and with the other IT systems in the organization and its business partners.

Gartner and many other analytics consulting organizations promoted the concept of a BI competence center that would serve the following functions:

• A center can demonstrate how BI is clearly linked to strategy and execution of strategy. • A center can serve to encourage interaction between the potential business user

communities and the IS organization. • A center can serve as a repository and disseminator of best BI practices between

and among the different lines of business. • Standards of excellence in BI practices can be advocated and encouraged through-

out the company. • The IS organization can learn a great deal through interaction with the user communi-

ties, such as knowledge about the variety of types of analytical tools that are needed.

http://www.youtube.com/watch?v=NXEL5F4_aKA

http://www.slideshare.net/teradata/bsi-how-we-did-it-the-case-of-the-misconnecting-passengers

30 Part I • Introduction to Analytics and AI

• The business user community and IS organization can better understand why the DW platform must be flexible enough to provide for changing business requirements.

• The center can help important stakeholders like high-level executives see how BI can play an important role.

Over the last 10 years, the idea of a BI competence center has been abandoned because many advanced technologies covered in this book have reduced the need for a central group to organize many of these functions. Basic BI has now evolved to a point where much of it can be done in “self-service” mode by the end users. For example, many data visualizations are easily accomplished by end users using the latest visualization pack- ages (Chapter 3 will introduce some of these). As noted by Duncan (2016), the BI team would now be more focused on producing curated data sets to enable self- service BI. Because analytics is now permeating across the whole organization, the BI competency center could evolve into an analytics community of excellence to promote best practices and ensure overall alignment of analytics initiatives with organizational strategy.

BI tools sometimes needed to be integrated among themselves, creating synergy. The need for integration pushed software vendors to continuously add capabilities to their products. Customers who buy an all-in-one software package deal with only one vendor and do not have to deal with system connectivity. But they may lose the advan- tage of creating systems composed from the “best-of-breed” components. This led to major chaos in the BI market space. Many of the software tools that rode the BI wave (e.g., Savvion, Vitria, Tibco, MicroStrategy, Hyperion) have either been acquired by other companies or have expanded their offerings to take advantage of six key trends that have emerged since the initial wave of surge in business intelligence:

• Big Data. • Focus on customer experience as opposed to just operational efficiency. • Mobile and even newer user interfaces—visual, voice, mobile. • Predictive and prescriptive analytics, machine learning, artificial intelligence. • Migration to cloud. • Much greater focus on security and privacy protection.

This book covers many of these topics in significant detail by giving examples of how the technologies are evolving and being applied, and the managerial implications.

u SECTION 1.4 REVIEW QUESTIONS

1. List three of the terms that have been predecessors of analytics. 2. What was the primary difference between the systems called MIS, DSS, and Executive

Information Systems?

3. Did DSS evolve into BI or vice versa? 4. Define BI. 5. List and describe the major components of BI. 6. Define OLTP. 7. Define OLAP. 8. List some of the implementation topics addressed by Gartner’s report. 9. List some other success factors of BI.

1.5 ANALYTICS OVERVIEW

The word analytics has largely replaced the previous individual components of comput- erized decision support technologies that have been available under various labels in the past. Indeed, many practitioners and academics now use the word analytics in place of BI. Although many authors and consultants have defined it slightly differently, one can

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 31

view analytics as the process of developing actionable decisions or recommendations for actions based on insights generated from historical data. According to the Institute for Operations Research and Management Science (INFORMS), analytics represents the com- bination of computer technology, management science techniques, and statistics to solve real problems. Of course, many other organizations have proposed their own interpreta- tions and motivations for analytics. For example, SAS Institute Inc. proposed eight levels of analytics that begin with standardized reports from a computer system. These reports essentially provide a sense of what is happening with an organization. Additional technolo- gies have enabled us to create more customized reports that can be generated on an ad hoc basis. The next extension of reporting takes us to OLAP-type queries that allow a user to dig deeper and determine specific sources of concern or opportunities. Technologies available today can also automatically issue alerts for a decision maker when performance warrants such alerts. At a consumer level, we see such alerts for weather or other issues. But similar alerts can also be generated in specific settings when sales fall above or below a certain level within a certain time period or when the inventory for a specific product is running low. All of these applications are made possible through analysis and queries of data being collected by an organization. The next level of analysis might entail statistical analysis to better understand patterns. These can then be taken a step further to develop forecasts or models for predicting how customers might respond to a specific marketing campaign or ongoing service/product offerings. When an organization has a good view of what is happening and what is likely to happen, it can also employ other techniques to make the best decisions under the circumstances.

This idea of looking at all the data to understand what is happening, what will happen, and how to make the best of it has also been encapsulated by INFORMS in pro- posing three levels of analytics. These three levels are identified as descriptive, predictive, and prescriptive. Figure 1.9 presents a graphical view of these three levels of analytics. It suggests that these three are somewhat independent steps and one type of analytics

Descriptive

What happened? What is happening?

Q ue

st io

ns E

na bl

er s

Well-defined business problems and opportunities

O ut

co m

Business reporting Dashboards Scorecards Data warehousing

Business Analytics

Predictive

What will happen? Why will it happen?

Prescriptive

What should I do? Why should I do it?

Accurate projections of future events and

outcomes

Best possible business decisions

and actions

Data mining Text mining Web/media mining Forecasting

Optimization Simulation Decision modeling Expert systems

FIGURE 1.9 Three Types of Analytics.

32 Part I • Introduction to Analytics and AI

applications leads to another. It also suggests that there is actually some overlap across these three types of analytics. In either case, the interconnected nature of different types of analytics applications is evident. We next introduce these three levels of analytics.

Descriptive Analytics

Descriptive (or reporting) analytics refers to knowing what is happening in the or- ganization and understanding some underlying trends and causes of such occurrences. First, this involves the consolidation of data sources and availability of all relevant data in a form that enables appropriate reporting and analysis. Usually, the development of this data infrastructure is part of DWs. From this data infrastructure, we can develop ap- propriate reports, queries, alerts, and trends using various reporting tools and techniques.

A significant technology that has become a key player in this area is visualization. Using the latest visualization tools in the marketplace, we can now develop powerful in- sights in the operations of our organization. Application Cases 1.3 and 1.4 highlight some such applications.

Silvaris Corporation was founded in 2000 by a team of forest industry professionals to pro- vide technological advancement in the lumber and building material sector. Silvaris is the first e- commerce platform in the United States spe- cifically for forest products and is headquartered in Seattle, Washington. It is a leading wholesale provider of industrial wood products and surplus building materials.

Silvaris sells its products and provides interna- tional logistics services to more than 3,500 custom- ers. To manage various processes that are involved in a transaction, the company created a proprietary online trading platform to track information flow related to transactions between traders, accounting, credit, and logistics. This allowed Silvaris to share its real-time information with its customers and partners. But due to the rapidly changing prices of materials, it became necessary for Silvaris to get a real-time view of data without moving them into a separate reporting format.

Silvaris started using Tableau because of its abil- ity to connect with and visualize live data. With dash- boards created by Tableau that are easy to understand and explain, Silvaris started using it for reporting pur- poses. This helped Silvaris in pulling out informa- tion quickly from the data and identifying issues that impact its business. Silvaris succeeded in managing

online versus offline orders with the help of reports generated by Tableau. Now, Silvaris keeps track of online orders placed by customers and knows when to send renew pushes to which customers to keep them purchasing online. Also, analysts of Silvaris can save time by generating dashboards instead of writ- ing hundreds of pages of reports by using Tableau.

Sources: Tableau.com. “Silvaris Augments Proprietary Technology Platform with Tableau’s Real-Time Reporting Capabilities.” http:// www.tableau.com/sites/default/files/case-studies/silvaris- business-dashboards_0.pdf (accessed September 2018); Silvaris. com. http://www.silvaris.com (accessed September 2018).

Questions for Case 1.3

1. What was the challenge faced by Silvaris?

2. How did Silvaris solve its problem using data visualization with Tableau?

What We Can Learn from This Application Case

Many industries need to analyze data in real time. Real-time analysis enables the analysts to identify issues that impact their business. Visualization is sometimes the best way to begin analyzing the live data streams. Tableau is one such data visualization tool that has the capability to analyze live data with- out bringing live data into a separate reporting format.

Application Case 1.3 Silvaris Increases Business with Visual Analysis and Real-Time Reporting Capabilities

http://www.tableau.com/sites/default/files/case-studies/silvaris-business-dashboards_0.pdf

http://Silvaris.com

http://www.silvaris.com

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 33

Predictive Analytics

Predictive analytics aims to determine what is likely to happen in the future. This analysis is based on statistical techniques as well as other more recently developed techniques that fall under the general category of data mining. The goal of these techniques is to be able to predict whether the customer is likely to switch to a com- petitor (“churn”), what and how much the customer would likely buy next, what promotions the customer would respond to, whether the customer is a creditworthy risk, and so forth. A number of techniques are used in developing predictive analytical applications, including various classification algorithms. For example, as described in Chapters 4 and 5, we can use classification techniques such as logistic regression, de- cision tree models, and neural networks to predict how well a motion picture will do at the box office. We can also use clustering algorithms for segmenting customers into different clusters to be able to target specific promotions to them. Finally, we can use association mining techniques (Chapters 4 and 5) to estimate relationships between different purchasing behaviors. That is, if a customer buys one product, what else is the customer likely to purchase? Such analysis can assist a retailer in recommending or promoting related products. For example, any product search on Amazon.com results in the retailer also suggesting similar products that a customer may be interested in. We will study these techniques and their applications in Chapters 3 through 6. Application Case 1.5 illustrates one such application in sports.

Siemens is a German company headquartered in Berlin, Germany. It is one of the world’s largest companies focusing on the areas of electrification, automation, and digitalization. It has an annual rev- enue of 76 billion euros.

The visual analytics group of Siemens is tasked with end-to-end reporting solutions and consulting for all of Siemens internal BI needs. This group was fac- ing the challenge of providing reporting solutions to the entire Siemens organization across different depart- ments while maintaining a balance between gover- nance and self-service capabilities. Siemens needed a platform that could analyze its multiple cases of cus- tomer satisfaction surveys, logistic processes, and finan- cial reporting. This platform should be easy to use for their employees so that they could use these data for analysis and decision making. In addition, the platform should be easily integrated with existing Siemens sys- tems and give employees a seamless user experience.

Siemens started using Dundas BI, a leading global provider of BI and data visualization solutions. It allowed Siemens to create highly interactive dash- boards that enabled it to detect issues early and thus save a significant amount of money. The dashboards developed by Dundas BI helped Siemens global

logistics organization answer questions like how dif- ferent supply rates at different locations affect the operation, thus helping the company reduce cycle time by 12 percent and scrap cost by 25 percent.

Questions for Case 1.4

1. What challenges were faced by Siemens visual analytics group?

2. How did the data visualization tool Dundas BI help Siemens in reducing cost?

What We Can Learn from This Application Case

Many organizations want tools that can be used to analyze data from multiple divisions. These tools can help them improve performance and make data discovery transparent to their users so that they can identify issues within the business easily.

Sources: Dundas.com. “How Siemens Drastically Reduced Cost with Managed BI Applications.” https://www.dundas.com/Content/ pdf/siemens-case-study.pdf (accessed September 2018); Wikipedia. org. “SIEMENS.” https://en.wikipedia.org/wiki/Siemens (ac- cessed September 2018); Siemens.com. “About Siemens.” http:// www.siemens.com/about/en/ (accessed September 2018).

Application Case 1.4 Siemens Reduces Cost with the Use of Data Visualization

http://Amazon.com

https://www.dundas.com/Content/pdf/siemens-case-study.pdf

https://en.wikipedia.org/wiki/Siemens

http://Siemens.com

http://www.siemens.com/about/en/

34 Part I • Introduction to Analytics and AI

Any athletic activity is prone to injuries. If the inju- ries are not handled properly, then the team suffers. Using analytics to understand injuries can help in deriving valuable insights that would enable coaches and team doctors to manage the team composition, understand player profiles, and ultimately aid in bet- ter decision making concerning which players might be available to play at any given time.

In an exploratory study, Oklahoma State University analyzed U.S. football-related sports inju- ries by using reporting and predictive analytics. The project followed the Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology (to be described in Chapter 4) to understand the problem of making recommendations on managing injuries, understanding the various data elements collected about injuries, cleaning the data, developing visual- izations to draw various inferences, building predic- tive models to analyze the injury healing time period, and drawing sequence rules to predict the relation- ships among the injuries and the various body part parts afflicted with injuries.

The injury data set consisted of more than 560 football injury records, which were categorized into injury-specific variables—body part/site/laterality, action taken, severity, injury type, injury start and healing dates—and player/sport-specific variables— player ID, position played, activity, onset, and game location. Healing time was calculated for each record, which was classified into different sets of time periods: 0–1 month, 1–2 months, 2–4 months, 4–6 months, and 6–24 months.

Various visualizations were built to draw infer- ences from injury–data set information depicting the healing time period associated with players’ positions, severity of injuries and the healing time period, treat- ment offered and the associated healing time period, major injuries afflicting body parts, and so forth.

Neural network models were built to pre- dict each of the healing categories using IBM SPSS

Modeler. Some of the predictor variables were current status of injury, severity, body part, body site, type of injury, activity, event location, action taken, and position played. The success of classifying the healing category was quite good: Accuracy was 79.6 percent. Based on the analysis, many recommendations were suggested, including employing more specialists’ input from injury onset instead of letting the training room staff screen the injured players; training players at defensive positions to avoid being injured; and hold- ing practice to thoroughly safety-check mechanisms.

Sources: “Sharda, R., Asamoah, D., & Ponna, N. (2013). “Research and Pedagogy in Business Analytics: Opportunities and Illustrative Examples.” Journal of Computing and Information Technology, 21(3), pp. 171–182.

Questions for Case 1.5

1. What types of analytics are applied in the injury analysis?

2. How do visualizations aid in understanding the data and delivering insights into the data?

3. What is a classification problem?

4. What can be derived by performing sequence analysis?

What We Can Learn from This Application Case

For any analytics project, it is always important to understand the business domain and the cur- rent state of the business problem through exten- sive analysis of the only resource—historical data. Visualizations often provide a great tool for gaining the initial insights into data, which can be further refined based on expert opinions to identify the rela- tive importance of the data elements related to the problem. Visualizations also aid in generating ideas for obscure problems, which can be pursued in building PMs that could help organizations in deci- sion making.

Application Case 1.5 Analyzing Athletic Injuries

Prescriptive Analytics

The third category of analytics is termed prescriptive analytics. The goal of prescriptive analytics is to recognize what is going on as well as the likely forecast and make deci- sions to achieve the best performance possible. This group of techniques has historically been studied under the umbrella of OR or management sciences and is generally aimed at

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 35

optimizing the performance of a system. The goal here is to provide a decision or a recom- mendation for a specific action. These recommendations can be in the form of a specific yes/no decision for a problem, a specific amount (say, price for a specific item or airfare to charge), or a complete set of production plans. The decisions may be presented to a deci- sion maker in a report or may be used directly in an automated decision rules system (e.g., in airline pricing systems). Thus, these types of analytics can also be termed decision or normative analytics. Application Case 1.6 gives an example of such prescriptive analytic applications. We will learn about some aspects of prescriptive analytics in Chapter 8.

ANALYTICS APPLIED TO DIFFERENT DOMAINS Applications of analytics in various in- dustry sectors have spawned many related areas or at least buzzwords. It is almost fashionable to attach the word analytics to any specific industry or type of data. Besides the general category of text analytics—aimed at getting value out of text (to be studied in Chapter 7)—or Web analytics—analyzing Web data streams (also in

This application case is based on a project that involved one of the coauthors A company that does not wish to disclose its name (or even its precise industry) was facing a major problem of making decisions on which inventory of raw materials to use to satisfy which customers. This company sup- plies custom configured steel bars to its customers. These bars may be cut into specific shapes or sizes and may have unique material and finishing require- ments. The company procures raw materials from around the world and stores them in its warehouse. When a prospective customer calls the company to request a quote for the specialty bars meeting spe- cific material requirements (composition, origin of the metal, quality, shapes, sizes, etc.), the salesper- son usually has just a little bit of time to submit such a quote including the date when the product can be delivered and, of course, prices, and so on. It must make available-to-promise (ATP) decisions, which determine in real time the dates when the salesper- son can promise delivery of products that customers requested during the quotation stage. Previously, a salesperson had to make such decisions by analyz- ing reports on available inventory of raw materials. Some of the available raw material may have already been committed to another customer’s order. Thus, the inventory in stock might not really be inven- tory available. On the other hand, there may be raw material that is expected to be delivered in the near future that could also be used for satisfying the order

from this prospective customer. Finally, there might even be an opportunity to charge a premium for a new order by repurposing previously committed inventory to satisfy this new order while delaying an already committed order. Of course, such deci- sions should be based on the cost–benefit analyses of delaying a previous order. The system should thus be able to pull real-time data about inventory, committed orders, incoming raw material, produc- tion constraints, and so on.

To support these ATP decisions, a real-time DSS was developed to find an optimal assignment of the available inventory and to support additional what-if analysis. The DSS uses a suite of mixed- integer pro- gramming models that are solved using commercial software. The company has incorporated the DSS into its enterprise resource planning system to seam- lessly facilitate its use of business analytics.

Questions for Case 1.6

1. Why would reallocation of inventory from one customer to another be a major issue for discussion?

2. How could a DSS help make these decisions?

Source: M. Pajouh Foad, D. Xing, S. Hariharan, Y. Zhou, B. Balasundaram, T. Liu, & R. Sharda, R. (2013). “Available-to-Promise in Practice: An Application of Analytics in the Specialty Steel Bar Products Industry.” Interfaces, 43(6), pp. 503–517. http://dx.doi. org/10.1287/inte.2013.0693 (accessed September 2018).

Application Case 1.6 A Specialty Steel Bar Company Uses Analytics to Determine Available-to-Promise Dates

http://dx.doi.org/10.1287/inte.2013.0693

36 Part I • Introduction to Analytics and AI

Chapter 7)—many industry- or problem-specific analytics professions/streams have been developed. Examples of such areas are marketing analytics, retail analytics, fraud analytics, transportation analytics, health analytics, sports analytics, talent ana- lytics, behavioral analytics, and so forth. For example, we will soon see several appli- cations in sports analytics. Application Case 1.5 could also be termed a case study in health analytics. The next section will introduce health analytics and market analytics broadly. Literally, any systematic analysis of data in a specific sector is being labeled as “(fill-in-blanks)” analytics. Although this may result in overselling the concept of analytics, the benefit is that more people in specific industries are aware of the power and potential of analytics. It also provides a focus to professionals developing and ap- plying the concepts of analytics in a vertical sector. Although many of the techniques to develop analytics applications may be common, there are unique issues within each vertical segment that influence how the data may be collected, processed, ana- lyzed, and the applications implemented. Thus, the differentiation of analytics based on a vertical focus is good for the overall growth of the discipline.

ANALYTICS OR DATA SCIENCE? Even as the concept of analytics is receiving more at- tention in industry and academic circles, another term has already been introduced and is becoming popular. The new term is data science. Thus, the practitioners of data sci- ence are data scientists. D. J. Patil of LinkedIn is sometimes credited with creating the term data science. There have been some attempts to describe the differences between data analysts and data scientists (e.g., see “Data Science Revealed,” 2018) (emc.com/ collateral/about/news/emc-data-science-study-wp.pdf). One view is that data analyst is just another term for professionals who were doing BI in the form of data compila- tion, cleaning, reporting, and perhaps some visualization. Their skill sets included Excel use, some SQL knowledge, and reporting. You would recognize those capabilities as descriptive or reporting analytics. In contrast, data scientists are responsible for predic- tive analysis, statistical analysis, and use of more advanced analytical tools and algo- rithms. They may have a deeper knowledge of algorithms and may recognize them under various labels—data mining, knowledge discovery, or machine learning. Some of these professionals may also need deeper programming knowledge to be able to write code for data cleaning/analysis in current Web-oriented languages such as Java or Python and statistical languages such as R. Many analytics professionals also need to build signifi- cant expertise in statistical modeling, experimentation, and analysis. Again, our readers should recognize that these fall under the predictive and prescriptive analytics umbrella. However, prescriptive analytics also includes more significant expertise in OR including optimization, simulation, and decision analysis. Those who cover these fields are more likely to be called data scientists than analytics professionals.

Our view is that the distinction between analytics professional and data scientist is more of a degree of technical knowledge and skill sets than functions. It may also be more of a distinction across disciplines. Computer science, statistics, and applied mathematics programs appear to prefer the data science label, reserving the analytics label for more business-oriented professionals. As another example of this, applied physics professionals have proposed using network science as the term for describing analytics that relate to groups of people—social networks, supply chain networks, and so forth. See http://barabasi.com/networksciencebook/ for an evolving textbook on this topic.

Aside from a clear difference in the skill sets of professionals who only have to do descriptive/reporting analytics versus those who engage in all three types of analyt- ics, the distinction between the two labels is fuzzy at best. We observe that graduates of our analytics programs tend to be responsible for tasks that are more in line with data

http://emc.com/collateral/about/news/emc-data-science-study-wp.pdf

http://barabasi.com/networksciencebook/

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 37

science professionals (as defined by some circles) than just reporting analytics. This book is clearly aimed at introducing the capabilities and functionality of all analytics (which include data science), not just reporting analytics. From now on, we will use these terms interchangeably.

WHAT IS BIG DATA? Any book on analytics and data science has to include significant coverage of what is called Big Data analytics. We cover it in Chapter 9 but here is a very brief introduction. Our brains work extremely quickly and efficiently and are ver- satile in processing large amounts of all kinds of data: images, text, sounds, smells, and video. We process all different forms of data relatively easily. Computers, on the other hand, are still finding it hard to keep up with the pace at which data are generated, let alone analyze them quickly. This is why we have the problem of Big Data. So, what is Big Data? Simply put, Big Data refers to data that cannot be stored in a single storage unit. Big Data typically refers to data that come in many different forms: structured, un- structured, in a stream, and so forth. Major sources of such data are clickstreams from Web sites, postings on social media sites such as Facebook, and data from traffic, sensors, or weather. A Web search engine such as Google needs to search and index billions of Web pages to give you relevant search results in a fraction of a second. Although this is not done in real time, generating an index of all the Web pages on the Internet is not an easy task. Luckily for Google, it was able to solve this problem. Among other tools, it has employed Big Data analytical techniques.

There are two aspects to managing data on this scale: storing and processing. If we could purchase an extremely expensive storage solution to store all this at one place on one unit, making this unit fault tolerant would involve a major expense. An ingenious solution was proposed that involved storing these data in chunks on different machines connected by a network—putting a copy or two of this chunk in different locations on the network, both logically and physically. It was originally used at Google (then called the Google File System) and later developed and released by an Apache project as the Hadoop Distributed File System (HDFS).

However, storing these data is only half of the problem. Data are worthless if they do not provide business value, and for them to provide business value, they must be analyzed. How can such vast amounts of data be analyzed? Passing all computation to one powerful computer does not work; this scale would create a huge overhead on such a powerful computer. Another ingenious solution was proposed: Push computa- tion to the data instead of pushing data to a computing node. This was a new paradigm and gave rise to a whole new way of processing data. This is what we know today as the MapReduce programming paradigm, which made processing Big Data a reality. MapReduce was originally developed at Google, and a subsequent version was released by the Apache project called Hadoop MapReduce.

Today, when we talk about storing, processing, or analyzing Big Data, HDFS and MapReduce are involved at some level. Other relevant standards and software solutions have been proposed. Although the major toolkit is available as an open source, several companies have been launched to provide training or specialized analytical hardware or software services in this space. Some examples are HortonWorks, Cloudera, and Teradata Aster.

Over the past few years, what was called Big Data changed more and more as Big Data applications appeared. The need to process data coming in at a rapid rate added ve- locity to the equation. An example of fast data processing is algorithmic trading. This uses electronic platforms based on algorithms for trading shares on the financial market, which operates in microseconds. The need to process different kinds of data added variety to the equation. Another example of a wide variety of data is sentiment analysis, which

38 Part I • Introduction to Analytics and AI

uses various forms of data from social media platforms and customer responses to gauge sentiments. Today, Big Data is associated with almost any kind of large data that have the characteristics of volume, velocity, and variety. As noted before, these are evolving quickly to encompass stream analytics, IoT, cloud computing, and deep learning– enabled AI. We will study these in various chapters in the book.

u SECTION 1.5 REVIEW QUESTIONS

1. Define analytics. 2. What is descriptive analytics? What are the various tools that are employed in descrip-

tive analytics?

3. How is descriptive analytics different from traditional reporting? 4. What is a DW? How can DW technology help enable analytics? 5. What is predictive analytics? How can organizations employ predictive analytics? 6. What is prescriptive analytics? What kinds of problems can be solved by prescriptive

analytics?

7. Define modeling from the analytics perspective. 8. Is it a good idea to follow a hierarchy of descriptive and predictive analytics before

applying prescriptive analytics?

9. How can analytics aid in objective decision making? 10. What is Big Data analytics? 11. What are the sources of Big Data? 12. What are the characteristics of Big Data? 13. What processing technique is applied to process Big Data?

1.6 ANALYTICS EXAMPLES IN SELECTED DOMAINS

You will see examples of analytics applications throughout various chapters. That is one of the primary approaches (exposure) of this book. In this section, we highlight three ap- plication areas—sports, healthcare, and retail—where there have been the most reported applications and successes.

Sports Analytics—An Exciting Frontier for Learning and Understanding Applications of Analytics

The application of analytics to business problems is a key skill, one that you will learn in this book. Many of these techniques are now being applied to improve decision making in all aspects of sports, a very hot area called sports analytics. It is the art and science of gathering data about athletes and teams to create insights that improve sports decisions, such as deciding which players to recruit, how much to pay them, who to play, how to train them, how to keep them healthy, and when they should be traded or retired. For teams, it involves business decisions such as ticket pricing as well as roster decisions, analysis of each competitor’s strengths and weaknesses, and many game-day decisions.

Indeed, sports analytics is becoming a specialty within analytics. It is an important area because sport is a big business, generating about $145 billion in revenues each year plus an additional $100 billion in legal and $300 billion in illegal gambling, accord- ing to Price Waterhouse (“Changing the Game: Outlook for the Global Sports Market to 2015” (2015)). In 2014, only $125 million was spent on analytics (less than 0.1 percent

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 39

of revenues). This is expected to grow at a healthy rate to $4.7 billion by 2021 (“Sports Analytics Market Worth $4.7B by 2021” (2015)).

The use of analytics for sports was popularized by the Moneyball book by Michael Lewis in 2003 and the movie starring Brad Pitt in 2011. It showcased Oakland A’s general manager Billy Beane and his use of data and analytics to turn a losing team into a winner. In particular, he hired an analyst who used analytics to draft players who were able to get on base as opposed to players who excelled at traditional measures like runs batted in or stolen bases. These insights allowed the team to draft prospects overlooked by other teams at reasonable starting salaries. It worked—the team made it to the playoffs in 2002 and 2003.

Now analytics are being used in all parts of sports. The analytics can be divided between the front office and back office. A good description with 30 examples appears in Tom Davenport’s survey article (). Front-office business analytics include analyzing fan behavior ranging from predictive models for season ticket renewals and regular ticket sales to scoring tweets by fans regarding the team, athletes, coaches, and owners. This is very similar to traditional CRM. Financial analysis is also a key area such as when salary cap (for pros) or scholarship (for colleges) limits are part of the equation.

Back-office uses include analysis of both individual athletes and team play. For in- dividual players, there is a focus on recruitment models and scouting analytics, analytics for strength and fitness as well as development, and PMs for avoiding overtraining and injuries. Concussion research is a hot field. Team analytics include strategies and tactics, competitive assessments, and optimal roster choices under various on-field or on-court situations.

The following representative examples illustrate how two sports organizations use data and analytics to improve sports operations in the same way that analytics have im- proved traditional industry decision making.

Example 1: The Business Office

Dave Ward works as a business analyst for a major pro baseball team, focusing on rev- enue. He analyzes ticket sales, both from season ticket holders and single-ticket buyers. Sample questions in his area of responsibility include why season ticket holders renew (or do not renew) their tickets as well as what factors drive last-minute individual seat ticket purchases. Another question is how to price the tickets.

Some of the analytical techniques Dave uses include simple statistics on fan be- havior such as overall attendance and answers to survey questions about likelihood to purchase again. However, what fans say versus what they do can be different. Dave runs a survey of fans by ticket seat location (“tier”) and asks about their likelihood of renew- ing their season tickets. But when he compares what they say versus what they do, he discovers big differences. (See Figure 1.10.) He found that 69 percent of fans in Tier 1 seats who said on the survey that they would “probably not renew” actually did. This

Tier Highly Likely Likely Maybe Probably Not Certainly Not

1 92 88 75 69 45

2 88 81 70 65 38

3 80 76 68 55 36

4 77 72 65 45 25

5 75 70 60 35 25

FIGURE 1.10 Season Ticket Renewals—Survey Scores.

40 Part I • Introduction to Analytics and AI

is useful insight that leads to action—customers in the green cells are the most likely to renew tickets and so require fewer marketing touches and dollars to convert compared to customers in the blue cells.

However, many factors influence fan ticket purchase behavior, especially price, which drives more sophisticated statistics and data analysis. For both areas, but especially single-game tickets, Dave is driving the use of dynamic pricing—moving the business from simple static pricing by seat location tier to day-by-day up-and-down pricing of indi- vidual seats. This is a rich research area for many sports teams and has huge upside po- tential for revenue enhancement. For example, his pricing takes into account the team’s record, who they are playing, game dates and times, which star athletes play for each team, each fan’s history of renewing season tickets or buying single tickets, and factors such as seat location, number of seats, and real-time information like traffic congestion historically at game time and even the weather. See Figure 1.11.

Which of these factors are important and by how much? Given his extensive sta- tistics background, Dave builds regression models to pick out key factors driving these historic behaviors and create PMs to identify how to spend marketing resources to drive revenues. He builds churn models for season ticket holders to create segments of custom- ers who will renew, will not renew, or are fence-sitters, which then drives more refined marketing campaigns.

In addition, Dave does sentiment scoring on fan comments such as tweets that help him segment fans into different loyalty segments. Other studies about single-game atten- dance drivers help the marketing department understand the impact of giveaways like bobble-heads or T-shirts or suggestions on where to make spot TV ad buys.

Beyond revenues, there are many other analytical areas that Dave’s team works on, including merchandising, TV and radio broadcast revenues, inputs to the general manager on salary negotiations, draft analytics especially given salary caps, promotion effectiveness including advertising channels, and brand awareness, as well as partner analytics. He’s a very busy guy!

Seat Location

Team Performance

Time-Related Variables

Game Start Time

Part of the Season

Days before the Game

Home Team Performance in Past 10 Games

Opponent Made Playoffs Previous Year

Individual Player Reputations

Which Pitcher? What’s His Earned Run Average?

Number of All Stars on Opponent’s Roster

Opponent from Same Division

FIGURE 1.11 Dynamic Pricing Previous Work—Major League Baseball. Source: Based on C. Kemper and C. Breuer, “How Efficient is Dynamic Pricing for Sports Events? Designing a Dynamic Pricing Model

for Bayern Munich”, Intl. Journal of Sports Finance, 11, pp. 4–25, 2016.

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 41

Example 2: The Coach

Bob Breedlove is the football coach for a major college team. For him, everything is about winning games. His areas of focus include recruiting the best high school play- ers, developing them to fit his offense and defense systems, and getting maximum effort from them on game days. Sample questions in his area of responsibility include: Whom do we recruit? What drills help develop their skills? How hard do I push our athletes? Where are opponents strong or weak, and how do we figure out their play tendencies?

Fortunately, his team has hired a new team operations expert, Dar Beranek, who specializes in helping the coaches make tactical decisions. She is working with a team of student interns who are creating opponent analytics. They used the coach’s annotated game film to build a cascaded decision tree model (Figure 1.12) to predict whether the next play will be a running play or passing play. For the defensive coordinator, they have built heat maps (Figure 1.13) of each opponent’s passing offense, illustrating their tenden- cies to throw left or right and into which defensive coverage zones. Finally, they built some time-series analytics (Figure 1.14) on explosive plays (defined as a gain of more than 16 yards for a passing play or more than 12 yards for a run play). For each play, they compare the outcome with their own defensive formations and the other team’s offensive formations, which help Coach Breedlove react more quickly to formation shifts during a game. We explain the analytical techniques that generated these figures in much more depth in Chapters 3–6 and Chapter 9.

New work that Dar is fostering involves building better high school athlete recruit- ing models. For example, each year the team gives scholarships to three students who are wide receiver recruits. For Dar, picking out the best players goes beyond simple

Total # of Plays: 540 Percentage of Run: 46.48% Percentage of Pass: 53.52%

Total # of Plays: 155 Percentage of Run: 79.35% Percentage of Pass: 20.65%

Total # of Plays: 385 Percentage of Run: 33.25% Percentage of Pass: 66.75%

Total # of Plays: 294 Percentage of Run: 38.78% Percentage of Pass: 61.22%

Total # of Plays: 91 Percentage of Run: 15.38% Percentage of Pass: 84.62%

Total # of Plays: 162 Percentage of Run: 50.62% Percentage of Pass: 49.38%

Total # of Plays: 132 Percentage of Run: 24.24% Percentage of Pass: 75.67%

Total # of Plays: 25 Percentage of Run: 44.00% Percentage of Pass: 56.00%

Total # of Plays: 66 Percentage of Run: 4.55%

Percentage of Pass: 95.45%

12, 21, 30, 31, 32 10, 11, 20, 22, or Missing

If it is...

If... If the distance to achieve the next down is

More than 5 yardsLess than 5 yardsWe are behind We are leading

or it is a tie

1st or 2nd Down 3rd or 4th Down

If Off_Pers is

FIGURE 1.12 Cascaded Decision Tree for Run or Pass Plays. Source: Contributed by Dr. Dave Schrader, who retired after 24 years in advanced development and marketing at Teradata. He has remained on the Board of Advisors of the Teradata University

Network, where he spends his retirement helping students and faculty learn more about sports analytics. Graphics by Peter Liang

and Jacob Pearson, graduate students at Oklahoma State University, as part of a student project in the spring of 2016 in Prof.

Ramesh Sharda’s class under Dr. Dave Schrader’s coaching.

42 Part I • Introduction to Analytics and AI

measures like how fast athletes run, how high they jump, or how long their arms are to newer criteria like how quickly they can rotate their heads to catch a pass, what kinds of reaction times they exhibit to multiple stimuli, and how accurately they run pass routes. Some of her ideas illustrating these concepts appear on the TUN Web site; look for the Business Scenario Investigation (2015) “The Case of Precision Football.”

WHAT CAN WE LEARN FROM THESE EXAMPLES? Beyond the front-office business ana- lysts, the coaches, trainers, and performance experts, there are many other people in sports who use data, ranging from golf groundskeepers who measure soil and turf condi- tions for PGA tournaments to baseball and basketball referees who are rated on the cor- rect and incorrect calls they make. In fact, it is hard to find an area of sports that is not being impacted by the availability of more data, especially from sensors.

Skills you will learn in this book for business analytics will apply to sports. If you want to dig deeper into this area, we encourage you to look at the Sports Analytics sec- tion of the TUN, a free resource for students and faculty. On its Web site, you will find descriptions of what to read to find out more about sports analytics, compilations of places where you can find publically available data sets for analysis, as well as examples

A Complete: 35

Total: 46 76.08%

Explosive: 4

1 Complete: 25

Total: 35 71.4%

Explosive: 1

B Complete: 6

Total: 8 75.00%

Explosive: 5

2 Complete: 12

Total: 24 50%

Explosive: 0

3 Complete: 14

Total: 28 50%

Explosive: 0

4 Complete: 8

Total: 14 57.14%

Explosive: 0

6 Complete: 7

Total: 10 70%

Explosive: 2

7 Complete: 13

Total: 21 61.9%

Explosive: 9

8 Complete: 7

Total: 10 70%

Explosive: 6

9 Complete: 15

Total: 27 55.55%

Explosive: 8

5 Complete: 25

Total: 44 56.81%

Explosive: 1

C Complete: 22

Total: 27 81.48%

Explosive: 2

X Complete: 1

Total: 13 7.69%

Explosive: 1

Y Complete: 7

Total: 18 38.88%

Explosive: 7

Z Complete: 5

Total: 15 33.33%

Explosive: 6

Line of Scrimmage

Defense

Offense

FIGURE 1.13 Heat Map Zone Analysis for Passing Plays. Source: Contributed by Dr. Dave Schrader, who retired after 24 years in advanced development and marketing at Teradata. He has remained on

the Board of Advisors of the Teradata University Network, where he spends his retirement helping

students and faculty learn more about sports analytics. Graphics by Peter Liang and Jacob Pearson,

graduate students at Oklahoma State University, as part of a student project in the spring of 2016 in

Prof. Ramesh Sharda’s class under Dr. Dave Schrader’s coaching.

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 43

of student projects in sports analytics and interviews of sports professionals who use data and analytics to do their jobs. Good luck learning analytics!

Analytics Applications in Healthcare—Humana Examples

Although healthcare analytics span a wide variety of applications from prevention to diagnosis to efficient operations and fraud prevention, we focus on some applications that have been developed at a major health insurance company in the United States, Humana. According to its Web site, “The company’s strategy integrates care delivery, the member experience, and clinical and consumer insights to encourage en- gagement, behavior change, proactive clinical outreach and wellness. . . .” Achieving these strategic goals includes significant investments in information technology in general and analytics in particular. Brian LeClaire is senior vice president and CIO of Humana. He has a PhD in MIS from Oklahoma State University. He has championed analytics as a competitive differentiator at Humana— including cosponsoring the creation of a center for excellence in analytics. He described the following projects as examples of Humana’s analytics initiatives, led by Humana’s chief clinical analytics officer, Vipin Gopal.

Humana Example 1: Preventing Falls in a Senior Population— An Analytic Approach

Accidental falls are a major health risk for adults age 65 years and older with one-third experiencing a fall every year.1 The costs of falls pose a significant strain on the U.S. healthcare system; the direct costs of falls were estimated at $34 billion in 2013 alone.1

ud_d_off_pers

ud_d_cov

Expl_

Expl_Y

ud_d_cov0

ud_d_cov2

ud_d_cov3

ud_d_cov4

ud_d_cov4 JAM ud_d_cov6

ud_d_covBUZZ ud_d_covFALL

ud_d_covFIRES

ud_d_covFLAME

ud_d_covHANDS

ud_d_covHARD ud_d_covHERO

ud_d_covHOT

ud_d_covLEVELS

ud_d_covMIX

ud_d_covROBBER

ud_d_covROLL

ud_d_covSKY

ud_d_covSMOKE

ud_d_covSPARK

ud_d_covSQUAT

ud_d_covSTATE

ud_d_covWALL

ud_d_off_pers10

ud_d_off_pers11

ud_d_off_pers12

ud_d_off_pers20

ud_d_off_pers21

ud_d_off_pers22

ud_d_off_pers23

ud_d_off_pers30

ud_d_off_pers31

ud_d_off_pers32

FIGURE 1.14 Time-Series Analysis of Explosive Plays.

44 Part I • Introduction to Analytics and AI

With the percent of seniors in the U.S. population on the rise, falls and associated costs are anticipated to increase. According to the Centers for Disease Control and Prevention (CDC), “Falls are a public health problem that is largely preventable” (www.cdc.gov/ homeandrecreationalsafety/falls/adultfalls.html).1 Falls are also the leading factor for both fatal and nonfatal injuries in older adults with injurious falls increasing the risk of disability by up to 50 percent (Gill et al., 2013).2 Humana is the nation’s second-largest provider of Medicare Advantage benefits with approximately 3.2 million members, most of whom are seniors. Keeping its senior members well and helping them live safely at their homes is a key business objective of which prevention of falls is an important com- ponent. However, no rigorous methodology was available to identify individuals most likely to fall, for whom falls prevention efforts would be beneficial. Unlike chronic medi- cal conditions such as diabetes and cancer, a fall is not a well-defined medical condition. In addition, falls are usually underreported in claims data as physicians typically tend to code the consequence of a fall such as fractures and dislocations. Although many clini- cally administered assessments to identify fallers exist, they have limited reach and lack sufficient predictive power (Gates et al., 2008).3 As such, there is a need for a prospec- tive and accurate method to identify individuals at greatest risk of falling so that they can be proactively managed for fall prevention. The Humana analytics team undertook the development of a Falls Predictive Model in this context. It is the first comprehensive PM reported that utilizes administrative medical and pharmacy claims, clinical data, temporal clinical patterns, consumer information, and other data to identify individuals at high risk of falling over a time horizon.

Today, the Falls PM is central to Humana’s ability to identify seniors who could benefit from fall mitigation interventions. An initial proof-of-concept with Humana con- sumers, representing the top 2 percent of those at the highest risk of falling, demonstrated that the consumers had increased utilization of physical therapy services, indicating con- sumers are taking active steps to reduce their risk for falls. A second initiative utilizes the Falls PM to identify high-risk individuals for remote monitoring programs. Using the PM, Humana was able to identify 20,000 consumers at a high risk of falling who benefited from this program. Identified consumers wear a device that detects falls and alerts a 24/7 service for immediate assistance.

This work was recognized by the Analytics Leadership Award by Indiana University Kelly School of Business in 2015, for innovative adoption of analytics in a business environment.

Contributors: Harpreet Singh, PhD; Vipin Gopal, PhD; Philip Painter, MD.

Humana Example 2: Humana’s Bold Goal—Application of Analytics to Define the Right Metrics

In 2014, Humana, Inc. announced its organization’s Bold Goal to improve the health of the communities it serves by 20 percent by 2020 by making it easy for people to achieve their best health. The communities that Humana serves can be defined in many ways, including geographically (state, city, neighborhood), by product (Medicare Advantage, employer-based plans, individually purchased), or by clinical profile (priority conditions including diabetes, hypertension, congestive heart failure [CHF], coronary artery disease [CAD], chronic obstructive pulmonary disease [COPD], or depression). Understanding the health of these communities and how they track over time is critical not only for the evaluation of the goal, but also in crafting strategies to improve the health of the whole membership in its entirety.

A challenge before the analytics organization was to identify a metric that cap- tures the essence of the Bold Goal. Objectively measured traditional health insurance

http://www.cdc.gov/homeandrecreationalsafety/falls/adultfalls.html

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 45

metrics such as hospital admissions or emergency room visits per 1,000 persons would not capture the spirit of this new mission. The goal was to identify a metric that captures health and its improvement in a community and was relevant to Humana as a business. Through rigorous analytic evaluations, Humana eventually selected “Healthy Days,” a four- question, quality-of-life questionnaire originally developed by the CDC to track and measure Humana’s overall progress toward the Bold Goal.

It was critical to make sure that the selected metric was highly correlated to health and business metrics so that any improvement in Healthy Days resulted in improved health and better business results. Some examples of how “Healthy Days” is correlated to metrics of interest include the following:

• Individuals with more unhealthy days (UHDs) exhibit higher utilization and cost patterns. For a five-day increase in UHDs, there are (1) an $82 increase in average monthly medical and pharmacy costs, (2) an increase of 52 inpatient admits per 1,000 patients, and (3) a 0.28-day increase in average length of stay (Havens, Peña, Slabaugh, Cordier, Renda, & Gopal, 2015).1

• Individuals who exhibit healthy behaviors and have their chronic conditions well managed have fewer UHDs. For example, when we look at individuals with diabe- tes, UHDs are lower if they obtained an LDL screening (-4.3 UHDs) or a diabetic eye exam (-2.3 UHDs). Likewise, if they have controlled blood sugar levels mea- sured by HbA1C (-1.8 UHDs) or LDL levels (-1.3 UHDs) (Havens, Slabaugh, Peña, Haugh, & Gopal 2015).2

• Individuals with chronic conditions have more UHDs than those who do not have (1) CHF (16.9 UHDs), (2) CAD (14.4 UHDs), (3) hypertension (13.3 UHDs), (4) dia- betes (14.7 UHDs), (5) COPD (17.4 UHDs), or (6) depression (22.4 UHDs) (Havens, Peña, Slabaugh et al., 2015; Chiguluri, Guthikonda, Slabaugh, Havens, Peña, & Cordier, 2015; Cordier et al., 2015).1,3,4

Humana has since adopted Healthy Days as their metric for the measurement of progress toward Bold Goal (Humana, http://populationhealth.humana.com/wp- content/uploads/2016/05/BoldGoal2016ProgressReport_1.pdf).5

Contributors: Tristan Cordier, MPH; Gil Haugh, MS; Jonathan Peña, MS; Eriv Havens, MS; Vipin Gopal, PhD.

Humana Example 3: Predictive Models to Identify the Highest Risk Membership in a Health Insurer

The 80/20 rule generally applies in healthcare; that is, roughly 20 percent of consum- ers account for 80 percent of healthcare resources due to their deteriorating health and chronic conditions. Health insurers like Humana have typically enrolled the highest-risk enrollees in clinical and disease management programs to help manage the chronic con- ditions the members have.

Identification of the correct members is critical for this exercise, and in the recent years, PMs have been developed to identify enrollees with high future risk. Many of these PMs were developed with heavy reliance on medical claims data, which results from the medical services that the enrollees use. Because of the lag that exists in submitting and processing claims data, there is a corresponding lag in identification of high-risk members for clinical program enrollment. This issue is especially relevant when new members join a health insurer as they would not have a claims history with an insurer. A claims-based PM could take on average of 9–12 months after enrollment of new members to identify them for referral to clinical programs.

In the early part of this decade, Humana attracted large numbers of new members in its Medicare Advantage products and needed a better way to clinically manage this

http://populationhealth.humana.com/wp-content/uploads/2016/05/BoldGoal2016ProgressReport_1.pdf

46 Part I • Introduction to Analytics and AI

membership. As such, it became extremely important that a different analytic approach be developed to rapidly and accurately identify high-risk new members for clinical man- agement, to keep this group healthy and costs down.

Humana’s Clinical Analytics team developed the New Member Predictive Model (NMPM) that would quickly identify at-risk individuals soon after their new plan enrollments with Humana rather than waiting for sufficient claim history to become available for compil- ing clinical profiles and predicting future health risk. Designed to address the unique chal- lenges associated with new members, NMPM developed a novel approach that leveraged and integrated broader data sets beyond medical claims data such as self-reported health risk assessment data and early indicators from pharmacy data, employed advanced data min- ing techniques for pattern discovery, and scored every Medicare Advantage (MA, a specific insurance plan) consumer daily based on the most recent data Humana has to date. The model was deployed with a cross-functional team of analytics, IT, and operations to ensure seamless operational and business integration.

Since NMPM was implemented in January 2013, it has been rapidly identifying high- risk new members for enrollment in Humana’s clinical programs. The positive outcomes achieved through this model have been highlighted in multiple senior leader commu- nications from Humana. In the first quarter 2013 earnings release presentation to inves- tors, Bruce Broussard, CEO of Humana, stated the significance of “improvement in new member PMs and clinical assessment processes,” which resulted in 31,000 new members enrolled in clinical programs, compared to 4,000 in the same period a year earlier, a 675 percent increase. In addition to the increased volume of clinical program enrollments, outcome studies showed that the newly enrolled consumers identified by NMPM were also referred to clinical programs sooner with over 50 percent of the referrals identified within the first three months after new MA plan enrollments. The consumers identified also participated at a higher rate and had longer tenure in the programs.

Contributors: Sandy Chiu, MS; Vipin Gopal, PhD.

These examples illustrate how an organization explores and implements analytics applications to meet its strategic goals. You will see several other examples of healthcare applications throughout various chapters in the book.

ANALYTICS IN THE RETAIL VALUE CHAIN The retail sector is where you would perhaps see the most applications of analytics. This is the domain where the volumes are large but the margins are usually thin. Customers’ tastes and preferences change frequently. Physical and online stores face many challenges to succeed. And market dominance at one time does not guarantee continued success. So investing in learning about your suppliers, cus- tomers, employees, and all the stakeholders that enable a retail value chain to succeed and using that information to make better decisions has been a goal of the analytics industry for a long time. Even casual readers of analytics probably know about Amazon’s enormous investments in analytics to power their value chain. Similarly, Walmart, Target, and other major retailers have invested millions of dollars in analytics for their supply chains. Most of the analytics technology and service providers have a major presence in retail analytics. Coverage of even a small portion of those applications to achieve our exposure goal could fill a whole book. So this section highlights just a few potential applications. Most of these have been fielded by many retailers and are available through many technology providers, so in this section, we will take a more general view rather than point to specific cases. This general view has been proposed by Abhishek Rathi, CEO of vCreaTek.com. vCreaTek, LLC is a boutique analytics software and service company that has offices in India, the United States, the United Arab Emirates (UAE), and Belgium. The company develops ap- plications in multiple domains, but retail analytics is one of its key focus areas.

http://vCreaTek.com

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 47

Figure 1.15 highlights selected components of a retail value chain. It starts with suppliers and concludes with customers but illustrates many intermediate strategic and operational planning decision points where analytics—descriptive, predictive, or prescriptive—can play a role in making better data-driven decisions. Table 1.1 also il- lustrates some of the important areas of analytics applications, examples of key questions that can be answered through analytics, and of course, the potential business value de- rived from fielding such analytics. Some examples are discussed next.

An online retail site usually knows its customer as soon as the customer signs in, and thus they can offer customized pages/offerings to enhance the experience. For any retail store, knowing its customer at the store entrance is still a huge challenge. By com- bining the video analytics and information/badge issued through its loyalty program, the store may be able to identify the customer at the entrance itself and thus enable an extra opportunity for a cross-selling or up-selling. Moreover, a personalized shopping experi- ence can be provided with more customized engagement during the customer’s time in the store.

Store retailers invest lots of money in attractive window displays, promotional events, customized graphics, store decorations, printed ads, and banners. To discern the effectiveness of these marketing methods, the team can use shopper analytics by observ- ing closed-circuit television (CCTV) images to figure out the demographic details of the in-store foot traffic. The CCTV images can be analyzed using advanced algorithms to de- rive demographic details such as age, gender, and mood of the person browsing through the store.

Further, the customer’s in-store movement data when combined with shelf layout and planogram can give more insight to the store manager to identify the hot-selling/ profitable areas within the store. Moreover, the store manager also can use this informa- tion to plan the workforce allocation for those areas for peak periods.

Retail Value Chain Critical needs at every touch point of the Retail Value Chain

• Shelf-space optimization • Location analysis • Shelf and floor planning • Promotions and markdown optimization

• Supply chain management • Inventory cost optimization • Inventory shortage and excess management • Less unwanted costs

• Targeted promotions • Customized inventory • Promotions and price optimization • Customized shopping experience

• On-time product availability at low costs • Order fulfillment and clubbing • Reduced transportation costs

• Trend analysis • Category management • Predicting trigger events for sales • Better forecasts of demand

• Deliver seamless customer experience • Understand relative performance of channels • Optimize marketing strategies

Vendors Customers Planning Merchandizing Buying

Warehouse & Logistics

Multichannel Operations

• Building retention and satisfaction • Understanding the needs of the customer better • Serving high LTV customers better

FIGURE 1.15 Example of Analytics Applications in a Retail Value Chain. Source: Contributed by Abhishek Rathi, CEO, vCreaTek.com.

http://vCreaTek.com

48 Part I • Introduction to Analytics and AI

TABLE 1.1 Examples of Analytics Applications in the Retail Value Chain

Analytic Application Business Question Business Value

Inventory Optimization

1. Which products have high demand? 2. Which products are slow moving or becoming

obsolete?

1. Forecast the consumption of fast-moving products and order them with sufficient inventory to avoid a stock out scenario.

2. Perform fast inventory turnover of slow-moving products by combining them with one in high demand.

Price Elasticity 1. How much net margin do I have on the product?

2. How much discount can I give on this product?

1. Markdown prices for each product can be optimized to reduce the margin dollar loss.

2. Optimized price for the bundle of products is identified to save the margin dollar.

Market-Basket Analysis

1. What products should I combine to create a bundle offer?

2. Should I combine products based on slow- moving and fast-moving characteristics?

3. Should I create a bundle from the same category or a different category line?

1. The affinity analysis identifies the hidden correlations between the products, which can help in following values: a. Strategize the product bundle offering based on

focus on inventory or margin. b. Increase cross-selling or up-selling by creating

bundle from different categories or the same categories, respectively.

Shopper Insight

1. Which customer is buying what product at what location?

1. By customer segmentation, the business owner can create personalized offers resulting in better customer experience and retention of the customer.

Customer Churn Analysis

1. Who are the customers who will not return? 2. How much business will I lose? 3. How can I retain the customers? 4. What demography of customer is my loyal

customer?

1. Businesses can identify the customer and product relationships that are not working and show high churn. Thus, they can have better focus on product quality and the reason for that churn.

2. Based on the customer lifetime value (LTV), the business can do targeted marketing resulting in retention of the customer.

Channel Analysis

1. Which channel has lower customer acquisition cost?

2. Which channel has better customer retention? 3. Which channel is more profitable?

1. Marketing budget can be optimized based on insight for better return on investment.

New Store Analysis

1. What location should I open? 2. What and how much opening

inventory should I keep?

1. Best practices of other locations and channels can be used to get a jump-start.

2. Comparison with competitor data can help to create a differentiator to attract the new customers.

Store Layout 1. How should I do store layout for better topline?

2. How can I increase my in-store customer experience?

1. Understand the association of products to decide store layout and better alignment with customer needs.

2. Workforce deployment can be planned for better customer interactivity and thus satisfying customer experience.

Video Analytics

1. What demography is entering the store during the peak period of sales?

2. How can I identify a customer with high LTV at the store entrance so that a better personalized experience can be provided to this customer?

1. In-store promotions and events can be planned based on the demography of incoming traffic.

2. Targeted customer engagement and instant discount enhances the customer experience resulting in higher retention.

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 49

Market-basket analysis has commonly been used by the category managers to push the sale of slowly moving stock keeping units (SKUs). By using advanced analytics of data available, the product affinity can be identified at the lowest level of SKU to drive better returns on investments (ROIs) on the bundle offers. Moreover, by using price elasticity techniques, the markdown or optimum price of the bundle offer can also be deduced, thus reducing any loss in the profit margin.

Thus, by using data analytics, a retailer can not only get information on its current operations but can also get further insight to increase the revenue and decrease the operational cost for higher profit. A fairly comprehensive list of cur- rent and potential retail analytics applications that a major retailer such as Amazon could use is proposed by a blogger at Data Science Central. That list is available at www. datasciencecentral.com/profiles/blogs/20-data-science-systems-used-by- amazon-to-operate-its-business. As noted earlier, there are too many examples of these opportunities to list here, but you will see many examples of such applications throughout the book.

IMAGE ANALYTICS As seen in this section, analytics techniques are being applied to many diverse industries and data. An area of particular growth has been analysis of visual images. Advances in image capturing through high-resolution cameras, storage capabili- ties, and deep learning algorithms have enabled very interesting analyses. Satellite data have often proven their utility in many different fields. The benefits of satellite data at high resolution and in different forms of imagery including multi-spectral are significant to scientists who need to regularly monitor global change, land usage, and weather. In fact, by combining the satellite imagery and other data including information on social media, government filings, and so on, one can surmise business planning activities, traffic pat- terns, changes in parking lots or open spaces. Companies, government agencies, and non- governmental organizations (NGOs) have invested in satellites to try to image the whole globe every day so that daily changes can be tracked at any location and the information can be used for forecasting. In the last few months, many interesting examples of more reliable and advanced forecasts have been reported. Indeed, this activity is being led by different industries across the globe, and has added a term to Big Data called Alternative Data. Here are a few examples from Tartar et al. (2018). We will see more in Chapter 9 when we study Big Data.

• World Bank researchers used satellite data to propose strategic recommendations for urban planners and officials from developing nations. This analysis arose due to the recent natural disaster where at least 400 people died in Freetown, Sierra Leone. Researchers clearly demonstrated that Freetown and some other developing cities lacked systematic planning of their infrastructure that resulted in the loss of life. The bank researchers are using satellite imagery now to make critical decisions regard- ing risk-prone urban areas.

• EarthCast provides accurate weather updates for a large commercial U.S. airline based on the data it pulls from a constellation of 60 government-operated satellites combined with ground and aircraft-based sensors, tracking almost anything from lightning to turbulence. It has even developed the capability to map out conditions along a flight path and provides customized forecasts for everything from hot air balloons to drones.

• Imazon started using satellite data to develop a picture of close real-time informa- tion on Amazon deforestation. It uses advanced optical and infrared imagery that has led to identifying illegal sawmills. Imazon is now focused more on getting data to local governments through its “green municipalities” program that trains officials to identify and curb deforestation.

http://www.datasciencecentral.com/profiles/blogs/20-data-science-systems-used-by-amazon-to-operate-its-business

50 Part I • Introduction to Analytics and AI

• The Indonesian government teamed up with international nonprofit Global Fishing Watch, which processes satellite extracted information on ship movement to spot where and when vessels are fishing illegally (Emmert, 2018). This initiative delivered instant results: Government revenue from fishing went up by 129 percent in 2017 compared to 2014. It is expected that by next decade, the organization would track vessels that are responsible for 75 percent of the world’s catch.

These examples illustrate just a sample of ways that satellite data can be combined with analytics to generate new insights. In anticipation of the coming era of abundant earth observations from satellites, scientists and communities must put some thought into recognizing key applications and key scientific issues for the betterment of society. Although such concerns will eventually be resolved by policymakers, what is clear is that new and interesting ways of combining satellite data and many other data sources is spawning a new crop of analytics companies.

Such image analysis is not limited to satellite images. Cameras mounted on drones and traffic lights on every conveyable pole in buildings and streets provide the ability to capture images from just a few feet high. Analysis of these images coupled with facial recognition technologies is enabling all kinds of new applications from customer recognition to govern- ments’ ability to track all subjects of interest. See Yue (2017) as an example. Applications of this type are leading to much discussion on privacy issues. In Application Case 1.7, we learn about a more benevolent application of image analytics where the images are captured by a phone and a mobile application provides immediate value to the user of the app.

Estimating how much ground is covered by green vegetation is important in analysis of a forest or even a farm. In case of a forest, such analysis helps users understand how the forest is evolving, its impact on surrounding areas, and even climate. For a farm, similar analysis can help understand likely plant growth and help estimate future crop yields. It is obviously impossible to measure all forest cover manually and is challenging for a farm. The com- mon method is to record images of a forest/farm and then analyze these images to estimate the ground cover. Such analysis is expensive to perform visually and is also error prone. Different experts looking at the ground cover might estimate the percentage of ground covering differently. Thus, automated meth- ods to analyze these images and estimate the per- centage of ground covered by vegetation are being developed. One such method and an app to make it practical through a mobile phone has been devel- oped at Oklahoma State University by researchers in the Department of Plant and Soil Sciences in part- nership with the university’s App Center and the Information Technology group within the Division of Agricultural Sciences and Natural Resources.

Canopeo is a free desktop or mobile app that estimates green canopy cover in near real-time from images taken with a smartphone or digital camera. In experiments in corn, wheat, canola, and other crops, Canopeo calculated the percentage of canopy covering dozens to thousands of times faster than existing software without sacrificing accuracy. And unlike other programs, the app can acquire and ana- lyze video images, says Oklahoma State University (OSU) soil physicist, Tyson Ochsner—a feature that should reduce the sampling error associated with canopy cover estimates. “We know that plant cover, plant canopies, can be quite variable in space,” says Ochsner, who led the app’s development with former doctoral student Andres Patrignani, now a faculty member at Kansas State University. “With Canopeo, you can just turn on your [video] device, start walk- ing across a portion of a field, and get results for every frame of video that you’re recording.” By using a smartphone or tablet’s digital camera, Canopeo users in the field can take photos or videos of green plants, including crops, forages, and turf, and import them to the app, which analyzes each image pixel, classifying them based on its red-green-blue (RGB)

Application Case 1.7 Image Analysis Helps Estimate Plant Cover

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 51

Analytics/data science initiatives are quickly embracing and even merging with new developments in artificial intelligence. The next section provides an overview of artificial intelligence followed by a brief discussion of convergence of the two.

u SECTION 1.6 REVIEW QUESTIONS

1. What are three factors that might be part of a PM for season ticket renewals? 2. What are two techniques that football teams can use to do opponent analysis? 3. What other analytics uses can you envision in sports? 4. Why would a health insurance company invest in analytics beyond fraud detection?

Why is it in its best interest to predict the likelihood of falls by patients?

5. What other applications similar to prediction of falls can you envision? 6. How would you convince a new health insurance customer to adopt healthier life-

styles (Humana Example 3)?

7. Identify at least three other opportunities for applying analytics in the retail value chain beyond those covered in this section.

8. Which retail stores that you know of employ some of the analytics applications iden- tified in this section?

9. What is a common thread in the examples discussed in image analytics? 10. Can you think of other applications using satellite data along the lines presented in

this section?

color values. Canopeo analyzes pixels based on a ratio of red to green and blue to green pixels as well as an excess green index. The result is an image where color pixels are converted into black and white with white pixels corresponding to green can- opy and black pixels representing the background. Comparison tests showed that Canopeo analyzes images more quickly and just as accurately as two other available software packages.

Developers of Canopeo expect the app to help producers judge when to remove grazing cattle from winter wheat in “dual-purpose” systems where wheat is also harvested for grain. Research by oth- ers at OSU found that taking cattle off fields when at least 60 percent green canopy cover remained ensured a good grain yield. “So, Canopeo would be useful for that decision,” Patrignani says. He and Ochsner also think the app could find use in turf- grass management; in assessments of crop damage from weather or herbicide drift; as a surrogate for the Normalized Difference Vegetation Index (NDVI) in fertilizer recommendations; and even in UAV- based photos of forests or aquatic systems.

Analysis of images is a growing applica- tion area for deep learning as well as many other AI techniques. Chapter 9 includes several exam- ples of image analysis that have spawned another

term—alternative data. Applications of alternative data are emerging in many fields. Chapter 6 also highlights some applications. Imagining innovative applications by being exposed to others’ ideas is one of the main goals of this book!

Questions for DisCussion

1. What is the purpose of knowing how much ground is covered by green foliage on a farm? In a forest?

2. Why would image analysis of foliage through an app be better than a visual check?

3. Explore research papers to understand the underlying algorithmic logic of image analysis. What did you learn?

4. What other applications of image analysis can you think of?

Source: Compiled from A. Patrignani and T. E. Ochsner. (2015). “Canopeo: A Powerful New Tool for Measuring Fractional Green Canopy Cover.” Agronomy Journal, 107(6), pp. 2312–2320; R. Lollato, A. Patrignani, T. E. Ochsner, A. Rocatelli, P. Tomlinson, & J. T. Edwards. (2015). Improving Grazing Management Using a Smartphone App. www.bookstore.ksre.ksu.edu/pubs/ MF3304.pdf (accessed October 2018); http://canopeoapp. com/ (accessed October 2018); Oklahoma State University press releases.

http://www.bookstore.ksre.ksu.edu/pubs/MF3304.pdf

http://canopeoapp.com/

52 Part I • Introduction to Analytics and AI

1.7 ARTIFICIAL INTELLIGENCE OVERVIEW

On September 1, 2017, the first day of the school year in Russia, Vladimir Putin, the Russian President, lectured to over 1,000,000 school children in what is called in Russia the National Open Lesson Day. The televised speech was titled “Russia Focused on the Future.” In this presentation, the viewers saw what Russian scientists are achieving in sev- eral fields. But, what everyone remembers from this presentation is one sentence: “The country that takes the lead in the sphere of computer-based artificial intelligence will become the ruler of the world.”

Putin is not the only one who knows the value of AI. Governments and corpora- tions are spending billions of dollars in a race to become a leader in AI. For example, in July 2017, China unveiled a plan to create an AI industry worth $150 billion to the Chinese economy by 2030 (Metz, 2018). China’s Badu Company today employs over 5,000 AI en- gineers. The Chinese government facilitates research and applications as a national top priority. The accounting firm PricewaterhouseCoopers calculated that AI will add $15.7 trillion to the global economy by 2030 (about 14 percent; see Liberto, 2017). Thus, there is no wonder that AI is clearly the most talked about technology topic in 2018.

What Is Artificial Intelligence?

There are several definitions of what is AI (Chapter 2). The reason is that AI is based on theories from several scientific fields, and it encompasses a wide collection of tech- nologies and applications. So, it may be beneficial to look at some of the characteris- tics of AI in order to understand what it is. The major goal of AI is to create intelligent machines that can do tasks currently done by people. Ideally, these tasks include reasoning, thinking, learning, and problem solving. AI studies the human thought processes’ ability to understand what intelligence is so AI scientists can duplicate the human processes in machines. eMarketer (2017) provides a comprehensive report, describing AI as

• Technology that can learn to do things better over time. • Technology that can understand human language. • Technology that can answer questions.

The Major Benefits of AI

Since AI appears in many shapes, it has many benefits. They are listed in Chapter 2. The major benefits are as follows:

• Significant reduction in the cost of performing work. This reduction continues over time while the cost of doing the same work manually increases with time.

• Work can be performed much faster. • Work is consistent in general, more consistent than human work. • Increased productivity and profitability as well as a competitive advantage are the

major drivers of AI.

The Landscape of AI

There are many parts in the landscape (or ecosystem) of AI. We decided to organize them into five groups as illustrated in Figure 1.16. Four of the groups constitute the basis for the fifth one, which is the AI applications. The groups are as follows:

MAJOR TECHNOLOGIES Here we elected to include machine learning (Chapter 5), deep learning (Chapter 6), and intelligent agents (Chapter 2).

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 53

KNOWLEDGE-BASED TECHNOLOGIES (all covered in Chapter 12) Topics covered are expert systems, recommendation engines, chatbots, virtual personal assistants, and robo-advisors.

BIOMETRIC-RELATED TECHNOLOGIES This includes natural language processing (under- standing and generation, machine vision and scene and image recognition and voice and other biometric recognition (Chapter 6).

SUPPORT THEORIES, TOOLS, AND PLATFORMS Academic disciplines include computer science, cognitive science, control theory, linguistics, mathematics, neuroscience, philoso- phy, psychology, and statistics.

Devices and methods include sensors, augmented reality, context awareness, logic, gestural computing collaborative filtering, content recognition, neural networks, data mining, humanoid theories, case-based reasoning, predictive application programming interfaces (APIs), knowledge management, fuzzy logic, genetic algorithm, bin data, and much more.

TOOLS AND PLATFORMS These are available from IBM, Microsoft, Nvidia, and several hundred vendors specializing in the various aspects of AI.

AI APPLICATIONS There are several hundred or may be thousands of them. We provide here only a sample:

Smart cities, smart homes, autonomous vehicles (Chapter 13), automatic decisions (Chapter 2), language translation, robotics (Chapter 10), fraud detection, security protec- tion, content screening, prediction, personalized services, and more. Applications are in all business areas (Chapter 2), and in almost any other area ranging from medicine and healthcare to transportation and education.

Note: Lists of all these are available at Faggela (2018) and Jacquet (2017). Also see Wikipedia, “Outline of artificial intelligence,” and a list of “AI projects” (several hundred items.)

In Application Case 1.8, we describe how several of these technologies are com- bined in improving security and in expediting the processing of passengers in airports.

Major AI Technologies

Knowledge-Based Technologies

Biometric-Based Technologies

Support Theories, Tools, Platforms,

Mechanisms

AI Applications

FIGURE 1.16 The Landscape (Ecosystem) of AI. Source: Drawn by E. Turban.

54 Part I • Introduction to Analytics and AI

NARROW (WEAK) VERSUS GENERAL (STRONG) AI The AI field can be divided into two major categories of applications: narrow (or weak) and general (or strong).

A Narrow AI Field Focuses on One Narrow Field (Domain). Well-known examples of this are SIRI and Alexa (Chapter 12) that, at least in their early years of life, operated in limited, predefined areas. As time has passed, they have become more general, acquiring ad- ditional knowledge. Most expert systems (Chapter 12) were operating in fairly narrow domains. If you notice, when you converse with an automated call center, the computer

We may not like the security lines at airports or the idea that terrorists may board our plane or enter our country. Several AI devices are designed to mini- mize these possibilities.

1. Facial recognition at airports. Jet Blue is ex- perimenting with facial-recognition technology (a kind of machine vision to match travelers’ faces against prestored photos, such as pass- port, driver’s license). This will eliminate the need for boarding passes and increase security. The match is of high quality. The technology pioneered by British Airways is used by Delta, KLM, and other airlines using similar technol- ogies for self-checking of bags. Similar tech- nology is used by the U.S. Immigration and Customs Enforcement agency where people’s photos taken at arrivals are matched against the database of photos and other documents.

2. China’s system. The major airports in China are using a system similar to that of Jet Blue, us- ing facial recognition for verifying the identity of passengers. The idea is to eliminate board- ing passes and expedite the flow of boarding. The system is also used to recognize airport employees entering restricted areas.

3. Using bots. Several airports (e.g., New York, Beijing) offer conversational bots (Chapter 12) to provide travelers with airport guidance. Bots provide also information about customs and immigration services.

4. Spotting liars at airport. This application is emerging to help immigration services to vet passengers at airports and land entry borders. With increased security, both immigration and airline personal may need to query passengers. Here is the solution that can be economically used to query all passengers at high speed, so

there will be short waiting lines. This emerging system is called Automated Virtual Agent for Truth Assessments in Real Time (AVATAR). The essentials of the system are as follows:

a. AVATAR is a bot in which you first scan your passport.

b. AVATAR asks you a few questions. Several AI technologies are used in this project, such as AI, Big Data analytics, the “Cloud,” robotics, machine learning, machine vision, and bots.

c. You answer the questions. d. AVATAR’s sensors and other AI technolo-

gies collect data from your body, such as voice variability, facial expression (e.g., muscle engagement), eyes’ position and movements, mouth movements, and body posture. Researchers feel that it takes less effort to tell the truth than to die, so re- searchers compared the answers to routine questions.

The machine then will flag suspects for fur- ther investigation. The machine is already in use by immigration agents in several countries.

Sources: Condensed from Thibodeaux, W. (2017, June 29). “This Artificial Intelligence Kiosk Is Designed to Spot Liars at Airports.” Inc.com.; Silk, R. (2017, November). “Biometrics: Facial Recognition Tech Coming to an Airport Near You.” Travel Weekly, 21.

Questions for Case 1.8

1. List the benefits of AI devices to travelers.

2. List the benefits to governments and airline companies.

3. Relate this case to machine vision and other AI tools that deal with people’s biometrics.

Application Case 1.8 AI Increases Passengers’ Comfort and Security in Airports and Borders

http://Inc.com

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 55

(which is usually based on some AI technology) is not too intelligent. But, it is get- ting “smarter” with time. Speech recognition allows computers to convert sound to text with great accuracy. Similarly, computer vision is improving, recognizes objects, classifies them, and even understands their movements. In sum, there are millions of narrow AI ap- plications, and the technology is improving every day. However, AI is not strong enough yet because it does not exhibit the true capabilities of human intelligence (Chapter 2).

GENERAL (STRONG) AI To exhibit real intelligence, machines need to perform the full range of human cognitive capabilities. Computers can have some cognitive capabilities (e.g., some reasoning and problem solving) as will be shown in Chapter 6 on cognitive computing.

The difference between the two classes of AI is getting smaller as AI is getting smarter. Ideally, strong AI will be able to replicate humans. But true intelligence is happening only in narrow domains, such as game playing, medical diagnosis, and equipment failure diagnosis.

Some feel that we never will be able to build a truly strong AI machine. Others think differently; see the debate in Section 14.9. The following is an example of a strong AI bot in a narrow domain.

Example 3: AI Makes Coke Vending Machine Smarter

If you live in Australia or New Zealand and you are near a Coca-Cola vending machine, you can order a can or a bottle of the soft drink using your smartphone. The machines are cloud connected, which means you can order the Coke from any place in the world, not only for yourself but also for any friend who is near a vending machine in Australia or New Zealand. See Olshansky (2017).

In addition, the company can adjust pricing remotely, offer promotions, and collect inventory data so that restocking can be made. Converting existing machines to AI-enabled takes about 1 hour each.

Wait a minute, what if something goes wrong? No problem, you can chat with Coca- Cola’s bot via Facebook Messenger (Chapter 12).

The Three Flavors of AI Decisions

Staff (2017) divided the capabilities of AI systems into three levels: assisted, autonomous, and augmented.

ASSISTED INTELLIGENCE This is equivalent mostly to the weak AI, which works only in narrow domains. It requires clearly defined inputs and outputs. Examples are some monitoring systems and low-level virtual personal assistants (Chapter 12). Our cars are full of such monitoring systems that give us alerts. Similarly, there are many healthcare applications (monitoring, diagnosis).

Autonomous AI

These systems are in the realm of the strong AI but in very narrow domain. Eventually, the computer will take over. Machines will act as experts and have absolute decision-making power. Pure robo-advisors (Chapter 12) are examples of such machines. Autonomous vehicles and robots that can fix themselves are also good examples.

AUGMENTED INTELLIGENCE Most of the existing AI applications are between assisted and autonomous and/are referred to as augmented intelligence (or intelligence aug- mentation). The technology focuses on augmenting computer abilities to extend human cognitive abilities (see Chapter 6 on cognitive computing), resulting in high performance as described in Technology Insights 1.1.

56 Part I • Introduction to Analytics and AI

TECHNOLOGY INSIGHTS 1.1 Augmented Intelligence

The idea of combining the performance of people and machines is not new. Here we combine (augmenting) human capabilities with powerful machine intelligence. That is, not replacing people which is done by autonomous AI, but extending human cognitive abilities. The result is the ability to solve complex human problems as in the opening vignette to this chapter. The computers enabled people to solve problems that were unsolved before. Padmanabhan (2018) distinguishes the following differences between traditional and augmented AI:

1. Augmented machines extend rather than replace human decision making, and they facilitate creativity.

2. Augmentation excels in solving complex human and industry problems in specific domains in contrast with strong, general AI.

3. In contrast with a “black box” model of some AI and analytics, augmented intelligence provides insights and recommendations, including explanations.

4. In addition, augmentation technology can offer new solutions by combining existing and discovered information in contrast with assisted AI, which identifies problems or symp- toms and suggests predetermined solutions.

Padmanabhan (2018) and many others believe that at the moment, augmented AI is the best option to move toward the transformation of the AI world.

In contrast with autonomous AI, which describes machines with a wide range of cogni- tive abilities (e.g., driverless vehicles), augmented intelligence has only a few cognitive abilities.

Examples of Augmented Intelligence Staff (2017) provides the following examples:

• Cybercrime fighting. AI can identify forthcoming attacks and suggest solutions. • e-Commerce decisions. Marketing tools make testing 100 times faster and adapt the

layout and response functions of a Web site to users. The machines make recommenda- tions and the marketers can accept or reject them.

• High-frequency stock market trading. This is done either completely autonomously or in some cases with control and calibration by humans.

DISCUSSION QUESTIONS

1. What is the basic premise of augmented intelligence?

2. List the major differences between augmented intelligence and traditional AI.

3. What are some benefits of augmented intelligence?

4. How does technology relate to cognitive computing?

Societal Impacts

Much talk is on the topics of AI and productivity, speed, and cost reduction. In a national conference hosted by Gartner, the famous IT consulting company, nearly half of 3,000 participating U.S. CIOs reported plans to deploy AI-now (Weldon, 2018). Industry can- not ignore the potential benefits of AI, especially its increased productivity gains, cost reduction and quality, and speed. Conference participants there talked about strategy and implementation (Chapter 14). It seems that every company is at least involved in pilot- ing and experimentation AI. However, in all this excitement, we should not neglect the societal impacts. Many of these are positive, some are negative, and most are unknown. A comprehensive discussion is provided in Chapter 14. Here we provide three examples of AI impacts.

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 57

IMPACT ON AGRICULTURE A major impact of AI will be on agriculture. One major an- ticipated result is to provide more food, especially in third world countries. Here are a few examples:

• According to Lopez (2017), using AI and robots can help farmers produce 70 per- cent more food by 2050. This increase is a result of higher productivity of farm equipment boosted by IoT (see opening vignette to Chapter 13) and a reduced cost of producing food. (Today only 10 percent of a family’s budget is spent on food versus. 17.5 percent in 1960).

• Machine vision helps in improved planting and harvesting. Also, AI helps to pick good kernels of grain.

• AI will help to improve the nutrition of food. • AI will reduce the cost of food processing. • Driverless tractors are already being experimented with. • Robots know how to pick fruits and to plant vegetables can solve the shortage of

farm workers. • Crop yields are continuously increasing in India and other countries. • Pest control improves. For example, AI can predict pest attacks, facilitating planning. • Weather conditions are monitored by satellites. AI algorithms tell farmers when to

plant and/or harvest.

The list can go on and on. For countries such as India and Bangladesh, these activi- ties will critically improve the life of farmers. All in all, AI will help farmers make better decisions. For a Bangladesh case, see PE Report (2017). See alsonews.microsoft.com/ en-in/features/ai-agriculture-icrisat-upl-india/.

Note: AI can help hungry pets too. A food and water dispenser, called Catspad, is available in the United Kingdom for about US $470 You need to put an ID tag on your pet (only cats and small dogs). The dispenser knows which pet comes to the food and dispenses the type and amount of appropriate food. In addition, sensors (Chapter 13) can tell you how much food each pet ate. You will also be notified if water needs to be added. Interested? See Deahl (2018) for details.

INTELLIGENT SYSTEMS CONTRIBUTION TO HEALTH AND MEDICAL CARE Intelligent sys- tems provide a major contribution to our health and medical care. New innovations ar- rive almost any day in some place in the world (governments, research institutions, and corporation-sponsored active medical AI research). Here are some interesting innovations.

• AI excels in disease prediction (e.g., predicting the occurrence of infective diseases one week in advance).

• AI can detect brain bleeds. • AI can track medication intake, send medical alerts, order medicine refills, and

improve prescription compliance. • Mobile telepresence robots remotely connect physicians and patients. • NVIDIA’s medical imaging supercomputer helps diagnosticians and facilitates cures

of diseases. • Robotics and AI can redesign pharmaceutical supply chains. • AI predicts cardiovascular risks from retinal images. • Cancer predictions are made with deep learning, and machine learning performs

melanoma diagnosis. • A virtual personal assistant can assess a patient’s mood and feeling by cues pro-

vided (e.g., speech gesture or inflection). • Many portals provide medical information to patients and even surgeons. Adoptive

spine IT is an example.

http://alsonews.microsoft.com/en-in/features/ai-agriculture-icrisat-upl-india/

58 Part I • Introduction to Analytics and AI

• Aging-based AI center for research on people who are elderly operates in the United States. Similar activities exist in Japan.

• The use of bionic hands and legs is improving with AI. • Healthcare IT News (2017) describes how AI is solving healthcare problems by

using virtual assistants (Chapter 12).

The list can go on and on. Norman (2018) describes the scenario of replacing doc- tors with intelligent systems.

Note: AI in medicine is recognized as a scientific field with national and interna- tional annual conferences. For a comprehensive book on the subject, see Agah (2017).

OTHER SOCIETAL APPLICATIONS There are many AI applications in transportation, utili- ties, education, social services, and other fields. Some are covered under the topic of smart cities (Chapter 13). AI is used by social media and others to control content in- cluding fake news. Finally, how about using technology to eradicate child slavery in the Middle East? See Application Case 1.9.

In several Middle Eastern countries, notably Jordan, Abu Dhabi, and other Gulf nations, racing camels has been a popular activity for generations. The owners of the winning camels can make a huge bonus (up to $1,000,000 for first place). Also, the events are considered cultural and social.

The Problem

For a long time, the racing camels were guided by human jockeys. The lighter the weight of the rider, the better is the chance to win. So the owners of the camels trained children (as young as seven) to be jockeys. Young male children were bought (or kidnapped) from poor families in Sudan, India, Bangladesh, and other poor countries and were trained as child jockeys. In fact, this practice was using child slave labor to race the camels. This prac- tice was used for generations until it was banned in all Middle Eastern countries during 2005–2010. A major factor that resulted in the banning was the uti- lization of robots.

The Robots’ Solution

Racing camels was a tradition for many generations and become a lucrative sport. So, no one wanted to discontinue it. According to Opfer (2016), there was a humanistic reason for using robots to race camels—to save the children. Today, all camel race tracks in the Middle East employ only robots. The

robots are tied to the hump of the camels, look- ing like small jockeys and are remote controlled from cars that drive parallel to the racing camels. The owners can command the camels by voice, and they can also operate a mechanical whip to beat the animals so they will run faster, much like human jockeys do. Note that camels would not run unless they hear the voice of a human or see something that looks like a human on their humps.

The Technology

There is a video camera that shows the people that are in cars driving alongside of the camels, what is going on in real time. The owner can provide voice commands to the camel from the car. A mechanical whip attached to the hump of the camel can be remotely operated to induce the animal.

The Results

The results are astonishing. Not only was the child slavery practice eliminated, but also the speed obtained by the camels increased. After all, the robots used weigh only 6 pounds and do not get tired. To see how this works watch the video at you- tube.com/watch?v=GVeVhWXB7sk (2:47 min.). To view a complete race, see youtube.com/ watch?v=xFCRhk4GYds (9:08 min.). You may have

Application Case 1.9 Robots Took the Job of Camel-Racing Jockeys for Societal Benefits

http://youtube.com/watch?v=GVeVhWXB7sk

http://youtube.com/watch?v=xFCRhk4GYds

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 59

u SECTION 1.7 REVIEW QUESTIONS

1. What are the major characteristics of AI? 2. List the major benefits of AI. 3. What are the major groups in the ecosystem of AI? List the major contents of each. 4. Why is machine learning so important? 5. Differentiate between narrow and general AI. 6. Some say that no AI application is strong. Why? 7. Define assisted intelligence, augmented intelligence, and autonomous intelligence. 8. What is the difference between traditional AI and augmented intelligence? 9. Relate types of AI to cognitive computing.

10. List five major AI applications for increasing the food supply. 11. List five contributions of AI in medical care.

1.8 CONVERGENCE OF ANALYTICS AND AI

Until now we have presented analytics and AI as two independent entities. But, as illustrated in the opening vignette, these technologies can be combined in solv- ing complex problems. In this section, we discuss the convergence of these tech- niques and how they complement each other. We also describe the possible addition of other technologies, especially IoT, that enable the solutions to very complex problems.

Major Differences between Analytics and AI

As you recall from Section 1.4, analytics process historical data using statistical, man- agement science and other computational tools to describe situations (descriptive ana- lytics), to predict results including forecasting (predictive analytics), and to propose recommendations for solutions to problems (predictive analytics). The emphasis is on the statistical, management science, and other computational tools that help analyze historical data.

AI, on the other hand, also uses different tools, but its major objective is to mimic the manner in which people think, learn, reason, make decisions, and solve problems. The emphasis here is on knowledge and intelligence as major tools for solving problems rather than relying on computation, which we do in analysis. Furthermore, AI also is dealing with cognitive computing. In reality, this difference is not so clear because in advanced analytic applications, there are situations of using machine learning (an AI

a chance to see the royal family when you go to the track. Finally, you can see more details in youtube. com/watch?v=C1uYAXJIbYg (8:08 min.).

Sources: Compiled from C. Chung. (2016, April 4). “Dubai Camel Race Ride-Along.” YouTube.com. youtube.com/watch?v=xFCRhk4 GYds (accessed September 2018); P. Boddington. (2017, January 3). “Case Study: Robot Camel Jockeys. Yes, really.” Ethics for Artificial Intelligence; and L. Slade. (2017, December 21). “Meet the Jordanian Camel Races Using Robot Jockeys.” Sbs. com.au.

DisCussion Questions

1. It is said that the robots eradicated the child slav- ery. Explain.

2. Why do the owners need to drive by their cam- els while they are racing?

3. Why not duplicate the technology for horse racing?

4. Summarize ethical aspects of this case (Read Boddington, 2017). Do this exercise after you have read about ethics in Chapter 14.

http://youtube.com/watch?v=C1uYAXJIbYg

http://YouTube.com

http://youtube.com/watch?v=xFCRhk4GYds

http://Sbs.com.au

60 Part I • Introduction to Analytics and AI

technology), supporting analytics in both prediction and prescription. In this section, we describe the convergence of intelligent technologies.

Why Combine Intelligent Systems?

Both analytics and AI and their different technologies are making useful contributions to many organizations when each is applied by itself. But each does have limitations According to a Gartner study, the chance that business analytics initiatives will not meet the enterprise objectives is 70–80 percent. Namely, at least 70 percent of corporate needs are not fulfilled. In other words, there is only a small chance that business intelligence initiatives will result in organizational excellence. There are several reasons for this situation including:

• Predictive models have unintended effects (see Chapter 14). • Models must be used ethically, responsibly, and mindfully (Chapter 14). They may

not be used this way. • The results of analytics may be very good for some applications but not for

others. • Models are as good as their input data and assumptions (garbage-in, garbage-out). • Data could be incomplete. Changing environments can make data obsolete very

quickly. Models may be unable to adapt. • Data that come from people may not be accurate. • Data collected from different sources can vary in format and quality.

Additional reasons for combining intelligent systems are generic to IT projects, and they are discussed in Section 14.2.

The failure rate of AI initiatives is also high. Some of the reasons are similar to the rate of analytics. However, a major reason is that some AI technologies need a large amount of data, sometimes Big Data. For example, many millions of data items are fed to Alexa every day to increase its knowledge. Without continuous flow of data, there would not be good learning in AI.

The question is whether AI and analytics (and other intelligent systems) can be combined in such a way that there will be synergy for better results.

How Convergence Can Help?

According to Nadav (2017), business intelligence and its analytics answer most of the why and what questions regarding the sufficiency of problem solving. Adding prescrip- tive analytics will add more cost but not necessarily better performance. Therefore, the next generation of business intelligence platforms will use AI to automatically locate, visualize, and narrate important things. This can also be used to create automatic alerts and notifications. In addition, machine learning and deep learning can support ana- lytics by conducting pattern recognition and more accurate predictions. AI will help to compare actual performance with the predicted one (see Section 14.6). Machine learning and other AI technologies also provide for constant improvement strategy. Nadav also suggested adding expert opinions via collective intelligence, as presented in Chapter 11.

In the remaining part of this section, we present detailed aspects of convergence of some intelligent systems.

Big Data Is Empowering AI Technologies

Big Data is characterized by its volume, variety, and velocity that exceed the reach of commonly used hardware environments and/or the capabilities of software tools to pro- cess data. However, today there are technologies and methods that enable capturing,

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 61

cleaning, and analyzing Big Data. These technologies and methods enable companies to make real-time decisions. The convergence with AI and machine learning is a major force in this direction. The availability of new Big Data analytics enables new capabilities in AI technologies that were not possible until recently. According to Bean (2017), Big Data can empower AI due to:

• The new capabilities of processing Big Data at a much reduced cost. • The availability of large data sets online. • The scale up of algorithms, including deep learning, is enabling powerful AI

capabilities.

MetLife Example: Convergence of AI and Big Data

MetLife is a Canadian-based global insurance company that is known for its use of IT to smooth its operation and increase customer satisfaction. To get the most from technology, the company uses AI that has been enabled by Big Data analysis as follows:

• Tracking incidents and their outcomes has been improved by speech recognition. • Machine learning indicates pending failures. In addition, handwritten reports made

by doctors about people injured or were sick and claims paid by the insurance com- pany are analyzed in seconds by the system.

• Expediting the execution of underwriting policies in property and casualty insur- ance is done by using both AI and analytics.

• The back-office side of claim processing includes many unstructured data that are incorporated in claims. Part of the analysis includes patients’ health data. Machine learning is used to recognize anomalies in reports very quickly.

For more about AI and the insurance business, see Chapter 2. For more on the con- vergence of Big Data and AI in general and at MetLife, see Bean (2017).

The Convergence of AI and the IoT

The opening vignette illustrated to us how AI technologies when combined with IoT can provide solutions to complex problems. IoT collects a large amount of data from sensors and other “things.” These data need to be processed for decision support. Later we will see how Microsoft’s Cortana does this. Butner (2018) describes how combining AI and IoT can lead to the “next-level solutions and experiences.” The emphasis in such combination is on learning more about customers and their needs. This integration also can facilitate competitive analysis and business operation (see the opening vignette). The combined pair of AI and IoT, especially when combined with Big Data, can help facilitate the discovery of new products, business processes, and opportunities. The full potential of IoT can be leveraged with AI technologies. In addition, the only way to make sense of the data streamed from the “things” via IoT and to obtain the insight from them is to subject them to AI analysis. Faggela (2017) provides the following three examples of combining AI and IoT:

1. The smart thermostat of Nest Labs (see smart homes in Chapter 13). 2. Automated vacuum cleaners, like iRobot Roomba (see Chapter 2, intelligent

vacuums). 3. Self-driving vehicles (see Chapter 13).

The IoT can become very intelligent when combined with IBM Watson Analytics that includes machine learning. Examples are presented in the opening vignette and the opening vignette to Chapter 13.

62 Part I • Introduction to Analytics and AI

The Convergence with Blockchain and Other Technologies

Several experts raise the possibility of the convergence of AI, analytics, and blockchain (e.g., Corea, 2017; Kranz, 2017). The idea is that such convergence may contribute to design or redesign of paradigms and technologies. The blockchain technology can add security to data shared by all parties in a distributed network, where transaction data can be recorded. Kranz believes that the conversion with blockchain will power new solu- tions to complex problems. Such a convergence should include the IoT. Kranz also see a role for fog computing (Chapter 9). Such a combination can be very useful in complex applications such as autonomous vehicles and in Amazon’s Go (Application Case 1.10).

In early 2018, Amazon.com opened its first fully automated convenience store in downtown Seattle. The company had had success with this type of store during 2017, experimenting with only the company’s employees.

Shoppers enter the store, pick up products, and go home. Their accounts are charged later on. Sounds great! No more waiting in line for the packing of your goods and paying for them – no cashiers, no hassle.

In some sense, shoppers are going through a process similar to what they do online—find desired products/services, buy them, and wait for the monthly electronic charge.

The Shopping Process

To participate, you need a special free app on your smartphone. You need to connect it to your regular Amazon.com account. Here is what you do next:

1. Open your app. 2. Wave your smartphone at a gate to the store. It

will work with a QR code there. 3. Enter the store. 4. Start shopping. All products are prepacked.

You put them in a shopping bag (yours or one borrowed at the store). The minute you pick an item from the shelf, it is recorded in a virtual shopping cart. This activity is done by sensors/cameras. Your account is debited. If you change your mind, and return an item, the system will credit your account instantly. The sensors also track your movements in the store. (This is an issue of digital privacy; see Chapter 14, Section 14.3). The sensors are of RFID type (Chapter 13).

5. Finished shopping? Just leave the store (make sure your app is open for the gate to let you leave). The system knows that you have left and

what products you took, and your shopping trip is finished. The system will total your cost, which you can check anytime on your smartphone.

6. Amazon.com records your shopping habits (again, a privacy issue), which will help your fu- ture shopping experience and will help Amazon to build recommendations for you (Chapter 2). The objective of Go is to guide you to healthy food! (Amazon sells its meal kits of healthy food there.)

Note: Today, only few people work in the store! Employees stock shelves and assist you otherwise. The company plans to open several additional stores in 2018.

The Technology Used

Amazon disclosed some of the technologies used. These are deep learning algorithms, computer vision, and sensor fusion. Other technologies were not disclosed. See the videoyoutube.com/ watch?v=NrmMk1Myrxc (1:50 min.).

Sources: Condensed forC. Jarrett. (2018). “Amazon Set to Open Doors on AI-Powered Grocery Store.” Venturebeat.com. venturebeat.com/2018/01/21/amazon-set-to-open-doors- on-ai-powered-grocery-store/ (accessed September 2018); D. Reisinger. (2018, February 22). “Here Are the Next Cities to Get Amazon Go Cashier-Less Stores.” Fortune.

Questions for Case 1.9

1. Watch the video. What did you like in it, and what did you dislike?

2. Compare the process described here to a self- check available today in many supermarkets and “big box” stores (Home Depot, etc.).

3. The store was opened in downtown Seattle. Why was the downtown location selected?

4. What are the benefits to customers? To Amazon?

5. Will customers be ready to trade privacy for con- venience? Discuss.

Application Case 1.10 Amazon Go Is Open for Business

http://Amazon.com

http://videoyoutube.com/watch?v=NrmMk1Myrxc

http://Venturebeat.com

http://venturebeat.com/2018/01/21/amazon-set-to-open-doors-on-ai-powered-grocery-store/

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 63

For a comprehensive report regarding convergence of intelligent technologies, see reportbuyer.com/product/5023639/.

In addition to blockchain, one can include IoT and Big Data, as suggested earlier, as well as more intelligent technologies (e.g., machine vision, voice technologies). These may have enrichment effects. In general, the more technologies are used (presumably properly), the more complex problems may be solved, and the more efficient the perfor- mance of the convergence systems (e.g., speed, accuracy) will be. For a discussion, see i-scoop.eu/convergence-ai-iot-big-data-analytics/.

IBM and Microsoft Support for Intelligent Systems Convergence

Many companies provide tools or platforms for supporting intelligent systems conver- gence. Two examples follow.

IBM IBM is combining two of its platforms to support the convergence of AI and ana- lytics. Power AI is a distribution platform for AI and machine learning. This is a way to support the IBM analytics platform called Data Science Experience (cloud enabled). The combination of the two enables improvements in data analytics process. It also en- ables data scientists to facilitate the training of complex AI models and neural networks. Researchers can use the combined system for deep learning projects. All in all, this combi- nation provides better insight to problem solving. For details, see FinTech Futures (2017).

As you may recall from the opening vignette, IBM Watson is also combining analyt- ics, AI, and IoT in cognitive buildings projects.

MICROSOFT’S CORTANA INTELLIGENCE SUITE Microsoft offers from its AZURE cloud (Chapter 13) a combination of advanced analytics, traditional BI, and Big Data analytics. The suite enables users to transform data into intelligent actions.

Using Cortana, one can transform data from several sources, including from IoT sensors, and apply both advanced analytics (e.g., data mining) and AI (e.g., ma- chine learning) and extract insights and actionable recommendations, which are de- livered to decision makers, to apps, or to fully automated systems. For the details of the system and the architecture of Cortana, see mssqltips.com/sqlservertip/4360/ introduction-to-microsoft-cortana-intelligence-suite/.

u SECTION 1.8 REVIEW QUESTIONS

1. What are the major benefits of intelligent systems convergences? 2. Why did analytics initiatives fail at such a high rate in the past? 3. What synergy can be created by combining AI and analytics? 4. Why is Big Data preparation essential for AI initiatives? 5. What are the benefits of adding IoT to intelligent technology applications? 6. Why it is recommended to use blockchain in support of intelligent applications?

1.9 OVERVIEW OF THE ANALYTICS ECOSYSTEM

So you are excited about the potential of analytics, data science, and AI and want to join this growing industry. Who are the current players, and what to do they do? Where might you fit in? The objective of this section is to identify various sectors of the analytics in- dustry, provide a classification of different types of industry participants, and illustrate the types of opportunities that exist for analytics professionals. Eleven different types of play- ers are identified in an analytics ecosystem. An understanding of the ecosystem also gives the reader a broader view of how the various players come together. A secondary

http://reportbuyer.com/product/5023639/

http://i-scoop.eu/convergence-ai-iot-big-data-analytics/

http://mssqltips.com/sqlservertip/4360/introduction-to-microsoft-cortana-intelligence-suite/

64 Part I • Introduction to Analytics and AI

purpose of understanding the analytics ecosystem for a professional is also to be aware of organizations and new offerings and opportunities in sectors allied with analytics.

Although some researchers have distinguished business analytics professionals from data scientists (Davenport and Patil, 2012), as pointed out previously, for the purpose of understanding the overall analytics ecosystem, we treat them as one broad profession. Clearly, skill needs can vary for a strong mathematician to a programmer to a modeler to a communicator, and we believe this issue is resolved at a more micro/individual level rather than at a macro level of understanding the opportunity pool. We also take the wid- est definition of analytics to include all three types as defined by INFORMS—descriptive/ reporting/visualization, predictive, and prescriptive as described earlier. We also include AI within this same pool.

Figure 1.17 illustrates one view of the analytics ecosystem. The components of the ecosystem are represented by the petals of an analytics flower. Eleven key sectors or clus- ters in the analytics space are identified. The components of the analytics ecosystem are grouped into three categories represented by the inner petals, outer petals, and the seed (middle part) of the flower. The outer six petals can be broadly termed technology provid- ers. Their primary revenue comes from providing technology, solutions, and training to analytics user organizations so they can employ these technologies in the most effective and efficient manner. The inner petals can be generally defined as the analytics accelera- tors. The accelerators work with both technology providers and users. Finally, the core of the ecosystem comprises the analytics user organizations. This is the most important component as every analytics industry cluster is driven by the user organizations.

The metaphor of a flower is well suited for the analytics ecosystem as multiple com- ponents overlap each other. Similar to a living organism like a flower, all these petals grow and wither together. Many companies play in multiple sectors within the analytics industry and thus offer opportunities for movement within the field both horizontally and vertically.

Data Generation

Infrastructure Providers

Analytics- Focused Software

Developers

Data Service Providers

Middleware Providers

Data Warehouse Providers

Analytics User

Organization

Regulators and Policy Makers

Data Management Infrastructure

Providers

Academic Institutions and

Certification Agencies

Analytics Industry

Analysts and InfluencersApplication

Developers: Industry Specific

or General

FIGURE 1.17 Analytics Ecosystem.

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 65

More details for the analytics ecosystem are included in our shorter book (Sharda, Delen, and Turban, 2017) as well as in Sharda and Kalgotra (2018). Matt Turck, a venture capitalist with FirstMark, has also developed and updates an analytics ecosystem focused on Big Data. His goal is to keep track of new and established players in various segments of the Big Data industry. A very nice visual image of his interpretation of the ecosystem and a comprehensive listing of companies is available through his Web site: http:// mattturck.com/2016/02/01/big-data-landscape/ (accessed September 2018).

1.10 PLAN OF THE BOOK

The previous sections have given you an understanding of the need for information technology in decision making, the evolution of BI, analytics, data science, and artificial intelligence. In the last several sections, we have seen an overview of various types of analytics and their applications. Now we are ready for a more detailed managerial excur- sion into these topics along with some deep hands-on experience in some of the techni- cal topics. Figure 1.18 presents a plan on the rest of the book.

Analytics, Data Science, and Artificial Intelligence: Systems for Decision Support (11th Edition)

Chapter 1 An Overview of BA, DSS, BI,

DS and AI

Chapter 2 Artificial

Intelligence Concepts, Drivers, Major

Technologies, and Business Applications

Chapter 3 Nature of Data,

Statistical Modeling and Visualization

Chapter 8 Optimization

and Simulation

Chapter 9 Big Data, Data Centers, and

Cloud Computing

Chapter 4 Algorithms rather than Applications in the title

Chapter 5 Machine- Learning

Techniques for Predictive

Analytics

Chapter 7 Text Mining, Sentiment Analysis, and Social Analytics

Chapter 6 Deep Learning and Cognitive Computing

Chapter 10 Robotics;

Industrial and Consumer

Applications

Chapter 11 Group Decision

Making, Collaborative

Systems, and AI Support

Chapter 13 TThe Internet of

Things As A Platform For

Intelligent Applications

Chapter 12 Knowledge Systems:

Expert Systems, Recommenders, Chatbots, Virtual

Personal Assistants, and Robo Advisors

Chapter 14 Implementation

Issues: From Ethics and

Privacy to Organizational and Societal

Impacts

Introduction to Analytics and Al

Predictive Analytics/ Machine Learning

Prescriptive Analytics and Big Data

Robotics, Social Networks, Al and IoT

Caveats of Analytics and Al

FIGURE 1.18 Plan of the Book.

http://mattturck.com/2016/02/01/big-data-landscape/

66 Part I • Introduction to Analytics and AI

In this chapter, we have provided an introduction, definitions, and overview of DSS, BI, and analytics, including Big Data analytics and data science. We also gave you an overview of the analytics ecosystem to have you appreciate the breadth and depth of the industry. Chapters 2 and 3 cover descriptive analytics and data issues. Data clearly form the foundation for any analytics application. Thus, we cover an introduction to data warehousing issues, applications, and technologies. This chapter also covers business reporting and visualization technologies and applications.

We follow the current chapter with a deeper introduction to artificial intelligence in Chapter 2. Because data are fundamental to any analysis, Chapter 3 introduces data issues as well as descriptive analytics, including statistical concepts and visualization. An online chapter covers data warehousing processes and fundamentals for those who like to dig more deeply into these issues. The next section of the book covers predictive analytics and machine learning. Chapter 4 provides an introduction to data mining applications and the data mining process. Chapter 5 introduces many of the common data mining techniques: classification, clustering, association mining, and so forth. Chapter 6 includes coverage of deep learning and cognitive computing. Chapter 7 focuses on text mining applications as well as Web analytics, including social media analytics, sentiment analysis, and other related topics. The following section brings the “data science” angle into further depth. Chapter 8 covers prescriptive analytics. Chapter 9 includes more details of Big Data analytics. It also includes an introduction to cloud-based analytics as well as location analytics. The next section covers robotics, social networks, AI, and IoT. Chapter 10 introduces robots in business and consumer applications and discusses the future impact of such devices on society. Chapter 11 focuses on collaboration systems, crowdsourcing, and social networks. Chapter 12 reviews personal assistants, chatbots, and the exciting developments in this space. Chapter 13 studies IoT and its potential in decision support and a smarter society. The ubiquity of wireless and GPS devices and other sensors is resulting in the creation of massive new databases and unique applications. A new breed of analytics companies is emerging to analyze these new databases and create a much better and deeper un- derstanding of customers’ behaviors and movements. It is leading to the automation of analytics and has spanned a new area called the “Internet of Things.” Finally, Chapter 14 concludes with a brief discussion of security, privacy, and societal dimensions of analytics/AI.

1.11 RESOURCES, LINKS, AND THE TERADATA UNIVERSITY NETWORK CONNECTION

The use of this chapter and most other chapters in this book can be enhanced by the tools described in the following sections.

Resources and Links

We recommend the following major organizational resources and links:

• The Data Warehousing Institute (tdwi.org). • Data Science Central (datasciencecentral.com). • DSS Resources (dssresources.com). • Microsoft Enterprise Consortium (enterprise.waltoncollege.uark.edu/mec.asp).

Vendors, Products, and Demos

Most vendors provide software demos of their products and applications. Information about products, architecture, and software is available at dssresources.com.

http://tdwi.org

http://datasciencecentral.com

http://dssresources.com

http://enterprise.waltoncollege.uark.edu/mec.asp

http://dssresources.com

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 67

Periodicals

We recommend the following periodicals:

• Decision Support Systems (www.journals.elsevier.com/decision-support-systems). • CIO Insight (www.cioinsight.com).

The Teradata University Network Connection

This book is tightly connected with the free resources provided by TUN (see www. teradatauniversitynetwork.com). The TUN portal is divided into two major parts: one for students and one for faculty. This book is connected to the TUN portal via a special section at the end of each chapter. That section includes appropriate links for the specific chapter, pointing to relevant resources. In addition, we provide hands-on exercises using software and other materials (e.g., cases) available at TUN.

The Book’s Web Site

This book’s Web site, pearsonhighered.com/sharda, contains supplemental textual mate- rial organized as Web chapters that correspond to the printed book’s chapters. The topics of these chapters are listed in the online chapter table of contents.

As this book went to press, we verified that all cited Web sites were active and valid. However, URLs are dynamic. Web sites to which we refer in the text sometimes change or are discontinued because companies change names, are bought or sold, merge, or fail. Sometimes Web sites are down for maintenance, repair, or redesign. Many organizations have dropped the initial “www” designation for their sites, but some still use it. If you have a problem connecting to a Web site that we mention, please be patient and simply run a Web search to try to identify a possible new site. Most times, you can quickly find the new site through one of the popular search engines. We apologize in advance for this inconvenience.

Chapter Highlights

• The business environment is becoming more complex and is rapidly changing, making deci- sion making more difficult.

• Businesses must respond and adapt to the chang- ing environment rapidly by making faster and better decisions.

• A model is a simplified representation or abstrac- tion of reality.

• Decision making involves four major phases: in- telligence, design, choice, and implementation.

• In the intelligence phase, the problem (op- portunity) is identified, classified, and decom- posed (if needed), and problem ownership is established.

• In the design phase, a model of the system is built, criteria for selection are agreed on, alterna- tives are generated, results are predicted, and a decision methodology is created.

• In the choice phase, alternatives are compared, and a search for the best (or a good-enough)

solution is launched. Many search techniques are available.

• In implementing alternatives, a decision maker should consider multiple goals and sensitivity- analysis issues.

• The time frame for making decisions is shrinking, whereas the global nature of decision making is ex- panding, necessitating the development and use of computerized DSS.

• An early decision support framework divides decision situations into nine categories, de- pending on the degree of structuredness and managerial activities. Each category is sup- ported differently.

• Structured repetitive decisions are supported by standard quantitative analysis methods, such as MS, MIS, and rule-based automated decision support.

• DSS use data, models, and sometimes knowledge management to find solutions for semistructured and some unstructured problems.

http://www.journals.elsevier.com/decision-support-systems

http://www.cioinsight.com

http://www.teradatauniversitynetwork.com

http://pearsonhighered.com

68 Part I • Introduction to Analytics and AI

• The major components of a DSS are a database and its management, a model base and its man- agement, and a user-friendly interface. An intel- ligent (knowledge-based) component can also be included. The user is also considered to be a component of a DSS.

• BI methods utilize a central repository called a DW that enables efficient data mining, OLAP, BPM, and data visualization.

• BI architecture includes a DW, business analyt- ics tools used by end users, and a user interface (such as a dashboard).

• Many organizations employ descriptive analytics to replace their traditional flat reporting with in- teractive reporting that provides insights, trends, and patterns in the transactional data.

• Predictive analytics enables organizations to es- tablish predictive rules that drive the business outcomes through historical data analysis of the existing behavior of the customers.

• Prescriptive analytics helps in building models that involve forecasting and optimization tech- niques based on the principles of OR and man- agement science to help organizations to make better decisions.

• Big Data analytics focuses on unstructured, large data sets that may also include vastly different types of data for analysis.

• Analytics as a field is also known by industry- specific application names, such as sports analyt- ics. It is also known by other related names such as data science or network science.

• Healthcare and retail chains are two areas where analytics applications abound, with much more to come.

• Image analytics is a rapidly evolving field leading to many applications of deep learning.

• The analytics ecosystem can be first viewed as a collection of providers, users, and facilitators. It can be broken into 11 clusters.

Key Terms

analytics analytics ecosystem artificial intelligence augmented intelligence Big Data analytics business intelligence (BI) choice phase

dashboard data mining decision or normative analytics descriptive (or reporting) analytics design phase implementation phase intelligence phase

online analytical processing (OLAP)

online transaction processing (OLTP)

predictive analytics prescriptive analytics

Questions for Discussion

1. Survey the literature from the past six months to find one application each for DSS, BI, and analytics. Summarize the applications on one page, and submit it with the exact sources.

2. Your company is considering opening a branch in China. List typical activities in each phase of the deci- sion (intelligence, design, choice, and implementation) regarding whether to open a branch.

3. You are about to buy a car. Using Simon’s (1977) four- phase model, describe your activities at each step in making the decision.

4. Explain, through an example, the support given to deci- sion makers by computers in each phase of the decision process.

5. Comment on Simon’s (1977) philosophy that managerial decision making is synonymous with the whole process of management. Does this make sense? Explain. Use a real-world example in your explanation.

6. Review the major characteristics and capabilities of DSS. How does each of them relate to the major components of DSS?

7. List some internal data and external data that could be found in a DSS for a university’s admissions office.

8. Distinguish BI from DSS. 9. Compare and contrast predictive analytics with prescrip-

tive and descriptive analytics. Use examples. 10. Discuss the major issues in implementing BI.

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 69

Exercises

Teradata University Network and Other Hands-On Exercises

1. Go to the TUN site teradatauniversitynetwork.com. Using the site password your instructor provides, register for the site if you have not already previously registered. Log on and learn the content of the site. You will receive assignments related to this site. Prepare a list of 20 items on the site that you think could be beneficial to you.

2. Go to. Explore the Sports Analytics page, and summa- rize at least two applications of analytics in any sport of your choice.

3. Go to. The TUN site, and select “Cases, Projects, and Assignments.” Then select the case study “Harrah’s High Payoff from Customer Information.” Answer the follow- ing questions about this case: a. What information does the data mining generate? b. How is this information helpful to management in

decision making? (Be specific.) c. List the types of data that are mined. d. Is this a DSS or BI application? Why?

4. Go to teradatauniversitynetwork.com and find the paper titled “Data Warehousing Supports Corporate Strategy at First American Corporation” (by Watson, Wixom, and Goodhue). Read the paper, and answer the following questions: a. What were the drivers for the DW/BI project in the

company? b. What strategic advantages were realized? c. What operational and tactical advantages were achieved? d. What were the critical success factors for the

implementation? 5. Go to http://analytics-magazine.org/issues/digital-

editions and find the January/February 2012 edition titled “Special Issue: The Future of Healthcare.” Read the article “Predictive Analytics—Saving Lives and Lowering Medical Bills.” Answer the following questions: a. What problem is being addressed by applying pre-

dictive analytics? b. What is the FICO Medication Adherence Score? c. How is a prediction model trained to predict the FICO

Medication Adherence Score HoH? Did the prediction model classify the FICO Medication Adherence Score?

d. Zoom in on Figure 4, and explain what technique is applied to the generated results.

e. List some of the actionable decisions that were based on the prediction results.

6. Go to http://analytics-magazine.org/issues/digital- editions, and find the January/February 2013 edition titled “Work Social.” Read the article “Big Data, Analytics and Elections,” and answer the following questions: a. What kinds of Big Data were analyzed in the article’s

Coo? Comment on some of the sources of Big Data.

b. Explain the term integrated system. What is the other technical term that suits an integrated system?

c. What data analysis techniques are employed in the project? Comment on some initiatives that resulted from data analysis.

d. What are the different prediction problems answered by the models?

e. List some of the actionable decisions taken that were based on the prediction results.

f. Identify two applications of Big Data analytics that are not listed in the article.

7. Search the Internet for material regarding the work of managers and the role analytics plays in it. What kinds of references to consulting firms, academic departments, and programs do you find? What major areas are represented? Select five sites that cover one area, and report your findings.

8. Explore the public areas of dssresources.com. Prepare a list of its major available resources. You might want to refer to this site as you work through the book.

9. Go to microstrategy.com. Find information on the five styles of BI. Prepare a summary table for each style.

10. Go to oracle.com, and click the Hyperion link under Applications. Determine what the company’s major products are. Relate these to the support technologies cited in this chapter.

11. Go to the TUN questions site. Look for BSI videos. Review the video of “Case of Retail Tweeters.” Prepare a one-page summary of the problem, proposed solu- tion, and the reported results. You can also find associ- ated slides on slideshare.net.

12. Review the Analytics Ecosystem section. Identify at least two additional companies in at least five of the industry clusters noted in the discussion.

13. The discussion for the analytics ecosystem also included several typical job titles for graduates of analytics and data science programs. Research Web sites such as datasciencecentral.com and tdwi.org to locate at least three similar job titles that you may find interesting for your career.

14. Go to Brainspace at MIT lab brainspace.com. View the video about “Augmented Human Intelligence.” Find the activities that deal with the enabling of meaningful combination of people and machines. Write a report.

15. Find information about IBM Watson’s activities in the healthcare field. Write a report.

16. Examine Daniel Power’s DSS Resources site at dssresources.com. Take the Decision Support Systems Web Tour (dssresources.com/tour/index.html). Explore other areas of the Web site. List at least three recent resources related to analytics. What topics do these cover?

http://teradatauniversitynetwork.com

http://analytics-magazine.org/issues/digital-editions

http://dssresources.com

http://microstrategy.com

http://oracle.com

http://slideshare.net

http://datasciencecentral.com

http://tdwi.org

http://brainspace.com

http://dssresources.com

http://dssresources.com/tour/index.html

70 Part I • Introduction to Analytics and AI

References

http://canopeoapp.com/ (accessed October 2018). http://imazon.org.br/en/imprensa/mapping-change-

in-the-amazon-how-satellite-images-are-halting- deforestation/ (accessed October 2018).

http://www.earthcastdemo.com/2018/07/bloomberg- earthcast-customizing-weather/ (accessed October 2018)

ht tps : / /www.wor ldbank.org/en/news/press - release/2018/02/22/world-bank-supports-sierra- leones-efforts-in-landslide-recovery (accessed October 2018)

Siemens.com. About Siemens. siemens.com/about/en/ (accessed September 2018).

Silvaris.com. Silvaris overview. silvaris.com (accessed September 2018).

Agah, A. (2017). Medical Applications of Artificial Intelligence. Boca Raton, FL: CRC Press.

Anthony, R. N. (1965). Planning and Control Systems: A Framework for Analysis. Cambridge, MA: Harvard Univer- sity Graduate School of Business.

Baker, J., and M. Cameron. (1996, September). “The Effects of the Service Environment on Affect and Consumer Perception of Waiting Time: An Integrative Review and Research Propositions.” Journal of the Academy of Market- ing Science, 24, pp. 338–349.

Bean, R. (2017, May 8). “How Big Data Is Empowering AI and Machine Learning at Scale.” MIT Sloan Management Review.

Boddington, P. (2017, January 3). “Case Study: Robot Camel Jockeys. Yes, really.” Ethics for Artificial Intelligence.

Brainspace. (2016, June 13). “Augmenting Human Intelli- gence.” MIT Technology Review Insights.

Butner, K. (2018, January 8). “Combining Artificial Intelligence with the Internet of Things Could Make Your Business Smarter.” IBM Consulting Blog.

CDC.gov. (2015, September 21). “Important Facts about Falls.” cdc.gov/homeandrecreationalsafety/falls/adultfalls. html (accessed September 2018).

Charles, T. (2018, May 21). “Influence of the External Environment on Strategic Decision.” Azcentral. yourbusiness. azcentral. com/influence-external-environment- strategic- decisions-17628.html/ (accessed October 2018).

Chiguluri, V., Guthikonda, K., Slabaugh, S., Havens, E., Peña, J., & Cordier, T. (2015, June). Relationship Between Diabetes Complications and Health Related Quality of Life Among an Elderly Population in the Unit- ed States. Poster presentation at the American Diabe- tes Association Seventy- Fifth Annual Scientific Sessions. Boston, MA.

Chongwatpol, J., & R. Sharda. (2010, December). “SNAP: A DSS to Analyze Network Service Pricing for State Net- works.” Decision Support Systems, 50(1), pp. 347–359.

Chung, C. (2016). “Dubai Camel Race Ride-Along.” YouTube. com. youtube.com/watch?v=xFCRhk4GYds (accessed September 2018).

Cordier, T., Slabaugh, L., Haugh, G., Gopal, V., Cusano, D., Andrews, G., & Renda, A. (2015, September). Quality of Life Changes with Progressing Congestive Heart Failure. Poster presentation at the Nineteenth Annual Scientific Meeting of the Heart Failure Society of America, Wash- ington, DC.

Corea, F. (2017, December 1). “The Convergence of AI and Blockchain: What’s the Deal?” Medium.com. medium. com/@Francesco_AI/the-convergence-of-ai-and- blockchain-whats-the-deal-60c618e3accc (accessed September 2018).

Davenport, T., & SAS Institute Inc. (2014, February). “Analytics in Sports: The New Science of Winning.” sas.com/con- tent/dam/SAS/en_us/doc/whitepaper2/iia-analytics- in-sports-106993.pdf (accessed September 2018).

Davenport, T. H., & Patil, D. J. (2012). “Data Scientist.” Har- vard Business Review, 90, 70–76.

Deahl, D. (2018, January 7). “This Automatic Feeder Can Tell the Difference Between Your Pets.” The Verge.

De Smet, A., et al. (2017, June). “Untangling Your Organiza- tion’s Decision Making.” McKinsey Quarterly.

Duncan, A. (2016). “The BICC Is Dead.” https://blogs. gartner.com/alan-duncan/2016/03/11/the-bicc-is- dead/ (accessed October 2018).

Dundas.com. “How Siemens Drastically Reduced Cost with Managed BI Applications.” www.dundas.com/Con- tent/pdf/siemens-case-study.pdf (accessed September 2018).

Eckerson, W. (2003). Smart Companies in the 21st Century: The Secrets of Creating Successful Business Intelligent Solu- tions. Seattle, WA: The Data Warehousing Institute.

eMarketer. (2017, May). “Artificial Intelligence: What’s Now, What’s New and What’s Next?” EMarketer Inc.

Emc.com. (n.d.). “Data Science Revealed: A Data-Driven Glimpse into the Burgeoning New Field.” emc.com/col- lateral/about/news/emc-data-science-study-wp.pdf (accessed September 2018)

Emmert, Samantha. (2018, March 19). “Fighting Illegal Fishing.” Global Fishing Watch. globalfishingwatch.org/research/ fighting-illegal-fishing/ (accessed October 2018).

Faggela, D. (2017, August 24). “Artificial Intelligence Plus the Internet of Things (IoT): 3 Examples Worth Learning From.” TechEmergence.

Faggela, D. (2018, March 29). “Artificial Intelligence Industry: An Overview by Segment.” TechEmergence.

Fernandez, J. (2017, April). “A Billion People a Day. Millions of Elevators. No Room for Downtime.” IBM developer Works Blog. developer.ibm.com/dwblog/2017/kone- watson-video/ (accessed September 2018).

http://canopeoapp.com/

http://imazon.org.br/en/imprensa/mapping-change-in-the-amazon-how-satellite-images-are-halting-deforestation/

http://www.earthcastdemo.com/2018/07/bloomberg-earthcast-customizing-weather/

https://www.worldbank.org/en/news/press-release/2018/02/22/world-bank-supports-sierra-leones-efforts-in-landslide-recovery

http://Siemens.com

http://siemens.com/about/en/

http://Silvaris.com

http://silvaris.com

http://cdc.gov/homeandrecreationalsafety/falls/adultfalls.html

http://yourbusiness.azcentral.com/influence-external-environment-strategic-decisions-17628.html/

http://YouTube.com

http://youtube.com/watch?v=xFCRhk4GYds

http://Medium.com

medium.com/@Francesco_AI/the-convergence-of-ai-and-blockchain-whats-the-deal-60c618e3accc

http://sas.com/content/dam/SAS/en_us/doc/whitepaper2/iia-analytics-in-sports-106993.pdf

https://blogs.gartner.com/alan-duncan/2016/03/11/the-bicc-is-dead/

http://Dundas.com

http://www.dundas.com/Content/pdf/siemens-case-study.pdf

http://Emc.com

http://emc.com/collateral/about/news/emc-data-science-study-wp.pdf

http://globalfishingwatch.org/research/fighting-illegal-fishing/

http://developer.ibm.com/dwblog/2017/kone-watson-video/

Chapter 1 • Overview of Business Intelligence, Analytics, Data Science, and Artificial Intelligence 71

FinTech Futures. (2017, October 11). “IBM Combining Data Sci- ence and AI for Analytics Advance.” BankingTech.com.

Gartner, Inc. (2004). Using Business Intelligence to Gain a Competitive Edge. A special report.

Gates, S., Smith, L. A., Fisher, J. D., et al. (2008). Systematic Re- view of Accuracy of Screening Instruments for Predicting Fall Risk Among Independently Living Older Adults. Jour- nal of Rehabilitation Research and Development, 45(8), pp. 1105–1116.

Gill, T. M., Murphy, T. E., Gahbauer, E. A., et al. (2013). “Associ- ation of Injurious Falls with Disability Outcomes and Nurs- ing Home Admissions in Community Living Older Persons.” American Journal of Epidemiology, 178(3), pp. 418–425.

Gorry, G. A., & Scott-Morton, M. S. (1971). “A Framework for Management Information Systems.” Sloan Management Review, 13(1), pp. 55–70.

Havens, E., Peña, J., Slabaugh, S., Cordier, T., Renda, A., & Gopal, V. (2015, October). Exploring the Relationship Be- tween Health-Related Quality of Life and Health Condi- tions, Costs, Resource Utilization, and Quality Measures. Podium presentation at the ISOQOL Twenty-Seventh Annual Conference, Vancouver, Canada.

Havens, E., Slabaugh, L., Peña, J., Haugh, G., & Gopal, V. (2015, February). Are There Differences in Healthy Days Based on Compliance to Preventive Health Screening Measures? Poster presentation at Preventive Medicine 2015, Atlanta, GA.

Healthcare IT News. (2017, November 9). “How AI Is Trans- forming Healthcare and Solving Problems in 2017.” Slideshow. healthcareitnews.com/slideshow/how- ai-transforming-healthcare-and-solving-problems- 2017?page=4/ (accessed September 2018).

Hesse, R., & G. Woolsey. (1975). Applied Management Sci- ence: A Quick and Dirty Approach. Chicago, IL: SRA Inc.

Humana. 2016 Progress Report. populationhealth. humana.com/wp-content/uploads/2016/05/ BoldGoal2016ProgressReport_1.pdf (accessed Sep- tember 2018).

INFORMS. Analytics Section Overview. informs.org/ Community/Analytics (accessed September 2018).

Jacquet, F. (2017, July 4). “Exploring the Artificial Intelligence Ecosystem: AI, Machine Learning, and Deep Learning.” DZone.

Jarrett, C. (2018, January 21). “Amazon Set to Open Doors on AI-Powered Grocery Store.” Venturebeat.com. venture- beat.com/2018/01/21/amazon-set-to-open-doors-on- ai-powered-grocery-store/ (accessed September 2018).

Keen, P. G. W., & M. S. Scott-Morton. (1978). Decision Sup- port Systems: An Organizational Perspective. Reading, MA: Addison-Wesley.

Kemper, C., and C. Breuer. (2016). “How Efficient Is Dynamic Pricing for Sports Events? Designing a Dynamic Pricing Model for Bayern Munich.” International Journal of Sports Finance, 11, pp. 4–25.

Kranz, M. (2017, December 27). “In 2018, Get Ready for the Convergence of IoT, AI, Fog, and Blockchain.” Insights.

Liberto, D. (2017, June 29). “Artificial Intelligence Will Add Trillion to the Global Economy: PwC.” Investopedia.

Lollato, R., Patrignani, A., Ochsner, T. E., Rocatelli, A., Tomlinson, P. & Edwards, J. T. (2015). “Improving Grazing Management Using a Smartphone App.” www. bookstore.ksre.ksu.edu/pubs/MF3304.pdf (accessed October 2018).

Lopez, J. (2017, August 11). “Smart Farm Equipment Helps Feed the World.” IQintel.com.

Metz, C. (2018, February 12). “As China Marches Forward on A.I., the White House Is Silent.” The New York Times.

Nadav, S. (2017, August 9). “Business Intelligence Is Failing; Here Is What Is Coming Next.” Huffington Post.

Norman, A. (2018, January 31). “Your Future Doctor May Not Be Human. This Is the Rise of AI in Medicine.” Futurism.com.

Olshansky, C. (2017, August 24). “Coca-Cola Is Bringing Artifi- cial Intelligence to Vending Machines.” Food & Wine.

Opfer, C. (2016, June 22). “There’s One Terrific Reason to Race Camels Using Robot Jockeys.” Howstuffworks.com.

Padmanabhan, G. (2018, January 4). “Industry-Specific Augment- ed Intelligence: A Catalyst for AI in the Enterprise.” Forbes.

Pajouh Foad, M., Xing, D., Hariharan, S., Zhou, Y., Balasunda- ram, B., Liu, T., & Sharda, R. (2013). Available-to-Promise in Practice: An Application of Analytics in the Specialty Steel Bar Products Industry. Interfaces, 43(6), pp. 503–517. dx. doi.org/10.1287/inte.2013.0693 (accessed September 2018).

Patrignani, A., & Ochsner, T. E., (2015). Canopeo: A Powerful New Tool for Measuring Fractional Green Canopy Cover. Agronomy Journal, 107(6), pp. 2312–2320;

PE Report. (2017, July 29). “Satellite-Based Advance Can Help Raise Farm Output by 20 Percent Experts.” Financial Express.

PricewaterhouseCoopers Report. (2011, December). “Chang- ing the Game: Outlook for the Global Sports Market to 2015.” pwc.com/gx/en/ hospitality-leisure/pdf/ changing-the-game-outlook-for-the-global-sports- market-to-2015.pdf (accessed September 2018).

Quain, S. (2018, June 29). “The Decision-Making Process in an Organization.” Small Business Chron.

Reisinger, D. (2018, February 22). “Here Are the Next Cities to Get Amazon Go Cashier-Less Stores.” Fortune.

Sharda, R., Asamoah, D., & Ponna, N. (2013). “Research and Pedagogy in Business Analytics: Opportunities and Illus- trative Examples.” Journal of Computing and Information Technology, 21(3), pp. 171–182.

Sharda, R., Delen, D., & Turban, E. (2016). Business Intelligence, Analytics, and Data Science: A Managerial Perspective on Analytics. 4th ed. NJ: Pearson.

Sharda, R., & P. Kalgotra. (2018). “The Blossoming Analytics Talent Pool: An Overview of the Analytics Ecosystem.” In James J. Cochran (ed.). INFORMS Analytics Body of Knowl- edge. John Wiley, Hoboken, NJ

http://BankingTech.com

http://healthcareitnews.com/slideshow/how-ai-transforming-healthcare-and-solving-problems-2017?page=4/

http://populationhealth.humana.com/wp-content/uploads/2016/05/BoldGoal2016ProgressReport_1.pdf

http://informs.org/Community/Analytics

http://Venturebeat.com

http://venturebeat.com/2018/01/21/amazon-set-to-open-doors-on-ai-powered-grocery-store/

http://www.bookstore.ksre.ksu.edu/pubs/MF3304.pdf

http://IQintel.com

http://Futurism.com

http://Howstuffworks.com

http://dx.doi.org/10.1287/inte.2013.0693

http://pwc.com/gx/en/hospitality-leisure/pdf/changing-the-game-outlook-for-the-global-sports-market-to-2015.pdf

72 Part I • Introduction to Analytics and AI

Silk, R. (2017, November). “Biometrics: Facial Recognition Tech Coming to an Airport Near You.” Travel Weekly, 21.

Simon, H. (1977). The New Science of Management Decision. Englewood Cliffs, NJ: Prentice Hall.

Slade, L. (2017, December 21). “Meet the Jordanian Camel Races Using Robot Jockeys.” Sbs.com.au.

Slowey, L. (2017, February 16). “Look Who’s Talking: KONE Makes Elevator Services Truly Intelligent with Watson IoT.” IBM Internet of Things Blog. ibm.com/blogs/internet- of-things/kone/ (accessed September 2018).

“Sports Analytics Market Worth by 2021.” (2015, June 25). Wintergreen Research Press Release. Covered by PR Newswire at http://www.prnewswire.com/news- releases/sports-analytics-market-worth-47-billion- by-2021-509869871.html.

Srikanthan, H. . (2018, January 8). “KONE Improves ‘People Flow’ in 1.1 Million Elevators with IBM Watson IoT.” Ge- neris. https://generisgp.com/2018/01/08/ibm-case- study-kone-corp/ (accessed September 2018).

Staff. “Assisted, Augmented and Autonomous: The 3Flavours of AI Decisions.” (2017, June 28). Software and Technology.

Tableau.com. Silvaris Augments Proprietary Technology Platform with Tableau’s Real-Time Reporting Capabilities. tableau.com/sites/default/files/case-studies/silvaris- business-dashboards_0.pdf (accessed September 2018).

Tartar, Andre, et al. (2018, 26 July). “All the Things Satellites Can Now See from Space.” Bloomberg.com. www. bloomberg.com/news/features/2018-07-26/all-the- things-satellites-can-now-see-from-space (accessed October 2018).

TeradataUniversityNetwork.com. (2015, Fall). “BSI: Sports Analytics—Precision Football” (video). teradatauniver- sity network.com/About-Us/Whats-New/BSI–Sports- Analytics-Precision-Football/ (accessed September 2018).

Thibodeaux, W. (2017, June 29). “This Artificial Intelligence Kiosk Is Designed to Spot Liars at Airports.” Inc.com.

Turck, Matt. “Is Big Data Still a Thing? (The 2016 Big Data Landscape).” http://mattturck.com/2016/02/01/big- data-landscape/ (accessed September 2018).

Watson, H. (2005, Winter). Sorting Out What’s New in Deci- sion Support. Business Intelligence Journal.

Weldon, D. (2018, March 6). “Nearly Half of CIOs Now Plan to Deploy Artificial Intelligence.” Information Management.

Wikipedia.org. On-base Percentage. wikipedia.org/wiki/ On_base_percentage. (accessed September 2018).

Wikipedia.org. Sabermetrics. wikipedia.org/wiki/ Sabermetrics (accessed September 2018).

Wikipedia.org. SIEMENS. wikipedia.org/wiki/Siemens (accessed September 2018).

YouTube.com. (2013, December 17). CenterPoint Energy Talks Real Time Big Data Analytics. youtube.com/ watch?v=s7CzeSlIEfI (accessed September 2018).

Yue, P. (2017, August 24). “Baidu, Beijing Airport Launch Facial Recognition for Passenger Check-In.” China Money Network. https://www.chinamoneynetwork.com/ 2017/08/24/baidu-capital-airport-launch-facial- recognition-system-airport (accessed October 2018).

Zane, E. B. (2016). Effective Decision-Making: How to Make Better Decisions Under Uncertainty And Pressure. Kindle ed. Seattle, WA: Amazon Digital Services.

http://Sbs.com

http://ibm.com/blogs/internet-of-things/kone/

http://www.prnewswire.com/news-releases/sports-analytics-market-worth-47-billion-by-2021-509869871.html

https://generisgp.com/2018/01/08/ibm-case-study-kone-corp/

http://Tableau.com

http://tableau.com/sites/default/files/case-studies/silvaris-business-dashboards_0.pdf

http://Bloomberg.com

http://www.bloomberg.com/news/features/2018-07-26/all-the-things-satellites-can-now-see-from-space

http://TeradataUniversityNetwork.com

http://network.com/About-Us/Whats-New/BSI-Sports-Analytics-Precision-Football/

http://Inc.com

http://mattturck.com/2016/02/01/big-data-landscape/

http://Wikipedia.org

http://wikipedia.org/wiki/On_base_percentage

http://Wikipedia.org

http://wikipedia.org/wiki/Sabermetrics

http://Wikipedia.org

http://wikipedia.org/wiki/Siemens

http://YouTube.com

http://youtube.com/watch?v=s7CzeSlIEfI

https://www.chinamoneynetwork.com/2017/08/24/baidu-capital-airport-launch-facial-recognition-system-airport

LEARNING OBJECTIVES

Artificial Intelligence Concepts, Drivers, Major Technologies, and Business Applications

2 C H A P T E R

■■ Understand the concepts of artificial intelligence (AI) ■■ Become familiar with the drivers, capabilities, and benefits of AI

■■ Describe human and machine intelligence ■■ Describe the major AI technologies and some derivatives

■■ Discuss the manner in which AI supports decision making

■■ Describe AI applications in accounting ■■ Describe AI applications in banking and financial services

■■ Describe AI in human resource management ■■ Describe AI in marketing ■■ Describe AI in production-operation management

A rtificial intelligence (AI), which was a curiosity for generations, is rapidly de-veloping into a major applied technology with many applications in a variety of fields. OpenAI’s (an AI research institution described in Chapter 14) mission states that AI will be the most significant technology ever created by humans. AI appears in several shapes and has several definitions. In a crude way, it can be said that AI’s aim is to make machines exhibit intelligence as close as possible to what people exhibit, hopefully for the benefit of humans. The latest developments in computing technologies drive AI to new levels and achievements. For example, IDC Spending Guide (March 22, 2018) forecasted that worldwide spending on AI will reach $19.1 billion in 2018. It also predicted annual double-digit spending growth for the near future. According to Sharma (2017), China expects to be the world leader in AI, with a spending of $60 billion in 2025. For the business value of AI, see Greig (2018).

In this chapter, we provide the essentials of AI, its major technologies, its support for decision making, and a sample of its applications in the major business functional areas.

74 Part I • Introduction to Analytics and AI

The chapter has the following sections:

2.1 Opening Vignette: INRIX Solves Transportation Problems 74 2.2 Introduction to Artificial Intelligence 76 2.3 Human and Computer Intelligence 83 2.4 Major AI Technologies and Some Derivatives 87 2.5 AI Support for Decision Making 95 2.6 AI Applications in Accounting 99 2.7 AI Applications in Financial Services 101 2.8 AI in Human Resource Management (HRM) 105 2.9 AI in Marketing, Advertising, and CRM 107

2.10 AI Applications in Production-Operation Management (POM) 110

2.1 OPENING VIGNETTE: INRIX Solves Transportation Problems

THE PROBLEM

Traffic congestion is an ever-increasing problem in many large metropolitan areas. Drivers may spend several hours on the roads each day. In addition, air pollution is increasing, and more accidents are occurring.

THE SOLUTION

INRIX corporation (inrix.com) enables drivers to get real-time traffic information. They can download the INRIX-XD Traffic App for iOS and Android. The information provided is generated by a predictive analysis of massive data obtained from consumers and the environment (e.g., road construction, accidents). Information sources include:

• Traffic data collected by helicopters, drones, and so on, which include real-time traffic flow and accident information.

• Information provided by participating delivery companies and over 100 million anonymous volunteer drivers, who have GPS-enabled smartphones, all reporting in real time.

• Information provided by traffic congestion reports (e.g., delays due to road maintenance).

INRIX processes the collected information with proprietary analytical tools and for- mulas, some of which are AI-based. The processed information is used to generate traffic predictions. For example, it creates a picture of anticipated traffic flows and delays for the next 15 to 20 minutes, few hours, and few days for many locations. These predictions enable drivers to plan their optimal routes. As of 2018, INRIX had offered global coverage in 45 countries and in many major cities, and the company analyzed traffic information from over 100 sources. This service is combined with digital maps. In Seattle, for exam- ple, traffic information is disseminated via smartphones and color codes on billboards along the freeways. Smartphones also display estimated times for the roads to be either clear or jammed. As of 2018, the company had covered over 5,000,000 miles of highways worldwide, delivering upon request the best recommended routes to use, all in real time.

The INRIX system provides information (or recommendations) for decisions such as:

• Optional routes for delivery vehicles and other travelers to take • The best time to go to work or to other places from a given location

http://inrix.com

Chapter 2 • Artificial Intelligence 75

• Information for rerouting a trip to avoid encountering a traffic jam that just occurred

• Fees to be paid on highways, which are based on traffic conditions and time of the day

The technologies used to collect data are:

• Closed-circuit TV cameras and radar that monitor traffic conditions • Public safety reports and traffic information • Information about freeway access and departure flows • Technologies that measure toll collection queues • Magnetic sensing detectors embedded under the road surface (expensive) • Smartphones and other data collection devices that gather data for INRIX

The information is processed by several AI techniques such as expert systems; see Chapter 12 and different analytical models (such as simulation).

Several of the sources of information are connected to the company via the Internet of Things (IoT) (Chapter 13). According to its Web site, INRIX has partnered with Clear Channel Radio to broadcast real-time traffic data directly to vehicles via Ln Carr or via portable navigation systems, broadcast media, and wireless and Internet-based services. Clear Channel’s Total Traffic Network is available in more than 125 metropolitan areas in four countries (inrix.com/press-releases/2654/). In 2018, the system was installed in over 275 million cars and data collection devices. The system collects real-time traffic information from these devices.

THE RESULTS

In addition to being used by individual drivers, the processed information is shared by organizations and city planners for making planning decisions. Also, less traffic con- gestion has been recorded in participating cities, which results in less pollution, fewer road accidents, and increased productivity by happier employees who spend less time commuting.

The INRIX Traffic App (available for download at inrix.com/mobile-apps) is suit- able for all smartphones; it supports 10 languages, including English, French, and Spanish. For the free INRIX traffic features, see inrixtraffic.com/features. For interesting case studies, see inrix.com/case-studies.

As of 2016, INRIX had released an improved traffic app that uses both AI and crowdsourcing (Chapter 11) to support drivers’ decisions as to the best route to take (Korosec, 2016). The AI technology analyzes drivers’ historical activities to infer their future activities.

Note: Popular smartphone apps, such as Waze and Moovit, provide navigation and data collection similar to INRIX.

Sources: Based on inrix.com, Gitlin (2016), Korosec (2016), and inrix.com/mobile-apps (all accessed June 2018).

u QUESTIONS FOR THE OPENING VIGNETTE

1. Explain why traffic may be down while congestion is up (see the London case at inrix.com/uk-highways-agency/).

2. How does this case relate to decision support? 3. Identify the AI elements in this system. 4. Identify developments related to AI by viewing the company’s press releases from

the most recent four months at inrix.com/press-releases. Write a report.

http://inrix.com/press-releases/2654/

http://inrix.com/mobile-apps

http://inrixtraffic.com/features

http://inrix.com/case-studies

http://inrix.com

http://inrix.com/mobile-apps

http://inrix.com/uk-highways-agency/

http://inrix.com/press-releases

76 Part I • Introduction to Analytics and AI

5. According to Gitlin (2016), INRIX’s new mobile traffic app is a threat to Waze. Explain why.

6. Go to sitezeus.com/data/inrix and describe the relationship between INRIX and Zeus. View the 2:07 min. video at sitezeus.com/data/inrix/. Why is the system in the video called a “decision helper”?

WHAT WE CAN LEARN FROM THE VIGNETTE

The INRIX case illustrates to us how the collection and analysis of a very large amount of information (Big Data) can improve vehicles’ mobility in large cities. Specifically, by collecting information from drivers and other sources instead of only from expensive sensors, INRIX has been able to optimize mobility. This has been achieved by sup- porting decisions made by drivers and by analyzing traffic flows. INRIX is also using applications from the IoT to connect vehicles and devices with its computing system. This application is one of the building blocks of smart cities (see Chapter 13). The analysis of the collected data is done by using powerful algorithms, some of which are applications of AI.

2.2 INTRODUCTION TO ARTIFICIAL INTELLIGENCE

We would all like to see computerized decision making being simpler, easier to use, more intuitive, and less threatening. And indeed, efforts have been made over time to simplify and automate several tasks in the decision-making process. Just think of the day that refrigerators will be able to measure and evaluate their contents and place orders for goods that need replenishment. Such a day is not too far in the future, and the task will be supported by AI.

CIO Insight projected that by 2035, intelligent computer technologies will result in $5–$8.3 trillion in economic value (see cioinsight.com/blogs/how-ai-will-impact- the-global-economy.html). Among the technologies listed as intelligent ones are the IoT, advanced robotics, and self-driven vehicles, all described in this book. Gartner, a leading technology consulting firm, listed the following in its 2016 and 2017 Hype Cycles for Emerging Technologies: expert advisors, natural language questions and answering, commercial drones, smart workspaces, IoT platforms, smart data discovery, general- purpose machine intelligence, and virtual personal assistants. Most are described or cited in this book (see also Greengard, 2016). For the history of AI, see Zarkadakis (2016) and en.wikipedia.org/wiki/History_of_artificial_intelligence.

Definitions

Artificial intelligence has several definitions (for an overview see Marr 2018); however, many experts agree that AI is concerned with two basic ideas: (1) the study of human thought processes (to understand what intelligence is) and (2) the representation and duplication of those thought processes in machines (e.g., computers, robots). That is, the machines are expected to have humanlike thought processes.

One well-publicized definition of AI is “the capabilities of a machine to imitate intel- ligent human behavior” (per Merriam-Webster Dictionary). The theoretical background of AI is based on logic, which is also used in several computer science innovations. Therefore, AI is considered a subfield of computer science. For the relationship between AI and logic, see plato.stanford.edu/entries/logic-ai.

A well-known early application of artificial intelligence was the chess program hosted at IBM’s supercomputer (Deep Blue). The system beat the famous world cham- pion, Grand Master Garry Kasparov.

http://sitezeus.com/data/inrix

http://sitezeus.com/data/inrix/

http://cioinsight.com/blogs/how-ai-will-impact-the-global-economy.html

http://en.wikipedia.org/wiki/History_of_artificial_intelligence

http://plato.stanford.edu/entries/logic-ai

Chapter 2 • Artificial Intelligence 77

AI is an umbrella term for many techniques that share similar capabilities and char- acteristics. For a list of 50 unique AI technologies, see Steffi (2017). For 33 types of AI, see simplicable.com/new/types-of-artificial-intelligence.

Major Characteristics of AI Machines

There is an increasing trend to make computers “smarter.” For example, Web 3.0 sup- poses to enable computerized systems that exhibit significantly more intelligence than Web 2.0. Several applications are already based on multiple AI techniques. For example, the area of machine translation of languages is helping people who speak different lan- guages to collaborate as well as to buy online products that are advertised in languages they do not speak. Similarly, machine translation can help people who know only their own language to converse with people speaking other languages and to make decisions jointly in real time.

Major Elements of AI

As described in Chapter 1, the landscape of AI is huge, including hundreds or more components. We illustrate the foundation and the major technologies in Figure 2.1. Notice that we divide them into two groups: Foundations, and Technologies and Applications. The major technologies will be defined later in this chapter and described throughout this book.

Intelligence; Tutoring

Autonomous Vehicles

Speech Understanding

Automatic Programming

Game Playing

Expert SystemsComputer Vision

Intelligent Agents

Natural Language Processing

Neutral Networks

Voice Recognition

Genetic Algorithms

Deep Learning

Engineering

Philosophy

Pattern Recognition

Neurology M2M

Logic Sociology

IoT

Human Cognition

Statistics

Information Systems

Linguistics

Augmented Reality

Fo un

da ti

on s

Te ch

no lo

gi es

a nd

A pp

lic at

io ns

Machine Learning

Robo Advisors

Smart Cities

Smart Homes

Human Behavior

Psychology

Biology

Fuzzy Logic

Management Science

Robotics

Computer ScienceMathematics

Personal Assistant

The AI

Tree

Smart Factories

FIGURE 2.1 The Functionalities and Applications of Artificial Intelligence.

http://simplicable.com/new/types-of-artificial-intelligence

78 Part I • Introduction to Analytics and AI

AI Applications

The technologies of AI are used in the creation of a large number of applications. In Sections 2.6–2.10, we provide a sampler of applications in the major functional areas of business.

Example

Smart or intelligent applications include those that can help machines to answer cus- tomers’ questions asked in natural languages. Another area is that of knowledge-based systems which can provide advice, assist people to make decisions, and even make de- cisions on their own. For example, such systems can approve or reject buyers’ requests to purchase online (if the buyers are not preapproved or do not have an open line of credit). Other examples include the automatic generating of online purchasing orders and arranging fulfillment of orders placed online. Both Google and Facebook are experiment- ing with projects that attempt to teach machines how to learn and support or even make autonomous decisions. For smart applications in enterprises, see Dodge (2016), Finlay (2017), McPherson (2017), and Reinharz (2017). For how AI solutions are used to facili- tate government services, see BrandStudio (2017).

AI-based systems are also important for innovation and are related to the areas of analytics and Big Data processing. One of the most advanced projects in this area is IBM Watson Analytics (see Chapter 6). For comprehensive coverage of AI, including defini- tions and its history, frontiers, and future, see Kaplan (2016).

Note: In January 2016, Mark Zuckerberg, the CEO of Facebook, announced publicly that his goal for 2016 was to build an AI-based assistant to help with his personal and business activities and decisions. Zuckerberg was teaching a machine to understand his voice and follow his basic commands as well as to recognize the faces of his friends and business partners. Personal assistants are used today by millions of people (see Chapter 12).

Example: Pitney Bowes Is Getting Smarter with AI

Pitney Bowes Inc. is a U.S.-based global business solutions provider in areas such as product shipments, location intelligence, customer engagement, and customer infor- mation management. The company powers billions of physical and digital transactions annually across the connected and borderless world of commerce.

Today, at Pitney Bowes, shipping prices are determined automatically based on the dimensions, weight, and packaging of each package. The fee calculations create data that are fed into AI algorithms. The more data processed, the more accurate are the calcula- tions (a machine-learning characteristic). The company estimates a 25 percent improve- ment in calculations achieved from their algorithms. This gives Pitney Bowes an accurate base for pricing, better customer satisfaction, and improved competitive advantage.

Major Goals of AI

The overall goal of AI is to create intelligent machines that are capable of executing a variety of tasks currently done by people. Ideally, AI machines should be able to reason, think abstractly, plan, solve problems, and learn.

Some specific goals are to:

• Perceive and properly react to changes in the environment that influence specific business processes and operations.

• Introduce creativity in business processes and decision making.

Chapter 2 • Artificial Intelligence 79

Drivers of AI

The use of AI has been driven by the following forces:

• People’s interest in smart machines and artificial brains • The low cost of AI applications versus the high cost of manual labor (doing the

same work) • The desire of large tech companies to capture competitive advantage and market

share of the AI market and their willingness to invest billions of dollars in AI • The pressure on management to increase productivity and speed • The availability of quality data contributing to the progress of AI • The increasing functionalities and reduced cost of computers in general • The development of new technologies, particularly cloud computing

Benefits of AI

The major benefits of AI are as follows:

• AI has the ability to complete certain tasks much faster than humans. • The consistency of the completed AI work can be much better than that of humans.

AI machines do not make mistakes. • AI systems allow for continuous improvement projects. • AI can be used for predictive analysis via its capability of pattern recognition. • AI can manage delays and blockages in business processes. • AI machines do not stop to rest or sleep. • AI machines can work autonomously or be assistants to humans. • The functionalities of AI machines are ever increasing. • AI machines can learn and improve their performance. • AI machines can work in environments that are hazardous to people. • AI machines can facilitate innovations by human (i.e., support research and devel-

opment [R&D]). • No emotional barriers interfere with AI work. • AI excels in fraud detection and in security facilitations. • AI improves industrial operations. • AI optimizes knowledge work. • AI increases speed and enables scale. • AI helps with the integration and consolidating of business operations. • AI applications can reduce risk. • AI can free employees to work on more complex and productive jobs. • AI improves customer care. • AI can solve difficult problems that previously were unsolved (Kharpal, 2017). • AI increases collaboration and speeds up learning.

These benefits facilitate competitive advantage as reported by Agrawal (2018).

Note: Not all AI systems deliver all these benefits. Specific systems may deliver only some of them.

The capability of reducing costs and increasing productivity may result in large in- creases in profit (Violino, 2017). In addition to benefiting individual companies, AI can dramatically increase a country’s economic growth, as it is doing in Singapore.

80 Part I • Introduction to Analytics and AI

EXAMPLES OF AI BENEFITS The following are typical benefits of AI in various areas of applications:

1. The International Swabs and Derivatives Association (ISDA) uses AI to eliminate tedious activities in contract procedures. For example, by using optical character recognition (OCR) integrated with AI, ISDA digitizes contracts and then defines, ex- tracts, and archives the contracts.

2. AI is starting to revolutionize business recruitment by (1) conducting more efficient and fairer candidate screening, (2) making better matches of candidates to jobs, and (3) helping safeguard future talent pipelines for organizations. For details, see SMBWorld Asia Editors (2017) and Section 2.8.

3. AI is redefining management. According to Kolbjørnsrud et al. (2016), the following five practices result from the use of AI:

• It can perform routine administrative tasks. • Managers can focus on the judgment portions of work. • Intelligent machines are treated as colleagues (i.e., managers trust the advice gener-

ated by AI). In addition, there is people–machine collaboration (see Chapter 11). • Managers concentrate on creative abilities that can be supported by AI machines. • Managers are developing social skills, which are needed for better collaboration,

leadership, and coaching.

4. Accenture Inc. developed AI-powered solutions using natural language processing (NLP) and image recognition to help blind people in India improve the way that they can experience the world around them. This enables them to have a better life, and those who work can work better, faster, and do jobs that are more challenging.

5. Ford Motor Credit uses machine learning to spot overlooked borrowers. In addition, it uses machine learning to help its underwriters better understand loan applicants. The program helps the productivity of both underwriters and overlooked applicants. Finally, the system predicts potential borrowers’ creditworthiness, thus minimizing losses for Ford.

6. Alastair Cole uses data collected from several sources with IBM Watson to predict what customers are expecting from the company. The generated data are used for supporting more efficient business decisions.

7. Companies are building businesses around AI. There are many examples of start-ups or existing companies that are attempting to create new businesses.

2000 2010 2020 2030 Time

Hum ans

Wo rk

AI/Robotics

Cost $

FIGURE 2.2 Cost of Human Work versus the Cost of AI Work.

Chapter 2 • Artificial Intelligence 81

Two areas in which large benefits have already been reaped are customer experi- ence and enjoyment. According to a global survey reported by CMO Innovation Editors (2017), 91 percent of top-performing companies deployed AI solutions to support customer experience.

Some Limitations of AI Machines

The following are the major limitations of AI machines:

• Lack human touch and feel • Lack attention to non-task surroundings • Can lead people to rely on AI machines too much (e.g., people may stop to think

on their own) • Can be programmed to create destruction (see discussion in Chapter 14) • Can cause many people to lose their jobs (see Chapter 14) • Can start to think by themselves, causing significant damage (see Chapter 14)

Some of the limitations are diminishing with time. However, risks exist. Therefore, it is necessary to properly manage the development of AI and try to minimize risks.

WHAT AI CAN AND CANNOT DO The limitations just identified constrain the capabili- ties of commercial AI. For example, it could cost too much to be commercially used. Ng (2016) provides an assessment of what AI was able to do by 2016. This is important for two reasons: (1) executives need to know what AI can do economically and how compa- nies can use it to benefit their business and (2) executives need to know what AI cannot economically do.

AI is already transforming Web search, retailing and banking services, logistics, online commerce, entertainment, and more. Hundreds of millions of people use AI on their smartphones and in other ways. However, according to Ng (2016), applications in these areas are based on how simple input is converted to simple output as a response; for example, in automatic loan approval, the input is the profile of the applicant and the output will be an approval or rejection.

Applications in these areas are normally fully automated. Automated tasks are usu- ally repetitive and done by people with short periods of training. AI machines depend on data that may be difficult to get (e.g., belong to someone else) or inaccurate. A second barrier is the need for AI experts, who are difficult to find and/or expensive to hire. For other barriers, see Chapter 14.

Three Flavors of AI Decisions

Staff (2017) divided the capabilities of AI systems into three levels: assisted, autonomous, and augmented.

ASSISTED INTELLIGENCE This is equivalent mostly to weak AI, which works only in narrow domains. It requires clearly defined inputs and outputs. Examples are some moni- toring systems and low-level virtual personal assistants (Chapter 12). Such systems and assistants are used in our vehicles for giving us alerts. Similar systems can be used in many healthcare applications (e.g., monitoring, diagnosing).

AUTONOMOUS AI These systems are in the realm of the strong AI but in a very narrow domain. Eventually, a computer will take over many tasks, automating them completely. Machines act as experts and have absolute decision-making power. Pure robo-advisors (Chapter 12) are examples of such machines. Autonomous vehicles and robots that can fix themselves are also good examples.

82 Part I • Introduction to Analytics and AI

AUGMENTED INTELLIGENCE Most of the existing AI applications, which are between assisted and autonomous, are referred to as augmented intelligence (or intelligence augmentation). Their technology can augment computer tasks to extend human cogni- tive abilities (see Chapter 6 on cognitive computing), resulting in high performance, as described in Technology Insight 2.1.

Artificial Brain

The artificial brain is a people-made machine that is desired to be as intelligent, creative, and self-aware as humans. To date, no one has been able to create such a machine; see artificialbrains.com. A leader in this area is IBM. IBM and the U.S. Air Force have built a system equivalent to 64 million artificial neurons that aims to reach 10 billion neurons

TECHNOLOGY INSIGHT 2.1 Augmented Intelligence

The idea of combining the performance of people and machines is not new. In this section, we discuss combining (augmenting) human abilities with powerful machine intelligence—not re- placing people, which autonomous AI does, but extending human cognitive abilities. The result is the ability of humans to solve more complex problems, as in the opening vignette to Chapter 1. Computers have provided data to help people solve problems for which no solution had been available. Padmanabhan (2018) specifies the following differences between traditional and aug- mented AI:

1. Augmented machines extend human thinking capabilities rather than replace human decision making. These machines facilitate creativity.

2. Augmentation excels in solving complex human and industry problems in specific do- mains in contrast with strong, general AI machines, which are still in development.

3. In contrast with a “black box” model of some AI and analytics, the augmented intelli- gence provides insights and recommendations, including explanations.

4. In addition, augmented technology can offer new solutions by combining existing and discovered information in contrast to assisted AI that identifies problems or symptoms and suggests predetermined known solutions.

Padmanabhan (2018) and many others believe that at the moment, augmented AI is the best option to deal with practical problems and transform organizations to be “smarter.”

In contrast with autonomous AI, which describes machines with a wide range of cognitive abilities (e.g., driverless cars), augmented intelligence has only a few cognitive abilities.

Examples of Augmented Intelligence Staff (2017) provides the following areas for which AI is useful:

• Cybercrime fighting. For example, AI can identify forthcoming attacks and suggest solutions.

• E-commerce decisions. AI marketing tools can make testing results 100 times faster, and adapt the layout and response functions of a Web site to users. Machines also make recommendations, and marketers can accept or reject them.

• High-frequency stock market trading. This process can be done either completely autonomously or in some cases with human control and calibration.

Questions for DisCussion

1. What is the basic premise of augmented intelligence?

2. List the major differences between augmented intelligence and assisted AI applications.

3. What are some benefits of augmented intelligence?

4. How does the technology relate to cognitive computing?

http://artificialbrains.com

Chapter 2 • Artificial Intelligence 83

by 2020. Note that a human brain contains about 100 billion neurons. The system tries to imitate a biological brain and be energy efficient. IBM’s project is called TrueNorth or BlueBrain, and it learns from humans’ brains. Many believe that it will be a long and slow process for AI machines to be as creative as people (e.g., Dormehl, 2017).

u SECTION 2.2 REVIEW QUESTIONS

1. Define AI. 2. What are the major aims and goals of AI? 3. List some characteristics of AI. 4. List some AI drivers. 5. List some benefits of AI applications. 6. List some AI limitations. 7. Describe the artificial brain. 8. List the three flavors of AI and describe augmentation.

2.3 HUMAN AND COMPUTER INTELLIGENCE

AI usage is growing rapidly due to its increased capabilities. To understand AI, we need to first explore the meaning of intelligence.

What Is Intelligence?

Intelligence can be considered to be an umbrella term and is usually measured by an IQ test. However, some claim that there are several types of intelligence. For example, Dr. Howard Gardner of Harvard University proposed the following types of intelligence:

• Linguistic and verbal • Logical • Spatial • Body/movement • Musical • Interpersonal • Intrapersonal • Naturalist

Thus, intelligence is not a simple concept.

CONTENT OF INTELLIGENCE Intelligence is composed of reasoning, learning, logic, problem-solving ability, perception, and linguistic ability.

Obviously, the concept of intelligence is not simple.

CAPABILITIES OF INTELLIGENCE To understand what artificial intelligence is, it is useful to first examine those abilities that are considered signs of human intelligence:

• Learning or understanding from experience • Making sense out of ambiguous, incomplete, or even contradictory messages and

information • Responding quickly and successfully to a new situation (i.e., using the most cor-

rect responses) • Understanding and inferring in a rational way, solving problems, and directing

conduct effectively

84 Part I • Introduction to Analytics and AI

• Applying knowledge to manipulate environments and situations • Recognizing and judging the relative importance of different elements in a situation

AI attempts to provide some, hopefully all, of these capabilities, but in general, it is still not capable of matching human intelligence.

How Intelligent Is AI?

AI machines have demonstrated superiority over humans in playing complex games such as chess (beating the world champion), Jeopardy! (beating the best players), and Go (a complex Chinese game) whose top players were beaten by a computer using the well- known program, Google’s DeepMind (see Hughes, 2016). Despite these remarkable dem- onstrations (whose cost is extremely high), many AI applications still show significantly less intelligence than humans.

COMPARING HUMAN INTELLIGENCE WITH AI Several attempts have been made to com- pare human intelligence with AI. There is difficulty in doing so because it is a multidi- mensional situation. A comparison is presented in Table 2.1.

TABLE 2.1 Artificial Intelligence versus Human Intelligence

Area AI Human

Execution Very fast Can be slow

Emotions Not yet Can be positive or negative

Computation speed Very fast Slow, may have trouble

Imagination Only what is programmed for Can expand existing knowledge

Answers to questions What is in the program Can be innovative

Flexibility Rigid Large, flexible

Foundation A binary code Five senses

Consistency High Variable, can be poor

Process As modeled Cognitive

Form Numbers Signals

Memory Built in, or accessed in the cloud

Use of content and scheme memory

Brain Independent Connected to a body

Creativity Uninspired Truly creative

Durability Permanent, but can get obsolete if not updated

Perishable, but can be updated

Duplication, documentation, and dissemination

Easy Difficult

Cost Usually low and declining Maybe high and increasing

Consistency Stable Erratic at times

Reasoning process Clear, visible Difficult to trace at times

Perception By rules and data By patterns

Figure missing data Usually cannot Frequently can

Chapter 2 • Artificial Intelligence 85

For additional comparisons and who had the advantage in which area, see www. dennisgorelik.com/ai/ComputerintelligenceVsHumanIntelligence.htm.

Measuring AI

The Turing Test is a well-known attempt to measure the intelligence level of AI machines.

TURING TEST: THE CLASSICAL MEASURE OF MACHINE INTELLIGENCE Alan Turing de- signed a test known as the Turing Test to determine whether a computer exhibits intel- ligent behavior. According to this test, a computer can be considered smart only when a human interviewer asking the same questions to both an unseen human and an unseen computer cannot determine which is which (see Figure 2.3). Note that this test is limited to a question-and-answer (Q&A) mode.

To pass the Turing Test, a computer needs to be able to understand a human language (NLP), to possess human intelligence (e.g., have a knowledge base), to rea- son using its stored knowledge, and to be able to learn from its experiences (machine learning).

Note: The $100,000 Leobner prize is waiting for the person or persons who develop software that is truly intelligent (i.e., passing the Turing Test).

OTHER TESTS Over the years, there have been several other proposals of how to measure machine intelligence. For example, improvements in the Turing Test appear in several variants. Major U.S. universities (e.g., University of Illinois, Massachusetts Institute of Technology [MIT], Stanford University) are engaged in studying the IQ of AI. In addition, there are several other measuring tests. Let’s examine one test in Application Case 2.1.

In conclusion, it is difficult to measure the level of intelligence of humans as well as that of machines. Doing so depends on the circumstances and the metrics used.

Human Interviewer

Questions Unseen Human

Screen

Unseen Computer

FIGURE 2.3 A Pictorial Representation of the Turing Test

http://www.dennisgorelik.com/ai/ComputerintelligenceVsHumanIntelligence.htm

86 Part I • Introduction to Analytics and AI

If you do not know it, vacuum cleaners can be smart. Some of you may use the Roomba from iRo- bot. This vacuum cleaner can be left alone to clean floors, and it exhibits some intelligence.

However, in smart homes (Chapter 13), we expect to see even smarter vacuum cleaners. One is Roboking Turbo Plus from LG in Korea. Researchers at South Korea’s Seoul National University Robotics and Intelligent System Lab studied the Roboking and verified that its deep-learning algorithm makes it as intelligent as a six- or seven-year-old child. If we have self-driving cars, why can’t we have a self- driving vacuum cleaner, which is much simpler than a car. The cleaner needs only to move around an entire room. To do so, the machine needs to “see” its location in a room and identify obstacles in front of it. Then the cleaner’s knowledge base needs to find what is the best thing to do (given worked in the past). This is basically what many AI machines’ sensors, knowledge bases, and rules do. In addition, the AI machine needs to learn from its past experi- ence (e.g., what it should not do because it did not work in the past).

Roboking is equipped with LG’s Deep Thin QTM AI program, which enables the vacuum cleaner to figure out the nature of an encountered

obstacle. The program tells it to go around furniture, wait for a dog to move, or stop. So, how intelligent is the machine? To answer this question, the Korean researchers developed 100 metrics and tested vac- uum cleaners that were boasted as autonomous. The performance of the tested cleaners was divided into three levels based on their performance regard- ing the 100 metrics. The levels were as intelligent as a dolphin, as intelligent as an ape, and as intelli- gent as a six-to-seven-year-old child. The study con- firmed that Roboking performed tasks at the upper level of machine intelligence.

Sources: Compiled from Fuller (2017) and webwire.com/ ViewPressRel.asp?aId=211017 news dated July 18, 2017.

Questions for Case 2.1

1. How did the Korean researchers determine the performance of the vacuum cleaners?

2. If you own (or have seen) the Roomba, how intelligent do you think it is?

3. What capability can be generated by the deep learning feature? (You need to do some research.)

4. Find recent information about LG’s Roboking. Specifically, what are the newest improvements to the product?

Application Case 2.1 How Smart Can a Vacuum Cleaner Be?

Regardless of the determination of how intelligent a machine is, AI exhibits a large num- ber of benefits as described earlier.

It is important to note that the capabilities of AI are increasing with time. For ex- ample, an experiment at Stanford University (Pham, 2018) found that AI programs at Microsoft and Alibaba Co. have scored higher than hundreds of individual people at read- ing comprehension tests. (Of course, these are very expensive AI programs.) For a discus- sion of AI versus human intelligence, see Carney (2018).

u SECTION 2.3 REVIEW QUESTIONS

1. What is intelligence? 2. What are the major capabilities of human intelligence? Which are superior to that of

AI machines?

3. How intelligent is AI? 4. How can we measure AI’s intelligence? 5. What is the Turing Test and what are its limitations? 6. How can one measure the intelligence level of a vacuum cleaner?

http://webwire.com/ViewPressRel.asp?aId=211017

Chapter 2 • Artificial Intelligence 87

2.4 MAJOR AI TECHNOLOGIES AND SOME DERIVATIVES

The AI field is very broad; we can find AI technologies and applications in hundreds of disciplines ranging from medicine to sports. Press (2017) lists 10 top AI technolo- gies similar to what is covered in this book. Press also provides the status of the life cycle (ecosystem phase) of the technologies. In this section, we present some major AI technologies and their derivatives as related to business. The selected list is illustrated in Figure 2.4.

Intelligent Agents

An intelligent agent (IA) is an autonomous, relatively small computer software pro- gram that observes and acts upon changes in its environment by running specific tasks autonomously. An IA directs an agent’s activities to achieve specific goals related to the changes in the surrounding environment. Intelligent agents may have the ability to learn by using and expanding the knowledge embedded in them. Intelligent agents are effective tools for overcoming the most critical burden of the Internet information overload and making computers more viable decision support tools. Interest in using intelligent agents for business and e-commerce started in the academic world in the mid-1990s. However, only since 2014, when the capabilities of IA increased remark- ably, have we started to see powerful applications in many areas of business, econom- ics, government, and services.

Initially, intelligent agents were used mainly to support routine activities such as searching for products, getting recommendations, determining products’ pricing, plan- ning marketing, improving computer security, managing auctions, facilitating payments, and improving inventory management. However, these applications were very simple, using a low level of intelligence. Their major benefits were increasing speed, reducing costs, reducing errors, and improving customer service. Today’s applications, as we will see throughout this chapter, are much more sophisticated.

Artificial Intelligence Machine Learning Neural Network

Deep LearningIntelligent Agents Natural Language

Process

Chatbots

Knowledge Systems

Robotics

Autonomous Vehicle

Machine Computer Visions Video Analysis

Image Generations

Cognitive Computing

Machine Translation of Languages

Speech and Voice Understanding and Generation

Degenerated Reality

FIGURE 2.4 The Major AI Technologies

88 Part I • Introduction to Analytics and AI

Example 1: Virus Detection Program

A simple example of an intelligent software agent is a virus detection program. It resides in a computer, scans all incoming data, and removes found viruses automatically while learning to detect new virus types and detection methods.

Example 2

Allstate Business Insurance is using an intelligent agent to reduce call center traffic and provide help to human insurance agents during the rate-quoting process with business customers. In these cases, rate quotes can be fairly complicated. Using this system, agents can quickly answer questions posted by corporate customers, even if the agents are not fully familiar with the related issue.

Intelligent agents are also utilized in e-mail servers, news filtering and distribution, appointment handling, and automated information gathering.

Machine Learning

At this time, AI systems do not have the same learning capabilities that humans have; rather, they have simplistic (but improving) machine learning (modeled after human learning methods. The machine-learning scientists try to teach computers to identify pat- terns and make connections by showing the machines a large volume of examples and related data. Machine learning also allows computer systems to monitor and sense their environmental activities so the machines can adjust their behavior to deal with changes in the environment. The technology can also be used to predict performance, to reconfigure programs based on changing conditions, and much more. Technically speaking, machine learning is a scientific discipline concerned with the design and development of algorithms that allow computers to learn based on data coming from sensors, databases, and other sources. This learning is then used for making predictions, recognizing patterns, and sup- porting decision makers. For an overview, see Alpaydin (2016) and Theobald (2017).

Machine-learning algorithms (see Chapter 5 for description and discussion) are used today by many companies. For an executive guide to machine learning, see Pyle and San Jose (2015).

The process of machine learning involves computer programs that learn as they face new situations. Such programs collect data and analyze them and then “train” them- selves to arrive at conclusions. For example, by showing examples of situations to a machine-learning program, the program can find elements not easily visible without it. A well-known example is that of computers detecting credit card fraud.

Application Case 2.2 illustrates how machine learning can improve companies’ busi- ness processes.

According to Taylor (2016), the “increased computing power, coupled with other improvements including better algorithms and deep neural networks for image pro- cessing, and ultra-fast in-memory databases like SAP HANA, are the reasons why ma- chine learning is one of the hottest areas of development in enterprise software today.” Machine-learning applications are also expanding due to the availability of Big Data sources, especially those provided by the IoT (Chapter 13). Machine learning is basically learning from data.

There are several methods of machine learning. They range from neural networks to case-based reasoning. The major ones are presented in Chapter 5.

DEEP LEARNING One subset, or refinement, of machine learning is called deep learning. This technology, which is discussed in Chapter 6, tries to mimic how the human brain

Chapter 2 • Artificial Intelligence 89

The following examples of using machine learn- ing are provided by Wellers, et al. (2017), who stated that “today’s leading organizations are using machine learning-based tools to automate decision processes. . . .”

1. Improving customer loyalty and retention. Companies mine customers’ activities, transac- tions, and social interactions and sentiments to predict customer loyalty and retention. Com- panies can use machine learning, for example, to predict people’s desire to change jobs and then employers can make attractive offers to keep the existing employees or to lure poten- tial employees who work elsewhere to move to new employers.

2. Hiring the right people. Given an average of 250 applicants for a good job in certain com- panies, an AI-based program can analyze applicants’ resumes and find qualified can- didates who did not apply but placed their resume online.

3. Automating finance. Incomplete financial transactions that lack some data (e.g., order numbers) require special attention. Machine- learning systems can learn how to detect and correct such situations, very quickly and at minimal cost. The AI program can take the necessary corrective action automatically.

4. Detecting fraud. Machine-learning algorithms use pattern recognition to detect fraud in real time. The program is looking for anomalies, and then it makes inferences regarding the type of detected activities to look for fraud.

Financial institutions are the major users of this program.

5. Providing predictive maintenance. Machine learning can find anomalies in the operation of equipment before it fails. Thus, corrective actions are done immediately at a fraction of a cost to repair equipment after it fails. In ad- dition, optimal preventive maintenance can be done (see Opening Vignette Chapter 1).

6. Providing retail shelf analysis. Machine learn- ing combined with machine vision can analyze displays in physical stores to find whether items are in proper locations on the shelves, whether the shelves are properly stocked, and whether the product labels (including prices) are properly shown.

7. Making other predictions. Machine learn- ing has been used for making many types of predictions ranging in areas from medicine to investments. An example is Google Flights, which predicts delays that have not been flagged yet by the airlines.

Source: Compiled from Wellers, et al. (2017) and Theobald (2017).

Questions for Case 2.2

1. Discuss the benefits of combining machine learn- ing with other AI technologies.

2. How can machine learning improve marketing?

3. Discuss the opportunities of improving human resource management.

4. Discuss the benefits for customer service.

Application Case 2.2 How Machine Learning Is Improving Work in Business

works. Deep learning uses artificial neural technology and plays a major role in dealing with complex applications that regular machine learning and other AI technologies can- not handle. Deep learning (DL) delivers systems that not only think but also keep learn- ing, enabling self-direction based on fresh data that flow in. DL can tackle previously unsolvable problems using its powerful learning algorithms.

For example, DL is a key technology in autonomous vehicles by helping to interpret road signs and road obstacles. DL is also playing critical roles in smartphones, robotics, tablets, smart homes, and smart cities (Chapter 13). For a discussion of these and other applications, see Mittal (2017). DL is mostly useful in real-time interactive applications in the areas of machine vision, scene recognition, robotics, and speech and voice process- ing. The key is continuous learning. As long as new data arrive, learning occurs.

90 Part I • Introduction to Analytics and AI

Example

Cargill Corp. offers conventional analytics, and DL-based analytics help farmers to do more profitable work. For example, farmers can produce better shrimp at lower cost. DL is used extensively in stock market analysis and predictions. For details, see Smith (2017) and Chapter 6.

Machine and Computer Vision

The definitions of machine vision vary because several different computer vision sys- tems include different hardware and software as well as other components. Generally speaking, the classical definition is that the term machine vision includes “the technol- ogy and methods used to provide imaging-based automated inspection and analysis for applications such as robot guidance, process control, autonomous vehicles, and inspec- tion.” Machine vision is an important tool for the optimization of production and robotic processes. A major part of machine vision is the industrial camera, which captures, stores, and archives visual information. This information is then presented to users or computer programs for analysis and eventually for automatic decision making or for support of human decision making. Machine vision can be confused with computer vision because sometimes the two are used as synonyms, but some users and researchers treat them as different entities. Machine vision is treated more as an engineering subfield, while com- puter vision belongs to the computer science area.

COMPUTER VISION Computer vision, according to Wikipedia, “is an interdisciplinary field that deals with how computers can be made for gaining high-level understanding from digital images or videos. From the perspective of engineering, it seeks to automate tasks that the human visual system can do.” Computer vision acquires or processes, analyzes, and interprets digital images and produces meaningful information for making decisions. Image data can take several formats, such as photos or videos, and they can come from multidimensional sources (e.g., medical scanners). Scene and item recogni- tions are important elements in computer vision. The computer vision field plays a vital role in the domains of safety, security, health, and entertainment. Computer vision is con- sidered a technology of AI, which enables robots and autonomous vehicles to see (refer to the description in Chapter 6). Both computer vision and machine vision automate many human tasks (e.g., inspection). These tasks can deal with one image or a sequence of images. The major benefit of both technologies is lowering the costs of performing tasks, especially those that are repetitive and make the human eyes tired. The two tech- nologies are also combined with image processing that facilitates complex applications, such as in visual quality control. Another view shows them as being interrelated based on image processing and sharing a variety of contributing fields.

An applied area of machine vision is scene recognition, which is performed by computer vision. Scene recognition enables recognition and interpretation of objects, scenery, and photos.

Example of Application

Significant illegal logging exists in many countries. To comply with the laws in the United States, Europe, and other countries, it is necessary to examine wood in the field. This requires expertise. According to the U.S. Department of Agriculture, “the urgent need for such field expertise, training and deploying humans to identify processed wood in the field [i.e., at ports, border crossings, weigh-stations, airports, and other points of entry for commerce] is prohibitively expensive and difficult logistically. The machine vision wood

Chapter 2 • Artificial Intelligence 91

identification project (MV) has developed a prototype machine vision system for wood identification.” Similarly, AI computer vision combined with deep learning is used to identify illegal poachers of animals (see USC, 2018).

Another example of this application is facial recognition in several security appli- cations, such as those used by the Chinese police that employ smart glasses to identify (via facial recognition) potential suspects. In 2018, the Chinese police identified a suspect who attended a pop concert. There were 60,000 people in the crowd. The person was recognized at the entrance gate where a camera took his picture; see the video at you- tube.com/watch?v=Fq1SEqNT-7c. In 2018, US Citizenship and Immigration Services identified people that used false passports in the same manner.

VIDEO ANALYTICS Applying computer vision techniques to videos enables the recog- nition of patterns (e.g., for detecting fraud) and identifying events. This is a derivative application of computer vision. Another example is one in which, by letting computers view TV shows, it is possible to train the computers to make predictions regarding human interactions and the success of advertising.

Robotic Systems

Sensory systems, such as those for scene recognition and signal processing, when com- bined with other AI technologies, define a broad category of integrated, possibly com- plex, systems, generally called robotics (Chapter 10). There are several definitions of robots, and they are changing over time. A classical definition is this: “A robot is an elec- tromechanical device that is guided by a computer program to perform manual and/or mental tasks.” The Robotics Institute of America formally defines a robot as “a program- mable multifunctional manipulator designed to move materials, parts, tools, or special- ized devices through variable programmed motions for the performance of a variety of tasks.” This definition ignores the many mental tasks done by today’s robots.

An “intelligent” robot has some kind of sensory apparatus, such as a camera, that collects information about the robot’s surroundings and its operations. The collected data are interpreted by the robot’s “brain,” allowing it to respond to the changes in the environment.

Robots can be fully autonomous (programmed to do tasks completely on their own, even repair themselves), or can be remotely controlled by a human. Some robots known as androids resemble humans, but most industrial robots are not this type. Autonomous robots are equipped with AI intelligent agents. The more advanced smart robots are not only autonomous but also can learn from their environment, building their capa- bilities. Some robots today can learn complex tasks by watching what humans do. This leads to better human–robot collaboration. The Interactive Group at MIT is experimenting with this capability by teaching robots to make complex decisions. For details, see Shah (2016). For an overview of the robot revolution, see Waxer (2016).

Example: Walmart Is Using Robots to Properly Stock Shelves

The efficiency of Walmart stores depends on appropriately stocking their shelves. Using manual labor for checking what is going on is expensive and may be inaccurate. As of late 2017, robots were supporting the company’s stocking decisions.

At Walmart, the 2-foot-tall robots use a camera/sensor to scan the shelves to look for misplaced, missing, or mispriced items. The collected information and the interpre- tation of problems are done by these self-moving robots. The results are transmitted to humans who take corrective actions. The robots carry out their tasks faster and frequently more accurately than humans. The company experimented with this in 50 stores in 2018.

http://youtube.com/watch?v=Fq1SEqNT-7c

92 Part I • Introduction to Analytics and AI

Preliminary results are significantly positive and are also expected to increase customer satisfaction. The robots will not cause employees to lose their jobs.

Robots are used extensively in e-commerce warehouses (e.g., tens of thousands are used by Amazon.com). They also are used in make-to-order manufacturing as well as in mass production (e.g., cars), lately of self-driven vehicles. A new generation of robots is designed to work as advisors, as described in Chapter 12. These robots are already advis- ing on topics such as investments, travel, healthcare, and legal issues. Robots can serve as front desk receptionists and even can be used as teachers and trainers.

Robots can help with online shopping by collecting shopping information, match- ing buyers and products, and conducting price and capability comparisons. These are known as shopbots (e.g., see igi-global.com/dictionary/shopbot/26826). Robots can carry goods for shoppers in open air markets. Walmart is experimenting now with robotic shopping carts (Knight, 2016). For a video (4:41 min.), see businessinsider. com/personal-robots-for-shopping-and-e-commerce-2016-9?IR=T. The Japanese company SoftBank opened a cellphone store in Tokyo entirely staffed by robots, each named Pepper. Each robot is mobile (on wheels) and can approach customers. Initially, communication with customers was done by entering information into a tablet attached to each Pepper. A major issue with robots is their trend to take human jobs. For a discus- sion of this topic, see Section 14.6.

Natural Language Processing

Natural language processing (NLP) is a technology that gives users the ability to communicate with a computer in their native language. The communication can be in written text and/or in voice (speech). This technology allows for a conversational type of interface in contrast with using a programming language that consists of computer jargon, syntax, and commands. NLP includes two subfields:

• Natural language understanding that investigates methods of enabling computers to comprehend instructions or queries provided in ordinary English or other human languages.

• Natural language generation that strives to have computers produce ordinary spoken language so that people can understand the computers more easily. For details and the history of NLP, see en.wikipedia.org/wiki/Natural_language_processing and Chapter 6.

NLP is related to voice-generated data as well as text and other communication forms.

SPEECH (VOICE) UNDERSTANDING Speech (voice) understanding is the recognition and understanding of spoken languages by a computer. Applications of this technol- ogy have become more popular. For instance, many companies have adopted this technology in their automated call centers. For an interesting application, see cs.cmu. edu/~./listen.

Related to NLP is machine translation of languages, which is done by both written text (e.g., Web content) and voice conversation.

MACHINE TRANSLATION OF LANGUAGES Machine translation uses computer programs to translate words and sentences from one language to another. For example, Babel Fish Translation, available at babelfish.com, offers more than 25 different combinations of language translations. Similarly, Google’s Translate (translate.google.com) can translate dozens of different languages. Finally, users can post their status on Facebook in several languages.

http://Amazon.com

http://igi-global.com/dictionary/shopbot/26826

http://businessinsider.com/personal-robots-for-shopping-and-e-commerce-2016-9?IR=T

http://en.wikipedia.org/wiki/Natural_language_processing

http://cs.cmu.edu/~./listen

http://babelfish.com

http://translate.google.com

Chapter 2 • Artificial Intelligence 93

Example: Sogou’s Travel Translator

This Chinese company introduced, in 2018, an AI-powered portable travel device. Chinese people are now traveling to other countries in increasing numbers (200 million expected in 2020 versus 122 million in 2016). The objective of the device is to enable Chinese tourists to plan trips (so they can read Web sites like Trip Advisor, available in English). The AI-powered portable travel device enables tourists to read menus, street signs, and communicate with native speakers. The device, which is using NLP and image recognition, is connected to Sogou search (a search engine). In contrast with the regular Chinese-English dictionaries, this device is structured specifically for travelers and their needs.

Knowledge and Expert Systems and Recommenders

These systems, which are presented in Chapter 12, are computer programs that store knowledge, which their applications use to generate expert advice and/or perform prob- lem solving. Knowledge-based expert systems also help people to verify information and make certain types of automated routine decisions.

Recommendation systems (Chapter 12) are knowledge-based systems that make shopping and other recommendations to people. Another knowledge system is chatbots (see Chapter 12).

KNOWLEDGE SOURCES AND ACQUISITION FOR INTELLIGENT SYSTEMS For many intel- ligent systems to work, it is necessary for them to have knowledge. The process of acquiring this knowledge is referred to as knowledge acquisition. This activity can be complex because it is necessary to make sure what knowledge is needed. It must fit the desired system. In addition, the sources of the knowledge need to be identified to ensure the feasibility of acquiring the knowledge. The specific methods of acquiring the knowledge need to be identified and if expert(s) are the source of knowledge, their cooperation must be ensured. In addition, the method of knowledge representation and reasoning from the collected knowledge must be taken into account, and knowledge must be validated and be consistent.

Given this information, it is easy to see that the process of knowledge acquisition (see Figure 2.5) can be very complex. It includes extracting and structuring knowledge. It has several methods (e.g., observing, interviewing, scenario building, and discussing), so specially trained knowledge engineers may be needed for knowledge acquisition and system building. In many cases, teams of experts with different skills are created for knowledge acquisition. Knowledge can be generated from data, and then experts may be used to verify it. The acquired knowledge needs to be organized in an activity referred to as knowledge representation.

KNOWLEDGE REPRESENTATION Acquired knowledge needs to be organized and stored. There are several methods of doing this, depending on what the knowledge will be used for, how the reasoning from this knowledge will be done, how users will interact with the knowledge, and more. A simple way to represent knowledge is in the form of questions and matching answers (Q&A).

REASONING FROM KNOWLEDGE Perhaps the most important component in an intelli- gent system is its reasoning feature. This feature processes users’ requests and provides answers (e.g., solutions, recommendations) to the user. The major difference among the various types of the intelligent technologies is the type of reasoning they use.

94 Part I • Introduction to Analytics and AI

Chatbots

Robots come in several shapes and types. One type that has become popular in recent years is the chatbot. A chatbot, which will be presented in Chapter 12, is a conversional robot that is used for chatting with people. (A “bot” is short for “robot.”) Depending on the purpose of the chat, which can be done in writing or by voice, bots can be in the form of intelligent agents that retrieve information or personal assistants that provide ad- vice. In either case, chatbots are usually equipped with NLP that enables conversations in natural human languages rather than in a programmed computer language. Note that Google has rolled out six different voices to its Google’s Assistant.

Emerging AI Technologies

Several new AI technologies are emerging. Here are a few examples:

• Effective computing. These technologies detect the emotional conditions of people and suggest how to deal with discovered problems

• Biometric analysis. These technologies can verify an identity based on unique bio- logical traits that are compared to stored ones (e.g., facial recognition).

COGNITIVE COMPUTING Cognitive computing is the application of knowledge derived from cognitive science (the study of the human brain) and computer science theories in order to simulate the human thought processes (an AI objective) so that computers can exhibit and/or support decision-making and problem-solving capabilities (see Chapter 6). To do so, computers must be able to use self-learning algorithms, pattern recognition, NLP, machine vision, and other AI technologies. IBM is a major proponent of the concept by developing technologies (e.g., Watson) that support people in making complex de- cisions. Cognitive computing systems learn to reason with purpose, and interact with people naturally. For details, see Chapter 6 and Marr (2016).

Sources of Knowledge

Knowledge Acquisition Validation Verification

Natural Language Understanding

Knowledge Organization and Representation

Knowledge Repository

Knowledge Refining

Response Generation

Explanation Justification

Natural Language Generation Q&A

Problem Analysis

IdentificationUser Interface

Knowledge Users

Documented Knowledge Data Information

System Brain, Search

Inferencing, Reasoning

FIGURE 2.5 Automated Decision-Making Process

Chapter 2 • Artificial Intelligence 95

AUGMENTED REALITY Augmented reality (AR) refers to the integration of digital infor- mation with the user environment in real time (mostly vision and sound). The technology provides people real-world interactive experience with the environment. Therefore, infor- mation may change the way people work, learn, play, buy, and connect. Sophisticated AI programs may include machine vision, scene recognition, and gesture recognition. AR is available on iPhones as ARKit. (Also see Metz, 2017.)

These AR systems use data captured by sensors (e.g., vision, sound, temperature) to augment and supplement real-world environments. For example, if you take a photo of a house with your cellphone, you can immediately get the publicly available information about its configuration, ownership, and tax liabilities on your cellphone.

u SECTION 2.4 REVIEW QUESTIONS

1. Define intelligent agents and list some of their capabilities. 2. Prepare a list of applications of intelligent agents. 3. What is machine learning? How can it be used in business? 4. Define deep learning. 5. Define robotics and explain its importance for manufacturing and transportation. 6. What is NLP? What are its two major formats? 7. Describe machine translation of languages. Why it is important in business? 8. What are knowledge systems? 9. What is cognitive computing?

10. What is augmented reality?

2.5 AI SUPPORT FOR DECISION MAKING

Almost since the inception of AI, researchers have recognized the opportunity of using it for supporting the decision-making process and for completely automating decision making. Jeff Bezos, the CEO of Amazon.com, said in May 2017 that AI is in a golden age, and it is solving problems that were once in the realm of science fiction (Kharpal, 2017). Bezos also said that Amazon.com is using AI in literally hundreds of applications, and AI is really of amazing assistance. Amazon.com has been using AI, for example, for product recommendations for over 20 years. The company also uses AI for product pricing, and as Bezos said, to solve many difficult problems. And indeed, since its inception, AI has been related to problem solving and decision making. AI technologies allow people to make better decisions. The fact is that AI can:

• Solve complex problems that people have not been able to solve. (Note that solving problems frequently involves making decisions.)

• Make much faster decisions. For example, Amazon makes millions of pricing and recommendation decisions, each in a split second.

• Find relevant information, even in large data sources, very fast. • Make complex calculations rapidly. • Conduct complex comparisons and evaluations in real time.

In a nutshell, AI can drive some types of decisions many times faster and more consistently than humans can. For details, watch the video at youtube.com/ watch?v=Dr9jeRy9whQ/. The nature of decision making, especially nonroutine ones, as noted in Chapter 1, is complex. We discussed in Chapter 1 the fact that there are sev- eral types of decisions and several managerial levels of making them, and we looked at the typical process of making decisions. Making decisions, many of which are used for problem solving, requires intelligence and expertise. AI’s aim is to provide both. As a

http://Amazon.com

http://youtube.com/watch?v=Dr9jeRy9whQ/

96 Part I • Introduction to Analytics and AI

result, it is clear that using AI to facilitate decision making involves many opportunities, benefits, and variations. For example, AI can successfully support certain types of deci- sion making and fully automate others.

In this section, we discuss some general issues of AI decision support. The section also distinguishes between support of decision making and fully automating decision making.

Some Issues and Factors in Using AI in Decision Making

Several issues determine the justification of using AI and its chance of success. These include:

• The nature of the decision. For example, routine decisions are more likely to be fully automated, especially if they are simple.

• The method of support, what technology(ies) is (are) used. Initially, automated decision supports were rule-based. Practically, expert systems were created to gen- erate solutions to specific decision situations in well-defined domains. Another popular technology mentioned earlier was “recommender,” which appeared with e-commerce in the 1990s. Today, there is an increased use of machine learning and deep learning. A related technology is that of pattern recognition. Today, attention is also given to biometric types of recognition.

For example, research continues to develop an AI machine that will interview peo- ple at airports, asking one or two questions, and then determining whether they are telling the truth. Similar algorithms can be used to vet refugees and other types of immigrants.

• Cost-benefit and risk analyses. These are necessary for making large-scale decisions, but computing these values may not be simple with AI models due to difficulties in measuring costs, risks, and benefits. For example, as we cited earlier, researchers used 100 metrics to measure the intelligence level of vacuum cleaners.

• Using business rules. Many AI systems are based on business or other types of rules. The quality of automated decisions depends on the quality of these rules. Advanced AI systems can learn and improve business rules.

• AI algorithms. There is an explosion in the number of AI algorithms that are the basis for automated decisions and decision support. The quality of the decisions depends on the input of the algorithms, which may be affected by changes in the business environment.

• Speed. Decision automation is also dependent on the speed within which decisions need to be made. Some decisions cannot be automated because it takes too much time to get all the relevant input data. On the other hand, manual decisions may be too slow for certain circumstances.

AI Support of the Decision-Making Process

Much AI support can be applied today to the various steps of the decision-making pro- cess. Fully automated decisions are common in routine situations and will be discussed in the next section. Here we follow the steps in the decision-making process described in Chapter 1.

PROBLEM IDENTIFICATION AI systems are used extensively in problem identification typically in diagnosing equipment malfunction and medical problems, finding security breaches, estimating financial health, and so on. Several technologies are used. For ex- ample, sensor-collected data are used by AI algorithms. Performance levels of machines are compared to standards, and trend analysis can point to opportunities or troubles.

Chapter 2 • Artificial Intelligence 97

GENERATING OR FINDING ALTERNATIVE SOLUTIONS Several AI technologies offer alter- native solutions by matching problem characteristics with best practices or proven solu- tions stored in databases. Both expert systems and chatbots employ this approach. They can generate recommended solutions or provide several options from which to choose. AI tools such as case-based reasoning and neural computing are used for this purpose.

SELECTING A SOLUTION AI models are used to evaluate proposed solutions, for ex- ample, by predicting their future impact (predictive analysis), assessing their chance of success, or predicting a company’s reply to action taken by a competitor.

IMPLEMENTING THE SOLUTIONS AI can be used to support the implementation of com- plex solutions. For example, it can be used to demonstrate the superiority of proposals and to assess resistance to changes.

Applying AI to one or more of the decision-making processes and steps enables companies to solve complex real-world problems, as shown in Application Case 2.3.

Automated Decision Making

As the power of AI technologies increases, so does its ability to fully automate more and more complex decision-making situations.

The following examples were extracted from Forrest (2017):

Google’s Cloud Machine Learning Engine and Tensor Flow allow unique access to machine learn- ing tools without the need for PhD-educated data scientists.

The following companies use Google’s tools to solve the listed problem.

1. Axa International. This global insurance com- pany uses machine learning to predict which drivers would be more likely to cause major accidents. The analysis provides prediction ac- curacy of 78 percent. This prediction is used to determine appropriate insurance premiums.

2. Airbus Defense & Space. Detecting clouds in satellite imagery was done manually for de- cades. Using machine learning, the process has been expedited by 40 percent, and the error rate has been reduced from 11 percent to 3 percent.

3. Preventing overfishing globally. A government agency previously monitored only small sam- ple regions globally to find fishing violators. Now, using satellite AIS positioning, the agen-

cy can watch the entire ocean. Using machine learning, the agency can track all fishing ves- sels to find violators.

4. Detecting credit card fraud in Japan. SMFG, a Japanese financial services company, uses Google’s machine learning (a deep learning application) to monitor fraud related to credit card use, with an 80–90 percent accuracy of detection. The detection generates an alarm for taking actions.

5. Kewpie Food of Japan. This company detected defective potato cubes manually using a slow and expensive process. Using Google AI tools enables it to automatically monitor video feeds and alert inspectors to remove defective potatoes.

Source: Condensed and compiled from Forrest (2017).

Questions for Case 2.3

1. Why use machine learning for predictions?

2. Why use machine learning for detections?

3. What specific decisions were supported in the five cases?

Application Case 2.3 How Companies Solve Real-World Problems Using Google’s Machine-Learning Tools

98 Part I • Introduction to Analytics and AI

INTELLIGENT AND AUTOMATED DECISION SUPPORT As early as 1970, there were at- tempts to automate decision making. These attempts were typically done with the use of rule-based expert systems that provided recommended solutions to repetitive managerial problems. Examples of decisions made automatically include the following:

• Small loan approvals • Initial screening of job applicants • Simple restocking • Prices of products and services (when and how to change them) • Product recommendation (e.g., at Amazon.com)

The process of automated decision making is illustrated in Figure 2.5. The pro- cess starts with knowledge acquisition and creation of a knowledge repository. Users submit questions to the system brain, which generates a response and submits it to the users. In addition, the solutions are evaluated so that the knowledge repository and the reasoning from it can be improved. Complex situations are forwarded to humans’ attention. This process is especially used in knowledge-based systems. Note that the process in Figure 2.5 for knowledge acquisition illustrates automatic decision making as well. Companies use automated decision making for both their external operations (e.g., sales) and internal operations (e.g., resource allocation, inventory management). An example follows.

Example: Supporting Nurses’ Diagnosis Decisions

A study conducted in a Taiwanese hospital (Liao, et al., 2015) investigated the use of AI to generate nursing diagnoses and compared them to diagnoses generated by humans. Diagnoses required comprehensive knowledge, clinical experience, and instinct. The re- searchers used several AI tools, including machine learning, to conduct data mining and analysis to predict the probable success of automated nursing diagnoses based on patient characteristics. The results indicated an 87 percent agreement between the AI and human diagnosis decisions.

Such technology can be used in places that have no human nursing staff as well as by nursing staff who want to verify the accuracy of their own diagnostic predictions. The system can facilitate the training of nursing staff as well.

Automated decisions can take several forms, as illustrated in Technology Insight 2.2.

Conclusion

There is little doubt that AI can change the decision-making process for businesses; for an example, see Sincavage (2017). The nature of the change varies based on the circum- stances. But, in general, we expect AI to have a major impact for making better, faster, and more efficient decisions. Note that, in some cases, an AI watchdog is needed to regu- late the process (see Sample, 2017, for details).

u SECTION 2.5 REVIEW QUESTIONS

1. Distinguish between fully automated and supported decision making. 2. List the benefits of AI for decision support. 3. What factors influence the use of AI for decision support? 4. Relate AI to the steps in the classical decision-making process. 5. What are the necessary conditions for AI to be able to automate decision making? 6. Describe Schrage’s four models.

http://Amazon.com

Chapter 2 • Artificial Intelligence 99

TECHNOLOGY INSIGHT 2.2 Schrage’s Models for Using AI to Make Decisions

Schrage (2017) of MIT’s Sloan School has proposed the following four models for AI to make autonomous business decisions:

1. The Autonomous Advisor. This is a data-driven management model that uses AI algorithms to generate best strategies and instructions on what to do and makes specific recommendations. However, only humans can approve the recommendations (e.g., proposed solutions).

Schrage provided an example in which an American retailing company replaced an entire merchandising department with an AI machine, ordering employees to obey direc- tives from it. Obviously, resistance and resentment followed. To ensure compliance, the company had to install monitoring and auditing software.

2. The Autonomous Outsource. Here, the traditional business process outsourcing model is changed to a business process algorithm. To automate this activity, it is necessary to cre- ate crystal-clear rules and instructions. It is a complex scenario since it involves resource allocation. Correct predictability and reliability are essential.

3. People–Machine Collaboration. Assuming that algorithms can generate optimal decisions in this model, humans need to collaborate with the brilliant, but constrained, fully auto- mated machines. To ensure such collaboration, it is necessary to train people to work with the AI machines (see the discussion in Chapter 14). This model is used by tech giants such as Netflix, Alibaba, and Google.

4. Complete Machine Autonomy. In this model, organizations fully automate entire processes. Management needs to completely trust AI models, a process that may take years. Schrage provides an example of a hedge fund that trades very frequently based on a machine’s rec- ommendations. The company uses machine learning to train the trading algorithms.

Implementing these four models requires appropriate management leadership and col- laboration with data scientists. For suggestions of how to do so, consult Schrage (2017), who has written several related books. Kiron (2017) discusses why managers should consider AI for decision support.

An interesting note is that some competition among companies will actually occur among data-driven autonomous algorithms and related business models.

Questions for DisCussion

1. Differentiate between the autonomous advisor and the people–machine collabora- tion models.

2. In all four models, there are some degrees of people–machine interaction. Discuss.

3. Why it is easier to use model 4 for investment decisions than, for example, market- ing strategies?

4. Why is it important for data scientists to work with top management in autono- mous AI machines?

2.6 AI APPLICATIONS IN ACCOUNTING

Throughout this book, we provide many examples of AI applications in business, ser- vices, and government. In the following five sections, we provide additional applications in the traditional areas of business: accounting; finance; human resource management; marketing, advertising, and CRM; and production-operation management.

AI in Accounting: An Overview

The CEO of SlickPie Accounting Software for small businesses, Chandi (2017), noticed trends among professional accountants: their use of AI, including bots in professional

100 Part I • Introduction to Analytics and AI

routines, increased. Chandi observed that the major drivers for this are perceived sav- ings in time and money and increased accuracy and productivity. The adoption has been rapid and it has been followed by significant improvements. An example is the execution of compliance procedures, where, for instance, Ernst & Young (EY) is using machine learning for detecting anomalous data (e.g., fraudulent invoices).

AI in Big Accounting Companies

Major users of AI are the big tax and accounting companies as illustrated in Application Case 2.4.

Accounting Applications in Small Firms

Small accounting firms also use AI. For example, Crowe Horwath of Chicago is using AI to solve complex billing problems in the healthcare industry. This helps its clients to deal with claims processing and reimbursements. The firm can now solve difficult problems that had previously resisted solutions. Many other applications are used with the support of AI, ranging from analyzing real estate contracts to risk analysis. It is only a question of time before even smaller firms will be able to utilize AI as well.

The big accounting companies use AI to replace or support human activities in tasks such as tax prepa- ration, auditing, strategy consulting, and accoun- tancy services. They mostly use NLP, robotic pro- cess automation, text mining, and machine learning. However, they use different strategies as described by Zhou (2017):

• EY attempts to show quick, positive return on investment (ROI) on a small scale. The strategy concentrates on business value. EY uses AI, for example, to review legal documents related to leasing (e.g., to meet new government regulations).

• PricewaterhouseCoopers (PwC) favors small projects that can be completely functioning in four weeks. The objective is to demonstrate the value of AI to client companies. Once demon- strated to clients, the projects are refined. PwC demonstrates 70 to 80 such projects annually.

• Deloitte Touche Tohmatsu Limited, commonly referred to as Deloitte, builds cases that guide AI-based projects for both clients and internal use. The objective is to facilitate innovation. One successful area is the use of NLP for review of large contracts that may include hundreds of thousands of legal documents. The company reduced such review time from six months to

less than a month, and it reduced the number of employees who had performed the review by more than 70 percent. Deloitte, like its com- petitors, is using AI to evaluate potential pro- curement synergies for merger and acquisition decisions. Such evaluation is a time-consuming task since it is necessary to check huge quanti- ties of data (sometime millions of data lines). As a result, Deloitte can finish such evaluation in a week compared to the four to five months required earlier. Deloitte said that with AI, it is viewing data in ways never even contemplated before (Ovaska-Few, 2017).

All big accounting companies use AI to assist in generating reports and to conduct many other routine, high-volume tasks. AI has produced high- quality work, and its accuracy has become better and better with time.

Sources: Compiled from Chandi (2017), Zhou (2017), and Ovaska- Few (2017).

Questions for Case 2.4

1. What are the characteristics of the tasks for which AI is used?

2. Why do the big accounting firms use different implementation strategies?

Application Case 2.4 How EY, Deloitte, and PwC Are Using AI

Chapter 2 • Artificial Intelligence 101

COMPREHENSIVE STUDY OF AI USE IN ACCOUNTING The ICAEW information technology (IT) faculty provides a free comprehensive study, “AI and the Future of Accountancy.” This report (ICAEW, 2017) provides an assessment of AI use in accounting today and in the future. The report sees the advantage of AI by:

• Providing cheaper and better data to support decision making and solve accounting problems

• Generating insight from data analysis • Freeing time of accountants to concentrate on problem solving and decision making

The report points to the use of the following:

• Machine learning for detecting fraud and predicting fraudulent activities • Machine-learning and knowledge-based systems for verifying of accounting tasks • Deep learning to analyze unstructured data, such as in contracts and e-mails

Job of Accountants

AI and analytics will automate many routine tasks done today by accountants (see discus- sion in Chapter 14), many of whom may lose their jobs. On the other hand, accountants will need to manage AI-based accounting systems. Finally, accountants need to drive AI innovation in order to succeed or even survive (see Warawa, 2017).

u SECTION 2.6 REVIEW QUESTIONS

1. What are the major reasons for using AI in accounting? 2. List some applications big accounting firms use. 3. Why do big accounting firms lead the use of applied AI? 4. What are some of the advantages of using AI cited by the ICAEW report? 5. How may the job of the accountant be impacted by AI?

2.7 AI APPLICATIONS IN FINANCIAL SERVICES

Financial services are much diversified, and so is AI usage in the area. One way to orga- nize the AI activities is by major segments of services. In this section, we discuss only two segments: banking and insurance.

AI Activities in Financial Services

Singh (2017) observed the following activities that may be found across various types of financial services:

• Extreme personalization (e.g., using chatbots, personal assistants, and robo invest- ment advisors) (Chapter 12)

• Shifting customer behavior both online and in brick-and-mortar branches • Facilitating trust in digital identity • Revolutionizing payments • Sharing economic activities (e.g., person-to-person loans) • Offering financial services 24/7 and globally (connecting the world)

AI in Banking: An Overview

Consultancy.uk (2017) provides an overview of how AI is transforming the banking in- dustry. It found AI applications mostly in IT, finance and accounting, marketing and sales, human resource management (HRM), customer service, and operations. A comprehensive

102 Part I • Introduction to Analytics and AI

survey on AI in banking was conducted in 2017, and a report is available for purchase (see Tiwan, 2017).

The key findings of this report are as follows:

• AI technologies in banking include all those listed in Section 2.7 and several other analytical tools (Chapters 3 to 11 of this book).

• These technologies help banks improve both their front-office and back-office operations.

• Major activities are the use of chatbots to improve customer service and com- municating with customers (see Chapter 12), and robo advising is used by some financial institutions (see Chapter 12).

• Facial recognition is used for safer online banking. • Advanced analytics helps customers with investment decisions. For examples of

this help, see Nordrum (2017), E. V. Staff (2017), and Agrawal (2018). • AI algorithms help banks identify and block fraudulent activities including money

laundering. • AI algorithms can help in assessing the creditworthiness of loan applicants. (For

a case study of an application of AI in credit screening, see ai-toolkit.blogspot. com/2017/01/case-study-artificial-intelligence-in.html.)

Illustrative AI Applications in Banking

The following are banking institutions that use AI:

• Banks are using AI machines, such as IBM Watson, to step up employee sur- veillance. This is important in preventing illegal activities such as those that occurred at Wells Fargo, the financial services and banking company. For de- tails, see information-management.com/articles/banks-using-algorithms- to-step-up-employee-surveillance.

• Banks use applications for tax preparation. H&R Block is using IBM Watson to review tax returns. The program makes sure that individuals pay only what they owe. Using interactive conversations, the machine attempts to lower peo- ple’s tax bills.

• Answering many queries in real time. For example, Rainbird Co. (rainbird.ai/) is an AI vendor that trains machines to answer customers’ queries. Millions of cus- tomers’ questions keep bank employees busy. Bots assist staff members to quickly find the appropriate answers to queries. This is especially important in banks where turnovers of employees are high. Also, there is knowledge degrading over- time, due to frequent changes in policies and regulations.

Rainbird is integrated with IBM Watson, which is using AI capabilities and cognitive reasoning to understand the nature of the queries and provide solutions. The program–employee conversations are done via chatbots, which are deployed to all U.K. branches of the banks served by Rainbird.

• At Capital One and several other banks, customers can talk with Amazon’s Alexa to pay credit card bills and check their accounts.

• TD Bank and others (see Yurcan, 2017) experiment with Alexa, which provides machine learning and augmented reality capabilities for answering queries.

• Bank Danamon uses machine learning for fraud detection and anti–money- laundering activities. It also improves the customer experience.

• At HSBC, customers can converse with the virtual banking assistant, Olivia, to find information about their accounts and even learn about security. Olivia can learn from its experiences and become more useful.

http://ai-toolkit.blogspot.com/2017/01/case-study-artificial-intelligence-in.html

http://information-management.com/articles/banks-using-algorithms-to-step-up-employee-surveillance

http://rainbird.ai/

Chapter 2 • Artificial Intelligence 103

• Santander Bank employs a virtual assistant (called Nina) that can transfer money, pay bills, and do more. Nina can also authenticate its customers via an AI-based voice recognition system. Luvo of RBS is a customer service and customer relation- ship management (CRM) bot that answers customers’ queries.

• At Accenture, Collette is a virtual mortgage advisor that provides personalized advice.

• A robot named NaO can analyze facial expression and behavior of customers that enter the branches of certain banks and determine their nationality. Then the machine selects a matching language ( Japanese, Chinese, or English) to interact with the customer.

IBM Watson can provide banks many other services ranging from crime fighting to regulatory compliance as illustrated next.

Example: How Watson Helps Banks Manage Compliance and Supports Decision Making

Government regulations place a burden on banks and other financial institutions. To comply with regulations, banks must spend a considerable amount of time examining huge amounts of data generated daily.

Developed by Promontory Financial Group (an IBM subsidiary), IBM Watson (Chapter 6) developed a set of tools to deal with the compliance problem. The set of tools was trained by using the knowledge of former regulators and examining data from over 200 different sources. All in all, the program is based on over 60,000 regulatory cita- tions. It includes three sets of cognitive tools that deal with regulatory compliance. One of the tools deals with financial crimes, flagging potential suspicious transactions and pos- sible fraud. The second tool monitors compliance, and the third one deals with the large volume of data. Watson is acting as a banking financial consultant for these and other banking issues.

IBM’s tools are designed to assist financial institutions to justify important decisions. The AI algorithms examine the data inputs and outputs in managerial decision making. For example, when the program spots suspicious activity, it will notify the appropriate manager, who then will take the necessary action. For details, see Clozel (2017).

Application Case 2.5 illustrates US Bank’s use of AI to improve customer service.

Insurance Services

Advancements in AI are improving several areas in the insurance industry, mostly in issu- ing policies and handling claims.

According to Hauari (2017), the major objectives of the AI support are to improve analysis results and enhance customer experience. Incoming claims are analyzed by AI, and, depending on their nature, are sent to appropriate available adjusters. The technolo- gies used are NLP and text recognition (Chapters 6 and 7). The AI software can help in data collection and analysis and in data mining old claims.

Agents previously spent considerable time asking routine questions from people submitting insurance claims. AI machines, according to Beauchamp (2016), provide speed, accuracy, and efficiency in performing this process. Then AI can facilitate the un- derwriting process.

Similarly, claims processing is streamlined with the help of AI. It reduces process- ing time (by up to 90 percent) and improves accuracy. Capabilities of machine-learning and other AI programs can be shared in seconds in multi-office configurations, including global settings.

104 Part I • Introduction to Analytics and AI

Insurers, like other adopters of AI, will have to go through a transformation and adapt to change. Companies and individual agents can learn from early adopters. For how this is done at MetLife, see Blog (2017).

Example: Metromile Uses AI in Claim Processing

Metromile is an innovator in vehicle insurance, using the pay-per-mile model. It operates in seven U.S. states. In mid-2017, it started using AI-based programs to automate accident data, process accident claims, and pay customer claims. The automated platform, accord- ing to Santana (2017), is powered by a smart claim bot called AVA. It processes images forwarded by customers, extracting the pertinent telematic data. The AI bot simulates the accidents’ major points and makes a verification based on decision rules; authoriza- tion for payments provides for successful verification. The process takes minutes. Only complex cases are sent to investigation by human processors. Customers are delighted since they can get fast resolutions. While at the moment AVA is limited to certain types of claims, its range of suitability is increasing with the learning capabilities of machine learn- ing and the advances in AI algorithms.

Note: A 2015 start-up, Lemonade (lemonade.com) provides an AI-based platform for insurance that includes bots and machine learning. For details, see Gagliordi (2017).

u SECTION 2.7 REVIEW QUESTIONS

1. What are the new ways that banks interact with customers by using AI? 2. It is said that financial services are more personalized with AI support. Explain. 3. What back-office activities in banks are facilitated by AI? 4. How can AI contribute to security and safety?

As of July 2017, US Bank has been able to automati- cally identify military service members and veterans when they call or enter one of its branches This is not a simple task. The service members are recog- nized by Einstein, an AI-based CRM service from Salesforce Inc. (see Section 2.9).

What US Bank is trying to do is to recognize customers and understand their needs. Einstein helps the bank gain a competitive advantage in doing so. Knowledge provided is important not only for marketing and providing targeted professional financial services but also for greeting customers on their birthdays or thanking them for using the bank’s services.

The bank now has considerable informa- tion about customers available to human agents in real time. Such information helps customers

when online and when at one of the bank’s actual locations.

The AI application tells the rep all about the customer so the rep can offer appropriate services. For example, if the customer needs insurance, the AI will detect this need and the rep will offer a good alternative. It also offers information to an online customer: “Hello, Mary; I see you are checking your mortgage payments. I have good news for you. . . .”

Source: Compiled from Crosman (2017) and Carey (2017).

Questions for Case 2.5

1. What are Einstein’s advantages to US Bank?

2. What are its advantages to customers?

3. What are the benefits of voice communication?

Application Case 2.5 US Bank Customer Recognition and Services

http://lemonade.com

Chapter 2 • Artificial Intelligence 105

5. What is the role of chatbots and virtual assistants in financial services? 6. How can IBM Watson help banking services? 7. Relate Salesforce Einstein to CRM in financial services. 8. How can AI help in processing insurance claims?

2.8 AI IN HUMAN RESOURCE MANAGEMENT (HRM)

As in other business functional areas, the use of AI technologies is spreading rapidly in HRM. And as in other areas, the AI services reduce cost and increase productivity, consis- tency, and speed of execution.

AI in HRM: An Overview

Savar (2017) points to the following reasons for AI to transform HRM, especially in re- cruiting: (1) reducing human bias, (2) increasing efficiency, productivity, and insight in evaluating candidates, and (3) improving relationships with current employees.

Wislow (2017) sees the use of AI as a continuation of automation that supports HRM and keeps changing it. Wislow suggests that such automation changes how HRM employ- ees work and are engaged. This change also strengthens teamwork. Wislow divided the impact of AI into the following areas:

RECRUITMENT (TALENT ACQUISITION) One of the cumbersome tasks in HRM, especially in large organizations, is recruiting new employees. The fact is that many job positions are unfilled due to difficulties in finding the right employees. At the same time, many qualified people cannot find the right jobs.

AI improves the recruiting process as illustrated in Application Case 2.6. The use of chatbots to facilitate recruitment is also described by Meister (2017). Companies that help recruiters and job seekers, especially LinkedIn, are using AI

algorithms to suggest matches to both recruiters and job seekers. Haines (2017) describes the process, noting that a key benefit of this process is the removal of unconscious biases and prejudices of humans.

AI FACILITIES TRAINING The rapid technological developments make it necessary to train and retrain employees. AI methods can be used to facilitate learning. For example, chatbots can be used as a source of knowledge to answer learners’ queries. Online courses are popular with employees. AI can be used to test progress, for example. In addition, AI can be used to personalize online teaching for individuals and to design group lectures.

AI SUPPORTS PERFORMANCE ANALYSIS (EVALUATION) AI tools enable HR management to conduct performance analysis by breaking work into many small components and by measuring the performance of each employee and team on each component. The perfor- mance is compared to objectives, which are provided to employees and teams. AI also can track changes and progress by combining AI with analytical tools.

AI USE IN RETENTION AND ATTRITION DETECTION In order to keep employees from leaving, it is necessary for businesses to analyze and predict how to make workers happy. Machine learning can be used to detect reasons why employees leave companies by identifying influencing patterns.

AI in Onboarding

Once new employees are hired, the HR department needs help introducing them to the organizational culture and operating processes. Some new employees require much

106 Part I • Introduction to Analytics and AI

attention. AI helps HRM prepare customized onboarding paths that are best for the new- comers. Results showed that those employees supported by AI-based plans tend to stay longer in organizations (Wislow, 2017).

USING CHATBOTS FOR SUPPORTING HRM The use of chatbots in HRM is increasing rap- idly. Their ability to provide current information to employees anytime is a major reason. Dickson (2017) refers to the following chatbots: Mya, a recruiting assistant, and Job Bot, which supports the recruitment of hourly workers. This bot is also used as a plug-in to Craigslist. Another chatbot mentioned earlier is Olivia; see olivia.paradox.ai/.

Introducing AI to HRM Operations

Introducing AI to HRM operations is similar to introducing AI to other functional areas. Meister (2017) suggests the following activities:

1. Experiment with a variety of chatbots 2. Develop a team approach involving other functional areas 3. Properly plan a technology roadmap for both the short and long term, including

shared vision with other functional areas 4. Identify new job roles and modifications in existing job roles in the transformed

environment 5. Train and educate the HRM team to understand AI and gain expertise in it

Alexander Mann is a Chicago-based company that offers AI solutions to support the employee recruit- ment process. The major objective is to help com- panies solve HRM problems and challenges. The AI is used to:

1. Help companies evaluate applicants and their resumes by using machine learning. The result is the decision regarding which applicants to invite for an interview.

2. Help companies evaluate resumes that are posted on the Web. The AI software can use key words for the search related to the back- ground of employees (e.g., training, years of experience).

3. Evaluate the resumes of the best employees who currently work in a company and create, accordingly, desired profiles to be used when vacancies occur. These profiles are then com- pared to those of applying candidates, and the top ones are ranked by their fit to each job opening. In addition to the ranking, the AI program shows the fit with each desired criterion. At this stage, the human recruiter

can make a final selection decision. This way, the selection process is faster and much more accurate.

The accuracy of the process solves the candidate volume problem, ensuring that qualified people are not missed and poorly fit applicants are not selected.

Alexander Mann is also helping its clients to install chatbots that can provide candidates’ answers to questions related to the jobs and the working con- ditions at the employing company. (For the recruiting chatbot, see Dickson, 2017).

Sources: Compiled from Huang (2017), Dickson (2017), and alexandermannsolutions.com, accessed June 2018.

Questions for Case 2.6

1. What types of decisions are supported?

2. Comment on the human–machine collaboration.

3. What are the benefits to recruiters? To applicants?

4. Which tasks in the recruiting process are fully automated?

5. What are the benefits of such automation?

Application Case 2.6 How Alexander Mann Solutions (AMS) Is Using AI to Support the Recruiting Process

http://olivia.paradox.ai/

http://alexandermannsolutions.com

Chapter 2 • Artificial Intelligence 107

For additional information and discussion, see Essex (2017).

u SECTION 2.8 REVIEW QUESTIONS

1. List the activities in recruiting and explain the support provided by AI to each. 2. What are the benefits rewarded to recruiters by AI? 3. What are the benefits to job seekers? 4. How does AI facilitate training? 5. How is performance evaluation of employees improved by AI? 6. How can companies increase retention and reduce attrition with AI? 7. Describe the role of chatbots in supporting HRM.

2.9 AI IN MARKETING, ADVERTISING, AND CRM

Compared to other business areas, there are probably more applications of AI in market- ing and advertising. For example, AI-based product recommendations have been in use by Amazon.com and other e-commerce companies for more than 20 years. Due to the large number of applications, we provide only a few examples here.

Overview of Major Applications

Davis (2016) provides 15 examples of AI in marketing as listed with explanations by the authors of this book and from Martin (2017). Also see Pennington (2018).

1. Product and personal recommendations. Starting with Amazon.com’s book recom- mendations for Netflix’s movies, AI-based technologies are used extensively for per- sonalized recommendations (e.g., see Martin, 2017).

2. Smart search engines. Google is using RankBrain’s AI system to interpret users’ que- ries. Using NLP helps in understanding the products or services for which online users are searching. This includes the use of voice communication.

3. Fraud and data breaches detection. Application for this has covered credit/debit card use for many years, protecting Visa and other card issuers. Similar technologies protect retailers (such as Target and Neiman Marcus) from hackers’ attacks.

4. Social semantics. Using AI-based technologies, such as sentiment analysis and image and voice recognitions, retailers can learn about customers’ needs and provide tar- geted advertisements and product recommendations directly (e.g., via e-mail) and through social media.

5. Web site design. Using AI methods, marketers are able to design attractive Web sites. 6. Producer pricing. AI algorithms help retailers price products and services in a dynamic

fashion based on the competition, customers’ requirements, and more. For example, AI provides predictive analysis to forecast the impact of different price levels.

7. Predictive customer service. Similar to predicting the impact of pricing, AI can help in predicting the impact of different customer service options.

8. Ad targeting. Similar to product recommendations, which are based on user profiles, marketers can tailor ads to individual customers. The AI machines attempt to match different ads with individuals.

9. Speech recognition. As the trend to use voice in human–machine interaction is in- creasing, the use of bots by marketers to provide product information and prices accelerates. Customers prefer to talk to bots rather than to key in dialogue.

10. Language translation. AI enables conversations between people who speak differ- ent languages. Also, customers can buy from Web sites written in languages they do not speak by using GoogleTranslate, for example.

http://Amazon.com

108 Part I • Introduction to Analytics and AI

11. Customer segmentation. Marketers are segmenting customers into groups and then tailoring ads to each group. While less effective than targeting individuals, this is more effective than mass advertising. AI can use data and text mining to help mar- keters identify the characteristics of specific segments (e.g., by mining historical files) as well as help tailor the best ads for each segment.

12. Sales forecasting. Marketers’ strategy and planning are based on sales forecasting. Such forecasting may be very difficult for certain products. Uncertainties may exist in many situations such as in customer need assessment. Predictive analytics and other AI tools can provide better forecasting than traditional statistical tools.

13. Image recognition. This can be useful in market research (e.g., for identifying con- sumer preferences of one company’s products versus those of its competition). It can also be used for detecting defects in producing and/or packaging products.

14. Content generation. Marketers continuously create ads and product information. AI can expedite this task and make sure that it is consistent and complies with regu- lations. Also, AI can help in generating targeted content to both individuals and segments of consumers.

15. Using bots, assistants, and robo advisors. In Chapter 12, we describe how bots, per- sonal assistants, and robo advisors help consumers of products and services. Also, these AI machines excel in facilitating customer experience and strengthen customer relationship management. Some experts call bots and virtual personal assistants the “face of marketing.”

Another list is provided at en.wikipedia.org/wiki/Marketing_and_artificial_ intelligence.

AI Marketing Assistants in Action

There are many ways that AI can be used in marketing. One way is illustrated in Application Case 2.7 about Kraft Foods.

Customer Experiences and CRM

As described earlier, a major impact of AI technologies is changing customer experiences. A notable example is the use of conversational bots. Bots (e.g., Alexa) can provide infor- mation about products and companies and can provide advice and guidance (e.g., robo advisors for investment; see Chapter 12). Gangwani (2016) lists the following ways to improve customers’ experiences:

1. Use NLP for generating user documentation. This capability also improves the customer–machine dialogue.

2. Use visual categorization to organize images (for example, see IBM’s Visual Recognition and Clarifai)

3. Provide personalized and segmented services by analyzing customer data. This in- cludes improving shopping experience and CRM.

A well-known example of AI in CRM is Salesforce’s Einstein.

Example: Salesforce’s AI Einstein

Salesforce Einstein is an AI set of technologies (e.g., Einstein Vision for image recogni- tion) that is used for enhancing customer interactions and supporting sales. For example, the system delivers dynamic sales dashboards to sales reps. It also tracks performance and manages teamwork by using sales analytics. The AI product also can provide predictions

http://en.wikipedia.org/wiki/Marketing_and_artificial_intelligence

Chapter 2 • Artificial Intelligence 109

The number of mobile users is growing rapidly as is the number of mobile shoppers. Kraft Foods took notice of that. The company is adapting its adver- tising and sales to this trend. Mobile customers are looking for brands and interacting with Kraft brands. Kraft Foods wanted to make it easy for customers to interact with the company whenever and wherever they want. To achieve this interaction goal, Kraft Foods created a “Food Assistant,” also known as Kraft Food Assistant.

The Kraft Food Assistant

Kraft’s Food Assistant is an app for smartphones that allows customers to access more than 700 recipes. Thus, the consumer can browse easily for ideas. Customers enter a virtual store and open the “recipe of the day.” The app tells the user all the ingredients needed for that recipe or for any desired recipe. The Food Assistant also posts all the relevant coupons available for the ingredients on users’ smartphone. Users need only to take the smartphone to a supermarket, scan the coupons, and save on the ingredients. The recipe of the day is also demonstrated on a video. Unique to this app is the inclusion of an AI algorithm that learns from users’ orders and can infer, for exam- ple, the users’ family size. The more the AI learns about users, the more suggestions it makes. For example, it tells users what to do with their left- over ingredients. In addition, the more the Food Assistant learns about users, the more useful sug- gestions for recipes and cooking it can offer. It is like the Netflix recommender. The more Kraft products that users buy (the ingredients), the more advice they get. The Food Assistant also directs users to the nearest store that has the recipes’ ingredients. Users can get assistance on how to prepare food in 20 minutes and on many cooking-related topics.

The AI is tracking consumers’ behavior. Information is stored on each user’s loyalty card. The system makes inferences about what consum- ers like and targets related promotions to them. This process is called behavioral pattern recognition, and is based on AI techniques such as “collaborative fil- tering.” (See Chapter 12.)

AI assistants also can tweak messages to users, and they know if users are interested in their top- ics. The assistant also knows whether customers are responding positively and whether they are or are not motivated to try a new product or purchase more of what they previously purchased. The Kraft AI Food Assistant actually is trying to influence and sometimes to modify consumer behavior. Like other vendors, Kraft is using the information collected by the AI assistant to forge and execute mobile and regular commerce strategies.

Using the information collected, Kraft and sim- ilar vendors can expand their mobile marketing pro- grams both online and in physical stores.

Note: Users can interact with the system with voice powered by Nuance Communication. The system is based on natural lan- guage processing.

Sources: Compiled from Celentano (2016), press releases at nuance.com, and kraftrecipes.com/media/iphoneassistant. aspx/, accessed March 2018.

Questions for Case 2.7

1. Identify all AI technologies used in the Food Assistant.

2. List the benefits to the customers.

3. List the benefits to Kraft Foods.

4. How is advertising done?

5. What role is “behavioral pattern recognition” playing?

6. Compare Kraft’s Food Assistant to Amazon.com and Netflix recommendation systems.

Application Case 2.7 Kraft Foods Uses AI for Marketing and CRM

and recommendations. It supports Salesforce Customer Successful Platform and other Salesforce products.

Einstein’s automatically prioritized sales leads make sales reps more productive when dealing with sales leads and potential opportunities. The sales reps also get insights about customers’ sentiments, competitors’ involvement, and other information.

http://nuance.com

http://kraftrecipes.com/media/iphoneassistant.aspx/

http://Amazon.com

110 Part I • Introduction to Analytics and AI

For information and a demo, see salesforce.com/products/einstein/overview/. For features and description of the product, see zdnet.com/article/salesforces- einstein- ai-platform-what-you-need-to-know/. For additional features, see salesforce.com/ products/einstein/features/.

Other Uses of AI in Marketing

The following show the diversity of AI technologies used in marketing:

• It is used to mimic the expertise of in-store salespeople. In many physical stores, humans are not readily available to help customers who do not want to wait very long. Thus, shopping is made easier when bots provide guidance. A Japanese store already provides all services in a physical store by speaking robots.

• It provides lead generation. As seen in the case of Einstein, AI can help generate sales leads by analyzing customers’ data. The program can generate predictions. Insights can be generated by intelligent analytics.

• It can increase customer loyalty using personalization. For example, some AI tech- niques can recognize regular customers (e.g., in banks). IBM Watson can learn about people from their tweets.

• Salesforce.com provides a free e-book, “Everything You Need to Know about AI for CRM” (salesforce.com/form/pdf/ai-for-crm.jsp).

• It can improve the sales pipeline. Narayan (2018) provides a process of how compa- nies can use AI and robots to do this. Specifically, robots convert unknown visitors into customers. Robots use three stages: (1) prepare a list of target customers in the database, (2) send information, ads, videos, and so on to prospects on the list cre- ated earlier, and (3) provide the company sales department with a list of leads that successfully convert potential customers to buyers.

u SECTION 2.9 REVIEW QUESTIONS

1. List five of the 15 applications of Davis (2016). Comment on each. 2. Which of the 15 applications relate to sales? 3. Which of the 15 applications relate to advertising? 4. Which of the 15 applications relate to customer service and CRM? 5. For what are the prediction capabilities of AI used? 6. What is the Salesforce’s Einstein? 7. How can AI be used to improve CRM?

2.10 AI APPLICATIONS IN PRODUCTION-OPERATION MANAGEMENT (POM)

The field of POM is much diversified, and its use of AI is evident today in many areas. To describe all of them, we would need more than a whole book. In the remaining chapters, we provide dozens of examples about AI applications in POM. Here, we pro- vide only a brief discussion regarding two related application areas: manufacturing and logistics.

AI in Manufacturing

To handle ever-increasing labor costs, changes in customers’ requirements, increased global competition, and government regulations (Chapter 1), manufacturing compa- nies are using elevated levels of automation and digitization. According to Bollard et al.

http://salesforce.com/products/einstein/overview/

http://zdnet.com/article/salesforces-einstein-ai-platform-what-you-need-to-know/

http://salesforce.com/products/einstein/features/

http://Salesforce.com

http://salesforce.com/form/pdf/ai-for-crm.jsp

Chapter 2 • Artificial Intelligence 111

(2017), companies need to be more agile, and react quicker and more effectively. They also need to be more efficient and improve customers’ (organizations’ and individuals’) experiences. Companies are pressured to cut costs and increase quality and transparency. To achieve these goals, they need to automate processes and make use of AI and other cutting-edge technologies.

Implementation Model

Bollard, et al. (2017) proposed a five-component model for manufacturing companies to use intelligent technologies. This model includes:

• Streamlining processes, including minimizing waste, redesigning processes, and using business process management (BPM)

• Outsourcing certain business processes, including going offshore • Using intelligence in decision making by deploying AI and analytics • Replacing human tasks with intelligent automation • Digitizing customers’ experiences

Companies have used this model for a long time. Actually, robotics have been used since around 1960 (e.g., Unimate in General Motors). However, the robots were “dumb,” each usually doing one simple task. Today, companies use intelligent robots for complex tasks, enabling make-to-order products and mass customization. In other words, many mental and cognitive tasks are being automated. These developments, involving AI and sensors, allow supporting or even automating production decisions in real time.

Example

When a sensor detects a defective product or a malfunction, the data are processed by an AI algorithm. An action then takes place instantly and automatically. For example, a defective item can be removed or replaced. AI can even make predictions about equip- ment failures before they occur (see the opening vignette in Chapter 1). This real-time action saves a huge amount of money for manufacturers. (This process may involve the IoT; see Chapter 13.)

Intelligent Factories

Ultimately, companies will use smart or intelligent factories (see Chapter 13). These facto- ries use complex software and sensors. An example of a lead supplier is General Electric, which provides software such as OEE Performance Analyzer and Production Execution Supervisor. The software is maintained in the “cloud” and it is provided as a “software-as- a-service.” GE partners with Cisco and FTC to provide security, connectivity, and special analytics.

In addition to GE, well-known companies such as Siemens and Hitachi provide comprehensive solutions. For an example, see Hitachi AI Technology’s Report (social- innovation.hitachi/ph/solutions/ai/pdf/ai_en_170310.pdf).

Many small vendors are specializing in different aspects of AI for manufacturing. For example, BellHawk Systems Corporation, which provides services to small companies, specializes in real-time operations tracking (see Green, 2016).

Early successes were recorded by large companies such as Procter & Gamble and Toyota.

However, as time passes, medium-size and small companies can also afford AI ser- vices. For additional information, see bellhawk.com.

http://social-innovation.hitachi/ph/solutions/ai/pdf/ai_en_170310.pdf

http://bellhawk.com

112 Part I • Introduction to Analytics and AI

Logistics and Transportation

AI and intelligent robots are used extensively in corporate logistics and internal and exter- nal transportation, as well as in supply chain management. For example, Amazon.com is using over 50,000 robots to move items in its distribution centers (other e-commerce com- panies are doing the same). Soon, we will see driverless trucks and other autonomous vehicles all over the world (see Chapter 13).

Example: DHL Supply Chain

DHL is a global delivery company (competing with FedEx and UPS). It has a supply chain division that works with many business partners. AI and IoT are changing the manner by which the company, its partners, and even its competitors operate. DHL is developing innovative logistics and transportation business models, mostly with AI, IoT, and machine learning. These models also help DHL’s customers gain a competitive advantage (and this is why the company cannot provide details in its reports).

Several of the IoT projects are linked to machine learning, specifically in the areas of sensors, communication, device management, security, and analysis. Machine learning in such cases assists in tailoring solutions to specific requirements.

Overall, DHL concentrates on the areas of supply chains (e.g., identifies inventories and controls them along the supply chain) and warehouse management. Machine learn- ing and other AI algorithms enable more accurate procurement, production planning, and work coordination. Tagging and tracking items using Radio Frequency Identification (RFID) and Quick Response (QR) code allow for item tracking along the supply chain. Finally, AI facilitates predictive analytics, scheduling, and resource planning. For details, see Coward (2017).

u SECTION 2.10 REVIEW QUESTIONS

1. Describe the role of robots in manufacturing. 2. Why use AI in manufacturing? 3. Describe the Bollard et al. implementation model. 4. What is an intelligent factory? 5. How are a company’s internal and external logistics supported by AI technologies?

Chapter Highlights

• The aim of artificial intelligence is to make ma- chines perform tasks intelligently, possibly like people do.

• A major reason for using AI is to cause work and decision making to be easier to perform. AI can be more capable (enable new applications and business models), more intuitive, and less threat- ening than other decision support applications.

• A major reason to use AI is to reduce cost and/or increase productivity.

• AI systems can work autonomously, saving time and money, and perform work consistently. They can also work in rural and remote areas where human expertise is rare or not available.

• AI can be used to improve all decision-making steps.

• Intelligent virtual systems can act as assistants to humans.

• AI systems are computer systems that exhibit low (but increasing) levels of intelligence.

• AI has several definitions and derivatives, and its importance is growing rapidly. The U.S. govern- ment postulated that AI will be a “critical driver of the U.S. economy” (Gaudin 2016).

• The major technologies of AI are intelligent agents, machine learning, robotic systems, NLP and speech recognition, computer vision, and knowledge systems.

http://Amazon.com

Chapter 2 • Artificial Intelligence 113

• Expert systems, recommendation systems, chat- bots, and robo advisors are all based on knowl- edge transferred to machines.

• The major limitations of AI are the lack of human touch and feel, the fear that it will take jobs from people, and the possibility that it could be destructive.

• AI is not a match to humans in many cognitive tasks, but it can perform many manual tasks quicker and at a lower cost.

• There are several types of intelligence, so it is dif- ficult to measure AI’s capacity.

• In general, human intelligence is superior to that of machines. However, machines can beat peo- ple in complex games.

• Machine learning is currently the most useful AI technology. It attempts to learn from its experi- ence to improve operations.

• Deep learning enables AI technologies to learn from each other, creating synergy in learning.

• Intelligent agents excel in performing simple tasks considerably faster and more consistently than humans (e.g., detecting viruses in computers).

• The major power of machine learning is a result of the machine’s ability to learn from data and its manipulation.

• Deep learning can solve many difficult problems. • Computer vision can provide understandings

from images, including from videos. • Robots are electromechanical computerized sys-

tems that can perform physical and mental tasks. When provided with sensory devices, they can become intelligent.

• Computers can understand human languages and can generate text or voice in human languages.

• Cognitive computing simulates the human thought process for solving problems and making decisions.

• Computers can be fully automated in simple manual and mental tasks using AI.

• Several types of decision making are fully auto- mated using AI; other types can be supported.

• AI is used extensively in all functional business departments, reducing cost and increasing pro- ductivity, accuracy, and consistency. There is a tendency to increase the use of chatbots. They all support decision making well.

• AI is used extensively in accounting, automating simple transactions, helping deal with Big Data, finding fraudulent transactions, increasing secu- rity, and assisting in auditing and compliance.

• AI is used extensively in financial services to im- prove customer service, provide investment ad- vice, increase security, and facilitate payments among other tasks. Notable applications are in banking and insurance.

• HRM is using AI to facilitate recruitment, en- hance training, help onboarding, and streamline operations.

• There is considerable use of AI in marketing, sales, and advertising. AI is used to support prod- uct recommendation, help in search of products and services, facilitate Web site design, support pricing decisions, provide language translation in globe trade, assist in forecasting and predictions, and use chatbots for many marketing and cus- tomer service activities.

• AI has been used in manufacturing for decades. Now it is applied to support planning, supply chain coordination, logistics and transportation, and operation of intelligent factories.

Key Terms

artificial brain artificial intelligence (AI) augmented intelligence chatbots computer vision deep learning

intelligent agent machine learning machine vision natural language processing

(NLP) robot

scene recognition shopbot speech (voice) understanding Turing Test

Questions for Discussion

1. Discuss the difficulties in measuring the intelligence of machines.

2. Discuss the process that generates the power of AI.

3. Discuss the differences between machine learning and deep learning.

114 Part I • Introduction to Analytics and AI

4. Describe the difference between machine vision and computer vision.

5. How can a vacuum cleaner be as intelligent as a six- year-old child?

6. Why are NLP and machine vision so prevalent in industry?

7. Why are chatbots becoming very popular? 8. Discuss the advantages and disadvantages of the Turing

Test. 9. Why is augmented reality related to AI?

10. Discuss the support that AI can provide to decision makers.

11. Discuss the benefits of automatic and autonomous deci- sion making.

12. Why is general (strong) AI considered to be “the most significant technology ever created by humans”?

13. Why is the cost of labor increasing, whereas the cost of AI is declining?

14. If an artificial brain someday contains as many neurons as the human brain, will it be as smart as a human brain? (Students need to do extra research.)

15. Distinguish between dumb robots and intelligent ones. 16. Discuss why applications of natural language processing

and computer vision are popular and have many uses.

Exercises

1. Go to itunes.apple.com/us/app/public-transit-app- moovit/id498477945?mt=8. Compare Moovit opera- tions to the operation of INRIX.

2. Go to sitezeus.com and view the 2:07 min. video. Explain how the technology works as a decision helper.

3. Go to Investopedia and learn about investors’ tolerance. Then find out how AI can be used to contain this risk, and write a report.

4. In 2017, McKinsey & Company created a five-part video titled “Ask the AI Experts: What Advice Would You Give to Executives About AI?” View the video and summarize the advice given to the major issues discussed. (Note: This is a class project.)

5. Watch the McKinsey & Company video (3:06 min.) on today’s drivers of AI at youtube.com/ watch?v=yv0IG1D-OdU and identify the major AI drivers. Write a report.

6. Go to the Web site of the Association for the Advancement of Artificial Intelligence aaai.org/home. html and describe its content. Compare it to that of ai. sri.com and csail.mit.edu/.

7. Go to crosschx.com and find information about Olive. Explain how it works, what its limitations and advan- tages are, and which types of decisions it automates and which it only supports.

8. Go to waze.com and moovitapp.com and find their capabilities. Summarize the help they can provide users.

9. Go to sentient.ai. Find its products that facilitate e- commerce. Write a report.

10. Go to artificialbrain.org and report the latest progress there.

11. Find recent information on research that is aimed to measure artificial intelligence. Write a report.

12. Go to salesforce.com and find recent developments on AI Einstein. Why it is so popular?

13. Find the latest information on IBM Watson’s advising activities. Write a report.

14. Find information on the use of AI in iPhones. Explore the role of Edge AI. Write a report.

15. Explore the AI-related products and services of Nuance Inc. (nuance.com). Explore the Dragon voice recogni- tion product.

16. Go to the Netradyne report at cs_netradyne.com/ and read about the use of its product for road safety. Write a report.

17. Go to salesforce.com and investigate the capabilities of Gecko HRM. Relate it to Salesforce Einstein. Provide examples of two applications.

18. Enter McKinsey & Company and find in its Fifty Five “The Value AI Can Bring to Your Business” (mckinsey. com/featured-insights/artificial-intelligence/five- fifty-real-world-ai). Then look for “Real-World AI.” Find the banking section and dive more deeply into its content.

19. Find material on the impact of AI on advertising. Write a report.

20. Go to strategicsourceror.com/2018/03/giant-scale- supply-chains-can-make.html. Summarize the use of AI.

References

Agrawal, V. “How Successful Investors Are Using AI to Stay Ahead of the Competition.” ValueWalk, January 28, 2018.

Alpaydin, E. Machine Learning: The New AI (The MIT Press Essential Knowledge Series). Boston, MA: MIT Press, 2016.

Beauchamp, P. “Artificial Intelligence and the Insurance In- dustry: What You Need to Know.” The Huffington Post, October 27, 2016.

Blog. “Welcome to the Future: How AI Is Transforming Insur- ance.” Blog.metlife.com, October 1, 2017.

Bollard, A., et al. “The next-generation operating model for the digital world.” McKinsey & Company, March 2017.

BrandStudio. “Future-Proof: How Today’s Artificial Intelli- gence Solutions Are Taking Government Services to the Next Frontier.” Washington Post, August 22, 2017.

http://itunes.apple.com/us/app/public-transit-app-moovit/id498477945?mt=8

http://sitezeus.com

http://youtube.com/watch?v=yv0IG1D-OdU

http://aaai.org/home.html

http://ai.sri.com

http://csail.mit.edu/

http://artificialbrain.org

http://salesforce.com

http://nuance.com

http://cs_netradyne.com/

http://salesforce.com

http://mckinsey.com/featured-insights/artificial-intelligence/five-fifty-real-world-ai

http://strategicsourceror.com/2018/03/giant-scale-supply-chains-can-make.html

http://Blog.metlife.com

Chapter 2 • Artificial Intelligence 115

Carey, S. “US Bank Doubles Its Conversion Rate for Wealth Customers Using Salesforce Einstein.” Computerworld UK, November 10, 2017.

Carney, P. “Pat Carney: Artificial Intelligence versus Human Intelligence.” Vancouver Sun, April 7, 2018.

Celentano, D. “Kraft Foods iPhone Assistant Appeals to Time Starved Consumers.” The Balance, September 18, 2016.

Chandi, N. “How AI is Reshaping the Accounting Industry.” Forbes.com, July 20, 2017.

Clozel, L. “IBM Unveils New Watson tools to Help Banks Manage Compliance, AML.” American Banker, June 14, 2017.

Consultancy.uk. “How Artificial Intelligence Is Transforming the Banking Industry.” September 28, 2017. consultancy .uk/news/14017/how-artificial-intelligence-is- transforming-the-banking-industry/ (accessed June 2018).

Coward, J. “Artificial Intelligence Is Unshackling DHL’s Supply Chain Potential.” IoT Institute, April 18, 2017. ioti.com/ industrial-iot/artificial-intelligence-unshackling- dhls-supply-chain-potential (accessed June 2018).

Crosman, P. “U.S. Bank Bets AI Can Finally Deliver 360- Degree View.” American Banker, July 20, 2017.

Davis, B. “15 Examples of Artificial Intelligence in Marketing.” Econsultancy, April 19, 2016.

Dickson, B. “How Artificial Intelligence Optimizes Recruit- ment.” The Next Web, June 3, 2017.

Dodge, J. “Artificial Intelligence in the Enterprise: It’s On.” Computerworld, February 10, 2016.

Dormehl, L. Thinking Machines: The Quest for Artificial Intelligence—and Where It’s Taking Us Next. New York, NY: Tarcher-Perigee, 2017.

Essex, D. “AI in HR: Artificial Intelligence to Bring Out the Best in People.” TechTargetEssential Guide, April 2017.

E. V. Staff. “Artificial Intelligence Used to Predict Short-Term Share Price Movements.” The Economic Voice, June 22, 2017.

Finlay, S. Artificial Intelligence and Machine Learning for Business: A No-Nonsense Guide to Data Driven Technolo- gies. 2nd ed. Seattle, WA: Relativistic, 2017.

Forrest, C. “7 Companies That Used Machine Learning to Solve Real Business Problems.” Tech Republic, March 8, 2017.

Fuller, D. “LG Claims Its Roboking Vacuum Is As Smart As a Child.” Androidheadlines.com, July 18, 2017.

Gagliordi, N. “Softbank Leads $120M Investment in AI-Based Insurance Startup Lemonade.” ZDNET, December 19, 2017.

Gangwani, T. “3 Ways to Improve Customer Experience Using A.I.” CIO Contributor Network, October 12, 2016.

Gaudin, S. “White House: A.I. Will Be Critical Driver of U.S. Economy.” Computerworld, October 12, 2016.

Gitlin, J. M. “Watch Out, Waze: INRIX’s New Traffic App Is Coming for You.” Ars Technica, March 30, 2016. arstechnica. com/cars/2016/watch-out-waze-inrixs-new-traffic- app-is-coming-for-you/ (accessed June 2018).

Greengard, S. “Delving into Gartner’s 2016 Hype Cycle.” Base- line, September 7, 2016.

Greig, J. “Gartner: AI Business Value Up 70% in 2018, and These Industries Will Benefit the Most.” Tech Republic, April 25, 2018.

Haines, D. “Is Artificial Intelligence Making It Easier and Quicker to Get a New Job?” Huffington Post UK, Decem- ber 4, 2017.

Hauari, G. “InsurersLeverage AI to Unlock Legacy Claims Data.” Information Management, July 3, 2017.

Huang, G. “Why AI Doesn’t Mean Taking the ‘Human’ Out of Human Resources.” Forbes.com, September 27, 2017.

Hughes, T. “Google DeepMind’s Program Beat Human at Go.” USA Today, January 27, 2016.

ICAEW. “Artificial Intelligence and the Future of Accountan- cy.” artificial-intelligence-report.ashx/, 2017.

Kaplan, J. Artificial Intelligence: What Everyone Needs to Know. London, UK: Oxford University Press, 2016.

Kharpal, A. “A.I. Is in a ‘Golden Age’ and Solving Problems That Were Once in the Realm of Sci-Fi, Jeff Bezos Says.” CNBC News, May 8, 2017.

Kiron, D. “What Managers Need to Know About Artificial Intel- ligence?” MITSloan Management Review, January 25, 2017.

Knight, W. “Walmart’s Robotic Shopping Carts Are the Latest Sign That Automation Is Eating Commerce.” Technology Review, June 15, 2016.

Kolbjørnsrud, V., R. Amico, and R. J. Thomas. “How Artificial Intelligence Will Redefine Management.” Harvard Busi- ness Review, November 2, 2016.

Korosec, K. “Inrix Updates Traffic App to Learn Your Daily Habits.” Fortune Tech, March 30, 2016.

Liao, P-H., et al. “Applying Artificial Intelligence Technology to Support Decision-Making in Nursing: A Case Study in Taiwan.” Health Informatics Journal, June 2015.

Marr, B., “The Key Definitions of Artificial Intelligence That Explain Its Importance.” Forbes, February 14, 2018.

Marr, B. “What Everyone Should Know About Cognitive Com- puting.” Forbes.com, March 23, 2016.

Martin, J. “10 Things Marketers Need to Know about AI.” CIO. com, February 13, 2017.

McPherson, S.S. Artificial Intelligence: Building Smarter Ma- chines. Breckenridge, CO: Twenty-First Century Books, 2017.

Meister, J. “The Future of Work: How Artificial Intelligence Will Transform the Employee Experience.” Forbes.com, November 9, 2017.

Metz, C. “Facebook’s Augmented Reality Engine Brings AI Right to Your Phone.” Wired, April 19, 2017.

Mittal, V. “Top 15 Deep Learning Applications That Will Rule the World in 2018 and Beyond.” Medium.com, October 3, 2017.

Narayan, K. “Leverage Artificial Intelligence to Build your Sales Pipeline.” LinkedIn, February 14, 2018.

Ng, A. “What Artificial Intelligence Can and Can’t Do Right Now.” Harvard Business Review, November 9, 2016.

Nordrum, A. “Hedge Funds Look to Machine Learning, Crowd- sourcing for Competitive Advantage.” IEEE Spectrum, June 28, 2017.

Ovaska-Few, S. “How Artificial Intelligence Is Changing Accounting.” Journal of Accountancy, October 9, 2017.

http://Forbes.com

http://consultancy.uk/news/14017/how-artificial-intelligence-is-transforming-the-banking-industry/

http://ioti.com/industrial-iot/artificial-intelligence-unshackling-dhls-supply-chain-potential

http://arstechnica.com/cars/2016/watch-out-waze-inrixs-new-traffic-app-is-coming-for-you/

http://Forbes.com

http://artificial-intelligence-report.ashx/

http://Forbes.com

http://Medium.com

116 Part I • Introduction to Analytics and AI

Padmanabhan, G. “Industry-Specific Augmented Intelligence: A Catalysts for AI in the Enterprise.” Forbes, January 4, 2018.

Pennington, R. “Artificial Intelligence: The New Tool for Ac- complishing an Old Goal in Marketing.” Huffington Post, January 16, 2018.

Press, G. “Top 10 Hot Artificial Intelligence (AI) Technolo- gies.” Forbes, January 23, 2017.

Pyle, D., and C. San José. “An Executive’s Guide to Machine Learning.” McKinsey & Company, June 2015.

Reinharz, S. An Introduction to Artificial Intelligence: Profes- sional Edition: An Introductory Guide to the Evolution of Artificial Intelligence. Kindle Edition. Seattle, WA: Simulta- neous Device Usage (Amazon Digital Service), 2017.

Sample, I. “AI Watchdog Needed to Regulate Automated Decision-Making, Say Experts.” The Guardian, January 27, 2017.

Santana, D. “Metromile Launches AI Claims Platform.” Digital Insurance, July 25, 2017.

Savar, A. “3 Ways That A.I. Is Transforming HR and Recruit- ing.” INC.com, June 26, 2017.

Schrage, M. “4 Models for Using AI to Make Decisions.” Har- vard Business Review, January 27, 2017.

Shah, J. “Robots Are Learning Complex Tasks Just by Watch- ing Humans Do Them.” Harvard Business Review, June 21, 2016.

Sharma, G. “China Unveils Multi-Billion Dollar Artificial Intel- ligence Plan.” International Business Times, July 20, 2017. ibtimes.co.uk/china-unveils-multi-billion-dollar- artificial-intelligence-plan-1631171/ (accessed January 2018).

Sincavage, D. “How Artificial Intelligence Will Change Decision-Making for Businesses.” Business 2 Community, August 24, 2017.

Singh, H. “How Artificial Intelligence Will Transform Financial Services.” Information Management, June 6, 2017.

SMBWorld Asia Editors. “Hays: Artificial Intelligence Set to Revolutionize Recruitment.” Enterprise Innovation, August 30, 2017.

Smith, J. Machine Learning: Machine Learning for Beginners. Can Machines Really Learn Like Humans? All About Artificial Intelligence (AI), Deep Learning and Digital Neu-

ral Networks. Kindle Edition. Seattle, WA: Amazon Digital Service, 2017.

Staff. “Assisted, Augmented and Autonomous: The 3 Flavours of AI Decisions.” Software and Technology, June 28, 2017. tgdaily.com/technology/assisted-augmented-and- autonomous-the-3-flavours-of-ai-decisions

Steffi, S. “List of 50 Unique AI Technologies.” Hacker Noon. com, October 18, 2017.

Taylor, P. “Welcome to the Machine – Learning.” Forbes Brand- Voice, June 3, 2016. forbes.com/sites/sap/2016/06/03/ welcome-to-the-machine-learning/#3175d50940fe (accessed June 2017).

Theobald, O. Machine Learning for Absolute Beginners: A Plain English Introduction. Kindle Edition. Seattle, WA, 2017.

Tiwan, R. “Artificial Intelligence (AI) in Banking Case Study Report 2017.” iCrowd Newswire, July 7, 2017.

USC. “AI Computer Vision Breakthrough IDs Poachers in Less Than Half a Second.” Press Release, February 8, 2018.

Violino, B. “Most Firms Expect Rapid Returns on Artificial Intelligence Investments.” Information Management, November 1, 2017.

Warawa, J. “Here’s Why Accountants (Yes, YOU!) Should Be Driving AI Innovation.” CPA Practice Advisor, November 1, 2017.

Waxer, C. “Get Ready for the BOT Revolution.” Computer- world, October 17, 2016.

Wellers, D., et al. “8 Ways Machine Learning Is Improving Companies’ Work Processes.” Harvard Business Review, May 31, 2017.

Wislow, E. “5 Ways to Use Artificial Intelligence (AI) in Human Resources.” Big Data Made Simple, October 24, 2017. bigdata-madesimple.com/5-ways-to-use-artificial- intelligence-ai-in-human-resources/.

Yurcan, B. “TD’s Innovation Agenda: Experiments with Alexa, AI and Augmented Reality.” Information Management, December 27, 2017.

Zarkadakis, G. In Our Own Image: Savior or Destroyer? The History and Future of Artificial Intelligence. New York, NY: Pegasus Books, 2016.

Zhou, A. “EY, Deloitte and PwC Embrace Artificial Intelligence for Tax and Accounting.” Forbes.com, November 14, 2017.

http://INC.com

http://ibtimes.co.uk/china-unveils-multi-billion-dollar-artificial-intelligence-plan-1631171/

http://tgdaily.com/technology/assisted-augmented-and-autonomous-the-3-flavours-of-ai-decisions

http://Noon.com

http://forbes.com/sites/sap/2016/06/03/welcome-to-the-machine-learning/#3175d50940fe

http://bigdata-madesimple.com/5-ways-to-use-artificial-intelligence-ai-in-human-resources/

117

LEARNING OBJECTIVES

Nature of Data, Statistical Modeling, and Visualization

■■ Understand the nature of data as they relate to business intelligence (BI) and analytics

■■ Learn the methods used to make real-world data analytics ready

■■ Describe statistical modeling and its relationship to business analytics

■■ Learn about descriptive and inferential statistics ■■ Define business reporting and understand its historical evolution

■■ Understand the importance of data/information visualization

■■ Learn different types of visualization techniques ■■ Appreciate the value that visual analytics brings to business analytics

■■ Know the capabilities and limitations of dashboards

I n the age of Big Data and business analytics in which we are living, the importance of data is undeniable. Newly coined phrases such as “data are the oil,” “data are the new bacon,” “data are the new currency,” and “data are the king” are further stress- ing the renewed importance of data. But the type of data we are talking about is obvi- ously not just any data. The “garbage in garbage out—GIGO” concept/principle applies to today’s Big Data phenomenon more so than any data definition that we have had in the past. To live up to their promise, value proposition, and ability to turn into insight, data have to be carefully created/identified, collected, integrated, cleaned, transformed, and properly contextualized for use in accurate and timely decision making.

Data are the main theme of this chapter. Accordingly, the chapter starts with a de- scription of the nature of data: what they are, what different types and forms they can come in, and how they can be preprocessed and made ready for analytics. The first few sections of the chapter are dedicated to a deep yet necessary understanding and process- ing of data. The next few sections describe the statistical methods used to prepare data as input to produce both descriptive and inferential measures. Following the statistics sections are sections on reporting and visualization. A report is a communication artifact

3 C H A P T E R

118 Part I • Introduction to Analytics and AI

prepared with the specific intention of converting data into information and knowledge and relaying that information in an easily understandable/digestible format. Today, these reports are visually oriented, often using colors and graphical icons that collectively look like a dashboard to enhance the information content. Therefore, the latter part of the chapter is dedicated to subsections that present the design, implementation, and best practices regarding information visualization, storytelling, and information dashboards.

This chapter has the following sections:

3.1 Opening Vignette: SiriusXM Attracts and Engages a New Generation of Radio Consumers with Data-Driven Marketing 118

3.2 Nature of Data 121 3.3 Simple Taxonomy of Data 125 3.4 Art and Science of Data Preprocessing 129 3.5 Statistical Modeling for Business Analytics 139 3.6 Regression Modeling for Inferential Statistics 151 3.7 Business Reporting 163 3.8 Data Visualization 166 3.9 Different Types of Charts and Graphs 171

3.10 Emergence of Visual Analytics 176 3.11 Information Dashboards 182

3.1 OPENING VIGNETTE: SiriusXM Attracts and Engages a New Generation of Radio Consumers with Data-Driven Marketing

SiriusXM Radio is a satellite radio powerhouse, the largest radio company in the world with $3.8 billion in annual revenues and a wide range of hugely popular music, sports, news, talk, and entertainment stations. The company, which began broadcasting in 2001 with 50,000 subscribers, had 18.8 million subscribers in 2009, and today has nearly 29 million.

Much of SiriusXM’s growth to date is rooted in creative arrangements with automo- bile manufacturers; today, nearly 70 percent of new cars are SiriusXM enabled. Yet the company’s reach extends far beyond car radios in the United States to a worldwide pres- ence on the Internet, on smartphones, and through other services and distribution chan- nels, including SONOS, JetBlue, and Dish.

BUSINESS CHALLENGE

Despite these remarkable successes, changes in customer demographics, technology, and a competitive landscape over the past few years have posed a new series of business challenges and opportunities for SiriusXM. Here are some notable ones:

• As its market penetration among new cars increased, the demographics of its buy- ers changed, skewing toward younger people with less discretionary income. How could SiriusXM reach this new demographic?

• As new cars become used cars and change hands, how could SiriusXM identify, engage, and convert second owners to paying customers?

• With its acquisition of the connected vehicle business from Agero—the leading pro- vider of telematics in the U.S. car market—SiriusXM gained the ability to deliver its service via satellite and wireless networks. How could it successfully use this acqui- sition to capture new revenue streams?

Chapter 3 • Nature of Data, Statistical Modeling, and Visualization 119

PROPOSED SOLUTION: SHIFTING THE VISION TOWARD DATA-DRIVEN MARKETING

SiriusXM recognized that to address these challenges, it would need to become a high- performance, data-driven marketing organization. The company began making that shift by establishing three fundamental tenets. First, personalized interactions—not mass marketing—would rule the day. The company quickly understood that to conduct more personalized marketing, it would have to draw on past history and interactions as well as on a keen understanding of the consumer’s place in the subscription life cycle.

Second, to gain that understanding, information technology (IT) and its external tech- nology partners would need the ability to deliver integrated data, advanced analytics, integrated marketing platforms, and multichannel delivery systems.

And third, the company could not achieve its business goals without an integrated and consistent point of view across the company. Most important, the technology and business sides of SiriusXM would have to become true partners to best address the chal- lenges involved in becoming a high-performance marketing organization that draws on data-driven insights to speak directly with consumers in strikingly relevant ways.

Those data-driven insights, for example, would enable the company to differentiate between consumers, owners, drivers, listeners, and account holders. The insights would help SiriusXM to understand what other vehicles and services are part of each household and cre- ate new opportunities for engagement. In addition, by constructing a coherent and reliable 360-degree view of all its consumers, SiriusXM could ensure that all messaging in all cam- paigns and interactions would be tailored, relevant, and consistent across all channels. The important bonus is that a more tailored and effective marketing is typically more cost-efficient.

IMPLEMENTATION: CREATING AND FOLLOWING THE PATH TO HIGH-PERFORMANCE MARKETING

At the time of its decision to become a high-performance marketing company, SiriusXM was working with a third-party marketing platform that did not have the capacity to support SiriusXM’s ambitions. The company then made an important, forward-thinking decision to bring its marketing capabilities in-house—and then carefully plotted what it would need to do to make the transition successfully.

1. Improve data cleanliness through improved master data management and governance. Although the company was understandably impatient to put ideas into action, data hygiene was a necessary first step to create a reliable window into consumer behavior.

2. Bring marketing analytics in-house and expand the data warehouse to enable scale and fully support integrated marketing analytics.

3. Develop new segmentation and scoring models to run in databases, eliminating la- tency and data duplication.

4. Extend the integrated data warehouse to include marketing data and scoring, lever- aging in-database analytics.

5. Adopt a marketing platform for campaign development. 6. Bring all of its capability together to deliver real-time offer management across all

marketing channels: call center, mobile, Web, and in-app.

Completing those steps meant finding the right technology partner. SiriusXM chose Teradata because its strengths were a powerful match for the project and company. Teradata offered the ability to:

• Consolidate data sources with an integrated data warehouse (IDW), advanced ana- lytics, and powerful marketing applications.

• Solve data-latency issues.

120 Part I • Introduction to Analytics and AI

• Significantly reduce data movement across multiple databases and applications. • Seamlessly interact with applications and modules for all marketing areas. • Scale and perform at very high levels for running campaigns and analytics in-database. • Conduct real-time communications with customers. • Provide operational support, either via the cloud or on premise.

This partnership has enabled SiriusXM to move smoothly and swiftly along its road map, and the company is now in the midst of a transformational, five-year process. After establishing its strong data governance process, SiriusXM began by implementing its IDW, which allowed the company to quickly and reliably operationalize insights through- out the organization.

Next, the company implemented Customer Interaction Manager—part of the Teradata Integrated Marketing Cloud, which enables real-time, dialog-based customer interaction across the full spectrum of digital and traditional communication channels. SiriusXM also will incorporate the Teradata Digital Messaging Center.

Together, the suite of capabilities allows SiriusXM to handle direct communications across multiple channels. This evolution will enable real-time offers, marketing messages, and recommendations based on previous behavior.

In addition to streamlining the way it executes and optimizes outbound marketing activities, SiriusXM is also taking control of its internal marketing operations with the implementation of Marketing Resource Management, also part of the Teradata Integrated Marketing Cloud. The solution will allow SiriusXM to streamline workflow, optimize mar- keting resources, and drive efficiency through every penny of its marketing budget.

RESULTS: REAPING THE BENEFITS

As SiriusXM continues its evolution into a high-performance marketing organization, it already is benefiting from its thoughtfully executed strategy. Household-level consumer insights and a complete view of marketing touch strategy with each consumer enable SiriusXM to create more targeted offers at the household, consumer, and device levels. By bringing the data and marketing analytics capabilities in-house, SiriusXM achieved the following:

• Campaign results in near real time rather than four days, resulting in massive reduc- tions in cycle times for campaigns and the analysts who support them.

• Closed-loop visibility, allowing the analysts to support multistage dialogs and in-campaign modifications to increase campaign effectiveness.

• Real-time modeling and scoring to increase marketing intelligence and sharpen cam- paign offers and responses at the speed of their business.

Finally, SiriusXM’s experience has reinforced the idea that high-performance market- ing is a constantly evolving concept. The company has implemented both processes and the technology that give it the capacity for continued and flexible growth.

u QUESTIONS FOR THE OPENING VIGNETTE

1. What does SiriusXM do? In what type of market does it conduct its business? 2. What were its challenges? Comment on both technology and data-related

challenges.

3. What were the proposed solutions? 4. How did the company implement the proposed solutions? Did it face any

implementation challenges?

5. What were the results and benefits? Were they worth the effort/investment? 6. Can you think of other companies facing similar challenges that can potentially

benefit from similar data-driven marketing solutions?

Chapter 3 • Nature of Data, Statistical Modeling, and Visualization 121

WHAT WE CAN LEARN FROM THIS VIGNETTE

Striving to thrive in a fast-changing competitive industry, SiriusXM realized the need for a new and improved marketing infrastructure (one that relies on data and analytics) to effectively communicate its value proposition to its existing and potential custom- ers. As is the case in any industry, success or mere survival in entertainment depends on intelligently sensing the changing trends (likes and dislikes) and putting together the right messages and policies to win new customers while retaining the existing ones. The key is to create and manage successful marketing campaigns that resonate with the target population of customers and have a close feedback loop to adjust and modify the message to optimize the outcome. At the end, it was all about the preci- sion in the way that SiriusXM conducted business: being proactive about the changing nature of the clientele and creating and transmitting the right products and services in a timely manner using a fact-based/data-driven holistic marketing strategy. Source identification, source creation, access and collection, integration, cleaning, transforma- tion, storage, and processing of relevant data played a critical role in SiriusXM’s suc- cess in designing and implementing a marketing analytics strategy as is the case in any analytically savvy successful company today, regardless of the industry in which they are participating.

Sources: C. Quinn, “Data-Driven Marketing at SiriusXM,” Teradata Articles & News, 2016. http://bigdata. teradata.com/US/Articles-News/Data-Driven-Marketing-At-SiriusXM/ (accessed August 2016); “SiriusXM Attracts and Engages a New Generation of Radio Consumers.” http://assets.teradata.com/resourceCenter/ downloads/CaseStudies/EB8597.pdf?processed=1.

3.2 NATURE OF DATA

Data are the main ingredient for any BI, data science, and business analytics initiative. In fact, they can be viewed as the raw material for what popular decision technolo- gies produce—information, insight, and knowledge. Without data, none of these tech- nologies could exist and be popularized—although traditionally we have built analytics models using expert knowledge and experience coupled with very little or no data at all; however, those were the old days, and now data are of the essence. Once perceived as a big challenge to collect, store, and manage, data today are widely considered among the most valuable assets of an organization with the potential to create invaluable insight to better understand customers, competitors, and the business processes.

Data can be small or very large. They can be structured (nicely organized for computers to process), or they can be unstructured (e.g., text that is created for humans and hence not readily understandable/consumable by computers). Data can come in small batches continuously or can pour in all at once as a large batch. These are some of the characteristics that define the inherent nature of today’s data, which we often call Big Data. Even though these characteristics of data make them more challenging to process and consume, they also make the data more valuable because the character- istics enrich them beyond their conventional limits, allowing for the discovery of new and novel knowledge. Traditional ways to manually collect data (via either surveys or human-entered business transactions) mostly left their places to modern-day data collection mechanisms that use Internet and/or sensor/radio frequency identification (RFID)–based computerized networks. These automated data collection systems are not only enabling us to collect more volumes of data but also enhancing the data quality and integrity. Figure 3.1 illustrates a typical analytics continuum—data to analytics to actionable information.

http://bigdata.teradata.com/US/Articles-News/Data-Driven-Marketing-At-SiriusXM

http://assets.teradata.com/resourceCenter/downloads/CaseStudies/EB8597.pdf?processed=1

122 Part I • Introduction to Analytics and AI

Although their value proposition is undeniable, to live up their promise, data must comply with some basic usability and quality metrics. Not all data are useful for all tasks, obviously. That is, data must match with (have the coverage of the specifics for) the task for which they are intended to be used. Even for a specific task, the relevant data on hand need to comply with the quality and quantity requirements. Essentially, data have to be analytics ready. So what does it mean to make data analytics ready? In addition to its relevancy to the problem at hand and the quality/quantity requirements, it also has to have a certain structure in place with key fields/variables with properly normalized val- ues. Furthermore, there must be an organization-wide agreed-on definition for common variables and subject matters (sometimes also called master data management), such as how to define a customer (what characteristics of customers are used to produce a holis- tic enough representation to analytics) and where in the business process the customer- related information is captured, validated, stored, and updated.

Sometimes the representation of the data depends on the type of analytics being employed. Predictive algorithms generally require a flat file with a target variable, so mak- ing data analytics ready for prediction means that data sets must be transformed into a flat-file format and made ready for ingestion into those predictive algorithms. It is also imperative to match the data to the needs and wants of a specific predictive algorithm and/or a software tool. For instance, neural network algorithms require all input variables

UOB 1.0

UOB 2.2

UOB 2.1

UOB 3.0

ERP CRM SCM

Business Process

Facebook

Google+

Linked In

YouTube

Twitter

Tumblr Flicker

Instagram Pinterest

Snapchat

Reddit Foursquare

Internet/Social Media

Machines/Internet of Things

Data Storage Analytics

Data Protection

Cloud Storage and Computing

Pa tte

rns

Trends

Knowledge

Applications

End Users

Validate

Built

Test

FIGURE 3.1 A Data to Knowledge Continuum.

Chapter 3 • Nature of Data, Statistical Modeling, and Visualization 123

to be numerically represented (even the nominal variables need to be converted into pseudo binary numeric variables), whereas decision tree algorithms do not require such numerical transformation—they can easily and natively handle a mix of nominal and nu- meric variables.

Analytics projects that overlook data-related tasks (some of the most critical steps) often end up with the wrong answer for the right problem, and these unintentionally cre- ated, seemingly good answers could lead to inaccurate and untimely decisions. Following are some of the characteristics (metrics) that define the readiness level of data for an ana- lytics study (Delen, 2015; Kock, McQueen, & Corner, 1997).

• Data source reliability. This term refers to the originality and appropriateness of the storage medium where the data are obtained—answering the question of “Do we have the right confidence and belief in this data source?” If at all possible, one should always look for the original source/creator of the data to eliminate/mitigate the possibilities of data misrepresentation and data transformation caused by the mishandling of the data as they moved from the source to destination through one or more steps and stops along the way. Every move of the data creates a chance to unintentionally drop or reformat data items, which limits the integrity and perhaps true accuracy of the data set.

• Data content accuracy. This means that data are correct and are a good match for the analytics problem—answering the question of “Do we have the right data for the job?” The data should represent what was intended or defined by the original source of the data. For example, the customer’s contact information recorded within a database should be the same as what the customer said it was. Data accuracy will be covered in more detail in the following subsection.

• Data accessibility. This term means that the data are easily and readily obtainable— answering the question of “Can we easily get to the data when we need to?” Access to data can be tricky, especially if they are stored in more than one location and storage medium and need to be merged/transformed while accessing and obtaining them. As the traditional relational database management systems leave their place (or coexist with a new generation of data storage mediums such as data lakes and Hadoop infra- structure), the importance/criticality of data accessibility is also increasing.

• Data security and data privacy. Data security means that the data are secured to allow only those people who have the authority and the need to access them and to prevent anyone else from reaching them. Increasing popularity in educational degrees and certificate programs for Information Assurance is evidence of the criti- cality and the increasing urgency of this data quality metric. Any organization that maintains health records for individual patients must have systems in place that not only safeguard the data from unauthorized access (which is mandated by federal laws such as the Health Insurance Portability and Accountability Act [HIPAA]) but also accurately identify each patient to allow proper and timely access to records by authorized users (Annas, 2003).

• Data richness. This means that all required data elements are included in the data set. In essence, richness (or comprehensiveness) means that the available variables portray a rich enough dimensionality of the underlying subject matter for an accurate and worthy analytics study. It also means that the information content is complete (or near complete) to build a predictive and/or prescriptive analytics model.

• Data consistency. This means that the data are accurately collected and com- bined/merged. Consistent data represent the dimensional information (variables of interest) coming from potentially disparate sources but pertaining to the same sub- ject. If the data integration/merging is not done properly, some of the variables of different subjects could appear in the same record—having two different patient

124 Part I • Introduction to Analytics and AI

records mixed up; for instance, this could happen while merging the demographic and clinical test result data records.

• Data currency/data timeliness. This means that the data should be up-to-date (or as recent/new as they need to be) for a given analytics model. It also means that the data are recorded at or near the time of the event or observation so that the time delay–related misrepresentation (incorrectly remembering and encoding) of the data is prevented. Because accurate analytics relies on accurate and timely data, an essential characteristic of analytics-ready data is the timeliness of the creation and access to data elements.

• Data granularity. This requires that the variables and data values be defined at the lowest (or as low as required) level of detail for the intended use of the data. If the data are aggregated, they might not contain the level of detail needed for an analytics algorithm to learn how to discern different records/cases from one another. For example, in a medical setting, numerical values for laboratory results should be recorded to the appropriate decimal place as required for the meaning- ful interpretation of test results and proper use of those values within an analytics algorithm. Similarly, in the collection of demographic data, data elements should be defined at a granular level to determine the differences in outcomes of care among various subpopulations. One thing to remember is that the data that are aggregated cannot be disaggregated (without access to the original source), but they can easily be aggregated from its granular representation.

• Data validity. This is the term used to describe a match/mismatch between the actual and expected data values of a given variable. As part of data definition, the acceptable values or value ranges for each data element must be defined. For example, a valid data definition related to gender would include three values: male, female, and unknown.

• Data relevancy. This means that the variables in the data set are all relevant to the study being conducted. Relevancy is not a dichotomous measure (whether a variable is relevant or not); rather, it has a spectrum of relevancy from least relevant to most relevant. Based on the analytics algorithms being used, one can choose to include only the most relevant information (i.e., variables) or, if the algorithm is capable enough to sort them out, can choose to include all the relevant ones regard- less of their levels. One thing that analytics studies should avoid is including totally irrelevant data into the model building because this could contaminate the informa- tion for the algorithm, resulting in inaccurate and misleading results.

The above-listed characteristics are perhaps the most prevailing metrics to keep up with; the true data quality and excellent analytics readiness for a specific application do- main would require different levels of emphasis to be placed on these metric dimensions and perhaps add more specific ones to this collection. The following section will delve into the nature of data from a taxonomical perspective to list and define different data types as they relate to different analytics projects.

u SECTION 3.2 REVIEW QUESTIONS

1. How do you describe the importance of data in analytics? Can we think of analytics without data?

2. Considering the new and broad definition of business analytics, what are the main inputs and outputs to the analytics continuum?

3. Where do the data for business analytics come from? 4. In your opinion, what are the top three data-related challenges for better analytics? 5. What are the most common metrics that make for analytics-ready data?

Chapter 3 • Nature of Data, Statistical Modeling, and Visualization 125

3.3 SIMPLE TAXONOMY OF DATA

The term data (datum in singular form) refers to a collection of facts usually obtained as the result of experiments, observations, transactions, or experiences. Data can consist of numbers, letters, words, images, voice recordings, and so on, as measurements of a set of variables (characteristics of the subject or event that we are interested in studying). Data are often viewed as the lowest level of abstraction from which information and then knowledge is derived.

At the highest level of abstraction, one can classify data as structured and unstruc- tured (or semistructured). Unstructured data/semistructured data are composed of any combination of textual, imagery, voice, and Web content. Unstructured/semistruc- tured data will be covered in more detail in the text mining and Web mining chapter. Structured data are what data mining algorithms use and can be classified as categori- cal or numeric. The categorical data can be subdivided into nominal or ordinal data, whereas numeric data can be subdivided into intervals or ratios. Figure 3.2 shows a simple data taxonomy.

• Categorical data. These represent the labels of multiple classes used to divide a variable into specific groups. Examples of categorical variables include race, sex, age group, and educational level. Although the latter two variables can also be considered in a numerical manner by using exact values for age and highest grade completed, for example, it is often more informative to categorize such variables into a relatively small number of ordered classes. The categorical data can also be called discrete data, implying that they represent a finite number of values with no continuum between them. Even if the values used for the categorical (or discrete) variables are numeric, these numbers are nothing more than symbols and do not imply the possibility of calculating fractional values.

• Nominal data. These contain measurements of simple codes assigned to objects as labels, which are not measurements. For example, the variable marital status can be generally categorized as (1) single, (2) married, and (3) divorced. Nominal

Data in Analytics

Structured Data Unstructured or Semi-Structured Data

Nominal

Ordinal

Textual

Multimedia

XML/JSON

Categorical Numerical

Interval

Ratio

Image

Audio

Video

FIGURE 3.2 A Simple Taxonomy of Data.

126 Part I • Introduction to Analytics and AI

data can be represented with binomial values having two possible values (e.g., yes/no, true/false, good/bad) or multinomial values having three or more pos- sible values (e.g., brown/green/blue, white/black/Latino/Asian, single/married/ divorced).

• Ordinal data. These contain codes assigned to objects or events as labels that also represent the rank order among them. For example, the variable credit score can be generally categorized as (1) low, (2) medium, or (3) high. Similar ordered relationships can be seen in variables such as age group (i.e., child, young, middle-aged, elderly) and educational level (i.e., high school, college, graduate school). Some predictive analytic algorithms, such as ordinal multiple logistic regression, take into account this additional rank-order information to build a better classification model.

• Numeric data. These represent the numeric values of specific variables. Examples of numerically valued variables include age, number of children, total household income (in U.S. dollars), travel distance (in miles), and temperature (in Fahrenheit degrees). Numeric values representing a variable can be integers (only whole numbers) or real (also fractional numbers). The numeric data can also be called continuous data, implying that the variable contains continuous measures on a specific scale that allows insertion of interim values. Unlike a discrete vari- able, which represents finite, countable data, a continuous variable represents scal- able measurements, and it is possible for the data to contain an infinite number of fractional values.

• Interval data. These are variables that can be measured on interval scales. A common example of interval scale measurement is temperature on the Celsius scale. In this particular scale, the unit of measurement is 1/100 of the difference between the melting temperature and the boiling temperature of water in atmospheric pres- sure; that is, there is not an absolute zero value.

• Ratio data. These include measurement variables commonly found in the physical sciences and engineering. Mass, length, time, plane angle, energy, and electric charge are examples of physical measures that are ratio scales. The scale type takes its name from the fact that measurement is the estimation of the ratio between a magnitude of a continuous quantity and a unit magnitude of the same kind. Informally, the dis- tinguishing feature of a ratio scale is the possession of a nonarbitrary zero value. For example, the Kelvin temperature scale has a nonarbitrary zero point of absolute zero, which is equal to –273.15 degrees Celsius. This zero point is nonarbitrary because the particles that comprise matter at this temperature have zero kinetic energy.

Other data types, including textual, spatial, imagery, video, and voice, need to be converted into some form of categorical or numeric representation before they can be pro- cessed by analytics methods (data mining algorithms; Delen, 2015). Data can also be classi- fied as static or dynamic (i.e., temporal or time series).

Some predictive analytics (i.e., data mining) methods and machine-learning algorithms are very selective about the type of data that they can handle. Providing them with incompatible data types can lead to incorrect models or (more often) halt the model development process. For example, some data mining methods need all the variables (both input and output) represented as numerically valued variables (e.g., neural net- works, support vector machines, logistic regression). The nominal or ordinal variables are converted into numeric representations using some type of 1-of-N pseudo variables (e.g., a categorical variable with three unique values can be transformed into three pseudo variables with binary values—1 or 0). Because this process could increase the number of variables, one should be cautious about the effect of such representations, especially for the categorical variables that have large numbers of unique values.

Chapter 3 • Nature of Data, Statistical Modeling, and Visualization 127

Similarly, some predictive analytics methods, such as ID3 (a classic decision tree algorithm) and rough sets (a relatively new rule induction algorithm), need all the vari- ables represented as categorically valued variables. Early versions of these methods re- quired the user to discretize numeric variables into categorical representations before they could be processed by the algorithm. The good news is that most implementa- tions of these algorithms in widely available software tools accept a mix of numeric and nominal variables and internally make the necessary conversions before process- ing the data.

Data come in many different variable types and representation schemas. Business analytics tools are continuously improving in their ability to help data scientists in the daunting task of data transformation and data representation so that the data require- ments of specific predictive models and algorithms can be properly executed. Application Case 3.1 illustrates a business scenario in which one of the largest telecommunication companies streamlined and used a wide variety of rich data sources to generate customers insight to prevent churn and to create new revenue sources.

The Problem

In the ultra-competitive telecommunications indus- try, staying relevant to consumers while finding new sources of revenue is critical, especially since cur- rent revenue sources are in decline.

For Fortune 13 powerhouse Verizon, the secret weapon that catapulted the company into the nation’s largest and most reliable network provider is also guiding the business toward future success (see the following figure for some numbers about Verizon). The secret weapon? Data and analytics. Because telecommunication companies are typically rich in data, having the right analytics solution and personnel in place can uncover critical insights that benefit every area of the business.

The Backbone of the Company

Since its inception in 2000, Verizon has partnered with Teradata to create a data and analytics archi- tecture that drives innovation and science-based decision making. The goal is to stay relevant to cus- tomers while also identifying new business oppor- tunities and making adjustments that result in more cost-effective operations.

“With business intelligence, we help the business identify new business opportunities or

make course corrections to operate the business in a more cost-effective way,” said Grace Hwang, executive director of Financial Performance & Analytics, BI, for Verizon. “We support decision makers with the most relevant information to improve the competitive advantage of Verizon.”

By leveraging data and analytics, Verizon is able to offer a reliable network, ensure customer satisfaction, and develop products and services that consumers want to buy.

“Our incubator of new products and services will help bring the future to our customers,” Hwang said. “We’re using our network to make breakthroughs in

Application Case 3.1 Verizon Answers the Call for Innovation: The Nation’s Largest Network Provider uses Advanced Analytics to Bring the Future to its Customers

Verizon by the Numbers

The top ranked wireless carrier in the U.S. has:

$131.6B

177K

1,700

112.1M

106.5M

13M

retail locations

retail connections

postpaid customers

TV and internet subscribers

in revenue

employees

(Continued )

128 Part I • Introduction to Analytics and AI

interactive entertainment, digital media, the Internet of Things, and broadband services.”

Data Insights across Three Business Units

Verizon relies on advanced analytics that are exe- cuted on the Teradata® Unified Data Architecture™ to support its business units. The analytics enable Verizon to deliver on its promise to help customers innovate their lifestyles and provide key insights to support these three areas:

• Identify new revenue sources. Research and development teams use data, analytics, and strategic partnerships to test and develop with the Internet of Things (IoT). The new frontier in data is IoT, which will lead to new revenues that in turn generate opportunities for top-line growth. Smart cars, smart agricul- ture, and smart IoT will all be part of this new growth.

• Predict churn in the core mobile business. Verizon has multiple use cases that demonstrate how its advanced analytics enable laser-accurate churn prediction—within a one to two percent margin—in the mobile space. For a $131 billion company, predicting churn with such precision is significant. By recognizing specific patterns in tablet data usage, Verizon can identify which customers most often access their tablets, then engage those who do not.

• Forecast mobile phone plans. Customer behav- ioral analytics allow finance to better predict earnings in fast-changing market conditions. The U.S. wireless industry is moving from monthly payments for both the phone and the service to paying for the phone independently. This opens up a new opportunity for Verizon to gain busi- ness. The analytic environment helps Verizon better predict churn with new plans and forecast the impact of changes to pricing plans.

The analytics deliver what Verizon refers to as “hon- est data” that inform various business units. “Our mission is to be the honest voice and the indepen- dent third-party opinion on the success or oppor- tunities for improvement to the business,” Hwang

explains. “So my unit is viewed as the golden source of information, and we come across with the honest voice, and a lot of the business decisions are through various rungs of course correction.”

Hwang adds that oftentimes, what forces a company to react is competitors affecting change in the marketplace, rather than the company making the wrong decisions. “So we try to guide the business through the best course of correc- tion, wherever applicable, timely, so that we can continue to deliver record-breaking results year after year,” she said. “I have no doubt that the business intelligence had led to such success in the past.”

Disrupt and Innovate

Verizon leverages advanced analytics to optimize marketing by sending the most relevant offers to customers. At the same time, the company relies on analytics to ensure they have the financial acumen to stay number one in the U.S. mobile market. By continuing to disrupt the industry with innovative products and solutions, Verizon is positioned to remain the wireless standard for the industry.

“We need the marketing vision and the sales rigor to produce the most relevant offer to our customers, and then at the same time we need to have the finance rigor to ensure that whatever we offer to the customer is also profitable to the business so that we’re responsible to our share- holders,” Hwang says.

In Summary—Executing the Seven Ps of Modern Marketing

Telecommunications giant Verizon uses seven Ps to drive its modern-day marketing efforts. The Ps, when used in unison, help Verizon penetrate the market in the way it predicted.

1. People: Understanding customers and their needs to create the product.

2. Place: Where customers shop. 3. Product: The item that’s been manufactured

and is for sale.

Application Case 3.1 (Continued)

Chapter 3 • Nature of Data, Statistical Modeling, and Visualization 129

u SECTION 3.3 REVIEW QUESTIONS

1. What are data? How do data differ from information and knowledge? 2. What are the main categories of data? What types of data can we use for BI and

analytics?

3. Can we use the same data representation for all analytics models? Why, or why not?

4. What is a 1-of-N data representation? Why and where is it used in analytics?

3.4 ART AND SCIENCE OF DATA PREPROCESSING

Data in their original form (i.e., the real-world data) are not usually ready to be used in analytics tasks. They are often dirty, misaligned, overly complex, and inaccurate. A te- dious and time-demanding process (so-called data preprocessing) is necessary to con- vert the raw real-world data into a well-refined form for analytics algorithms (Kotsiantis, Kanellopoulos, & Pintelas, 2006). Many analytics professionals would testify that the time spent on data preprocessing (which is perhaps the least enjoyable phase in the whole process) is significantly longer than the time spent on the rest of the analytics tasks (the fun of analytics model building and assessment). Figure 3.3 shows the main steps in the data preprocessing endeavor.

In the first step of data preprocessing, the relevant data are collected from the iden- tified sources, the necessary records and variables are selected (based on an intimate understanding of the data, the unnecessary information is filtered out), and the records coming from multiple data sources are integrated/merged (again, using the intimate un- derstanding of the data, the synonyms and homonyms are able to be handled properly).

In the second step of data preprocessing, the data are cleaned (this step is also known as data scrubbing). Data in their original/raw/real-world form are usually dirty (Hernández & Stolfo, 1998; Kim et al., 2003). In this phase, the values in the data set are identified and dealt with. In some cases, missing values are an anomaly in the data set, in which case they need to be imputed (filled with a most probable value) or ignored; in other cases, the missing values are a natural part of the data set

4. Process: How customers get to the shop or place to buy the product.

5. Pricing: Working with promotions to get cus- tomers’ attention.

6. Promo: Working with pricing to get customers’ attention.

7. Physical evidence: The business intelligence that gives insights.

“The Aster and Hadoop environment allows us to explore things we suspect could be the rea- sons for breakdown in the seven Ps,” says Grace Hwang, executive director of Financial Performance & Analytics, BI, for Verizon. “This goes back to

providing the business value to our decision- makers. With each step in the seven Ps, we ought to be able to tell them where there are opportunities for improvement.”

Questions for Case 3.1

1. What was the challenge Verizon was facing?

2. What was the data-driven solution proposed for Verizon’s business units?

3. What were the results?

Source: Teradata Case Study “Verizon Answers the Call for Innovation” https://www.teradata.com/Resources/Case-Studies/ Verizon-answers-the-call-for-innovation (accessed July 2018).

https://www.teradata.com/Resources/Case-Studies/Verizon-answers-the-call-for-innovation

130 Part I • Introduction to Analytics and AI

(e.g., the household income field is often left unanswered by people who are in the top income tier). In this step, the analyst should also identify noisy values in the data (i.e., the outliers) and smooth them out. In addition, inconsistencies (unusual values within a variable) in the data should be handled using domain knowledge and/or expert opinion.

In the third step of data preprocessing, the data are transformed for better process- ing. For instance, in many cases, the data are normalized between a certain minimum and maximum for all variables to mitigate the potential bias of one variable having

Well-Formed Data

Social Data

Legacy DBWeb Data

Data Consolidation Collect data Select data Integrate data

Data Cleaning Impute values Reduce noise Eliminate duplicates

Data Transformation Normalize data Discretize data Create attributes

Data Reduction Reduce dimension Reduce volume Balance data

OLTP

Raw Data Sources

F ee

db ac

FIGURE 3.3 Data Preprocessing Steps.

Chapter 3 • Nature of Data, Statistical Modeling, and Visualization 131

large numeric values (such as household income) dominating other variables (such as number of dependents or years in service, which could be more important) having smaller values. Another transformation that takes place is discretization and/or aggrega- tion. In some cases, the numeric variables are converted to categorical values (e.g., low, medium, high); in other cases, a nominal variable’s unique value range is reduced to a smaller set using concept hierarchies (e.g., as opposed to using the individual states with 50 different values, one could choose to use several regions for a variable that shows location) to have a data set that is more amenable to computer processing. Still, in other cases, one might choose to create new variables based on the existing ones to magnify the information found in a collection of variables in the data set. For instance, in an organ transplantation data set, one might choose to use a single variable show- ing the blood-type match (1: match, 0: no match) as opposed to separate multinominal values for the blood type of both the donor and the recipient. Such simplification could increase the information content while reducing the complexity of the relationships in the data.

The final phase of data preprocessing is data reduction. Even though data scientists (i.e., analytics professionals) like to have large data sets, too much data can also be a problem. In the simplest sense, one can visualize the data commonly used in predictive analytics projects as a flat file consisting of two dimensions: variables (the number of columns) and cases/records (the number of rows). In some cases (e.g., image process- ing and genome projects with complex microarray data), the number of variables can be rather large, and the analyst must reduce the number to a manageable size. Because the variables are treated as different dimensions that describe the phenomenon from differ- ent perspectives, in predictive analytics and data mining, this process is commonly called dimensional reduction (or variable selection). Even though there is not a single best way to accomplish this task, one can use the findings from previously published litera- ture; consult domain experts; run appropriate statistical tests (e.g., principal component analysis or independent component analysis); and, more preferably, use a combination of these techniques to successfully reduce the dimensions in the data into a more manage- able and most relevant subset.

With respect to the other dimension (i.e., the number of cases), some data sets can include millions or billions of records. Even though computing power is increasing ex- ponentially, processing such a large number of records cannot be practical or feasible. In such cases, one might need to sample a subset of the data for analysis. The underlying assumption of sampling is that the subset of the data will contain all relevant patterns of the complete data set. In a homogeneous data set, such an assumption could hold well, but real-world data are hardly ever homogeneous. The analyst should be extremely careful in selecting a subset of the data that reflects the essence of the complete data set and is not specific to a subgroup or subcategory. The data are usually sorted on some variable, and taking a section of the data from the top or bottom could lead to a biased data set on specific values of the indexed variable; therefore, always try to randomly select the records on the sample set. For skewed data, straightforward random sampling might not be sufficient, and stratified sampling (a proportional representation of different subgroups in the data is represented in the sample data set) might be required. Speaking of skewed data, it is a good practice to balance the highly skewed data by either oversampling the less represented or undersampling the more represented classes. Research has shown that balanced data sets tend to produce better prediction models than unbalanced ones (Thammasiri et al., 2014).

The essence of data preprocessing is summarized in Table 3.1, which maps the main phases (along with their problem descriptions) to a representative list of tasks and algorithms.

132 Part I • Introduction to Analytics and AI

TABLE 3.1 A Summary of Data Preprocessing Tasks and Potential Methods

Main Task Subtasks Popular Methods

Data consolidation Access and collect the data Select and filter the data Integrate and unify the data

SQL queries, software agents, Web services. Domain expertise, SQL queries, statistical tests. SQL queries, domain expertise, ontology-driven data mapping.

Data cleaning Handle missing values in the data

Fill in missing values (imputations) with most appropriate val- ues (mean, median, min/max, mode, etc.); recode the missing values with a constant such as “ML”; remove the record of the missing value; do nothing.

Identify and reduce noise in the data

Identify the outliers in data with simple statistical techniques (such as averages and standard deviations) or with cluster analysis; once identified, either remove the outliers or smooth them by using binning, regression, or simple averages.

Find and eliminate erroneous data

Identify the erroneous values in data (other than outliers), such as odd values, inconsistent class labels, odd distributions; once identified, use domain expertise to correct the values or remove the records holding the erroneous values.

Data transformation Normalize the data Reduce the range of values in each numerically valued variable to a standard range (e.g., 0 to 1 or -1 to +1) by using a vari- ety of normalization or scaling techniques.

Discretize or aggregate the data

If needed, convert the numeric variables into discrete represen- tations using range- or frequency-based binning techniques; for categorical variables, reduce the number of values by applying proper concept hierarchies.

Construct new attributes Derive new and more informative variables from the existing ones using a wide range of mathematical functions (as simple as addition and multiplication or as complex as a hybrid combi- nation of log transformations).

Data reduction Reduce number of attributes Use principal component analysis, independent component analysis, chi-square testing, correlation analysis, and decision tree induction.

Reduce number of records Perform random sampling, stratified sampling, expert- knowledge-driven purposeful sampling.

Balance skewed data Oversample the less represented or undersample the more represented classes.

It is almost impossible to underestimate the value proposition of data preprocess- ing. It is one of those time-demanding activities in which investment of time and effort pays off without a perceivable limit for diminishing returns. That is, the more resources you invest in it, the more you will gain at the end. Application Case 3.2 illustrates an interesting study that used raw, readily available academic data within an educational organization to develop predictive models to better understand attrition and improve freshman student retention in a large higher education institution. As the application case clearly states, each and every data preprocessing task described in Table 3.1 was criti- cal to a successful execution of the underlying analytics project, especially the task that related to the balancing of the data set.

Chapter 3 • Nature of Data, Statistical Modeling, and Visualization 133

Student attrition has become one of the most chal- lenging problems for decision makers in academic institutions. Despite all the programs and services that are put in place to help retain students, accord- ing to the U.S. Department of Education’s Center for Educational Statistics (nces.ed.gov), only about half of those who enter higher education actually earn a bachelor’s degree. Enrollment management and the retention of students have become a top priority for administrators of colleges and universities in the United States and other countries around the world. H