Data analytics

profilejoypos
INSD5120CourseIntroduction-fall2018.pdf

 Why are we here?  Analytics skill set  Advanced Data Analytics program  Course outline  Analytics Process

Salary Benchmarks

 “Data science is an interdisciplinary field about scientific methods, processes, and systems to extract insights from structured and unstructured data”

 More generally, data analytics is a problem solving mindset and particular set of skills, adapted to computing and data collection advances, to support evidence-based decision making

Domain Expertise

Data EngineeringAnalytics

Data Analytics

Statistical Research

Data Processing

Machine Learning

UNT Advanced Data Analytics

Deep understanding of analytic methods with the ability to apply, adapt and develop sophisticated analysis in a variety of settings to derive actionable business insights.

Analysis Computing Domain Expertise INSD 5120 Introduction to Data Analytics

ADTA 5130 Data Analytics 1 ADTA 5240 Harvesting, Storing and Retrieving Data

Healthcare Analytics

Sports Analytics

Statistics Management Digital Merchandising

ADTA 5230 Data Analytics 2 ADTA 5340Learning with Big Data

ADTA 5250Large Data Visualization

ADTA 5940Capstone

 Gain insight into the practice of data analytics by practicing doing analytics

 Case studies approach  Higher education, retail location analytics,

market research, healthcare analytics  Preview analytic methods explored in

other ADTA courses  Emphasis on developing problem

solving mindset

D eveloping Your Skill Set

 Undergraduate in business, MBA in HR, PhD in Applied Technology and Performance Improvement with a minor in Management Science

 15 years at Xerox; early adopter of “manage by fact”

 Role at Xerox included process reengineering

 UNT full-time faculty and program advisor

 Diverse organizations  U.S. Navy, Institute for Defense

Analysis, U.S. Army, IBM, Bank of America, America’s Cash Express, Fox TV, Electronic Arts, FEMA, Argo Data Resources, ABC TV

 Diverse applications  National security, Internet targeting,

portfolio optimization, risk analysis, sensor modeling, quality control, inventory management, market research, forecasting, workforce management

FOX

 Warm-up  EDA, short analysis reports

 Trends at research universities  EDA, indices, variable reduction

 Retail location analysis  Data cleaning, multivariate regression

 Market research  Survey design, sampling, factor analysis, cluster

analysis, research paper on survey design

 Medical Diagnosis  Decision trees

 Wrap-up and final projects

There is no magic to turning your data into gold. It’s about How was the data collected How you mine it, How you interpret it, How you draw insights and deploy them, How you refresh the models and enhance

the data All with clear objectives in mind

 Measure need for product/feature

 Prototype data-driven solutions  Design valid experiments to test

hypotheses  Understand how users value

opportunities  Analyze iterations until product

is ready for delivery  Present findings to executive

leadership

Product Analytics  Deploy data analytics to solve

core problems  Determine tradeoffs between

accuracy and cost to develop/operate/support Monitor production models,

determining when they are stale Work to scale models as they

grow

Analytic Products

Measuring product value across customer impact

Acquisition

How do you attract

customers?

Engagement

What do customers do

with the service?

Retention

Do customers continue to use

the service?

Monetization

How do you make money

from customers?

Scale

What do customers tell

others?

Performance

How is the customer evaluating how the product serves their needs?

Construct

Specification

Data Acquisition & Variable Selection

Exploratory Analysis

Modeling & Validation

Presentation or Build

Problem

Definition

Construct

Specification

Data Acquisition & Variable Selection

Exploratory Analysis

Modeling & Validation

Presentation or Build

Problem

Definition

Problem Definition

 Most important part of the cycle. If you don’t know where you are going, then the rest is a mess

 Simply stated, “What question or problem are you trying to solve?

Construct

Specification

Data Acquisition & Variable Selection

Exploratory Analysis

Modeling & Validation

Presentation or Build

Problem

Definition

Construct Specification

 A construct is an informed, idea developed or generated to describe or explain behavior

 For example, Intelligence, Buying Behavior, Consumer Preferences, Sales Forecasts

Construct

Specification

Data Acquisition & Variable Selection

Exploratory Analysis

Modeling & Validation

Presentation or Build

Problem

Definition

Data Acquisition & Variable Selection

 Collecting raw data, leveraging stored data, or using 3rd party data

 Data validation and cleaning  Variable selection is an

iterative process to identifying variables that define constructs and/or explain construct variation

Construct

Specification

Data Acquisition & Variable Selection

Exploratory Data Analysis

Modeling & Validation

Presentation or Build

Problem

Definition

Exploratory Data Analysis

 Crucial step in gaining an intuitive understanding of the data

 EDA is an approach that postpones model building by first allowing the data to reveal its underlying structure through descriptive statistics and visualizations

Construct

Specification

Data Acquisition & Variable Selection

Exploratory Data Analysis

Modeling & Validation

Presentation or Build

Problem

Definition

Modeling & Validation

 Modeling includes all analytical methods, both traditional and machine learning techniques

 This is the stage that explores techniques for answering “The Question”

 Validation is reviewing and testing your analysis & modeling

Construct

Specification

Data Acquisition & Variable Selection

Exploratory Data Analysis

Modeling & Validation

Presentation or Build

Problem

Definition

Presentation or Build

 Presenting insights to decision makers

 Build and implement analytic model with diagnostics in place

 Problem statement that succinctly describes what problems you will solve

 Review of prior work and “best-practice” solutions in the problem domain that contains your question

 Evaluated list of the data available or that you will need to harvest  Analytical strategy aimed at predicting target outcomes  Outline of EDA  Modeling and analyses techniques to be applied

 Plausible or expected solutions to the problem given data availability, constraints, and the limitations of your modeling/analytical methodologies

 What data is needed?  Where is it?  Do you have access?  How is this data structured and stored?  Relational data tables, flat files, published reports, unstructured text

 How is this data organized?  Does metadata exist to explain what variables mean, how they were collected, how

often the data is refreshed?  What errors are associated with the data?  How large are the data files?

 Problem statement that succinctly describes what problems you will solve

 Review of prior work and “best-practice” solutions in the problem domain that contains your question

 Evaluated list of the data available or that you will need to harvest  Analytical strategy aimed at predicting target outcomes  Outline of EDA  Modeling and analyses techniques to be applied

 Plausible or expected solutions to the problem given data availability, constraints, and the limitations of your modeling/analytical methodologies

36

 We’ll begin with a set of warm-up problems based on real data from three very different domains

 Apply exploratory data analysis (EDA) methods to gain insight into the data which will inform later more sophisticated analysis

 EDA, in its own right, is a crucial step in understanding the data  EDA methods are often “sophisticated enough” to meet business

objectives  EDA is essential to correctly applying and deploying statistical modeling

and machine learning methods

  • �Data Analytics�[Engine of the Information Economy]
  • Overview
  • Slide Number 3
  • Slide Number 4
  • Sample of Currently Open Positions �[in nearly every business sector]
  • So what the heck is Data Science, Big Data, Deep Learning, [insert buzzword] and how can I get in on it?
  • Data Science/Analytics
  • A Particular Set of Skills
  • Core Competencies
  • Advanced Methods
  • Slide Number 11
  • ADA Curriculum
  • Introduction to Data Analytics
  • Slide Number 14
  • My Background and Skill Set
  • Doing Analytics in Diverse Settings
  • Course Outline
  • Course Outline
  • Artificial intelligence, however you want to define it, that's everything. There will be more changes in the next five to seven years than we've seen in the last 30. It will impact every business. Data is the new gold. ��Mark Cuban
  • However…
  • Slide Number 21
  • Dual Tracks of Analytics in Business
  • ANALYTICS CYCLE
  • Product Analytics
  • Product Analytics Cycle
  • Analytics Products/Insights Cycle
  • Analytics Products/Insights Cycle
  • Analytics Products/Insights Cycle
  • Analytics Products/Insights Cycle
  • Analytics Products/Insights Cycle
  • Analytics Products/Insights Cycle
  • Analytics Products/Insights Cycle
  • Creating a Data Analysis Plan
  • Data Assessment
  • Creating a Data Analysis Plan
  • Slide Number 36
  • Warm-up