Business_Intelligence_week5
Chapter 8:
Web Analytics, Web Mining, and Social Analytics
Business Intelligence and Analytics: Systems for Decision Support
(10th Edition)
Business Intelligence and Analytics: Systems for Decision Support
(10th Edition)
Copyright © 2014 Pearson Education, Inc.
8-‹#›
1
Learning Objectives
Define Web mining and understand its taxonomy and its application areas
Differentiate between Web content mining and Web structure mining
Understand the internals of Web search engines
Learn the details about search engine optimization
Define Web usage mining and learn its business application
(Continued…)
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Learning Objectives
Describe the Web analytics maturity model and its use cases
Understand social networks and social analytics and their practical applications
Define social network analysis and become familiar with its application areas
Understand social media analytics and its use for better customer engagement
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Opening Vignette…
Security First Insurance Deepens Connection with Policyholders
Situation
Problem
Solution
Results
Answer & discuss the case questions.
Copyright © 2014 Pearson Education, Inc.
8-‹#›
4
Questions for the Opening Vignette
What does Security First do?
What were the main challenges Security First was facing?
What was the proposed solution approach? What types of analytics were integrated in the solution?
Based on what you learn from the vignette, what do you think are the relationships between Web analytics, text mining, and sentiment analysis?
What were the results Security First obtained? Were any surprising benefits realized?
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Web Mining Overview
Web is the largest repository of data
Data is in HTML, XML, text format
Challenges (of processing Web data)
The Web is too big for effective data mining
The Web is too complex
The Web is too dynamic
The Web is not specific to a domain
The Web has everything
Opportunities and challenges are great!
Copyright © 2014 Pearson Education, Inc.
8-‹#›
6
Web Mining
Web mining (or Web data mining) is the process of discovering intrinsic relationships from Web data (textual, linkage, or usage)
Is it the same as data mining on data generated on the Internet?
Web data?
Content, Link, Log, …
Web Mining versus Web Analytics
Look at the simple taxonomy on the next slide
Copyright © 2014 Pearson Education, Inc.
8-‹#›
7
Web Mining
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Web Content/Structure Mining
Mining the textual content on the Web
Data collection via Web Crawlers/Spiders
Web pages include hyperlinks
Authoritative pages
Hubs
hyperlink-induced topic search (HITS) alg.
Copyright © 2014 Pearson Education, Inc.
8-‹#›
9
Application Case 8.1
Identifying Extremist Groups with Web Link and Content Analysis
Questions for Discussion
How can Web link/content analysis be used to identify extremist groups?
What do you think are the challenges and the potential solution to such intelligence gathering activities?
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Search Engines
Google, Bing, Yahoo, …
For what reason do you use search engines?
Search engine is a software program that searches for documents (Internet sites or files) based on the keywords (individual words, multi-word terms, or a complete sentence) that users have provided that have to do with the subject of their inquiry
They are the workhorses of the Internet
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Structure of a Typical Internet Search Engine
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Anatomy of a Search Engine
Development Cycle
Web Crawler
Document Indexer
Steps
Step 1 – Pre-Processing the Documents
Collecting, organizing, and storing
Step 2 – Parsing the Documents
Step 3 – Creating the Term-by-Document Matrix
How to represent the values (numeric, binary, …)
Term Frequency / Inverse Document Frequency
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Anatomy of a Search Engine
Response Cycle
Query Analyzer
Document Matcher/Ranker
How does Google do it?
Googlebot
Google indexer
Google Query Processor
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Technology Insights 8.1 PageRank Algorithm
PageRank is a link analysis algorithm
Larry Page
Outcome of a research project at Stanford University in 1996
The “secret sauce” in Google
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Application Case 8.2
IGN Increases Search Traffic by 1500 Percent with SEO
Questions for Discussion
How did IGN dramatically increase search traffic to its Web portals?
What were the challenges, the proposed solution, and the obtained results?
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Search Engine Optimization (SEO)
It is the intentional activity of affecting the visibility of an e-commerce site or a Web site in a search engine’s natural (unpaid or organic) search results
Part of an Internet marketing strategy
Based on knowing how a search engine works
Content, HTML, keywords, external links, …
Indexing based on …
Webmaster submission of URL
Proactively and continuously crawling the Web
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Top 15 Most Popular Search Engines (by eBizMBA, March 2013)
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Methods for Search Engine Optimization
Search engine recommended techniques (White-Hat SEO)
Producing results based on good site design, accurate content (for users, not engines)
Search engine disapproved techniques (Black-Hat SEO)
Spamdexing? (search spam, search engine spam, or search engine poisoning)
Deception (what is shown is different to human and machine/spider)
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Application Case 8.3
Understanding Why Customers Abandon Shopping Carts Results in $10 Million Sales Increase
Situation
Problem
Solution
Results
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Web Usage Mining
Web Analytics!
Extraction of information from data generated through Web page visits and transactions…
data stored in server access logs, referrer logs, agent logs, and client-side cookies
user characteristics and usage profiles
metadata, such as page attributes, content attributes, and usage data
Clickstream data, clickstream analysis
Copyright © 2014 Pearson Education, Inc.
8-‹#›
21
Web Usage Mining
Web usage mining applications
Determine the lifetime value of clients
Design cross-marketing strategies across products
Evaluate promotional campaigns
Target electronic ads and coupons at user groups based on user access patterns
Predict user behavior based on previously learned rules and users' profiles
Present dynamic information to users based on their interests and profiles
…
Copyright © 2014 Pearson Education, Inc.
8-‹#›
22
Web Usage Mining (Clickstream Analysis)
Copyright © 2014 Pearson Education, Inc.
8-‹#›
23
Application Case 8.4
Allegro Boosts Online Click-Thru Rates by 500 Percent with Web Analysis
Questions for Discussion
How did Allegro significantly improve clickthrough rates with Web analytics?
What were the challenges, the proposed solution, and the obtained results?
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Web Analytics Metrics
Provides near-real-time data to deliver invaluable information to …
Improve site usability
Manage marketing efforts
Better document ROI, …
Web analytics metric categories:
Web site usability: How were they using my Web site?
Traffic sources: Where did they come from?
Visitor profiles: What do my visitors look like?
Conversion statistics: What does all this mean for the business?
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Web Analytics Metrics - Web Site Usability
Web Site Usability
Page views
Time on site
Downloads
Click map
Click paths
Traffic Source
Referral Web sites
Search engines
Direct
Offline campaigns
Online campaigns
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Web Analytics Metrics - Web Site Usability
Visitor Profiles
Keywords
Content groupings
Geography
Time of day
Landing page
Conversion Statistics
New visitors
Returning visitors
Leads
Sales/conversions
Abandonment rates
Copyright © 2014 Pearson Education, Inc.
8-‹#›
A Web Analytics Dashboard
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Web Analytics Maturity Model
Maturity degree of proficiency, formality, and optimization of business models
Business Intelligence Maturity Model (TDWI)
Management Reporting ➔ Spreadmarts ➔ Data Marts ➔ Data Warehouse ➔ Enterprise Data Warehouse ➔ BI Services
Business Analytics Maturity Model (INFORMS)
Descriptive Analytics ➔ Predictive Analytics ➔ Prescriptive Analytics
Web analytics maturity model next slide…
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Web Analytics Maturity Model
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Web Analytics Tools
Plenty of them exist, and numbers are increasing (Web-based versus downloadable)
Google Web Analytics (google.com/analytics)
Yahoo! Web Analytics (web.analytics.yahoo.com)
Open Web Analytics (openwebanalytics.com)
Piwik (PIWIK.ORG)
FireStats (firestats.cc)
Site Meter (sitemeter.com)
Woopra (woopra.com)
AWStats (awstats.org)
Snoop (reinvigorate.net) …
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Putting It All Together—A Web Site Optimization Ecosystem
Two-Dimensional View of the Inputs for Web Site Optimization
Goal:
Customer Experience Management (CEM)
Voice of Customer (VOC)
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Web Mining Success Stories
Amazon.com, Ask.com, Scholastic.com, …
A Process View of the Web Site Optimization Ecosystem
Copyright © 2014 Pearson Education, Inc.
8-‹#›
33
Voice of the Customer Strategy Framework (Attensity.com)
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Social Analytics Social Network Analysis
Social Network - social structure composed of individuals linked to each other
Analysis of social dynamics
Interdisciplinary field
Social psychology
Sociology
Statistics
Graph theory
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Social Analytics Social Network Analysis
Social Networks help study relationships between individuals, groups, organizations, societies
Self organizing
Emergent
Complex
Typical social network types
Communication networks, community networks, criminal networks, innovation networks, …
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Application Case 8.5
Social Network Analysis Helps Telecommunication Firms (TELCOs)
Questions for Discussion
How can social network analysis be used in the telecommunications industry?
What do you think are the key challenges, potential solution, and probable results in applying SNA in telecommunications firms?
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Social Analytics Social Network Analysis Metrics
Connections
Homophily
Multiplexity
Network closure
Propinquity
Segmentation
Cliques and social circles
Clustering coefficient
Cohesion
Distribution
Bridge
Centrality
Density
Structural holes
Tie strength
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Social Media Definitions and Concepts
Enabling technologies of social interactions among people
Relies on enabling technologies of Web 2.0
Takes on many different forms
Internet forums, Web logs, social blogs, microblogging, wikis, social networks, podcasts, pictures, video, and product reviews
Different types of social media
Based on media research and social process
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Different Types of Social Media
Collaborative projects (e.g., Wikipedia)
Blogs and microblogs (e.g., Twitter)
Content communities (e.g., YouTube)
Social networking sites (e.g., Facebook)
Virtual game worlds (e.g., World of Warcraft), and
Virtual social worlds (e.g., Second Life)
--Kaplan and Haenlein (2010)
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Social versus Industrial Media
Web-based social media are different from traditional/industrial media, such as newspapers, television, and film
Differentiating characteristics
Quality
Reach
Frequency
Accessibility
Usability
Immediacy
Updatability
Copyright © 2014 Pearson Education, Inc.
8-‹#›
How Do People Use Social Media?
Different engagement levels
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Application Case 8.6
Measuring the Impact of Social Media at Lollapalooza
Questions for Discussion
How did C3 Presents use social media analytics to improve its business?
What were the challenges, the proposed solution, and the obtained results?
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Social Media Analytics
It is the systematic and scientific ways to consume the vast amount of content created by Web-based social media outlets, tools, and techniques for the betterment of an organization’s competitiveness
Fastest growing movement in analytics
Social Media
Tweeter
LinlkedIn
…
Insights
Solutions
Course of Actions
…
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Social Media Analytics
HBR Analytic Services survey (HBR, 2010)
75% of the companies did not know where their customers are talking about them
31% do not measure effectiveness of social media
only 23% are using social media analytics tools
7% are able to integrate social media into marketing
Measuring the Social Media Impact
Descriptive analytics – simple counts/statistics
Social network analysis
Advanced analytics – predictive analytics, text mining
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Best Practices in Social Media Analytics
Think of measurement as a guidance system, not a rating system
Track the elusive sentiment
Continuously improve the accuracy of text analysis
Look at the ripple effect
Look beyond the brand
Identify your most powerful influencers
Look closely at the accuracy of your analytic tool
Incorporate social media intelligence into planning
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Application Case 8.7
eHarmony Uses Social Media to Help Take the Mystery Out of Online Dating
Questions for Discussion
How did eHarmony use social media to enhance online dating?
What were the challenges, the proposed solution, and the obtained results?
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Social Media Analytics Tools and Vendors
Attensity360
Radian6/Salesforce Cloud
Sysomos
Collective Intellect
Webtrends
Crimson Hexagon
Converseon
SproutSocial …
YouTube
Flickr
…
Copyright © 2014 Pearson Education, Inc.
8-‹#›
Social Media Analytics
Copyright © 2014 Pearson Education, Inc.
8-‹#›
End of the Chapter
Questions, comments
Copyright © 2014 Pearson Education, Inc.
8-‹#›
50
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America.
Copyright © 2014 Pearson Education, Inc.
8-‹#›
51
360 Customer View
Log Analysis
Marketing AttributionCustomer Analytics
Social Media Analytics
Search Engines Optimization
Page RankInformation Retrieval
Search Engines
Social Network Analysis
Clickstream Analysis
Social Analytics
Semantic WebsWeb Analytics
Graph Mining
Sentiment Analysis
Web Structure Mining
Source:the unified
resource locator (URL)
links contained in the
Web pages
Web Content Mining
Source:unstructured
textual content of the
Web pages (usually in
HTML format)
Web Usage Mining
Source:the detailed
description of a Web
site’s visits (sequence
of clicks by sessions)
Data
Mining
Text
Mining
WEB MINING
Query Analyzer
Document
Matcher/Ranker
Web Crawler
Document
Indexer
Scheduler
Cashed / Indexed
Documents DB
User
World Wide Web
S
e
a
r
c
h
Q
u
e
r
y
P
r
o
c
e
s
s
e
d
Q
u
e
r
y
L
i
s
t
o
f
U
R
L
s
t
o
C
r
a
w
l
C
r
a
w
l
i
n
g
t
h
e
W
e
b
U
n
p
r
o
c
e
s
s
e
d
W
e
b
P
a
g
e
s
P
r
o
c
e
s
s
e
d
P
a
g
e
s
L
i
s
t
o
f
M
a
t
c
h
e
d
P
a
g
e
s
R
a
n
k
e
d
-
O
r
d
e
r
e
d
P
a
g
e
s
Responding CycleDevelopment Cycle
M
e
t
a
d
a
t
a
I
n
d
e
x
Weblogs
Website
Pre-Process Data
Collecting
Merging
Cleaning
Structuring
-Identify users
-Identify sessions
-Identify page views
-Identify visits
Extract Knowledge
Usage patterns
User profiles
Page profiles
Visit profiles
Customer value
How to better the data
How to improve the Web site
How to increase the customer value
User /
Customer
Web
Analytics
Voice of
Customer
Customer Experience
Management
Customer Interaction
on the Web
Analysis of Interactions
Knowledge about the Holistic
View of the Customer
Creators
Critics
Joiners
Collectors
Spectators
Inactives
Time
L
e
v
e
l
o
f
S
o
c
i
a
l
M
e
d
i
a
E
n
g
a
g
e
m
e
n
t