Why IT Fumbles Analytics
Business Intelligence, Analytics, and Data Science: A Managerial Perspective
Fourth Edition
Chapter 5
Predictive Analytics II: Text, Web, and Social Media Analytics …
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Learning Objectives (1 of 2)
5.1 Describe text mining and understand the need for text mining
5.2 Differentiate among text analytics, text mining, and data mining
5.3 Understand the different application areas for text mining
5.4 Know the process of carrying out a text mining project
5.5 Appreciate the different methods to introduce structure to text-based data
Slide 5-2
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide 2 is a list of textbook LO numbers and statements.
2
Learning Objectives (2 of 2)
5.6 Describe sentiment analysis
5.7 Develop familiarity with popular applications of sentiment analysis
5.8 Learn the common methods for sentiment analysis
5.9 Become familiar with speech analytics as it relates to sentiment analysis
Slide 5-3
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide 3 is a list of textbook LO numbers and statements.
3
OPENING VIGNETTE Machine versus Men on Jeopardy!: The Story of Watson (1 of 3)
Slide 5-4
IBM Watson going head-to-head with the best of the best in Jeopardy!
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
OPENING VIGNETTE Machine versus Men on Jeopardy!: The Story of Watson (2 of 3)
Slide 5-5
IBM Watson – How does it do it?
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
OPENING VIGNETTE Machine versus Men on Jeopardy!: The Story of Watson (3 of 3)
Discussion Questions for the Opening Vignette
What is Watson? What is special about it?
What technologies were used in building Watson (both hardware and software)?
What are the innovative characteristics of DeepQA architecture that made Watson superior?
Why did IBM spend all that time and money to build Watson? Where is the return on investment (ROI)?
Slide 5-6
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Text Analytics and Text Mining
Text Analytics versus Text Mining
Text Analytics =
Information Retrieval +
Information Extraction +
Data Mining +
Web Mining
or simply
Text Analytics = Information Retrieval + Text Mining
Slide 5-7
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Text Analytics and Text Mining
Slide 5-8
FIGURE 5.2 Text Analytics, Related Application Areas, and Enabling Disciplines
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Text Mining Concepts
85-90 percent of all corporate data is in some kind of unstructured form (e.g., text)
Unstructured corporate data is doubling in size every 18 months
Tapping into these information sources is not an option, but a need to stay competitive
Answer: text mining
A semi-automated process of extracting knowledge from unstructured data sources
a.k.a. text data mining or knowledge discovery in textual databases
Slide 5-9
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
9
Data Mining versus Text Mining
Both seek for novel and useful patterns
Both are semi-automated processes
Difference is the nature of the data:
Structured versus unstructured data
Structured data: in databases
Unstructured data: Word documents, PDF files, text excerpts, XML files, and so on
To perform text mining – first, impose structure to the data, then mine the structured data
Slide 5-10
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
10
Text Mining Concepts
Benefits of text mining are obvious especially in text-rich data environments
e.g., law (court orders), academic research (research articles), finance (quarterly reports), medicine (discharge summaries), biology (molecular interactions), technology (patent files), marketing (customer comments), etc.
Electronic communization records (e.g., e-mail)
Spam filtering
E-mail prioritization and categorization
Automatic response generation
Slide 5-11
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
11
Text Mining Application Area
Information extraction
Topic tracking
Summarization
Categorization
Clustering
Concept linking
Question answering
Slide 5-12
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
12
Text Mining Terminology
Unstructured or semistructured data
Corpus (and corpora)
Terms
Concepts
Stemming
Stop words (and include words)
Synonyms (and polysemes)
Tokenizing
Slide 5-13
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
13
Text Mining Terminology
Term dictionary
Word frequency
Part-of-speech tagging
Morphology
Term-by-document matrix
Occurrence matrix
Singular value decomposition
Latent semantic indexing
Slide 5-14
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
14
Application Case 5.1 Insurance Group Strengthens Risk Management with Text Mining Solution
Questions for Discussion
How can text analytics and mining be used to keep up with changing business needs of insurance companies?
What were the challenges, the proposed solution, and the obtained results?
Can you think of other uses of text analytics and text mining for insurance companies?
Slide 5-15
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Natural Language Processing (NLP)
Structuring a collection of text
Old approach: bag-of-words
New approach: natural language processing
NLP is …
a very important concept in text mining
a subfield of artificial intelligence and computational linguistics
the studies of "understanding" the natural human language
Syntax versus semantics-based text mining
Slide 5-16
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
16
Natural Language Processing (NLP)
What is “Understanding”?
Human understands, what about computers?
Natural language is vague, context driven
True understanding requires extensive knowledge of a topic
Can/will computers ever understand natural language the same/accurate way we do?
Slide 5-17
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
17
Natural Language Processing (NLP)
Challenges in NLP
Part-of-speech tagging
Text segmentation
Word sense disambiguation
Syntax ambiguity
Imperfect or irregular input
Speech acts
Dream of AI community
to have algorithms that are capable of automatically reading and obtaining knowledge from text
Slide 5-18
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
18
Natural Language Processing (NLP)
WordNet
A laboriously hand-coded database of English words, their definitions, sets of synonyms, and various semantic relations between synonym sets
A major resource for NLP
Need automation to be completed
Sentiment Analysis
A technique used to detect favorable and unfavorable opinions toward specific products and services
SentiWordNet
Slide 5-19
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
19
Application Case 5.2 AMC Networks Is Using Analytics to Capture New Viewers, Predict Ratings, and Add Value for Advertisers in a Multichannel World (1 of 2)
Slide 5-20
A Web-Based Dashboard Used by AMC Networks [Source: AMC Networks]
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Application Case 5.2 AMC Networks Is Using Analytics to Capture New Viewers, Predict Ratings, and Add Value for Advertisers in a Multichannel World (2 of 2)
Questions for Discussion
What are the common challenges broadcasting companies are facing nowadays? How can analytics help to alleviate these challenges?
How did AMC leverage analytics to enhance their business performance?
What were the types of text analytics and text mini solutions developed by AMC networks? Can you think of other potential uses of text mining applications in the broadcasting industry?
Slide 5-21
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
NLP Task Categories
Question answering
Automatic summarization
Natural language generation & understanding
Machine translation
Foreign language reading & writing
Speech recognition
Text proofing, optical character recognition
Optical character recognition
Slide 5-22
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
22
Text Mining Applications
Marketing applications
Enables better CRM
Security applications
ECHELON, OASIS
Deception detection (…)
Medicine and biology
Literature-based gene identification (…)
Academic applications
Research stream analysis
Slide 5-23
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
23
Deception detection
A difficult problem
If detection is limited to only text, then the problem is even more difficult
The study
analyzed text-based testimonies of person of interests at military bases
used only text-based features (cues)
Application Case 5.3 Mining for Lies (1 of 4)
Slide 5-24
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
24
Application Case 5.3 Mining for Lies (2 of 4)
FIGURE 5.3 Text-Based Deception-Detection Process
Slide 5-25
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
25
Application Case 5.3 Mining for Lies (3 of 4)
Table 5.1 Categories and Examples of Linguistic Features Used in Deception Detection
Slide 5-26
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
26
371 usable statements are generated
31 features are used
Different feature selection methods used
10-fold cross validation is used
Results (overall % accuracy)
Logistic regression 67.28
Decision trees 71.60
Neural networks 73.46
Application Case 5.3 Mining for Lies (4 of 4)
Slide 5-27
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
27
Text Mining Applications (Gene/Protein Interaction Identification)
Slide 5-28
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
28
Application Case 5.4 Bringing the Customer into the Quality Equation: Lenovo Uses Analytics to Rethink Its Redesign
Questions for Discussion
How did Lenovo use text analytics and text mining to improve quality and design of their products and ultimately improve customer satisfaction?
What were the challenges, the proposed solution, and the obtained results?
Slide 5-29
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
29
Text Mining Process
A Context Diagram for Text Mining Process
Slide 5-30
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
30
Text Mining Process
FIGURE 5.6 The Three-Step/Task Text Mining Process
Slide 5-31
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
31
Text Mining Process
Step 1: Establish the corpus
Collect all relevant unstructured data (e.g., textual documents, XML files, e-mails, Web pages, short notes, voice recordings…)
Digitize, standardize the collection (e.g., all in ASCII text files)
Place the collection in a common place (e.g., in a flat file, or in a directory as separate files)
Slide 5-32
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
32
Text Mining Process
Step 2: Create the Term–by–Document Matrix
Slide 5-33
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
33
Text Mining Process
Step 2: Create the Term–by–Document Matrix (TDM) (Cont.)
Should all terms be included?
Stop words, include words
Synonyms, homonyms
Stemming
What is the best representation of the indices (values in cells)?
Row counts; binary frequencies; log frequencies;
Inverse document frequency
Slide 5-34
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
34
Text Mining Process
Step 2: Create the Term–by–Document Matrix (TDM) (Cont.)
TDM is a sparse matrix. How can we reduce the dimensionality of the TDM?
Manual - a domain expert goes through it
Eliminate terms with very few occurrences in very few documents (?)
Transform the matrix using singular value decomposition (SVD)
SVD is similar to principle component analysis
Slide 5-35
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
35
Text Mining Process
Step 3: Extract patterns/knowledge
Classification (text categorization)
Clustering (natural groupings of text)
Improve search recall
Improve search precision
Scatter/gather
Query-specific clustering
Association
Trend Analysis (…)
Slide 5-36
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
36
Application Case 5.5 Research Literature Survey with Text Mining (1 of 4)
Mining the published IS literature
MIS Quarterly (MISQ)
Journal of MIS (JMIS)
Information Systems Research (ISR)
Covers 12-year period (1994-2005)
901 papers are included in the study
Only the paper abstracts are used
9 clusters are generated for further analysis
Slide 5-37
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
37
Application Case 5.5 Research Literature Survey with Text Mining (2 of 4)
Slide 5-38
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
38
Application Case 5.5 Research Literature Survey with Text Mining (3 of 4)
Slide 5-39
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
39
Application Case 5.5 Research Literature Survey with Text Mining (4 of 4)
Slide 5-40
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
40
Sentiment Analysis
Sentiment belief, view, opinion, and conviction
Sentiment analysis is trying to answer the question “What do people feel about a certain topic?”
By analyzing data related to opinions of many using a variety of automated tools
Used in variety of domains, but its applications in CRM are especially noteworthy (which related to customers/consumers’ opinions)
Slide # of total
Slide 5-41
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Sentiment Analysis Applications
Voice of the customer (VOC)
Voice of the Market (VOM)
Voice of the Employee (VOE)
Brand Management
Financial Markets
Politics
Government Intelligence
… others
Slide 5-42
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Sentiment Analysis Process
Slide 5-43
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Sentiment Analysis Process
Step 1 – Sentiment Detection
Comes right after the retrieval and preparation of the text documents
It is also called detection of objectivity
Fact [= objectivity] versus Opinion [= subjectivity]
Step 2 – N-P Polarity Classification
Given an opinionated piece of text, the goal is to classify the opinion as falling under one of two opposing sentiment polarities
N [= negative] versus P [= positive]
Slide 5-44
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Sentiment Analysis Process
Step 3 – Target Identification
The goal of this step is to accurately identify the target of the expressed sentiment (e.g., a person, a product, an event, etc.)
Level of difficulty the application domain
Step 4 – Collection and Aggregation
Once the sentiments of all text data points in the document are identified and calculated, they are to be aggregated
Word Statement Paragraph Document
Slide 5-45
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
P-N Polarity and S-O Polarity
Slide 5-46
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Web Mining Overview
Web is the largest repository of data
Data is in HTML, XML, text format
Challenges (of processing Web data)
The Web is too big for effective data mining
The Web is too complex
The Web is too dynamic
The Web is not specific to a domain
The Web has everything
Opportunities and challenges are great!
Slide 5-47
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
47
Web Mining
Web mining (or Web data mining) is the process of discovering intrinsic relationships from Web data (textual, linkage, or usage)
Slide 5-48
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
48
Web Content/Structure Mining
Mining the textual content on the Web
Data collection via Web crawlers
Web pages include hyperlinks
Authoritative pages
Hubs
Hyperlink-induced topic search (HITS) alg.
Slide 5-49
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
49
Web Usage Mining
Extraction of information from data generated through Web page visits and transactions…
data stored in server access logs, referrer logs, agent logs, and client-side cookies
user characteristics and usage profiles
metadata, such as page attributes, content attributes, and usage data
Clickstream data
Clickstream analysis
Slide 5-50
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
50
Web Usage Mining
Web usage mining applications
Determine the lifetime value of clients
Design cross-marketing strategies across products.
Evaluate promotional campaigns
Target electronic ads and coupons at user groups based on user access patterns
Predict user behavior based on previously learned rules and users' profiles
Present dynamic information to users based on their interests and profiles
…
Slide 5-51
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
51
Search Engines
Google, Bing, Yahoo, …
For what reason do you use search engines?
Search engine is a software program that searches for documents (Internet sites or files) based on the keywords (individual words, multi-word terms, or a complete sentence) that users have provided that have to do with the subject of their inquiry
They are the workhorses of the Internet
Slide 5-52
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Structure of a Typical Internet Search Engine
Slide 5-53
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Anatomy of a Search Engine
Development Cycle
Web Crawler
Document Indexer
Response Cycle
Query Analyzer
Document Matcher/Ranker
Slide 5-54
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
It is the intentional activity of affecting the visibility of an e-commerce site or a Web site in a search engine’s natural (unpaid or organic) search results
Part of an Internet marketing strategy
Based on knowing how a Search Engine works
Content, HTML, keywords, external links, …
Indexing based on …
Webmaster submission of URL
Proactively and continuously crawling the Web
Search Engine Optimization
Slide 5-55
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Top 15 Most Popular Search Engines (by eBizMBA, August 2016)
Slide 5-56
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Web Usage Mining (Clickstream Analysis)
Slide 5-57
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
57
Web Analytics Metrics
Web site usability
How were the visitors using my Web site?
Traffic sources
Where did they come from?
Visitor profiles
What do my visitors look like?
Conversion statistics
What does it all mean for the business?
Slide 5-58
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Web Analytics Metrics
Web Site Usability
Page views
Time on site
Downloads
Click map
Click paths
Traffic Source
Referral Web sites
Search engines
Direct
Offline campaigns
Online campaigns
Slide 5-59
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Web Analytics Metrics
Visitor Profiles
Keywords
Content groupings
Geography
Time of day
Landing page profiles
Conversion Statistics
New visitors
Returning visitors
Leads
Sales/conversions
Abandonment/exit rate
Slide 5-60
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
A Sample Web Analytics Dashboard
Slide 5-61
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Social Analytics Social Network Analysis
Social Network - social structure composed of individuals linking to each other
Analysis of social dynamics
Interdisciplinary field
Social psychology
Sociology
Statistics
Graph theory
Slide 5-62
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Social Analytics Social Network Analysis
Social Networks help study relationships between individuals, groups, organizations, societies
Self organizing
Emergent
Complex
Typical social network types
Communication networks, community networks, criminal networks, innovation networks, …
Slide 5-63
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Discussion Questions
How can social media analytics be used in the consumer products industry?
What do you think are the key challenges, potential solutions, and probable results in applying social media analytics in consumer products and services firms?
Application Case 5.8 Tito’s Vodka Establishes Brand Loyalty with an Authentic Social Strategy
Slide 5-64
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
64
Social Analytics Social Network Analysis Metrics
Connections
Homophily
Multiplexity
Mutuality/reciprocity
Network closure
Propinquity
Segmentation
Cliques and social circles
Clustering coefficient
Cohesion
Distribution
Bridge
Centrality
Density
Distance
Structural holes
Slide 5-65
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Social Media Definitions and Concepts
Enabling technologies of social interactions among people
Relies on enabling technologies of Web 2.0
Takes on many different forms
Internet forums, Web logs, social blogs, microblogging, wikis, social networks, podcasts, pictures, video, and product reviews
Different types of social media
Based on media research and social process
Slide 5-66
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Social versus Industrial Media
Web-based social media are different from traditional/industrial media, such as newspapers, television, and film
Differentiating characteristics
Quality
Reach
Frequency
Accessibility
Usability
Immediacy
Updatability
Slide 5-67
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
How Do People Use Social Media?
Different engagement levels
Slide 5-68
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Social Media Analytics
It is the systematic and scientific ways to consume the vast amount of content created by Web-based social media outlets, tools, and techniques for the betterment of an organization’s competitiveness
Tools to measure social media impact:
Descriptive analytics
Social network analysis
Advanced analytics
Slide 5-69
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Best Practices in Social Media Analytics
Think of measurement as a guidance system, not a rating system
Track the elusive sentiment
Continuously improve the accuracy of text analysis
Look at the ripple effect
Look beyond the brand
Identify your most powerful influencers
Look closely at the accuracy of your analytic tool
Incorporate social media intelligence into planning
Slide 5-70
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
End of Chapter 5
Questions / Comments
Slide 5-71
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Statements
Transcribed for
Processing
Text Processing
Software Identified
Cues in Statements
Statements Labeled as
Truthful or Deceptive
by Law Enforcement
Text Processing
Software Generated
Quantified Cues
Classification Models
Trained and Tested on
Quantified Cues
Cues Extracted &
Selected
Category Example Cues
Quantity Verb count, noun-phrase count, ...
Complexity Avg. no of clauses, sentence length, …
Uncertainty Modifiers, modal verbs, ...
Nonimmediacy Passive voice, objectification, ...
Expressivity Emotiveness
Diversity Lexical diversity, redundan cy, ...
Informality Typographical error ratio
Specificity Spatiotemporal , perceptual information …
Affect Positive affect, negative affect, etc.
G
e
n
e
/
P
r
o
t
e
i
n
596 12043 24224 28102042722 397276
D007962
D 016923
D 001773
D019254D044465D001769D002477D003643D016158
185851112923017275874279189521623563217825282523
NNINNNINVBZINJJJJNNNNNNCCNNINNN
NPPPNPNPPPNPNPPPNP
O
n
t
o
l
o
g
y
W
o
r
d
P
O
S
S
h
a
l
l
o
w
P
a
r
s
e
...expression of Bcl-2 is correlated with insufficient white blood cell death and activation of p53.
Establish the Corpus:
Collect and organize
the domain-specific
unstructured data
Create the Term-
Document Matrix:
Introduce structure
to the corpus
Extract Knowledge:
Discover novel
patterns from the
T-D matrix
The inputs to the process
include a variety of relevant
unstructured (and semi-
structured) data sources such as
text, XML, HTML, etc.
The output of Task 1 is a
collection of documents in
some digitized format for
computer processing
The output of Task 2 is a flat
file called term-document
matrix where the cells are
populated with the term
frequencies
The output of Task 3 is a
number of problem-specific
classification, association,
clustering models and
visualizations
Task 1Task 2Task 3
FeedbackFeedback
Knowledge
1
2
3
4
5
Data
Text
i
n
v
e
s
t
m
e
n
t
r
i
s
k
p
r
o
j
e
c
t
m
a
n
a
g
e
m
e
n
t
s
o
f
t
w
a
r
e
e
n
g
i
n
e
e
r
i
n
g
d
e
v
e
l
o
p
m
e
n
t
1
S
A
P
.
.
.
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
...
Documents
Terms
1
1
1
2
1
1
1
3
1
Journal
Year
Author(s)
Title
Vol/No
Pages
Keywords
Abstract
MISQ
2005
A. Malhotra,
S. Gosain and
O. A. El Sawy
Absorptive capacity
configurations in
supply chains:
Gearing for partner-
enabled market
knowledge creation
29/1
145-187
knowledge management
supply chain
absorptive capacity
interorganizational
information systems
configuration approaches
The need for continual value
innovation is driving supply
chains to evolve from a pure
transactional focus to
leveraging interorganizational
partner ships for sharing
ISR
1999
D. Robey and
M. C. Boudreau
Accounting for the
contradictory
organizational
consequences of
information
technology:
Theoretical directions
and methodological
implications
2-Oct
167-185
organizational
transformation
impacts of technology
organization theory
research methodology
intraorganizational power
electronic communication
mis implementation
culture
systems
Although much contemporary
thought considers advanced
information technologies as
either determinants or enablers
of radical organizational
change, empirical studies have
revealed inconsistent findings to
support the deterministic logic
implicit in such arguments. This
paper reviews the contradictory
JMIS
2001
R. Aron and
E. K. Clemons
Achieving the optimal
balance between
investment in quality
and investment in self-
promotion for
information products
18/2
65-88
information products
internet advertising
product positioning
signaling
signaling games
When producers of goods (or
services) are confronted by a
situation in which their offerings
no longer perfectly match
consumer preferences, they
must determine the extent to
which the advertised features of
…
…
…
…
…
…
…
…
Identify the target
for the sentiment
Calculate the N –P
Polarity of the
sentiment
Is there a
sentiment?
Record the Polarity,
Strength, and the
Target of the
sentiment.
Tabulate & aggregate
the sentiment
analysis results
Textual Data
Calculate the
O –S Polarity
YesNo
A statement
Yes
Lexicon
Lexicon
O –S
polarity
measure
N-P Polarity
Target
Step 1
Step 2
Step 3
Step 4
Positive (P)
(+)
Negative (N)
(-)
Objective (O)
Subjective (S)
P –N Polarity
S
-
O
P
o
l
a
r
i
t
y
Marketing AttributionCustomer Analytics
360 Customer ViewVoice of the Customer
Search Engine OptimizationSocial Network AnalysisSocial Media AnalyticsWeblog Analysis
Page RankInformation RetrievalGraph MiningSocial AnalyticsClickstream Analysis
Data
Mining
Text
Mining
Web Mining
Web Structure Mining
Source:the unified
resource locator (URL)
links contained in the
Web pages
Web Content Mining
Source:unstructured
textual content of the
Web pages (usually in
HTML format)
Web Usage Mining
Source:the detailed
description of a Web
site’s visits (sequence of
clicks by sessions)
Web AnalyticsSearch EnginesSentiment AnalysisSemantic Webs
Query Analyzer
Document
Matcher/Ranker
Web Crawler
Document
Indexer
Scheduler
Cashed / Indexed
Documents DB
User
World Wide Web
S
e
a
r
c
h
Q
u
e
r
y
P
r
o
c
e
s
s
e
d
Q
u
e
r
y
L
i
s
t
o
f
U
R
L
s
t
o
C
r
a
w
l
C
r
a
w
l
i
n
g
t
h
e
W
e
b
U
n
p
r
o
c
e
s
s
e
d
W
e
b
P
a
g
e
s
P
r
o
c
e
s
s
e
d
P
a
g
e
s
L
i
s
t
o
f
M
a
t
c
h
e
d
P
a
g
e
s
R
a
n
k
e
d
-
O
r
d
e
r
e
d
P
a
g
e
s
Responding CycleDevelopment Cycle
M
e
t
a
d
a
t
a
I
n
d
e
x
Weblogs
Website
Pre-Process Data
Collecting
Merging
Cleaning
Structuring
-Identify users
-Identify sessions
-Identify page views
-Identify visits
Extract Knowledge
Usage patterns
User profiles
Page profiles
Visit profiles
Customer value
How to better the data
How to improve the Web site
How to increase the customer value
User /
Customer
Creators
Critics
Joiners
Collectors
Spectators
Inactives
Time
L
e
v
e
l
o
f
S
o
c
i
a
l
M
e
d
i
a
E
n
g
a
g
e
m
e
n
t