Business Intelligence 3

profilesbadugula
week4assignment.pdf

Chapter 4 • Data Mining Process, Methods, and Algorithms 243

then subsequently augmented the passenger data with additional information such as fam- ily sizes and Social Security numbers—information purchased from the data broker Acxiom. The consolidated personal database was intended to be used for a data mining project to develop potential terrorist profiles. All of this was done without notification or consent of passengers. When news of the activities got out, however, dozens of privacy lawsuits were filed against JetBlue, Torch, and Acxiom, and several U.S. senators called for an investiga- tion into the incident (Wald, 2004). Similar, but not as dramatic, privacy-related news was reported in the recent past about popular social network companies that allegedly were selling customer-specific data to other companies for personalized target marketing.

Another peculiar story about privacy concerns made it to the headlines in 2012. In this instance, the company, Target, did not even use any private and/or personal data. Legally speaking, there was no violation of any laws. The story is summarized in Application Case 4.7.

In early 2012, an infamous story appeared concern- ing Target’s practice of predictive analytics. The story was about a teenage girl who was being sent adver- tising flyers and coupons by Target for the kinds of things that a mother-to-be would buy from a store like Target. The story goes like this: An angry man went into a Target outside of Minneapolis, demanding to talk to a manager: “My daughter got this in the mail!” he said. “She’s still in high school, and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?” The manager had no idea what the man was talking about. He looked at the mailer. Sure enough, it was addressed to the man’s daughter and contained advertisements for maternity clothing, nursery furniture, and pictures of smiling infants. The manager apologized and then called a few days later to apologize again. On the phone, though, the father was somewhat abashed. “I had a talk with my daughter,” he said. “It turns out there’s been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology.”

As it turns out, Target figured out a teen girl was pregnant before her father did! Here is how the company did it. Target assigns every customer a Guest ID number (tied to his or her credit card, name, or e-mail address) that becomes a placeholder that keeps a history of everything the person has bought. Target augments these data with any demographic information that it had collected from the customer or had bought from other information sources. Using this information, Target looked at historical buying

data for all the females who had signed up for Target baby registries in the past. They analyzed the data from all directions, and soon enough, some useful patterns emerged. For example, lotions and special vitamins were among the products with interesting purchase patterns. Lots of people buy lotion, but what an analyst noticed was that women on the baby registry were buying larger quantities of unscented lotion around the beginning of their second trimester. Another analyst noted that sometime in the first 20 weeks, pregnant women loaded up on supplements like calcium, magnesium, and zinc. Many shoppers purchase soap and cotton balls, but when someone suddenly starts buying lots of scent-free soap and extra-large bags of cotton balls, in addition to hand sanitizers and washcloths, it signals that they could be getting close to their delivery date. In the end, the analysts were able to identify about 25 products that, when analyzed together, allowed them to assign each shopper a “pregnancy prediction” score. More impor- tant, they could also estimate a woman’s due date to within a small window, so Target could send cou- pons timed to very specific stages of her pregnancy.

If you look at this practice from a legal perspec- tive, you would conclude that Target did not use any information that violates customer privacy; rather, they used transactional data that almost every other retail chain is collecting and storing (and perhaps analyzing) about their customers. What was disturb- ing in this scenario was perhaps the targeted con- cept: pregnancy. Certain events or concepts should

Application Case 4.7 Predicting Customer Buying Patterns—The Target Story

(Continued )

244 Part II • Predictive Analytics/Machine Learning

Data Mining Myths and Blunders

Data mining is a powerful analytical tool that enables business executives to advance from describing the nature of the past (looking at a rearview mirror) to predicting the future (looking ahead) to better manage their business operations (making accurate and timely decisions). Data mining helps marketers find patterns that unlock the mysteries of customer behavior. The results of data mining can be used to increase revenue and reduce cost by identifying fraud and discovering business opportunities, offering a whole new realm of competitive advantage. As an evolving and maturing field, data mining is often associated with a number of myths, including those listed in Table 4.6 (Delen, 2014; Zaima, 2003).

Data mining visionaries have gained enormous competitive advantage by under- standing that these myths are just that: myths.

Although the value proposition and therefore its necessity are obvious to anyone, those who carry out data mining projects (from novice to seasoned data scientist) some- times make mistakes that result in projects with less-than-desirable outcomes. The follow- ing 16 data mining mistakes (also called blunders, pitfalls, or bloopers) are often made in practice (Nisbet et al., 2009; Shultz, 2004; Skalak, 2001), and data scientists should be aware of them and, to the extent that is possible, do their best to avoid them:

1. Selecting the wrong problem for data mining. Not every business problem can be solved with data mining (i.e., the magic bullet syndrome). When there are no represen- tative data (large and feature rich), there cannot be a practicable data mining project.

2. Ignoring what your sponsor thinks data mining is and what it really can and cannot do. Expectation management is the key for successful data mining projects.

TABLE 4.6 Data Mining Myths

Myth Reality

Data mining provides instant, crystal-ball- like predictions.

Data mining is a multistep process that requires deliberate, proactive design and use.

Data mining is not yet viable for mainstream business applications.

The current state of the art is ready for almost any business type and/or size.

Data mining requires a separate, dedicated database.

Because of the advances in database technology, a dedicated database is not required.

Only those with advanced degrees can do data mining.

Newer Web-based tools enable managers of all educational levels to do data mining.

Data mining is only for large firms that have lots of customer data.

If the data accurately reflect the business or its customers, any company can use data mining.

be off limits or treated extremely cautiously, such as terminal disease, divorce, and bankruptcy.

Questions for Case 4.7

1. What do you think about data mining and its implication for privacy? What is the threshold between discovery of knowledge and infringe- ment of privacy?

2. Did Target go too far? Did it do anything ille- gal? What do you think Target should have done? What do you think Target should do next (quit these types of practices)?

Sources: K. Hill, “How Target Figured Out a Teen Girl Was Pregnant Before Her Father Did,” Forbes, February 16, 2012; R. Nolan, “Behind the Cover Story: How Much Does Target Know?”, February 21, 2012. NYTimes.com.

Application Case 4.7 (Continued)