executive summary

profilejlbenn0751
CONCISE-OpportunitiesStillUncaptured.pdf

28 McKinsey Global Institute 1. The data and analytics revolution gains momentum

© PhotoAlto/Odilon Dimier/Getty Images

2. OPPORTUNITIES STILL UNCAPTURED

This chapter revisits the five domains we highlighted in our 2011 report to evaluate their progress toward capturing the potential value that data and analytics can deliver. In each of these areas, our previous research outlined dozens of avenues for boosting productivity, expanding into new markets, and improving decision making. Below we examine how much of that value is actually being captured, combining quantitative data with input from industry experts. The numerical estimates of progress in this chapter provide an indication of which areas have the greatest momentum and where barriers have proven to be more formidable, although we acknowledge they are directional rather than precise.

We see the greatest progress in location-based services and in retail, where competition from digital native firms has pushed other players toward adoption (Exhibit 1). The thin margins facing retailers (especially in the grocery sector) and pressure from competitors such as Amazon and leading big-box retailers such as Walmart create a strong incentive to evolve. In contrast, manufacturing, the public sector, and health care have captured less than a third of the value opportunities that data and analytics presented five years ago. Overall, many of the opportunities described in our 2011 report are still on the table. In the meantime, the potential for value creation has grown even bigger.

Exhibit 1

Potential impact: 2011 research Value captured

% Major barriers

Location-based data

 $100 billion+ revenues for service providers  Up to $700 billion value to end users

 Penetration of GPS-enabled smartphones globally

US retail1  60%+ increase in net margin  0.5–1.0% annual productivity growth

 Lack of analytical talent  Siloed data within companies

Manufacturing2  Up to 50% lower product development cost  Up to 25% lower operating cost  Up to 30% gross margin increase

 Siloed data in legacy IT systems  Leadership skeptical of impact

EU public sector3

 ~€250 billion value per year  ~0.5% annual productivity growth

 Lack of analytical talent  Siloed data within different

agencies

US health care  $300 billion value per year  ~0.7% annual productivity growth

 Need to demonstrate clinical utility to gain acceptance

 Interoperability and data sharing

There has been uneven progress in capturing value from data and analytics

SOURCE: Expert interviews; McKinsey Global Institute analysis

1 Similar observations hold true for the EU retail sector. 2 Manufacturing levers divided by functional application. 3 Similar observations hold true for other high-income country governments.

50– 60

30– 40

20– 30

10– 20

10– 20

Future of decision making (big data) Report mc 1206

DUPLICATE from ES

30 McKinsey Global Institute 2. Opportunities still uncaptured

Most business leaders recognize the size of the opportunities and feel the pressure to evolve. Recent research has found that investing in data and analytics capabilities has high returns, on average: firms can use these capabilities to achieve productivity gains of 6 to 8 percent, which translates into returns roughly doubling their investment within a decade. This is a higher rate of return than other recent technologies have yielded, surpassing even the computer investment cycle in the 1980s.32

However, these high returns are largely driven by only a few successful organizations. Early adopters are posting faster growth in operating profits, which in turn enables them to continue investing in data assets and analytics capabilities, solidifying their advantages. Facebook, in particular, has created a platform capable of gathering remarkably detailed data on billions of individual users. But not all of the leaders are digital natives. Walmart, GE, Ferrari F1, and Union Pacific are examples of companies in traditional industries whose investments in data and analytics have paid significant dividends on both the revenue and cost sides.

Many other companies are lagging behind in multiple dimensions of data and analytics transformation—and the barriers are mostly organizational issues. The first challenge is incorporating data and analytics into a core strategic vision. The next step is developing the right business processes and building capabilities (including both data infrastructure and talent); it is not enough to simply layer powerful technology systems on top of existing business operations. All these aspects of transformation need to come together to realize the full potential of data and analytics—and the challenges incumbents face in pulling this off are precisely why much of the value we highlighted in 2011 is still unclaimed.

LOCATION-BASED SERVICES HAVE MADE THE MOST SIGNIFICANT PROGRESS SINCE 2011 Location-based services, which exist across multiple industries, use GPS and other data to pinpoint where a person (or a vehicle or device) is situated in real time. This domain has made the greatest strides since our 2011 report, thanks in large part to the widespread adoption of GPS-enabled smartphones. The role of digital giants such as Google and Apple in driving these applications forward for billions of smartphone users is hard to overstate.

We estimate that some 50 to 60 percent of the potential value anticipated in our 2011 research from location-based services has already been captured. We looked separately at the revenue generated by service providers and the value created for consumers. Our 2011 report estimated that service providers had roughly $96 billion to $106 billion in revenue at stake from three major sources: GPS navigation devices and services, mobile phone location-based service applications, and geo-targeted mobile advertising services. Today’s market has already reached 60 percent of that revenue estimate, with particularly strong growth in the use of GPS navigation. Industries and consumers alike have embraced real- time navigation, which is now embedded in a host of services that monetize this technology in new ways. Uber and Lyft use location data for their car dispatching algorithms, and online real estate platforms such as Redfin embed street views and neighborhood navigation in their mobile apps to aid home buyers.

Our 2011 analysis estimated that end consumers would capture the equivalent of more than $600 billion in value, which is the lion’s share of the benefits these services create. The world is at the tipping point at which smartphones account for most mobile phone subscriptions (although most people in the world still do not own mobile phones).33 The share is much higher in advanced economies and is rising rapidly worldwide. This trend puts mapping

32 Jacques Bughin, “Ten lessons learned from big data analytics,” Journal of Applied Marketing Analytics, forthcoming.

33 451 Research data. See also Ericsson mobility report: On the pulse of the networked society, Ericsson, June 2016.

31McKinsey Global Institute The age of analytics: Competing in a data-driven world

technology in the pockets of billions of consumers. Much of the value comes, as expected in 2011, in the form of time and fuel savings as they use GPS navigation while driving and adopt many other mobile location-based services that deliver new types of conveniences.

Yet the opportunities have grown beyond what we envisioned in our 2011 report. Today there are new and growing opportunities for businesses in any industry to use geospatial data to track assets, teams, and customers in dispersed locations in order to generate new insights and improve efficiency. These opportunities are significant and, while still in the very early stages, could turn out to be even larger than the ones discussed above.

LEADERS IN THE RETAIL SECTOR HAVE ADOPTED ANALYTICS, BUT MARGINS REMAIN THIN AS MUCH OF THE VALUE GOES TO CONSUMER SURPLUS Analytics has tremendous relevance for retailers, since they can mine a trove of transaction- based and behavioral data from their customers. In 2011, we estimated that some retailers could increase net margins by more than 60 percent—and that the US sector as a whole could boost annual productivity growth by 0.5 to 1.0 percent. Five years later, US retailers have captured 30 to 40 percent of this potential.

While our 2011 analysis focused on the US sector alone, these opportunities clearly exist in other high-income countries as well, especially in Europe. Moreover, the incentives for adoption are there; major retailers worldwide have been early adopters of analytics as they seek to respond to the competitive pressures created by e-commerce. But despite the improvements made possible by analytics, overall margins have remained thin (the earnings before interest, taxes, and amortization, or EBITA, margins held steady around 7 to 8 percent from 2011 to 2016).34 This is because a great deal of the value has gone to consumers, who have been the major beneficiaries of intense competition in the retail world.

Capabilities are also uneven across the industry. Big-box retailers such as Target, Best Buy, and Costco have invested in creating an end-to-end view of their entire value chain, from suppliers to warehouses to stores to customers. Real-time information from its stores allowed Globus, a department store chain in Switzerland, to update its product mix and respond quickly to customer demand.35 In addition, certain subsectors have made faster progress. The grocery sector has led the way, while smaller retailers specializing in clothing, furnishings, and accessories have lagged behind. Organizational hurdles, including the difficulty of finding data scientists and breaking down information silos across large companies, have kept many companies from realizing the full potential payoff.

Our 2011 report focused on integrating analytics into five key functions: marketing, merchandising, operations, supply-chain management, and new business models. We have seen fairly even progress across all of these, with most large retailers adopting at least basic analytics to optimize their operations and supply chains.

In marketing and sales, the biggest emphasis has been on improved cross-selling, including “next product to buy” recommendations. While Amazon pioneered this use of technology, many other retailers (including Nordstrom, Macy’s, and Gilt) now make recommendations based on user data. In addition, retailers have tested everything from location-based ads to social media analysis; Target, for instance, is piloting the use of beacons that transmit targeted in-store ads depending on a shopper’s precise position.36 Within merchandising, retailers have made strides in optimizing their pricing (especially online) and assortment, but they have not brought as much technology to bear on placement. Amazon is mining

34 Based on a sample of a dozen large global retail companies from their annual reports. 35 Real-time enterprise stories, case studies from Bloomberg Businessweek Research Services and Forbes

Insights, SAP, October 2014. 36 Sarah Perez, “Target launches beacon test in 50 stores, will expand nationwide later this year,” TechCrunch,

August 5, 2015.

30-40% potential value captured in the retail sector

32 McKinsey Global Institute 2. Opportunities still uncaptured

other sellers on its marketplace for possible additions to its own assortment. Among the operations and supply-chain levers, there has been widespread adoption for uses such as making performance more transparent, optimizing staffing levels, improving sourcing, and streamlining shipping and transport.

PROGRESS IN MANUFACTURING HAS LARGELY BEEN LIMITED TO A SMALL GROUP OF INDUSTRY LEADERS Manufacturing industries have captured only about 20 to 30 percent of the potential value we estimated in our 2011 research—and most of that has gone to a handful of industry- leading companies. Those that made decisive investments in analytics capabilities have often generated impact in line with our estimates. The sector’s main barrier seems to be the perception of many companies that the complexity and cost of analytics could outweigh the potential gains, particularly if the companies have difficulty identifying the right technology and talent. There is no single integrated system that is a clear choice for every company. Many will not solve the full problem of data being cordoned off in silos across an organization, and installing replacement systems is a difficult undertaking.

Our 2011 report highlighted opportunities for the global manufacturing sector to realize value from data and analytics within R&D, supply-chain management, production, and after- sales support. Within R&D, design-to-value (the use of customer and supplier data to refine existing designs and feed into new product development) has had the greatest uptick in adoption, particularly among carmakers. While adoption of advanced demand forecasting and supply planning has been limited, there are some individual success stories. One stamping parts producer was able to save approximately 15 percent on product costs by using these types of insights to optimize its production footprint.

Within the actual production process, the greatest advances have been in developing digital models of the entire production process. Industry leaders such as Siemens, GE, and Schneider Electric have used these “digital factories” to optimize operations and shop floor layout, though this technique often focuses on designing new facilities. Furthermore, throughout ongoing production processes, many early adopters are using sensor data to reduce operating costs by some 5 to 15 percent. Data-driven feedback in after-sales has been most heavily applied within servicing offers, especially in aerospace or large installations in business-to-business transactions. After-sales servicing offers once relied on simple monitoring, but now they are beginning to be based on real-time surveillance and predictive maintenance.

MUCH OF THE POTENTIAL FOR COST SAVINGS, EFFICIENCY, AND TRANSPARENCY IN THE PUBLIC SECTOR REMAINS UNREALIZED Our 2011 report analyzed the public sector in the European Union (EU), where we outlined some €250 billion worth of annual savings that could be achieved by making government services more efficient, reducing fraud and errors in transfer payments, and improving tax collection. But little of this has materialized, as EU agencies have captured only about 10 to 20 percent of this potential.

In terms of operations, some government entities have moved more interactions online, and many (particularly tax agencies) have adopted more pre-filled forms. There is also a movement to improve data sharing across agencies, exemplified in the EU initiative called “Tell It Once.” On a country-specific level, the Netherlands has moved most tax and social welfare functions online, while France saved the equivalent of 0.4 percent of GDP by reducing requests from agencies to citizens for the same type of information from 12 to one.

Adoption of algorithms to detect fraud and errors in transfer payments has been limited. Analytics have been used to improve the rate of tax collection, mainly by targeting tax audits more effectively and running algorithms on submitted tax forms. France has automated

20-30% potential value captured in the manufacturing sector

10-20% potential value captured by the EU public sector

33McKinsey Global Institute The age of analytics: Competing in a data-driven world

cross-checks between agencies to improve the accuracy of their reviews of tax forms, while the United Kingdom’s payment phase segmentation leading to targeted tax audits recovered some £2 billion in its first year.

While our 2011 report focused on the EU public sector, these observations regarding government adoption are applicable across all high-income economies. Adoption and capabilities generally vary greatly from country to country, and even among agencies (with tax agencies typically being the most advanced within a given country). The main barriers holding back progress have been organizational issues, the deeply entrenched nature and complexity of existing agency systems, and the difficulty of attracting scarce analytics talent with public-sector salaries.

US HEALTH-CARE PLAYERS HAVE ONLY BEGUN TO DEVELOP ANALYTICS CAPABILITIES AND TRANSFORM THE DELIVERY OF PATIENT CARE Our 2011 report outlined $300 billion worth of value that big data analytics could unlock in the US health-care sector. To date, only 10 to 20 percent of this value has been realized. Making a major shift in how data is used is no easy task in a sector that is not only highly regulated but often lacks strong incentives to promote increased usage of analytics. A range of barriers—including a lack of process and organizational change, a shortage of technical talent, data-sharing challenges, and regulations—have combined to limit the impact of data and analytics throughout the sector and constrain many of the changes we envisioned.

The opportunities we highlighted were split among five categories: clinical operations, accounting and pricing, R&D, new business models, and public health. Within clinical operations, the major success has been the rapid expansion of electronic medical records (EMRs), which accounted for 15.6 percent of all records in 2010 but more than 75 percent by 2014, aided by the incentives for providers in the Affordable Care Act.37 This has enabled basic analytics but little has been done to unlock and fully utilize the vast stores of data actually contained within EMRs. A few providers have pushed this further, including Sutter Health, whose new EMR system processes reports 40 times faster and achieves an 87 percent increase in predicting readmissions compared with its previous system, by centralizing the data and analytics and pushing toward prospective analyses.38

Payers have also been slow to capitalize on big data for accounting and pricing, but a few encouraging trends have emerged. Transparency in health-care pricing has improved thanks to steps taken by the Centers for Medicare and Medicaid Services at the national level, while more than 30 states have established all-payer claims databases to serve as large-scale repositories of pricing information. A few insurers have made gains. Optum within UnitedHealth saves employers money by combing claim records for over- prescriptions.

Greater progress has been made in the pharmaceutical industry, where many companies have adopted analytics to assist their R&D, although they are still in the early stages of putting the full capabilities to work. Most pharma companies now use predictive modeling to optimize dosing as they move from animal testing to phase I clinical trials, but analytics have not yet been used as widely in later trials to determine questions such as the proper efficacy window and patient exclusion criteria. Data are being used in R&D to identify the right target population for drug development, which can reduce the time and cost of clinical trials by 10 to 15 percent. Contract research organizations, which are used more widely today than even five years ago, generally use statistical tools to improve the administration of clinical

37 Dustin Charles, Meghan Gabriel, and Talisha Searcy, “Adoption of electronic health record systems among US non-federal acute care hospitals: 2008–2014,” Office of the National Coordinator for Health Information Technology, data brief number 23, April 2015.

38 “Healthcare providers unlock value of on-demand patient data with SAP HANA,” SAP press release, May 4, 2016.

10-20% potential value captured by the US health-care sector

34 McKinsey Global Institute 2. Opportunities still uncaptured

trials, and they are now using their scale to draw broader conclusions from their data. A few leading players have been aggressive in using clinical trial data to pursue label expansion opportunities (that is, finding additional uses for the same drug). Meanwhile, the Food and Drug Administration has introduced the Sentinel program to scan the records of 178 million patients for signs of adverse drug effects.

Some of the new models we highlighted in 2011 are in fact taking root. There is now a large and growing industry that aggregates and synthesizes clinical records. Explorys, for example, a data aggregator with some 40 million EMRs, was recently acquired by IBM to support the development of Watson. However, online platforms and communities (such as PatientsLikeMe) had strong initial success as key data sources, but other sources have also appeared. The use of analytics in public health surveillance has assumed new importance, given recent outbreaks of the Ebola and Zika viruses.

The health-care sector may have a long way to go toward integrating data and analytics. But in the meantime, the possibilities have grown much bigger than what we envisioned just five short years ago. Cutting-edge technology will take time to diffuse throughout the health-care system, but the use of machine learning to assist in diagnosis and clinical decision making has the potential to reshape patient care. Advances in deep learning in the near future, especially in natural language and vision, could help to automate many activities in the medical field, leading to significant labor cost savings. With labor constituting 60 to 70 percent of hospital total costs, this presents significant opportunities in the future.

The biggest frontier for data analytics in health care is the potential to launch a new era of truly personalized medicine (see Chapter 4 for a deeper discussion of what this could entail). New technologies have continued to push down the costs of genome sequencing from $10,000 in 2011 to approximately $1,000 today.39 Combining this with the advent of proteomics (the study of proteins) has created a huge amount of new biological data. To date, the focus has been largely within oncology, as genomics has enabled characterization of the microsegments of each type of cancer.

A NUMBER OF BARRIERS STILL NEED TO BE OVERCOME What explains the relatively slow pace of adoption and value capture in these domains and many others? Below we look at some of the internal and external barriers organizations face as they try to shift to a more data-driven way of doing business.

Adopting analytics is a multistep process within an organization The relatively slow pace of progress in many domains points to some hurdles that most organizations encounter as they try to integrate analytics into their decision.

Many organizations have responded to competitive pressure by making large technology investments—without adopting the necessary organizational changes to make the most of them.

An effective transformation strategy can be broken down into several elements (Exhibit 2). The first is stepping back to ask some fundamental questions that can shape the strategic vision: What will data and analytics be used for? How will the insights drive value? How will the value be measured? The second component is building out the underlying data architecture as well as data collection or generation capabilities. Many incumbents struggle with switching from legacy data systems to a more nimble and flexible architecture to store and harness big data; they may also need to complete the process of fully digitizing transactions and processes in order to collect all the data that could be useful. The third element is acquiring the analytics capabilities needed to derive insights from data;

39 Data from the NHGRI Genome Sequencing Program, National Human Genome Research Center, available at https://www.genome.gov/sequencingcostsdata/.

1/10TH the cost of sequencing a genome today as a share of the cost in 2011

35McKinsey Global Institute The age of analytics: Competing in a data-driven world

organizations may choose to add in-house capabilities or to outsource to specialists (see below for more on the talent shortage). The fourth element is another potential stumbling block: changing business processes to incorporate data insights into the actual workflow. This requires getting the right data insights into the hands of the right personnel within the organization. Finally, organizations need to build the capabilities of executives and mid-level managers to understand how to use data-driven insights—and to begin to rely on them as the basis for making decisions.

Failing to execute these steps well can limit the potential value. Digital native companies have a huge natural advantage in these areas. It is harder for traditional companies to overhaul or change existing systems, but hesitating to get started can leave them vulnerable to being disrupted. And while it may be a difficult transition, some long-established companies— including GE, Mercedes-Benz, Ferrari F1, and Union Pacific—have managed to pull it off. (See Box 1, “Identifying the most critical internal barriers for organizations.”)

Exhibit 2

Successful data and analytics transformation requires focusing on five elements

SOURCE: McKinsey Analytics; McKinsey Global Institute analysis

USE CASES/

SOURCES OF VALUE

DATA ECOSYSTEM

MODELING INSIGHTS

WORKFLOW INTEGRATION ADOPTION

Internal Data modeling “black box” Process redesign Capability building

External Heuristic insights “smart box”

Tech enablement Change management

▪ Clearly articulating the business need and projected impact

▪ Outlining a clear vision of how the business would use the solution

▪ Gathering data from internal systems and external sources

▪ Appending key external data

▪ Creating an analytic “sandbox”

▪ Enhancing data (deriving new predictor variables)

▪ Applying linear and nonlinear modeling to derive new insights

▪ Codifying and testing heuristics across the organization (informing predictor variables)

▪ Redesigning processes

▪ Developing an intuitive user interface that is integrated into day- to-day workflow

▪ Automating workflows

▪ Building frontline and management capabilities

▪ Proactively managing change and tracking adoption with performance indicators

DUPLICATE from ES

36 McKinsey Global Institute 2. Opportunities still uncaptured

Box 1. Identifying the most critical internal barriers for organizations McKinsey & Company recently conducted a survey of C-suite executives and senior managers regarding the use of data and analytics across a variety of industries.1 The participants were asked which challenges were most pervasive in their company. Since respondents were asked to rank their top three challenges, the survey is not directly comparable between industries but only on the relative ranking of difficulties within an industry. However, it is useful for highlighting the barriers perceived by those in industries where progress has been slower (Exhibit 3). The barriers discussed in the survey can be broken into three categories: strategy, leadership, and talent; organizational structure and processes; and technology infrastructure.

1 “The need to lead in data and analytics,” McKinsey & Company survey, McKinsey.com, April 2016, available at http://www.mckinsey.com/business-functions. The online survey, conducted in September 2015, garnered responses from more than 500 executives across a variety of regions, industries, and company sizes.

Exhibit 3

Which of these have been among the TOP 3 most significant challenges to your organization's pursuit of its data and analytics objectives?

Survey respondents report that strategic, leadership, and organizational hurdles often determine the degree to which they can use data and analytics effectively

SOURCE: McKinsey Global Institute analysis

Barriers Overall

%

High tech and telecom Retail

Manu- factur-

ing Public sector

Health care

Strategy, leadership, and talent

Constructing a strategy ● ● ● ● ● ● Ensuring senior management involvement ● ● ● ● ● ● Securing internal leadership for data and analytics projects ● ● ● ● ● ● Attracting and/or retaining appropriate talent (both functional and technical) ● ● ● ● ● ●

Organi- zational structure and processes

Tracking the business impact of data and analytics activities ● ● ● ● ● ● Designing an appropriate organizational structure to support data and analytics activities ● ● ● ● ● ● Creating flexibility in existing processes to take advantage of data-driven insights ● ● ● ● ● ●

IT infra- structure

Providing business functions with access to support ● ● ● ● ● ● Investing at scale ● ● ● ● ● ● Designing effective data architecture and technology infrastructure ● ● ● ● ● ●

High Moderate Low

23

45

13

14

17

36

30

42

33

21

37McKinsey Global Institute The age of analytics: Competing in a data-driven world

Box 1. Identifying the most critical internal barriers for organizations (continued) Setting the right vision and strategy for data and analytics use was the top- rated hurdle among all participants, with more than 45 percent listing it as one of their top three concerns. This is an issue, since personal buy-in from senior management has a direct impact on internal projects. Moving to a model of more data-driven decision making is not as simple as buying a new IT system; it requires leadership to bring about lasting organizational change and usher in a new way of doing business. Executives who reported that their company has made effective use of analytics most often ranked senior management involvement as the factor that has contributed most to their success.

The talent needed to execute the leadership vision is in high demand. In fact, approximately half of executives across geographies and industries reported greater difficulty recruiting analytical talent than any other kind of talent. Forty percent say retention is also an issue. “Business translators” who can bridge between analytics and other functions were reported to be the most difficult to find, followed by data scientists and engineers. Talent scarcity is a major concern that we discuss in greater detail later in this chapter.

A company’s IT infrastructure is the backbone through which it can access its data, integrate new data, and perform relevant analyses. Respondents in location-based data services, the area highlighted in our 2011 research that has made the greatest progress in capturing value, reported relatively low barriers in IT infrastructure, an area where heavy investment has paid off. But firms in other industries have struggled to move on from legacy systems that trap data in silos. Even when their companies have allocated investment dollars to upgrading, many executives worry about their ability to choose the most effective systems for their needs. Finally, business units and functions need support and tailoring to use analytics systems effectively; this was a recurring theme from respondents.

Respondents who said their companies had made ineffective use of analytics noted that their biggest challenge was designing the right organizational structure to support it. This needs to include tracking the business impact and making existing processes flexible enough to respond to new data-driven insights. Firms may be comfortable with using analytics in certain areas, but for many, those changes have not filtered through the entire organization.

38 McKinsey Global Institute 2. Opportunities still uncaptured

Talent remains a critical constraint Human capital has proven to be one of the biggest barriers standing in the way of realizing the full potential of data and analytics. There are four broad types of roles to consider: the data architects who design data systems and related processes; the data engineers who scale data solutions and build products; the data scientists who analyze data with increasingly sophisticated techniques to develop insights; and “business translators” who have both technical and domain- or function-specific business knowledge, enabling them to turn analytical insights into profit and loss impact. In addition to these four categories, data visualization is an important skill set, vital to the last-mile challenge of discovering value. It may be performed by data scientists and business translators, or it can be a stand-alone role—and it is particularly powerful when combined with creative visual design skills as well as experience in creating effective user interfaces and user experiences.

Our 2011 report hypothesized that the demand for data scientists in the United States alone could far exceed the availability of workers with these valuable skills.40 Since then, the labor market has borne out this hypothesis. As a result of soaring demand for data scientists, their average wages rose by approximately 16 percent per year from 2012 to 2014.41 This far outstrips the less than 2 percent increase in the nominal average salary across all occupations in Bureau of Labor Statistics data. Top performers with a very scarce skill set, such as deep learning, can command very high salaries. Glassdoor.com lists “data scientist” as the best job in 2016 based on number of job openings, salary, and career opportunities.42 LinkedIn reports that the ability to do statistical analysis and data mining is one of the most sought-after skills of 2016.43

Roles for data scientists are becoming more specialized. On one end of the spectrum are data scientists who research and advance the most cutting-edge algorithms themselves— and this elite group likely numbers fewer than 1,000 people globally. At the other end are data scientists working closer to business uses and developing more practical firm-specific insights and applications.

The scarcity of elite data scientists has even become a factor in some acquisitions. Google, for example, acquired DeepMind Technologies in 2014, at an estimated price of $500 million. With approximately 75 DeepMind employees at the time of the deal, the price tag was nearly $7 million per employee.44 This is in line with other estimates by experts, who say that “aqui-hires” of cutting-edge AI startups cost around $5 million to $10 million per employee. In this case, the DeepMind acquisition resulted in the development of AlphaGo, which became the first AI program to defeat a human professional player in the game of Go.45 It also reportedly enabled Google to reduce the cooling costs for its vast data centers by 40 percent, saving several hundred million dollars per year.46 The DeepMind acquisition could pay off for Google from just this one application alone.

The supply side has been responding to the growing demand for analytics talent. In the United States, students are flocking to programs emphasizing data and analytics. The number of graduates with degrees of all levels in these fields grew by 7.5 percent from 2010 to 2015, compared with 2.4 percent growth in all other areas of study. Universities are also

40 In our 2011 analysis, this role was referred to as “deep analytical talent.” 41 Beyond the talent shortage: How tech candidates search for jobs, Indeed.com, September 2015. 42 “25 best jobs in America,” Glassdoor.com blog, available at https://www.glassdoor.com/List/Best-Jobs-in-

America-LST_KQ0,20.htm 43 “The 25 skills that can get you hired in 2016,” LinkedIn official blog, January 2016, available at https://blog.

linkedin.com/2016/01/12/the-25-skills-that-can-get-you-hired-in-2016. 44 Catherine Shu, “Google acquires artificial intelligence startup DeepMind for more than $500 million,”

TechCrunch, January 26, 2014. 45 See, for example, Christof Koch, “How the computer beat the Go master,” Scientific American, March 19,

2016. This achievement was widely regarded as a seminal moment in advancing artificial intelligence. 46 See DeepMind corporate blog at https://deepmind.com/applied/deepmind-for-google/.

39McKinsey Global Institute The age of analytics: Competing in a data-driven world

launching programs specifically targeted to data analytics or data science, and business analytics. Currently there are more than 120 master’s programs for data analytics or science and more than 100 for business analytics. In addition to formal education tracks, other avenues have opened up to help more people acquire data science skills, including boot camps, MOOCs (massive open online courses), and certificates. However, the scalability and hiring success of these alternative models for data science training remains to be seen. Because data science is highly specialized, the jury is still out on whether employers are willing to hire from non-traditional sources.

In the short run, however, even this robust growth in supply is likely to leave some companies scrambling. It would be insufficient to meet the 12 percent annual growth in demand that could result in the most aggressive case that we modeled (Exhibit 4). This scenario would produce a shortfall of roughly 250,000 data scientists. As a result, we expect to see salaries for data scientists continue to grow. However, one trend could mitigate demand in the medium term: the possibility that some part of the activities performed by data scientists may become automated. More than 50 percent of the average data scientist’s work is data preparation, including cleaning and structuring data. As data tools improve, they could perform a significant portion of these activities, potentially helping to ease the demand for data scientists within ten years.

On a broader scale, multiple initiatives at the state, national, and international levels aim to develop analytical talent. Examples include the Open Data Institute and the Alan Turing Institute in the United Kingdom; the latter functions also as an incubator for data-driven startups. The European Commission launched a big data strategy in 2014. The United

Exhibit 4

The expected number of trained data scientists would not be sufficient to meet demand in a high-case scenario

SOURCE: US Bureau of Labor Statistics; Burning Glass; McKinsey Global Institute analysis

Supply and demand of data scientists in the United States1 Thousand

248

403

483

235

66 High

2024 demand2024 supply4

736

Low

314

2014 supply2 New graduates3

1 The calculation is across all Standard Occupational Classifications except the ones that are clear false positive hits. 2 2014 fraction per occupation times the 2014 BLS EP employment per occupation, assuming the job market is in equilibrium and supply equals demand.

Including 355 occupations. 3 Graduates from US universities during ten years who are estimated to have the skill set required to be data scientists. Includes removing the retirement of

2.1% from current supply. 4 For each industry we calculate the share individually for the top five US companies based on their market capitalization. Professional services is an

exception; there we use consulting companies irrespective of their home country. NOTE: Numbers may not sum due to rounding.

40 McKinsey Global Institute 2. Opportunities still uncaptured

States published its own federal big data R&D strategic plan in 2016, focusing on the creation of a big data ecosystem that features collaboration among government agencies, universities, companies, and non-profits.

Many organizations focus on the need for data scientists, assuming their presence alone constitutes an analytics transformation. But another equally vital role is that of the business translator who can serve as the link between analytical talent and practical applications to business questions. In some ways, this role determines where the investment ultimately pays off since it is focused on converting analytics into insights and actionable steps. In addition to being data savvy, business translators need to have deep organizational knowledge and industry or functional expertise. This enables them to ask the data science team the right questions and to derive the right insights from their findings. It may be possible to outsource analytics activities, but business translator roles need to be deeply embedded into the organization since they require proprietary knowledge. Many organizations are building these capabilities from within.

The ratio of business translators to data scientists needed in a given company depends heavily on how the organization is set up and the number and complexity of the uses the company envisions. But averaging across various contexts, we estimate there will be demand for approximately two million to four million business translators over the next decade. Given that about 9.5 million STEM and business graduates are expected in the United States over this period, approximately 20 to 40 percent of these graduates would need to go into business translator roles to meet demand (though people from other fields can also become business translators). That seems quite aspirational given that today, only some 10 percent of STEM/business graduates go into business translator roles. Two trends could bring supply in line with potential future demand: wages for business translators may have to increase, or more companies will need to implement their own training programs. Some are already doing so, since this role requires a combination of skill sets that is extremely difficult for most companies to find in external hires.47

Visualization is an important step in turning data into insight and value, and we estimate that demand for this skill has grown roughly 50 percent annually from 2010 to 2015.48 Since this is a fairly new development, demand does not always manifest in a specific role. In many instances today, organizations are seeking data scientist or business translator candidates who can also execute data visualization. But we expect that medium-size and large organizations, as well as analytics service providers, will increasingly create specialized positions.

Three trends are driving demand for data visualization skills. First, as data becomes increasingly complex, distilling it is all the more critical to help make the results of data analyses digestible for decision makers. Second, real-time and near-real-time data are becoming more prevalent, and organizations and teams need dynamic dashboards rather than reports. Third, data is increasingly required for decision making through all parts of an organization, and good visualization supports that goal, bringing the information to life in a way that can be understood by those who are new to analytics. New software enables users to make clear and intuitive visualizations from simpler data. But more complex dashboards and data-driven products can require specialized designers. Those who combine a strong understanding of data with user interface/user experience and graphic design skills can play a valuable role in most organizations.

47 Sam Ransbotham, David Kiron, and Pamela Kirk Prentice, “The talent dividend: Analytics talent is driving competitive advantage at data-oriented companies,” MIT Sloan Management Review, April 25, 2015.

48 Based on using the Burning Glass job postings database to search for postings including any of the following skills: data visualization, Tableau, Qlikview, and Spotfire. Normalized with the total number of job postings.

2M-4M projected demand for business translators over the next decade

41McKinsey Global Institute The age of analytics: Competing in a data-driven world

Access to data has improved, but sharing is not always seamless Overall, industries are less restricted in their ability to access meaningful data than ever before. Data aggregators have found ways to monetize information, governments have opened public data sets and created open data frameworks, and new data sources have continued to proliferate.49 Firms such as Truven and Explorys are monetizing medical claims data, utilities are sharing detailed information on energy consumption, and sensors embedded in the urban environment are generating valuable data to help manage traffic and complex infrastructure systems. Meanwhile, data policies and regulations have evolved, but barriers remain in some highly regulated sectors, especially health care and the public sector.

Lack of interoperability is a major problem in some industries—notably in health care, where patient information is not always accessible across different EMR systems, and in manufacturing, where various plants and suppliers cannot always seamlessly share information. This is a critical enabler: previous MGI research found that 40 percent of all the potential value associated with the internet of things required interoperability among IoT systems, for example.50

In some cases, there are still disincentives to share data. In health care, for example, providers and pharmaceutical companies could stand to lose from greater data sharing with payers. Perhaps the largest hurdle in data access is the need for new data sources to rapidly demonstrate their profit-making potential. Many new data sets are being created in personal health, such as those captured by wearable sensors, but these data sets have yet to demonstrate clinical utility. Given industry dynamics and reimbursement policies, they may experience slow usage and uptake.

Three major concerns continue to challenge the private sector as well as policy makers: privacy, cybersecurity, and liability. All of these can discourage the use of analytics. Privacy issues have been front and center in the European Union, where “right to be forgotten” legislation has required internet companies to take extra steps to clean their records, and citizens have a constitutional right to access data about themselves, even when held by private companies. Meanwhile, privacy concerns have been heightened by repeated cybersecurity breaches. Widely publicized breaches have had major ramifications for companies’ relationships with their customers. Many people remain wary about “big brother”–style surveillance by both companies and governments. Customers have reacted negatively to retailers tracking their movements in stores, for example.51 And lastly, liability frameworks surrounding the use of data and analytics still need to be clarified. In health care, for example, clinical validation can be a lengthy process, and deviating from established guidelines can put physicians or companies at risk of a lawsuit. These concerns will only grow as complicated algorithms play a larger role in decision making, from autonomous driving to deciding where law enforcement resources should be deployed.

•••

Beyond the impact within individual sectors, the soaring demand for data and analytics services has created complex ecosystems. Data may take a long and complex journey from their initial collection to their ultimate business use—and many players are finding ways to monetize data and add value at points along the way. The next chapter examines these ecosystems in greater detail to identify some of these opportunities in this rapidly evolving landscape.

49 Open data: Unlocking innovation and performance with liquid information, McKinsey Global Institute, October 2013.

50 The internet of things: Mapping the value beyond the hype, McKinsey Global Institute, June 2015. 51 Stephanie Clifford and Quentin Hardy, “Attention, shoppers: Store is tracking your cell,” The New York Times,

July 14, 2013.