Literature review Summary

profilebharathgdk209
Big_Data_Data_Science_and_th.pdf

BIG DATA, DATA SCIENCE, AND THE U.S. DEPARTMENT OF DEFENSE (DOD)

by

Roy Lancaster

GAYLE GRANT, DM, Faculty Mentor and Chair

MICHELLE PREIKSAITIS, JD, PhD, Committee Member

BRUCE WINSTON, PhD, Committee Member

Tonia Teasley, JD, Interim Dean

School of Business and Technology

A Dissertation Presented in Partial Fulfillment

Of the Requirements for the Degree

Doctor of Business Administration

Capella University

January 2019

ProQuest Number:

All rights reserved

INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted.

In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed,

a note will indicate the deletion.

ProQuest

Published by ProQuest LLC ( ). Copyright of the Dissertation is held by the Author.

All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code

Microform Edition © ProQuest LLC.

ProQuest LLC. 789 East Eisenhower Parkway

P.O. Box 1346 Ann Arbor, MI 48106 - 1346

13805367

13805367

2019

© Roy Lancaster, 2019

Abstract

This qualitative case study of a de-identified DOD organization, Bravo Zulu Center (BZC)

(pseudonym) explored how U.S. Department of Defense (DOD) personnel glean actionable

information from big data sets. This research sought to help analyze and define the skills used by

DOD analysts, in order to better understand the application of data science to the DOD. While

the technology for producing data has grown tremendously, DOD personnel lack the required

data analysis skills and tools. Eleven DOD analysts answered individual interview questions,

eight managers participated in a focus group, and the DOD provided documents to assist with

investigating two research questions: How does the Bravo Zulu Center glean actionable

information from big data sets? How mature are the data science analytical skills, processes, and

software tools used by Bravo Zulu Center analysts? Qualitative analysis using the NVivo-11®

Pro software on the results of the interviews, focus group, and documents, showed that

overarching themes of access to quality data, training, data science skills, domain understanding,

management, infrastructure and legacy systems, organization structure and culture, and

competition for analytical talent appear as concerns for improving big data analysis in the DOD.

The Bravo Zulu Center is experiencing the same large data growth as other organizations

described in scholarly research and is struggling with creating actionable information from large

data sets to meet mission requirements and this is compounded by immature data science skills.

iii

Dedication

The study is dedicated to my wife of thirty years Laurie Lancaster. Your love, continued

encouragement, and desire for life-long learning has always provided me strength to continue, I

love you and thank you! I also dedicate this work to our children Sarah, TJ, and Wesley and to

our grandbabies Nora and Jameson! A special thank you to my mom Kathryn for “grounding”

me in the early years and teaching me the value of education and for your foundational love and

support! Special thank you to my sisters Shari and Amy and to all my extended family and

friends, I love you all!

iv

Acknowledgments

I wholeheartedly thank my mentor and chair, Dr. Gayle Grant for her expert guidance

throughout this project and getting me to finish line, thank you! I extend gratitude to my

committee, Dr. Michelle Preiksaitis and Dr. Bruce Winston for their expert reviews and

guidance. A special thank you to Dr. Linda Haynes for her outstanding reviews and most

importantly her love and inspiration, thanks Aunt Linda! Thank you to the Bravo Zulu Center

(pseudonym) for opening their doors for me, this study would not have been possible without

your generosity. Thank you to the men and women who wear the uniform of the United States

military!

v

Table of Contents

Dedication .............................................................................................................. iii

Acknowledgments.................................................................................................. iv

List of Tables ....................................................................................................... viii

List of Figures ..........................................................................................................x

CHAPTER 1. INTRODUCTION ........................................................................................1

Introduction ..............................................................................................................1

Background ..............................................................................................................2

Business Problem .....................................................................................................4

Research Purpose .....................................................................................................5

Research Questions ..................................................................................................6

Rationale ..................................................................................................................7

Conceptual Framework ............................................................................................8

Significance..............................................................................................................9

Definition of Terms................................................................................................10

Assumptions and Limitations ................................................................................10

Organization for Remainder of Study ....................................................................11

CHAPTER 2. LITERATURE REVIEW ...........................................................................13

Conceptual Framework and Research Design .......................................................14

Big Data Defined ...................................................................................................19

DOD and Big Data .................................................................................................25

Data Sciences .........................................................................................................31

vi

Data Sciences Skills ...............................................................................................34

Federal Job Series and DOD Data Scientists .........................................................45

Management Implications ......................................................................................48

Summary ................................................................................................................52

CHAPTER 3. METHODOLOGY .....................................................................................53

Introduction ............................................................................................................53

Research Questions ................................................................................................53

Design and Methodology .......................................................................................54

Participants .............................................................................................................56

Setting. ...................................................................................................................60

Analysis of Research Questions.............................................................................61

Credibility and Dependability ................................................................................65

Data Collection ......................................................................................................67

Data Analysis .........................................................................................................69

Ethical Considerations ...........................................................................................75

CHAPTER 4. RESULTS ...................................................................................................76

Introduction ............................................................................................................76

Data Collection Results..........................................................................................78

Data Analysis and Results .....................................................................................84

Summary ..............................................................................................................141

CHAPTER 5. DISCUSSION, IMPLICATIONS, RECOMMENDATIONS ..................143

Introduction ..........................................................................................................143

vii

Evaluation of Research Questions .......................................................................147

Fulfillment of Research Purpose ..........................................................................149

Contribution to Business Problem .......................................................................152

Recommendations for Further Research ..............................................................153

Conclusions ..........................................................................................................155

REFERENCES ................................................................................................................157

Statement of Original Work and Signature ......................................................................169

APPENDIX A. INTERVIEW GUIDE ............................................................................170

viii

List of Tables

Table 1. Seven traits/small to big data comparison ..........................................................23

Table 2. Harris and Mehrotra’s analysts and data scientists comparison .........................37

Table 3. Harris, Murphy and Vaisman list of data science skills .....................................38

Table 4. Federal 1500 job series occupations…………………………………………… 48

Table 5. BZC participant criteria…………………………………………………………60

Table 6. Instruments and data collection methods ………………………………………62

Table 7. Initial codes …………………………………………………………………….71

Table 8. Interviewee experience levels…..……………………………………………….80

Table 9. Management focus group experience….…..……………………………………81

Table 10. BZC collected documents…………………….………………………………..83

Table 11. Initial codes (restated)……………………………………………………….....85

Table 12. Analysts’ responses to questions about big data..........................................…...89

Table 13. Analysts’ responses to data usage questions…………………………………...90

Table 14. Analysts’ responses to questions regarding data analysis challenges………….92

Table 15. Analysts’ responses further exploring access to quality data…………………..93

Table 16. Analysts’ responses to data usage and data analysis questions………………...95

Table 17. Additional responses to analysis challenges questions…………………………97

Table 18. Additional analysts’ responses to challenges questions………………………..99

Table 19. Analysts’ responses to data science skills questions…………………………..101

Table 20. Analysts’ responses to data science skills and analysis software questions…...104

Table 21. Analysts’ responses to training related questions……………………………...106

ix

Table 22. Analysts’ responses to data scientists scarcity questions……………………..108

Table 23. Analysts’ responses to data scientist skills and roles questions………………110

Table 24. Managers’ responses to questions about big data…………………………….114

Table 25. Managers’ responses to data usage questions………………………………...116

Table 26. Managers’ responses to questions regarding data analysis challenges……….117

Table 27. Managers’ additional responses to data analysis challenges…………….……119

Table 28. Managers’ responses to data usage and data analysis questions……………...120

Table 29. Managers’ responses to analysis challenges………………………………….122

Table 30. Managers’ responses to data science skills questions………………………...124

Table 31. Managers’ responses to data science skills and analysis software questions....125

Table 32. Managers’ responses to training related questions……………………….…...127

Table 33. Managers’ responses to data scientists scarcity questions……………….........128

Table 34. Managers’ responses to data scientists’ skills and roles questions…………....130

Table 35. Data scientist and BZC Supply Analyst skills comparison……………….…..134

Table 36. Data scientist and BZC Program Management Analyst skills comparison…...136

Table 37. Data scientist and BZC Operations Research Analyst skills comparison……..138

Table 38. Data scientist and BZC Computer Scientist skills comparison……………......140

x

List of Figures

Figure 1. Analysis of big data scholarship .........................................................................16

Figure 2. Cleveland’s data science taxonomy....................................................................32

Figure 3. Adaptation of Cleveland’s data science taxonomy ............................................63

Figure 4. BZC case study triangulation .............................................................................67

Figure 5. BZC case study data analysis process ................................................................72

Figure 6. BZC potential analyst participants………………………………..……………79

Figure 7. Final hierarchical coding structure……..………………………………………86

Figure 8. Initial analysts interviews word frequency diagram……………………………87

Figure 9. Refined analyst interviews word frequency diagram…………………………..88

Figure 10. Initial management focus group interview word frequency diagram………..112

Figure 11. Refined management focus group interview word frequency diagram.……..113

Figure 12. BZC strategic document word frequency diagram……………………..…….131

Figure 13. BZC job announcements word frequency diagram……………………….….133

Figure 14. Cleveland’s data science taxonomy (restated)……………….……….…........144

Figure 15. Final hierarchical coding structure (restated)……………………….…..……146

Figure 16. Domain and data science assessment model………………………….…...….151

1

CHAPTER 1. INTRODUCTION

Introduction

A seemingly infinite amount of data (big data) has emerged, and its effects are profound

on modern-day corporations and the United States military as they continue to progress through

the information technology age (Ransbotham, Kiron, & Prentice, 2015). The ability to connect

and analyze continuously growing digital data is now essential to competitiveness in most

sectors of the United States economy (Lansiti & Lakhani, 2014). George, Haas, and Pentland

(2014) suggested that although there is evidence demonstrating the significant growth in data and

its importance for sustainability there is a gap in published management scholarship providing

theory and practices for management. Additionally, growing evidence supports the notion that

the skills required to manage and analyze the exponentially growing size of data are inadequate

and in short supply with bleak predictions for the future (Harris & Mehrotra, 2014). If there is

truly a new occupation emerging (data scientist) in the commercial sector because of the

exceptional data growth, then determining how United States Department of Defense (DOD)

organizations currently analyze large data sets will help determine if data scientists are warranted

in their organizations. Chapter 1 of this study demonstrates a business problem for both

commercial organizations and the DOD. The general business problem is the lack of effective

analysis in organizations operating in the modern-day big data environment (Harris & Mehrotra,

2014). The specific business problem is that DOD organizations may be struggling with gleaning

actionable information from large data sets compounded by immature data science skills of DOD

analysts (Harris, Murphy, & Vaisman, 2013). This chapter describes the conceptual framework

that supports this study and the rationale, purpose, and significance of the study. The overall

significance of this study is to assist with the gap in DOD related scholarly research related to

2

big data and data science and seeks to contribute value to scholars and practitioners working on

this important business problem.

Gang-Hoon, Trimi, and Ji-Hyong (2014) proposed a level of skepticism in the United

States military’s ability to adapt new technologies and philosophies required to leverage

meaningful information from large data sets. The research explored big data and data science

associated with the challenges brought on by the enormous data growth being observed in nearly

all organizations. The DOD is an extremely large organization and well beyond the ability of one

dissertation to affect massive change. This research was supported by a comprehensive literature

review of big data and data science application in corporate America as well as the DOD and

seeks to provide actionable insights into the requirements of the analysts in modern-day

organizations and serve as a catalyst for additional research.

Background

Managing data represents both problems and opportunity with distinct advantages to

organizations that can manage and analyze data (McAfee & Brynjolfsson, 2012). This research

investigated how organizational leaders and analysts manage and probe data to make better-

informed decisions, offer new insights, and automate business processes thereby adding value

throughout the value chain and creating sustainable competitive advantages (Berner, Graupner,

& Maedche, 2014). Watson and Marjanovic (2013) advocated that although executives are aware

of big data and know of some specific uses, they are often unsure how big data can be used in

their organizations and what is required to be successful. Additionally, Edwards (2014) found the

DOD is experiencing a similar data growth and presents similar problems and opportunities for

DOD leaders.

Watson and Marjanovic (2013) suggested big data and data science may not represent

3

something new but are simply the next stage of business analysis as organizations continue to

progress through the information technology age. The fields of business intelligence (BI) and

business analytics (BA) are not new with decades of existence in business and were the subject

of examination in this research. Scholarly researchers agree it is important to understand the

desired connection between raw data and actionable information through the evolution of

business intelligence (BI) and business analytics (BA) (Chen, Chiang, & Storey, 2012). The term

intelligence has been a term used in scientific research since the early 1950s. In the 1970s,

computing technology began providing actionable information to the business world and

companies began utilizing systems to generate information from raw data for management

(Ortiz, 2010). In her seminal book, In the Age of the Smart Machine: The Future of Work and

Power, Zuboff (1988) predicted information systems are not only going to automate business

processes they will also produce valuable information in a unique manner. The field of business

intelligence became popular in the business and information technology (IT) communities and

the idea of business analytics became popular in the 2000s as the key analytics component of

business intelligence (Chen et al. 2012). The unquestioned benefit of business intelligence and

business analytics is the ability to capture trends, gain insights, and draw conclusions from the

data generated in support of the business or to gain advantages over the competition and create

sustainable growth (Rouhani, Ashrafi, Zare Ravasan, & Afshari, 2016). Berner et al. (2014)

suggested that with data generation on a sharp incline there are significant gaps in the abilities of

modern-day organizations to leverage big data, and without mitigation, this gap will continue to

grow. The concept of business intelligence means organizations understand their business and

the environment it operates in, thus creating the ability for smarter decisions. Big data stands to

be a key enabler for business intelligence success (Swain, 2016).

4

Business Problem

Organizations face rapid data growth, requiring deliberate and strategic action by

leadership to remain competitive and ensure sustainability (Gabel & Tokarski, 2014). For

example, the data-rich, highly-competitive airline industry gives a clear advantage to airline

corporations that use big data to drive their strategies and decisions, while punishing those that

do not (Akerkar, 2014). Additionally, corporations such as Amazon are leading the way utilizing

high-powered big data analytics to alter the retail industry (Watson & Marjanovic, 2013). The

airline and retail industries are just two examples of industries that are being reshaped due to

their ability or inability to analyze large data sets and may provide actionable insights for the

DOD.

Ransbotham, Kiron, and Prentice (2015) is a significant research study published in the

MIT Sloan Management Review that in 2014 surveyed 2,719 participants. The participants of the

study advocated combining high level analytical skills with existing business knowledge are

creating competitive advantages. Phillips-Wren and Hoskisson (2015) suggested big data is

stimulating innovation and altering foundational aspects of many business models. Additionally,

both of these sources indicate the analysis of big data is proving difficult as companies struggle

with the ability to create actionable analytical products and integrating new analysis into existing

decisions venues. Ransbotham et al. (2015) proposed a key constraint preventing analysts from

producing actionable information from large data sets are the lack of analytical skills.

The general business problem is the lack of effective analysis in organizations operating

in the modern-day big data environment (Harris & Mehrotra, 2014). The specific business

problem is that DOD organizations may be struggling with gleaning actionable information from

large data sets compounded by immature data science skills of DOD analysts (Harris, Murphy, &

5

Vaisman, 2013). Symon and Tarapore (2015) proposed the fast-paced evolution of analysis

capabilities in commercial organizations represents great opportunity to address this business

problem for the DOD. Hamilton and Kreuzer (2018) suggested the amount of data collected by

DOD organizations continues to outpace the ability to process and interpret the data and the

ability to glean actionable information from large data sets is crucial for DOD mission success.

Research Purpose

The purpose of this qualitative case study was to explore how DOD employees conduct

data analysis with the influx of big data. An unidentified U.S. Air Force command was selected

by the researcher as the case study organization to support this study. The Bravo Zulu Center

(BZC) pseudonym was applied throughout this research to conceal the identity of the case study

organization. This research explored the emerging commercial data scientist occupation and the

skills required of data scientists to help determine if data science is applicable to the DOD. This

research sought to further define the skills required of data scientists to help enable their

effectiveness in modern organizations with specific emphasis aimed at the DOD. The targeted

population consisted of analysts, managers, or executives working within the Bravo Zulu Center

(BZC). The implication for positive social change includes the potential to identify needed

adaptations in the skills and abilities of analysts and managers working within DOD

organizations that are required to glean actionable information from big data sets. This research

explored data science and the implications associated with the big data phenomenon by

conducting qualitative research with a representative case study organization. This dissertation

explored important skill sets, attitudes, and perceptions of the analysts working big data issues

for the BZC, along with the skills sets, attitudes, and perceptions of management within the same

organization. Big data innovations are happening throughout commercial industries and it is

6

transforming foundational aspects of many business models and placing greater demands for

fast-paced innovation (Parmar, Cohn, & Marshall, 2014). This fast-paced evolution of analysis

capabilities in commercial organizations represents great opportunity for the DOD. This research

builds upon several big data and data science constructs documented in contemporary scholarly

literature (Symon & Tarapore, 2015). First, big data represents both potential and liability with

the ability to manage and analyze big data sets likely required for business sustainability

(Gobble, 2013). Second, for organizations to harvest actionable information from big data sets

requires deliberate change in many aspects of organization design and management of human

resources (Gabel & Tokarski, 2014).

A qualitative research methodology is appropriate for understanding human behavior and

is common in social and behavioral sciences and by scholar practitioners who seek to understand

a phenomenon (Cooper & Schindler, 2013). This type of research involves collecting data

typically in the participants’ settings and inductively analyzing the collected information looking

for themes to provide insight and understanding (Cooper & Schindler, 2013). This research is an

exploration of how big data analysis is accomplished within the DOD and why the rise of large

data sets may generate the need to increase the analytical skills of DOD employees making a

qualitative research methodology most appropriate.

Research Questions

The objective of this research was to develop an understanding of how DOD analysts

respond to, probe and assimilate data in big data environments to help determine if a data science

occupation is justified and warranted in the DOD. The following research questions guided the

study:

7

Primary Research Question 1: How does the Bravo Zulu Center glean actionable

information from big data sets?

Primary Research Question 2: How mature are the data science analytical skills,

processes, and software tools used by Bravo Zulu Center analysts?

Rationale

The principle rationale for furthering the knowledge on the big data phenomenon and

data science through a qualitative case study is a result of the need to view big data analysis

through the humanist lens instead of an information system technological lens (McAfee &

Brynjolfsson, 2012). Managing big data requires senior decision makers to embrace data driven

decisions and this will require a cultural change in many organizations (Gabel & Tokarski,

2014). Even though there are researchers that stress the importance of big data capability, there is

no consensus on how best to re-align and organize modern-day organizational models to support

big data efforts (Grossman & Siegel, 2014). Additionally, Brynjolfsson and McAfee (2012)

suggested there is a lack of understanding by all levels of management regarding the value of big

data and the changes required to harness the power of big data. Management may need to invest

in data scientists who can manage and manipulate large data sets and turn this raw data into

meaningful information. Unfortunately, organizations and academia may be struggling with

defining the skills sets of these so-called data scientists (Harris et al. 2013). Gabel and Tokarski

(2014) advocated data capture usage is on a sharp increase and businesses and organizations

would like to realize competitive advantages contained in the use of the tremendous amount of

data. Digital data is driving foundational changes in personal lives, business, academia, and

functions of government. The analysis of big data promises to reshape everything from

government, international development, and even how we conduct basic science (Gobble, 2013).

8

DOD organizations are generating massive amounts of information from activities along their

value chains. There has been a dramatic increase of embedded sensors into modern-day weapon

systems that is compounding the data growth (Hamilton & Kreuzer, 2018).

Moorthy et al. (2015) suggested there is potential in nearly all industries regarding the

impact of turning vast amounts of raw data into meaningful information. Additionally, turning

large raw data sets into meaningful information will require deliberate and strategic action

(Galbraith, 2014). Warehousing data is problematic, expensive, and time consuming and creates

alignment difficulties in modern organizations (Gabel & Tokarski, 2014). Davenport and Patil

(2012) submitted that the skills required to large amounts of raw data into meaningful

information are in high demand and are in short supply. The technology for producing data has

evolved greatly but the skills and software tools required to analyze large data sets have been

lagging (Gobble, 2013). Additionally, the DOD has declared they have a scarcity of data

scientists. According to the Deputy Assistant Secretary for Defense Research, data scientists are

in short supply and are becoming the most in demand job for the U.S. Military (Hoffman, 2013).

There are experts suggesting there is a data analysis skills shortfall especially for analysts that

have the talent to create predictive analytical products utilizing statistics, artificial intelligence,

and machine learning (Davenport & Patil, 2012).

Conceptual Framework

The conceptual framework serves as the foundational knowledge to support the research

study. This framework serves to guide the research by relying on formal theory, which supports

the researcher’s thinking on how to understand and plan to research the topic (Grant & Osanloo,

2014). William S. Cleveland (2001) coined the term data science in the context of enlarging the

major areas of technical work in the field of statistics. Cleveland’s seminal work described the

9

requirement of an “action plan to enlarge the technical areas of statistics focuses of the data

analyst” (Cleveland, 2001, p. 1). Cleveland described a major altering of the analysis occupation

to the point a new field shall emerge and will be called “data science” (Cleveland, 2001, p. 1).

The plan of six technical areas that encompass the field of data science described by Cleveland

include multidisciplinary investigations, models and methods for data, computing with data,

pedagogy, tool evaluation, and theory. The primary catalyst for Cleveland’s declaration of the six

technical areas was to act as a guideline for the percentage of the overall effort a university or

governing organization should apply to each technical area to begin to define curriculum for the

development of future data scientists and was adapted to support this research (Cleveland, 2001).

Significance

DISA (2015) suggested the capability to leverage meaningful information from big data

is important to the DOD. However, there are researchers that also suggests there are significant

shortfalls in the abilities of complex organizations to fully employ business intelligence

techniques on extremely large data sets (Harris & Mehrotra, 2014). In June 2014, the Office of

Naval Research published a request to commercial and DOD industries for white papers and full

proposals on how to use big data for real insight (McCaney, 2014). The overall objective was to

achieve unprecedented access to data with deeper insights by examining the data in new and

innovative ways (McCaney, 2014). Additionally, in March of 2015 the Defense Information

Systems Agency (DISA) published a request for information regarding infrastructure

development to support potential big data and governance solutions. This request is specifically

seeking examples of commercially developed solutions that are more efficient than current DOD

solutions (DISA, 2015). The desired significance of this research was to develop an

understanding of the skills required by modern-day analysts and help determine if a data scientist

10

is justified and warranted in the DOD.

Definition of Terms

Big Data is characterized as “datasets that are too large for traditional data processing

systems and that therefore require new technologies” (Provost & Fawcett, 2013, p. 54).

Big Data is characterized by “extremely high volume, velocity, and variety (commonly

referred to as the “3 Vs”). It also exceeds the capabilities of most relational database

management systems and has spawned a host of new technologies, platforms, and approaches”

(Watson & Marjanovic, 2013, p. 5).

Big Data Analytics: “Analytical techniques in applications that are so large (from

terabytes to exabytes) and complex (from sensor to social media data) that they require advanced

and unique data storage, management, analysis, and visualization technologies” (Chen et al.

2012, p. 1165).

Data Scientist Definition #1 is a seasoned professional with the training, skills, and

curiosity to discover new insights in the era of big data (Davenport & Patil, 2012).

Data Scientist Definition #2 is someone that is better at programming than statistics and

better at statistics than a computer scientist (Baskarada & Koronios, 2017).

Assumptions and Limitations

The goal of this qualitative case study was to explore how DOD employees conduct data

analysis with the influx of big data. This research explored the emerging commercial data

scientist occupation and the skills required of data scientists to help determine if data science is

applicable to the DOD. The ability to generalize conclusions to a larger population is a potential

limitation of qualitative research (Cooper & Schindler, 2013). A potential limitation of this study

is the ability to draw conclusions on an organization as large and complex as the DOD. The

11

following were the assumptions and limitations within this study.

Assumptions

The sample in this study was limited to a small number of DOD analysts and managers

within one organization. The research findings are not meant to be representative of the entire

population of DOD analysts and managers but are meant to be a catalyst for additional

quantitative research and analysis. Responses from the analysts and the managers were based

upon their own experiences and perceptions are not meant to be representative of the entire DOD

population.

Limitations

There were some limitations to qualitative data collection, primarily because of the

subjectivity and biases inherent to each participant and the researcher (Cooper & Schindler,

2013). The researcher purposively selected an organization within the DOD responsible for large

data sets and is experiencing the big data phenomenon for supporting documents, research

literature, and case study. A potential limitation was the researcher’s bias due to his long DOD

career. The researcher is a career U.S. Navy employee and purposively avoided U.S. Navy

organizations to prevent bias. All the data collected in support of this research will be retained

for seven years and then destroyed personally by the researcher via a crosscut shredder for

documents and via an approved data destruction program for digital recordings.

Organization for Remainder of Study

This study is organized into five chapters and the basis of Chapter 1 was to identify the

purpose, reasoning, and intent of this doctoral research. The research in support of Chapter 1

demonstrated a clear business problem regarding the challenges associated with the big data

phenomenon and lack of defining skills for DOD analysts and proposed the DOD is suffering

12

from this business problem (Gobble, 2013). Chapter 2 contains a literature review with

explanations on how this study differs from previous research. Chapter 3 describes the

methodology and research design employed in this study. Additionally, the data collection

method(s) are described to include the data analysis, credibility, dependability, and ethical

considerations (Moustakas, 1994). Chapter 4 presents the data analysis and findings and Chapter

5 presents a discussion of the results, conclusions, and recommendations for further research.

13

CHAPTER 2. LITERATURE REVIEW

The evidence is clear; forward acting leaders manage and harness insights from data to

gain sustainable competitive advantages (Lansiti & Lakhani, 2014). Additionally, there is clear

evidence that there are big data problems emerging due to the disproportionate growth between

collected data and the abilities of most organizations to analyze the data (Géczy, 2015). The

general business problem is the lack of effective analysis in organizations operating in the

modern-day big data environment (Harris & Mehrotra, 2014). The specific business problem is

that DOD organizations may be struggling with gleaning actionable information from large data

sets compounded by immature data science skills of DOD analysts (Harris et al. 2013).

Additionally, the amount of data being collected and requiring analysis is on a sharp increase for

the DOD. Porche III, Wilson, Johnson, Erin-Elizabeth, and Tierney (2014) commented that at

little as 5% of all data collected in the U.S. Navy and Air Force’s intelligence, surveillance, and

reconnaissance mission received analytical interpretation: the U.S. military data analysts are

overwhelmed. Additionally, substantial research is underway to determine how big data volumes

can create value for individuals, community organizations and governments (Gobble, 2013). In

response to concern regarding extreme data growth and its impact on modern day businesses and

society, several scholarly journals have been created just in the past few years which are bringing

scholars and practitioners together to research and report on the growing big data business

problem and data sciences (Frizzo-Barker, Chow-White, Mozafari & Dung, 2016). For example,

the Big Data Analytics, Big Data & Society, and the EPJ Data Science Journals have all been

founded since 2012.

The objective of this research was to develop an understanding of how DOD analysts

14

respond to, probe and assimilate data in big data environments to help determine if a data science

occupation is justified and warranted in the DOD. The following research questions guided the

study:

Primary Research Question 1: How does the Bravo Zulu Center glean actionable

information from big data sets?

Primary Research Question 2: How mature are the data science analytical skills,

processes, and software tools used by Bravo Zulu Center analysts?

This chapter describes the processes used to explore big data and data sciences and

identifies and describes research studies that have been completed regarding this important

business problem in commercial business as well as the DOD. This chapter is the result of a

comprehensive review of the pertinent scholarly and practitioner literature surrounding big data

and data sciences and is foundational for a qualitative methodology and case study research

design.

Conceptual Framework and Research Design

The conceptual framework that serves as the foundational knowledge to support this

research study is the work of William S. Cleveland (2001). This seminal research introduced the

term data science in the context of “expanding the technical areas of the field of statistics.” This

seminal work described the requirement of an “action plan to enlarge the technical areas of

statistics focuses of the data analyst” (Cleveland, 2001, p. 1). Cleveland described a major

altering of the analyst occupation to the point that a new field shall emerge called “data science”

(Cleveland, 2001, p. 1). Cleveland’s data science taxonomy directed universities to develop six

technical areas, allocate resources appropriately to research, and develop curriculum within these

technical areas. Additionally, Cleveland recommended a data science action plan that could be

15

adapted for research by government and corporate organizations. Since Cleveland (2001) there

have been many researchers advancing the field of data science through theories and methods.

However, there has yet to be provided a largely accepted academic definition of data science to

include the skills required of data scientists and how best to employ data scientists in modern big

data environments (Viaene, 2013). Conversely, there are scholars conducting scientific research

further defining the data science occupation and there are universities that have developed

curriculum to educate data scientists (Cotter, 2014). The lack of a definition regarding data

science and the potential shortage of these professionals coupled with the rapid data growth in

DOD data systems presents a key issue for the DOD.

As described by Moustakas (1994), qualitative research is an approach to explore how

groups or individuals perceive a specific phenomenon or problem. This type of research involves

collecting data typically in the participants’ settings and inductively conducting analysis of the

collected information looking for themes to provide insight and understanding (Moustakas,

1994). A qualitative research design utilizing a single embedded case study organization is

appropriate for this research and the Bravo Zulu Center agreed to participate as the case study

organization.

Gap in Literature

Although there is a tremendous amount of literature with researchers investigating the

implications with big data sets and data science, there is a gap in published scholarly literature

regarding big data and data sciences related specifically to the DOD. Frizzo-Barker et al. (2016)

conducted a systematic review of the big data business scholarship published between the years

2009-2014. These researchers analyzed 219 papers from 152 relevant academic journals and

concluded big data research and theory is fragmented and in “early state of domain of research in

16

terms of theoretical grounding, methodological diversity, and empirical evidence” (Frizzo-

Barker et al. 2016, p. 1). Frizzo-Barker et al. (2016) examined key elements as to the types and

sheer volume of published big data research as well as to the aspects of big data problems and

opportunities examined in contemporary big data research. Frizzo-Barker et al. (2016) examined

the types of industries and organizations being analyzed through big data research and concluded

most research can be categorized as either business in general or financial and management.

These researchers categorized any research regarding big data and the DOD into the law and

governance category making up 17% of the total big data research submitted suggesting a

significant gap exists in big data research associated with DOD as seen in Figure 1.

Figure 1. Analysis of Big Data Scholarship. Adapted from “An Empirical Study of the Rise of

Big Data in Business Scholarship,” by J. Frizzo-Barker, P. Chow-White, M. Mozafari, and H.

Dung 2016, International Journal of Information Management, 36(3), p. 410. Copyright 2016 by

Elsevier. Reprinted with permission.

Additionally, there is an abundance of contemporary big data research regarding the

technological advances enabling the big data phenomenon and much less surrounding the human

and data science implications associated with big data. In fact, there appears to be a gap in

published scholarly literature that tackles the human implications associated with big data and

17

data sciences and this gap is the focus of this research. There appears to be many opportunities to

explore new theories and practices that may evolve regarding the management of big data and

the evolution and application of data science (George et al. 2014).

The Big Data and Data Science Buzz

Without question the term big data and associated literature experienced a sharp increase

over the past decade. In Young’s (2014) dissertation regarding big data and healthcare Young

cited a 2013 Google search on the term big data which yielded 9.1 million hits, I executed the

same Google search in December 2017, and the search provided 343 million hits regarding big

data and I executed the same search in August 2018, and the search provided 824 million hits.

Additionally, there is a plethora of both scholarly and secondary literature surrounding big data

and data science and this literature review was the product of the examination of hundreds of

writings regarding these topics. This literature review focused on the perceived benefits and

liabilities of big data and the implications for analysts in modern organizations responsible for

capturing meaningful information from the data. Specifically, are there actions and emerging

requirements of the people responsible for analyzing data because of the arrival of large amounts

of data, and secondly is the notion of a data scientist warranted? Additionally, this literature

review focused on supported evidence of successful big data application by commercial

organizations to aid the DOD regarding their initiatives to harness big data.

A continually growing interest from mainstream media and research firms are

contributing to the message regarding data sciences. The research firm Glassdoor is an

organization that ranks occupations based upon current job openings, salaries, career

opportunities, and job satisfaction. This organization ranked data scientist as the top job in the

United States for 2016, 2017, and 2018 and indicated a data scientist could expect to earn an

18

annual salary of $110,000 (Columbus, 2018). In this example, a major research firm on job

occupations in the United States declared data scientist as the top profession and yet as this

literature review highlights the DOD has not determined how and if data scientists are needed.

Additionally, a very often cited report Manyika et al. (2011) suggested a short fall of analytical

and managerial talent in the United States in the range of 140,000 to 190,000 people by 2018.

The well-published big data researchers Thomas Davenport and D.J. Patil not only agreed to the

shortfall but also labeled data scientist as the “sexiest” job in the 21st century (Davenport & Patil,

2012, p. 1). Conversely, Fox and Do (2013) advocated there may be too much hype regarding

big data and its potential impacts. These researchers indicated the term big data is too vague and

this vagueness is causing prioritization problems for organizations. These researchers suggest

that increasing data both in size and complexity has been on-going since the mid-1990s and it

does not represent a new problem (Fox & Do, 2013). Comparing literature between researchers

such as Davenport and Patil (2012) who claimed big data and data science is having profound

effects on most industries and researchers such as Fox and Do (2013) who proposed that big data

is not new demonstrates this is an on-going debate that requires further research.

The term data scientist gained significant notoriety and momentum in 2008, when D. J.

Patil and Jeff Hammerbacker were leading the analytical efforts at Facebook and LinkedIn

(Davenport & Patil, 2012). Data scientists are professionals at gleaning actionable information

from large amounts of data. Data scientist use traditional math, science, and statistical techniques

along with modern analysis software to glean actionable information from large data sets

(Davenport & Patil, 2012). Furthermore, the term data scientist received a great amount of

popular press when D. J. Patil went on to be appointed by President Obama as the first Chief

Data Scientist at the White House (Smith, 2015). D.J. Patil served in this capacity under

19

President Obama from 2015-2017. The following comprehensive review of the existing scholarly

and practitioner literature explores the potential and effects of big data and seeks to document the

implications and requirements of today’s business leaders and understand the growing

importance of data science.

Big Data Defined

There is clear evidence demonstrating there is a big data phenomenon underway, but it is

less clear on the full ramifications of big data and how prepared is the human element and the

full significance of the big data phenomenon. There are scholarly researchers suggesting the

arrival of big data includes cultural, technological, and scholarly impacts (George, Haas, &

Pentland, 2014). Conversely, there are some influential researchers, such as Watson and

Marjanovic (2013), that indicate big data may not represent something new but is simply the next

phase of digitization as societies continue to progress through the information age. Beer’s (2016)

theoretical framework suggested there is very little understanding of the concept of big data,

such as where the term came from, how is it used and how does it lend authority thereby further

conceptualizing the big data phenomenon and allowing for actionable research and theory.

Schneider, Lyle, and Murphy (2015) indicated the growing conversation of big data is a very

relevant conversation to the DOD due to the extreme data growth and data capture by DOD

activities coupled with indications the data growth trends will continue for the near future.

Big data has become a ubiquitous term with no single unified definition. A commonly

cited explanation describes big data “as the collection of data sets so large and complex that it

becomes difficult to process using traditional relational database tools and traditional data

processing applications” (Moorthy et al. 2015, p. 76). The origin of the term big data is

debatable; however, this term has been around since at least the 1990s. Several authors give

20

some credit to John Mashey, who in the 1990s was a chief scientist working at Silicon Graphics

Inc., responsible for developing methods for the management of large amounts of computer

graphics. Mashey gave hundreds of presentations to small groups in the 1990s to explain the

concept of an extremely large amount of data capture coming quickly with profound impacts

(Lohr, 2013).

Several researchers, such as Watson and Marjanovic (2013), placed big data on an

evolutionary scale and depict the big data phenomenon as the fourth generation in the

information age. With decision support systems (DSS) as the first generation which was born in

the early 1970s. Secondly, the 1990s brought in the era of the enterprise data warehousing in

which businesses aggregated their data from many disparate data sources and field locations into

a single warehouse or warehouses. The third generation arrived in the early 2000s in which

senior leaders and managers were gaining near and real-time access into these data warehouses

and invested heavily into the business intelligence layers built on top of these data sets to gain

powerful and competitively attractive decisions into their value chains. Finally, the big data era is

creating a fourth generation that promises to be a catalyst for major change and innovation in

nearly all industries (Watson & Marjanovic, 2013).

The Size of Big Data

The amount of data collection globally is growing rapidly and modern organizations are

capturing massive amounts of data on activities up and down their value chains. Additionally,

millions of networked sensors are being embedded into machines creating a hugely data rich

environment. This exponential growth in data is underway in nearly all sectors of the U.S.

economy and businesses are simply collecting more data than they can manage (McAfee &

Brynjolfsson, 2012). There are several researchers and organizations studying the amount of data

21

generated and providing predictions of massive growth in the decade ahead. One common

resource cited in modern literature surrounding big data is the Digital Universe research project

sponsored by the EMC Corporation (Turner, Reinsel, Gantz & Minton, 2014). This project seeks

to define how big the big data expansion is today and provides predictions of data growth into

the next decade. According to the Digital Universe, data generation and collection will double

every two years and by 2020, the size of stored digital data will reach 44 trillion gigabytes. To

help put this into context if this amount of data was stored in a stack of tablet computers, such as

an iPad™, there would be 6.6 stacks of tablets equal to the distance from the Earth to the Moon

(Turner et al. 2014).

The Three V’s Revised

There are many assumptions and perplexities regarding big data definitions. If all

organizations generate data, what constitutes big data? Additionally, because big data is a term

with different meanings it creates difficulties when determining solution paths regarding big data

efforts (Watson & Marjanovic, 2013). Attempting to define a taxonomy on which to conduct big

data research is a common theme in contemporary big data literature (Beer, 2016). In 2001,

Douglas Laney of META group authored what is now considered a foundational white paper

regarding data management and provided a context upon which the big data phenomenon could

be described. Even though there is no consensus on the amount of data that constitutes big data,

the impact of big data could be described through the constructs of volume, velocity, and variety

(Phillips-Wren & Hoskisson, 2015). Although an exact and wide-spread definition of big data

has not been commonly agreed to, examining the data growth through Laney’s definition is very

commonly cited in the literature. Laney described the three V’s in the context of the amount and

size of the data (volume), the rate at which data is produced(velocity), and range of different

22

formats data is being generated and delivered (variety) (Phillips-Wren & Hoskisson, 2015).

Kitchin and McArdle (2016) suggested Laney’s traditional view of big data using the

three V’s lacks ontological clarity. Ontological clarity would define the concepts, categories and

properties of big data and the relationships between them (Kitchin & McArdle, 2016). The use of

the three V’s to describe big data is a useful entry point but only describes a broad set of issues

associated with big data, vice providing further definition and practicality of big data (Kitchin &

McArdle, 2016). Additionally, Kitchin and McArdle (2016) aggregated and submitted several

important and new qualities and attributes of big data, suggested by several contemporary big

data researchers, to include the following:

 “Exhaustivity. The entire system is captured, n=all, rather than being sampled.

 Fine-grained. Resolution and uniquely indexical (in identification).

 Relationality. Data contains common fields that enable the conjoining of different

datasets.

 Extensionality. Data is added and changed easily.

 Scaleability. The ability for data to expand in size rapidly.

 Veracity. Data can be messy, noisy and contain uncertainty and error.

 Value. Data provides many insights can be extracted and the data repurposed.

 Variability. Data can be constantly shifting in relation to the context in which they are

generated” (Kitchin & McArdle, 2016, p. 1).

Kitchin and McArdle (2016) explored ontological characteristics of 26 datasets to

provide a more actionable definition of big data. These researchers developed a taxonomy of

seven big data traits and then applied these traits against 26 data sets that were considered to

23

meet current definitions of big data. Kitchin and McArdle (2016) significantly added to Laney’s

foundational definition of big data and demonstrated big data is qualitatively different to

traditionally small data sets along seven axes as seen in Table 1.

Table 1

Kitchin & McArdles’ Seven Traits and Small to Big Data Comparison

Small Data Big Data

Volume Small or limited to large Very large

Velocity Slow, freeze-framed or bundled Fast, continuous

Variety Limited in scope to wide ranging Wide

Exhaustivity Samples Entire populations

Resolution and indexicality Course and weak to strong and tight Tight and strong

Relationality Weak to strong Strong

Extensionality and

scalability

Low to middling High

Note. Adapted from “What makes big data, big data? Exploring the ontological characteristics of 26

datasets,” by R. Kitchin and G. McArdle, 2016. Big Data & Society, 3 (1). CC 2016 by Sage Publishing.

Big Data Benefits

The traditional analytics environment that exists in most organizations today includes

transactional systems that generate data and data warehouses that store the data. Data warehouses

are thus collections of federated data marts. A set of business intelligence and analytics tools that

aid decision-making through queries, data mining, and dashboards. Typical dashboards drill from

top-level key performance indicators down through a wide range of supporting metrics and

detailed data (Davenport, Barth, & Bean, 2012).

Almeida (2017) suggested the primary purpose of big data analysis is to improve

24

business processes through greater insights and better decision making. Understanding how to

leverage increasingly amounts of data is crucial for business success in the modern environment.

This researcher conducted an in-depth literature review of published works between the years

2012-2017 and determined that big data analysis is a growing theme of importance in big data

research (Almeida, 2017). Additionally, research published in the Harvard Business Review by

McAfee and Brynjolfsson (2012) was a study encompassing 330 large North American

companies and consisted of structured interviews with executives spread across these

organizations. The researchers gathered information in interviews about the companies’

organizational management and technology strategies and collected information from annual

reports and independent sources. The primary purpose of McAfee and Brynjolfssons’ study was

to investigate if exploiting vast new flows of information in the era of big data could radically

improve performance. The researchers suggested the era of big data is a revolution because

companies can measure and therefore manage more precisely activities up and down their values

streams unlike any time in the past. McAfee and Brynjolfssons concluded that top performing

companies that are using data-driven decision-making supported by analytical software were on

average “5% more productive and 6% more profitable” suggesting companies can and do build

competitive advantages through big data analysis (McAfee & Brynjolfsson, 2012, p. 64).

Additionally, according to Davenport and Dyché (2013) the analysis of data to provide insight

into the organizations’ value chain is not a new concept. However, most businesses are just now

starting to strategize the potential benefits of big data analysis and how best to implement big

data analysis into their traditional business intelligence architectures. Corporations such as

Yahoo, Google, Wal-Mart, and Amazon are clearly leading the way regarding big data

management and analysis. However, for most companies the ability to manage large data sets to

25

the extent of these leading corporations requires strategic planning and action (Watson &

Marjanovic, 2013). Prominent researchers such as Davenport and McAfee clearly demonstrate

there is value to companies that can analyze big data sets and may provide actionable theory for

the DOD. Hoffman (2013) suggested that although the DOD has been warehousing and

analyzing data for several decades, they, too, require strategic change to leverage information in

the era of big data. Leveraging big data through analysis is a high priority for the U.S. military,

however there are researchers suggesting the DOD’s ability to analyze its data is not keeping

pace with the amount of data being collected (Hoffman, 2013). Much of the expectation involved

in big data analysis is the continued desire by companies and the DOD to move from reactionary

metrics based on historical data to predictive and prescriptive metrics that may be possible with

big data analysis. Research on big data and data science suggests the ability to locate hidden

facts, indicators, and relationships immersed in big data sets not yet explored (Chen et al. 2012).

DOD and Big Data

The amount of data collection across the DOD has been increasing at a fast pace and the

demands from the warfighters to make well-informed decisions from massive amounts of data

are critical (Hamilton & Kreuzer, 2018). Edwards (2014) suggested big data insights are now an

essential requirement for modern warfare and military organizations need to use advanced

analytics to take advantage of their massive amounts of data and avoid over saturation from the

data. The notion the DOD is aware of its growing data challenge is well documented. However,

it is less clear on just how large is the data growth in DOD information systems and how

prepared is the DOD to handle big data. The purpose of this exploratory qualitative case study

was to explore how DOD employees conduct data analysis with the influx of big data. This

research will explore the emerging commercial data scientist occupation and the skills required

26

of data scientists to help determine if data science is applicable to the DOD. By conducting a

comprehensive literature review as to the perceptions of big data and data science there are

potential benefits to the DOD.

DOD Big Data Initiatives

Although Frizzo-Barker et al. (2016) suggested there is a gap in big data literature for

U.S. government organizations the U.S. defense industry appears energized by the potential of

big data and big data analysis. The DOD is reaching out to commercial industries for assistance

and advice (Konkel, 2015). Cyber defense and situation awareness initiatives appear to be in the

forefront of the department’s initiatives. Many of the big data projects underway within the DOD

are aimed at advancing military, surveillance, and reconnaissance (ISR) systems (Costlow,

2014). Porche et al. (2014) accumulated several formal research projects requested by the U.S.

Navy to investigate the huge data growth and provide any potential ways forward. The amount of

ISR data collected by the U.S. Navy has become overwhelming with no end in sight. These

researchers explained the U.S. Navy is only able to analyze approximately five percent of the

data it collects from its ISR platforms (Porche et al. 2014). Additionally, several researchers

from the U.S. Navy’s postgraduate school collaborated on Big Data and Deep Learning for

Understanding DOD Data (2015) further expounding on the big data problem for the DOD with

specific research to help determine if big data and data science are really something new or just

the next progression in information technology analysis. These researchers explained that

applications including traditional numerical analysis, statistics, machine learning, data mining,

business intelligence, and artificial intelligence are migrating into a common term called big data

analytics (Zhao, MacKinnon, & Gallup, 2015).

The U.S. Air Force (USAF) is also struggling with the demands for ISR data collection

27

and analysis as the requirement for these types of missions continues to increase. In Data Science

and the USAF ISR Enterprise (2016), the USAF Deputy Chief of Staff for Intelligence,

Surveillance and Reconnaissance released a publicly available white paper that described

extreme emphasis on the U.S. Air Force’s big data growth and the opportunities for data

sciences. The U.S. Air Force is experiencing exponential data growth and increasing demands on

analysts. Data science is a key element in order to unlock big data for the U.S. Air Force ISR

community (USAF, 2016). This white paper described three specific conditions that exist today

that are indications of lacking big data analysis. First, even though there is exponential growth in

data, only a limited set of data is analyzed due to the lack of integration and connectedness.

Secondly, a problem is the incapability to dynamically correlate and cross-reference data

vertically through organizations and horizontally across mission areas. Lastly, the shortage of

streamlined processes to coordinate, combine, and disseminate data to other participating

organizations (USAF, 2016). In this writing, the U.S. Air Force clearly acknowledged a big data

and data science problem and is requesting additional research to understand the impacts of

leveraging data scientists. This research suggested big data specialists should take the lead of

researching and comprehending data science methods and approaches that would be instrumental

in advancing the field of data sciences across the U.S. Air Force (USAF, 2016).

Another recent big data and data science initiative suggests the DOD is strategically

making efforts to analyze big data streams aimed at improving personnel readiness.

Strengthening Data Science Methods for Department of Defense Personnel and Readiness

Missions (2017) is a publically available and comprehensive report sponsored by the DOD. The

report requests the National Academies of Science, Engineering, and Medicine to collaborate on

and provide recommendations on how the Office of the Under Secretary of Defense (Personnel

28

& Readiness) could use the field of data science to improve the effectiveness and efficiency of

their critical mission. Specifically, the request was to develop an implementation plan for the

integration of data analytics into the DOD decision-making processes. A major theme is this

report is to further the development of advanced analytics and the strengthening of data science

education. A skilled workforce that can apply contemporary advances in data science

methodologies is critical. Furthermore, this research study concluded that based upon similar

research conducted in other mature organizations this portion of the DOD’s depth, skills, and

overall resources in data analytics is insufficient. Having small pockets of data science expertise

is not sufficient and the DOD should seek to raise the overall general level of awareness and

skills to become more effective. Simply stated, new data science skills are critically needed in

the DOD workforce (National Academies Press, 2017). The U.S. Army also has several big data

initiatives underway with exclamations that big data analysis has arrived and is here to stay. The

Commander’s Risk Reduction Dashboard (CRRD) is an initiative that integrates a variety of

personnel data from several data sources. The CRRD relies on big data analysis to inform local

commanders and higher echelon commands of personnel who might be at higher risk of suicide

(Schneider et al. 2015). By examining current and publically available literature from the U.S.

Navy, U.S. Air Force, and the U.S. Army there are distinct big data and data science projects on-

going. Many of the projects are championed by senior officers who have expressed concern

regarding the abilities of DOD organizations to analyze big data sets. Additionally, it is also clear

the DOD is interested in examining the big data and data science practices of commercial

organizations and to leverage these advances across DOD organizations to support national

defense strategies.

29

Big Data Challenges

According to Watson and Marjanovic (2013) the challenge with harnessing the power of

big data includes identifying which sectors of data to exploit, getting data into an appropriate

platform and integrating across several platforms, providing governance, and getting the people

with the correct skill sets to make sense of the data. There is evidence this fundamental problem

resides within the DOD as well. The essence of analyzing big data within the DOD requires

many data sources to be fed from hundreds of organizations requiring the defining data sharing

legal, policy, oversight, and compliance standards to make it happen (Edwards, 2014). To make

effective use of big data within the DOD requires an investment of time and money as well as

finding the correct talent to do the analysis. Locating the people within DOD as well as bringing

in analysts from outside the DOD to successfully conduct big data analysis is a major challenge

(Edwards, 2014). Schneider, Lyle, and Murphy (2015) categorized the primary challenges

associated with big data specifically for the DOD and listed the ability to analyze and interpret

the data as a primary concern. Furthermore, these researchers recommended incentivizing

analysts to remain loyal to the DOD may be one of the biggest challenges the DOD will face

with big data analysis.

White House Big Data Strategy

Another example that the U.S. Government is acting on big data and data science is the

White House’s big data strategy. In March 2012, the Obama administration published the Big

Data Research and Development Initiative with specific implications for six federal departments

or agencies including the DOD. The intent of the initiative is to build an innovation ecosystem to

enhance the ability to analyze, extract and make decisions from large and diverse data sets. The

intent is for Federal agencies to better support the entire nation based upon data (White House,

30

2012). One of the specific initiatives was to expand the workforce needed across federal agencies

to develop and use big data technologies. The DOD portion of the big data initiative focuses on

three areas: data for decisions, autonomy, and human systems. The data to decision aspect of this

initiative is to develop computation techniques and software tools for analyzing large amounts of

data (White House, 2012). Stemming from the White House big data initiative the Federal Big

Data Research and Development Strategic Plan (2016) was promulgated. The Big Data Steering

Group reports to the Subcommittee on Networking and Information Technology Research and

Development (NITRD) and published their report through the direction of the Executive Office

of the President, National Science, and Technology Council. There are seven detailed strategies

promulgated in this plan with strategy number six directly related to the business problem and

research questions that chartered this research with the BZC.

Strategy 1: “Create next generation capabilities by leveraging emerging Big Data foundations,

techniques, and technologies” (White House, 2016, p. 6).

Strategy 2: Support R & D to explore and understand…

Strategy 3: Build and enhance research cyber infrastructure…

Strategy 4: Increase the value of data through policies that promote sharing…

Strategy 5: Understand big data collection, sharing, regarding …

Strategy 6: “Improve the national landscape for big data education and training to fulfill

increasing demand for both deep analytical talent and analytical capacity for the broader

workforce” (White House, 2016, p. 29).

 Continue growing the cadre of data scientists

 Expand the community of data-empowered domain experts

 Broaden the data-capable workface

31

 Improve the public’s data literacy

Strategy 7: “Create and enhance connections in the national big data innovation ecosystem”

(White House, 2016, p. 34).

The NITRD’s supplement to the fiscal year 2018 President’s budget indicates the Federal Big

Data Research and Development Strategic Plan (2016) is still an active plan under President

Trump (White House, 2018).

Data Sciences

Similar to using a search engine to search term big data, a review of both scholarly and

gray literature regarding data sciences and data scientists returns a plethora of literature. There is

evidence suggesting the term data science has been around for decades. However, many scholars

credit William S. Cleveland (2001) with introducing the term data science in the context of

enlarging the major areas of technical work in the field of statistics. This seminal work described

the requirement of an “action plan to enlarge the technical areas of statistics focuses of the data

analyst” (Cleveland, 2001, p. 1). Cleveland described, due to the increasing collections of data a

major altering of the analysis occupation to the point a new field shall emerge and will be called

“data science” (Cleveland, 2001 p. 1). The plan of six technical areas that encompass the field of

data science includes multidisciplinary investigations, models, and methods for data, computing

with data, pedagogy, tool evaluation, and theory Figure 2. The primary catalyst for Cleveland’s

declaration of the six technical areas was to act as a guideline for the percentage of the overall

effort a university or governing organization should apply to each technical area to begin to

define curriculum for the development of future data scientists (Cleveland, 2001). The focal

point of this research is to understand and document the current environment surrounding the

required skills for big data analysis. Additionally, to explore the call for data science as described

32

by Cleveland and further the body of knowledge regarding the progression of the data science

occupation with specific emphasis on the DOD.

Figure 2. Cleveland’s Data Science Taxonomy. Adapted from “Data Science: An action plan for

expanding the technical areas of the field of statistics.” by W. Cleveland (2001) International

statistical review, 69(1), 21-26.

Scholarly Views of the Data Scientist Role

Zhu and Xiong (2015) explained there is a new discipline emerging called data science

and there are distinct differences between the established sciences, data technologies, and big

data. The formation and the further development of data science extends much further than

computer science. Although data scientists use similar methods and techniques there are

profound differences and data science requires fundamental theories and new techniques (Zhu &

Xiong, 2015). In an attempt to further define data science and data scientist Harris, Murphy and

Vaisman (2013) provided the results of the survey they conducted in mid-2012 of working

analysts across multiple industries. These researchers surveyed analysts to understand their

Data Sciences

Multidisciplinary Investigation

Models & Methods

Computing with Data

Pedagogy

Tool Evaluation

Theory

33

experiences and perceptions of their skills. This research provided a quantitative methodology

that researchers and DOD organizations could leverage to understand how to evolve their

existing analysts into data scientists. Harris, Murphy and Vaisman (2013) furthered the notion of

the T-shaped data analysts. These are analysts that have broad expertise (top of the T) coupled

with in-depth knowledge of a particular skill or business domain (stem of the T). The vertical

stem of the T represents deep and foundational business domain understanding and the

horizontal bar represents a wide range of skills necessary across the organization (Harris et al.

2013). Additionally, scholars such as Vincent Granville, Ph.D. have now published detailed

descriptions of data scientists with specific skill requirements. In his foundational book

Developing Analytic Talent: Becoming a Data Scientist (2014) Granville explained vividly data

science is a new role emerging across industries and government organizations. The data

scientist role is different from traditional roles of statistician, business analysts and data

engineers. Data science is a combination of business engineering and business domain expertise,

data mining, statistics, and computer science, along with advanced predictive capabilities such as

machine learning. Data science is bringing a number of processes, techniques, and

methodologies together with a business vision to drive actionable insights (Granville, 2014).

Business Intelligence and Business Analytics

Although there are scholars such as Zhu and Xiong (2015) and Harris, Murphy and

Vaisman (2013) that proposed data science is an emerging occupation with distinct skill

requirements beyond traditional data analysts. There are scholarly researchers suggesting data

science is the next logical progression of business intelligence (BI) and business analytics (BA)

generating on-going debate. Provost and Fawcett (2013) suggested companies have realized the

benefits of hiring data scientists and academic institutions are creating data science curriculums

34

and contemporary literature is documenting advocacy for a new data science occupation.

However, there is disagreement about what constitutes data science is and without further

definition; the concept may diffuse into a meaningless term. These researchers argue data science

has been difficult to define because it is intermingled with other data driven decision making

concepts such as business analytics, business intelligence, and big data. The relationships

between these concepts and data science required further exploration and the underlying

principles of data science need to emerge to fully understand the potential of data science

(Provost & Fawcett, 2013).

The research conducted by Chen, Chiang, and Storey (2012) described a clear evolution

of business intelligence and business analytics starting in the 1990s and determined big data

analytics is a similar field offering new opportunities. They described big data and big data

analytics as terms used to describe the “data sets and analytical techniques that have become

large and complex and typically require unique and advanced storage” (p. 1165). Additionally,

big data sets may require specialized management, analysis and visualization technologies, and

techniques. The big data era has quietly moved into many public, private, and corporate

organizations and these researchers explained significant improvements in market intelligence,

government, politics, science and technology, healthcare, security, and public safety through big

data analysis. These researchers expressed that the analysis of big data is a related but separate

field to business intelligence and business analytics (Chen et al. 2012).

Data Sciences Skills

The literature suggests before modern-day organizations, including the DOD, can benefit

from the rapid data growth and access to real time information, data scientists are going to be

required and will need to be embedded into the decision processes (Galbraith, 2014). Research

35

published in the Harvard Business Review Shah, Horne, and Capellá (2012) suggested even

though companies are investing heavily in deriving insights from data streaming from their

customers and suppliers there are still significant gaps in skills and abilities of individuals and

organizations to conduct the analysis. In 2012, these researchers surveyed 5,000 employees from

22 global companies and determined less than 40% of employees have sufficiently matured skills

to succeed in a big data environment (Shah, Horne & Capellá, 2012). Fundamentally, the ability

most organizations possess is to analyze only a small subset of their collected data that is

constrained by analytics and algorithms of desktop software solutions with modest capability

(Shah et al. 2012).

Fundamental to the investigation on whether a data scientist is different from traditional

quantitative analysts requires an investigation of the current abilities of data scientists in relation

to their requirements to generate information and the ability of the data scientists to use the

modern tool sets (Harris & Mehrotra, 2014). Many questions still exist such as: what is the level

of education needed? Do data scientists need to have a terminal degree or is data science an

applied role? Do all data scientists need to be experts in machine learning and unstructured data

analysis? Additionally, there is evidence suggesting a rise in the mistaken assumptions regarding

the meaningfulness of correlations in the era of big data. For example, big data sets often

produce statistically significant findings even though the results are false and potentially based

on inappropriate analytical methods suggesting a required modification of analytical skills (Shah

et al. 2012). The arrival of big data suggests the typical statistical approach of relying on p values

to establish significance and correlation will unlikely be sufficient in a world of immense data in

that almost everything is significant. Simply, when utilizing traditional and typical statistical

tools to analyze big data it is common to arrive at false correlations (George et al. 2014).

36

Harris and Mehrotra (2014) expressed that in their research the organizations that create

the most value from data science are the ones that allow their data scientists to discover insights

from “open-ended questions that matter the most to the business” (p. 16). These researchers also

suggested there are distinguishable differences between data scientists when compared to

traditional quantitative analysts and there are many implications on how to define the roles of

data scientists as well as how to attract and train these experts and how to get the most value

from this emerging discipline. In 2014, these researchers surveyed more than 300 analytical

professionals from many different companies and from several industries to learn how these

analysts perceived their work and role in the organization. In their research they concluded about

one-third of the analysts describe themselves as data scientists with the remaining identifying

themselves as analysts with distinguishable characteristics. For example, more data scientists

than analysts consider their work more critical to favorable business outcomes. Additionally,

94% of the data scientists’ surveyed indicated analytical abilities are a key element of their

companies’ strategies and business model as compared to 65% of the traditional analysts who

believe their work is tied directly to business models and strategies (Harris & Mehrotra, 2014).

According to Harris and Mehrotra (2014), data scientist skills differ from traditional analyst and

the most typical distinctions are provided in Table 2.

37

Table 2

Harris and Mehrotra’s Analysts and Data Scientists Comparisons

Traditional Analysts Data Scientists

Types of Data Structured or semi-

structured, relational and

typically numeric data

All types, including unstructured,

numeric, and non-numeric data (such

as images, sound, text)

Preferred Tools Statistical and modeling

tools, usually contained in a

data repository

Mathematical languages (such as R

and Python®), machine learning,

natural language processing and

open-source tools.

Nature of work Report, predict, prescribe

and optimize

Explore, discover, investigate and

visualize

Typical educational

background

Operations research,

statistics, applied

mathematics, predictive

analytics

Computer science, data science,

symbolic systems, cognitive science.

Mind-set Percentage who say they:

 Are entrepreneurial 69%

 Explore new ideas 58%

 Gain insights outside of

formal projects 54%

Percentage who say they:

 Are entrepreneurial 96%

 Explore new ideas 85%

 Gain insights outside of formal

projects 89%

Note. Adapted from “Getting value from your data scientists,” by J. Harris and V. Mehrotra, (2014). MIT

Sloan Management Review, 56(1), 15-18. Copyright 2014 by Massachusetts Institute of Technology.

Adapted with permission.

The research concluded data scientists are highly skilled specialists who tackle the most

significant and complex business challenges (Harris & Mehrotra, 2014). Common themes

regarding the skills required of data scientist include advanced and in many cases, open source

statistical software such as R and Python. These applications lend themselves to another common

characteristic of the perceived data scientist and that is they will serve the organization best if

they can explore open-ended questions (Davenport & Dyché, 2013).

Harris, Murphy and Vaisman (2013) conducted quantitative research in 2012 that

surveyed analysts across several industries to further the knowledge of data science skills and the

38

role of data scientists. The researchers developed a list of 22 generic data science skills and then

ask the respondents of their survey to categorize the skills and to self-identify their perceived

roles against the list of data science skills. The list of perceived data science skills as described

by these researchers was adapted to analyze the perceived skills and roles of the analysts at the

Bravo Zulu Center as seen in Table 3.

Table 3

Harris, Murphy and Vaisman Data Science Skills

Perceived Category Data Science Skills

Business Product development

Business

Machine Learning/Big Data Unstructured data

Structured data

Machine learning

Big and distributed data

Math & Operations research Optimization

Math

Graphical models

Bayesian/Monte Carlo statistics

Algorithms

Simulation

Programming System administration

Back end programming

Front end programming

Statistics Visualization

Temporal statistics

Surveys and marketing

Spatial statistics

Science

Data manipulation

Classical statistics

Note. Adapted from “Analyzing the Analyzers: An introspective survey of data scientists and their work,”

by H. Harris, D. Murphy, and M. Vaisman, (2013). Sebastopol, CA: O’Reilly Media. Copyright 2013 by the authors. Adapted with permission.

39

Defining the occupation of the data scientist is an evolutionary process currently

underway. Viaene (2013) explains that data science is not yet a defined academic discipline or

established profession. There appears to be a group of occupations such as scientists, analysts,

technologists, engineers, statisticians working together to carve out the role for the data scientist.

This researcher also agrees with other data science research underway that big data analysis

requires a multi-skilled team in which the data scientist is a member. Big data sets combined

with advanced analytical capability are creating a breed of analysts that are going to be able to

uncover hidden patterns and unknown correlations (Santaferraro, 2013).

Data Science and Business Domain Connection

A common theme in data science research suggests that for data scientists to generate

business value they will need to work closely with domain experts in the organization (Viaene,

2013). To create the business value and prevent runaway data projects this researcher proposes a

benefits realization process through a circular series of steps. This process can create

collaboration between the business domain experts and the data scientists and should be a

foundational requirement before starting a data science project. Viaene’s benefits realization

process steps are briefly described below:

 Modeling the business- modeling represent using data to create improvements in the

business.

 Discovering data- discovery takes place in the model domain.

 Operationalizing insights- operational insights are transferred to the model domain to

the business domain or operationalized.

 Cultivating knowledge- promotes the best practices for the use of data and data science

to maximize the investment.

40

Three Types of Analysts

Viaene (2013) describes the roles of traditional analysts fall into three categories: data

analysts, business intelligence analysts, and business analysts. First data analysts are

professionals that understand where data comes from and how to make data available for

business decisions. These analysts typically focus on the extraction, cleansing, and

transformation of raw data in actionable information and most data analysts have computer

science training and solid backgrounds in math and statistics. Second, business intelligence

analysts are effective once the data have been moved into data marts and data warehouses. Third

business intelligence (BI) analysts perform the next level of data preparation. Business Analysts

are the business analysts are the group within the organization that can transform the information

collected into actionable insights on where to influence the business. The abilities of moving,

handling and analyzing data make these traditional analysts ideal data scientist candidates.

To evolve these traditional analysts into data scientists will require proficiencies in

parallel computing and petabyte sized non-structured analysis capability of NoSQL databases,

machine learning, and advanced statistics (Santaferraro, 2013). To gain these data scientists,

Santaferraro suggested the creation of internal programs that provides the opportunity for

existing data, BI analysts, and business analysts to acquire the skills they need to become big

data scientists and recommends the creation of this program around five primary tasks.

Santaferraro (2013) breaks the skills required of the emerging data scientists into a few distinct

descriptions and provides a five-point plan for filling the demand for data scientists.

Santaferraro’s five-point plan is summarized below:

Task 1 – Canvas existing analysts and identify those with the background, talent and

desire to increase their skills and create education opportunities for these individuals.

41

Task 2 - Provide incentives for participants and reward them for reaching milestones.

Incentivizing data scientists’ loyalty will be important due to the shortage of data scientists.

Task 3 – Organize analysis structure to support big data success. Avoid tying data

scientists only to business units or only creating an enterprise pool of data scientists. A hybrid of

these two approaches is warranted.

Task 4 – Deploy the infrastructure to support big data analytics. Create an infrastructure

to support unconstrained analytics. These systems should contain embedded analytics, agile

extensions, rapid iterations, real-time access, and extreme flexibility.

Task 5 – Foster a culture of analytics that supports data driven decisions. Big data

analysis can eliminate emotions, gut feelings, and egos from decision-making.

Training and Certification of Data Scientists

Henry and Venkatraman (2015) claim the average American universities and their degree

programs are unprepared to provide the analytical skills required of corporations in the modern

big data environment. Conversely, the literature suggests there are many colleges, universities,

trade-schools, research organizations, software providers and government organizations that are

modifying their curriculums to include advanced analytics and data science (Miller, 2014). The

literature regarding data science suggests there are no widely agreed upon standards and

certification requirements for data science and data scientists. Essentially anyone can label

themselves a data scientist. Considerations such as the educational level and the core skill

requirements are still in large debate making it difficult to define data science skills and

curriculums. However, there are many educational institutions now providing their interpretation

(Cotter, 2014).

In Cotter’s (2014) dissertation: Analytics by Degree: The Dilemmas of Big Data Analytics

42

in Lasting University/Corporate Partnerships this researcher conducted in-depth investigation

about how corporations and universities should partner to ensure the readiness of graduates to fill

key analysis roles in the era of big data. Cotter conducted a phenomenological study and

interviewed four business analytical groups: business leaders, faculty, recent graduates, and

supervisors of recent graduates to determine the readiness of the recent graduates and the

perceived overall effectiveness of the university education. This research concluded that most

business analytics graduates are initially lacking in real-world preparation. Additionally, Cotter

concluded the ever-changing business world is creating a need for analytical capability that may

have been previously satisfied with the T-shaped analysts (Cotter, 2014). Cotter’s research

amplifies the research questions posed in this dissertation regarding how prepared are the

analysts within the DOD to glean actionable information from big data sets? Fundamentally,

determining how the curriculums offered today at universities and DOD learning institutions

may need to alter to provide data scientists to the workforce is high interest to DOD leaders

(Edwards, 2014).

Defining data scientists’ skills, training and certification requirements is problematic

because of the broad implications and overlapping language with business intelligence, data

analysis, and business analytics. Cotter (2014), also conducted a comprehensive review of the

current degrees and certifications offered at the undergraduate and graduate levels in the United

States and abroad and concluded there are several learning institutions with many undergraduate

degrees and certifications available. Fundamental to the investigation on whether data scientists

are different from traditional quantitative analysts requires an investigation of the current

abilities of data scientists in relation to their requirements to generate information and the ability

of the data scientists to use the modern tool sets. There is evidence suggesting not only a skills

43

gap, but the analysis tools are outpacing the ability of the analysts suggesting a gap in human

talent to harness big data (Halper, 2016). Watson and Marjanovic (2013) suggested already

embedded business analysts can upgrade their skills through university courses and should

include Java, R, SAS Enterprise Miner, IBM SPSS Modeler, Hadoop, and MapReduce.

Commercial Certification

Another available option for the DOD to examine their data science abilities is through

the use of certification from agencies outside of the DOD and academia. Modest research for

options available for certification of data scientists today suggests there are several companies

and trade organizations providing training and certification. The Institute for Operations

Research and Management Science (INFORMS) is an international organization comprised of

over 12,500 members supporting the fields of operations research and analytics. INFORMS

describes in their charter a desire to promote practices that create advances in operations research

and analytics for the betterment of decision-making and optimize business processes

(INFORMS, 2017). This organization claims to be a leading organization in the formalization of

a certification process for analytics focused on moving organizations from descriptive to

predictive and prescriptive analytics (Sharda, Asamoah, & Ponna, 2013). INFORMS sets an

eligibility requirement for experience and skills and then through a set of high standards and

rigorous examinations certifies analytical professionals with CAP certification (INFORMS,

2017).

Halper (2016) provided the results of a snapshot survey from an audience at The Data

Warehouse Institute Chicago 2016. This researched aimed at furthering the understanding as to

the confidence of software providers to automate analysis of big data sets and address the skills

gap. There is a push by software and hardware technology providers to ease the skills required of

44

data scientists by advancing analytical software to continually move through large data sets

while also providing high level and effective statistical analysis and training. Halper’s modest

research supports the notion that organizations are still trying to determine what skills are

required for their analysts, where the analysts are going to come from and are uncertain as the

overall effectiveness of software solutions (Halper, 2016).

Vendor Training and Certification

There are several major corporations such as Microsoft, IBM, TeraData, and SAS that

are quickly developing professional analytical and data science programs. Microsoft is

recognizing the growing need for professional expertise in data science through their

professional development program focusing on data science theory, hands-on training, on-line

course curriculum coupled with a final project prior to certification (Davis, 2016). The SAS

institute is another organization offering a data science certification. This company was founded

in 1976 and has been consistently growing ever since. SAS suggests that companies successfully

harnessing information from big data are augmenting their existing analytical staffs with data

scientists. Data scientist possess higher levels of IT capability and specialize training and skills

with emphasis on big data technologies (SAS, 2017). SAS has developed an Academy for Data

Sciences that offers a blend of classroom and on-line courses that also uses a case study approach

to get hands on experience. Additionally, the SAS training curriculum offers training in several

of the sought after big data and data science applications such as R, Python, Pig, Hive and

Hadoop (SAS, 2017). This research study explored the commercial availability of data science

training and explored how analysts are trained at the BZC to help determine if further

exploration of commercial data science training is appropriate for DOD organizations.

45

Shortfall Preparation

The literature suggests there is a significant shortfall of analytical professionals within the

commercial sector and the DOD and this shortfall is expected to grow (Géczy, 2015). As this

literature review demonstrates, researchers are calling for action. Miller (2014) suggested that

big data and data science are such a significant problem that a national consortium is warranted.

Academia, industry, and the U.S. Government should work together to continue the growth of a

big data and data science national consortium to address the big data analytical skills gap (Miller,

2014). This consortium would do the following:

 Create formal definitions for occupations to include data scientist

 Establish curriculums and standards for accreditation for data and analytics

 Engaged industries, government, and academia through shared communities of

interest

 Partner with industry consortiums and organizations to establish strong internship

programs and increase the collaboration between academia and business

 Stimulate the creation of courseware skills and literacy at all levels of education

 Establish working groups to govern data policy issues

Federal Job Series and DOD Data Scientists

George, Haas, and Pentland (2014) suggested equally important to the methods for

collecting the data are the methodologies to analyze the data. Finding and maintaining analysts

who are capable of gleaning actionable information and significance of big data intelligence is a

challenge confronting our military and these experts are in short supply (Edwards, 2014). The

development and continuous maintenance of data analysis skills in the era of big data typically

requires large investments in time and dollars. Additionally, each class of DOD worker (enlisted,

46

officer, civilian, contractor) may benefit uniquely from big data analysis but also may bring

unique challenges (Schneider et al. 2015). Attempting to analyze the current state of skills and

potential shortfalls of the entire class of workers in the DOD is beyond the scope of this

dissertation. However, this research focused on the primary analysts responsible for conducting

big data analysis at Bravo Zulu Center, the DOD civilians. Additionally, because the definitions,

skill requirements, and occupational roles of data scientists are still emerging in commercial

industries and academia, this fundamentally supports the importance of exploring this problem

for the DOD. Several researchers suggests the most likely avenue for organizations to develop

analytical talents will come from innovating new talent from existing analytical groups

(Davenport & Dyché, 2013). To gain insights as to the DOD’s current talent to conduct big data

analysis this research investigated the current occupational roles of the persons assigned within

the federal civilian workforce and the analysts assigned to the case study organization

responsible for conducing data analysis.

Office of Personnel Management

The United States Office of Personnel Management (OPM) is an independent agency of

the U.S. Federal Government that manages the civil service labor force. According to OPM,

“their mission is to recruit and hire the best talent; to train and motivate employees to achieve

their greatest potential; and to constantly promote an inclusive workforce defined by diverse

perspectives” (OPM, 2014, p.1.). OPM maintains a detailed classification and qualifications

section of their website and publicly available manual that promulgates the federal position

classifications, job grading, and qualifications information that is used to determine the

classifications and qualifications requirements for most work within the Federal Government

(OPM, 2014).

47

Classification and Qualification Standards

OPM classification standards are assigned for all federal positions and provide uniformity

and equity in the classification of positions by providing a common reference across federal

organizations, locations, and agencies. OPM classification usually includes a description of the

duties, criteria, official titles and grades. Simply, by classifying federal jobs OPM determines the

appropriate occupational series title, pay grade and pay system. Qualifications are the specific

knowledge, skills and abilities required of each position (OPM, 2009). OPM categorizes all

federal positions by white-collar jobs or trades and labor occupations. Examining the federal

positions classified for data analysis and the qualifications required of these positions provided

insights into the DOD’s current labor force associated with conducting big data analysis.

Researching the current federal job classifications suggest there are no current job classification

series for data scientists and the terms business intelligence and business analytics are not

requirements listed in the OPM’s classification and qualifications guidance. However, within the

1500 OPM job series there are several job classifications that encompass analysis, mathematics,

statistics, operations research, and computer science. The 1500 job series appears to be the

federal job classification most closely related to the emerging field of data science (OPM, 2005).

A description of the 1500 job series is paraphrased below:

Federal 1500 Job Series - This group includes all classes of positions and the duties of which

are to advise on, administer, supervise, or perform research or other professional and scientific work.

This group also performs related clerical work in basic mathematical principles, methods,

procedures, or relationships, including the development and application of mathematical methods for

the investigation and solution of problems. Additionally, the development and application of

statistical theory in the selection, collection, classification, adjustment, analysis, and interpretation of

48

data; the development and application of mathematical, statistical, and financial principles to

programs or problems involving life and property risks (OPM, 2005, pp. 14-16).

By further examining the 1500 federal job classification guidance there are several

occupational series that encompass, at least in part, many qualifications requirements of

traditional analysis as seen in Table 4. This research explored the 1500 series federal occupations

and other federal analysts occupations within the DOD workforce to determine if they provide

the necessary skills for useful big data analysis and how aligned these federal occupations are to

those of the perceived data scientist.

Table 4

Federal 1500 Job Series Occupations

1501- General Mathematics & Statistics 1520- Mathematics

1510- Actuarial Science 1529- Mathematical Statistics

1515- Operations Research 1530- Statistics

Note. Adapted from “Professional Work in the Mathematical Sciences Group 1500,” by U.S. Office of

Personnel Management.

According the research published by the U.S. Air Force, a distinctive data science career field

does not currently exist and the operations research analysts (1515) is the federal occupation that

most closely relates to the perceived data scientist occupation (USAF, 2016). The employment of

the 1500 job series analysts and other active analysts occupations were explored with the BZC

case study.

Management Implications

The arrival of a vast amount of data along with the continuing evolution of information

systems presents a paradigm that requires a change in the management of the organization.

49

Combining big data with advanced analytics will allow managers to gain deep insights about

their business and translate data analysis into improved performance (Brynjolfsson & McAfee,

2012). The Manyika et al. (2011) research that indicated a large shortfall of data scientists by

2018 also forecasted a significant shortfall of managers with the expertise to leverage big data

analysis to make effective decisions. In a big data era where one comment from a trusted social

media source can result in losses or profits of billions of dollars and chain reactions in the news

media, there is no argument remaining regarding a management impact to modern-day business

(George et al. 2014). Additionally, there is little doubt businesses are prioritizing to include big

data in their strategic plans and a recent survey of six hundred global business leaders identified

their organizations as data driven and ninety percent of those organizations recognized

information as key resources for success (Gobble, 2013). However, there is evidence that

suggests many organizations do not fully trust the technologies, the data and ultimately the data

scientists and “neither the data scientists nor managers are effective at speaking each other’s

language” (Harris & Mehrotra, 2014, p. 16). In the research conducted by Harris and Mehrotra

(2014) they proposed there are five key management challenges to address in the era of big data:

 Talent Management

 Leadership

 Decision Making

 Technology

 Company Culture

Although a comprehensive investigation as to the management implications associated with all

five key management challenges is beyond the scope of this dissertation, researching key

implications for managers and their perceptions of big data and data sciences is warranted.

50

Additionally, the approach of investigating the perceptions of the analysts as well as conducting

a focus group interview with executives or managers within the Bravo Zulu Center will help

ensure deep investigation. The investigation with the management team at the Bravo Zulu Center

will explore their perceptions regarding the differences between data scientists and traditional

analysts and several other important questions. Again, Harris and Mehrotra’s (2014) research

included a survey of more than 300 analysts and suggested that because there was a much higher

direct management involvement of data scientists over traditional data analysts into the most

critical projects, management understands how effective creative data scientists can be when it

comes to solving complex problems. Additionally, as part of their research Harris and Mehrotra

conducted a focus group interview session with a group of managers and executives to gain their

perspectives on big data and the data science. This approach was repeated in my case study

research with the Bravo Zulu Center.

According to Brynjolfsson and McAfee (2012) the managerial challenges associated with

building data driven organizations from big data are even greater than the technological

challenges. In general, the technologies are outpacing adoption, and there is work to be done to

construct the policies that ensures the leveraging of big data. In previous decades, data and

metrics were limited and essentially rolled into aggregated key performance indicators and

presented to executives. Much of the decisions and direction of the firm were placed in the hands

of the executives who relied heavily on their experiences and intuition. The ability to analyze big

data stands to completely change this business model but requires a significant investment in the

culture of the organization (Brynjolfsson & McAfee, 2012). Additionally, even though the term

big data has now been accepted as a common business term, there is very little published

management scholarly literature that tackles the management challenges associated with big data

51

and provides great promise and opportunity for new theories and practices (George et al. 2014).

Companies may need to train incumbent managers to be more numerate and data literate as well

as hire new managers who already possess the skills to lead in the era of big data (Harris &

Mehrotra, 2014).

Kiron (2013) from MIT Sloan Management Review provides analysis of a 2012 survey of

50 senior executives from the financial and insurance industries that investigated their

perceptions of big data. Several key themes emerged from this analysis.

 These leaders believed in the promise of better informed decisions with the analysis

of big data sets. Eighty five percent of the surveyed leaders indicated they have big

data initiatives either planned or in-work.

 These leaders were more concerned about the variety of data and less concerned

about the volume. Most of the firms had initiatives for managing the volume of data

but were not satisfied with the integration of the dispersed data sources.

 Very few leaders, only 3% were concerned about the analysis of social media

information.

 Organizational alignment is a critical factor to ensuring success. The alignment of big

data initiatives across the business and information technology units is crucial.

 The leaders recognized the lack of available analytical talent.

Harris and Mehrotra (2014) suggested senior management will need to learn how to best

employ and manage data scientists. Many large organizations are now creating a core hub of data

scientists to foster an environment of sharing information and technology. Additionally, because

data scientists are a scarce commodity, many organizations are embedding data scientists with

existing data analysis groups within the organization. Creating teams that combine business

52

analysts, visualization experts, modeling experts, and data scientists from different disciplines

and functional areas may provide the most effective strategy for employment (Harris &

Mehrotra, 2014).

Summary

This literature review provided evidence that U.S. companies are experiencing massive

data growth, and companies that can harness information from big data create competitive

advantages. Similarly, the DOD is experiencing big data growth and the ability of the U.S.

military to analyze large data sets are becoming a crucial element of mission accomplishment

(Hamilton & Kreuzer, 2018). The terms big data and data science have rapidly grown in their

relative importance in business and DOD scholarship, however there remains opportunity to

further advance theory for practical application. The desired ability to conduct meaningful

analysis from big data sets is a strong theme in contemporary scholarly literature and the further

emergence of the data science occupation is growing merit quickly. Based upon the evidence

suggesting there will continue to be a shortage of data scientists for the near future and the DOD

is faced with a significant challenge.

53

CHAPTER 3. METHODOLOGY

Introduction

The purpose of this qualitative case study was to explore how DOD employees conduct

data analysis with the influx of big data. This research explored the emerging data scientist

occupation and the skills required of data scientists to help determine if data science is applicable

to the DOD. This research aimed to discover if there are fundamental differences between DOD

analysts and data scientists by exploring the professional experiences of analysts and managers

from a key organization within the DOD. Géczy (2015) proposed a common big data problem in

organizations because of the inabilities of most organizations to manage and analyze big data

sets. Berner et al. (2014) suggested organizations are capturing more data than at any time in

history, with clear advantages to organizations that glean insight from the data. Although there is

a tremendous amount of literature investigating the implications with big data sets and data

science, there appears to be a gap in published scholarly literature regarding big data and data

sciences related specifically to the DOD (Frizzo-Barker et al. 2016). The general business

problem is the lack of effective analysis in organizations operating in the modern-day big data

environment (Harris & Mehrotra, 2014). The specific business problem is that DOD

organizations may be struggling with gleaning actionable information from large data sets

compounded by immature data science skills of DOD analysts (Harris et al. 2013). This chapter

is organized into sections to explain the methodology, design, setting, and proposed participants.

Additionally, this chapter explains how the data was collected and analyzed in support of the two

research questions and how ethical considerations were handled.

Research Questions

The objective of this research was to develop an understanding of how DOD analysts

54

respond to, probe and assimilate data in big data environments to help determine if a data science

occupation is justified and warranted in the DOD. The following research questions guided the

study:

Primary Research Question 1: How does the Bravo Zulu Center glean actionable

information from big data sets?

Primary Research Question 2: How mature are the data science analytical skills,

processes, and software tools used by Bravo Zulu Center analysts?

These research questions framed the research and were used to generate data through semi-

structured personal interviews and a single focus group interview from professionals living the

big data phenomenon within the DOD. Additionally, analysis of documents served as a third data

source from the sponsoring case study organization.

The remainder of Chapter 3 provides details on the research design and methodology, the

sponsoring organization and participants and the questions of inquiry to include how the data

was collected and analyzed. Additionally, this chapter discusses the credibility and dependability

of the research and ethical considerations.

Design and Methodology

A research design provides the logic that connects the collected data to the overall

questions posed in the study (Yin, 2009). Creswell (2009) described three components of

research: the researcher’s philosophical assumptions, the methodology, and the strategy of

inquiry. The researcher used an exploratory research design to gather the perceptions of the

participants through personal interviews and employed a qualitative strategy to explore and

analyze the collected data from a single embedded case study organization.

55

Methodological Approach

Qualitative research stems from a variety of disciplines such as “anthropology, sociology,

psychology, linguistics, communication, economics, and semiotics” (Cooper & Schindler, 2013,

p. 145). Qualitative research is an approach for exploring and understanding the meaning

individuals or groups may ascribe to a specific problem or phenomenon. This type of research

involves collecting data typically in the participants’ settings and inductively conducting analysis

of the collected information looking for themes to provide insight and understanding (Cooper &

Schindler, 2013). Additionally, Creswell (2009) explained, although there may still be

deliberation on the fine elements of qualitative research, generally there is common agreement

on several core and defining characteristics as seen below:

 Qualitative researchers collect data where the participants are experiencing the

phenomenon on problem under investigation.

 The researcher serves as the key instrument and is the means in which the data are

collected. Qualitative researchers may collect the data through interviewing

participants, observing behavior or examining documents.

 Qualitative researchers gather multiple forms of data vice relying on a single source.

 Qualitative researchers build patterns, categories, and themes from the data from the

bottom up utilizing inductive and deductive data analysis techniques.

 Qualitative researchers maintain a focus on learning the meaning that the participants

of the study uphold regarding the problem or issue under investigation.

 Qualitative researchers are open to emergent designs, and understand questions may

change, and data collection methods may shift as the researchers learns about the

problem or issue to be studied.

56

 Qualitative researchers understand their role in the study and how their personal

backgrounds have potential for shaping interpretations.

 Qualitative researchers strive to develop a complete account of the research problem.

A qualitative research methodology is appropriate for understanding human behavior and

is common in social and behavioral sciences and by scholar practitioners who seek to understand

a phenomenon (Cooper & Schindler, 2013). In this case, the research was furthering the body of

knowledge as it relates to big data and data science and how or if DOD analysts should be

behaving differently due to the growth of information into big data.

Research Design

A case study is a qualitative research design to obtain multiple perspectives from a single

organization and is appropriate when questions are being posed to understand a contemporary

phenomenon (Yin, 2009). Case study research is an inquiry about a contemporary phenomenon

that is set within the real-world context when there is a desire to provide an up-close and in-

depth understanding from a single or small number of cases (Yin, 2012). This effective approach

is the rationale for selecting one organization within the DOD with the intent to help determine if

data scientists are warranted in DOD organizations. Triangulation is a method used to improve

the overall accuracy of research by combing data collection methods and differing types of data

(Gronhaug & Ghauri, 2010). Triangulation for this research was executed by collecting data

through semi-structured personal interviews, a single focus group interview and document

analysis. Triangulation was accomplished by analyzing the data from the three data sources with

the assistance of the NVivo-11® software.

Participants

Yin (2009) suggested that a single case study is appropriate under several circumstances.

57

First, a case study is appropriate when a single case meets all the conditions for testing the theory

and can confirm, challenge, or extend the theory. Secondly, when a single case represents an

extreme or unique case and lastly when a single case is representative of a typical case. The

Bravo Zulu Center represents a typical case as described by Yin (2009). By examining this

representative case-study organization within the DOD directly responsible for large data sets,

this research can provide actionable knowledge and serve as a road map for the DOD and similar

large complex organizations to execute further research. There are several means of data

collection available to the qualitative researcher (Creswell, 2009). The researcher collected data

through semi-structured interviews, document analysis, and a single focus group interview and

are discussed further in the data collection section of this chapter.

The researcher contacted senior officials from the DOD working in the Pentagon to help

identify organizations that are responsible for analyzing large data sets thus making them

candidate organizations to participate in this research. Additionally, the researcher’s extensive

experience in the DOD helped to guide the selection of the Bravo Zulu Center (BZC) as the case

study organization to support this research. The BZC is a large complex organization with big

data and data science challenges and is representative of many DOD organizations facing very

similar challenges. Because the DOD is an extremely large organization with understandably

tight controls on releasing information, creating actionable research is difficult, but not

impossible. A letter for sponsorship was provided by the Office of the Secretary of Defense

Prepublication and Security Review that granted approval of this research within any DOD

organization with two conditions. First, DOD specific literature supporting the literature review

portion of this study would need to be already released literature regarding the DOD. In other

words, the researcher was not permitted to use his DOD computer and network access to extract

58

DOD related information that had not yet been released for public dissemination. Secondly, the

organizations and the individuals who participated in the research would do so on a volunteer

basis and the participants could end their involvement with the researcher at any time without

repercussion. Additionally, the sponsoring organization and the participants will not be

compensated.

Selecting the participants in qualitative research requires deliberate planning and an

effective sampling strategy. Participants of the research study are generally not chosen because

their opinions represent the dominant opinion but because their experiences and attitudes will

reflect the entire scope of the research problem (Gronhaug & Ghauri, 2010). The basic premise

for sampling in scientific research is “by selecting some of the elements in the population,

conclusions can be drawn regarding the entire population” (Cooper & Schindler, 2013, p. 338).

The population for this study represents thousands of managers and analysts from the DOD.

Additionally, the initial review of available literature regarding the BZC and its mission

supported its selection as the representative organization to support this study.

Harris and Mehrotra (2014) conducted a scientific research project that in 2012 surveyed

more than three hundred analysts and conducted a focus group interview with managers and

executives that investigated how organizations can get value from data scientists. Their research

findings suggested hiring data scientists alone is not enough and managers in modern

organizations must learn how to employ data scientists effectively. Their data collection strategy

was to solicit participants from two distinct groups, analysts and managers, and served as a

foundational strategy for this research and was repeated in this case study research with the BZC.

To gain understanding within specific functional groups within the DOD a purposive

sampling method was used. Purposive sampling is a type of nonprobability sampling where the

59

researcher arbitrarily selects participants for their “unique characteristics or their experiences,

attitudes, or perceptions” and is most effective when one needs to study a certain cultural domain

with knowledgeable experts within the organization (Cooper & Schindler, 2013, p. 663). The

ideal target population was determined to be senior managers or executives from the BZC

directly responsible or influenced by large data sets as well as the analysts, or perceived data

scientists supporting management within the BZC. Each of the participants of this study met the

initial inclusion criteria because they are employed by the BZC working as either an analyst or

manager/executive within the organization. Additionally, the purposive sampling strategy

allowed the researcher to exercise his expert judgment on additional inclusion and exclusion of

participants that ultimately increased the precision and accuracy of the research. The researcher

applied a minimum seniority and experience level to both participants groups and excluded DOD

contractors.

Although there is no specific requirement on the number of participants to include in a

qualitative research study, qualitative case study research typically ranges from 3 to 10

participants (Creswell, 2009). Additionally, saturation in qualitative research suggests the

researcher should keep sampling if the breadth and depth of knowledge is expanding and stop

collecting data when redundancy appears or no new insights occur from the collected data

(Walker, 2012). To ensure saturation is met the researcher pre-determined a minimum of 10

analysts would participate in the personal interviews and a minimum of 6 managers or executives

would participate in the focus group interview. Additionally, all the analysts and managers that

participated in this research was voluntary, no compensation was provided, and the participants

were informed they could leave at any time without repercussion. The details of the BZC

participant criteria are summarized in Table 5.

60

Table 5

BZC Participant Criteria

Managers or Executives Analysts

Pay Grade or Rank Civilian GS-14 or above

Military O-5 or above

Civilian GS-07 or

above military E-5

or above

Overall DOD

Experience

10 years 5 years

BZC Experience 2 years 2 years

Data Collection Focus Group Interviews

Participants 6-8 10-minima

Setting

Several factors were used to determine the DOD organization to participate in this

research and a potential conflict of interest was addressed. A conflict of interest is any condition

in which the researcher has an existing relationship with a participant or the sponsoring

organization that could compromise the validity and the findings of the research (Seidman,

2013). Naval aviation related DOD organizations were omitted as possibilities to avoid any

potential conflict of interest due to the researcher’s active employment with Naval Air Systems

Command (NAVAIR) and the potential of his 32-year naval career creating bias in the research.

Secondly, using secondary information, such as DOD organizational charts as well as

consultations with current senior civil service members at the Office of the Secretary of Defense,

several organizations were targeted for possible inclusion. Lastly, any DOD organization that

was selected would need to be experiencing a large growth in data and required to provide

actionable information about their big data sets.

61

The Bravo Zulu Center (BZC) was selected by the researcher as the single case study

organization. The BZC is a large complex organization with big data and analysis requirements

to support its mission and is representative of many DOD organizations facing very similar

challenges. The BZC’s big data and analysis requirement supports the selection of the BZC as

the representative organization to support this case study research. Due to the geographical

distance between the researcher and the BZC and due to scheduling complexities that existed

with the number of participants the data was not collected in person. The data was collected via

the telephone and is addressed further in the data collection section of this dissertation.

The BZC published and made publicly available a strategic document that provided

insights into data and analysis challenges within their organization. According to this report, the

U.S. Air Force has only started to realize the full potential of an integrated logistics and

sustainment enterprise and the ability to access and analyze data will play a key role. This

strategic plan for the BZC categorizes the actions to achieve the vision into nine distinct

attributes. Attribute #1 sets a vision for the BZC to build and analyze their data more effectively.

This strategic vision along with other BZC documents were explored as part of this research and

further detail is provided in Chapter 4. This research provided value to the DOD practitioners

working within the BZC and similar DOD organizations required to analyze big data sets. To

ensure confidentiality of the case study organization, the title and citation of the BZC strategic

document is not provided in this research.

Analysis of Research Questions

In qualitative research findings result from a process of data collection, interpretative or

analytical processing, and reporting (Cooper & Schindler, 2013). Organizations are made up of

human beings with different skills, attitudes, beliefs, values, motivations, prejudices, hopes,

62

worries, political beliefs, and other characteristics that effect the performance of the organization

(Swanson & Holton, 2005). In support of the two research questions chartering this study, the

role of the researcher was to explore how the BZC gleans actionable information from large data

sets to help determine if the data scientist occupation is warranted in DOD organizations. By

posing questions to professionals working within the BZC, their responses yielded patterns

regarding big data and data sciences and generated themes for actionable conclusions and the

support of further research. Three instruments and three data collection methods were used in

this study as seen in Table 6.

Table 6

Instruments and Data Collection Methods

Instrument Data Collection Method (s)

The researcher Interviews, focus group, document analysis

Audio recorder/Telephone Interviews

Audio recorder/Telephone Focus group

William S. Cleveland (2001) introduced the term data science in the context of enlarging

the major areas of technical work in the field of statistics and provides the conceptual framework

that supports this study. Cleveland’s seminal work described the requirement of an “action plan

to enlarge the technical areas of statistics focuses of the data analyst” (Cleveland, 2001, p. 1).

Cleveland described, due to the increasing collections of data a major altering of the analysis

occupation to the point a new field shall emerge and will be called “data science” (Cleveland,

2001, p. 1). Cleveland’s proposal of six technical areas that encompass the field of data science

includes multidisciplinary investigations, models and methods for data, computing with data,

pedagogy, tool evaluation, and theory. This taxonomy was adapted with permission from a

63

senior executive within the BZC to collect and analyze the data as seen in Figure 3.

Figure 3. Cleveland’s Data Science Taxonomy. Adapted from “Data Science: An action plan for

expanding the technical areas of the field of statistics.” by W. Cleveland (2001) International

statistical review, 69(1), 21-26.

1. Multidisciplinary Investigation – Investigate BZC data analysis collaborations.

2. Models and Methods – Investigate the analysis capabilities and the statistical models

and methods used by the BZC analysts.

3. Computing Data – Investigate BZC hardware and software capability available to

conduct big data analysis.

4. Pedagogy – Investigate the skills of the BZC analysts and the educational and training

requirements and opportunities available to BZC analysts.

5. Tool evaluation - Investigate the BZC software tools used in big data analysis.

Semi-Structured Interviews and Focus Group Interview Questions

The interview questions should seek to describe the essence of the experience and be

Data Sciences

Multidisciplinary Investigation

Models & Methods

Computing with Data

Pedagogy

Tool Evaluation

Theory

64

unquestionably linked to the research problem under investigation (Creswell, 2009). In support

of the two primary research questions chartering this research, the researcher prepared several

interview questions to gain specific insights regarding big data and data sciences experiences at

the BZC. The interview questions were limited to 5 to 8 and were prepared carefully as to

provide insights into the research problem while also being prepared as not to limit the views of

the participants. A template was developed to ensure a clear understanding of the questions and

to ensure identical initial questions were posed to the managers or executives and the analysts

within the BZC. Additionally, the participants were given the questions at least one week prior to

the scheduled interviews to ensure adequate time to develop in-depth responses.

Interview Questions

1. How is data used in your organization to meet mission requirements? What are some

areas in your organization that are dependent on data?

2. How do you define big data? What increases of digital data (big data) have you

witnessed and how has it impacted the business of the BZC?

3. What are some knowledge, skills, and abilities needed to be an effective data

scientist?

4. What are some of the significant challenges associated with conducting data analysis

in your organization?

5. What are the data science skills that are used by the BZC analysts?

6. What additional skills are needed by analysts to be effective in the modern big data

environment?

7. What else can you tell me regarding big data and data science?

Semi-Structured Interview Protocol

A semi-structured interview protocol was selected as the best means to collect data from

the analysts who participated in the research. Semi-structured interviews are individual depth

65

interviews that generally start with a few broader questions, to put the respondents at ease and to

gain general insight into the business problem, and then migrate into increasingly more specific

questions to draw out detail (Cooper & Schindler, 2013). Interviews used in qualitative research

can vary depending on the “number of people involved, the level of structure, the proximity of

the interviewer to the participants, and the number of interviews conducted” (Cooper &

Schindler, 2013, p. 152). Effective use of semi-structured interviews relies on developing a

dialog between the interviewers and the respondents and requires more interviewer creativity.

Additionally, the interviewer’s experience and skills should be used to achieve a greater clarity

and elaboration of the answers (Cooper & Schindler, 2013). As a 32-year veteran of DOD

experiences both as an active duty sailor and a federal civilian, the researcher relied heavily on

many experiences regarding the management of information technology projects and data

analysis initiatives for the DOD. The telephone was used as the data collection instrument to

conduct the interviews with the analysts.

Focus Group

A focus group is a panel that typically consists of 6 to 8 participants that is led by a

trained moderator. Focus group interviews typically last between ninety minutes to two hours

(Cooper & Schindler, 2013). The researcher moderated a focus group interview that consisted of

8 managers or executives from the BZC to gain insights, ideas, feelings, and experiences about

big data and data sciences in their organization. A recorded telephone conference was used as the

data collection instrument to conduct the focus group interview after which the recorded audio

was transcribed and analyzed by the researcher to determine patterns and themes.

Credibility and Dependability

Internal validity or credibility addresses how the research findings match reality.

66

Qualitative researchers need to address the extent the findings will make sense and be considered

credible (Swanson & Holton, 2005). To ensure the consistency of the findings and dependability

in the research the researcher used a field testing technique. The interview questions that were

developed by the researcher were field tested with five doctoral level business professors that

possessed the experience and skills to participate in this study and helped determine if the

questions posed by the researcher were interpreted as intended. These field tests were conducted

by telephone to simulate the conditions of the actual interviews and modifications were made to

the interview template based upon the feedback received. The field test confirmed the credibility

and dependability of the semi-structured interview guide and the focus group interview guide

used for this study. Creswell (2009) suggested member checking is a process used by researchers

to ensure the accuracy of qualitative findings. Through the ongoing dialogue between the

researcher and the participants the researcher will continually describe his interpretation of the

dialogue to ensure it aligns to the participants perceptions. Additionally, the researcher submitted

a copy of the transcripts to each participant for their review to ensure the researcher accurately

transcribed the dialogue.

Triangulation is a method to improve the accuracy of qualitative research by combining

data collection methods and different types of data to support the research. Triangulation in

research assists in the production of a more complete, holistic, and contextual portrait of the

research problem and is particularly important in case study research (Gronhaug & Ghauri,

2010). Triangulation for this research was achieved by utilizing three data collection methods

appropriate for qualitative research as seen in Figure 4.

67

Figure 4. BZC case study triangulation.

In conjunction with the analysts interviews and the management focus group interview the

researcher collected documents to support the research questions posed in this study. Documents

included job descriptions of analysts working at the BZC and a strategy document regarding data

analysis at the BZC.

Data Collection

Qualitative research combines explorative and intuitive analysis and relies on the

experience and the skills of the researcher to conduct analysis of the collected data (Gronhaug &

Ghauri, 2010). As with many scientific studies, business research studies generally required the

collection of primary data to answer their research questions (Gronhaug & Ghauri, 2010). The

data collection decisions in this research set the boundaries for the study on how the data would

be collected and documented for later analysis (Creswell, 2009). Creswell (2009) suggested

when conducting qualitative inquiry, the researcher has several forms of data collection means

available:

 A qualitative observation seeks to obtain information through the use of field notes on

the behaviors and activities of the individuals at the research sites.

Managers or Executives

Focus Group

Analysts

Interviews Documents Analysis

68

 Qualitative interviews are direct interaction events in which the researcher meets with

the participants and through the use of semi-structured interviews elicits views and

opinions.

 Qualitative documents are public documents (e.g., newspapers, meeting minutes,

official reports).

 Qualitative audio and visual materials such as audio recordings, photographs, video,

website main pages, and e-mail.

Upon approval from the Capella University Institutional Review Board (IRB), the

researcher began to collect data. The recruitment strategy was to email a description and purpose

of the study along with the interview questions, that illustrated the nature of the study to the list

of proposed analysts and managers that met the researcher’s selection criteria. After receiving

responses from several potential participants, the researcher began to formalize a relationship

with each participant. The participants confirmed they read the informed consent form provided

by the researcher and acknowledged they were willing to disclose information during the

interview process and agreed to allow the interviews to be recorded by the researcher. The

researcher allotted himself six weeks to conduct the individual interviews and the single focus

group interview. BZC documents were collected throughout the entire data collection period. To

minimize fatigue the semi-structured interviews of the analysts were limited to sixty minutes and

the focus group interview was limited to ninety minutes. Because of the geographical distance

between the researcher and the analysts participating in the study, the interviews of the analysts

were conducted via telephone. Additionally, the single focus group interview was conducted via

a telephone conference that allowed participants to dial in from different locations.

Before conducting any of the interviews each of the participants provided the researcher

69

with a verbal consent that met the standards of the Capella University IRB and the researcher

confirmed each participant understood their rights. Anonymity was provided by assigning a

numerical value for each participant in the study and no participant names were disclosed at any

point in the research. The data in support of this research was collected solely by the researcher

and the digital recordings and transcripts have been locked in a cabinet in the researcher’s home

and will be destroyed by the researcher after seven years via the use of a cross cut shredder for

documents and via an approved data destruction program for the digital recordings.

Document analysis is a process for systematically reviewing and evaluating documents in

support of qualitative research. Similar to other analytical methods, document analysis requires

the researcher to deeply explore the collected data to elicit meaning and develop a deeper

understanding in support of the research problem. Documents may include “both printed and

electronic material” and include items such as advertisements, agendas, meeting minutes,

manuals, white papers, books, letters, diaries and journals (Bowen, 2009 p. 27). In support of this

research, the BZC provided the researcher releasable documents regarding the job descriptions of

analysts working at the BZC and strategic documents regarding data and analysis at the BZC.

Additionally, to ensure the relevancy of the documents provided by the BZC the researcher only

collected documents published by the BZC between January 1, 2012 and July 31, 2018. These

documents were fully reviewed and the synthesized information was categorized into major

themes for analysis in support of the two research questions posed in this study.

Data Analysis

The process of qualitative data analysis is making sense out of the data and ultimately

discovering themes from seemingly random information (Swanson & Holton, 2005). The

premise promoted for the two distinct groups is aimed at learning about the lived experiences of

70

people responsible for setting goals and policies (managers) as well as learning about the lived

experiences of people responsible for gleaning information from large data sets (analysts).

Specifically, the researcher sought to locate themes from managers and analysts currently

working within the big data phenomenon to create an accurate understanding of the two research

questions proposed in this study.

Coding Structure

The process of coding “involves the assignment of numbers or symbols to responses

generated from the interviews so the information can be grouped into a limited number of

categories” (Cooper & Schindler, 2013, p. 652). Creating a coding structure gives the researcher

the ability to take large amounts of raw information acquired from the interviews and categorize

the collected responses into a more manageable scheme for processing and analysis (Cooper &

Schindler, 2013). In qualitative research coding happens as a function in both the preparation of

the data collection process and after the data are collected as a means to efficiently analyze the

data (Cooper & Schindler, 2013). Additionally, it is common in qualitative research for the initial

categorizations and codes to change and evolve during the research process (Gronhaug &

Ghauri, 2010). A coding structure was developed and served as guidance to the researcher to

ensure linkages between the conceptual framework, the research questions, and the data

collection process. In preparation for the semi-structured interviews with the analysts and the

focus group interview with the managers or executives the following initial coding structure was

used as seen in Table 7. This coding structure was modified as the researcher progressed through

the data collection and data analysis phases.

71

Table 7

Initial Codes

Code Theme Description

MI Multidisciplinary investigation BZC data analysis collaborations

MM Models and methods BZC analysis capabilities and the statistical

models or methods

CD Computing with data BZC hardware and software capability

P Pedagogy BZC analysts skills, training, education

TE Tool evaluation BZC software tools used

Cooper and Schindler (2013) suggested qualitative researchers use an array of

interpretive techniques to describe the phenomena, decode and translate the information drawn

from personal experiences to achieve an in-depth understanding that tells the researcher how and

why things happen. Swanson and Holton (2005) described four levels required for qualitative

data analysis as the following:

 Data organization and preparation – getting the collected data into a form that is easy

to work with and will require the transcription of the collected data.

 Familiarization – the researcher will become deeply immersed in the collected data.

 Data reduction (coding) – the researcher will be begin to organize the information

into meaningful categories.

 Generating meaning – the researcher will begin to offer own their own interpretation.

The following process was applied to conduct the analysis of the qualitative data collected from

the BZC as seen in Figure 5

72

Figure 5. BZC case study data analysis process.

Analysis and Interpretation

Data Organization & Preparation

Document

Analysis

Questions of Inquiry

BZC Management

Focus Group BZC Analysts

Interviews

Transcribe Data

from Audio

Recordings/

Documents

Verification of

Transcripts

Organization of

the Data

Read all of the

Data

Data Coding

NVivo ®

Themes Descriptions

Interrelating Themes and

Descriptions

Interpretation of the Meaning

of Themes and Descriptions

Recode the data using the identified

themes and sub-themes

Familiarization

Data Reduction

73

Data Organization and Preparation

All audio files generated from the interviews with the analysts and the focus group with

the managers were transcribed by the researcher. The interviews were transcribed into a

Microsoft Word ® document and this document was imported for use in NVivo-11®.

Additionally, the researcher typed up his field notes and observations recorded during the

interviews and these were also imported into NVivo-11® for qualitative inductive analysis and

thematic identification. All recordings, transcriptions, scans, and outputs from NVivo-11® will

be kept in an unidentified, password-protected location for seven years and subsequently

destroyed.

Familiarization

During the familiarization process the researcher is actively engaged in the data by

asking questions of the data and making comments (Swanson & Holton, 2005). The researcher

immersed himself in the data by listening to the audio several times and reading and rereading

the data while taking notes and synthesizing meaning from the data. The familiarization process

allowed the researcher to gain a general sense of the collected information and then to note and

understand important aspects that later aided in the analysis portion of the research.

Data Reduction

A large share of the work involved in qualitative analysis is driven by the act of

categorizing and coding. The goal is to begin to identify themes of the collected data and use

codes to represent those emergent concepts (Swanson & Holton, 2005). Several steps are

required in the data reduction process. The researcher is looking for tones, impressions, and

credibility of the collected data while always keeping in the forefront how the collected data

might relate to the research questions proposed in the study (Swanson & Holton, 2005).

74

Secondly, the process of coding gives the researcher the ability to reduce or simplify the data by

creating categories and gives the researcher the ability to start conceptualizing the collected data.

A code is a tag or label for assigning units of meaning to the collected data and data driven codes

are the most fundamental and most widely used method of coding in qualitative research

(Swanson & Holton, 2005). With continual reading and synthesizing of the collected data,

recurring topics and patterns began to emerge from the data that were then categorized and

properly coded. This process was completed separately for every semi-structured interview

transcript and the transcript of the focus group interview. Additionally, these two sets of outputs

were combined and analyzed together. The last step in the data reduction phase is to start the

generation of themes from the analyzed data. By examining and reflecting the categories and

themes of each interview and focus group overall themes began to emerge (Swanson & Holton,

2005).

Analysis and Interpretation

The final phase of the process is the analysis and interpretation of the data. In this phase,

the researcher brings all the generated themes together for formal conclusions and presentation

(Cooper & Schindler, 2013). Through the process of coding and analysis of the collected data,

interpretation and understanding began to emerge for the researcher. In this stage the qualitative

researcher attempts to offer their own interpretation of the phenomenon (Swanson & Holton,

2005). This is done by exploring the codes and categories and asking, how do the themes fit

together? What happens with some combining or splitting of the categories? What patterns

emerge across the themes? What contrasts, paradoxes, irregularities may surface? The resulting

themes that resulted from the data collection and analysis are described in Chapter 4 of this

dissertation.

75

Ethical Considerations

The researcher obtained approval from the Capella University Institutional Review Board

(IRB) prior to collecting research data from any of the participants. Additionally, the researcher

successfully completed the Collaborative Institutional Training Initiative (CITI) that provided the

general acceptable ethical standards for academic human research. After completing this

training, the researcher determined the core ethical principles to address in this research included

informed consent, privacy, confidentiality, and researcher bias. Additionally, the researcher

obtained approvals from the U.S. Air Force Survey Office, U.S. Air Force Human Rights

Protection Office, and the union that represents a portion of the workforce at the BZC.

DOD information security considerations were mitigated by working closely with the

Secretary of Defense Prepublication and Security Review office that is responsible for providing

security reviews of publications regarding DOD information. Additionally, an ethical

consideration of conflict of interest was examined. A conflict of interest is any condition in

which the researcher has an existing relationship with a participant or the sponsoring

organization that could compromise the validity and the findings of the research (Seidman,

2013). The researcher in this study has a long history with the DOD due to his employment with

the Naval Air Systems Command. This conflict was mitigated by not including any naval

aviation organizations in the research. Solely the researcher collected the data in support of this

research. The digital recordings and transcripts will be locked in a cabinet in the researcher’s

home and will be destroyed by the researcher after seven years via the use of a crosscut shredder

for documents and via an approved data destruction program for the digital recordings.

76

CHAPTER 4. RESULTS

Introduction

The purpose of this qualitative case study was to explore how DOD employees conduct

data analysis with the influx of big data. The general business problem is the lack of effective

analysis in organizations operating in the modern-day big data environment (Harris & Mehrotra,

2014). The specific business problem is that DOD organizations may be struggling with gleaning

actionable information from large data sets compounded by immature data science skills of DOD

analysts (Harris, Murphy, & Vaisman, 2013). This research explored the emerging data scientist

occupation and the skills required of data scientists to help determine if data science is applicable

to the DOD. This research aimed to discover if there are fundamental differences between DOD

analysts and data scientists by exploring the professional experiences of analysts and managers

from a critical organization within the DOD. Géczy (2015) suggested a typical big data problem

in organizations because of the inabilities of most organizations to manage and analyze big data

sets. This chapter is organized into sections to explain the data collection results, data analysis

results, summary, and how the collected and analyzed data supported the two research questions

in this study.

The following research questions guided the study:

Primary Research Question 1: How does the Bravo Zulu Center glean actionable

information from big data sets?

Primary Research Question 2: How mature are the data science analytical skills,

processes, and software tools used by Bravo Zulu Center analysts?

The remainder of Chapter 4 is organized to provide details of the participants in the research,

documents that were collected and analyzed, and the themes and patterns that resulted from the

77

qualitative data analysis of the collected data.

Evaluation of Design and Methodology

Qualitative research stems from a variety of disciplines such as “anthropology, sociology,

psychology, linguistics, communication, economics, and semiotics” (Cooper & Schindler, 2013,

p. 145). As described by Moustakas (1994), qualitative research is an approach to explore how

groups or individuals perceive a specific phenomenon or problem. This type of research involves

collecting data typically in the participants’ settings and inductively conducting analysis of the

collected information looking for themes to provide insight and understanding (Moustakas,

1994). A case study is a qualitative research design to obtain multiple perspectives from a single

organization and is appropriate when questions are being posed to understand a contemporary

phenomenon (Yin, 2009). Case study research is an inquiry about a contemporary phenomenon

that is set within the real-world context when there is a desire to provide an up-close and in-

depth understanding from a single or small number of cases (Yin, 2012). This effective approach

was the rationale for selecting the BZC to help determine if data scientists are warranted in DOD

organizations. The data collected and analyzed from the management focus group, the analysts’

interviews, and the BZC documents supported an exploratory case study approach for this

research. Additionally, the BZC is a complex organization that collects large amounts of data and

is struggling with the analysis of this data to support their mission requirements making them an

ideal representative case study organization for this research. The data was collected by three

means to support this research. First, semi-structured interviews were conducted with analysts

working within the BZC. Second, a single focus group interview was conducted with managers

within the BZC. Third, job announcements used to hire BZC analysts were collected and

analyzed and a recent BZC strategic planning document was collected and analyzed. The

78

research design and methodology, participant criteria, setting, data collection and analysis

methods were executed as proposed in Chapter 3. One additional analyst was interviewed than

the proposed minimum to ensure saturation.

Data Collection Results

Participants of the research study are generally not chosen because their opinions

represent the dominant opinion but because their experiences and attitudes will reflect the entire

scope of the research problem (Gronhaug & Ghauri, 2010). The researcher used the purposive

sampling method and defined participant criteria based upon minimum seniority and experience

level to include senior managers or executives from the BZC directly responsible or influenced

by large data sets as well as analysts supporting management within the BZC. The research

complied with the policies of the Institutional Review Board (IRB) at Capella University, the

U.S. Air Force Survey Office and the U.S. Air Force Human Rights Protection Office and the all

the participants met the inclusion criteria. Triangulation is a method used to improve the overall

accuracy of research by combing data collection methods and different types of data (Gronhaug

& Ghauri, 2010). Triangulation for this research was executed by collecting data through semi-

structured personal interviews, a single focus group interview, and document analysis.

Triangulation was accomplished by analyzing the data from the three data sources using the

NVivo-11® software that aided in the identification of patterns and themes.

Interviews

A list of the email addresses of potential participants that met the participant criteria was

provided to the researcher by the BZC personnel office. The researcher then solicited participants

via email that included a description of the research, the adult informed consent form, and the

interview questions. Potential participants consisted of personnel working at any of the BZC

79

locations with a job title of analyst, and they met the minimum seniority and experience criteria.

Demographic analysis was conducted on the initial composition of potential participants as seen

in Figure 6.

Figure 6. BZC potential analyst participants.

Unexpectedly the demographic analysis of the potential participant data revealed there are far

more program management analysts assigned in analyst positions over the other OPM

occupations at the BZC. Eleven semi-structured interviews with analysts were conducted and

one additional analyst was interviewed than originally planned to ensure saturation. The analysts

that agreed to participate spanned three different OPM job occupations and ranged significantly

in overall DOD and BZC experience. The most senior analyst that participated had forty-five

years DOD experience and the most junior analyst had nine years DOD experience. The

participant with the most BZC center experience had fourteen years of experience and two

participants had just completed two years working at the BZC. The analyst participants were

assigned a numeric value to ensure their anonymity as seen in Table 8.

80

Table 8

Interviewee Experience Levels

Pseudonym OPM Code/Occupation DOD Experience BZC Experience

Analyst 1 2003/Supply Analyst 17 Years 8 Years

Analyst 2 1515/Ops Research Analyst 18 Years 2+ Years

Analyst 3 2003/Supply Analyst 35 Years 8 Years

Analyst 4 1515/Ops Research Analyst 17 Years 2+ Years

Analyst 5 1515/Ops Research Analyst 33 Years 6 Years

Analyst 6 0343/Program Analyst 16 Years 6 Years

Analyst 7 0343/Program Analyst 45 Years 13 Years

Analyst 8 1515/Ops Research Analyst 13 Years 5 Years

Analyst 9 0343/Program Analyst 19 Years 14 Years

Analyst 10 1515/Ops Research Analyst 9 Years 9 Years

Analyst 11 0343/Program Analyst 41 Years 6 Years

The researcher shared the purpose of the exploratory research with each participant, and

the researcher read the adult informed consent form out loud and received verbal consent from

each participant before conducting the interviews. The open-ended interview questions ensured

alignment with the conceptual framework and were grouped within the initial coding structure

and supported the two research questions. The analysts’ interviews were recorded using a

smartphone application. The interviews were then downloaded onto the researcher’s personal

computer and the audio recording files were imported into the NVivo-11® software. Each audio

interview was transcribed by the researcher and the document files were imported into NVivo-

11® that aided in the thematic analysis.

81

Focus Group

A list of the email addresses of potential focus group participants that met the participant

criteria was provided to the researcher by the BZC personnel office. The researcher then solicited

participants via email that included a description of the research, the adult informed consent

form, and the interview questions. Potential participants consisted of managers or executives

working at any of the BZC locations that met the minimum seniority and experience criteria.

Seven managers and one executive participated in the focus group and each participant was

assigned a generic manager title and a numeric value to ensure their anonymity refer to Table 9.

Table 9

Management Focus Group Experience

Pseudonym DOD Experience BZC Experience

Manager 1 35 2

Manager 2 32 24

Manager 3 30 10

Manager 4 19 3

Manager 5 16 14

Manager 6 20 15

Manager 7 34 24

Manager 8 17 12

The researcher shared the purpose of the exploratory research with each participant, and the

researcher read the adult informed consent form out loud and received verbal consent from each

participant prior to conducting the focus group interview. The researcher confirmed with each

82

participant that they met the seniority and minimum experience participant criteria. The

researcher asked the same initial open-ended questions to the management focus group that were

asked to the analysts and the interview questions ensured alignment to the conceptual framework

and were grouped together within the initial coding structure and supported the two research

questions. The focus group interview was eighty-six minutes in duration and was recorded using

a smartphone application. The interview was then downloaded onto the researcher’s personal

computer and the audio recording of was then transcribed by the researcher and imported into

NVivo-11® that aided in the thematic analysis.

Document Analysis

Two different types of documents were collected and analyzed in support of this research.

Job announcements were collected to explore the skills required of newly hired analysts to help

determine if the BZC is hiring data science skills into their organization. Additionally, a strategic

planning document that encompasses a vision of data and analysis for the BZC to achieve was

collected and analyzed. The documents that were collected to support this study are seen in Table

10. To ensure the confidentiality of the case study organization, the title and citation of the

BZC’s job announcements and strategic document are not disclosed in this research.

83

Table 10

BZC Collected Documents

Document Type Document

Job Announcement Program Management Analyst

Job Announcement Operations Research Analyst

Job Announcement Computer Scientist

Job Announcement Supply Systems Analysts

Strategic BZC Strategic Planning Document

This research explored if the federal occupations within the BZC workforce provide the

necessary skills for big data analysis and how aligned these federal occupations are to those of

the perceived data scientist. The data collection and analysis supported the two research

questions in this study to explore how the BZC gleans actionable information from big data sets

and how mature are the data science skills of analysts, processes and software tools used within

the BZC. Analyzing BZC job announcements for analysts and computer scientists and coding the

job and skills requirements from these job announcements into NVivo-11® aligned with the

initial coding structure and conceptual framework provided insights on the BZC’s requirements

of analytical talent. The BZC personnel office provided job announcements for analysts and

computer science occupations. These announcements were imported into NVivo-11® and the

duties and skills requirements were coded using the initial coding structure aligned with the

conceptual framework, and the results are provided later in this chapter.

To explore how the BZC uses data and to explore how the BZC gleans actionable

information from big data sets, a BZC strategic planning document was collected and analyzed.

84

This publicly available BZC strategic document suggest the U.S. Air Force has only started to

realize the full potential of an integrated logistics and sustainment enterprise and the ability to

access and analyze data will play a key role. This strategic plan for the BZC categorizes the

actions to achieve the vision into nine distinct attributes. Attribute #1 sets a vision for the BZC to

build and analyze their data more effectively. This document was imported into NVivo-11®, and

the content of attribute #1 was coded aligned with the initial coding structure and conceptual

framework and the results are provided later in this chapter.

Data Analysis and Results

In qualitative research findings result from a process of data collection, interpretative or

analytical processing, and reporting (Cooper & Schindler, 2013). In support of the two research

questions chartering this study, the role of the researcher was to explore how the BZC gleans

actionable information from large data sets and how mature the data science skills of analysts,

processes, and software tools are at the BZC to help determine if the data scientist occupation is

warranted in DOD organizations. By posing questions to professionals working within the BZC,

their responses yielded patterns regarding big data and data sciences and themes have been

generated for actionable conclusions and the support of further research.

The process of coding “involves the assignment of numbers or symbols to responses

generated from the interviews so the information can be grouped into a limited number of

categories” (Cooper & Schindler, 2013, p. 652). Creating a coding structure gives the researcher

the ability to take large amounts of raw information acquired from the interviews and categorize

the collected responses into a more manageable scheme for processing and analysis (Cooper &

Schindler, 2013). The researcher developed the research questions and ensured alignment with

the initial coding structure and conceptual framework. The interview questions were open-ended

85

which enabled semi-structured conversations about how the BZC gleans actionable information

from big data sets and how evolved the data science skills, processes, and software tools are at

the BZC. The coding and analysis of the interviews with the analysts served as the baseline for

the enhanced coding structure and were then used in the coding and analysis of the focus group

interview and the BZC documents. The initial coding structure is restated in Table 11 for

convenience.

Table 11

Initial Codes

Code Theme Description

MI Multidisciplinary investigation BZC data analysis collaborations

MM Models and methods BZC analysis capabilities and the statistical

models or methods

CD Computing with data BZC hardware and software capability

P Pedagogy BZC analysts’ skills, training, education

TE Tool evaluation BZC software tools used

Several iterations of reading and coding were required in the data reduction process and the

researcher was looking for tones, impressions, and credibility of the collected data while keeping

in the forefront how the collected data related to the research questions in this study. With

continual reading and synthesizing of the collected data recurring topics and patterns emerged.

The coding structure was refined as the transcripts of the analysts and focus group interviews

were coded and analyzed and resulted in the final coding structure (see Figure 7).

86

Figure 7. Final hierarchical coding structure. Shaded codes represent the initial coding structure.

87

Semi-Structured Interviews Analysis and Results

The transcriptions of the 11 analysts’ interviews were loaded into NVivo-11® and each

interview was coded to the initial parent codes aligned with the conceptual framework. After the

initial coding and analysis of the transcribed interviews of the 11 analysts a word frequency

query was used in NVivo-11® to generate Figure 8. The word data was removed from all word

frequency queries because it was overwhelmingly used.

Figure 8. Initial analyst interviews word frequency diagram.

The initial analysis of the semi-structured interviews with the analysts suggests early themes of

analysts’ skills, analysis, training, organizations, and information systems as seen in Figure 8.

The word frequency query was then modified to display only the fifteen most used words by the

analysts to further identify the early themes. This additional query still demonstrated early

themes of analysts’ skills, analysis, training, organizations, and information systems but

additional themes of programs, scientist, engineers, research, problem, pull, and management

emerged as seen in Figure 9.

88

Figure 9. Refined analyst interviews word frequency diagram.

Several open-ended interview questions were posed to the eleven analysts that

participated in the research to further explore the research questions on how the BZC gleans

actionable information from big data sets and how mature are the data science skills, processes,

and software used by BZC analysts. The interview questions were designed to gain a deeper

understanding on how BZC analysts conduct analysis, their perceptions of big data, challenges

associated with conducting data analysis, the software tools used to conduct data analysis,

training options for analysts, and their perceptions of data science. Several themes emerged from

the analysis of the collected data which helped to answer the research questions posed in this

study.

Research Question #1: How does the Bravo Zulu Center glean actionable information

from big data sets?

The analysts were asked initial open-ended questions investigating if the BZC is

experiencing the big data phenomena, the perceived benefits, and liabilities of big data, and their

conceptions about the term big data. The responses provided insights about the concept of big

data, data growth and the ability of the BZC to analyze large data sets. The participants’

89

responses are provided in Table 12.

Interview questions posed regarding big data:

How do you define big data? What increases of digital data (big data) have you witnessed and

how has it impacted the business of the BZC?

The complete list of initial interview questions are provided in Appendix A.

Table 12

Analysts’ Responses to Questions about Big Data

Participant Comment

Analyst 2 I think at least the fundamental concept of big data is integrating multiple data

sources so that you’ve got a better picture of your overall output or just trends.

This is something where we should be working toward. There is very little that

we are doing with big data.

Analyst 3 It’s so big you haven’t figured out either the way to do it or the time to do it, to tie

things together in a meaningful way that is what I think our situation is.

Analyst 4 Yes, it has grown exponentially from the 80s. However, many of our systems for

data collection rely on the compliance of human beings.

Analyst 5 There are vast amounts of sensor data on new weapon systems that are available.

I believe big data is anything bigger than a standard desktop application can

handle, it is going to involve data formats above and beyond structured tables and

lists. It is going to include things like scanned images, and we’ve got information

systems that involve scanned images, it could be audio, it could be video, it could

be free form text, we’ve got lots of forms with check boxes and then free form

boxes for somebody to write something in there. Big data is going to be a huge

volume and it may be coming at you at a very rapid rate.

Analyst 6 I haven’t noticed in increase in the data, I have noticed a trend to try and

modernize how the data is being gathered, maintained and shared.

Analyst 8 I think to someone who comes from a statistics background, who has been in the

field of statistics for a long time, their version of what constitutes big data is

totally different than someone who is a computer scientist or programmer. I

would say big data in today’s day and age. Big data is millions of records if not

billions and trillions I don’t know that we are capturing more data, per say in the

BZC, although I think there is a push to want to capture more than what we

already are. I think big data and big data analytics is a trend, but it is a trend that

is here to stay and I think the Air Force needs to jump on the bandwagon.

Analyst 11 We have so much data and you’re right it is growing exponentially and it’s really

kind of overwhelming for the average employee.

90

Big data theme. By coding and analyzing the transcripts from the analysts’ interviews

through the (MI) initial code regarding big data, thematic elements common in the literature

review were revealed. The BZC is a complex organization with many disparate data systems

generating large data sets and is struggling with gleaning actionable information from the data

sets. The BZC supports Moorthy’s (2015) definition of big data “as the collection of data sets so

large and complex that it becomes difficult to process using traditional relational database tools

and traditional data processing applications” (Moorthy et al. 2015, p. 76).

The analysts were posed questions that further explored how the BZC gleans actionable

information from big data sets. The participants were asked to explain how data is used within

the BZC to meet mission requirements. The participants were also posed an open-ended question

that explored any dependencies on data. The participants’ responses are provided in Table 13

Interview question posed regarding big data analysis challenges:

How is data used in your organization to meet mission requirements? What are some areas in

your organization that are dependent on data?

The complete list of initial interview questions are provided in Appendix A.

Table 13

Analysts’ Responses to Data Usage Questions

Participant Comment

Analyst 2 But they haven’t been able to tell people how they perform historically and we’ve

had to go back and develop all that for them as far as metrics and other things like

that and a Pareto chart. Then we did some follow up DSCM work after that and

developed metrics and goals. Analyst 3 So, we are big on metrics, number 1 so we pull down a lot of data just to satisfy

populating metrics but there’s not, the majority of the metrics there’s not a lot of

analytical things that go along with it, its just we pull down the data and you

populate a metric and then you’re done. There’s other that we do to where we pull

the data populate a metric and maybe based on being in or out of tolerance that

91

Table 13 (continued)

Participant Comment

Analyst 3 warrants doing some analysis and so any time you do analysis, when then you

have to start pulling down the raw data that facilitates doing that Analyst 5 So, we do the planning and we generate metrics from all sorts of data to access

how well the supply chain is performing. One of the big things that we look at is

metrics, how well are we doing and there are different definitions of the metrics

depending on which organization you are talking to. But even within the BZC

there is going to be different definitions of the metrics. Analyst 7 One of the things they have a measurement for output per man day. So a man day

would be let’s say people on an 80 hour pay period on a two week period.

Analyst 9 I go and evaluate an organization they’ve really have never tracked it before, like

on a spreadsheet or database or anything because it was never really evaluated as

something that was important, there is other metrics that they are looking at.

Analyst 11 People do a lot of the gathering of the data and metrics, reporting and that sort of

thing.

Metrics theme. In previous decades, data and metrics were limited and essentially rolled

into aggregated key performance indicators and presented to executives. Much of the decisions

and direction of the firm were placed in the hands of the executives who relied heavily on their

experiences and intuition. The ability to analyze big data stands to completely change this

business model but requires a significant investment in the culture of the organization

(Brynjolfsson & McAfee, 2012). The responses to the interview questions posed to the analysts

regarding how data is used within the BZC were coded using the (MI) initial code aligned with

the conceptual framework. The analysis of the collected data suggests a theme of metrics and the

BZC places emphasis on managing their business through the analysis of metrics. The analysts

proclaimed they spend a significant amount of time pulling data together and creating metrics for

their leadership.

The analysts were asked initial open-ended questions that continued to explore how the

BZC gleans actionable information from big data sets and associated challenges. The participants

92

were asked to explain the challenges in gleaning actionable information from big data sets. The

participants’ responses are provided in Table 14

Interview question posed regarding big data analysis challenges:

What are some of the significant challenges associated with conducted data analysis in your

organization?

The complete list of initial interview questions are provided in Appendix A.

Table 14

Analysts’ Responses to Questions Regarding Data Analysis Challenges

Participant Comment

Analyst 1 We definitely have problems with data quality and I think as the data increases

the challenges increase.

Analyst 2 They have so little actionable big data; we lack the infrastructure and the

knowledge to really bring it all together.

Analyst 3 We do have plenty of data. The data warehouse that I use mostly, there hasn’t

been an increase in the data that’s been collected, however there’s been a change

of what’s been exposed to us.

Analyst 4 The reliability of our data is poor.

Analyst 5 You made an allusion to a data pool or data warehouse. It’s not out there, there is

an immense amount of time and effort that has to be applied to knowing where

the data is at and then going out to fetch it.

Analyst 6 The biggest challenge is getting appropriate access to those systems to extract the

Information. It seems that we are still very protective of letting other air force

employeesget into systems and pull what needs to be pulled. That’s a challenge

that I experience on a daily basis. Who owns the data, people allowing you to see

their data, you could have better decision support if you have access to certain

data, but getting that access is often difficult from the person who controls it so

that is a challenge.

Analyst 7 I don’t believe that there is a problem with collecting data and really even some

cases the way they report. I think it is probably just not as accurate as it should

be.

Analyst 10 So IT alone aside from software is another issue, but sometimes the lack of data

or missing information.

93

Access to quality data theme. By coding and analyzing the transcripts from the analysts’

interviews through the (MI) initial code, access to quality data emerged as a theme. The analysts

indicated infrastructure and policies are constraining access to data. Additionally, the data that is

accessible lacks accuracy and completeness. Watson and Marjanovic (2013) suggested a

challenge with harnessing the power of big data includes accessing data through appropriate

platforms and providing data governance. A BZC data governance strategy that includes how

analysts get access to quality data to support mission requirements is warranted.

As the dialog continued between the researcher and the analysts regarding the challenges

associated with conducting data analysis at the BZC. Additional sub-questions were posed to

each participant to further explore the factors constraining access to quality data within the BZC.

The participants’ responses are provided in Table 15

Interview questions posed regarding big data analysis challenges:

What are some of the significant challenges associated with conducted data analysis in your

organization? What are some factors limiting access to quality data?

The complete list of initial interview questions are provided in Appendix A.

Table 15

Analysts’ Responses Further Exploring Access to Quality Data

Participant Comment

Analyst 2 We have so little actionable big data; we lack the infrastructure and the

knowledge to really bring it all together. As far as advanced analytics, the

infrastructure hasn’t been established, a couple of people have tinkered with it.

We desperately need the infrastructure and the hardware and the software to get

started, management needs to understand that when they set up big data, it’s a lot

like owning a boat, you are going to pour in a lot of money and we may not see a

real viable return on investment for 3-5 years.

Analyst 4 So you can So you can imagine if the Air Force or DOD decided to go to a cloud

based system the millions upon millions of records that we have that would have

94

Table 15 (continued)

Participant Comment

Analyst 4 to be scrubbed. Most of them could be done automatically.

Analyst 5 We’re still running on many dozens of legacy data systems that have their roots

decades ago and we are still using those legacy systems to do our planning.

Analyst 6 How do we transition this information from legacy systems that are pieced

mealed into a larger common database that we can actually do things with and

make informed decisions and connect the dots where we know we haven’t been

able to in the past? How do we merge everything together to where we can really

start tackling some of these big problems instead of just wringing our hands over

it?

Analyst 10 But there are now doing a lot of data mining, getting all of this information from

these program offices and putting them into databases where, they are web-based

databases where anybody can go in and get this information and I think it’s very

important and now they are talking about going to the cloud and having a lot of

the information available in the cloud, although the air force is behind in that.

Analyst 11 There is a lot of things we don’t know, we’ve got the data out there but it is in so

many disparate forms and so many disparate systems that it is virtually

impossible for us to know what we truly have and what we can do. So I have been

trying to get us pushed into that direction.

Infrastructure: Legacy and disparate systems theme. Edward (2014) suggested the

essence of analyzing big data within the DOD requires the aggregation of many data sources

from hundreds of organizations requiring the defining data sharing legal, policy, oversight, and

compliance standards to make it happen. According to Watson and Marjanovic (2013), the

challenge with harnessing the power of big data includes identifying which sectors of data to

exploit, getting data into an appropriate platform and integrating across several platforms,

providing governance, and getting the people with the correct skill sets to make sense of the data.

Interview questions were posed to the participants regarding what challenges and opportunities

they faced to conduct big data analysis and the responses that were related to information

systems were coded using the (CD) initial code aligned with the conceptual framework. The

95

analysis of the collected data suggests the BZC has sections of their business with modern

computer infrastructure and analysis capabilities but their business is also constrained in the

ability to conduct enterprise big data analysis partially due to outdated or legacy information

systems, infrastructure, and many disparate systems.

To further explore the research question of how the BZC gleans actionable information

from big data sets. The analysts were posed questions further exploring how data is used within

the BZC to mission requirements and how do BZC center employees conduct data analysis?

Additionally, sub-questions were posed to the participants to determine how evolved the BZC is

in their ability to build predictive and prescriptive metrics and models. The participants’

responses are provided in Table 16

Interview questions posed:

How is data used in your organization to meet mission requirements? How do BZC analysts

glean actionable information from big data sets?

The complete list of initial interview questions are provided in Appendix A.

Table 16

Analysts’ Responses to Data Usage and Data Analysis Questions

Participant Comment

Analyst 1 We spend a lot of time now, just pulling from different sources and then putting it

all together then trying to analyze it.

Analyst 2 Really most of the things they are doing are elementary data pulls where they

compile the data and it’s just count data. Very simple elementary computations,

they for the most part make sure that the data is valid and they compile it and it’s

like, here are your top 10. We populate it and run a couple of queries and stuff

numbers into PowerPoints. Most of the data is not aggregated, some of the guys

have taken a class dealing with neural networks, but we haven’t really played

with that very much. As far predictive modeling, that’s not really how the BZC

people view it, they only look at count data. Most of the data is not aggregated,

some of the guys have taken a class dealing with neural networks, but we haven’t

96

Table 16 (continued)

Participant Comment

Analyst 2 with that very much. As far predictive modeling, that’s not really how the BZC

people view it, they only look at count data. Most of the data is not aggregated,

some of the guys have taken a class dealing with neural networks, but we haven’t

really played with that very much. As far predictive modeling, that’s not really

how the BZC people view it, they only look at count data. I mean you’re talking

predictive capability is stuff that we have, it would in less than 1%.

Analyst 3 The majority is pulling raw data, there are a few pre-defined. But most of it is

pulling down raw data that, you kinda of, either manipulate inside of the system,

right calculation or things like that inside of the system to produce an answer you

are looking for or your other option is to export into excel.

Analyst 4 We actually built a simulation model using Arena, we are still is the so very

beginning of text analysis.

Analyst 5 You made an allusion to a data pool or data warehouse. It’s not out there, there is

an immense amount of time and effort that has to be applied to knowing where

the data is at and then going to fetch it.

Analyst 10 We try to get enough data where we can find trends to try to mitigate any issues.

We have access to a system and we pull it up and we see what has failed and what

hasn’t and it’s a very old system and we export it into excel, unfortunately it

duplicates some things and so we have to literally go through, take out duplicates

and then make charts, pivot tables and what not to analyze the data. I mean, we

have useful things that we have predicted for certain aircraft parts or even for an

aircraft itself and most of them are well past their useful life.

Analyst 11 So, right I don’t have access to most systems, I have very few systems that I

actually access, I typically will contact other people if I need a data pull from a

system. For example, I have gone to DP, to personnel and told them I want a list

of every mechanic by skill and by shop, where they work so I can try and do

some analysis on how many sheet metal mechanics it take for different weapon

systems, so that when I have a new weapon system come on board maybe I can

be better informed on how many mechanics I will need for that.

Data analysis processes theme. Data science is bringing many processes, techniques,

and methodologies together with a business vision to drive actionable insights (Granville, 2014).

Much of the expectation involved in big data analysis is the continued desire by company and

DOD leaders to move from reactionary metrics based on historical data to predictive and

prescriptive metrics that may be possible with big data analysis. Research on big data and data

97

science suggests the ability to locate hidden facts, indicators, and relationships immersed in big

data sets not yet explored (Chen et al. 2012). The interviews were coded and analyzed through

the (MM) initial code aligned with the conceptual framework. The analysis of the collected data

suggests the BZC is mostly building and analyzing reactive metrics on historical data with small

pockets of predictive analytical capability. Additionally, many of the data analysis processes are

manual processes reliant upon pulling data from many disparate data warehouses and analyzing

the data in basic analysis software.

Further exploring how BZC gleans actionable information from big data sets and the

challenges associated with conducting big data analysis the participants provided input regarding

organizational structure and the culture within the BZC. The participants’ responses are provided

in Table 17

Interview questions posed:

What are some of the significant challenges associated with conducting data analysis in your

organizations? How are analysts employed and aligned in your organization?

The complete list of initial interview questions are provided in Appendix A.

Table 17

Additional Responses to Analysis Challenges Questions

Participant Comment

Analyst 2 It’s kind of a mixed model, if you will. They’ve got it centralized in some of it,

where we’ve got an entire flight, I think of about a dozen analysts, including

interns. Real world problems are not going to be exactly like the book. They lack

creativity, we have people that are so use to the military model, where everything

is provided is some kind Reg or SOP, or TOP or something like that.

Analyst 3 Right and because of that, I think that’s why at least in the BZC, that’s why we

have the volume of analytics being done by contractors. So, they are not actually

government employees, it’s just a contractor that’s doing it.

98

Table 17 (continued)

Participant Comment

Analyst 4 We don’t cross talk well. There is still a lot of protectionism about data and about

systems. We don’t have enough data scientist, people to go collect the data. We

need more data scientist folks to go out and collect the data and feed it to us. I

think as an organization we’re going to have to have a deliberate plan to mature

the analysis capabilities and the ability of the organization to consume those

products. Recognizing it is going to take several years but they are trying to bring

in 1515s. We have very few 1515s in the center, very few.

Analyst 5 I think as an organization we’re going to have to have a deliberate plan to mature

the analysis capabilities and the ability of the organization to consume those

products.

Analyst 6 I do think there is a problem within our command air force materiel command

that I’m aligned to we are trying to address it even within the center through CSF,

they’re called center senior functionals. Recognizing it is going to take several

years but they are trying to bring in 1515s. We have very few 1515s in the center,

very few.

Analyst 10 There’s never been a data analysts ever that have worked for the quality

department before so this is brand new. Sometime I also find that the willingness

of people to work with you and communicate. There is a lot of people that don’t

like to communicate. I don’t know about the other branches but the air force is so

far behind and I fear that it is making it difficult and I think it is deterring a lot of

analysts away. We work overtime every week and have a huge back log of things.

I think sometimes, people don’t understand what we are doing and why we are

doing it.

Analyst 11 People are shorthanded so they don’t have the time to do the analysis. So we

don’t have very many people with that skill set I think that if we grew that skill

set so that, for example I don’t think we have, we don’t have a 1515 in LGX or I

believe in LGA. I’m not sure how far down in the organization maybe each

division would have to have at least one data scientist and maybe make it at the

13 level or even the 12 target 13, something like that.

So I think the answer to that is you bring in some data scientists to train the

functional specialists on how to do the thing.

Organizational structure and culture theme. Gabel and Tokarski (2014) suggested for

organizations to harvest actionable information from big data sets requires the deliberate altering

in many facets of organization design and management of human resources. Harris and Mehrotra

(2014) proposed senior management will need to learn how to employ best and manage data

99

scientists. Many large organizations are now creating a core hub of data scientists to foster an

environment of sharing information and technology. Additionally, because data scientists are a

scarce commodity, many organizations are embedding data scientists with existing data analysis

groups within the organization. Creating teams that combine business analysts, visualization

experts, modeling experts, and data scientists from different disciplines and functional areas may

provide the most effective strategy for employment (Harris & Mehrotra, 2014) When discussing

challenges associated with conducting big data analysis within the BZC a theme of

organizational structure and culture was apparent, and determining how to best employ data

scientists and how to create a culture that shares data and information is warranted at the BZC.

Further investigating how the BZC gleans actionable information from big data sets and

the challenges associated with conducting data analysis the participants provide additional

insights. The participants’ responses are provided in Table 18.

Interview question posed:

What are some of the significant challenges associated with conducting data analysis in your

organizations?

The complete list of initial interview questions are provided in Appendix A.

Table 18

Additional Analysts’ Responses to Challenges Questions

Participant Comment

Analyst 2 We are just getting there with leadership, they continue to do the same thing, yet

expect different results.

Analyst 3 We are big on metrics, number 1 so we pull down a lot of data just to satisfy

populating metrics but there’s not, the majority of the metrics there’s not a lot of

analytical things that go along with it, its just we pull down the data and you

populate a metric and then you’re done.

100

Table 18 (continued)

Participant Comment

Analyst 4 There is a disconnect sometimes with leadership on how long it takes to actually

build both the models whether it’s simulation models on some other type of data

model.

Analyst 5 Within BZC there is going to be different definitions of the metrics. So you use a

different method or a different data set to calculate something you are going to

get a different result and they are never going to agree. We’ve got to bring our

managers along and as we rotate senior managers we’ve got to make sure they’ve

got that capability to consume those products.

Analyst 7 My biggest issue on rotating the leaders is that from an organizational

development perspective if you look at team development principles, you keep

your team in a constant storming stage versus getting to the norming and

performing stages.

Analyst 8 Management is wrapped up in taskers and the bureaucracy of how things are and

what their leadership wants them to do that we never get to do anything advanced

here.

Analyst 9 I see the newer leadership coming up are moving up into the leadership positions

do not know what to do with data. We are not educating our senior leaders to

think methodically and to really to use data and when I say use it is, ok

understand it, that’s a piece, can you interpret it, because that’s the other piece,

you have to understand it and to able to interpret it so that way you can speak to

it.

Analyst 10 Another one that you kind of had touched on that I made a note to is relevance.

When I came into this office there were people putting information down and

trying to put stuff together that really didn’t make any sense.

Management theme. Harris and Mehrotra (2014) proposed leadership is a top

management challenge in the era of big data. Companies may need to train incumbent managers

to be more numerate and data literate as well as hire new managers who already possess the

skills to lead in the era of big data. Participants provided statements regarding how leadership

consumes analysis information and difficulties with determining what metrics to use to measure

the success of the BZC. The BZC is a military organization that rotates its military leaders often,

and the participants suggested this creates challenges for BZC analysis.

101

Research Question 2: How mature are the data science analytical skills, processes, and

software tools used by Bravo Zulu Center analysts?

The analysts that participated were posed open-ended questions investigating the maturity

level of analytical skills, processes, and software that are used within the BZC. The initial open-

ended questions were designed by the researcher to explore the skills required to be an effective

analyst within the BZC as perceived by the participants. The initial open-ended questions

investigated if there are perceived data science skills being used by BZC analysts and the

maturity of those skills. The participants’ responses are provided in Table 19.

Interview question posed:

What are some knowledge, skills, and abilities needed to be an effective data scientist? What are

the data science skills that are used by BZC analysts? How evolved are the data science skills

within the BZC?

The complete list of initial interview questions are provided in Appendix A.

Table 19

Analysts’ Responses to Data Science Skills Questions

Participant Comment

Analyst 1 You definitely need to know how to manipulate data in excel and even to

manipulate data in; we have a system called LIMS-EV BOB J, business objects.

Being able to write scripts to pull data, different types of data that you need to do

your analysis. So you definitely need some computer skills. Yes, some basic

programming skills, because that is exactly what you are doing when we are

using LIMS. You don’t have to be a math scientist to do it, but you do have to be

able to count. I think you definitely have to be able to do critical thinking,

thinking out of the box, to be a good analyst and a lot of it comes with time, the

more experience you get the more things you know you need to look for, you

know the right questions to ask. You have to be inquisitive. It’s hard to find

people that have all of those skills, and it takes a long time to get skills on both of

those domains. You have to have people that are self-motivated

102

Table 19 (continued)

Participant Comment

Analyst 2 We are just barely scratching the surface. Very, little, most of the data is not

aggregated, some of the guys have taken a class dealing with neural networks, but

we haven’t really played with that very much. I’ve built some elementary

predictive models looking at the relationship between different variables and how

it effects asset availability and I mapped most of those so people can understand

the interactions between those. As far predictive modeling, that’s not really how

the BZC people view it, they only look at count data. We are in supply and this is

a virgin canvas, nobody has touched it, they haven’t sprinkled science on any of

this stuff. I mean we look like freaking rock stars helping this people and we are

not even getting into the really cool or interesting tools yet. I’m thinking this

is a fantastic field and warranted.

Analyst 3 We use the word analyst quite often but there really are no true analysts in our

organization. There’s probably I think six others that fill, quote, unquote, analyst

role and none of us are true analysts, we just are people that kind of know the

supply chain, know how to pull down data, know how to make heads or tails of it,

know how to spin it, know how write a few internal, in the system, internal

calculations or variables, things like that, and so we pull down the data and we

kind of of come up with some, ya know basic results that’s why we have the

volume of analytics being done by contractors. In my eyes there could be so

much more achieved if, if the knowledge base or the skill set were to grow.

Analyst 4 Data science, I believe there is a specific need at least on an interim basis as we

transition from all they siloed data systems to data lakes, cloud based. Getting the

skills to be able to that and to actually do it is time consuming. We are still in the

so very beginning of text analysis.

Analyst 5 Now we do have one analyst that was able to add a simulation package.

So we will be able to build some simulation models there and use those. So bit in

pieces we lurch forward. Visualization of findings, right now we are slapping

together slide decks, sometime with 200 slides in them.

Analyst 10 I don’t really know the difference between what they would consider a data

scientist or an operational research analyst. To me they are doing the same thing,

you are diving for data, you are looking for data, you are trying to analyze it or

use it to analyze in order to make impact decisions for problems or for systems. I

don’t that the title really makes much of a difference other than the fact that

operations research analyst are predominately in financial.

Analyst 11 We have the 1515 job series, operations research analysts, so those people are

very valuable they are really, when you talk about a data scientist that kind of

what I think of that person would be so we don’t have very many people with that

skill set I think that if we grew that skill set so that, for example I don’t think we

have, we don’t have a 1515 in LGX or I believe in LGA. Yes, and I also believe

that we have a lot of analysts that could easily be trained with those additional

skill sets, and I would argue that they would make the better one because they’ve

103

Table 19 (continued)

Participant Comment

Analyst 11 got the experience in that area, whatever that area is. I would say we do need

some more analysts and probably a data scientist at a low enough level that they

can train others to increase their skills sets would be really good.

Data science skills theme. Davenport and Patil (2012) proposed data scientists are

experts at gleaning actionable information from massive amounts of data. Data scientist use

traditional science, math, and statistics coupled with modern software and analysis techniques to

turn raw data into actionable information. Data science is a combination of business engineering

and business domain expertise, data mining, statistics, and computer science along with

advanced predictive capabilities such as machine learning (Granville, 2014). The participants

agreed to scholarly views of the perceived data science skills and unanimously agreed that the

perceived data skills are immature within the BZC. Six analysts agreed that data science is a

unique role beyond that of a traditional analyst and two analysts suggested the role of the data

scientist does not have to be unique and three analysts were unsure. Additionally, the participants

acknowledged that there is no data science occupation within the Federal OPM job structure and

they expressed that there are very few analysts within BZC with the complete range of the

perceived data science skills. Several analysts indicated that the operations research analysts is

the occupation that is most closely related to a data scientist and several participants submitted

growing data scientists from the existing analytical workforce would be the most effective

approach. Additionally, four sub-themes emerged from the data collection and analysis: access to

software, access to training, competition for talent, and domains.

Open-ended interview questions were posed to the BZC analysts that continued to

104

explore the maturity of data science skills and the utilization of software tools to support data

analysis within the organization. Additional questions were posed that explored the use of

common data science software tools to gain insights into the accessibility and utilization of these

tools by BZC analysts. The participants’ responses are provided in Table 20.

Interview question posed:

What are the data science skills that are used by BZC analysts? How evolved are the data science

skills with the BZC? Are BZC analysts able to access and use mathematical languages and open

source tools such as R and Python®?

The complete list of initial interview questions are provided in Appendix A

Table 20

Analysts’ Responses to Data Science Skills and Analysis Software Questions

Participant Comment

Analyst 1 We have some tools that are out there, for example a thing called LIMS-EV.

Analyst 2 We use Access and Excel. I’m also using Minitab and it’s only because that what

we have licenses for, for something that’s a real stats program and has of lot of

these built in functions. At this point we have R installed, we don’t have R studio,

I’m not much of a programmer and everything that I’m looking there seems to be

nothing that’s GUI based

Analyst 4 We have access to R, but not the most current version. Either the licenses aren’t

renewed in the case of Arena or there is something else better that comes along.

So we end up losing our skills.

Analyst 8 We have base R but we are not allowed to install any of the packages that people

create for it. Access to software is one of the biggest things.

Analyst 10 Now we’ve been trying to also get Tableau, because right now all we have is

excel and we don’t even have the analysis took pack, so everything is hand done.

They took it out and I called and ask them to put it back in because we were

trying to run regression on something and they said no. It was no longer allowed,

it caused a security issue and we couldn’t have it, and that’s all we were told. So

that has probably been one of our biggest issues for the air force all together is IT

constraints and we did a huge study on IT constraints and how much that impacts

our day to day. IT is definitely our biggest issue and it’s not just the software but

IT alone. We can purchase a software license but by the time things go through

105

Table 20 (continued)

Participant Comment

Analyst 10 contracting the one that we are trying to purchase will be outdated and then we

have to go through and it’s so challenging to get it through and we’ve tried to go

through different avenues to get a quicker process but it’s been an ongoing issue.

I’m having to do a cost comparison for my own position, to contract it out to

MERC for them to do analysis because the air force will not provide me with the

software to do it myself. It becomes concerning, because then where am I going

to go, what am I going todo, I know thought for a fact the marine corps and the

army are in dire need of analysts.

Analyst 11 I will use excel and do my analysis based on that and using my 41 years of

experience with maintenance and most of it has been in maintenance although I

have worked supply chain and program offices as well. The fact that there is a ton

more data available and other tools that they could use to do better analysis, they

are either not trained in it, they don’t know how to do it, their bosses don’t

request that or require it so we lose out on a lot of opportunity.

Access to software theme. Common themes regarding the skills required of data

scientists include advanced and in many cases, open source statistical software such as R and

Python®. These applications lend themselves to another common characteristic of the perceived

data scientist, and that is they will serve the organization best if they can explore open-ended

questions (Davenport & Dyché, 2013). Fundamentally, the ability of personnel in most

organizations is to analyze only a small subset of their collected data that is constrained by

analytics and algorithms of desktop software solutions with the modest capability (Shah et al.

2012). The analysts’ responses to the interview questions were coded using the (TE) initial code

aligned with the conceptual framework. The analysis of the collected data suggests there are

some sections of the BZC leveraging advanced analytical software. However, the collected data

suggest the BZC has limited advanced analytical software available to most analysts.

Information technology policies appeared as a significant constraint preventing access to modern

analytical software.

106

Several interview questions were posed to the participants to explore the role of data

science at the BZC, the data science skills that are used by the BZC, and the data science training

available to BZC analysts to answer the research question on how evolved the data science skills,

processes, and software tools at the BZC. Questions were posed to explore how participants

receive training and the maturity of this training as compared to the perceived data science skill

requirements. The participants’ responses are provided in Table 21.

Interview question posed:

How evolved are the data science skills with the BZC? Do analysts received data science

training? How do analysts get trained with the BZC?

The complete list of initial interview questions are provided in Appendix A.

Table 21

Analysts’ Responses to Training Related Questions

Participant Comment

Analyst 2 There’s no formalized training, they’ve been having people go through the Army

ORSAMAC School, but that’s just an introduction. They have occasional classes

that, most are AFIT classes, which is what the Air Force calls it. Most of those

require you to be a resident to do that, they have occasional training classes that

we’ve seen with the local colleges or something else. A lot of things that we do

are self-study.

Analyst 3 There is no training to do any kind of analytics. A lot of it is just assume, because

we do a lot of promotion within and so we just assume they are capable of doing

what the job is asking for. No, No. Now don’t get me wrong I think if we wanted

that, if somebody, if I wanted to pursue that, I think my organization would be in

support of it and they would concur with that and approve it, but it’s just not

something we sought to do.

Analyst 4 We sort of feed on each other, it’s not a formalized training program.

Analyst 8 We have base R but we are not allowed to install any of the packages that people

create for it. Access to software is one of the biggest things.

Analyst 10 So there aren’t just a lot of training opportunities that are given to us, I’m not on

an APDP coded position anymore.

107

Table 21 (continued)

Participant Comment

Analyst 11 The fact that there is a ton more data available and other tools that they could use

to do better analysis, they are either not trained in it, they don’t know how to do

it, their bosses don’t request that or require it so we lose out on a lot of

opportunity. The truthful answer is, we don’t get any

Access to training theme. The responses were coded using the (P) initial code aligned

with the conceptual framework. The analysis of the collected data suggest the data science skills

of civilian analysts are immature at the BZC. The participants expressed there are very few

analysts training opportunities and even less training opportunities related to the perceived data

science skills. Some of the participants explained that they are fully qualified and meeting their

OPM job series requirement but acknowledged their OPM occupational requirements do not

require data science skills training. Additionally, several analysts indicated they have been able

to complete modest levels of data science training through web-based instruction. One analyst

stationed at Wright-Patterson Air Force Base indicated analysts that are stationed at this location

have access to the Air Force Institute of Technology (AFIT) and could acquire data science-

related training without tuition cost to the individual. The participants submitted the BZC has

successfully sent analysts to other services to receive data science-related training and there is a

significant amount of self-study taking place using common websites such as YouTube and

Google.

A thematic element in the scholarly literature that supported this research suggests the

DOD will have to compete for scarce data science talent (Géczy, 2015). BZC participants were

posed questions to further investigate the maturity of data science and the perceived shortfall and

competition for analytical talent. The participants’ responses are provided in Table 22.

108

Interview question posed:

How evolved are the data science skills with the BZC? Do you have to compete for data science

talent? Do you have enough data scientists?

The complete list of initial interview questions are provided in Appendix A.

Table 22

Analysts’ Responses to Data Scientists Scarcity Questions

Participant Comment

Analyst 1 It’s hard to find people that have all of those skills.

Analyst 5 Our interns are getting emails from headhunters looking for analysts and the

starting salaries are twice or better than what we are paying them, those double

salary packages are going to be very attractive as soon as their obligation periods

are over.

Analyst 6 We can’t hire people fast enough.

Analyst 7 The whole issues of getting people hired into the government is typically slow

and all those other things that compounds this whole problem.

Analyst 10 I don’t know about the other branches but the air force is so far behind and I fear

that it is making it difficult to and I think it is deterring a lot of analysts away. It is

impossible for us to do the work, so they are like giving us busy work and we’re

not able to actually do what were trained to do, what went to school to do, and

what we want to do. I mean honestly I’ve really considered going out into

industry and see what’s out there, only because we are so constrained it makes it

almost impossible to do our jobs and to support how much we should be

supporting and its unfortunate we can’t get the air force to see that. It’s a huge

growing industry and we need a lot more people with the experience, I think that

is one of the problems that we’ve had here is finding people that meet the criteria

and have the right education and experience to fill the positions to help us with

these problems that we are having but I think training and trying to get out the

message that analysts and ops research analysts are a way to go forward to help

with our DOD.

Analyst 11 So I think if you try to bring them in from the outside with those skills, yes it’s

hard to keep them, I think that if we, I think we try to develop these particular

skills in the people that we currently have, maybe, I can think of people in my

different organizations that were really good at analyzing with the simple tools

that they had and if they were given some additional training and classes how

awesome they could be. I think we need more analysts.

109

Competing for talent theme. Géczy (2015) suggested there is a significant shortfall of

analytical professionals within the commercial sector and the DOD and this shortfall is expected

to grow. Finding and maintaining analysts who are capable of gleaning actionable information

from big data intelligence is a challenge confronting our military, and these experts are in short

supply (Edwards, 2014). Schneider, Lyle, and Murphy (2015) advocate incentivizing analysts to

remain loyal to the DOD may be one of the most significant challenges the DOD will face with

big data analysis. Davenport and Dyché (2013) suggested the most likely avenue for

organizations to develop analytical talent will come from innovating new talent from existing

analytical groups. The analysts’ responses to the interview questions were coded using the (P)

initial code aligned with the conceptual framework. The results of the exploration suggest the

BZC has experienced some success in attracting analysts in some locations but is also

experiencing difficulties in attracting this talent. The participants expressed concern about their

people being sought after by competing industries and the process to bring new hires into their

organization is too slow.

BZC participants were posed questions to further investigate the maturity of data science,

the perceived skills required, and the roles of a data scientist. The researcher explained scholarly

definitions of data scientists and solicited responses from the analysts. The participants’

responses are provided in Table 23.

Interview question posed:

How evolved are the data science skills with the BZC? What skills are required of BZC analysts?

Are data scientists’ people with distinct skill requirements beyond traditional analysts?

The complete list of initial interview questions are provided in Appendix A.

110

Table 23

Analysts’ Responses to Data Scientists Skills and Roles

Participant Comment

Analyst 1 You have to be able to check the data that you are pulling and that comes from

experience as well if something doesn’t look right it’s probably not right so you

have to be able to do the math, is the program actually giving you the correct

numbers, sometimes you have to do that. The ideal candidate has that experience

in the supply chain and also has critical thinking and analysis skills.

Analyst 2 A lot of the guys they are right out of school, they don’t know how to apply a

theoretical model, they don’t realize that real world the data is not as clear cut. I

think mentoring would be something. We need a lot of people who are trained as

just analysts, I’m mean you can learn the rest of the stuff, you can find someone

to program or something, but you need someone who can go an solve problems

and track it to ground and get some actual viable movement so they can see that

there is a change.

Analyst 3 We’ve got the one person in our organization, he’s kinda like the most dangerous

guy, because not only does he understand the data, he understands how it all

works and he knows how to program and he has a degree in statistics. There’s

probably I think six others that fill, quote, unquote, analyst role and none of us

are true analysts, we just are people that kind of know the supply chain, know

how to pull down data, know how to make heads or tails of it, know how to spin

it, know how write a few internal, in the system, internal calculations or variables,

things like that, and so we pull down the data and we kind of come up with basic

results.

Analyst 5 The long term vision is they’ll extract the data and they will hand it over to an

operations research analyst that is specially trained in analysis techniques as

opposed to data science techniques. We need more data scientist folks to go out

and collect that data and feed it to us.

Analyst 11 I would think as LG we should definitely have like one per division and we are

supposed to be integrating everything for the entire BZC and yet we don’t have

some 1515s to help us with our analysis because what will happen I’ll spend, I

might spend 5 days analyzing data to come up with some results or whatever that

because I don’ t have the skills that a 1515 has they might be able to do the same

thing in four or five hours that’s taking me four or five days and so we lose a lot

in that and could just even be that maybe we just have some small training

sessions, here is how you do pivot tables. Yes, and I also believe that we have a

lot of analysts that could easily be trained with those additional skill sets, and I

would argue that they would make the better one because they’ve got the

experience in that area, whatever that area. So I would say we do need some more

analysts and probably a data scientist at a low enough level that they can train

others to increase their skills sets would be really good.

111

Domains theme. A common theme in data science research suggests that for data

scientists to generate business value, they will need to work closely with domain experts in the

organization. Creating collaboration between the business domain experts and the data scientists

and should be a foundational requirement before starting a data science project (Viaene, 2013).

Granville (2014) suggested data science is a combination of business engineering and business

domain expertise, data mining, statistics, and computer science, and advanced predictive

capabilities such as machine learning. Data science is bringing many processes, techniques, and

methodologies together with a business vision to drive actionable insights (Granville, 2014). The

responses to the interview questions were coded using the (P) initial code aligned with the

conceptual framework. The participants offered their perceptions regarding the data science role

within DOD organizations and the importance of data science and business domain connections.

Some participants proposed that data scientists should be proficient in the business domain while

other participants suggested data scientists could serve the business best by conducting the

advanced analysis and then provide the results to a business domain analyst.

Focus Group Interview Analysis and Results

The transcribed focus group interview was loaded into NVivo-11® and was coded to the

initial parent nodes aligned with the conceptual framework. After the initial coding and analysis

of the transcribed focus group interview a word frequency query was used in NVivo-11® to

generate Figure 10. The word data was removed from all word frequency queries because it was

overwhelmingly used.

112

Figure 10. Initial management focus group interview word frequency diagram.

The initial analysis of the focus group interview suggests early themes of metrics,

analysts’ skills, tools, information systems, and performing as seen in Figure 10. The word

frequency query was then modified to display only the fifteen most used words by the managers

to identify early themes. This additional query still demonstrated early themes of metrics,

analysts’ skills, tools, information systems, and performing but additional early themes of

predictive, analysts, processes, computers, and business emerged as seen in Figure 11.

113

Figure 11. Refined management focus group interview word frequency diagram.

The same initial open-ended interview questions that were posed to the analysts were

asked to the focus group participants to further explore the research questions on how the BZC

gleans actionable information from big data sets and how mature are the data science skills,

processes, and software tools used by BZC analysts. The interview questions were designed to

gain a deeper understanding on how BZC analysts conduct analysis, the participants’ perceptions

of big data, challenges associated with conducting data analysis, the software tools used to

conduct data analysis, training options for analysts, and their perceptions of data science. All of

the themes that were generated from interviews with the BZC’ analysts were also supported by

the focus group participants with the exception of the management theme. The collected data

from the management focus group did not present a theme of management as a constraining

factor to big data analysis.

Research Question #1: How does the Bravo Zulu Center glean actionable information

from big data sets?

114

The management focus group participants were asked initial open-ended questions

investigating if the BZC is experiencing the big data phenomena, the perceived benefits, and

liabilities of big data, and their conceptions about the term big data. The responses provided

insights about the concept of big data, data growth and the ability of the BZC to analyze large

data sets. The participants’ responses are provided in Table 24.

Interview questions posed regarding big data:

How do you define big data? What increases of digital data (big data) have you witnessed and

how has it impacted the business of the BZC?

The complete list of initial interview questions are provided in Appendix A.

Table 24

Managers’ Responses to Questions about Big Data

Focus Group Comment

Participant The term big data by itself I think has a lot of different meanings depending on

who you talk to, if you connect it with something it takes on a new meaning like

big data analytics, but big data my understanding of it, it’s these large data sets of

structured data or unstructured data but again back to the volume of it, it’s so big

maybe traditional tools that you have don’t allow you to take advantage of all that

information that is there, available to you.

Participant So while we recognize that we’ve had big data it has always been from a different

aperture or different perspective and which we have applied the analytics. I think

that we are maturing our conceptualization of big data and with at least the

logistic space we are recognizing that is an enterprise asset and we are moving the

kind of corporation in that direction at least from a logistics perspective.

Participant There is a realm of methods used for the predictive we are sitting on a significant

volume of data that I would call big data in the sense it is from different sources,

different types, structured, un-structured ect., that we could use to do relational

analysis and form the basis for predictive and potentially prescriptive.

Participant So we actually collect that data, I would love to say it is in big data warehouses

but that implies a much more elegant solution that I think we currently have in the

BZC. We are looking upgrading many of those systems but to date many of them

are old systems written in COBOL, that sort of language, but they collect the

data, they are standard ways to analyze it, standard ways it is presented to

material managers and shop planners.

115

Table 24 (continued)

Focus Group Comment

Participant So I think big data, to blunt and honest is kind of a buzzword right now, that we

have been doing some of that for years, we just haven’t given it this fancy title,

but we have been predicting what we are going to need years in advance for as

long as I have been in the air force.

Big data theme. As expected the interviews with the BZC managers provided insights

about data growth and the ability of the BZC to collect and analyze large data sets. The open-

ended interview questions were designed to explore if the BZC is experiencing a big data

phenomena, the perceived benefits and liabilities of big data, and their conceptions about big

data. By coding and analyzing the transcripts from the focus group interview through the (MI)

initial code regarding big data, thematic elements common in the literature review were revealed.

The BZC is a complex organization with many disparate data systems generating large data sets.

The managers recognized benefits and challenges with analyzing their big data sets and one

participant described big data as a buzzword.

The managers that participated in the focus group were posed questions that further

explored how the BZC gleans actionable information from big data sets. The participants were

asked to explain how data is used within the BZC to meet mission requirements. The participants

were also posed an open-ended questions that explored any dependencies on data. The

participants’ responses are provided in Table 25.

Interview question posed regarding big data analysis challenges:

How is data used in your organization to meet mission requirements? What are some areas in

your organization that are dependent on data?

The complete list of initial interview questions are provided in Appendix A.

116

Table 25

Managers’ Responses to Data Usage Questions

Focus Group Comment

Participant His division is really the keeper in the BZC for performance metrics and how we

apply standard metrics. We use those metrics to access performance and then we

use them in planning as well. Participant I’ll say corporate business processes we have these metrics as well, so they are

throughout the complex.

Participant Let me just add, so we also use the warfighter metrics too, we use operational

performance of how our systems are performing. We have a whole series of

readiness metrics just like you guys use in the navy, which are outcome metrics

but those drive our planning processes too, so it’s operational metrics, it’s our

supply chain performance metrics, it’s our operations and production

management metrics, there is a whole series, training metrics, you name it, we

use that data to measure our performance and understand where problems are,

that’s what metrics do, they tell you story and help you reveal where you have

gaps and shortfalls that you need to address.

Participant We are talking requirement type metrics, our systems actually track it through the

base supply system, we track how often it is ordered, we compare that to the

flying hour program and then we determine how often that item is used per flying

hour and then how many flying hours we are projected to fly.

Participant One of things , we have a whole host of data solutions to kind of piggy back on

what Mr… is saying, we have one that is kind of business intelligence and an

enterprise data warehouse the pulls raw data and then applies business rules the

cleanse that data and do a presentation layer so that people can have standard

performance metrics in near real time or as the data projects but in the case of

Mr…. operation you get large data sets that are pulled from legacy systems and

then analyzed to present the metrics on performance.

Participant We are about to get started with looking at some commercial platforms that are

available, for example looking at some of our outcome metrics and even some of

the, all of the outcome metrics are lagging, some are less lagging than others and

looking for patterns within that to enable us to have some of the leading health

indicator constructs, that’s going to be a couple of six month projects that are

going to kick off in the next month.

Participant That can be something fundamental in understanding data, there is a tendency to

reach out for a single metric and when in fact it’s typically a sequence of events.

Metrics theme. The responses to the interview questions posed to the managers

regarding how data is used within the BZC were coded using the (MI) initial code aligned with

117

the conceptual framework. The managers that participated in the focus group interview

expressed the importance of gleaning actionable information from large data sets. The managers

provided several examples of how BZC managers use data and metrics throughout the

organization to make crucial business decisions. The managers expressed metrics are a key

output from the data analysts within the BZC and an important aspect of managing the business

of the BZC.

The management participants were asked initial open-ended questions that continued to

explore how the BZC gleans actionable information from big data sets and associated challenges.

The participants were asked to explain the challenges in gleaning actionable information from

big data sets. The participants’ responses are provided in Table 26

Interview question posed regarding big data analysis challenges:

What are some of the significant challenges associated with conducted data analysis in your

organization?

The complete list of initial interview questions are provided in Appendix A.

Table 26

Managers’ Responses to Questions Regarding Data Analysis Challenges

Focus Group Comment

Participant One of the major challenges Roy will be perhaps as we move into big data, right

now we have had a lot of segmented data that we mentioned before and so how

do we integrate that and how to we keep the integrity of that data so that we when

we start to do the big data analytics we’re doing it from a clear and concise

enterprise perspective that has data integrity from inception all the way through

the analysis phase. I think that is one of the big challenges that we are going to

have, because we have such segmented data, because we have so many legacy

systems that produce that data.

Participant We also have a challenge just in data creation, a lot of our systems are relying on

that airman typically a mechanic out in the field who has to put in what he did to

fix the part so we can create our models. The integrity piece is a continuous

118

Table 26 (continued)

Focus Group Comment

Participant challenge and will be no matter what analytical tool you apply.

Participant Access is another one ran into has a problem, getting access to the data, you know

it goes back to what Mr. xxx said about who owns the data, people allowing you

to see their data, you could have better decision support if you have access to

certain data, but getting that access is often difficult from the person who controls

it so that is a challenge.

Access to quality data theme. Watson and Marjanovic (2013) suggested a challenge

with capitalizing big data includes accessing data through appropriate platforms and providing

data governance. By coding and analyzing the transcripts from the focus group interview through

the (MI) initial code and asking open-ended questions regarding how the BZC gleans actionable

information from big data sets, access to quality data emerged as a theme. The management

participants shared common concerns expressed by the analyst participants regarding access to

quality as a theme that is currently constraining big data analytics at the BZC.

To further explore the challenges associated with conducting big data analysis within

BZC the researcher asked the focus group participants to further expound on constraints to big

data analysis. The management focus group participants provided additional responses as seen in

Table 27.

Interview questions posed:

What are some of the significant challenges associated with conducting data analysis in your

organization? The complete list of initial interview questions are provided in Appendix A.

119

Table 27

Managers’ Additional Responses to Data Analysis Challenges

Focus Group Comment

Participant I would love to say it is in big data warehouses but that implies a much more

elegant solution that I think we currently have in the BZC. We are looking

upgrading many of those systems but to date many of them are old systems

written in COBOL, that sort of language, but they collect the data, we don’t have

big enterprise data warehouse for logistics, I think that we are moving into that

space as some of the previous comments stated for the most part it is de-

centralized and it’s kind of adhoc based on the mission needs of the organization

that is applying those systems. Participant We have had a lot of segmented data that we mentioned before and so how do we

integrate that and how to we keep the integrity of that data so that we when we

start to do the big data analytics we’re doing it from a clear and concise enterprise

perspective that has data integrity from inception all the way through the analysis

phase.

Participant I think that is one of the big challenges that we are going to have, because we

have such segmented data, because we have so many legacy systems that produce

that data.

Participant If we can truly take advantage of the capacity and processing that potentially exist

in a cloud environment I think that would be huge and it might allow us to

actually use some of the tools that maybe are better fit in that environment then

the single site license for an individual computer, we have an air force license that

allows us to truly do analysis in the cloud.

Participant We don’t have a big enterprise data warehouse for logistics. For the most part it is

de-centralized and it’s kind of adhoc based on the mission needs of the

organization that is applying those systems.

Participant Warehousing data and we keep hearing like migration to a cloud environment and

so in my little world here from our perspective if we ever get to a true cloud

environment where all the data is available to everyone.

Infrastructure: Legacy and disparate systems theme. Edward (2014) suggested the

essence of analyzing big data within the DOD requires the aggregation of many data sources

from hundreds of organizations requiring the defining data sharing legal, policy, oversight, and

compliance standards to make it happen. The focus group responses were coded using the (CD)

initial code aligned with the conceptual framework. The participants of the management focus

120

group expressed very similar opinions of the analysts. The BZC has sections of their business

with modern computer infrastructure and analysis capabilities but their business is also

constrained in the ability to conduct enterprise big data analysis due to their availability of

information systems, infrastructure, and many disparate systems.

To further explore the research question of how the BZC gleans actionable information

from big data sets. The management participants were posed questions further exploring how

data is used within the BZC to mission requirements and how do BZC center employees conduct

data analysis? Additionally, sub-questions were posed to the participants to determine how

evolved the BZC is in their ability to build predictive and prescriptive metrics and models. The

participants’ responses are provided in Table 28.

Interview questions posed:

How is data used in your organization to meet mission requirements? How do BZC analysts

glean actionable information from big data sets?

The complete list of initial interview questions are provided in Appendix A.

Table 28

Managers’ Responses to Data Usage and Data Analysis Questions

Focus Group Comment

Participant I‘ll start with the stubby pencil because we still have some of the manual

calculations where we are pulling data from requirements from a simple data call

all the way into systems that we are trying to implement tools that are available

now that can do some of what you are getting at, the big data analytics to actually

automatically set some business intelligence rules up so that we take the human

out of the loop. We really need AI to help us probe that in a faster manner to find

those patterns so that we can do more exception based management, train the

software to really speed up our decision process. I’ve seen that continuum as part

of the data science maturity getting from like to said from reactive to predictive to

prescriptive effectivity, I think we are probably pretty good at the reactive piece.

121

Table 28 (continued)

Focus Group Comment

Participant There is a realm of methods used for the predictive we are sitting on a significant

volume of data that I would call big data in the sense it is from different sources,

different types, structured, un-structured ect., that we could use to do relational

analysis and form the basis for predictive and potentially prescriptive

Participant Vendors are out there who are putting together some views for us that will allow

us to be, write algorithms that will help us to be more predictive but we are really

just tipping our toe in that space right now, as you know if you have been

researching there is a variety of companies who have different levels of maturity

and abilities to do these and make these relationships to tell you and actually

allow you to be predictive and prescriptive.

Participant The Air Force in the past year has embraced the strategy of predictive

maintenance even though we have had policy for a number of years where we are

taking our data from our authoritative maintenance sources, we are using the data,

performance data that we are pulling off aircraft or other weapon systems and we

are using both sets to help us understand performance and manage the health of

the systems so that we can get more predictive and understanding failure and be

able to have parts available ahead of time.

Participant I wouldn’t say they are particularly predictive in really takes humans

understanding and interpreting the data and trying to make decisions, we haven’t

gotten into the machine learning stages yet, where those patterns build and then

we can program certain views and certain I’ll call them vignettes that allow us to

try and get ahead of trends that we believe are going to happen.

Participant I think to some extent we sale ourselves short as an air force, big data they always

tell me they can predict something, I would tell you or D200 system has looked at

the past history of our usage and we predict two years out what they air force is

going to need and prepare our depot shops to repair that, whether it’s a great

prediction or not it’s probably about as good as any you will find in industry

Data analysis processes theme. Much of the expectation involved in big data analysis is

the continued desire by company and DOD leaders to move from reactionary metrics based on

historical data to predictive and prescriptive metrics that may be possible with big data analysis.

Research on big data and data science suggests the ability to locate hidden facts, indicators, and

relationships immersed in big data sets not yet explored (Chen et al. 2012). Interview questions

were posed to the management participants regarding what processes and methods are used by

122

BZC analysts to glean actionable information from big data sets. The questions explored how

mature and effective the analytical processes are in their organization and the maturity of their

predictive analytical capabilities. The responses were coded and analyzed through the (MM)

initial code aligned with the conceptual framework. The analysis of the collected data suggests

the BZC is mostly building and analyzing reactive metrics on historical data with small pockets

of predictive analytical capability. Additionally, many of the data analysis processes are manual

processes reliant upon pulling data from many disparate data warehouses and analyzing the data

in basic analysis software.

Further exploring how BZC gleans actionable information from big data sets and the

challenges associated with conducting big data analysis the management participants provided

input regarding organizational structure and the culture within the BZC. The participants’

responses are provided in Table 29

Interview questions posed:

What are some of the significant challenges associated with conducting data analysis in your

organizations? How are analysts employed and aligned in your organization?

The complete list of initial interview questions are provided in Appendix A.

Table 29

Managers’ Responses to Analysis Challenges

Focus Group Comment

Participant If we had data scientists and they could do these big Uber computations on big

data and we had kind of the infrastructure I guess the fundamental question is

where would they reside to give the most value to the enterprise whatever that

enterprise is defined as, and what is the hierarchal structure, the relationships with

all the corresponding analysis that goes down all the way to, kind of the squadron

level, so I think fundamentally we have to organize ourselves to effectively utilize

data not just have the capacity to analyze and collect data.

123

Organizational structure and culture theme. Similar to the responses provided by the

analyst that participated in the research a theme of BZC organization and culture was apparent

within the focus group responses. Gabel and Tokarski (2014) suggested for organizations to

harvest actionable information from big data sets requires the deliberate altering in many facets

of organization design and management of human resources. Harris and Mehrotra (2014)

advocated senior management will need to learn to employ best and manage data scientists.

Research Question 2: How mature are the data science analytical skills, processes, and

software tools used by Bravo Zulu Center analysts?

The managers that participated were posed open-ended questions investigating the

maturity level of analytical skills, processes, and software that are used within the BZC. The

initial open-ended questions were designed by the researcher to explore the skills required to be

an effective analyst within the BZC as perceived by the participants. The initial open-ended

questions investigated if there are perceived data science skills being used by BZC analysts and

the maturity of those skills. The participants’ responses are provided in Table 30.

Interview question posed:

What are some knowledge, skills, and abilities needed to be an effective data scientist? What are

the data science skills that are used by BZC analysts? How evolved are the data science skills

within the BZC?

The complete list of initial interview questions are provided in Appendix A.

124

Table 30

Managers’ Responses to Data Science Skills Questions

Focus Group Comment

Participant This is Mr… I guess if you use the definition that you used where the person is

skill in all those areas as well as knowledgeable in the data they are handling

that’s a hard thing to groom or to grow if you are talking the technical aspect of

it, I think you almost back to the computer scientist, the 1550 type folks, so I

don’t know if you use the definition that you put to us earlier, that would be a

hard one, even if you had it I don’t know if you would even find qualified

candidates to fill it. That broad of a skill set that they need.

Participant Our data scientist if you will, we found him from the software group here, but I

agree with your definition the data scientist also has to understand the data and

we are probably in the same boat as every other organization where we rely on

SMEs but we have found some online tools like pluralsite and data camp and

where it is almost like a youtube type training so we can get real time training or

honestly people google things, I want to write script to do this and we google it

and we find an example of code like that and then we incorporate that code so a

lot of ours is truly learning on the fly or as a need presents itself figuring out who

else has done it and just kind of borrow from them, out.

Participant I have a group of analysts, operations research analysts that work for me, they are

very skilled in the model and very skilled in the math and to be honest they are

very well booked learned but they have no idea what the data is presenting to

them unless we have a senior logistician or someone who has been out on a flight

line or in a depot shop tell them what it means, they are good people and they

will learn it over time, but my particular shop is quite young they have all of

those skills but they don’t have any background on how to interpret the results.

Data science skills theme. Data scientist use traditional science, math, and statistics

coupled with modern software and analysis techniques to turn raw data into actionable

information. Data science is a combination of business engineering and business domain

expertise, data mining, statistics, and computer science along with advanced predictive

capabilities such as machine learning (Granville, 2014). The focus group participants

acknowledged the growing data science occupation in the commercial sector and the importance

of maturing the data science skills within the BZC. The participants agreed to the scholarly

125

definitions of a data scientist and that data science is a unique role beyond that of a BZC

traditional analyst. One focus group participant stressed that data science includes business

domain understanding.

Open-ended interview questions were posed to the BZC managers that continued to

explore the maturity of data science skills and the utilization of software tools to support data

analysis within the organization. Additional questions were posed that explored the use of

common data science software tools to gain insights into the accessibility and utilization of these

tools by BZC analysts. The participants’ responses are provided in Table 31.

Interview question posed:

What are the data science skills that are used by BZC analysts? How evolved are the data science

skills with the BZC? Are BZC analysts able to access and use mathematical languages and open

source tools such as R and Python®?

The complete list of initial interview

Table 31

Managers’ Responses to Data Science Skills and Analysis Software Questions

Focus Group Comment

Participant There is a spectrum here, there is dashboards, there’s tools that we have that have

an automated presentation layer that I can go and pull up certain metrics and it

will tell me status, particular readiness status, parts status, we are trying to get

into the space. We are finding those is a lot of those tools that they are being

taught on are not usage within the DOD environment because we can’t get them

inside the fence.

Participant So we are using older versions of the tools or we are not even able to access those

tools so we are still doing things, these students basically have to go learn how to

use Access, because Access is not being taught anymore in school, we are past

that point and Access has such a limited space constraint to it that we have to do

iterative type analysis to actually compile the data and make it usable.

Participant Those are the things that I was alluding to where we have R but it is five versions

removed or Python we are still trying to crack the code on how to get it and the

126

Table 31 (continued)

Focus Group Comment

Participant libraries that are needed to actually make it usable. How do I get some of this

software loaded and behind the firewall without taking 24 months?

Access to software theme. Common themes regarding the skills required of data

scientists include advanced and in many cases, open source statistical software such as R and

Python®. These applications lend themselves to another common characteristic of the perceived

data scientist, and that is they will serve the organization best if they can explore open-ended

questions (Davenport & Dyché, 2013). The responses provided by the management focus group

regarding data science skills and analysis software were coded using the (TE) initial code aligned

with the conceptual framework. The analysis of the collected data submit there are some sections

of the BZC leveraging advanced analytical software. However, the collected data suggest the

BZC has limited access to advanced analytical software available to most analysts. Information

technology policies appeared as a significant constraint preventing access to modern analytical

software.

Several interview questions were posed to the managers to explore the role of data

science at the BZC, the data science skills that are used by the BZC, and the data science training

available to BZC analysts to answer the research question on how evolved the data science skills,

processes, and software tools at the BZC. Questions were posed to explore how participants

receive training and the maturity of this training as compared to the perceived data science skill

requirements. The participants’ responses are provided in Table 32.

Interview question posed:

How evolved are the data science skills with the BZC? Do analysts received data science

127

training? How do analysts get trained with the BZC?

The complete list of initial interview questions are provided in Appendix A.

Table 32

Managers’ Responses to Training Related Questions

Focus Group Comment

Participant Our data scientist if you will we found him from the software group here, but I

agree with your definition the data scientist also has to understand the data and

we are probably in the same boat as every other organization where we rely on

SMEs but we have found some online tools like pluralsite and data camp and

where it is almost like a youtube type training so we can get real time training or

honestly people google things, I want to write script to do this and we google it

and we find an example of code like that and then we incorporate that code so a

lot of ours is truly learning on the fly or as a need presents itself figuring out who

else has done it and just kind of borrow from them.

Participant So this is Mr….again and Mr… you can correct me 100% but so some of the

workforce series employees, I mean a 1515 I believe is the series for an analyst

but again if I was to want a 346 who is a logistician and I need them to

understand because they are doing supply chain work what the data is telling

them I don’t as part of their development we don’t deliberately train them that

way, again there are courses out there that we, if you are dealing with that in your

day to day job that you can take, we are also looking at DAU, but this is the

challenge for career field development that we need to start moving towards

changing the competencies that we expect our SMEs to have so that it would

include these skills.

Access to training theme. The management participants supported the theme expressed

from the analysts, the BZC has limited access to data science-related training. There are very few

formal analysts training opportunities and even less training opportunities related to the

perceived data science skills. However, the BZC has pursued making some online training

venues available to analysts.

A thematic element in the scholarly literature that supported this research suggests the

DOD will have to compete for scarce data science talent (Géczy, 2015). BZC managers were

128

posed questions to further investigate the maturity of data science and the perceived shortfall and

competition for analytical talent. The participants’ responses are provided in Table 33.

Interview question posed:

How evolved are the data science skills with the BZC? Do you have to compete for data science

talent? Do you have enough data scientists?

The complete list of initial interview questions are provided in Appendix A.

Table 33

Managers’ Responses to Data Scientists Scarcity Questions

Focus Group Comment

Participant It is location specific in industry at right, I know the challenges we had when we

were trying to stand up that office it was the oil industry. I am never validated this

with any research but we could generally look at the price of a barrel of oil, if it

steadily stayed below $55 a barrel then the length of the cert got better but that is

purely my observation I didn’t write everything down, when was oil was high the

certs and the qualified applicants that I would receive to evaluate I would say was

slim pickings, over.

Participant I would agree with that in fact it’s probably harder I even have folks that have

already figured out that they can make more money even within the Department

of Defense if they go to either coast, so getting analysts to move here to BZC is a

challenge in itself, my fear is that we are going to groom these folks here and then

they are going to see they can go and become a GS14 analysts and make $20,000

dollars more, now granted there is a cost of living side to that as well but just

from a true numbers perspective the higher salaries are on the coasts they are not

out here in the middle of the country, or they are competing with the oil industry

who is paying a higher salary for those types of people.

Participant We just hired two ops research analysts and we had to go outside to do it and use

to DHA because it is a hard to fill occupation but we were able to find them here

maybe because we don’t have the oil industry and people don’t want to live on

the east coast but we were able to do it so I don’t think the pinch is quite so hard

here if you can find skill sets you can hire them but it is finding the skill sets that

is more of the problem. I would say one reason that we try to grab the interns and

bring them on and our EN office has done a really good job of that, let the folks

come in and get a flavor of it, we have several, I will say at least one that I know

that I brought in that helps with retention, they get experience out of it they get a

taste and it helps. The challenge is using them so that they have meaningful work,

there is a tendency at times for folks to say well that’s an intern let me give them

129

Table 33 (continued)

Focus Group Comment

Participant the grunt work, but if I really want that skill set it is giving them value added

work and the hard stuff so one they can know they are contributing and two it

gives them a taste of what is to come, over.

Competing for talent theme. Géczy (2015) suggested there is a significant shortfall of

analytical professionals within the commercial sector and the DOD and this shortfall is expected

to grow. Finding and maintaining analysts who are capable of gleaning actionable information

from big data intelligence is a challenge confronting our military, and these experts are in short

supply (Edwards, 2014). Several interview questions were posed to the focus group participants

to gain their perspectives on the anticipated shortfall of analytical talent, and the responses were

coded using the (P) initial code aligned with the conceptual framework. The results of the

exploration suggest the BZC has experienced some success in attracting analysts in some

locations but is experiencing difficulties in attracting this talent. The participants expressed

concern about their people being sought after by competing industries and the process to bring

new hires into their organization is too slow.

BZC managers were posed questions to further investigate the maturity of data science,

the perceived skills required, and the roles of a data scientist. The researcher explained scholarly

definitions of data scientists and solicited responses from the analysts. The participants’

responses are provided in Table 34.

Interview question posed:

How evolved are the data science skills with the BZC? What skills are required of BZC analysts?

Are data scientists’ people with distinct skill requirements beyond traditional analysts?

130

The complete list of initial interview questions are provided in Appendix A.

Table 34

Managers’ Responses to Data Scientists Skills and Roles Questions

Focus Group Comment

Participant Gone are those days where we had air force level institutions that kind of fostered

domain centric analysis capabilities and it seems now to be pushed down to the

organizational level that needs and consumes that data and makes the business

decisions for their particular business process. It is interesting to looking at the air

force in terms of, ya if we had data scientists and they could do these big Uber

computations on big data and we had kind of the infrastructure I guess the

fundamental question is where would they reside to give the most value to the

enterprise whatever that enterprise is defined as, and what is the hierarchal

structure, the relationships with all the corresponding analysis that goes down all

the way to, kind of the squadron level, so I think fundamentally we have to

organize ourselves to effectively utilize data not just have the capacity to analyze

and collect data.

Participant Have a SME who is able to do what I think eventually we want to get is where the

SME has those competencies that will make them good analysts but that is really

the future state so how do we bridge that, perhaps with data scientist and

computer scientist who are working with our SMEs using the tools that are

available.

Domains theme. A common theme in data science research suggest that for data

scientists to generate business value, they will need to work closely with domain experts in the

organization (Granville, 2014). Creating collaboration between the business domain experts and

the data scientists and should be a foundational requirement before starting a data science project

(Viaene, 2013). The management participants offered their perceptions regarding the data

science role within DOD organizations and the importance of data science and business domain

connections. The management focus group submitted similar opinions as the analysts regarding

the distinctions between data scientist and business domain knowledge support the domains

131

theme. All of the responses were coded using the (P) initial code aligned with the conceptual

framework.

Bravo Zulu Center Document Analysis and Results

The BZC strategic planning document that was collected by the researcher was imported

into NVivo-11® for analysis. The content of the BZC’s strategic plan attribute #1 regarding data

accessibility was coded aligned with the initial coding structure and conceptual framework. A

word frequency query was generated to gain a general sense of the information provided in the

BZC’s strategic plan as seen in Figure 12.

Figure 12. BZC strategic document word frequency diagram.

The analysis of the BZC’s strategic plan advocates the BZC has placed emphasis on digital, time,

agility, integration, tools, and analysis. The coding and further analysis of the BZC’s strategic

document revealed there is a BZC strategic objective to enable complete data integration and

data availability across the BZC. Within the data availability attribute of this strategic plan, there

are specific goals to make data 100% accessible and accurate by providing all required data at

the point of entry via a single entry point and by dynamically linking and integrating systems.

132

The strategic plan also describes the employment of the necessary tools, models, and predictive

analysis capabilities to turn raw data into useful information. Triangulation analysis of the data

collected from analyst interviews, the focus group interview, and this BZC strategic document

suggest the organization is suffering from significant data accessibility and data quality issues.

However, the review and analysis of the BZC’s strategic document suggest the organization is

aware of these shortfalls and is actively engaged in mitigating these issues.

BZC Job Announcements Document Analysis and Results

Harris and Mehrotra (2014) proclaimed there are distinguishable differences between

data scientists when compared to traditional quantitative analysts and there are many

implications on how to define the roles of data scientists as well as how to attract and train these

experts and how to get the most value from this emerging discipline. To explore the maturity of

data science skills at the BZC several recent job announcements were collected and analyzed.

The BZC personnel center provided recent supply analyst, program management analyst,

operations research analyst, and computer science job announcements. These job announcements

were imported into NVivo-11® and the skills and duties required of these positions were code to

the (P) initial code and aligned with the conceptual framework. A word frequency query was

executed combining the data from all four job announcements. Words that are generic to all job

description were omitted from the query. The result indicates the presence of data sciences skills

such as mathematics, statistics, and computer science, as seen in Figure 13.

133

Figure 13. BZC Job announcements word frequency diagram.

To further explore the maturity of data science skills of newly hired BZC personnel, the

skills and duties sections of the recent job announcements were compared to scholarly views of

data science skills. Comparisons of the data science skills proposed by Harris, Murphy, and

Vasinman (2013) along with the specific data science software suggested by Harris and Mehrotra

(2014) to the skills required of BZC analysts and computer scientists described in the recent BZC

job announcements are provided in Tables 35 through 38.

The comparison of the skills required of the supply analyst job announcement to

scholarly views of data science skills are provided in Table 35. According to the recent supply

analyst job announcement, BZC supply analysts require the basic abilities to analyze statistical

data and apply arithmetical computations with graphical representation. There are no specific

analysis software tools and computer science, or programming requirements. A supply analyst

that is hired into the BZC requires specific supply chain domain knowledge but little specific

data science-related skills.

134

Table 35

Data Scientist and BZC Supply Analyst Required Skills Comparison

Data Scientist Supply Analyst

Types of

Data (CD)

Big data, all types, including unstructured, numeric, and non-numeric data

(Harris, Murphy & Vasinman, 2013)

Current statistical data

Preferred

Tools (TE)

Mathematical languages (such as R and Python®), machine learning,

natural language processing and open-source tools (Harris & Mehrotra,

2014)

No specific software or

tools

Nature of

work (MI)

Explore, discover, investigate and visualize (Harris, Murphy & Vasinman,

2013)

Analyze, develop, evaluate

using statistical data

Methods

(MM)

Optimization/Visualization

Graphical models

Classical, Bayesian, Temporal, Spatial statistics

Monte Carlo Simulation

Data manipulation (Harris, Murphy & Vasinman, 2013)

Arithmetical computations

meaningful statistical data

for graphic representation

Computer

Science

Skills (P)

Programming

System administration

Back-end programming

Front-end programming (Harris & Mehrotra, 2014)

No specific computer

science requirements

Typical

degree

Computer science, data science, symbolic systems, cognitive science. Degree not required

135

The program management analyst job announcement collected and analyzed in support of

this research require candidate employees to have specific understanding of command

operations, products, services, and knowledge of the goals of the command. This occupation at

the BZC requires knowledge and skills in applying analytical and evaluation techniques to

identify and apply analytical process to resolve problems. The program management occupation

at the BZC serves as a broad announcement with little specific analytical requirements. Two

analysts that participated in this research explained that BZC job descriptions are not sufficiently

detailed to support the hiring of candidates with data science skills and this is apparent in the

program management analyst job announcement collected and analyze in support of this

research.

The comparison of the skills required of the program management analyst job

announcement to scholarly views of data science skills are provided in Table 36. According to

the recent program management analyst job announcement, BZC program management analysts

require basic skills in program management, planning, and coordinating. There are no specific

data analysis, mathematics, statistics, computer science, or programming requirements. There are

no specific analysis software requirements and a college degree is not required. According to the

list of analysts currently assigned to the BZC working as analysts to support this research, the

program management analysts make up 54% of the total analysts’ workforce concluding that the

majority of the BZC analysts have no specific data science skills requirements.

.

136

Table 36

Data Scientist and BZC Program Management Analyst Required Skills Comparison

Data Scientist Program Management Analyst

Types of

Data (CD)

Big data, all types, including unstructured, numeric, and non-numeric data

(Harris, Murphy & Vasinman, 2013)

No specific data analysis

requirements

Preferred

Tools (TE)

Mathematical languages (such as R and Python®), machine learning,

natural language processing and open-source tools (Harris & Mehrotra,

2014)

Familiar with total quality

management tools

Nature of

work (MI)

Explore, discover, investigate and visualize (Harris, Murphy &

Vasinman, 2013)

Develops plans and coordinates

Methods

(MM)

Optimization/Visualization

Graphical models

Classical, Bayesian, Temporal, Spatial statistics

Monte Carlo Simulation

Data manipulation (Harris, Murphy & Vasinman, 2013)

No specific math or statistics

requirements

Computer

Science

Skills (P)

Programming

System administration

Back-end programming

Front-end programming (Harris & Mehrotra, 2014)

No specific computer science

requirements

Typical

degree

Computer science, data science, symbolic systems, cognitive science. Degree not required

137

The operations research analyst job announcement collected and analyzed in support of

this research require candidate employees to possess the ability to conduct scientific work. BZC

analysts are required to possess the ability to design, develop and adapt mathematical, statistical,

econometric, and other methods to recommend courses of actions for complex problems. This

occupation at the BZC requires knowledge and skills in applying analytical and evaluation

techniques to identify and apply analytical process to resolve problems. According to the job

announcements, operations research analysts working at the BZC are required to work

independently on small projects and the ability to work with other analysts on large complex

projects. The operations research analyst occupation requires a 4-year degree from an accredited

college or university in operations research or a similar course of study with at least three to

twenty-four semester hours in calculus. The operations research analyst position announcement

analyzed in support of this research described that operations research analysts will be coupled

up with subject matter experts in the organization. This distinction supports the notion that the

BZC is stressing the important of creating teams comprised of domain experts and advanced

analysts.

The comparison of the skills required of the operations research analyst job

announcement to scholarly views of data science skills are provided in Table 37. According to

the recent operations research analyst job announcement, BZC operations research analysts are

required to have skills in data collection and a wide range of methods to conduct data analysis

and skills in applied mathematics. There are no specific analysis software tools and computer

science, or programming requirements. Several participants in this research expressed that the

operations research analyst occupation possess the skills most closely related to a data scientist.

138

Table 37

Data Scientist and BZC Operations Research Analyst Required Skills Comparison

Data Scientist Ops Research Analyst

Types of

Data (CD)

Big data, all types, including unstructured, numeric, and non-numeric data

(Harris, Murphy & Vasinman, 2013)

Data collection

Preferred

Tools (TE)

Mathematical languages (such as R and Python®), machine learning,

natural language processing and open-source tools (Harris & Mehrotra,

2014)

No specific software or tools

Nature of

work (MI)

Explore, discover, investigate and visualize (Harris, Murphy & Vasinman,

2013)

Wide range of methods and

techniques to perform analysis

Methods

(MM)

Optimization/Visualization

Graphical models

Classical, Bayesian, Temporal, Spatial statistics

Monte Carlo Simulation

Data manipulation (Harris, Murphy & Vasinman, 2013)

Applied mathematics and

statistics, no specific statistical

methods

Computer

Science

Skills (P)

Programming

System administration

Back-end programming

Front-end programming (Harris & Mehrotra, 2014)

No specific computer science

requirements

Typical

degree

Computer science, data science, symbolic systems, cognitive science. Ops Research or similar with

specific math requirements

139

The computer scientist job announcement collected and analyzed in support of this

research require candidate employees to possess expert knowledge of theories, concepts,

principles, practices, standards, methods, techniques, and materials of professional computer

science. Candidates are required to have knowledge of other technical disciplines to apply

advanced computer software, software systems, hardware architectural theories, principles of

concepts for new application development and experimental theories.

The comparison of the skills required of the computer scientist job announcement to

scholarly views of data science skills are provided in Table 38. According to the recent computer

scientist job announcement, BZC computer scientists are required to have skills in theories and

concepts of computer science to include the mathematics requirements encompassed in a

computer science bachelor’s degree. This occupation requires thirty semester hours of combined

mathematics, statistics, and computer science and a minimum of fifteen hours combining

statistics and calculus. There are no specific analysis software tools and computer science, or

programming requirements. The job announcement and the collected interview data from the

BZC indicate that computer scientists are employed in many different capacities throughout the

organization.

140

Table 38

Data Scientist and BZC Computer Scientist Required Skills Comparison

Data Scientist Computer Scientist

Types of

Data (CD)

Big data, all types, including unstructured, numeric, and non-numeric

data (Harris, Murphy & Vasinman, 2013)

No specific data analysis

requirements

Preferred

Tools (TE)

Mathematical languages (such as R and Python®), machine learning,

natural language processing and open-source tools (Harris & Mehrotra,

2014)

No specific software or tools

Nature of

work (MI)

Explore, discover, investigate and visualize (Harris, Murphy &

Vasinman, 2013)

Apply theories and concepts of

computer science

Methods

(MM)

Optimization/Visualization

Graphical models

Classical, Bayesian, Temporal, Spatial statistics

Monte Carlo Simulation

Data manipulation (Harris, Murphy & Vasinman, 2013)

No specific math or statistics

requirements

Computer

Science

Skills (P)

Programming

System administration

Back-end programming

Front-end programming (Harris & Mehrotra, 2014)

Apply theories and concepts of

computer science

Typical

degree

Computer science, data science, symbolic systems, cognitive science. Computer science or similar with

specific math requirements

141

The comparative analysis of the BZC job announcements to the scholarly views of data

science suggest the BZC can hire analysts with significant math, statistics, operations research,

and computer science skills through a combination of OPM occupations suggesting there is no

single OPM occupation that encompasses data science and a teaming approach for data science

enablement is appropriate. None of the analysts’ occupations and the computer science

occupation required any specific software knowledge.

Summary

The BZC is an organization that is generating big data sets and has varying levels of

analysis capability throughout their business units. The results of the research were triangulated

from semi-structured interviews with analysts, a focus group interview with management, and

document analysis of a BZC strategic document and recent BZC job announcements. Several

themes emerged as limitations in the BZC’s ability to analyze large data sets and were shown

throughout this research. Access to quality data, metrics, management, organization structure,

culture, infrastructure, data analysis processes, data science skills, and training emerged from the

research as themes important to big data analysis within the BZC.

All of the participants in the research recognized the benefits of developing data science

skills within BZC. Six of the eleven analysts agreed that data science is a role beyond that of a

traditional analyst, two analysts suggested existing analysts could evolve their skills to the level

of a data scientist, and three analysts were unsure. The focus group participants agreed to the

scholarly definitions of a data scientist and that data science is a unique role beyond that of a

BZC traditional analyst. The focus group stressed that if data science includes business domain

understanding it is going to be difficult for their organization to attract, train, and retain this level

of talent. There were common themes on the limitations of the skills of current analysts due to

142

occupational standards, access to training, access to software, and competition for talent. There

was a significant theme of how to train, certify, and employ data scientists within the BZC.

143

CHAPTER 5. DISCUSSION, IMPLICATIONS, RECOMMENDATIONS

Introduction

Rapid data growth is having profound effects on modern-day corporations and the United

States military as they continue to progress through the information technology age

(Ransbotham, Kiron, & Prentice, 2015). Harris and Mehrotra (2014) suggested the skills required

to manage and analyze the exponentially growing size of data are inadequate and in short supply

with bleak predictions for the future. This research explored the emerging commercial data

scientist occupation and the skills required of data scientists to help determine if data science

applies to the DOD. This research sought to define further the skills required of data scientists to

help enable their effectiveness in modern organizations with specific emphasis aimed at the

DOD. The targeted population consisted of analysts, managers, or executives working within the

Bravo Zulu Center (BZC). This research explored data science and the implications associated

with the big data phenomenon by conducting qualitative research with a representative case

study organization. This research explored essential skill sets, attitudes, and perceptions of the

analysts working big data issues for the BZC, along with the skills sets, attitudes, and perceptions

of management within the same organization. A BZC’s strategic planning document and recent

BZC’s job announcements were collected and analyzed that ensured triangulation from three

collection methods to improve the overall accuracy of the research (Gronhaug & Ghauri, 2010).

This chapter discusses the findings of the research compared to the research questions

and the supporting literature review to ensure fulfillment of the research purpose. The chapter

evaluates how the research contributed knowledge toward understanding and resolving the

business problem posed in this study and provides multiple recommendations for further

research.

144

Conceptual Framework Final Implications

The conceptual framework served as the foundational knowledge to support this research

study. This framework guided the research by relying on formal theory, which supported the

researcher’s thinking on how to understand and plan to research the topic (Grant & Osanloo,

2014). William S. Cleveland (2001) coined the term data science in the context of enlarging the

major areas of technical work in the field of statistics. Cleveland’s seminal work described the

requirement of an “action plan to enlarge the technical areas of statistics focuses of the data

analyst” (Cleveland, 2001, p. 1). Cleveland described a major altering of the analysis occupation

to the point a new field shall emerge and will be called “data science” (Cleveland, 2001, p. 1).

Cleveland’s proposal of six technical areas that encompass the field of data science

includes multidisciplinary investigations, models and methods for data, computing with data,

pedagogy, tool evaluation, and theory as seen in Figure 14. This taxonomy was adapted and used

by the researcher to conceptualize the business problem, formulate a plan to collect and analyze

data and provide actionable conclusions.

Figure 14. Cleveland’s Data Science Taxonomy. Adapted from “Data Science: An action plan

for expanding the technical areas of the field of statistics.” by W. Cleveland (2001) International

statistical review, 69(1), 21-26.

Data Sciences

Multidisciplinary Investigation

Models & Methods

Computing with Data

Pedagogy

Tool Evaluation

Theory

145

The coding and analysis of the data that was collected from interviews with BZC analysts

served as the baseline for the enhanced coding structure and were then used in the coding and

analysis of the focus group interview and the BZC collected documents. After continual reading

and synthesizing of the triangulated collected data recurring topics and patterns emerged and

resulted in the final coding structure (see Figure 15). The resulting themes that emerged from the

analysis of the collected information from the BZC formed the themes and conclusions of this

research. The adaptation of Cleveland’s data science taxonomy was effective in this research

study and could be used to support future data science research.

146

Figure 15. Final hierarchical coding structure.

147

Evaluation of Research Questions

Two primary research questions guided this study. How does the Bravo Zulu Center

glean actionable information from big data sets? How mature are the data science analytical

skills, processes, and software tools used by Bravo Zulu Center analysts?

Research Question 1

The purpose of exploring how the BZC gleans actionable information from large data

sets was to understand if their organization is experiencing exponential data growth and how

effective their organization is at analyzing large data sets to help determine if the data science

occupation is warranted in DOD organizations. Findings from the case study revealed the BZC is

large organization collecting an overwhelming amount of data from a large number of disparate

systems and the organization is not taking full advantage of the data that is available. The BZC

has different methods of gleaning actionable information from data sets from manual processes

of collecting and analyzing data to a mature level of analysis through effective business

intelligence systems. The most common method for analyzing data within the BZC is to pull raw

data from many different data warehouses, compile the data on local computers and then analyze

the data in Microsoft Excel or Access and provide the results in PowerPoint. When asked about

their knowledge of the term big data the participants indicated the BZC is operating in a big data

environment and most often equated their definition of big data to when an organization reaches

a data saturation point and is not able to effectively analyze their collected data. The analysis of

the collected data from the BZC was initially analyzed through the use of word frequency

queries and early themes of analysis skills, training, and information systems were identified.

Continual coding and analysis of the collected data revealed access to quality data, organization

structure, culture, infrastructure, and disparate systems as areas that are constraining the BZC’s

148

ability to glean actionable information from large data sets.

Research Question 2

The purpose of exploring how mature the data science skills of analysts, processes, and

software tools are at the BZC was to understand if the current BZC and DOD occupational job

series and the skills required of those job series encompass the scholarly views of data science

skills to ultimately help determine if the data science occupation is warranted in DOD

organizations. Six of the analysts and the focus group participants agreed that data science skills

are skills beyond that of traditional analysts, two analysts suggested the role of the data scientist

does not have to be unique, and three analysts were unsure. All the participants agreed that data

science skills are lacking at the BZC. All the participants indicated there are very few data

scientists within the organization and a large portion of their advanced analytical work is

contracted to outside companies. When asked about how evolved their analytical processes and

products were in relation to the perceived data scientists’ abilities the participants indicated they

are in the beginning stages of building advanced analytical products with limited predictive

analytical capability. Additionally, by comparing the skills and duties required of analysts and

computer scientists as described in recent BZC job announcements to that of scholarly views of

data science skills revealed there are components of data science skills spread across several

analysts’ occupations and the computer science occupation. Harris and Mehrotra (2014)

proposed creating teams that combine business analysts, visualization experts, modeling experts,

and data scientists from different disciplines and functional areas may provide the most effective

strategy for employment. Currently, the BZC cannot hire a government data scientist in a single

occupation and creating teams that encompass the data science skills is warranted.

Harris and Mehrotra (2014) suggested common desktop applications limit the analysis

149

capabilities in many organizations. Data scientists are well versed in common advanced

statistical software with access to open source libraries to conduct the advanced analysis. The

results of the research revealed that BZC analysts are constrained in their ability to conduct data

science because of their access to modern data science tools such as R and Python® as well as

modern visualization software such as Tableau and others. The research revealed there is a mix

of statistical and business intelligence software that is available but there is not a standardized

plan for analysis software across the BZC. The participants expressed frustration with

information technology policy constraints that are preventing access to modern analytical

software and inhibiting the BZC data science evolution.

There is evidence the BZC is actively engaged in advancing data science skills in their

organization. The BZC has recently created a small data science team that is focused on utilizing

data science to bring actionable insights into one specific business unit within their command.

Additionally, the BZC’s strategic document that was collected and analyzed revealed there is a

strategic objective to enable complete data integration and data availability across the BZC with

a goal to increase their analytical capability. The BZC analysts’ data science skills and processes,

and analysis software are immature. The BZC analysts and managers understood their limitations

to data science and are actively engaged to bring these skills into their business.

Fulfillment of Research Purpose

The Chapter 2 literature review provided a foundation of scholarly research that

expressed the critical importance of big data analysis in both commercial and DOD sectors. The

literature review served as a foundation of research that described the emergence of the data

science occupation and this occupation is critical for big data analysis in modern environments

and that these skills are in short supply (Edwards, 2014). The research sought to define further

150

data science skills and how and if these skills could be employed in DOD organizations by

examining the skills and abilities of federal civilians working as analysts within the BZC. The

research revealed that the scholarly views of data science skills are inherent to several federal

OPM occupations of personnel working within the BZC. Chapter 4 revealed the BZC is

experiencing extreme data growth, has immature data science skills and processes, and provided

several implications on how best to employ data scientist within their organization. These

findings directly related to the specific business problem that the DOD may be struggling with

gleaning actionable information from large data sets compounded by immature data science

skills. The research provided evidence that there are skills differences between data scientists and

the traditional analyst that are available to DOD organizations through the current Federal OPM

occupations. This research suggests access to quality data, organization structure and culture,

infrastructure and legacy systems, access to training, competition for talent, and access to

software as themes that are preventing the BZC from fully leveraging data science capabilities

and these limitations may be affecting other DOD organizations. Additional themes of big data,

metrics, management, data analysis processes, data science skills, and domains resulted from this

research and supported the conclusion that data science skills, processes, and software are

immature at the BZC.

The results of this research suggest DOD organizations will accelerate their ability to

glean actionable information from large data sets by maturing data science skills within their

workforce. The results of this research propose there are several limitations that are inhibiting the

development of a DOD data science workforce. Harris and Mehrotra (2014) suggested creating

teams that combine business analysts, visualization experts, modeling experts, and data scientists

as an effective strategy. Because there is no formalized data science occupation within the DOD

151

workforce and because the DOD is competing for scarce data science talent creating data

analysis teams that comprise the breadth of data science and domain understanding is a

reasonable approach. DOD organizations should evaluate the abilities of their existing analysts in

domain understanding and data science skills to support an action plan to further mature data

science within their organizations. Additionally, by creating a visualization that plots the

assessments of their analysts on domain knowledge and data science skills DOD organizations

can explore the maturity of their overall analysis capability as seen in Figure 16. Additionally,

DOD organizations should influence the skill requirements sections of job announcements of

incoming analysts to bring in more data science skills and evaluate all policies and infrastructure

limitations that are prohibiting the use of modern data science analytical software.

Figure 16. Domain and data science assessment model.

152

Contribution to Business Problem

Gabel and Tokarski (2014) suggested organizations face rapid data growth and require

deliberate action by leadership to ensure sustainability. The DOD is generating massive amounts

of data and is facing similar challenges (Hamilton & Kreuzer, 2018). The general business

problem is the lack of effective analysis in organizations operating in the modern-day big data

environment (Harris & Mehrotra, 2014). The specific business problem is that DOD

organizations may be struggling with gleaning actionable information from large data sets

compounded by immature data science skills of DOD analysts (Harris et al. 2013).

This qualitative case study analyzed the perceptions and experiences of analysts working

big data analysis issues in a representative organization along with the perceptions and

experiences of management within the same organization. The research provided actionable

information on how DOD organizations are currently analyzing large data sets. This research

provided insights regarding the current skills of analysts within the case study organization and

how evolved these skills are when compared to the scholarly views of data science skills. This

research uncovered vital limitations regarding the data science skills of existing DOD analysts

and new analysts coming into the federal OPM occupations when compared to scholarly views

of data science skills. The findings are that the personnel assigned as analysts within the case

study organization have detailed business domain understanding but do not have data science

specific skill requirements and training. The relatively small number of analysts that do have

partial requirements for data science-related skills are spread across several OPM occupations

and the job announcements used to hire analysts only partially include the breadth of data

science-related skills. Additionally, DOD analysts are constrained in their ability to leverage

modern analytical software. The BZC analysts that participated in this research are providing

153

valuable products to management throughout the organization. However, before BZC analysts

can build advanced analytical products on their large data sets the organization will need to

further assess the skills of existing analysts and policies that are constraining data science

maturity and subsequent analytical innovation.

Recommended Actions for DOD Organizations

This research investigated how the BZC gleans actionable information from big data sets

and identified access to quality data, organization structure and culture, infrastructure and legacy

systems, access to training, competition for talent, and access to software as constraints to data

science adoption. The research concluded the data science skills and processes of analysts

working at the BZC are immature and all of the participants in this research agreed that

advancing data science is critical to BZC’s mission effectiveness. The research suggests DOD

organizations should develop an action plan to mature data science to include:

 Evaluate existing analysts on business and data science knowledge.

 Create data science teams by combing data science related federal occupations.

 Influence job announcements to include data science skills.

 Remove policies constraining access to modern analytical software.

 Remove policies constraining access to data science training.

 Develop strategies to integrate and share quality data.

Recommendations for Further Research

Further research recommendations were derived from the limitations posed in Chapter 1

as well as the findings and themes from the analysis of the collected data in Chapter 4. Cooper

and Schindler (2013) suggested a limitation of qualitative research is the ability to generalize

154

conclusions to a larger population. The findings in this research suggests DOD organizations are

experiencing big data growth, and are struggling with gleaning actionable information from large

data sets compounded by immature data science skills. The following recommendations for

further research may help quantify the shortage of DOD data scientists, provide further details on

data science software and training barriers, and organizational and cultural implications to data

science adoption:

 A quantitative study to include a large population of DOD analysts statistically

comparing the skills used by DOD analysts to that of data science skills that could

quantify the shortage of analytical talent. This research would help to further define

gaps of current DOD analysts and help support any restructuring of Federal OPM

occupational standards and how DOD organizations acquire and employ data

scientists.

 A quantitative study that examines the constraints that are limiting DOD analysts to

software tools required for data science analysis. A researcher could survey DOD

operational units and information technology policy organizations regarding the

accessibility of software and the potential barriers that need addressing. Access to

modern software was a significant theme in this research, and access to analytical

software may be a common DOD problem.

 An exploratory qualitative study or a quantitative study that examined access to data

science or advanced analytical training within the DOD workforce. The participants

in this study presented a theme regarding the lack of data science training and

certification. Several options for data science training and certification are available

from commercial vendors, academia, and within the DOD. Additional research that

155

explores or quantifies the significance of access to data science training and

certification may help DOD organizations internally grow data scientists and is

warranted.

 An exploratory case study that examines the organizational and cultural changes

required in commercial or DOD organizations that are needed because of the massive

data growth and the requirement of better analytics. Gabel and Tokarksi (2014)

suggested large data sets are complicated, time-consuming, and expensive and create

strategic alignment problems in modern organizations. How to align the organization

and how and where to insert data scientists was a theme from this research with the

BZC and further research is warranted.

 A qualitative case study that explores the management implications associated with

the arrival of big data and data science in modern organizations.

Conclusions

This study was intended to further define big data and data sciences and explore their

applicability to DOD organizations and expand the body of knowledge regarding big data and

data science. The primary findings of this study suggest the BZC is experiencing large data

growth and concurs with scholarly definitions of big data, data science, and the importance of the

further development of a data science workforce to meet mission requirements. The study

revealed that the BZC is a large complex organization generating large amounts of data and has

varying levels of ability to glean actionable information from large data with several limitations.

The study revealed that data science skills and processes are immature within the BZC. The

personnel assigned as analysts within the case study organization have detailed business domain

understanding but do not have data science specific skill requirements and training. The

156

relatively small number of analysts that do have partial requirements for data science-related

skills are encompassed in several OPM occupations and the job announcements used to hire

analysts only partially include the breadth of data science-related skills. Several themes emerged

as constraints to data science expansion within the BZC. This research suggests access to quality

data, organization structure and culture, infrastructure and legacy systems, access to training,

competition for talent, and access to software as themes that are preventing the BZC from fully

leveraging data science capabilities and these limitations may be affecting other DOD

organizations. Additional themes of big data, metrics, management, data analysis processes, data

science skills, and domains resulted from this research and supported the conclusion that data

science skills and processes are immature at the BZC. The study revealed the BZC has strategic

actions underway to manage and integrate data for better accessibility and the importance of

modern analytical software for their analysts and to continue the development of the skills of

their analysts in order to glean actionable information from big data sets that will directly

contribute to mission effectiveness.

157

REFERENCES

Akerkar, R. (2014). Analytics on big aviation data: Turning data into insights. International

Journal of Computer Science and Applications, 11(3), 116-127. Retrieved from

https://pdfs.semanticscholar.org/820f/a4268e73d6de5beed8486dfa8b8d8ecc42de.pdf

Almeida, F. (2017). Benefits, challenges and tools of big data management. Journal of Systems

Integration, 8(4), 12-20. doi:10.20470/jsi.v8i4.311

Baskarada, S., & Koronios, A. (2017). Unicorn data scientist: The rarest of breeds. Program,

51(1), 65-74. doi:10.1108/PROG-07-2016-0053

Beer, D. (2016). How should we do the history of Big Data? Big Data & Society, 3(1), 1-10.

doi:10.1177/2053951716646135

Berner, M., Graupner, E., & Maedche, A. (2014). The information panopticon in the big data era.

Journal of Organization Design, 3(1), 14-19. doi:10.7146/jod.3.1.9736

Bowen, G. A. (2009). Document analysis as a qualitative research method. Qualitative Research,

9(2), 27-40. doi:10.3316/QRJ0902027

Brynjolfsson, E., & McAfee, A., (2012). Big data: The management revolution. Harvard

Business Review, 90(10), 60-68. Retrieved from http://tarjomefa.com/wp-

content/uploads/2017/04/6539-English-TarjomeFa-1.pdf

Chen, H., Chiang, R., & Storey, V. (2012). Business intelligence and analytics: From big data to

big impact. MIS Quarterly (0276-7783), 36(4), 1165. Retrieved from

https://www.jstor.org/stable/41703503

Cleveland, W. S. (2001). Data science: An action plan for expanding the technical areas of the

field of statistics. International Statistical Review, 69(1), 21-26. doi:10.1111.j.1751-

5823.2001.tb00477.x

158

Columbus, L. (2018, January, 29). Data scientist is the best job in America according to

Glassdoor’s 2018 rankings. Forbes. Retrieved from

https://www.forbes.com/sites/louiscolumbus/2018/01/29/data-scientist-is-the-best-job-in-

america-according-glassdoors-2018-rankings/#33eaef7c5535

Cooper, D., & Schindler, P. (2013). Business research methods, 12th Edition. McGraw-Hill

Learning Solutions, 2013-03-05. VitalBook file.

Costlow, T. (2014). How big data is paying off for DOD. Defense Systems. October 24 2014.

Retrieved from https://defensesystems.com/articles/2014/10/24/feature-big-data-for-

defense.aspx

Cotter, P. (2014). Analytics by degree: The dilemmas of big data analytics in lasting

university/corporate partnerships (Doctoral dissertation). Retrieved from ProQuest UMI

Dissertation, UMI Number 3635733

Creswell, J. W. (2009). Research design: Qualitative, quantitative, and mixed methods

approaches (3 ed.). Thousand Oaks, CA: Sage.

Davenport, T. H., Barth, P., & Bean, R. (2012). How big data is different. MIT Sloan

Management Review, 54(1), 43-46. Retrieved from

https://pdfs.semanticscholar.org/eb3d/ece257cca2e8ce6eaf73fd98c1fdcbdc5522.pdf

Davenport, T., & Dyché, J. (2013). Big data in big companies. SAS Institute. Retrieved from

https://www.sas.com/en_us/whitepapers/bigdata-bigcompanies-106461.html

Davenport, T., & Patil D. (2012). Data scientist: The sexiest job of the 21st Century. Harvard

Business Review 90(10), 70-76. Retrieved from https://hbr.org/

159

Davis, J. (2016, July, 15). Microsoft launches online data science program. Informationweek.

Retrieved from http://www.informationweek.com/big-data/big-data-analytics/microsoft-

launches-online-data-science-degree-program/d/d-id/1326276

DISA, (2015). Defense Information Systems Agency request for information, Big Data Solution

and Governance Capabilities. March, 2015. Retrieved from

https://govtribe.com/project/defense-information-systems-agency-disa

Edwards, J. (2014). Big data takes strategic turn a DOD. Defense News, November 21 2014.

Retrieved from https://www.c4isrnet.com/it-networks/2014/11/20/big-data-takes-a-

strategic-turn-at-dod/

Fox, S., & Do, T. (2013). Getting real about big data: Applying critical realism to analyse big

data hype. International Journal of Managing Projects in Business, 6(4), 739-760.

doi:10.1108/IJMPB-08-2012-0049

Frizzo-Barker, J., Chow-White, P., Mozafari, M., & Dung,H. (2016). An empirical study of the

rise of big data in business scholarship. International Journal of Information Management,

36(3), 403-413. doi:10.1016/j.ijinfomgt.2016.01.006

Gabel, T. J., & Tokarski, C. (2014). Big Data and organization design. Journal of Organization

Design, 3(1), 37-45. doi:10.7146/jod.3.1.9753

Galbraith, J. (2014). Organization design challenges resulting from big data. Journal of

Organization Design, 3(1), 2-13. doi:10.7146/jod.3.1.8856

Gang-Hoon, K., Trimi, S., & Ji-Hyong, C. (2014). Big-data applications in the government

sector. Communications of the ACM, 57(3), 78-85. doi:10.1145/2500873

160

Géczy, P. (2015). Big data management: Relational framework. Review of Business & Finance

Studies, 6 (3), 21-30. Retrieved from

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2656427

George, G., Haas, M. R., & Pentland, A. (2014). Big data and management. Academy of

Management Journal. 57(2), 321-326. doi:10.5465/amj.2014.4002

Gobble, M. M. (2013). Big data: The next big thing in innovation. Research Technology

Management, 56(1), 64-66. doi:10.5437/08956308X5601005

Grant, C., & Osanloo, A. (2014). Understanding, selecting, and integrating a theoretical

framework in dissertation research: Creating the blueprint for your house. Administrative

Issues Journal: Education, Practice, and Research, 4(2), 12-26. doi:10.5929/2014.4.2.9

Granville, V. (2014). Developing analytical talent: Becoming a data scientist. Indianapolis, In.

John Wiley & Sons, Incorporated.

Gronhaug, P., & Ghauri, K. (2010). Research methods in business studies XML Vitalsource

ebook for Capella, 4th Edition. Pearson Learning Solutions. VitalBook file.

Grossman, R., & Siegel, K. (2014). Organizational models for big data and analytics. Journal of

Organization Design, 3(1), 20-25. doi:10.7146/jod.3.1.979

Halper, F. (2016). The citizen data scientist-coming to your organization? Business Intelligence

Journal, 21, 55-56. Retrieved from https://tdwi.org

Hamilton, S. P., & Kreuzer, Michael P. (2018). The big data imperative. Air & Space Power

Journal, 32(1), 4-20. Retrieved from

https://www.airuniversity.af.edu/Portals/10/ASPJ_Spanish/Journals/Volume-30_Issue-

2/2018_2_11_hamilton_s_eng.pdf

161

Harris, H. D., Murphy, S. P., & Vaisman, M. (2013). Analyzing the analyzers: An introspective

survey of data scientists and their work. Sebastopol, CA: O’Reilly Media.

Harris, J. G., & Mehrotra, V. (2014). Getting value from your data scientists. MIT Sloan

Management Review, 56(1), 15-18. Retrieved from https://sloanreview.mit.edu

Henry, R., & Venkatraman, S. (2015). Big data analytics the next big learning opportunity.

Journal of Management Information and Decision Sciences, 18(2), 17-29. Retrieved from

https://www.abacademies.org/journals/journal-of-management-information-and-decision-

sciences-home.html

Hoffman, M. (2013). Big data poses big problem for pentagon. Defense Tech, 03(20). Retrieved

from https://www.military.com/defensetech/2013/02/20/big-data-poses-big-problem-for-

pentagon

INFORMS (2017). Certified Analytics Professional Handbook. Retrieved from

https://www.certifiedanalytics.org/

Kitchin, R., & McArdle, G. (2016). What makes big data, big data? Exploring the ontological

characteristics of 26 datasets. Big Data & Society, 3(1). doi:10.1177/2053951716631130

Kiron, D. (2013). Organizational alignment is key to big data success. MIT Sloan Management

Review, 54(3), 1-n/a. Retrieved from https://sloanreview.mit.edu/

Konkel, F. (2015) Pentagon to Silicon Valley: Tech us big data. NextGov, June 18 2015.

Retrieved from http://www.nextgov.com/analytics-data/2015/06/pentagon-silicon-valley-

teach-us-big-data/115717/

Lansiti, M., & Lakhani, K. R. (2014). Digital ubiquity: How connections, sensors, and data are

revolutionizing business. Harvard Business Review, 92(11), 90-99. Retrieved from

https://hbr.org/

162

Lohr, S. (2013, February 04). Searching for origins of the term 'big data'. The New York Times.

Retrieved from http://www.nytimes.com

Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011).

Big Data: The Next Frontier for Innovation, Competition & Productivity, 1-143.

Retrieved from https://www.mckinsey.com/business-functions/digital-mckinsey/our-

insights/big-data-the-next-frontier-for-innovation

McAfee, A., & Brynjolfsson, E. (2012). Big data: The management revolution. (cover story).

Harvard Business Review, 90(10), 60-68. Retrieved from https://hbr.org/

McCaney, K. (2014). Navy wants to take big data into battle. Defense News, June 24, 2014.

Retrieved from https://defensesystems.com/articles/2014/06/24/navy-onr-big-data-

ecosystem.aspx

Miller, S. (2014). Collaborative approaches needed to close the big data skills gap. Journal of

Organization Design, 3(1), 26-30. doi:10.7146/jod.3.1.9823

Moorthy, J., Lahiri, R., Biswas, N., Sanyal, D., Ranjan, J., Nanath, K., & Ghosh, P. (2015). Big

data: Prospects and challenges. The Journal for Decision Makers, 1(40), 74-96.

doi:10.1177/0256090915575450

Moustakas, C. (1994). Phenomenological research methods. Thousand Oaks, CA: Sage

Publications.

National Academies Press (2017). Strengthening data science methods for department of defense

personnel and readiness missions. Washington D.C. Retrieved from

https://www.nap.edu/catalog/23670/strengthening-data-science-methods-for-department-

of-defense-personnel-and-readiness-missions

163

OPM, (2005). U.S. Office of Personnel Management. Professional Work in the Mathematical

Sciences Group, 1500. Retrieved from https://www.opm.gov/policy-data-

oversight/classification-qualifications/classifying-general-schedule-

positions/standards/1500/gs1500p.pdf

OPM, (2009). U.S. Office of Personnel Management. Handbook of Occupational Groups and

Families. Retrieved from https://www.opm.gov/policy-data-oversight/classification-

qualifications/classifying-general-schedule-positions/occupationalhandbook.pdf

OPM, (2014). U. S. Office of Personnel Management strategic plan FY2014-2018. Retrieved

from https://www.opm.gov/about-us/budget-performance/strategic-plans/2014-2018-

strategic-plan.pdf

Ortiz Jr., S. (2010). Taking Business Intelligence to the Masses. Computer IEEE, 0018-

9162/July-10), 12-15. Retrieved from

https://www.computer.org/cms/Computer.org/ComputingNow/homepage/news/CO_0710

_FeatStory_BusinessIntelligenceToMasses.pdf

Parmar, R., Cohn, D., & Marshall, A. (2014). Driving Innovation through data. IBM Institute for

Business Value. Accessed December 27, 2015:

www.935.ibm.com/services.us/gbs/thoughleadership/innovation-through-data

Phillips-Wren, G., & Hoskisson, A. (2015). An analytical journey towards big data. Journal of

Decision Systems, 24(1), 87-102. doi:10.1080/12460125.2015.994333

Piatetsky, G. (2017, January, 10). Data scientist-best job in America, again. KDnuggets.

Retrieved from http://www.kdnuggets.com/2017/01/glassdoor-data-scientist-best-job-

america.html.

164

Porche, III, I., Wilson, B., Johnson, E., & Tierney, S. (2014). Data Flood: Helping the Navy

Address the Rising Tide of Sensor Information. RAND Corporation, 2014.

Provost, F., & Fawcett, F. (2013). Data science and its relationship to big data and data-driven

decision making. Big Data, 1(1), 51-59. doi:10.1089/big.2013.1508

Ransbotham, S., Kiron, D., & Prentice, P. K. (2015). The talent dividend: Analytics is driving

competitive advantage at data-oriented companies. MIT Sloan Management Review, 56(4),

1-12. Retrieved from https://sloanreview.mit.edu/

Rouhani, S., Ashrafi, A., Zare Ravasan, A., & Afshari, S. (2016). The impact model of business

intelligence on decision support and organizational benefits. Journal of Enterprise

Information Management, 29(1), 19-50. doi:10.1108/JEIM-12-2014-0126

Santaferraro, J. (2013). Filling the demand for data scientists: A five-point plan. Business

Intelligence Journal, 18, 13-18. Retrieved from https://tdwi.org/research/list/tdwi-

business-intelligence-journal.aspx

SAS, (2017). SAS academy for data science. Retrieved from

https://www.sas.com/en_us/learn/academy-data-science.html

Schneider, K. F., Lyle, D. S., & Murphy, F. X. (2015). Framing the big data ethics debate for the

military. Joint Force Quarterly : JFQ, (77), 16-23. Retrieved from

https://pdfs.semanticscholar.org/8cbc/6b28d0e1bca2bcf09cb6c5d389ec086c7748.pdf

Seidman, I. (2013). Interviewing as qualitative research: a guide for researchers in education

and the social sciences. Teachers college press.

Shah, S., Horne, A., & Capellá J. (2012). Good data won't guarantee good decisions. Harvard

Business Review, 90(4), 23-25. Retrieved from https://hbr.org/

165

Sharda, R., Adomako Asamoah, D., & Ponna, N. (2013). Research and pedagogy in business

analytics: Opportunities and illustrative examples. Journal of Computing & Information

Technology, 21(3), 171-183. doi:10.2498/cit.1002194

Smith, M. (2015, February, 18). The White House names Dr. DJ Patil as the first chief data

scientist. Retrieved from https://obamawhitehouse.archives.gov/blog/2015/02/18/white-

house-names-dr-dj-patil-first-us-chief-data-scientist

Swain, A. (2016). Big data analytics: An expert interview with Bipin Chadha, data scientist for

united services automobile association (USAA). Journal of Information Technology Case

and Application Research, 18(3), 181-185. doi:10.1080/15228053.2016.1223497

Swanson, R., & Holton, E. (2005). Research in Organizations, Foundations and Methods of

Inquiry. San Francisco, CA: Berrett-Koehler Publishers, Inc.

Symon, P. B., & Tarapore, A. (2015). Defense intelligence analysis in the age of big data. Joint

Force Quarterly : JFQ, (79), 4-11. Retrieved from

http://ndupress.ndu.edu/Media/News/Article/621113/defense-intelligence-analysis-in-the-

age-of-big-data/

Turner, V., Reinsel, D., Gantz, J., & Minton, S. (2014). The digital universe of opportunities:

Rich data and the increasing value of the internet of things. EMC Corporation. Retrieved

from https://www.emc.com/leadership/digital-universe/2014iview/index.htm

U.S. Air Force (2016). Data science and the USAF ISR enterprise. Retrieved from

http://www.defenseinnovationmarketplace.mil/resources/Data_Science_and_the_USAF_IS

R_Enterprise%20_White_Paper.PDF

Viaene, S. (2013). Data scientists aren't domain experts. IT Professional, 15(6), 12-17. Retrieved

from https://ieeexplore.ieee.org/document/6674007

166

Walker, J. (2012). The use of saturation in qualitative research. Canadian Journal of

Cardiovascular Nursing, 22(2), 37-41. Retrieved from

https://www.ncbi.nlm.nih.gov/pubmed/22803288

Watson, H. J., & Marjanovic, O. (2013). Big data: The fourth data management generation.

Business Intelligence Journal, 18, 4-8. Retrieved from https://tdwi.org/research/list/tdwi-

business-intelligence-journal.aspx

White House (2012). The big data research and development initiative. Washington D.C.

Retrieved from https://obamawhitehouse.archives.gov/blog/2012/03/29/big-data-big-deal

White House (2016). The federal big data research and development strategic plan. Washington

D.C. Retrieved from https://www.nitrd.gov/PUBS/bigdatardstrategicplan.pdf

White House (2018). The networking and information technology research and development

program supplement to the President’s FY18 budget. Washington D.C. Retrieved from

https://www.nitrd.gov/pubs/2018supplement/FY2018NITRDSupplement.pdf

Yin, R. (2009). Case study research: Design and methods (Applied Social Research Methods

Series, 5, 4th ed.). Thousand Oaks, CA: Sage Publications.

Yin, R. (2012). Applications of Case Study Research. Thousand Oaks, CA: Sage Publications.

Young, J. (2014). An epidemiology of big data (Doctoral dissertation). Retrieved from ProQuest

UMI Dissertation, UMI Number 3620515.

Zhao, Y., MacKinnon, D., & Gallup, G. (2015). Big data and deep learning for understanding

DOD data. CrossTalk. July/August 2015. Retrieved from

http://www.crosstalkonline.org/issues/julyaugust-2015.html

Zhu, Y., & Xiong, Y. (2015). Towards Data Science. Data Science Journal, 14, 8.

doi:10.5334/dsj-2015-008

167

Zuboff, S. (1988). In the Age of the Smart Machine: The Future of Work and Power.

Basic Books, New York, NY.

168

STATEMENT OF ORIGINAL WORK

Academic Honesty Policy

Capella University’s Academic Honesty Policy (3.01.01) holds learners accountable for the

integrity of work they submit, which includes but is not limited to discussion postings,

assignments, comprehensive exams, and the dissertation or capstone project.

Established in the Policy are the expectations for original work, rationale for the policy,

definition of terms that pertain to academic honesty and original work, and disciplinary

consequences of academic dishonesty. Also stated in the Policy is the expectation that learners

will follow APA rules for citing another person’s ideas or works.

The following standards for original work and definition of plagiarism are discussed in the

Policy:

Learners are expected to be the sole authors of their work and to acknowledge the

authorship of others’ work through proper citation and reference. Use of another person’s

ideas, including another learner’s, without proper reference or citation constitutes

plagiarism and academic dishonesty and is prohibited conduct. (p. 1)

Plagiarism is one example of academic dishonesty. Plagiarism is presenting someone

else’s ideas or work as your own. Plagiarism also includes copying verbatim or

rephrasing ideas without properly acknowledging the source by author, date, and

publication medium. (p. 2)

Capella University’s Research Misconduct Policy (3.03.06) holds learners accountable for research

integrity. What constitutes research misconduct is discussed in the Policy:

Research misconduct includes but is not limited to falsification, fabrication, plagiarism,

misappropriation, or other practices that seriously deviate from those that are commonly

accepted within the academic community for proposing, conducting, or reviewing

research, or in reporting research results. (p. 1)

Learners failing to abide by these policies are subject to consequences, including but not limited to

dismissal or revocation of the degree.

169

Statement of Original Work and Signature

I have read, understood, and abided by Capella University’s Academic Honesty Policy (3.01.01)

and Research Misconduct Policy (3.03.06), including Policy Statements, Rationale, and

Definitions.

I attest that this dissertation or capstone project is my own work. Where I have used the ideas or

words of others, I have paraphrased, summarized, or used direct quotes following the guidelines

set forth in the APA Publication Manual. Learner name

and date Roy Lancaster 11/11/2018

170

APPENDIX A. INTERVIEW GUIDE

Interview Guide designed and created by Lancaster, 2018.

Purpose: The interviews with analysts and the focus group with managers are being

conducted to help senior leadership in the Bravo Zulu Center (BZC) understand how the analysis

of big data impacts the organization’s mission effectiveness. We would like your opinion and

perception of what you consider important knowledge, skills, and abilities necessary for both the

analysts and management team working big data issues for the BZC. Your feedback on big data,

data science, and how our organization relies on this data to conduct daily business in the BZC is

valuable to helping us understand how and in where we can focus our efforts to improve BZC

organization. Thank you for taking time to participate.

Rationale: The principle rationale for furthering the knowledge on the big data

phenomenon and a potentially emerging data science occupation suggests creating the ability to

manage and analyze large amounts of data is more of a human problem and less of an

information system technological problem (McAfee & Brynjolfsson, 2012).

171

Interview Guide Questions

Code Question

MI- Multidisciplinary investigation How is data used in your organization to meet mission

requirements? What are some areas in your organization that

are dependent on data?

TH- Theory How do you define big data? What increases of digital data

(big data) have you witnessed and how has it impacted the

business of the BZC?

P-Pedagogy What are some knowledge, skills, and abilities needed to be

an effective data scientist?

TE- Tool evaluation

MM- Models and methods

What are some of the significant challenges associated with

conducting data analysis in your organization?

TH- Theory What are the data science skills that are used by BZC the

BZC analysts?

MI- Multidisciplinary investigation What additional skills are needed by analysts to be effective

in the modern big data environment?

MI- Multidisciplinary investigation What else can you tell me regarding big data and data

science?

We cannot provide confidentiality to a participant regarding comments involving criminal activity/behavior, or statements

that pose a threat to yourself or others. Do NOT discuss or comment on classified or operationally sensitive information.