QBRDM #5
Qualitative Coding in Software: Principles
and Processes
In: Using Software in Qualitative Research: A Step-by-Step
Guide
By: Christina Silver & Ann Lewins
Pub. Date: 2017
Access Date: October 29, 2019
Publishing Company: SAGE Publications Ltd
City: 55 City Road
Print ISBN: 9781446249734
Online ISBN: 9781473906907
DOI: https://dx.doi.org/10.4135/9781473906907
Print pages: 158-185
© 2014 SAGE Publications Ltd All Rights Reserved.
This PDF has been generated from SAGE Research Methods. Please note that the pagination of the
online version will vary from the pagination of the print book.
Qualitative Coding in Software: Principles and Processes
This chapter discusses principles and processes in coding textual and multimedia data using qualitative
software. We illustrate what qualitative coding is and how it works in software, discussing methodological
and practical approaches and the possibilities software provides in supporting and integrating them. Code
and retrieve capabilities underpin the development of CAQDAS programs (Chapter 1), but software does not
dictate whether, how or why to generate or apply codes. While specific coding functionality varies, packages
allow a similar degree of flexibility and a range of different ways to apply and combine coding techniques.
Chapters 8 and 9 build on the discussions presented here, and Chapter 10 can usefully be read in conjunction
as it discusses the ways writing tools can be used to document processes and further analytic thinking as you
proceed.
What is qualitative coding?
Qualitative coding is the process by which segments of data are identified as relating to, or being an example
of, a more general idea, instance, theme or category. Data segments from across the whole dataset are
placed together, or ‘tagged’ in order to be retrieved together at a later stage. In so doing you build up a
coding system to organise data and your ideas about them (Chapter 9). Coding therefore contributes to the
management and ordering of data (Figure 2.1; p. 45). It enables easier searching for similarities, differences,
anomalies, patterns and relationships. As such, coding is often an integral part of the analytic process, but it
is not analysis in itself.
How coding works in qualitative software
When a code is applied to a data segment in a CAQDAS package, a link is created between the segment
and the code. It is useful to think about CAQDAS packages as comprising two elements of a database
system. One stores the data files, the other houses the codes. When the link is created, the quick retrieval of
material is enabled. This is shown in Figure 7.1, illustrated in MAXQDA and NVivo. The right-hand part of the
MAXQDA screen displays the Document Browser, which in this example was initially an empty document in
which we made notes (or developed ‘critiques’) about a literature file (Chapter 5). The left-hand side shows
the Code System, which lists codes (grouped hierarchically according to type) being used to manage critiques
of materials included in the literature review.
The code ‘un-substantiated claims’ is highlighted in the Code System, as is one data segment to which it has
been applied in the Document Browser. The technical process of coding links positions in the Code System
to selected data segments within different documents. The principle is the same in all CAQDAS packages.
Any number of codes can be applied to a single data segment of any size and to overlapping or embedded
segments. Codes can be defined and analytic annotations/memos attached (Chapters 6 and 11). Coded data
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 2 of 31 Qualitative Coding in Software: Principles and Processes
can be retrieved in different ways (Chapter 8); interrogation based on the position of codes as applied to
data and combinations of coding and factual data characteristics (e.g. socio-demographics) are discussed in
Chapter 13.
Figure 7.1 Principles of coding processes and code margins (MAXQDA and NVivo)
Approaches to coding
Coding is often seen to be central to the ‘qualitative method’. Some approaches, however, resist organising
and categorising data through coding (Chapters 1 and 6). CAQDAS packages are not methods of analysis
but provide a range of tools which can be used to facilitate various analytic processes. Although some provide
assistance in coding (see discussion below), decisions about coding always rest with the researcher. Tools
continue to increase as software develops (Chapters 1 and 14). We encourage you to take a critical view of
software and make informed decisions as to whether particular tools within a package are appropriate to the
overarching methodology and the specific analytic needs.
Induction, deduction, abduction: logics of reaching explanations
In developing a strategy for coding data – whether using CAQDAS packages or not – many researchers draw
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 3 of 31 Qualitative Coding in Software: Principles and Processes
upon distinctions between inductive and deductive approaches to analysis, which resonate with ways of going
about coding. These terms refer to two contrasting logics of explanation. As described by Gibbs (2007: 4–5),
the first works up from the data level, the second down from the theoretical:
Induction is the generation and justification of a general explanation based on the accumulation
of lots of particular, but similar, circumstances. … Deductive explanation moves in the opposite
direction, in that a particular situation is explained by deduction from a general statement about the
circumstances.
It is often assumed that because quantitative research primarily employs deductive processes, qualitative
research must therefore be inductive, yet such distinctions are simplistic. As Gibbs (2007: 5) continues:
Much quantitative research is deductive in approach. A hypothesis is deduced from a general law
and this is tested against reality by looking for circumstances which confirm or disconfirm it. A lot of
qualitative research explicitly tries to generate new theory and new explanations. In that sense the
underlying logic is inductive. Rather than starting with some theories and concepts that are to be
tested or examined, such research favours an approach in which they are developed in tandem with
data collection in order to produce and justify new generalisations and thus create new knowledge
and understanding. Some writers reject the imposition of any a priori theoretical frameworks at
the outset. However, it is very hard for analysts to eliminate completely all prior frameworks.
Inevitably qualitative analysis is guided and framed by pre-existing ideas and concepts. Often what
researchers are doing is checking hunches; that is, they are deducing particular explanations from
general theories and seeing if the circumstances they observe actually correspond.
Similarly, Sibert and Shelly (1995: 115) distinguish between conceptual and mechanical tasks involved in
conducting analysis, illustrating that inductive and deductive processes are inherent in both:
Conceptual tasks are those tasks by which the researcher generates the products of the analysis
process. Through reading, questioning, categorizing, inferring by induction and generalizing, the
researcher generates coding categories, relationships, generalisations and perhaps theories. These
products are generated for the purpose of conceptualizing constructs in the data at higher levels of
abstraction.
Mechanical tasks are those tasks by which the researcher manipulates the products of the analysis
process. The researcher stores, organizes and retrieves data by using coding categories. The
researcher deduces (that is, makes deductive inferences about) the validity of relationships,
generalizations or theories by re-examining the data. These products are manipulated for the
purpose of organizing and reorganizing the data which is the basis, the grounding, of the conceptual
tasks.
Many other authors also discuss the nature of qualitative (and mixed methods) analysis in terms of a
combination (or dialectic) of deductive and inductive processes and reasoning (see below). Indeed, the
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 4 of 31 Qualitative Coding in Software: Principles and Processes
premise of this book, and inherent to the way we view the value of customised CAQDAS packages, is
that different analytic processes, ways of thinking and possibilities for cutting across datasets accordingly,
are significantly enhanced through the systematic and creative use of software tools. We discuss this in
relation to data retrieval and moving on from early coding tasks in Chapters 8–13. In specific terms of
computer-assisted coding, we conceptualise inductive and deductive approaches as existing on a continuum.
In practice, researchers may find they employ both inductive and deductive approaches iteratively, throughout
the whole process of the research project.
However, we think it useful to distinguish explicitly between how software can support inherently inductive
and inherently deductive approaches to coding before illustrating how they are combined in practice (Figures
7.2; p. 162 and 7.3; p. 171). Our experience of working with researchers from various sectors, disciplines and
methodologies, and in teaching students and analysts with different levels of technical and analytic expertise,
illustrates the range of approaches that are adopted. Some researchers have very clear ideas about their
need to work entirely (or predominately) in one direction. As the reach of CAQDAS stretches beyond the
academic social science disciplines in which they largely developed, this has become more obvious. We
therefore discuss inductive and deductive approaches to coding, before moving on to illustrate how combined
or ‘abductive’ approaches may proceed. First, we make some comments about coding terminology.
Coding terminology
There is much literature about the principles and applications of qualitative coding – both with and without
the support of customised software. Reflecting different nuances in approach, authors tend to use a range of
terms to refer to different types or purposes of coding, stages at which they occur within analysis and technical
mechanisms for their use within software. Sometimes, the same or similar terms are used to refer to quite
different processes. We list some here:
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 5 of 31 Qualitative Coding in Software: Principles and Processes
Figure 7.2 Early code creation processes (Qualrus and Transana)
• descriptive, interpretive and pattern coding (Miles and Huberman, 1994);
• provisional, core and satellite codes (Layder, 1998);
• objectivist and heuristic codes (Seidel, 1998);
• open, axial and selective coding (Strauss and Corbin, 1997);
• literal, interpretive and reflexive indexing (Mason, 2002);
• descriptive, topic and analytical coding (Richards, 2009).
It is not within the scope of this book, however, to summarise, describe or reflect on these different
conceptualisations or uses of coding. However, we do want to highlight the variety in the area, and in so doing
to encourage you to become familiar with relevant literature and to develop as clear an idea as possible about
how you intend to proceed before you begin in earnest with coding your own raw data within any software
package.
Inductive approaches to coding
The general principle underlying inductive approaches to coding is a desire to prevent existing theoretical
concepts from over-defining the analysis and obscuring the possibility of identifying and developing new
concepts and theories. As Abrahamson (1983: 286) states: ‘an inductive approach begins with the
researchers “immersing” themselves in the documents (that is, the various messages) in order to identify the
dimensions or themes that seem meaningful to the producers of each message’.
Table 7.1 Coding in software: some suggestions in the context of Case study A, Young People’s
Perceptions
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 6 of 31 Qualitative Coding in Software: Principles and Processes
Tasks from PHASE THREE: First wave coding:
deductive, broad-brush coding across the dataset
Creation of empty codes which reflect theoretical
focus
• Create a basic coding scheme structure reflecting the
theoretical context as identified by the literature review
• Define codes precisely, including reference to where in the
literature they come from
• Name or otherwise mark codes to indicate they are literature-
derived
Content-based auto-coding for keywords and
phrases
• Identify keywords/phrases that are indicators of potential
themes, auto-code on that basis across the dataset
Topic-based auto-coding for repeated structures • Broadly auto-code primary data by section
Tasks from PHASE FOUR: Inductive recoding of
broad-brush codes
Identify broad-brush codes that require recoding
• Read through memos and coded data
• Identify and prioritise codes relevant to each research
question, add to memos notes about current thinking
• Create short-cut groupings for codes on the basis of aspects
relevant to each research question
Inductively recode broad-brush codes
• Retrieve data coded at a broad-brush codes
• Inductively recode into more detailed and analytic aspects, in
relation to the research questions
• Name and define new codes in specific terms
• Use hierarchical positioning or pre-fixing to indicate
Build on analytic and process writing
• Make notes about what you are doing as you proceed
○ Comment on how what you are identifying through recoding
relates to the literature and research questions
The well-known and frequently discussed ‘grounded theory’ (originated by Glaser and Strauss, 1967)
comprises a methodological approach to qualitative research rather than simply being an analytic or coding
strategy. It is not our purpose to describe or discuss grounded theory in detail (see Strauss and Corbin,
2008, for a detailed discussion). However, proponents suggest that grounded coding is an iterative process,
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 7 of 31 Qualitative Coding in Software: Principles and Processes
frequently distinguishing between open, axial and selective coding procedures. In our experience many
researchers work in grounded ways, without necessarily strictly adhering to the processes of grounded theory
as they have been described. It is nevertheless useful to briefly discuss these procedures and indicate ways
software tools can facilitate them.
• Open coding refers to the first coding phase in which small segments of data (perhaps a word,
line, sentence or paragraph) are considered in detail and compared with one another. This process
usually generates large numbers of codes from the data level, which encapsulate what is seen to be
‘going on’. These codes may be descriptive or more conceptual in nature. They may be very precise,
or more generally specified. Often terminology found in data is used as code labels, termed in vivo
coding. Open coding fragments the data, ‘opening’ them up into all the possible ways in which they
can be understood.
• Axial coding is a more abstract process. It refers to the second pass through the data when the
codes generated by open coding are reconsidered. Code labels and the data linked to them are
rethought in terms of similarity and difference. Similar codes may be grouped together, merged into
higher-level categories, or subdivided into more detailed ones. Data are revisited and compared
continually as the way codes represent the data is examined. Axial coding thus brings back together
the fragmented data segments identified in the open coding phase by exploring the relationships
identified between the codes which represent them.
• Selective coding refers to a third stage of coding when the researcher again revisits the data and the
codes. Instances in the data which most pertinently illustrate themes, concepts, relationships, etc.
are identified. Conclusions are validated by illustrating instances represented by and grounded in the
data. Identified patterns are tested and core categories in the developing theory are illustrated. This
process will lead to segments of data being chosen to quote and discuss in the final written product
of the research project.
Whether following the principles of grounded theory explicitly, or using elements in informing the design of a
bespoke analytic strategy, adopting an inductive approach allows for naturally occurring elements within data
to be identified and interrogated, in terms of how they reflect substantive areas of interest. In Case Study
C, Coca-Cola Commercials, this included the devices used to promote the product, the subtle and explicit
representations of gendered relationships and how the commercials reflect their time more generally. The
other case-study examples also included inductive coding processes, but their positioning within the broader
analytic design was different (see Tables 2.1; p. 39, 7.1; p. 163, 7.2; p. 169 and 7.3; p. 173). For instance, in
Case study A, Young People’s Perceptions, we illustrate a process of inductively recoding areas of interest
initially identified through a deductive coding process (see below). Conversely, in Case Study B, The Financial
Downturn, the three types of data are coded sequentially, such that the analysis builds incrementally, with
inductive coding of focus-group data being sandwiched by deductive coding of open-ended survey data and
deductive coding of media materials.
Working inductively is characterised by careful and detailed inspection of data on a number of levels (Chapter
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 8 of 31 Qualitative Coding in Software: Principles and Processes
1). This ‘bottom-up’ approach starts at the detailed level and moves through recoding, regrouping, rethinking,
towards a higher level of abstraction. The aim may often be to generate theory from the data, although this is
not a pre-requisite for adopting an inductive approach.
Box 7.1 analytic notes
Considerations when coding inductively
It can be tempting to introduce data to a software project and begin coding immediately.
However, exploring data is important when adopting inductive analytic approaches
because these processes serve to focus subsequent coding. You may have generated
an initial list of codes as part of the process of data familiarisation (Chapter 6). If not,
you will at least have made notes about areas of interest and possibly also annotated
some data segments. Refer to the notes already made when coding. Also keep in mind
the overall aim of the analysis by continually reflecting on the research questions which
underlie your work. Inductive code development and assignation is instigated by what is
‘seen’ in data, but codes should not be created without purpose. The research questions
provide that overarching purpose; the processes of data familiarisation help focus the
detail of code development. Reflect, as you create codes, about how they might later help
you to interpret data and answer the research questions. Always define codes upon their
creation.
Box 7.2 functionality notes
Software tasks supporting inductive approaches to coding
All the packages discussed here provide flexible means by which to generate codes and
analyse qualitative data in an inductive way. These include the following specific tasks:
• creating codes grounded in the data (open coding) or based on language used in
the data (in vivo coding);
• retrieving data segments based on how they have been coded;
• grouping similar codes together and viewing the data coded at them together
(within or outside the software);
• defining codes, printing lists of codes, renaming codes;
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 9 of 31 Qualitative Coding in Software: Principles and Processes
• increasing and decreasing the amount of data coded;
• uncoding data;
• recoding data;
• commenting upon and writing about what is seen.
It is important to view these tasks as occurring in explicit relation to the processes of
data familiarisation and exploration that will have taken place earlier (Chapter 6) and as
providing the basis upon which analytic work will continue (Chapters 7–13).
Box 7.3 analytic notes
Questions to ask yourself when coding inductively
• How do the codes you have created differ? Are some more broadly specified than
others?
• To what extent does first-wave coding help you to think about the data differently
from how you had been thinking after initial data familiarisation?
• How did your initial conceptualisation of the research questions affect the way you
coded? Has your thinking about the research questions changed as a result of
initial inductive coding?
• Have you been defining codes as you create them? How has this helped with your
thinking about potential themes?
Deductive approaches to coding
Deductive approaches to coding are more explicit at the outset about the themes or categories to be
considered. There may be many reasons for taking such an approach, for example, where the intention is to
test an existing theory or hypothesis on newly generated data or to investigate its transferability to a different
social context; or due to perceived time constraints or for other pragmatic reasons. The design of Case study
A, Young People’s Perceptions, is theoretically deductive in the sense that the initial coding of primary data
is explicitly directed by the prior review of the literature (Box 7.4). You may not be testing a theory or a
hypothesis, but may simply know what you are looking for. This is often the approach taken in non-academic
research settings, where the focus may be a more applied and practical understanding for a specific and fairly
immediate objective, or a set of specifically identified outcomes. Case Study B, The Financial Downturn, for
example, includes elements of this type of approach, in that the first phase of analysis constitutes content-
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 10 of 31 Qualitative Coding in Software: Principles and Processes
based deductive coding of a sample of survey respondents’ open-ended responses. In academic settings, as
Berg notes, it is typical that a theoretical framework guides code development and application (Chapter 9):
‘In a deductive approach, researchers use some categorical scheme suggested by a theoretical perspective,
and the documents provide the means for assessing the hypothesis’ (Berg, 2001: 6).
Miles and Huberman (1994) describe a deductive method of coding. They suggest that a variety of factors
(e.g. the conceptual framework, research questions, hypotheses, problem areas) inform the generation of a
provisional list of codes prior to commencing fieldwork. They illustrate how a segment of text can be read on
different levels, suggesting that (re)consideration of data in the following terms leads to the identification and
explanation of themes and patterns:
• Descriptive codes are fairly objective and self-explanatory in nature; they are used at the outset in
the coding process when considering a segment of text for the first time. They allow the organisation
of data according to what it is descriptively about. They are based on predefined areas of interest,
whether factual, thematic or theoretical in nature (Figure 7.3; p. 171).
• Interpretive codes are subsequently used to add a more detailed layer of meaning to the data coded
descriptively. Coded data are revisited in relation to the broad areas of interest and considered
in more detail. Similar aspects may be recoded where they exemplify a meaningful concept or
relationship. Existing concepts or themes may be deconstructed into more detailed aspects.
Elements of a particular theme may be seen as relating to other aspects of different themes, and
perhaps linked to one another.
• Pattern codes are used in the third stage, which moves to a more inferential and explanatory
level. It involves considering how the themes, concepts, behaviours or processes identified through
descriptive and interpretive coding occur within or are relevant across the dataset. This could
be within an individual account. It could also be across subsets of data, for example amongst
respondents with certain similar characteristics. Similarity, difference, contradiction, etc. are
investigated, the aim being to identify meaningful and illustrative patterns in the data.
In many ways, therefore, deductive approaches are similar to inductive ones. They are also iterative and
cyclical, involving close and repeated consideration of the data. The main difference is that the process starts
with at least some predefined, higher-level areas of interest which are explicitly looked for in the data.
There are many different reasons for adopting a deductive approach. Here we briefly describe just two:
theoretical and question-based coding.
Box 7.4 case notes
Considerations when coding deductively (Case study A, Young People’s
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 11 of 31 Qualitative Coding in Software: Principles and Processes
Perceptions)
Deductive coding across all forms of data concurrently serves to integrate existing theory
with primary data analysis from the outset and to explicitly and directly consider empirical
data alongside literature throughout the analytic process. The literature review (Chapter
5) identified the theoretical context, and these themes direct the deductive, broad-brush
coding. Work in the software begins with the creation of an empty coding scheme that
descriptively represents the theoretical context. Subsequently, there are two distinct ways
of approaching the coding: the traditional ‘human-driven’ deductive approach in which the
researcher looks for data that correspond to or contradict the theory; and the ‘computer-
driven’ deductive approach in which the researcher uses the power of content searching
(for key words and phrases) to identify instances of the codes. The two can be powerfully
combined in order to benefit from the advantages and mitigate the disadvantages of
each. We present the combined approach here. Where data are inherently structured (e.g.
according to the questions asked, or other repeated sections) these may form the basis
of another layer of broad-brush deductive coding. The theoretical and topic-based coding
can later be combined using matrix-type co-occurrence queries (Chapter 13).
Theoretical coding
In projects which directly use or apply existing theoretical ideas, the coding process will be inherently
deductive. This might happen as described in relation to Case study A, Young People’s Perceptions, in which
the literature review defines the theoretical context (Box 7.4). In this example, there are large volumes of
previous research into the specific topic of school-based sex education provision, and the related broader
issues of young people’s introduction to issues relating to sex(uality) and relationships, their transition to
adulthood more generally and the policy context and political problematisation of teenage pregnancy in
Western societies. The literature review identified gaps in scientific knowledge relating to the influence
of cultural, social and political factors on the historical provision and effectiveness of school-based sex
education. For example, although ‘social attitudes’ and socio-political factors had been identified as important
impacts (Jones et al., 1986; Vilar, 1994; Rademakers, 1997; Thomson, 1994), neither the nature of these
influences, their historical roots nor the nature of the relationship between these ‘structural’ factors and policy
development, provision in practice or individual behaviour had been explored in detail (Silver, 2002). Thus the
theoretical context guided the formulation of the research questions, the design of the analytic strategy and
the context in which new theory was developed. However, as discussed below, theoretical deductive coding
constituted just one part of the process, which was followed by an inductive recoding process.
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 12 of 31 Qualitative Coding in Software: Principles and Processes
Box 7.5 functionality notes
Software tasks supporting deductive approaches to coding
In all software packages, codes can be generated independently of data – perhaps even
before data are collected, transcribed and incorporated within the software project.
• Codes can be generated at any point in the process, independently of data.
• A mature and complex coding schema which represents an existing theory or
hypothesis may already exist and be applied to data about a different topic.
• Broadly specified codes can be revisited in order to consider, for example, patterns
in the way respondents talk about an issue, or in which a theoretical idea manifests
itself.
• Deductive approaches tend to start off coding in a fairly descriptive way, capturing
instances relating to a set of general theoretical ideas.
Question-based coding
Where repeated structure exists in data, such as with structured interview or open-ended questions to survey
data, analysis may be based around respondents’ answers to particular questions. This is often the case
in applied research settings. In such projects it may be useful to code all the answers to each question
across the dataset separately in order to view and analyse the answers in isolation. This can usually be
achieved semi-automatically if data are formatted in a particular way (Chapters 4, 6 and 12). Case Study B,
The Financial Downturn contains two forms of data which are inherently structured (open-ended responses
to survey questions and focus-group discussions) and they were thus approached, in part, in this way. The
open-ended responses to the survey were coded first in the context of theoretically derived ‘sensitising
concepts’ according to the questions asked, and through use of content-based searching tools. The focus
groups were auto-coded according to speaker sections (Chapters 4 and 12, Box 4.3). Case A, Young People’s
Perceptions, also contains various structural elements, but this time based upon broad topics in the interview
guide, the vignette scenarios and the photos shown as prompts to elucidate attitudes. Chapter 12 discusses
processes for auto-coding on the basis of such structures and Chapter 13 ways in which these can be used
as the basis of revealing interrogations.
Table 7.2 Coding in software: some suggestions in the context of Case Study B, The Financial
Downturn
Tasks from PHASE THREE:
Deductive analysis of open-
ended survey questions
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 13 of 31 Qualitative Coding in Software: Principles and Processes
Broad-brush coding
• Code survey responses as whole units of context according to high-frequency
occurrences of keywords and phrases which indicate salient aspects in relation to
research questions and ‘sensitising concepts’
• Define codes in specific terms
• Name or otherwise mark codes to indicate they are based on textual content
Tasks from PHASE FOUR:
Inductive analysis of focus-
group data
Initial inductive coding
• Pilot code two data files, experimenting with different units of coding context, to build
a coding strategy that will achieve analytic aims
• Refer to research questions and previously written memos to inform the process
• Generate detailed ‘open’ codes to capture data which appears relevant
• Name codes in specific terms and define them precisely
• After basic code retrieval and coding scheme refinement (see Chapters 8 & 9)
proceed with coding remaining focus-group data
Tasks from PHASE FIVE:
Deductive analysis of media
content
Deductive coding in light of
previous coding phases
• Identify keywords/phrases that are indicators of themes identified in survey and focus-
group data, auto-code on that basis across the media materials
• Integrate media coding with primary data coding by merging codes or using short-cut
code groupings
• Build on existing memos to comment on what is seen in the media content and how
that relates to the content of the primary data
Figure 7.3 illustrates the potential of being able to recode descriptive (perhaps question-based) coding, in
more detail. Another aspect of the analysis was more explorative, cutting across the question structure to
consider, for example, the enablers of and barriers to creativity and innovation in the workplace. Question-
based analysis is often required by commissioned research in various applied settings, for example, public
consultations, service evaluations and some forms of government research. Sometimes, though, coding only
in this way can restrict your flexibility and ability to think outside the question structure.
Combining approaches: the practice of abductive coding strategies
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 14 of 31 Qualitative Coding in Software: Principles and Processes
using software
Discussions concerning inductive and deductive approaches to coding are necessarily simplistic and should
not be viewed as dichotomously opposed or mutually exclusive. We present them separately in order to
illustrate ways software tools can support a range of different approaches. As Gibbs (2002: 59) states, ‘you
do not have to do either one or the other or even one and then the other’. The dialectic, combination or
relationship between inductive and deductive practices is sometimes referred to as ‘abduction’ (Blaikie, 2000;
Guba and Lincoln, 1994). Some authors argue that abduction is the ‘guiding principle of empirically-based
theory construction’ (Timmermans and Tavory, 2012). Whatever the preferred terminology, there is always an
interplay between ideas and data, as emphasised by Dey (1993: 7):
We cannot analyse the data without ideas, but our ideas must be shaped and tested by the data we
are analysing. In my view this dialectic informs qualitative analysis from the outset, making debates
about whether to base analysis primarily on ideas (through deduction) or on the data (through
induction) rather sterile.
Similarly, Mason emphasises the sense in which ‘qualitative research design and research practice are
imbued with theory throughout’ (2002: 179), referring to inductive/deductive approaches by characterising the
stage at which theory comes into play – first, last, or simultaneously with data generation and analysis (2002:
180). Indeed, many authors have developed particular approaches to qualitative research and analysis which
formally advocate a combination of approaches to coding. Layder’s (1998) ‘adaptive theory’, for example, is
a multi-strategy approach to the whole process of analysis in which he argues that particular aspects cannot
be viewed in isolation. In coding data this approach takes account of both existing theoretical ideas and those
which develop directly from the data under consideration.
Case Study A, Young People’s Perceptions and Case Study B, The Financial Downturn provide examples
of this type of dialectic between theory and data (Table 2.1; p. 39). The use of software facilitates the
interplay and allows the process of integration to be made transparent. You do not have to code everything. In
Case study A we initially employed deductive, broad-brush coding across the dataset. This was important in
establishing a relationship between theory evidenced in the literature review and the analysis of primary data.
Data collected at the broad codes were subsequently recoded inductively in order to begin integrating the
analysis of different types of data. Software supported topic and content based coding quickly gathers related
material together. Working with literature and critical appraisals about them first ensures the analysis is driven
by theory. Conducting inductive coding of previously deductively generated bodies of coding allows in-depth
analytical work to be carried out without other data confusing thinking. This is important with Case study A
because there is much material contained in the literature and the primary data which does not specifically
relate to the present research questions. This is illustrated using NVivo in Figure 7.1.
The code ‘Moral contexts’ contains a large number of references (167) from a range of data sources (21).
The sub-codes hanging beneath have been created through inductive recoding. Some are quite specific
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 15 of 31 Qualitative Coding in Software: Principles and Processes
with as yet relatively little data recoded at them (e.g. ‘Controversies re teaching in a moral framework’,
‘Interconnectivity between parental and school morals’ and ‘Sexual explicitness’). Others are more broadly
specified and are already relatively prolific (e.g. ‘Religious beliefs’ and ‘Embarrassment’). Primary data coded
at ‘Moral contexts’ derived from the photo-prompt material were later recoded to a range of sub-codes –
including ‘Too young to be parents’, ‘Gendered experiences of morality’, ‘Blame’ and ‘Impact of religion’.
Working with various visualisations within software offers the researcher flexibility to consider data in different
ways. Figure 7.3 illustrates a similar recoding process in Case Study B.
Figure 7.3 Inductively recoding descriptive or broad-brush codes in context (ATLAS.ti)
The left-hand part of Figure 7.1, lists data sources coded at ‘Moral contexts’, showing data sources by type
(e.g. literature and primary data); indicators of frequency of application (Reference column) and volume of
data coded as a percentage of the whole data source (Coverage column). Such ‘quantitisation’ of qualitative
coding contributes to the prioritisation of themes as part of analytic reflection and theoretical refinement.
However this should not be construed as ‘mixed methods’ or ‘quantitative analysis’; it is simply a means
to see which ‘piles’ are bigger (Chapters 8 and 9). Other visualisations when recoding are also possible,
including focusing on how data of different types (e.g. interviews, vignettes and photo prompts) contributed
by one respondent in order to consider whether data collection tools impact on thematic content of material;
and sorting coded material according to factual data characteristics, such as socio-demographic variables
(Chapter 12), in order to facilitate the visualisation of comparisons on that basis (Chapter 13).
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 16 of 31 Qualitative Coding in Software: Principles and Processes
In Case Study B, the main idea framing the design of the analytic strategy was to work inductively with the
primary data in order to enable a participant-grounded analysis. Given the currency of the substantive topic,
however, and the intention also to include an analysis of media content, it was important to be explicit about
the general theoretical backdrop which both framed our thinking about the data, and the design of the survey
and the focus-group discussions. That said, at the time of designing the project and collecting the data, there
was little published research into the implications of the global financial crisis to draw upon. However, the
research questions informed the survey design and four of the following five questions asked are open-ended
questions and therefore constitute qualitative responses.
1. How secure do you feel in your job? (scale of 1–5)
2. How have you been personally affected by the financial crisis?
3. Please indicate who you feel are most at fault for the financial crisis. (Government, banks,
Eurozone – pick up to 2 choices)
a. Please provide comments about your answer to question 3.
4. Are you planning a holiday over the next 6 months?
a. Please provide comments about your answer to question 4.
5. Please comment on any special non-routine purchases or expenditure you have made in the
last 6 months (of over £400 in value).
Perceptions around job security and potential impacts on lifestyle of losing one’s job were the contextualising
topics, and general literature concerning unemployment and poverty provided the theoretical context. We
were explicit about the role of these theories and our own preconceptions and expectations in formulating the
research problem, designing the data collection and developing the analytic strategy. Four types of code were
generated through the different phases of work; a few examples are listed in Table 7.3. The survey data were
collected first, with the intention to conduct a preliminary analysis of them before conducting focus groups
in order to investigate key themes identified in the survey in m®ore detail. Some survey respondents were
invited to participate in the focus groups in order to compare responses to the different data types.
Table 7.3 Code development in a ‘theory-informed’ abductive approach (Case Study B, The Financial
Downturn)
Name and type Code definition/explanation
SENSITISING
CONCEPTS(derived from
literature and included as
deductive codes at the
outset of the coding
process)
Stigma of
unemployment
from the literature about poverty/unemployment – unemployment stigma exists and can lead to
a hiring bias against the unemployed
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 17 of 31 Qualitative Coding in Software: Principles and Processes
Reduction in self-
esteem
from the literature: crises of confidence in the view of self - caused by low expectations and
reducing circumstances
Alienation not quite the (Marxist) theory of, but in the context of being alienated from the area of work – not
part of that world - forcibly prevented from affirming self-worth – related to self-esteem
CONTENT-BASED
CODES(generated by
keyword/phrase
searching of open-ended
survey responses/media
content)
Spending behaviour
captured by words – bought, buy, spend, purchase, afford, essential, avoid, credit. Initially
captures discussion of spending in any context – but in terms of equivalence – needs to be
broken down in terms of whether talk is about the need to spend the attempt to curb spending
purchases made and feelings about them changes in what can be afforded perceptions of what
is ‘essential’ spending and how that is conceptualised
Impact on income captured by words – money, income, pay, balance, bills, pinch
Indicators of transition
captured by words – change, prospect, plans, pension, affected.This body of coding occurred
during PROCESS FOUR (inductive coding of focus-group data). Allows the consideration of
transition in all spheres of life – and changes that have already happened and those that are
envisaged in the future
BROAD-BRUSH
CODES(derived from the
main topics thrown up by
the focus-group
discussions)
Concern for younger explicit but general concern about younger members of family re security, employment
mortgages etc – for re-analysis later
Affects on social-formal impacts on social organisations – for re-analysis later
Affects on social-
informal all mentions of day to day impacts on social-life, meeting places – for re-analysis later
Security any reference to personal sense of security of employment, financial, in future etc etc – for re-
analysis later
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 18 of 31 Qualitative Coding in Software: Principles and Processes
Political statements collection of any views about the ‘politics’ of the downturn – to be examined and re-analysed
later
Economy statements collection of any views about the ‘economics’ of the downturn - to be examined and re-analysed
later
EMERGING
CODES(generated
inductively from focus-
group transcripts)
upsetting exit redundancy not handled well, exacerbating negative aspects of losing job
‘you’re an ex’ in vivo code – no longer part of organisation
compulsion to get job the sense that you do not have a choice about getting work, nothing else to fall back on
poor morale any mention of how, where, when is this happening
bad timing life-stage – bad moment to be in difficulties
back at home having to depend on parents again
manner of restructuring how news was broken, the mechanics of negotiation
vulnerability of weakest any mention of the impact on the vulnerable members of society
The flexibility of combining approaches
Contrasting the suggested analytic processes for the three case-study examples illustrates that approaches
are flexible, designed in relation to the specific needs of each study. Table 2.1 lists the analytic processes
in summary format and Tables 7.1, 7.2 and 7.4 provide more detail for each Case Study so that you can
compare the ordering of processes. Many researchers who use software combine grounded approaches
to coding with more deductive processes. Even those following a prescribed method, or working within a
particular paradigm, often want to be able to incorporate an element of flexibility for working in other ways. For
example, where a project is commissioned and the brief specifies certain outcomes, a fairly mature coding
schema may be identified at an early stage. In such projects, however, researchers usually also want to allow
for the identification and analysis of ‘surprising’ or contradictory aspects. You might be looking for something
specific now – but this does not preclude the use of your data to answer additional, perhaps unrelated
questions, or to be considered from a different perspective later. CAQDAS packages support this very well
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 19 of 31 Qualitative Coding in Software: Principles and Processes
(see also Chapter 9 on developing and managing coding schemes).
Coding visual data: ‘indirect’ and ‘direct’ approaches
Discussion in this chapter so far has largely concerned the coding of textual forms of data, but many research
projects also utilise visual materials. CAQDAS packages vary in how they handle visual data, particularly in
terms of whether they require moving images to be associated with written transcripts in order to be coded.
Chapters 4 and 6 introduced this distinction in terms of whether visual data are accessed and analysed
‘directly’ (i.e. without the need for an associated written transcription) or ‘indirectly’ (i.e. where some sort of
written transcript is required in order to code and analyse). In this chapter we discuss both, although it is
important to note that your chosen software may not support both (Chapter 3).
Coding visual data ‘indirectly’ via synchronised transcripts
Written transcripts can often be generated within CAQDAS packages and associated with the corresponding
media. If the process of generating written transcripts has been undertaken independently, then coding forms
a secondary stage. In such an approach, transcript development will have constituted an analytic act in its
own right. The resultant written representation might usefully be explored in ways similar to those discussed
in Chapter 6, but also as a means of checking accuracy and to quickly locate particular areas of interest.
Figure 7.4 Synchronised playback of single transcript (ATLAS.ti)
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 20 of 31 Qualitative Coding in Software: Principles and Processes
Working inductively, with limited ‘external’ or ‘theoretical’ direction concerning code development, it is useful
to be as familiar as possible with data before coding commences. This might not always be possible, however,
when working with extensive datasets. The transcription process itself can be important in this respect, as
decisions involved during that stage will necessarily direct analytic focus. Having developed the transcripts
and associated them with the corresponding video, playing them back in synchronised mode one after the
other will constitute another moment of data familiarisation. In the context of Case Study C, Coca-Cola
Commercials, for example, watching each commercial at the same time as reading through the corresponding
transcript offers a multidimensional representation that would be lacking if only one or the other were available
(Silver and Patashnick, 2011). This is illustrated in Figure 7.4 in ATLAS.ti. The written transcript (P9 on the
left) and the corresponding video (P10 on the right) are opened side by side. When in Synchro Mode they
can be played back together – the red dots visible in the transcript are the anchors which synchronise the
two documents – the text in the transcript is highlighted in blue as the video plays. Also visible in Figure
7.4 is a memo – entitled ‘Olympic Dream – Analytic Notes’ – which was initially created during transcription.
Having this open during synchronised playback and adding to it as additional insights are developed is
fundamental in developing detailed and incremental ‘audit trails’ of analytic and process notes (Chapter 10).
In Figure 4.3, three different textual transcripts are synchronised with one video in Transana; in Figure 6.2 the
Transana illustration shows that the researcher has made use of ‘snapshots’ of the video inserted in two of
the transcripts, which may be useful when the transcript is exported.
It can be tempting to start the process of inductive coding as soon as transcripts have been created, and in
some situations this can work well. However, ensuring coding is as purposeful as possible from the outset is
important when their generation is data-driven. This is especially true when working with visual data because
of their multidimensional nature. Reflect upon the elements within data which seem to be relevant to the type
and level of analysis you want to achieve. This may include considerations as to the content and interaction
as well as the form and structure.
Working in tandem with memos (see Figure 7.4; p. 175) during the various moments at which you engage
with visual data before coding processes (e.g. initial data familiarisation, transcription, synchronised playback)
will ensure that inductive coding remains analytically focused and grounded. Adopting an ‘indirect inductive
coding’ approach as described here (Table 7.4; p. 177) values the contrasting perspectives provided by
textual, audio and visual representations. This type of work is all about creating codes that closely reflect your
interpretations of the meaning of what is being seen in the data. Having already gone through the process of
generating transcripts, it is valuable to make use of this earlier work in coding data and generating themes.
Although the bulk of the work relating to the generation of transcripts might have been conducted as an initial,
earlier, stage, it should nevertheless be remembered that transcription of visual material – when the intention
is to work with both representations simultaneously – is a dynamic and iterative process. Within the software,
transcripts can be revisited and edited incrementally, even after coding has begun. Coding may even occur
in tandem with transcript development.
Table 7.4 Coding in software: some suggestions in the context of Case Study C, Coca-Cola
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 21 of 31 Qualitative Coding in Software: Principles and Processes
Commercials
Tasks from PHASE TWO:
Exploration and identification of
analytic focus
Familiarise with commercials
• Watch each commercial several times
• Create a memo for each commercial and make notes about first impressions
generally, and in relation to each research question
• Informed by the capabilities of your chosen software, decide whether to work
directly or indirectly
Working directly
• Rationalise and justify the decision not to generate transcripts; what might you gain
or lose in adopting the direct approach?
• Experiment with creating clips and annotating them (where this is possible to the
exclusion of coding)
Working indirectly – transcript
development
• Use software transcription tools to develop synchronised transcripts for each
commercial (experimenting with developing multiple transcripts where this is possible)
• Consider the importance of the placement of timestamps (time codes) in terms of
subsequent qualitative retrieval and quantitisation (these implications differ depending
on the software)
• Be clear in your mind about the different role of annotations and transcripts in terms
of your methodology and the capabilities of the software you are using
Tasks from PHASE SEVEN:
Integration and analysis of
publicity materials
Focused reading, exploration
and coding
• In light of the write-up of the preliminary primary data analysis (Phase Five),
explore publicity materials
• Use text searching tools to quickly locate passages likely indicative of the themes
you previously generated
• In memos write about the connections seen between the commercials and the
publicity materials
Coding visual data ‘directly’, without an associated transcript
Several software packages enable visual data – both still and moving images – to be handled without
the need for an associated or synchronised textual transcript. We call this ‘direct’ analysis (Silver and
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 22 of 31 Qualitative Coding in Software: Principles and Processes
Patashnick, 2011). This technological possibility raises issues concerning the direction of the relationship
between technology and methodology (Silver and Lewins, forthcoming). In relation to developing strategies
which result in relevant, targeted, or ‘good quality’ analyses, an awareness of the implications of technical
subtleties such as this are important. It might feel like the possibilities for direct analysis obviate the need for
a transcript, or offer a ‘short-cut’ to analysis. Be cautious about proceeding in this way, however, unless there
are good methodological reasons for working directly with visual records.
The issue is quite different depending on whether you are working with still or moving images. In the former,
there may be many good reasons for working directly. With respect to the latter, however, it is rarely the case
that time will actually be saved; what you save at the transcription stage, you use up at the analysis stage;
as without any form of written representation, you have to work with moving images in ‘real time’ (Silver and
Patashnick, 2011).
There are, however, instances when working directly has analytic benefits. Figure 7.5 illustrates how a
moving image file might be directly coded using ATLAS.ti. In this example, clips have been created based
on changes in action, with clips of varying length appearing as thick brackets in the margin view. The clips
have been coded according to several action- and interaction-based codes. Different types of code appear
in different colours in the margin view, enabling a visual overview in patterns sequentially (vertically) through
the commercial. Individual and collections of clips coded in a particular way can be retrieved, played back,
recoded, etc. at any point (Chapter 8). This particular commercial provides a good example of when the
direct coding of moving images can be particularly useful as there is very little verbal dialogue. The dialogue
which does occur takes place at the beginning and the end of the commercial and is very repetitive (‘It’s
eleven-thirty’ and ‘Diet Coke break’ at the beginning as the women gather to watch the ‘performance’, and
a short exchange at the end where two of them reflect on what they have seen: ‘oh that was great’, ‘see
you tomorrow’, ‘eleven-thirty’). This is reflected by repetition in non-verbal actions – principally in the middle
section, where a series of shots showing the male building-site worker removing his shirt and drinking a can
of Diet Coke are interspersed with shots of the group of female office workers looking at him through the
window. Creating clips (or quotations, as ATLAS.ti calls them) for each section of action in this way (which
corresponds closely to camera shots in this example) serves to indicate patterns in the speed of interactions
and the length of time a certain type of action is displayed. In Figure 7.5 this is indicated visually by the relative
length of the quotations as represented by margin brackets. In Chapter 8 we discuss both qualitative and
quantitative retrieval options. Length of clip is an example of the latter type which can be retrieved in precise,
quantifiable terms.
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 23 of 31 Qualitative Coding in Software: Principles and Processes
Figure 7.5 Direct coding of video (ATLAS.ti)
With respect to still images there are similar issues, but some added possibilities in Transana. ‘Snapshots’
can be taken from video sources and saved as stills, which can be embedded within transcripts for illustrative
purposes (see Figure 6.2; p. 142). In addition, these can be visually annotated, which is a form of coding
(Figure 7.6; below). Codes (called keywords in Transana) can be assigned a unique combination of coding
shape (rectangle, ellipse, line, arrow), colour, line width and line style (solid, dash, dot, and dot-dash).
This offers distinctive means of handling visual data and, particularly, of indicating emphasis in non-verbal
interactions (see also Figure 6.3; p. 144).
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 24 of 31 Qualitative Coding in Software: Principles and Processes
Figure 7.6 Visual annotation and coding of still images (Transana)
For certain types of visual data, then, there may be several benefits of working directly. But in developing a
strategy you will also need to be aware of what might be lost or compromised by the analytic choices you
make. However, the decision as to whether to work directly or indirectly is not necessarily a binary choice –
it might be most appropriate, for example, to combine the two – and some packages allow visual data to be
handled both indirectly and directly. Where this is the case, it is particularly important to develop a strategy
for when and why each technique would be used.
Box 7.6 analytic notes
Questions to ask yourself when coding visual materials
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 25 of 31 Qualitative Coding in Software: Principles and Processes
• When coding indirectly (via the transcripts) how has your prior placement of
timestamps affected the units you chose to code? (Note that software packages
differ with respect to whether timestamp placement defines ‘minimum codable
segment’.)
• If coding directly, how does the content of the media and what you are interested
in analytically affect the size of data segment you create?
Coding in software, whatever the approach
The use of software enables the combination of approaches to coding in ways which are more dynamic
than is possible when working on paper (e.g. using highlighter pens on hard-copy transcripts) or with non-
bespoke software packages (e.g. word-processor or spreadsheet applications). Whatever the approach,
using software encourages the cyclical and iterative nature of qualitative research. The structure and
functionality of CAQDAS packages do not promote in themselves a linear progression of tasks. Coding
qualitative data is part of a flexible process, and coding using software can also be cyclical and iterative in
nature, regardless of the approach(es) employed.
There are two main issues to think about in approaching coding when using software:
• the basis upon which codes are generated;
• how different types of codes and coding techniques help at different times in the analysis.
Whilst no qualitative software program will, on its own, solve either issue, they can support different
approaches to both.
The overriding aim of coding is to facilitate developing a detailed understanding of the phenomena which the
data are seen as representing. This may involve gaining an insight into the underlying meaning respondents
attribute to a social situation or particular experience, identifying patterns in attitudes, or investigating
processes of social interaction. Employing a systematic coding strategy will allow you to revisit significant
instances, to think about them again and to produce further insights. Be clear how the codes you use are
helping you make sense of the data.
Bases for generating codes
The codes you develop may be influenced by a number of factors, including:
• theoretical context
• study aims
• research questions
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 26 of 31 Qualitative Coding in Software: Principles and Processes
• methodology and analytic approach
• amount, kinds and sources of data
• level and depth of analysis
• constraints
• research audience.
A code may represent a deeply theoretical or analytical concept; it could be completely practical or
descriptive; or it could simply represent ‘interesting stuff’ or ‘data I need to think about more’ (Chapter 6). A
project will usually consist of different types of codes. As analysis proceeds, the purpose and use of codes
will usually change and you may collect them together in different ways. Above all, codes provide access to
those parts of the data which allow you to think about the phenomena you want to examine (see Chapter 9
for more on making the most of coding schemes).
As well as having multiple functions, codes may be generated in a number of ways. For example, they may
be based on:
• ideas or concepts – derived from existing literature in the research area and/or developed from close
reading and thinking about the data.
• themes or topics – identified within data through close reading and thinking.
• language or terminology used in the data – whether words or phrases used by respondents or found
in documentary evidence, or (un)conventional structures in discourse or narrative.
Labels and definitions need to be meaningful in the sense that they indicate the nature of data grouped at that
code in some way. This may have a descriptive or more analytic purpose, which in turn may vary according
to the approach to and stage of analysis.
Box 7.7 analytic notes
Limits and cautions when using software coding tools
What do you gain/lose in your approach to coding? It is important to think carefully about
the analytic and practical implications of different ways of working. Among the issues that
can cause problems are:
• Seeing coding as analysis in and of itself
• Generating too many codes
• Not knowing when to stop coding
• Not keeping on top of code definitions
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 27 of 31 Qualitative Coding in Software: Principles and Processes
• Misunderstanding the difference between code application counts and counts of
the occurrence of an aspect contained within data (Chapter 8)
• Misunderstanding the purpose of coding schema structures (Chapter 9).
Concluding remarks: using software to support your approach to
coding
There are many ways in which to organise qualitative data through coding. The processes and sequences you
go through may be influenced by a range of factors. However codes are generated and applied, their purpose
is to enable you to revisit data and to carry on thinking about them. As such, codes function as ‘heuristic
devices for discovery’ (Seidel and Kelle, 1995: 58). Coding is not about perfectly capturing an instance or
concept. Codes act as signposts to remind you to go back and think about an issue and the data linked to
it again. Using software offers flexible ways to code and supports discrete and combined strategies. This
flexibility, however, requires being clear about how and why different codes are generated, applied and used.
There are various ways of achieving this, including consistent and meaningful code definition (Chapter 7), the
use of integrated memo tools (Chapter 9) and modelling ideas and relationships (Chapter 10).
We are not advocating adherence to a particular methodology, process or strategy. Conversely, we would
argue that different coding strategies are suitable in different situations rather than that there is a ‘right or
wrong’ way of coding. It may be appropriate to follow different procedures and processes for coding different
types of data within the same project as well as in different projects.
It is important to be aware of the ways your chosen software handles coding processes and to develop your
own strategy of coding within your project efficiently. A balance needs to be found between your analytic
needs vis-à-vis coding and the ways in which your chosen software will support you during the various coding
phases. The following chapters have been written to help you reach such a balance.
Chapter Exercises
The following exercises are designed to help you become familiar with the coding functionality within your
chosen software and to think about the suitability of different approaches to the generation and use of codes
such that you can develop a technologically informed strategy for the use of coding tools in your own work.
As stressed throughout the main body of this chapter, although software coding tools can open up the way we
think about the descriptive, thematic and conceptual indexing of data, working with them in a methodological
vacuum is likely to lead to more, rather than less, confusion.
We therefore assume that you will have considered the broader literature concerning qualitative coding
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 28 of 31 Qualitative Coding in Software: Principles and Processes
before embarking in earnest on coding your own raw data. That said, early experimentation with coding
tools in the ways listed in the exercises below, and contained within the software-specific instructions on
the companion website, can usefully contribute to the development of an effective and transparent coding
strategy. In the experimentations you undertake, we encourage you to think explicitly about the relationship
between qualitative (and mixed methods) methodology and the possibilities technology affords. Whether
your end result will be an undergraduate dissertation, postgraduate thesis or journal publication, reflective
discussion about the nature of software-supported coding and the analysis which it supports will always be of
value (see Silver and Lewins, 2014, for more in-depth discussion of these issues).
Generating codes
1. Create a priori codes. Any key issues or themes you know you will be interested in can be
generated independently of data at any stage. They may be fairly broadly specified and/or
identified from existing literature and theory.
a. Generate some empty groups of codes for any deductively derived themes,
name them appropriately.
b. Create deductive codes for each case-study example.
c. Create sub-codes as appropriate to the Case study And software you are
using.
2. Create codes grounded in the data. Whatever their type or purpose, new codes can always be
created whilst reading/looking through data, and linked immediately to the precise segment
which prompted that new idea, concept or category. Generate codes for interesting aspects
you identify inductively in the data which seem to be relevant to your thinking about the
research questions, and name them appropriately.
3. Create in vivo codes. If you are particularly interested in the language used in the data, or if a
term is identified which neatly encapsulates an idea or theme, you can create codes in vivo.
Many packages allow words and phrases used in the data to be quickly and automatically
turned into a code label. Be aware though that if you use this tool widely, it has methodological
implications. It may seem very useful as a short-cut to code generation, but if working
interpretively you should use it critically, being aware that it only has a temporary role to play
in the whole process of producing more analytic ideas about the data.
Apply existing codes to data
1. Define the amount of text to be coded. Whether it is the word, phrase, sentence, paragraph or
whole document, the researcher specifies the relevant unit of context.
2. Apply codes as relevant. Apply as many codes to a text segment or to overlapping segments
as relevant, if this seems methodologically appropriate. Most software packages allow this sort
of deductive coding (i.e. where you apply existing (probably broadly specified) codes to data
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 29 of 31 Qualitative Coding in Software: Principles and Processes
segments as you identify them) in several different ways. Experiment with them all and think
about which best suit your preferred way of working.
3. View codes appearing in the margin. Viewing codes this way is useful because it allows you
to see co-occurring codes and visually identify interesting patterns and relationships as you
proceed. Make notes in memos or annotations for interesting early patterns you are seeing.
Define and list codes
1. Define the meaning, scope and intended application of codes. It makes sense, where possible,
to do this as codes are generated. Therefore, consider the function and impetus for generating
the code and revisit and refine definitions. Date any changes in definition so you can track
developments in thinking. Small leaps in reasoning are important, and increasing the
transparency of your work in this way will add to the potential quality of your work. Experiment
with defining codes using memo, comment or description tools.
a. Working deductively, it usually makes sense to add code definitions at the time
of code creation. This is often particularly important in team projects when a
high level of agreement about the meaning and intended application of codes
is needed in order to achieve as much consistency as possible.
b. Working inductively, it is often the case that the defining of codes does not
happen immediately upon their creation. This can be both because code
names themselves are often more specific, but also because researchers want
to ‘get on’ with the task of coding without being distracted or held up by
generating carefully worded definitions. Where either is the case, you can go
back at any point to add code definitions. Some packages provide easily visible
ways of seeing which codes have definitions.
c. See Chapter 9 for discussion about the role of code definitions in rationalising
coding schema structures, moving on from descriptive or thematic indexing of
data through coding and ensuring you do not end up with an unwieldy,
inefficient or too prolific coding schema.
2. Generate code reports. Listing codes and their definitions is useful for analytic and practical
reasons. Generating a code list can help at various stages, for example when thinking about
grouping codes, generating higher-level categories, or reorganising the coding schemes.
Retaining these reports can provide a useful ‘snapshot’ of the various stages of the analytic
process.
Refine coding
1. Increase or decrease the amount of text coded. Refining the coding in this way may be useful
when considering themes or concepts in more detail and moving on beyond the first pass
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 30 of 31 Qualitative Coding in Software: Principles and Processes
through the data. Experiment with both increasing and decreasing the amount of data
assigned to particular codes, and notice how the changes appear in the margin view.
2. Unlink a code from a point in the data. It is just as easy to uncode data as to code it if you
change your mind about the need for a code at a point in the data, or make an error. There will
usually be several ways to uncode data, so become familiar with these. In packages where
marked data segments are independent objects in their own right, notice that uncoding does
not automatically unmark the segment; it can still be considered as a segment of interest (and
annotated, linked, etc.).
3. Experiment with colour coding tools. Most packages allow codes to be assigned colour, which
is usually reflected in other windows as well as in the code margin view (although you may not
be able to retrieve on that basis). Consistent application of colour can add an additional
dimension to coding, for example to differentiate between types of or purposes for coding, so
think carefully about how colouring codes can have an analytic function in your project (this is
discussed in more detail in Chapter 8).
4. Experiment with assigning weight to individual code assignations (where possible). Some
packages allow individual coded data segments to additionally be assigned ‘weight’. This
might have several purposes, including to specify how strongly a particular attitude is being
expressed (this is discussed in more detail in Chapter 8).
http://dx.doi.org/10.4135/9781473906907.n8
SAGE
2014 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods
Page 31 of 31 Qualitative Coding in Software: Principles and Processes
- Qualitative Coding in Software: Principles and Processes
- In: Using Software in Qualitative Research: A Step-by-Step Guide