Coding Qualitative Data - How to Analyse User Interviews

We've spent a lot of time recently thinking about how user interviews and other forms of qualitative data get coded, and if there are ways to get more value from that element of our work. To explain the conclusion we've come to, let's unpack what coding is and why we do it for any newcomers, and then talk briefly as to how we think the product & UX industry could get more value out of its codes.

What is Qualitative Data Coding?

There are two different elements to qualitative research - listening to what people say via interviews, and watching what people do in given contexts. The pair of these form the basis of most modern ethnographic research. The challenges we face often arrive as we create accurate descriptions of the populations studied; gestalts (organised collections) formed out of individual experiences which have to represent the essence of the individual experiences of those we've interacted with. If it were art, we'd be painting a picture with hundreds of subtle hues, not simply mixing every paint together to end up with a giant, toneless splodge of data.

That means at the beginning, we have to think about how we are going to collect our data, and how we're going to analyse it. There's ongoing debate around constructivist^[1] vs critical realist^[2] positions with regards to evidence, and also the merits of methodologies like grounded theory compared less prescriptive tools like template analysis or action research, but regardless of the method, we have to codify and analyse our data. Without conducting some form of thematic analysis on the information collected, there wouldn't be much to learn or present.

As a result, codes which sit around our evidence, whether that's transcripts, data from analytics tools, surveys, or anything else, forms the foundation of almost every form of qualitative data analysis. It's the tool behind notating observed behaviours, finding commonalities, and bringing a human dimension to evidence. It creates our framework for creating user archetypes, user needs and journey maps. We use them to mine and link information, to be able to back up research findings with specific examples of behaviour, and to demonstrate truth in research.

As a result, coding is *much* more than simply annotation before the real work starts. It's an early stage analysis step in itself, which informs data collection. It's also why auto-transcription tools, whilst useful, don't save that much time in the long run - the time spent is in the analysis, which automated tools are still a long way from replacing.

The problem we often observed when watching researchers codify research though is that our current crop of digital tools are just reflections of how we've historically coded research - on paper with highlighters and scissors. There's nothing wrong with that, but digital tools allow us to interact with information in other ways. In the same way as the iPhone fundamentally reinvented how we think about portable personal computing by reflecting the fundamental constraints and benefits of the mobile form factor, our tools should reflect the medium they exist in. When those tools become digital, we should reflect that, and not simply try and adapt paper-based ideas of work.

So how should we code data, if we're going to get the greatest value out of it?

How to Code Qualitative Data

Let's start by thinking about what a code is, in terms of its function. A code is a commonly used abstracted word or phrase, representing an idea or concept, relating to a passage of text. Different, related passages get tagged with the same code, to build up a corpus of text with common concepts and/or themes. We do this over five stages, which progress in a helical fashion, circling around whilst we make progress, rather than linearly one after another. The stages at their most basic are:

Familiarisation: Reading the data, learning what it contains, and making notes about common themes and elements which seem to crop up as you go through it.
Initial Coding: Creating the list of codes, noting types of information they pertain to, and describing each.
Re-Coding & Collation: Defining relationships between codes and refining the coding structure. This can be in terms of a hierarchy, or a flat, facet-type structure, or some form of hybrid.
Reflection: Looking back at the data and comparing it to the codes and relationships to see if they reflect an accurate picture of the information, and if anything is missing or errant, fixing it.
Definition: Going through the finalised list of codes and relationships, and creating a definition for each, noting how they relate to each other, the story the data tells, and how it relates to the whole dataset.

The aim in the early stages is to code everything, whether or not you think it'll be used later. In your first few passes as you create your initial codes and then re-code, you're only be aiming to make the information as easy to integrate as possible, not to define the final codes or analyse it. The analysis will come as you start your reflection processes.

All the above processes can either be tackled asynchronously alongside data collection, so that the collection format can be informed by earlier data, or after the fact. If we want to refine our data future collection to narrow down on a particular area, we'll need to run this process whilst we're collecting early data, to see what turns up. If we want to simply get a broad sampling to analyse, or we're working with pre-collected data or re-analysing older information, then we can do the whole process separately to the collection.

Types of Codes

When we get into finding and creating our code sets, there's different approaches, different types of codes, and different levels of code. Let's talk a short look at all the different ways we can think about codes, starting with the three basic forms of code generation.

Deductive, Provisional and A Priori Codes: Use a pre-built set of codes to look at the data. That might be from research you did beforehand, finding themes in a literature review, or from other researcher's reports on similar areas. Once the codes have been defined, you then go through the text and code it based on the lens you've chosen to view it through. Particularly useful when there's a particular challenge you're trying to address, and/or you're specifically attempting to avoid bringing biases the team may have into the analysis.
Inductive and A Posteriori Codes: Beginning without any codes, you build up a bank of codes as you go through your data and see what turns up. The codes created are fluid, and can be dropped, added to or changed as your understanding of your dataset grows. Useful when there's no prior knowledge about what might turn up, or you're looking for initial ideas for how to refine experiences and don't yet have any hypothesis.
Hypothesis or Assumptive Codes: A sub-type of a priori codes, using a pre-built set specifically designed to evaluate evidence in relation to a hypothesis and its assumptions.

Most actual studies tend to be something between the two, starting off with a basic lens to view the information through, but moulding and expanding it as you learn. What you use depends on what you're analysing, how much you know about the dataset you're gathering/have gathered, and what you're aiming to get as a result of it.

In either case, your codes should reflect and represent your data accurately by the time you're done. Now let's think about the way we come up with the words we use for our codes. Whilst this isn't an exhaustive list, as there's hundreds of different ways to code, here's a few of the more common generic ones:

In Vivo Codes: Latin for within the living. Creating codes using words the participant actually used to try and both stay close to the intent of what they meant, and to be able to use language which reflects the participant's modes of speech. This is particularly useful in environments where the language can start to consist of a lot of area-specific acronyms or jargon. Also helps create a summary of the whole piece by pulling out the most important elements.
In Vitro or Descriptive Codes: Latin for within the glass. Creating codes which reflect what the user is describing, without using their language. This is useful when you have something commonly described in multiple ways, which you want to consolidate in one single, clear, concise code.
Process Codes: Gerund-based codes ("ing" words like doing, running, writing, changing, becoming, ending...) used to describe observable or conceptual activity. These also tend to be temporal in nature, and imply a delta, as something will be changed a result of them.
Conceptual Codes: High-level codes which describe what something is about more broadly. For example, someone might be running or swimming, but in either case they might be doing something relating to either fitness or survival. Generally used to impute a higher level of reason behind an action. Generally ties to process codes.
Values Codes: Codes which relate to the personal or social values either held or described by the participant. Commonly used to unpack attitudes, biases and beliefs.
Open Codes: Early-stage, tentatively defined and worded codes, which are expected to change and evolve as the analysis continues. Generally created as part of a posteriori coding.

Not all codes are generic in nature though - we also have certain types of codes which are highly defined and encapsulate a specific context. Some examples of those are:

Emotional Codes: Assigning one of a set of emotional codes to the participant and/or their behaviours. Generally also uses the idea of protocol coding, where a pre-set list of codes is used to describe the emotions observed. Common emotive protocols are those by Plutchik^[3], Aristotle^[4], and more recently the GoEmotions taxonomy^[5].
Consanguineous and Versus Codes: Codes which describe relationships between groups within a study. Consanguineous groups are sub-groups of a total population with a relationship of any form to others, whilst versus codes are specifically antagonistic relationships between (generally) two populations from a total group. For example, teachers and students could be described either as consanguineous or versus, depending on context, and the context inferred by related process codes co-occurring with it.
Domain and Taxonomic Codes: Codes which sit as part of a structure of semantically related codes, generally either in a tree or graph form. Generally created during re-coding and collation, and reflection stages as themes and domains emerge. These are not just generic categories - they relate specifically to behaviours and language specific to the people participants and their related backgrounds. For example, if surveying programmers, you might have UI, Back-end, Front-end and so on as codes, which could sit in an Areas of Responsibility domain.

The Value of Coding Data

This process of generating codes and applying them to the data allows us to get a handle on our data, so that we can understand and interrogate it. It helps us see what people are trying to do, the goals they're attempting to achieve, the challenges they encounter and successes they win, and the hacks they use to get there. It lets us see the story behind quantitative data, so that later we can find ways to improve user experience, through needs-driven research.

It also allows us to integrate and re-organise research data. Information gathered from interviews and observational studies will never run in order you want to think about and present it in. Codes let you bring order to that particular mess, by giving you tools to break down and re-build the data into a more usable structure.

At the most basic level, we have individual passages of text, with enough context to make sense. These are our *atomic units* of information.

Moving up a level, we assemble sets of individual passages under common codes, create *molecules* of information. This lets us understand and compare how frequent any particular element is, and whether a code relates to outlying information or a subset of the population of users. It's through this step that our qualitative data can be measured quantitatively, by starting to put some measures around it.

These molecules are then themselves formed into larger *structures*, creating a narrative which can be used to inform your organisation about the data collected from several codes. It illuminates different relationships between codes, and the nature of the codes themselves in a larger context. This is generally the level that everyone else outside your department will engage with the knowledge you create, and then be able to dig back down where required for their work.

There's various methods to creating structures. Some of the more common methods and results include:

Focused, Pattern, Selective and Axial Coding: Various slightly different ways of back though generated codes and merging, removing, grouping and categorising them, based on their semantics and logical relationships to each other. These are designed to refine the coding scheme used to just include those which accurately and completely (as a whole) describe the data.
Theoretical Codes: Meta-codes which form the structure uncovered by the above processes.
Longitudinal or Temporal Codes: The organisation of codes and structures to show progression of some construct or set of behaviours over time.
Spacial Codes: The organisation of codes and structures to show progression of some construct or set of behaviours through distance or some form of territory.

Finally, structures are brought to form *vaults*, combinations of narrative which reinforce each other, and define a larger story. We bring structures together to guide strategy and inform the higher levels of an organisation about what we've learned in a way which conveys enough detail and references to specific examples to be robust and convincing, but at a high enough level to allow longer-term, organisational planning decisions to be made from it.

The second reason we code is that the process itself helps us find issues around ways we think. By creating codes as a group with a group of people with a diverse set of thought processes, you create a broad state space of thought with a narrow total overlap, to allow for different researchers and analysts looking at the data to be able to look at the central body of work and find biases and errors in thinking which other people on the team may have had. This creates an environment where the verisimilitude (how like the truth something is) of the coding structure increases, as it gets challenged and improved. This is, incidentally, the same fundamental principle behind how hypothesis-driven design works.

How It Should Evolve

The way we code today, thanks to tools like Dovetail and EnjoyHQ, is still fundamentally the same as it has been before, but is becoming more collaborative and open. This is vital as as researchers, we need to work to understand how our axiologies (values and beliefs) may influence our research, from source choices, interview structure or question selection, populations studied, and the interpretation and analysis we create.

This creates an obvious challenge - how can we ensure that our analysis is reliable, and not influenced by biases that we can bring with us? After all, it's entirely possible for two researchers to view the same event and come up with totally different interpretations around it.

There's also a second challenge here, because ultimately the aim of our research isn't to present a report; it's to find new knowledge, to inspire hypotheses to test, and ultimately to generate actions which improve the organisation in some way. When we think about it in that light, there's a few ways we can potentially improve our current ways of working, and bring extra utility to the whole coding process.

Ensuring Reliability

If we think about reliability, the challenge we're facing is that we're trying to ensure that the potential avenues for inaccurate interpretation of our data are as narrow as possible. That means ensuring everything is clearly defined, clearly understandable, and that the research is repeatable. To get there, there's four things we need to do.

Firstly, we need to get a diversity of minds conducting and reviewing the research. Different people, with different backgrounds and perspectives, with different ways of thinking. The more dissimilar the group conducting the analysis, the more likely it is that biases and errors in thinking or interpretation may be picked up, as there's an ability challenge the codes created (or ones which may be missing). That doesn't just mean ethnic or gender differences, but also differences in areas of responsibility and expertise. Someone with training in copywriting and language analysis will pick up on different things in an interview than someone with training in anthropology or behavioural economics, for example.

On a related note, have participants (or if you can't re-contact the original people, as similar a group as possible) evaluate your interpretation, to see whether the analysis accurately reflects what was said. This is about making sure the trees don't get lost for the forest, and you preserve the dignity and importance of individual experience, whilst also reflecting the whole.

Thirdly, aim for plurality in your explanations of the data. As much as possible you want to generate various different competing hypotheses, so you can test them against each other to see which don't stand up. That way, you both look for what might be, rather than finding an initial hypothesis and stopping there, but also you allow for more creativity in looking for what actions to take as a result of your findings.

Finally, verify your research by looking for more data. This is also known as triangulation, as it allows you to think of evidence multi-dimensionally. In the same way as we try and falsify hypotheses with a null hypothesis, you should try and falsify your findings with alternative sources of evidence, to check your conclusions are robust.

Expanding Utility

To bring added utility, as we move into an age where tools Like Hirundin and Gleanly exist, we need to think more broadly about our a priori codes. I'd posit that as Charmaz and others have said, it's impossible to come to research with no pre-existing biases, expectations, desires or views. Instead, it's better to try and articulate that those so you can be aware of them, and conduct your research in the most effective way possible.

That means having a suite of sets of codes which outlines higher-level concepts under investigation before research is started, and which cut across different studies and areas of an organisation. By creating these codebooks, we can define what matters to an organisation more broadly, what has been found in prior research, the goals of the current research focus, and potential activities which may be informed by the study.

This is something which has traditionally been tricky, as research has sat apart, and once conducted tends to be reduced down to a report which may or may not be re-used, or lost to time in an archive somewhere. However, by beginning to take our findings and embed them in a broader context, we can improve the discoverability of our findings, and create suites of codes which allow us to truly begin to demonstrate value, tie business results to our work, and prove the worth of researchers to stakeholders.

As outlined by Kathy Charmaz in Constructing Grounded Theory: A Practical Guide Through Qualitative Analysis
As outlined by Roy Bhaskar in The Possibility of Naturalism
Fear; anger; sadness; joy; disgust; surprise; trust; and anticipation
Anger & calmness; friendship; fear & courage; shame & confidence; kindness & cruelty; pity; indignation; envy & jealousy; and love
Admiration; amusement; anger; annoyance; approval; caring; confusion; curiosity; desire; disappointment; disapproval; disgust; embarrassment; excitement; fear; gratitude; grief; joy; love; nervousness; optimism; pride; realization; relief; remorse; sadness; and surprise.