The risk of AI deepfake in qualitative analysis is a reality

Last month I had the pleasure of being a main speaker at the Social Research Association (SRA) annual conference. I invited Steve Wright to join me in a conversation about recent developments in the field of CAQDAS resulting from the availability of large language models (LLMs).

Our talk was recorded and you can watch it here – from where you can also download the slides and access the references.

The research we did and discussions we had in preparing the talk broadened my thinking about the issues still further from previous reflections, illustrating starkly that what I've been calling the 'risk of qualitative deepfake' is already a reality.

If you’re interested in discussing these issues and exploring AI tools for QDA in action, join me at one of the following forthcoming sessions

August 20th & 21st. Using AI for Qualitative Analysis throughout the Research Cycle. (Live and On-Demand 2 day workshop hosted by Instats)
September 10th. The implications of generative-ai on Creative Research practice: appropriate, ethical and transparent uses (2 hour workshop as part of the International Creative Research Methods Conference)
September 20th. AI-Assisted Qualitative Data Analysis (Live online 1 day workshop hosted by the SRA)

What is "Qualitative Deepfake"?

In our talk we mapped five core Generative-AI capabilities onto typical phases of the qualitative research process and discussed them in terms of

whether and how they might increase quality in analysis,
their time-saving potential in terms of what they might free up for humans to focus on, and
the costs that their development and use has (of which there are many - see a previous post for some reflections on this and keep your eyes out for more on this to come).

I love a good visual, and felt a Sankey diagram was a neat way to show how the Gen-AI activities (right axis) are being used across the typical phases of a qualitative research project (left axis).

Figure 1. Qualitative Research Process and Gen-AI Activities. First shown by Silver & Wright, SRA Conference 6th June 2024

What this emphasises is the possibility of doing a whole qualitative project, from start to finish, entirely using Generative-AI tools, with no, or minimal, human input.

This is what I call "qualitative deepfake".

Let’s break this down a little bit from the perspective of some typical phases of a qualitative research project. How is Generative-AI being used in the qualitative research workflow?

Research Design

Some researchers are asking Chatbots for ideation: to generate ideas for research topics and formulate research questions around them, or by conversing using Chatbot functionality within CAQDAS packages with published research reports to identify areas for more research.

In one of the researcher presentations at the Symposium on AI in Qualitative Analysis that the CAQDAS Networking Project organised with the SRA in November/December 2023, Heidi Hasbrouck PhD, and Deana Kotiga, researchers from the Ethnography Centre of Excellence, Ipsos UK recounted using their in-house platform for these purposes, describing how this contributes to their design process

Literature Review

The literature review process is an area into which the capabilities of Generative-AI have literally exploded. Many researchers are excited about the possibilities, especially those looking to streamline the process. There are many new AI driven tools designed to facilitate reviewing literature and their capabilities are being integrated into existing tools.

As I'm often heard to say in our workshops, reviewing literature is a form of qualitative data analysis, because literature is a form of qualitative data and reviewing is a form of analysis. Therefore all the Gen-AI activities mapped onto the qualitative research process in the Sankey diagram are being used to do literature reviews.

However, of particular note are tools that generate lists of articles to read on a topic (now being integrated into established CAQDAS-packages - e.g. the recent addition to ATLAS.ti that enables the search, cite, and import of research papers, journal articles, and studies directly into the analytic workspace with it’s “Paper Search” Beta.

These sorts of developments may see a step-change in how we view CAQDAS-packages as containers for and connectors to all aspects of the qualitative research process in new ways. In addition, the Gen-AI activities of summarise, converse and label are being used to facilitate literature reviews in similar ways as described below.

Several years ago I wrote a post about the importance of ensuring analytic strategies (what you plan to do) drive software tactics (how you plan to do it) using an example of a participant in one of my workshops who’d tried to do a literature review using a CAQDAS-package without reading any of the articles (they failed). I described this as a 'horror story'. The capabilities researchers have at their fingertips now, if not used appropriately, are even more of a horror-story for the craft of literature reviewing.

Data collection

When I first started to realise the potential implications of Gen-AI on the qualitative process, my focus (as it usually is) was on analysis. But what's happening in the space of qualitative data collection is equally profound and in some circumstances more worrying

similar to how Chatbots are being used to suggest research topics and questions, they are being used to generate suggestions for participant groups from which to recruit and possible sampling strategies, and to generate interview questions or focus-group guides based on the research topic.
but they're also being used to actually generate data, for example, bots can now interview participants using messenger services such as WhatsApp, or speaking bots can interview participants in real-time.
finally, and most concerningly, they’re being used to create transcripts for what ‘typical’ participants with certain characteristics are likely to say. Such AI generated content is referred to as “synthetic” or “silicone” data. There have even been articles published suggesting that this is a legitimate thing to do.
I have yet to fathom any circumstance in which generating fake data is legitimate - and let's call-a-spade-a-spade as my colleague and great friend, Ann Lewins would say - doing this generates fake data, cut the 'silicone/synthetic' rubbish, it's fake.
If you’re interested in this topic I’d recommend listening to Episode 28 of the Mystery AI Hype Theatre 3000 podcast in which Emily M. Bender and Alex Hanna discuss social science research papers that are essentially calls to fabricate data.

Transcription

When gathering qualitative materials with participants in customary ways (i.e. by audio/video recording in-person or online interactions) automated speech-to-text tools are now much more accurate - although not yet for all languages and dialects.

This is one area where recent technological developments are having a significantly positive impact, especially where the developers of transcription tools understand the importance of data privacy and security for qualitative research – see for example the options provided by Transana designed to suit different research needs, including an option to have qualitative data automatically transcribed without any data leaving the researchers’ computer.

I have mixed feelings about this one. On the one hand the speed, accuracy and security with which automated transcription can now happen is undoubtedly amazing, not only in terms of saving time but also opening up access to different forms of data, and for opening up access to qualitative analysis beyond the professions.
And its not just the transcription of research data themselves. When I did my Phd I used to drive home after conducing an interview or focus-group instead of catching the train so I could record myself reflect on the encounter using my (then old-fashioned) Dictaphone. I wanted to capture those initial reflections whilst they were fresh in my mind and speaking them out was a great way to do that. I would listen back to them and incorporate them into the analysis but I didn’t transcribe them because of the time it would have taken. Nowadays that wouldn’t be an issue, and I’d be able to fully integrate these reflections into an analysis. Daniel Turner makes related points in his blogpost and also in a webinar he ran for us at the CAQDAS Networking Project.
On the other hand, I still standby the sentiments I wrote about in 2016 in this post on transcription as a moment of contact with data. We do lose something by not transcribing ourselves and this is something we really should consider carefully. Automated transcription essentially only captures content. It can’t capture emotional tone, non-verbal interaction and so on. I learnt long ago from David Woods the extent to which transcription is an analytic act and that hasn’t changed just because we have new technology. Time-saving is seductive and useful in many circumstances but there is also a cost. What might it do to the essence of the qualitative research process if the craft of transcription is completely lost in the face of these hugely powerful technological developments?

Familiarise

Exploring qualitative materials to become thoroughly familiar with them is a core initial analytic activity in many established analytic methods. Many Gen-AI activities can be used for this purpose, but let’s focus on one: AI-generated summaries. Some tools will summarise transcripts and other textual materials automatically as part of a project set-up, and others enable humans to specify the level at which to generate summaries (e.g. each transcript, or selected sections within them).

I've always been an advocate of using tools to facilitate the familiarisation of qualitative materials - for example I believe there is a place in many circumstances for using high-level, quick and reliable, word frequency, phrase finder, concept modelling and similar tools to contribute to the familiarisation process, in combination with low-level, time-consuming and interpretive familiarisation in the form of reading, thinking and analytic note-taking. From that standpoint, I do believe the capabilities of Gen-AI have a place in qualitative analysis when used to contribute to content-based exploration and familiarisation. Taking the 'birds-eye' view gives a different perspective than being more considered. Neither is necessarily better or worse, they're just different; eagles with their quick overviews and tortoises with their slow careful consideration of the detail both have a place in the world. The problem comes when summarising qualitative data is taken as a replacement for careful reading, critical thinking and analytic note-taking. More on this in a future post.

Categorise

Core to how researchers have undertaken qualitative analysis up to this point is to categorise what is interesting and meaningful within and across materials. This happens in two key ways – the conceptual tagging of segments (aka qualitative coding) and grouping materials according to their features, for example the demographics of participants, or metadata characteristics of documents etc.

There are some, including Susanne Friese of Qeludra, who believe coding is no longer necessary because of the new capabilities Gen-AI brings. Others are harnessing Gen-AI to do the coding for them: for example ATLAS.ti offers two options, AI Coding which will generate and apply codes with no human direction, and Intentional AI Coding when humans input research questions and analytic intentions that the AI uses to focus the resulting codes. Other programs, for example MAXQDA, restrict the input of Gen-AI to suggesting codes rather than doing the coding, and DiscoverText, for more than a decade has been balancing the power of computers with the interpretive power of humans through its machine-learning (not Gen-AI) tools.

See also this post for other options

Speeding up the time-consuming process of coding is attractive, but the results these tools currently provide are more usefully thought of as another way to inform familiarisation than as being capable of accomplishing the qualitative coding task for us.

What’s the risk?

What does all this mean for qualitative research practice? The risk of qualitative deepfake is that we end up regurgitating garbage. If AI is used to generate research ideas and designs, to collect and /or transcribe data, to summarise, label and produce answers to our questions about it and then to write-up qualitative projects, these “findings” will be fed back into the repository of scientific “knowledge” and then fabricated studies will be drawn upon for subsequent projects.

This puts our profession at huge risk. The credibility of outputs rightly come into question. How can research outputs be used to inform policy when they are based on fabrications, when they are regurgitating fake data? What are our values as social researchers and how do they fit with these new capabilities? And how do research participants feel about all this? How can our responsibilities to represent their views and experiences be met if we’re letting the AI do the analysis?

These are unresolved questions that we as the collective community of practice of qualitative researchers must address. The workshops, webinars, podcasts and blogs myself and colleagues share are platforms for discussing the issues.

The risk of qualitative deepfake is a reality

Recent Posts