Step-by-step guide for educational research

LLM-Assisted
Content Analysis

A systematic procedure for integrating large language models into content analysis while maintaining methodological rigour and the researcher's interpretive responsibility.

Author

Javier Vidal

Institution

EVORI · Universidad de León

Version May 2026

Document information

LLM-Assisted Content Analysis

Step-by-step guide for educational research

Author

Javier Vidal Grupo EVORI · Universidad de León

Version

May 2026 Digital edition for academic distribution

Project

AiRISES · Application of AI in the analysis of informal social networks for guidance in Higher Education. PID2021-125405NB-I00 · Spanish Ministry of Science and Innovation

Team

IP: Javier Vidal · María José Vieira Camino Ferreira · Agustín Rodríguez-Esteban · Diego González-Rodríguez · Alba González-Moreira · Estela Mayor-Alonso · Yaiza Viñuela · María Álvarez-Godos · Ainhoa Martínez-Rodríguez · Héctor González-Mayorga

Cite as Vidal, J. (2026). LLM-Assisted Content Analysis: A step-by-step guide for educational research [Análisis de contenido asistido por LLM: Guía paso a paso para la investigación educativa]. University of León.

ISBN 979-13-87583-71-2 · Available at https://evori.net

CC BY-NC-ND 4.0 © 2026 Javier Vidal (University of León). This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. You may copy and redistribute it with attribution; commercial use is not permitted, nor the creation of derivative works.

About the author

Javier Vidal

Full Professor of Research Methods in Education
University of León · EVORI Group

Full Professor of Research Methods in Education at the University of León (Spain) and director of the EVORI research group. His academic work has always been linked to the study of higher education, university quality, and the relationship between education, innovation and society, combining research and teaching with institutional responsibilities in both the university and the Spanish educational administration.

He has collaborated with national and international organisations — including the European Commission, ANECA, the OEI, and the World Bank — on projects focused on educational evaluation, university guidance, and higher education policy. In recent years, his work has turned particularly towards the impact of artificial intelligence and digital technologies on education and research.

Author of numerous scientific and popular publications, he maintains an interdisciplinary and humanist vision of education, attentive both to current technological challenges and to the people and contexts at the heart of learning and social transformation.

Preface

✎ Translator's note

This is an English adaptation of the original Spanish guide Análisis de contenido asistido por LLM. Guía paso a paso para la investigación educativa by Javier Vidal (EVORI Group, University of León). The translation was produced with LLM assistance and reviewed by the author in accordance with Principle P6. The AiRISES case study examples draw on data from Spanish educational forums; they are left in their original form as authentic illustrations of the methodology, and readers from other national contexts are encouraged to substitute equivalent examples from their own systems. Table 7 (the lexical dictionary) has been adapted to a generic anglophone educational framework; a note in the table explains this adaptation. The warm, direct tone of the original has been preserved intentionally — it may feel less formal than typical academic writing.

Translator's declaration of LLM use

The English translation of this guide was produced with LLM assistance (Claude Sonnet, May 2026), used for initial drafting, reformulation, and terminology search. All translation decisions — including the adaptation of educational terminology to anglophone contexts, the handling of culturally specific examples, and the editorial tone — were taken and reviewed by the author. No LLM has replaced the author's judgement. The translation function has been instrumental, similar to an advanced editing tool, under the author's supervision and responsibility at all times.

After much consideration, and given the mixture of enthusiasm and exhaustion that academic research tends to produce, I decided to write this guide with the aim of reducing the exhaustion without diminishing the enthusiasm. I thought it might help you to have a document that accompanies you step by step through LLM-assisted content analysis (what people generally call "the ChatGPT thing"). I have written it in a direct, practical style — more working handbook than academic treatise. If you need academic references on content analysis, there are hundreds of them, both textbooks and research articles that apply the method. You will also find that this analytical approach is used across many scientific fields — it is worth looking at examples from areas close to your own research topic.

Although I have been doing content analysis since my doctoral thesis (1995), this guide is the product of my experience over the past three years in the AiRISES project and everything I have been learning — sometimes in front of a screen, sometimes over coffee with colleagues. Through trials, errors, meetings and occasional moments of inspiration (rare, but glorious), we found a way of working with LLMs that combines the rigour of analysis with the possibilities these new tools offer. This is only the beginning. I think I should put a date on this document — one never knows when a new LLM version will come along that renders half of it obsolete. Whether I manage to keep it updated is another matter (I doubt it).

My intention is that this guide should accompany you step by step through the content analysis process, remind you that the process has an internal logic — even when it does not seem like it — and allow you to move forward confidently while maintaining your own judgement and research perspective. The procedure described must be adapted to your type of study, the size of your corpus and your research questions, and always requires your decision-making to adjust, review and reinterpret each phase. In this sense, the guide proposes a flexible framework that will support your analysis without replacing your methodological reflection or your interpretive responsibility — let alone the analysis of implications and improvement actions, which are indispensable in educational research.

I hope you find here not only the steps to follow, but also the feeling that this is doable, and moreover, that you can do it very well.

If it is useful, let me know. And of course, please send me any suggestions for improvements that would benefit others who might use it. I have already incorporated some.

In Part V, Section 12.4, I say that something must be done. In keeping with my own recommendations and as an example of good practice, I include this declaration.

Declaration of LLM use (Principle P6)

In preparing this guide I used support from several LLMs — specifically ChatGPT, NotebookLM, Gemini and Claude (in their most up-to-date versions as of March 2026). They were used as assistance tools at various stages of the process. Their use was limited to drafting, textual reformulation, searching for alternative expressions, generating examples, supporting content structuring and presentation formatting. All conceptual, methodological and editorial decisions were made by me, and I critically reviewed each proposal generated. The model did not intervene in the essential content and, in no case, did it replace my judgement as a researcher. Its function was instrumental, similar to an advanced editing tool, always under my supervision and responsibility.

A note on language. In the original Spanish guide I deliberately combined three registers depending on context. In the English version, the direct, personal tone is preserved throughout. In definitions, principles and academic formulations I use generic neutral language ("the researcher"). When addressing you directly with instructions or warnings, I use the second person ("you"), which in English carries no gender marking. This is a deliberate editorial choice intended to make the guide feel like a working companion rather than a textbook.

Good luck.

Javier Vidal

Figure 1. General map of the LLM-assisted content analysis process

Part I

Foundations

Content analysis and the responsible use of LLMs in educational research.

Chapter 1

Introduction

Some context. Content analysis assisted by large language models (LLMs) has become a tool of growing interest in educational research. The availability of systems such as ChatGPT, Gemini or Claude — capable of processing large volumes of text and generating summaries, categories and explanations — makes it possible to optimise processes that are traditionally laborious and demand considerable time for reading, coding and comprehension. From the outset, however, it is important to make clear that the use of LLMs is a methodological support tool that does not replace the researcher's responsibility. This guide takes a combined approach: (a) conducting a traditional content analysis grounded in methodological rigour, transparency and critical examination of results, while (b) incorporating the LLM as technical support to accelerate tasks and organise information — without, I insist, replacing the researcher's judgement or interpretation.

The analytical procedure presented in this document is the direct result of work developed in the project Application of Artificial Intelligence in the Analysis of Informal Social Networks for Guidance in Higher Education (AiRISES), funded by the Spanish Ministry of Science and Innovation (PID2021-125405NB-I00), of which María José Vieira and I were principal investigators. Its development is grounded in more than three years of continuous research, combining automated analysis techniques and expert review to address rigorously a broad and complex corpus of messages from informal forums. This procedure is the final synthesis of the knowledge accumulated by everyone involved in the project, both the research team — Camino Ferreira, Agustín Rodríguez-Esteban, Diego González-Rodríguez and Alba González-Moreira — and the working team — Estela Mayor-Alonso, Yaiza Viñuela, María Álvarez-Godos, Ainhoa Martínez-Rodríguez and Héctor González-Mayorga. The final version presented here would not have been possible without their contributions, which have constantly enriched the approach, the understanding of the object of study and the robustness of the method.

Purpose of the guide

The main purpose of this guide is to offer you a detailed, step-by-step procedure for conducting LLM-assisted content analysis in educational research. It combines conceptual foundations, practical guidance, examples and ready-to-use instructions, so that you can adapt the process to your own study, whether you are working with a large corpus of hundreds of documents or a brief survey with only a few responses.

The ultimate aim is for you to be able to conduct a rigorous, reproducible and ethically informed content analysis, integrating the LLM as an expert assistant that speeds up the technical work, but without in any case replacing your own judgement or interpretive responsibility (I will be somewhat insistent about this).

Who it is for

The guide is intended for people like you — early-career researchers, educators conducting research or innovation studies, and teams wanting to integrate LLMs into their workflows. It assumes no programming knowledge or prior experience with LLMs. It does, however, assume familiarity with the fundamental concepts of content analysis. It may be useful to consult a textbook or introductory document before continuing.

AiRISES case study

Following the AiRISES case study: Imagine that your challenge is not to analyse isolated messages, but to understand an entire complex educational ecosystem. In AiRISES we faced 7,408 messages. At first, all we saw were "queries", but thanks to assisted analysis we discovered three major worlds: the transition from secondary education (where academic performance worries 64% of students), the distinct identity of vocational education (VET, representing 25% of total queries), and a structural fear of the labour market. We were able to identify these messages as an indicator of the concerns of thousands of students.

Scope and limits of LLMs in content analysis

LLMs can be powerful allies in supporting content analysis, but their usefulness depends on controlled use accompanied by the researcher's own judgement. They are particularly good at summarising, organising, grouping, rewriting and suggesting, but they do not replace the contextual, theoretical and ethical understanding of the person conducting the analysis. They may introduce biases inherited from their training data, over-interpret information or invent details. They therefore require precise instructions and constant supervision.

It is important to understand their capabilities and limitations clearly.

Table 1. Capabilities and limits of LLMs

Dimension	Elementos clave
What an LLM does well	Summarising large volumes of text; identifying preliminary patterns or tentative themes; suggesting initial category structures; coding texts into organised matrices; drafting syntheses and reports.
What must not be delegated to the LLM	Final interpretation of findings; attribution of deep or contextual meaning; autonomous construction of categories without human supervision; theoretical judgement and conceptual relevance; ethical responsibility in data handling.
Critical aspects to monitor	Circular or tautological reasoning; over-interpretations or unjustified inferences; excessive generalisations; inconsistencies in coding between similar fragments; lack of documentation that compromises the transparency of the process.

In summary, LLM-assisted analysis is always an interactive process: the model accelerates repetitive and structural tasks; the researcher interprets, validates and decides.

Rationale for using LLMs in educational research

In educational research, the data collected tends to be abundant, varied and rich in nuance: open-ended questionnaires, interviews, classroom diaries, student reflections, tutor reports, portfolios, focus group transcripts, teaching plans, or even virtual forum messages. Analysing all of this manually is time-consuming and can slow down the exploratory phases.

This is where LLMs add clear value: they allow processes to be accelerated without sacrificing rigour, as long as interpretive control remains in the researcher's hands.

Nevertheless, it is essential to remember that models can make mistakes, over-summarise or introduce inferences not present in the data. Los LLM no comprenden el contexto educativo, institucional o social más allá de lo que se les proporcione y por tanto su aporte siempre requiere verificación crítica.

Recuerda que, aunque el LLM te entregue una tabla impecable, la que firma el artículo eres tú. Si el modelo se inventa algo y no lo pillas, tú te llevarás un disgusto cuando te pregunten los revisores y la máquina ni se inmutará. Peor aún, te dirá: I'm sorry, you're right. I was wrong. The LLM accelerates the analysis process; the researcher interprets. That is the central principle. Here are a few examples.

Example 1. Open-ended student response (questionnaire)

Original data (Year 10 student):

“Me gusta cuando trabajamos en grupos pequeños porque así puedo preguntar sin vergüenza. En clase grande me pierdo enseguida.”

What the LLM can do here:

identify preliminary patterns (comfort, participation, emotional climate),
suggest tentative categories: participation, peer support, barriers in large-group settings,
extract relevant verbatim quotations without rewriting.

What it must not do:

interpretar que la alumna “tiene baja autoestima”,
suggest causes not mentioned,
infer personal characteristics.

Example 2. Comments in an institutional forum (university)

Student message on Moodle:

“A veces no entiendo bien qué pide cada práctica porque las instrucciones están en sitios distintos. Me ayudaría tener todo unificado.”

What the LLM can do:

identify needs for clarity, materials management, and student experience,
suggest a category such as: demands for teaching organisation,
suggest representative quotations.

What it must not do:

deducir que el profesor “no planifica bien”,
assert that this affects academic performance.

Formal aspects of the guide

First, throughout this guide I use the term LLM (Large Language Model) in preference to the broader term AI (Artificial Intelligence). The reason is that the assisted content analysis described here is specifically based on generative language models — such as ChatGPT, Claude or Gemini — capable of processing text, generating responses and applying complex analytical criteria. These models are part of the broader field of artificial intelligence, but the term AI is excessively generic and encompasses very different technologies (computer vision, expert systems, robotics, predictive analytics). By contrast, LLM precisely identifies the specific tool involved in content analysis, makes it easier to describe the process in technical terms, and avoids conceptual confusion. For this reason, and in order to maintain terminological rigour and methodological clarity, this guide uses the term LLM to refer to the assistant agent throughout all phases of the analysis.

Second, this guide was developed with reference to the capabilities of language models available in December 2025. Although LLMs evolve rapidly, the procedure described here is intended to be useful and durable, because it is grounded in the classical principles of content analysis and in structured interaction with the model. It is foreseeable that future improvements to LLMs (greater stability in applying analytical criteria, reduction of errors and more robust reasoning capabilities) will increase the validity and reliability of results, gradually reducing the need for direct human supervision at each step. Nevertheless, the researcher's interpretive judgement will remain indispensable, and this guide provides a methodological foundation both for current capabilities and for those that will develop in the future.

Third, this guide includes a series of instructions designed to guide LLM use during content analysis. To distinguish them from the explanatory text, these indications appear highlighted in single-line boxes, easy to identify and to copy directly when needed. Each box contains a specific action you can ask the model to perform, or a rule the model must follow. I prefer to call them instructions, although everyone refers to them as prompts. A prompt can be "good morning" — that is, something that triggers a response from the LLM. Here I focus only on the instructions we want the model to follow. I will therefore use both terms interchangeably, with the same meaning. They will take the format of this example:

Instruction for the LLM

If you detect inconsistencies, biases or impossible tasks, flag the problem before continuing.

Aprovecho para aclarar que estos son ejemplos, son propuestas. Los más expertos en este tema dicen que el prompting is as much an art as a science. Experimenting to find what works best is therefore almost a necessity.

Bear in mind that what I explain here can be used in its entirety or partially, depending on the research approach and methodological decisions you make. I have structured the guide so that each chapter can be read independently, without omitting warnings I consider essential (which means you will encounter some of them repeated).

Cross-cutting operative principles (mandatory throughout)

These principles apply to all phases (construction, refinement, coding, synthesis and reporting). Do not forget them. I state them here to help you calibrate your expectations about what an LLM can and cannot do to assist you. As I noted above, I will revisit them in each section.

P1The LLM proposes; the researcher decides and takes responsibility. The model does not "validate" results: it supports, suggests and structures; the final methodological decision is yours.

P2No inventing or inferring beyond the text. The analysis is limited to the corpus provided. If something cannot be determined, it must be stated explicitly.

P3Traceability and versioning. All relevant outputs must be saved (category system, definitions, tables, prompts and decisions), with version and date, ensuring complete process traceability. This means being able to replicate and verify the process.

P4Continuous quality control. Manual review of samples, comparison between iterations, and detection of inconsistencies/biases are required; if errors appear, instructions are corrected or the system is revised.

P5Privacy and data protection. Before using external platforms, data must be anonymised, sensitive information minimised and terms of use reviewed; where restrictions exist, secure alternatives must be used.

P6Documentation and reproducibility. You must record decisions (merges, deletions, criterion changes, examples) and describe your use of the LLM in the final report.

Chapter 2

Foundations of content analysis

This chapter presents the conceptual principles underpinning the LLM-assisted content analysis procedure. Although I assume you already have a basic understanding of what content analysis is, I offer a summary to situate you in the methodological framework that will be applied in the following chapters.

Definitions and key concepts

Content analysis

Content analysis is a systematic analytical method for identifying, organising and interpreting meanings present in textual data. In educational research it is used to analyse:

open-ended questionnaire responses,
interviews,
educational forums, emails or chats,
portfolios or diaries,
learning evidence,
other types of documents containing open text.

El objetivo es transformar datos textuales en información organizada y, de ahí, en interpretación fundamentada. Los datos (corpus) solo adquieren ese estatud cuando son interpretados en un marco coherente y estable.

Units of analysis

The first decision is which unit of analysis to use — that is, the textual unit that will be analysed: complete responses to a survey question, individual sentences, paragraphs, or semantic units defined by the researcher (e.g., complete ideas). Example of a forum message (reproduced literally, spelling uncorrected):

ID03216 - Hola quisiera saber donde es en terminos relativos , ingenieria tecnica industrial, menos dura, numero de egresados donde puede verse de cada universidad o alguna informacion en relacion al tema

Categories

Each unit of analysis may contain one or more categories: labels or concepts representing patterns, ideas or themes identified in the text. Examples: academic performance, vocational family, university access, employment…

Coding

To identify categories in units of analysis, we use a process called coding. There are two levels of application in open texts: (a) internal coding by segment (sentences, semantic units, paragraphs) and (b) document-level coding when a category functions as an attribute or variable of the case (for example, classifying the document by the educational level of the respondent, profile, or other analytical variables). Examples:

Coding a message segment: identifying the phrase 'I'm not sure whether to do the Higher VET course' and assigning it the code Higher VET. Coding a document: analysing a complete message to determine, using linguistic indicators, whether the author is male, female or neutral (user gender).

Tema (o dimensiones)

Categories may share common elements. In that case, we create groupings that integrate several categories and express a central idea or significant pattern. Categories should be the final level of a hierarchical structure that may have more than two levels of grouping, which we can call themes or dimensions. Examples:

Temporal dimension: the theme "Stages", grouping hierarchically lower categories such as: Pre-university, University entrance exam, University and Employment.

Content dimension: the theme "Attitudinal aspects", integrating the categories Emotions (fear, hope), Intentions and Perceived difficulty.

Types of analysis: inductive, deductive and mixed

Figure 2. Decision tree for selecting the type of analysis

Inductive

In the coding process we can use an inductive strategy, in which we create categories from what we find in the text.

This strategy is useful when exploring little-studied phenomena, when there is no prior theoretical framework, when exploring a new phenomenon, or when the aim is to capture the content of the text without imposing prior interpretations.

An LLM is useful here for suggesting initial patterns.

Deductive

We can also use a deductive strategy, in which we apply prior categories derived from theories, existing research, conceptual frameworks, rubrics or previous questionnaires.

This strategy is useful when the aim is to test, compare or apply prior theories.

An LLM acts as an assistant for applying the predefined category system to the units of analysis.

Mixed

The mixed approach combines both strategies, applying prior categories while allowing new ones to emerge. It combines predefined and emergent categories.

This is the most common approach in complex studies, and an LLM can help adjust and enrich the system.

Risks

In the inductive approach, the main risk when using LLMs is over-inflating initial proposals and accepting poorly grounded categories; in the deductive approach, the danger is forcing data to fit pre-established theoretical frameworks; in mixed approaches, both risks coexist and require especially careful vigilance.

Here is a summary table.

Table 2. Content analysis approaches: inductive, deductive and mixed

Criterion	Inductive	Deductive	Mixed
Starting point	The data	Prior theoretical framework	Theory and data
Category system	Emerge progressively from the corpus	Defined before the analysis	Initial categories that are revised and adjusted
Relationship with theory	Theory is constructed from the data	Theory guides the analysis	Theory orientates, but is revised in light of the data
Degree of openness	Very high	Low	Medium–high
Main advantages	Discovers unforeseen topics; high sensitivity to context	Greater conceptual control; facilitates comparability	Balance between rigour and openness; high adaptability
Risks or limits	Initial dispersion; unstable categories	Forcing the data; losing relevant information	Requires greater methodological control
Typical use in educational research	Exploratory studies, analysis of student experiences	Programme evaluation, application of theoretical models	Applied educational research (very common)
Role of the LLM	Supports identification of preliminary patterns, thematic exploration and initial category proposals, always under human review	Systematically applies a previously defined category system and facilitates consistent corpus coding	Supports both initial exploration and the refinement and adjustment of categories, facilitating rapid iterations between theory and data

Units of analysis and text segmentation.

The quality of the analysis depends largely on the precise definition of the unit of analysis.

Table 3. Most common types of units of analysis

Type	Description	When to use it?
Complete response	Each response is coded as a whole	Short or very direct questionnaires
Sentence	Segment delimited by punctuation	Long but structured responses
Semantic unit	Complete idea regardless of length	Interviews, complex narrative
Paragraph	Extensive textual blocks	Institutional documents or long reflections
Turn of speech	Participant intervention	Focus groups
Diary entry	Text of a day or diary entry	Diaries

LLMs work well at any of these levels, but require explicit instructions for each case. It is also useful to distinguish whether you will be:

coding within each document (segments/meaning units),
codificar el documento completo cuando la categoría se usa como atributo/variable del caso (p. ej., presencia de tema, tipo de documento, perfil del respondiente, etc.).

Ejemplos de instrucciones:

Instruction for the LLM

Use the sentence as the unit of analysis and do not divide into smaller units.

Identify meaning units within each response.

Treat each complete response as a case and assign variables/attributes at document level.

The role of the researcher in interpretation.

Although LLMs can suggest categories, summarise, code and synthesise, they do not understand the educational context in the deep sense required for rigorous analysis. The researcher contributes:

sensitivity to institutional and socio-cultural context,
theoretical knowledge,
judgement for assessing relevance,
reflective capacity for interpretation,
critical review of the model's work.

El principio que se debe seguir siempre es el P1: the LLM proposes and you decide.

Differences between manual and LLM-assisted analysis

Manual and assisted analysis share methodological principles but differ in pace and resources. Compared to manual analysis, LLM-assisted analysis presents the following:

Ventajas

Reduction of initial reading time.
Rapid identification of preliminary patterns.
Automatic generation of category proposals.
Orderly coding in tables.
Assistance in drafting summaries and results reports.
Consistency in handling large volumes of text.

Limitations

Risk of over-interpretation.
Categories that are too general if not adjusted.
Mechanical coding without nuance.
Lack of sensitivity to contextual details.
Requires constant supervision and cross-checking.

Recommendation: LLM-assisted analysis does not replace manual analysis, but optimises and makes it more efficient. Best practice consists of using the LLM to accelerate tasks, provide alternatives and generate structure, while you retain interpretive control.

Here is a summary table.

Table 4. Differences between manual content analysis and LLM-assisted analysis

Dimension	Manual analysis	LLM-assisted analysis
Speed	Slow with large corpora	High even with large volumes
Scalability	Limited by human time	High
Consistency	May vary between sessions	High si las instrucciones son estables
Risk of biases	Explicit human biases	Human + model biases
Transparency	High si se documenta bien	Requires explicit documentation
Researcher's role	Codes and interprets	Interprets, validates and decides
Type of tasks	Interpretive and technical	Accelerated techniques + human supervision

You now have the conceptual map. You know what content analysis is, how an LLM functions in that process, and what type of analysis you will be conducting. Now it is time to get practical: before asking the model anything, you need to prepare your data. If this phase is done well, everything else flows. If done poorly, sooner or later you will have to come back here. On to Part II.

Part II

Preparation

Preparing the data and configuring the LLM before starting the analysis.

Chapter 3

Data preparation

Define the research questions and study objectives. → Cap. 1.1
Decide the analysis approach: inductive, deductive or mixed. → Cap. 2.2
Identify the type of corpus to be analysed and the appropriate unit of analysis. → Cap. 2.3

The quality of LLM-assisted analysis depends initially on the quality of the data the model receives. This chapter explains how to collect, process and organise texts before asking an LLM to analyse them. Adequate preparation reduces errors, improves consistency and allows better use of the model's capabilities.

It is important to avoid discovering data errors during the analysis process. If they arise, you will typically need to go back, correct them and redo part of the work — costing you time and, very likely, your composure. Invest time in cleaning your texts thoroughly at the outset.

Data collection: common sources in educational research

In educational research it is common to work with different types of text. Before starting the analysis, it is worth reviewing their relevance, their relationship to the research questions, and considering whether contextual information should be incorporated.

LLMs can work with any type of text, but remember that in educational research the most common materials are:

Open-ended questionnaire responses (frequent in studies of perceptions, competencies, satisfaction…).
Semi-structured interviews or focus groups (transcribed).
Learning diaries and student reflections.
Institutional reports and documents (e.g., school plans, innovation reports).
Messages on educational platforms (forums, Moodle, Teams, chats…).
Student written work (essays, portfolios, comments).

Each data type requires specific decisions:

What will the unit of analysis be?
Will it be segmented into sentences, paragraphs, semantic units?
Are additional contextual data needed?
Does it need to be anonymised?

Cleaning and anonymisation

Figure 3. Ethical data safety traffic light before uploading to an LLM

Before using an LLM it is essential to:

A. Remove personal or sensitive information.

names of students or teachers,
educational institutions,
phone numbers, addresses,
direct identifiers,
health data, and
other specially protected information.
If the presence of certain data is analytically relevant (for example, a person's role), it is advisable to replace that data with labels, such as Teacher A, Student 3 or School X.

B. Correct errors that hinder the model's reading

It is not necessary to correct everything, but you should correct what might impede comprehension, such as duplicated text, incomplete lines, page breaks or irrelevant formatting.

C. Standardise the format

While there is no single standard here, these are some format recommendations: plain text or simple table, without colours or indentation, using standard characters.

Your Ethical Data Safety Traffic Light.

Before uploading any data to the platform, take a moment to check which colour applies to you. Remember that privacy is a mandatory principle (P5):

🟢 VERDE (Adelante): Datos totalmente anonimizados (p. ej., "Participante 1"), documentos públicos de centros o textos donde has borrado cualquier rastro personal.

🟡 AMBER (Caution): Has sustituido nombres por roles (p. ej., "Docente A", "Madre B"). Es seguro, pero asegúrate de que el contexto de la respuesta no permita identificar a la persona indirectamente.

🔴 ROJO (¡Para!): El texto contiene nombres reales, correos, teléfonos o datos sensibles de salud o de menores sin procesar. Low ningún concepto los subas así a plataformas externas.

AiRISES case study

Following the AiRISES case study: to build our database we carried out a massive scrape of 8 full years (2016–2023). We started from an initial "noise" of 13,328 records that required cleaning. Applying our ethical safety traffic light, we eliminated 754 duplicates and messages of fewer than 25 words that added no analytical value. An additional ethical challenge was gender identification: as a researcher, you must decide whether this variable is relevant for your study. In our case, we discovered that messages identified as male tended to focus on technical areas, while those identified as female concentrated on care-related areas, which obliged us to anonymise with double caution to avoid gender bias in subsequent analysis.

Recommended formats

Models work well with numbered lists, delimited text blocks (commas, tabs, etc.), simple tables and clear paragraphs. It is preferable to avoid complex formats and to prioritise well-structured plain text. Numbering responses (R1, R2, R3…) facilitates traceability during coding.

Example of a data block:

ID	RESPONSE
R1	I think digital skills are important for motivating students.
R2	I use ICT, but I feel I need more training.
R3	The school does not have sufficient technological resources.

Block-based analysis management

Most models work better when excessive loads are avoided. Rather than sending the entire corpus at once, you can work in blocks (for example, 50–200 short responses or one complete interview). Use clear names for each block. For example:

Block A: perceptions of digital resources
Block B: teacher training

If you load too much data at once, the model may become saturated, lose context, generate overly general categories or forget previous instructions. Working in parts, the LLM better maintains context, reduces length-related issues and allows the category system to be refined incrementally.

Special cases: short responses, noise and multilingual data

Very short responses, noise, multilingual data and poor spelling may require specific decisions: marking them as non-codeable, asking the model to translate or to correct only spelling without altering the content. These decisions must be documented, as they affect interpretation.

A. Very short responses

Ejemplo: “Sí”, “No”, “Depende”, pueden ser respuestas insuficientes o no relevantes para la investigación. En ese caso una posible solución es:

Instruction for the LLM

Detecta respuestas no codificables o insuficientes (como “Sí”, “No”, “Depende”) y agrúpalas en una categoría general llamada “Respuestas mínimas”, sin asignarles interpretación temática.

B. Noise or irrelevant texts

Isolated phrases, blank responses, duplicates, etc. may appear. In these cases, remove them before analysis or ask the model to flag them as non-analysable.

C. Multilingual data

If this is the case (not very common), note that the LLM can process them, but it is advisable to indicate this explicitly with instructions such as:

Instruction for the LLM

This dataset contains responses in English, French and Welsh.

Instruction for the LLM

Do not translate anything and analyse each response in its original language.

Instruction for the LLM

First translate the entire corpus into English literally and without interpreting. Do not alter the content.

D. Poor spelling

Models usually handle this well, but if it affects comprehension you should give instructions such as:

Instruction for the LLM

Rewrite these responses correcting only spelling and punctuation.

Chapter 4

Configuring the LLM for content analysis

Collect, clean and anonymise the data. → Cap. 3
Apply the ethical safety traffic light before loading data into external platforms. → Cap. 3.2
Organise the corpus in numbered blocks in a format readable by the LLM. → Cap. 3.3

Correctly configuring a language model is one of the most critical steps in the process. The quality of the final analysis depends largely on how the model is prepared, what instructions it receives, and what conceptual restrictions are established from the outset.

This chapter provides a set of principles and instructions for configuring ChatGPT, Gemini, Claude or other LLMs in a way orientated to educational research.

El “rol analítico” del LLM

LLMs do not possess intentionality or deep understanding; they function through statistical prediction (though it may not seem so). For the model to act as a methodological ally, it must be assigned a role — in this case, that of expert assistant in content analysis in educational research, careful with interpretation, respectful of the data, and attentive to requesting clarifications when faced with ambiguity.

The role must include:

competence in content analysis,
attention to methodological instructions,
emphasis on not inventing,
obligation to request clarification when faced with ambiguity,
respect for the language of the corpus, and criteria of transparency and traceability.

The role can be repeated at the start of each session or recalled periodically for greater consistency.

Master prompt: what to include

It is advisable to provide context and action instructions. The master prompt defines the working framework: approach (inductive, deductive or mixed), response style, restrictions (do not invent, do not generalise beyond the corpus) and protocols for flagging inconsistencies. This instruction should be used at the start of each session or phase of the analysis.

The master prompt is the initial configuration that orientates the entire conversation. It can be used with ChatGPT, Gemini and Claude, for example, without significant changes.

Tell the model:

A. El rol (como se vio en 4.1. )

Instruction for the LLM

Act as an expert assistant in content analysis in educational research. Your role is to help me explore, categorise and synthesise textual data, following criteria of academic rigour. Do not invent information. Request clarifications when necessary.

B. The methodological framework

Instruction for the LLM

Trabajaremos mediante un análisis de contenido inductivo/deductivo/mixto.

Sigue los pasos de exploración, categorización y síntesis que te indicaré.

C. The response style

Instruction for the LLM

The response style should be clear/structured/academic but not excessively technical/technical/with examples/with direct quotations/…

D. Restrictions

Instruction for the LLM

Do not make inferences that are not grounded in the data.

If something cannot be determined, state it.

Instruction for the LLM

Do not generalise beyond the corpus provided.

E. Error control

Instruction for the LLM

If you detect inconsistencies, biases or impossible tasks, flag the problem before continuing.

Version management and conversational memory.

Figure 4. LLM session configuration and management throughout the analysis

LLMs function through threads or sessions. It is advisable to separate sessions by stage (exploration, categories, coding, synthesis), repeat the master prompt when opening a new thread, and always note which version of the category system is being used. Saving key outputs (category versions, tables) is essential for traceability.

For example, create separate sessions by phase:

Session 1: initial exploration
Session 2: developing categories
Session 3: coding
Session 4: synthesis and reporting

Rules for avoiding loss of context:

Repeat the master prompt at the start of each session.
State the version of the category system. Load only the data needed for each task.

Instruction for the LLM

We will work with the category system version 2.0 [attach].

Save key outputs manually:

category list,
coding tables,
previous summaries.

Maintain organised folders and file names:

Categorias_v1.docx
Categorias_v2.docx
Codificación_BloqueA.xlsx

Consejo de investigadora a investigadora: tu bitácora de investigación.

To comply with the traceability principle (P3), do not rely solely on the chat memory. I suggest opening a simple document where you note:

Table 5. Research log

Field	Explanation	Usage example
Date and Model	Record the exact day of the session and the specific LLM version. This is crucial because model capabilities evolve rapidly.	14 March, ChatGPT-5.
Session description	Note the context of the conversation "thread". It is important to know whether the model has memory of previous steps or whether you have started "from scratch", to avoid accumulated biases.	Continuing the analysis from where I left off Starting the analysis of document ID23 from scratch
Key instructions	Document any change or nuance you have introduced in your master prompt. If you paste the exact instruction here, you will be able to replicate the result months later.	Modifiqué la instrucción añadiendo “identifica ideas duplicadas”.
Observations	Note qualitative impressions about the model's performance or conceptual blocks. This will help you interpret the data without desperately trying to remember what happened in that session.	Today the model is being particularly creative It seems to have got stuck on the TEACHING category

Sources of error and how to mitigate them

Typical errors include over-interpretation, creation of excessively general categories, loss of instructions and inconsistency between responses. To mitigate them, clear restrictions are formulated, rules are recalled periodically, and the model is asked to review its own coherence.

Main problems and suggestions for mitigating them.

A. Over-interpretation o “alucinación”. Los modelos pueden generar inferencias no justificadas.

Instruction for the LLM

Do not make inferences that are not grounded in the data. If you cannot determine something, state it.

B. Overly general categories. Instructions: request inclusion/exclusion criteria and textual examples.

C. Loss of instructions. Ask the model to repeat the master prompt periodically.

D. Inconsistency between responses. Instructions: ask the model to review its own coherence.

Instruction for the LLM

Revisa si hay categorías aplicadas de forma inconsistente.

Identifica contradicciones y propón ajustes.

E. Ambiguity in units of analysis. Instructions: explicitly indicate which unit to use.

F. Mixed languages. Instructions: give clear guidance on whether to translate or not.

Here is a summary table.

Table 6. Sources of error in LLM use and mitigation strategies

Source of error	Description	Risks	Mitigation strategy
Ambiguous instructions	Unclear or incomplete prompts	Inconsistent coding	Draft explicit, reusable instructions
Over-interpretation	Inferences not present in the data	Loss of validity	Prohibit external interpretations
Hallucinations	Introduction of non-existent content	Invalid results	Require exact verbatim quotations
Vague categories	Unclear definitions	Overlaps	Refine definitions and criteria
Lack of human control	Accepting results without review	Systematic errors	Review samples and versions
Model variability	Variation between runs	Low reproducibility	Document prompts and versions

Using projects or persistent workspaces (ChatGPT Projects, NotebookLM, etc.)

Specific working environments exist within LLMs themselves, such as ChatGPT Projects or NotebookLM, which allow you to organise documents, notes and conversations in one place permanently.

In the context of LLM-assisted content analysis, these spaces can function almost like a digital research notebook, where you can bring together:

el corpus (respuestas abiertas, interviews, documentos);
methodological documents (design, analysis protocol, inclusion/exclusion criteria, etc.);
the various versions of the category system;
notes of analytical decisions, records of changes and reflections;
exported coding tables or interim summaries.

Their use can be very helpful when the analysis is complex, extended over time, or involves different types of documents. However, they also introduce risks and limitations that are important to understand and make explicit in your research. Your decision whether to use them or not should be based on an analysis of these advantages and disadvantages for your specific project.

Advantages

a) Centralising study information

A project allows you to bring together in a single environment:

the corpus (for example, all open-ended questionnaire responses),
reference documents (key articles, theoretical framework, analysis criteria),
the methodological guide being followed,
previous reviews and versions of the category system.

Esto reduce la dispersión típica de tener muchos archivos sueltos (Word, PDF, hojas de cálculo) y facilita que el modelo pueda “ver” el conjunto de documentos relevantes cuando interactúa con el investigador.

b) Maintaining context between sessions

En una conversación “normal” con el modelo, el contexto se pierde o se fragmenta cuando se cierra la sesión o se supera el límite de mensajes. En cambio, los proyectos o notebooks mantienen:

the interaction history,
attached documents,
and often a persistent summary of key content.

This makes it possible to resume the analysis days or weeks later without having to explain everything from scratch, which is very practical for those who have to combine research with teaching or other tasks.

c) Reducing repetition of instructions

When working without projects, you must repeat in each session:

the study context,
the research questions,
the type of analysis (inductive, deductive, mixed),
the model's role (assistant in content analysis),
las advertencias de “no inventar”, etc.

En un proyecto, gran parte de esta información se puede fijar al inicio y el modelo la tiene disponible como referencia recurrente. Esto ahorra tiempo y reduce el riesgo de olvidar alguna indicación importante.

d) Función de “cuaderno de investigación” digital

A project can be used as a methodological record space, where the following are stored:

decisions about category merges and splits;
versions (v1.0, v2.0, v3.0) of the category system;
justifications for changes;
discarded explorations, but documented.

Although this does not replace the formal record in your own document (for example, a research diary or the method section in your research report), it can complement it and serve as a draft or working repository.

e) Semantic search within documents

Some environments allow you to search for relevant fragments within uploaded documents, not only by keyword, but by meaning. This can help to:

quickly locate verbatim quotations that exemplify a category;
check whether a topic actually appears in the corpus or only in the summary;
review how a specific concept is used in different parts of the material.

In content analysis terms, this function can speed up the collection of evidence to justify findings.

f) Support for iterative and complex analyses

In extensive studies (for example, several waves of data collection, or several groups of participants), the project can contain logical subfolders (by block, phase, group), which makes it easier to maintain a global view of the analysis and allows the LLM to navigate between related materials without the researcher having to load them one by one in each message.

Disadvantages and risks

a) Privacy and data protection risks

Uploading real educational corpora (student responses, teacher data, institutional data…) to an external platform carries significant risks:

possible processing of data by the provider (OpenAI, Google, etc.);
storage on external servers;
potential breach of data protection regulations, if appropriate measures are not taken.

This requires:

rigorously anonymising data before uploading;
reviewing the platform's privacy policy and terms of use;
following the ethical and legal standards of the institution and the research project.

If the sensitivity level is high (for example, data about minors, health, or vulnerable situations), it may be more prudent not to use these types of environments, or to restrict use to already anonymised and highly aggregated materials.

b) Falsa sensación de “seguimiento inteligente”

El hecho de tener un “Proyecto” con muchos documentos puede crear la impresión de que el modelo tiene una especie de memoria profunda y estable del análisis, cuando en realidad:

it still operates through predictive text generation (it will always give you a response),
it may forget relevant details,
and it may not always respect the logical sequence of analytical decisions.

Podrías acabar confiando en exceso en que “el proyecto ya lo sabe todo” y dejar de verificar la coherencia metodológica. Es importante recordar que el proyecto es una ayuda organizativa, no un sustituto de tu razonamiento.

c) Technological dependency and access problems

If the entire analysis (decisions, versions, evidence) is concentrated in a single online project, any of the following can seriously affect continuity of work:

a technical incident,
a change in service terms,
loss of access to the account.

Por ello, es crucial exportar periódicamente información clave (sistemas categoriales, tablas, síntesis) a documentos locales bajo control del investigador. En general, es conveniente, necesario e, insisto, imprescindible hacer copias de seguridad diarias de todo lo relacionado con tu investigación (o con tu trabajo o con tus cosas personales). Supongo que no hará falta insistir en esto, pero pregunta a alguien que esté haciendo estas copias por haberlo aprendido por las malas (yo mismo). Apréndelo por las buenas.

d) Reproducibility limitations

In academic terms, reproducibility is affected because:

other researchers will not be able to replicate exactly the same project environment;
models may change version;
LLM results may vary over time.

This does not invalidate their use, but requires greater care in documenting the procedure, instructions, model versions and the core content of the project.

e) Riesgo de “caja negra” metodológica

If much of the reasoning and refinement takes place within the project environment and is not explicitly recorded in other documents, part of the analytical process may become opaque (or wholly dependent on screenshots, internal logs, etc.).

This runs counter to the principles of transparency and verifiability in research, so the project must not replace the research diary or the formal method section in your report.

Recommendation

Projects or persistent workspaces can be very useful as an organisational support tool when the analysis requires centralising the corpus, saving versions of the category system, maintaining coherence between sessions, and enabling rapid searches within uploaded materials. Their use is especially advisable when work extends over long periods, involves multiple documents, or requires easy retrieval of previous decisions.

However, these environments should not be used as the primary repository for the analysis or as a substitute for the formal methodological record. You must maintain the given recommendations, such as periodically exporting key analytical products (categories, tables, summaries) and recording in writing, outside the project environment, the most important methodological decisions. Persistent workspaces can complement the researcher's work, but cannot replace the documentary records required by content analysis.

And, very importantly, remember that before using a persistent workspace it is worth asking three questions:

Is the corpus completely anonymised?
Will the research extend over a long period with multiple documents?
Does the institution authorise the use of these platforms?

If any answer is negative, it is preferable to limit their use or work with other options.

With the data prepared and the model configured, you are ready to begin the analysis proper. From here, the work becomes more iterative: you will explore, propose categories, refine them and start again as many times as necessary. This is normal. This is how good content analysis works, with or without an LLM. Part III accompanies you through each of those cycles.

Part III

Analysis

Exploración, construcción del sistema de categorías, codificación y validación.

Chapter 5

Initial exploratory analysis

Configure the LLM with the master prompt and assign it the analytical role. → Cap. 4
Prepara los datos divididos en bloques para enviarlos progresivamente. → Cap. 3.4
Nota: At this stage you do not yet need a category system. You are just exploring.

Exploratory analysis is the first systematic engagement with your data — the corpus. Its aim is not to build final categories but to understand the terrain, identify possible patterns and guide the categorisation phase. This phase benefits particularly from the speed and synthesis that LLMs offer.

How to present data to the LLM

The way data are introduced affects the quality of the analysis. Data blocks should be clearly delimited, using stable numbering and start/end markers. It is useful to briefly indicate what the block is about and what is expected of the model (for example, "I want an exploratory analysis, with no definitive categories yet").

Some recommendations:

Use clear, clean blocks

R1: [respuesta]

R2: [respuesta]

R3: [respuesta]

Include a stable, consistent header

Instruction for the LLM

A continuación, tienes un conjunto de respuestas sobre [tema].
Analiza únicamente este bloque.

Delimit the text. Using triple tildes or triple quotation marks helps the model identify the corpus:

<<< R1[respuesta] / R2[respuesta]>>>

Avoid multiple or ambiguous instructions. For example, do not ask for summaries, categories and coding simultaneously.

Requesting summaries, patterns and possible themes

In the exploratory phase, the model is asked to synthesise the main ideas, identify between five and ten preliminary patterns (depending on the type of data) and provide representative quotations. This gives a first overview of the corpus without yet committing to a stable category system.

In this phase it is recommended to begin with simple descriptive tasks:

Instruction for the LLM

Aquí tienes un conjunto de respuestas sobre [tema]. Para un análisis exploratorio preliminar quiero un resumen general de las ideas principales (5–7 líneas).

[Esperar respuesta]

Instruction for the LLM

Ahora, quiero entre 5 y 10 temas o patrones preliminares con ejemplos textuales breves que representen cada patrón. No generes categorías definitivas, solo patrones.

AiRISES case study

Following the AiRISES case study: when and what information do students actually seek? Al pedirle al LLM que identificara patrones temporales, descubrimos que agosto es un mes crítico de actividad en foros. ¿Por qué? Porque las instituciones están cerradas y los estudiantes se sienten “huérfanos” de orientación oficial. Además, el análisis exploratorio nos reveló un patrón inesperado: la FP no siempre se ve como una "pasarela" a la universidad. De hecho, más del 50% de los mensajes sobre universidad no mencionan la prueba de acceso, sugiriendo que muchos estudiantes de FP ven su título como una vía con identidad propia y no solo un trámite. Como investigadora, estas ideas exploratorias son las que guiarán tus preguntas definitivas.

Quick identification of interpretive angles

In addition to the summary and patterns, it is useful to ask the model to identify:

recurring concerns,
positive/negative attitudes,
facilitating factors and barriers,
unexpected elements, questions that the data raise.

Instruction for the LLM

Identifica:
- facilitadores,
- barreras,
- emociones dominantes,
- necesidades expresadas,
- ideas minoritarias pero relevantes.

This allows the analysis to be orientated towards interpretive dimensions without fixing categories yet.

Possible errors at this stage

LLMs can make errors at this stage if they do not receive adequate guidance.

First, the model commonly confuses emerging patterns with definitive categories. This happens because LLMs tend to organise information in a structured way, even when that structure has not yet been validated. To avoid this, it is important to specify explicitly that preliminary observations are sought, not consolidated categories.

A second problem is the tendency to over-generalise. Models, having worked with large training corpora, tend to propose broad or abstract assertions. The most effective way to control this risk is always to require that any assertion be accompanied by exact verbatim quotations from the corpus, so that the analysis remains anchored in real data.

Third, externally sourced interpretations may appear that are not justified by the data. LLMs, by their generative nature, may incorporate inferences based on common associations or habitual linguistic patterns that are not present in the corpus being analysed. This problem can be mitigated by explicitly indicating that the model must not interpret beyond what is clearly expressed in the responses.

Fourth, themes or analytical dimensions of different scales may be mixed. For example, the model may combine individual emotions with institutional barriers, generating patterns that lack conceptual coherence. To prevent this type of confusion, it is advisable to ask that patterns be organised at a clearly defined analytical level (for example: personal, institutional, pedagogical, etc.).

Finally, LLMs tend to suggest causal relationships even when these are not supported by the data. These inferences tend to arise because models are trained to complete plausible narratives that do not necessarily reflect the content of the corpus. To avoid this problem, it should be expressly indicated that unjustified causal inferences are prohibited and that only elements appearing explicitly in the analysed texts should be described.

Before closing your laptop each day (and getting some rest), so that you can pick up the work tomorrow without surprises:

Backup. Have you exported the latest table to a local Excel or Word file?
Chat link. Save the link or PDF of the current conversation.
Review your research log. Have you noted all your decisions?
You can now end your work session with peace of mind

El error: treating preliminary patterns as definitive categories.

Why this happens: los LLM tienden a estructurar la información incluso cuando se les pide solo explorar.

How to avoid it: explicitly state in the prompt that you do not yet want definitive categories. Trata las propuestas como hipótesis provisionales.

Chapter 6

Building the category system

Complete the exploratory analysis and prepare a list of preliminary patterns or themes. → Cap. 5
Confirm the analysis approach (inductive, deductive or mixed). → Cap. 2.2
If your analysis is deductive: have the theoretical framework or prior category system ready to provide to the LLM.

The category system is the heart of content analysis. At this stage, preliminary patterns are transformed into an organised analytical structure useful for coding and interpretation. LLMs can accelerate this process, but the researcher must exercise rigorous control to ensure clarity, relevance and coherence.

Academic criteria for a good category system

With or without an LLM, a category system must meet a series of criteria that guarantee its validity, analytical utility and internal and external coherence.

Relevance to the research objective.

The system must be directly linked to the study's questions and purposes. Categories should not be generic but relevant to what is to be understood or explained.

Exhaustiveness.

The set of categories must allow all significant topics present in the data to be classified. An exhaustive system avoids gaps and ensures the analysis captures the diversity of the corpus.

Exclusiveness.

Las categorías deben ser conceptualmente distintas y mutuamente excluyentes entre sí, de modo que cada una represente un significado claramente diferenciado. Esto no implica que una respuesta completa solo pueda vincularse a una única categoría cuando la codificación se realiza por segmentos: una misma respuesta puede contener varios segmentos y cada segmento puede corresponder a una categoría diferente.

Similarly, when categories are used as document- or case-level attributes or variables (for example, presence or absence of a topic, response type, analytical profile of the respondent), exclusiveness applies to the definition of the categories, not to the number of attributes a single document may present. The same case may have several attributes, provided these are clearly defined and do not overlap conceptually.

In both uses, the fundamental point is that categories must not compete with one another for the same type of content. A well-constructed category system avoids redundancies, reduces dispersion and facilitates consistent and comparable interpretations.

Conceptual clarity.

Each category must have a precise name, a clear definition and criteria that guide its application. The elements included in each category must share that definition and function as a cohesive concept, not a heterogeneous collection of ideas. This clarity facilitates reproducibility and reduces ambiguities in interpretation.

Grounding in textual examples.

Using representative corpus fragments as membership examples helps to delimit what does and does not belong to each category. Examples strengthen the comprehension, transparency and validity of the system.

Inductive: generating categories from the data

From an inductive approach, the LLM can propose emerging categories from the corpus. The researcher reviews those proposals, clarifies ambiguous terms, unifies duplicates and adapts the system to the theoretical and contextual reality of the study.

LLMs can help generate preliminary lists of categories, groupings, meaning summaries or operational descriptors.

Instruction for the LLM

A partir del siguiente bloque de datos, genera un sistema de categorías inicial.
Para cada categoría incluye:
- nombre breve,
- descripción clara,
- criterios de inclusión,
- criterios de exclusión,
- 2 o 3 citas textuales representativas.
No generes temas superiores todavía.

Deductive: applying pre-existing categories

When working with prior conceptual frameworks, the model is used to apply already-defined categories. It is provided with the category list and asked to classify each response, always with explicit justification based on the text. In deductive analysis, categories come from prior literature, theoretical models, rubrics, questionnaires, etc.

Instruction for the LLM

Estas son las categorías predefinidas.
Revísalas brevemente y confirma que las comprendes.
[CATEGORIES HERE]
Después clasifica cada respuesta dentro de una o varias categorías, siempre con justificación textual.

El LLM debe aprender el sistema, no crearlo.

A special case of deductive coding: using lexical dictionaries (controlled lexicon)

Figure 5. Comparison between controlled lexicon and semantic coding

Within the deductive approach, a particular situation may arise in which category assignment does not require complex semantic interpretation, but can be supported by the explicit presence of certain terms or expressions. These are cases in which certain lexical elements act as unambiguous indicators of a previously defined category, which allows the implementation of deductive coding based on ad hoc dictionaries or lexicons.

This procedure constitutes a special case of the deductive approach, since it departs from a closed, theoretically validated category system defined with clear operational criteria. Unlike inductive coding, it does not seek to identify emerging themes or expand the category system, but rather to operationalise existing categories through explicit decision rules.

En el marco del análisis de contenido asistido por LLM, el uso de diccionarios léxicos no debe entenderse como un análisis puramente léxico ni como un sustituto de la interpretación semántica, sino como un mecanismo complementario, especialmente adecuado para determinadas dimensiones estructurales del contenido. En el proyecto AiRISES, este enfoque se aplicó, por ejemplo, a dimensiones como las etapas o niveles educativos, donde la mención explícita de términos como “universidad”, “bachillerato”, “Educación Secundaria” o “Formación Profesional de Grado Superior” permite una asignación directa y robusta de la categoría correspondiente, con un margen mínimo de ambigüedad interpretativa. Piénsalo como un "atajo seguro": si en el texto aparece "Bachillerato", la categoría es "Bachillerato", sin más vueltas.

The lexicon is built from the already-defined category system, associating with each category a finite set of terms, variants and equivalent expressions. This resource acts as a formalised deductive rule, so that the appearance of one of the dictionary terms activates the coding of the corresponding category. The selection of terms is based on domain expert knowledge, corpus review, and, where appropriate, previous iterations of LLM-assisted analysis.

From a methodological standpoint, this strategy fulfils a dual function. On one hand, it increases coding precision for categories where indicators are explicit and normative, reducing the risk of false negatives. On the other, it provides a contrast criterion for evaluating coding carried out by LLMs or human coders, facilitating the detection of inconsistencies, systematic omissions or problems in operational definitions.

It is worth emphasising that this special case of deductive coding must not be applied indiscriminately. Its use is appropriate only when there is a clear and stable correspondence between terms and categories. For more interpretive dimensions — such as attitudes, emotions, valuations or intentions — LLM-assisted semantic coding is more appropriate. Consequently, the approach proposed in this guide advocates a hybrid model, in which lexicon-based deductive coding and AI-assisted semantic interpretation are combined strategically, according to the nature of each analytical dimension.

The following table presents a simplified example of a lexicon used for deductive coding of the dimension educational stage or level. In this case, each category is associated with a small set of terms and expressions whose presence in the text is considered an unambiguous indicator of the corresponding category. The lexicon is built from the previously defined category system (educational levels) and acts as a formal decision rule, so that the appearance of any of the listed terms automatically activates category coding, without the need for additional semantic interpretation.

Table 7. Lexical dictionary example (adapted to anglophone educational context)

Category / dimension	Operational definition	Lexicon terms and expressions (non-exhaustive)
Lower Secondary	Explicit references to lower secondary education (e.g., Key Stage 3/4 in England, Years 7–11), regardless of the narrative context of the message.	secondary school, lower secondary, Year 7, Year 8, Year 9, Year 10, Year 11, KS3, KS4, middle school, junior high
Upper Secondary / Sixth Form	Direct mentions of upper secondary education or the equivalent pre-university stage.	sixth form, A levels, AS levels, Year 12, Year 13, upper secondary, senior school, Highers (Scotland), advanced higher
VET / Technical education	Explicit references to vocational education and training or technical qualifications, without specifying level.	VET, vocational, technical college, further education, FE college, BTec, T level, apprenticeship, vocational qualification
Intermediate VET	Direct mentions of intermediate-level vocational qualifications or programmes.	Level 2 VET, intermediate apprenticeship, BTec First, Foundation apprenticeship, vocational Level 2
Higher VET	Direct mentions of higher-level vocational qualifications or programmes.	Level 3 VET, Level 4/5, higher apprenticeship, BTec National, T level, HNC, HND, degree apprenticeship
University / Higher Education	Explicit references to university or higher education, regardless of qualification type.	university, uni, college, higher education, HE, campus, degree course, undergraduate
Bachelor's degree	Direct mentions of undergraduate degree-level study.	bachelor's, bachelor degree, BA, BSc, BEng, LLB, honours degree, undergraduate degree, first degree
Postgraduate	Explicit references to postgraduate study.	master's, MA, MSc, MBA, PhD, doctorate, postgrad, postgraduate, graduate school
	Note: This table is an adaptation of the original Spanish lexicon, which was built around regulated Spanish educational levels (ESO, Bachillerato, FP, etc.). Researchers should construct their own lexicon based on the educational system relevant to their study context.

The presence of any term included in the lexicon automatically activates coding of the corresponding category. This procedure is applied only to dimensions in which terms function as unambiguous indicators, and does not exclude the possibility of a single message being coded in multiple categories.

Si quieres que el LLM aplique este diccionario de forma automática y rígida, puedes usar esta instrucción que te ahorrará mucho tiempo de revisión manual:

Instruction for the LLM

Actúa como asistente técnico. Te proporciono un lexicón (diccionario de términos) vinculado a un sistema de categorías deductivo.
Tu regla de decisión es simple: si en el texto aparece alguno de los términos del lexicón, asigna automáticamente la categoría correspondiente. No realices interpretaciones semánticas profundas; cíñete a la presencia explícita de las palabras.
[INSERTAR TABLA DEL LEXICÓN AQUÍ, como la del ejemplo de la Tabla 7]
Devuelve los resultados en una tabla de codificación donde aparezca: ID de la respuesta, términos detectados y categoría asignada.

A piece of advice: before applying the lexicon to the full corpus, test it with 10 or 20 responses. If you see the model getting confused (for example, someone says "my brother is at university" but they are actually talking about their own VET experience), adjust the prompt so that it distinguishes the subject's context.

Applying these categories might give rise to a table like the following, where 1 indicates the presence of the category in the unit of analysis (for example, in a message).

Table 8. Example document coding table

ID	Ed.Secundaria	Bachillerato	FP	FP Grado Medio	FP Grado Superior	Universidad	Grado	Posgrado
DOC1	1	0	1	0	1	1	1	1
DOC2	1	1	1	0	1	1	1	1
DOC3	0	0	1	0	1	1	1	1
DOC4	0	0	1	0	0	1	1	1
DOC5	1	0	1	1	1	1	1	1
DOC6	1	0	1	1	0	1	1	1
DOC7	1	1	1	1	1	1	1	1
DOC8	1	1	1	1	0	1	1	1
DOC9	1	1	1	0	1	1	1	1

AiRISES case study

Following the AiRISES case study: to analyse vocational education (VET), we could not let the LLM decide freely. We created a controlled lexicon based on the 26 vocational families in the national catalogue. This allowed us to classify 67% of the messages automatically into precise categories such as "Computing and Communications" (the most queried, at 41%) or "Healthcare" (12%). However, to understand soft skills or criticism of the educational system, we had to abandon the dictionary and move to inductive coding, allowing the model to detect the tension graduates feel between their theoretical training and the actual demands of employers. Recuerda que este es un modelo híbrido. El lexicón es fantástico para lo estructural (quién habla, desde qué nivel), pero para analizar actitudes, emociones o valoraciones, debes volver a la codificación semántica asistida que explicamos en el Capítulo 8, porque ahí es donde tu mirada interpretativa y la capacidad del LLM para entender el contexto son insustituibles.

Mixed: template + emerging categories

In many studies, pre-existing categories are combined with new ones emerging from the data. El modelo puede añadir categorías cuando detecta contenidos que no encajan en la plantilla, explicando por qué y proponiendo definiciones. El enfoque mixto combina lo mejor de ambas estrategias.

Instruction for the LLM

Use this category system as the base template.
[CATEGORIES HERE]
Add new categories if ideas appear that do not fit.
For each added category:
- explain why it does not fit the prior categories,
- provide its definition,
- include a textual example.

This approach is common in complex studies or those with multiple data sources. In any case, it is advisable to also use the inductive strategy to ensure that the proposed system is exhaustive or that the corpus suggests modifications to existing frameworks. Both situations are methodologically relevant.

How to ask the LLM for clear descriptions, inclusions, exclusions and examples

To avoid vague categories, the LLM should be asked to provide for each category a precise description, inclusion and exclusion criteria and textual examples. The total number of categories is also controlled to keep them manageable. Models tend to generate overly general categories if not guided.

Instruction for the LLM

Reformulate each category so that it meets:1) Nombre claro y no ambiguo.
2) Precise description with a single central idea.
3) Inclusion criteria.
4) Exclusion criteria.
5) Ejemplos textuales reales del corpus.
If any category is not consistent, propose adjustments.

Useful tips:

Pedir que evite sinónimos vagos: “motivación”, “actitud”, “recursos”, … sin especificar.
Request explicit exclusions: Incluye X pero no Y.
Restrict the number of categories (e.g., between 6 and 12).

Comparison between manually and LLM-generated systems

AI-generated systems tend to be more general; those constructed by researchers tend towards greater specificity. Combining both approaches allows the LLM's processing and synthesis capability to be exploited without sacrificing conceptual depth.

The LLM tends to:

generate more general categories,
group under broad concepts,
propose more neutral names.

The researcher tends to:

generate more specific categories,
qualify conceptual differences,
adjust the system to the theoretical framework.

Suggested process:

The LLM generates a first version (v1.0).

It produces a category draft from the initial corpus.

The researcher reviews and adjusts.

Refines the system: removes redundancies, merges categories, clarifies definitions and adds criteria.

The LLM generates the revised version (v2.0).

Integrates the modifications and reorganises the system according to the instructions.

The researcher validates the functioning.

Verifies that the categories are clear, distinct and applicable to the corpus.

A second LLM carries out a critical review (recommended).

Offers advantages, problems and improvements of the category system, acting as an independent reviewer.

El error: accepting overly general categories without exclusion criteria.

Why this happens: LLMs propose broad terms without specifying what is excluded.

How to avoid it: demand inclusion criteria y and exclusion criteria, as well as real textual examples from the corpus.

Chapter 7

Refining and validating the category system

Generate a category system version 1.0 (with LLM, manually or combining both). → Cap. 6
Verify that each category has at least a name and provisional description.
⚠ Aviso: no codifiques el corpus completo hasta terminar este capítulo. La validación del sistema es previa a la codificación masiva.

Once the initial version of the category system has been constructed, it is essential to review and improve it before using it to code the full corpus. This phase, often underestimated, is where the quality, coherence and analytical utility of the system are ensured. Do not skip it. We need to seek external coherence (i.e., that one category is not so similar to another that you end up flipping a coin to decide which to use). LLMs can help detect inconsistencies and propose improvements, but the final validation must always be yours.

Merging, splitting and level of abstraction

During refinement, overly broad, redundant or ambiguous categories are detected. The decision is made whether they should be split, merged or renamed. The aim is to achieve a consistent and useful category system.

Este es el sistema de categorías versión 1.0.

Instruction for the LLM

Analiza la coherencia interna y externa e identifica:
1. Categorías demasiado amplias.
2. Categorías redundantes o solapadas.
3. Nombres ambiguos.
4. Diferencias de nivel conceptual.
5. Categorías innecesarias o irrelevantes.
Proporciona propuestas concretas de mejora.

Internal and external coherence

El sistema categorial debe funcionar como un mapa conceptual lógico.

Coherencia interna - Cada categoría debe tener una idea central clara, criterios bien definidos, límites precisos y ejemplos textuales consistentes.

Coherencia externa - Las categorías deben diferenciarse claramente entre sí, no solaparse, no contradecirse y tener un nivel de generalidad comparable.

Instruction for the LLM

Revisa el sistema categorial y responde:
- ¿Qué categorías están mal delimitadas?
- ¿Qué categorías son demasiado similares?
- ¿Qué categorías deberían agruparse bajo un tema superior?
- ¿Qué categorías carecen de ejemplos claros?
- Sugiere mejoras justificadas.

LLM assistance for detecting ambiguities

The LLM can be asked to act as a methodological critic, identifying conceptual weaknesses and proposing alternative groupings. It is also possible to carry out stress tests by applying the system to particularly complex fragments. Models can detect problems that escape the researcher due to their proximity to and familiarity with the data.

Request critical analysis.

Instruction for the LLM

Actúa como crítico metodológico.
¿Qué debilidades conceptuales encuentras en este sistema de categorías?
Proporciona observaciones específicas basadas en definiciones, criterios y ejemplos.

Solicitar alternativas conceptuales.

Instruction for the LLM

Propose 2 or 3 alternative ways of grouping these categories.

Request stress tests.

Instruction for the LLM

Evalúa este sistema aplicándolo a los siguientes 5 segmentos.
Detecta dónde fallan las definiciones y justifica por qué.

These tests are used to check whether the categories really work.

Example evolution: version 1.0 → 2.0 → 3.0

Figure 6. Category system construction and refinement cycle (v1.0 → v3.0)

Version 1.0 (initial inductive AI)

Lack of training
Escasez de recursos
Motivation to learn
Use of technology
Apoyo institucional

Problems detected:

Category 4 too broad.
Categories 1 and 3 very similar but different.
Lack of precision in descriptors.

Version 2.0 (after human review + LLM)

Carencias formativas personales
Insufficient technological resources
Individual disposition towards digital learning
Institutional limitations for integrating technology
Self-training strategies and peer support

Improvements in v2.0:

More specific categories.
Clear differentiation between personal and institutional level.

Version 3.0 (final refined system)

Personal level

Carencias formativas personales
Individual disposition towards learning
Self-training strategies

Institutional level

Insufficient technological resources
Organisational limitations for integrating technology

Advantages of v3.0:

Consistent conceptual level.
Clarity in the boundaries between categories.
Greater utility for coding and interpretation.

Documenting methodological decisions

It is advisable to generate a decision record that captures how the system has evolved, what changes have been made and why. The LLM can help draft this record from the various versions. Traceability is essential for ensuring the credibility of the analysis.

It is recommended to record:

initial version of the system (v1.0),
LLM critiques,
human review,
improved version (v2.0),
reasons for merges or splits,
final criteria,
final version used (v3.0).

You can ask the LLM:

Instruction for the LLM

Genera un registro de decisiones basado en estas revisiones.

Include versions, changes and methodological justifications.

This record will be useful for:

the research method,
appendices,
peer review,
triangulation with other analysts.

El error: skipping the refinement and coding directly with version 1.0.

Why this happens: el sistema inicial casi siempre tiene solapamientos y definiciones imprecisas.

How to avoid it: aplica el sistema a 5–10 segmentos reales antes de usarlo en masa.

Chapter 8

Assisted coding

Validate and refine the category system to version 2.0 or above. → Cap. 7
Verifica que cada categoría tiene nombre, definición, criterios de inclusión, criterios de exclusión y ejemplos textuales.
Organiza los datos en bloques numerados y anonimizados. → Cap. 3.4
Prepare the master prompt for the coding session. → Cap. 4.2

We arrive at the part that usually gives us the most headaches: coding. But do not worry — this is where the LLM will genuinely save you those hours of staring at a table until the words lose their meaning. Coding is the process by which categories are assigned to units of analysis. This normally involves coding text segments within each document, but in some studies the complete document is also coded when certain categories operate as case attributes/variables (for example, presence/absence of a topic, document type, or analytical characteristics of the respondent).

With the help of an LLM, this process can be organised much more quickly and in an orderly fashion, provided the category system is well-defined and the instructions are very precise. Part II explains how to prepare the data — you should review this now. Before coding, texts are organised into numbered blocks and the model is reminded which version of the category system to use. Blocks should not be excessively large, to avoid saturating the LLM's context window.

This section explains how to request table-format coding, how to validate coding quality and how to avoid frequent errors.

How to request clear, justified coding tables

Figure 7. Complete assisted coding flow and quality control

Result tables typically include columns for text, category/categories and justification. The researcher specifies the desired format and whether assigning multiple categories to the same fragment is allowed. Emphasis is placed on justification being based on exact quotations.

Instruction for the LLM

Codifica cada respuesta según el sistema de categorías versión 2.0.
Devuelve la salida en forma de tabla con columnas:
1. Texto (resumen breve de la respuesta o la respuesta completa)
2. Categoría(s) asignada(s)
3. Justificación textual basada en segmentos del corpus
No inventes citas. Si una respuesta no encaja en ninguna categoría, indícalo.
[Optional: exigir una sola categoría]
Asigna solo la categoría más relevante a cada respuesta.
[Optional: permitir múltiples categorías]
Una respuesta puede incluir varias categorías si está justificado.

Si, en lugar de codificar segmentos, lo que necesito es codificar atributos/variables a nivel de documento, devuelve una tabla (o matriz) por casos donde cada fila sea una respuesta/documento y las columnas sean atributos (por ejemplo: presencia de X = 1/0), indicando brevemente el criterio para marcar 1.

How to manage large volumes of data

Hay que recordar que, en estudios con muchos casos, se trabaja por bloques (ver apartado PARTE II. 3.4. ) y se guardan las tablas parciales. Debe guardarse cada bloque como un archivo independiente.

It is important to maintain a stable methodological context by recalling key instructions and avoiding changing criteria mid-process. To this end, a permanent context must be established. It is suggested to always repeat:

Prompt maestro
Category system (current version)
Unit of analysis
Formato de salida

Periodically emphasise:

Do not generate new categories.
Do not infer meanings not present in the data.

Remember it is necessary to keep all instructions given saved in an external document (and I take this opportunity to insist: make backups).

El error: not specifying the unit of analysis in the LLM instruction.

Why this happens: without explicit instruction, the LLM freely decides how to segment.

How to avoid it: incluye siempre la unidad de análisis en el prompt y verifica en la revisión manual.

Chapter 9

Validation and quality control

Code at least one block of the corpus with the LLM. → Cap. 8
Export the coding tables to local files before continuing.
⚠ Warning: manual review is mandatory. Do not skip this chapter even if the LLM has generated tables that look impeccable.

LLM-assisted coding does not, in any case, eliminate the need for human control or inter-rater agreement mechanisms. As in traditional content analysis, the quality of the category system and coding is strengthened when the extent to which different judges or coders agree when applying the categories is verified. In this context, judges may be:

uno o varios investigadores humanos,
uno o varios LLM configurados como codificadores,
a combination of both.

Aunque los LLM pueden acelerar el proceso, la validación metodológica sigue requiriendo tres tipos de control: manual review, doble codificación asistida y revisión cruzada entre modelos.

Here is a summary of what I will explain below.

Figure 8. Validation strategies: when to apply each

Manual review of assisted coding

Even when coding has been carried out with LLM support, it is essential that you manually review a sufficiently large sample of the material (for example, between 10% and 20% of the corpus, or more in sensitive studies).

You must review:

La coherencia de la categoría elegida - Valora si la categoría asignada se ajusta realmente al sentido del fragmento y al sistema categorial definido (nombre, definición, criterios de inclusión y exclusión).
La adecuación de la cita textual utilizada como justificación - Verifica que la cita seleccionada por el LLM es representativa del fragmento y que sirve efectivamente para justificar la categoría asignada.
La ausencia de contenido inventado - Confirma que el modelo no ha añadido palabras, matices o ejemplos que no aparecen en el corpus original. Cualquier síntesis o reformulación debe poder rastrearse a partir de datos reales.

This manual review allows detection of systematic LLM error patterns (for example, a tendency to over-generalise, always apply the same catch-all category, or introduce subtle interpretations not in the text). If a very high number of errors is detected, or few but very important ones, try to understand why, return to square one and correct the process from the start.

But do not wait to stumble across an error. As the person ultimately responsible for the analysis (P1), be proactive about limiting hallucinations — send this challenge to the model to verify it is not inventing things (P2):

Instruction for the LLM

Review the coding table you have just generated. Identify whether any verbatim quotations have been summarised or altered. If you have "hallucinated" or invented any nuance not in the original text, acknowledge it now so I can correct it.

If the model apologises and corrects a quotation, you will have improved the validity of your analysis before it reaches the final report.

AiRISES case study

Following the AiRISES case study: we were not satisfied with the first LLM output. A critical example was the gender prompt: in the first trial, the model had a 14% error rate. Through manual review and prompt refinement, we achieved a second version with 100% agreement with the human expert. This procedure validated our methodology: you, with a well-configured LLM, can be far more effective than a generic automatic procedure.

Double coding assisted by the same LLM

A useful internal control strategy consists of asking the same LLM for two alternative codings of the same data block, with identical methodological instructions but with some small variation (for example, in role), and then comparing the results. This logic is analogous to human double coding when two coders are asked to apply the same category system independently.

Some options:

Ask for a coding with a more conservative approach, for example:

Instruction for the LLM

Vuelve a codificar este bloque utilizando un enfoque más conservador.

Solo asigna una categoría por respuesta.

Ask for coding with strict criteria:

Instruction for the LLM

Codifica este bloque aplicando criterios estrictos.

Do not code anything that is not explicitly expressed in the text.

Comparing the initial coding with the conservative or strict coding allows you to:

identificar respuestas en las que el modelo duda o cambia de criterio,
detect categories applied too loosely,
refine category definitions if ambiguities are observed.

Desde el punto de vista del acuerdo interjueces, estas dos codificaciones pueden tratarse como si provinieran de dos jueces distintos (LLM-1 modo estándar y LLM-1 modo conservador) and then calculate agreement indicators (for example, percentage of agreement or fit statistics).

Discussing coding decisions with the model

The LLM can also be used as a judge that explains its decisions. In practice, this means asking the model to justify why it has applied a specific category and to re-evaluate doubtful decisions.

Instruction for the LLM

¿Por qué asignaste la categoría X a la respuesta R7?

¿Qué elementos del texto justifican esta categoría y no otra?

¿Detectas alguna respuesta que esté codificada de forma inconsistente respecto a la definición de la categoría X?

This type of dialogue serves to:

make visible the implicit criteria the model is using,
check whether those criteria match the formal definition of the category,
ajustar las instrucciones o reformularlas si se detectan desviaciones.

Although the model does not reason like a human, these explanations can help the researcher identify blind spots, biases or misunderstandings.

In addition, to avoid over-reliance and ensure you remain in charge, do not accept the model's first response. Test it:

Instruction for the LLM

I know you assigned category 'A' to this block. Now, act as an external critical evaluator and argue why the same data could fit category 'B'. What nuances would we be ignoring if we stayed with your first option?

This will force you to reflect on the coherence of your categories and to decide with much greater certainty.

Cross-review using multiple LLMs (assisted triangulation)

A particularly interesting way of approaching inter-rater agreement in an LLM-assisted environment is to use several different models as if they were independent judges. It is like asking another model for a second opinion, as one might consult a colleague from another department. For example:

use ChatGPT for a first coding,
use Claude to critically review that coding,
utilizar Gemini para comprobar si hay discrepancias significativas.

This strategy functions as a form of assisted triangulation:

If several different LLMs consistently agree when applying a clear category system, confidence in the stability of the system and the robustness of the coding increases.
If, on the other hand, the models frequently disagree, this may indicate problems in the category definitions, the clarity of instructions or the corpus structure itself.

From a methodological standpoint, this cross-review allows each LLM to be treated as an additional judge, similar to what is done with multiple human coders.

Ello implica la necesidad de analizar el grado de coincidencia entre diferentes codificaciones, que se evalúan mediante indicadores clásicos de acuerdo interjueces (porcentaje de acuerdo, Kappa, etc.).

En un contexto con LLM, se pueden aplicar sobre:

dos codificaciones del mismo modelo con instrucciones diferentes,
human coding vs. LLM coding (one or several models),
several codings by different LLMs (which would allow complete automation of the process).

In the context of this guide, the most important thing is to understand that:

es posible tratar las codificaciones de distintos LLM y de investigadores humanos como fuentes de datos comparables,
calculating these indicators strengthens the credibility of the category system and the coding process,
low levels of agreement are a signal that category definitions, model instructions or even the study design need to be revised (and, unfortunately, the process restarted).

Taken together, the combination of manual review, assisted double coding, discussion with the model and cross-review using multiple LLMs, supported by classic inter-rater agreement indicators, allows the quality standards of content analysis to be transferred to LLM-assisted analysis.

Partial automation of analysis in advanced stages

Once the category system has been constructed, reviewed and validated, you can consider partially automating some phases of the analysis, especially the coding of large volumes of data and the generation of preliminary summaries.

By automation I mean the creation of workflows that execute various tasks systematically and repeatably. For example, tasks that can be automated include initial data cleaning, table generation, file organisation, spreadsheet reading, splitting a corpus into blocks, converting file formats, extracting text fragments, consolidating coding tables and, where appropriate, interaction with an LLM to apply a category system or generate summaries. Automation thus consists of chaining steps that are traditionally carried out manually so that a digital tool performs them in a stable and controlled way. For your level of technical knowledge, fortunately, many of these tasks can be automated using applications that do not require programming, such as ChatGPT Automations or Make, which allow processes to be designed through visual interfaces. The reference to these tools corresponds to the landscape at the time of writing (December 2025), although these functionalities are likely to become increasingly accessible and easy to use, given recent developments in the sector. Automation should not be understood as a replacement for traditional content analysis, but as an efficiency mechanism applied only after verifying that the model applies categories coherently and consistently. Overall, it allows technical work to be accelerated and time freed up for interpretation, which remains the researcher's responsibility.

Condiciones necesarias

We have already mentioned all phases and precautions, but it is worth recalling here the requirements for automating a process in a way that provides the greatest possible assurance of result quality. For automation to be methodologically valid, at a minimum the following must be met:

a) Stable, well-defined category system

Categories must have a clear name, precise description, inclusion/exclusion criteria and representative examples.
At least one round of LLM-assisted refinement (version 2.0 or 3.0) must have been completed.

b) Prior reliability tests

Manually review a broad sample of the corpus (10–20%).
Compare two codings generated at different moments.
Verify that there are no invented elements or inferences beyond the text.

c) Relative corpus homogeneity

Automation works better when the texts:

belong to the same type (all short responses, all interviews…),
refer to the same educational phenomenon,
follow a comparable style and structure.

d) Clear definition of the unit of analysis

Before automating mass coding, you must establish whether you are coding:

the complete response,
the sentence,
the semantic unit.

e) Mandatory quality controls

Even with automation:

periodic manual samples must be reviewed,
contradictions or anomalies must be recorded,
instructions must be refined if biases appear.

What can be automated

Many things can be automated. Here are some examples:

Mass block-based coding: el modelo puede procesar cientos de respuestas en tandas de 50–100, aplicando el sistema categorial establecido.
Generating coding tables in different formats: tabla de texto, CSV, Markdown o Excel.
Summaries by category or theme: for example, summary of all units falling within category X or detection of cross-cutting patterns.
Automatic detection of inconsistencies: una automatización puede revisar si una categoría se está usando de forma desigual, las citas justifican la asignación, hay categorías vacías o sobrecargadas.
Extraction of representative quotations: the LLM can automatically collect the most frequent quotations, the most intense ones, those expressing contradictions.
Cross-review with another model: en automatizaciones complejas se puede codificar con ChatGPT, revisar con Claude y verificar coincidencias.

Riesgos y advertencias

Despite its advantages, automation involves significant risks:

Perder matices o excepciones
Priorizar patrones dominantes.
Reproducing systematic errors.
If the category system contains an ambiguity, automation multiplies the error rather than reducing it.
Over-representation of broad categories.
The LLM tends to lean towards very general categories if not continuously reminded of exclusion criteria.
Excessive dependence on automation.
The researcher may start acting automatically without reviewing critically.

This is why there must always be human verification, even when automation is high.

Automating analysis with ChatGPT Automations (workflow)

ChatGPT Automations currently allow (subject to change) the creation of workflows that execute repetitive content analysis tasks without continuous manual intervention. They are especially useful in advanced phases, when the category system has already been validated and the aim is to process large volumes of data while maintaining rigorous quality control. This section explains how to structure a workflow in ChatGPT to partially automate coding and synthesis, maintaining the principles of traditional content analysis. The specific implementation should be consulted in the application itself, as it may change at any time. Here is an example of the steps that might be automated in sequence:

Automation example: automated profile identification

Trigger: Receipt of the already-segmented corpus (for example, the 1,886 specific messages about Vocational Education). Action 1 (Mass Classification): The system automatically administers a specialised prompt to each message to identify the author's gender (female, male or neutral) based on linguistic indicators such as predicative adjectives and pronouns. Acción 2 (Generación de Alertas): El flujo marca automáticamente como Neutros aquellos mensajes donde no existen indicadores de género claros, evitando que el modelo realice inferencias arriesgadas o inventadas. Action 3 (Integrated Quality Control): The automation sets aside a random sample (for example, 50 messages) for the researcher to carry out an expert review. If the error margin exceeds a threshold (in the real case it was 14%), the workflow stops for prompt refinement. Action 4 (Data Consolidation): Once the prompt has been validated (reaching 100% agreement), the system processes the remaining corpus and exports the results to a file compatible with statistical software such as JASP or SPSS.

Recommendation

Partial automation of the analysis is a powerful and efficient tool, but should only be used after consolidating and validating the category system and verifying the model's stability. The steps can accelerate mass coding, quotation extraction and initial synthesis, but require constant quality controls: human review of samples, triangulation between models and detailed documentation of each step. Before automating, data must be completely anonymised and privacy risks carefully assessed. In no case does automation replace the researcher's analytical responsibility; it should serve only as a technical extension of traditional content analysis.

Once the system has been validated and the coding reviewed, the technical analysis is complete. What comes next is the part that most belongs to you: converting all that category structure into knowledge. The analytical synthesis and report are where your research perspective takes centre stage in a way the LLM can never replicate.

After completing the coding, the most interpretive and conceptual stage of the analysis arrives: analytical synthesis. I believe this is the most interesting stage, and without doubt the most useful from a knowledge-generation perspective. Its aim is to articulate the findings, identify relationships between categories, extract deep meanings and draft a coherent narrative that responds to the research questions. LLMs can help by generating synthesis drafts, identifying tensions and grouping patterns. However, as I insist once again, the final interpretation is the researcher's responsibility — it is your responsibility.

El error: accepting LLM tables without manual review.

Why this happens: el LLM puede generar tablas formalmente impecables pero con citas alteradas o inventadas.

How to avoid it: revisa al menos el 10–20% comparando con el corpus original.

Part IV

Synthesis and reporting

Analytical synthesis and inclusion of LLM use in the academic report.

Chapter 10

Analytical synthesis

Validate the coding and consolidate all tables into a single file. → Cap. 9
Confirm that the category system is in its final version.
Recuerda: the LLM can generate drafts, but interpretation is your responsibility.

From the coding, thematic narratives are constructed that integrate categories and respond to the research questions. The LLM can propose drafts of these narratives. To this end, you can ask the LLM to:

Identify patterns between categories

Instruction for the LLM

From the category system and coding table, identify the main thematic findings and organise them into 3–6 themes.

Draft a synthesis

Instruction for the LLM

Draft an academic thematic analysis of 3–5 paragraphs, based solely on the information contained in the coded data. Include brief textual examples.

Compare temas para buscar diferencias y similitudes

Instruction for the LLM

Explain the relationships between these categories and how they group into broader themes. The key is to maintain fidelity to the corpus.

Integrating direct quotations

Including verbatim quotations strengthens the credibility of the analysis. The model can suggest options, but the researcher must verify their accuracy and relevance. It is not just about including quotations — it is about identifying where a quotation can be relevant.

Las citas textuales deben:

representar adecuadamente los datos,
strengthen the credibility of the analysis,
show nuances that synthesis alone would not capture.

Instruction for the LLM

Include 1–2 representative verbatim quotations per category or sub-theme. Do not invent content. Use only exact quotations from the corpus.

Good practices:

Evitar seleccionar siempre las mismas respuestas.
Combinar citas breves y moderadas.
Muy importante, eliminar detalles personales si los hubiera.

Relationships between categories

The synthesis must go beyond listing categories: it must show how they relate (support, tension, explicit causality, thematic hierarchy). The model can suggest these connections, which are then contrasted with the corpus and theory. LLMs can help identify relationships of the type:

causal (always with caution: they must be explicit),
condicionales,
complementarias,
contradictorias,
hierarchical,
thematic.

Instruction for the LLM

Explain how these categories relate to one another. Indicate which support, contradict or form part of the same process.

Here, the LLM can detect, for example:

tensions between personal motivation and lack of resources,
contradictions between institutional policies and teacher perceptions,
differences between intentional discourses and actual practice.

Tensions, contradictions and exceptions

Cases that do not fit, contradictions and minority voices contribute important nuances. Explicitly asking the model to identify them helps avoid overly homogeneous conclusions. For this reason, a fundamental part of content analysis consists of identifying:

casos que no encajan,
contradicciones internas,
elementos discordantes,
perspectivas minoritarias.

Los LLM pueden ayudar a encontrarlas:

Instruction for the LLM

Identify atypical, contradictory or minority cases within the corpus and explain why they are relevant to the analysis.

Estos casos ayudan a:

evitar conclusiones simplistas,
enrich the interpretation,
mostrar diversidad dentro de los datos.

Chapter 11

Reporting LLM use in the academic report

Draft an analytical synthesis with the main findings. → Cap. 10
Keep a log of all relevant methodological decisions.
Sugerencia: review the LLM use declaration in the preface as a reference for your report.

When content analysis has been LLM-assisted, the academic report (article, doctoral thesis, master's dissertation, undergraduate dissertation or technical report) must transparently include which tasks were carried out with LLM support and which were the researcher's responsibility. This transparency is an ethical, methodological and reproducibility requirement.

The research report must mention LLM use in the method, limitations, reproducibility and ethics sections, and in results and discussion only when the LLM has explicitly contributed to those parts. Below I indicate in detail where it must be mentioned, together with the justification and a brief writing example for each case.

Tabla 9. Inclusión y Justificación del uso de LLM en el Informe de Investigación

Report section	Mandatory?	Justification and ethical/methodological requirement
Introduction	Optional	Contextualises LLM use as part of the study's novelty or general framework.
Method	Mandatory	Describes LLM tasks, instructions (prompts), human supervision and quality control.
Results	Condicional	Necessary only if the LLM generated syntheses, thematic summaries or pattern identification.
Discussion	Optional	Reinforces transparency about the human interpretive role relative to technical support.
Limitations	Mandatory	Acknowledges model biases, probabilistic variability and technological dependencies.
Transparency/Anexos	Mandatory	Ensures reproducibility through inclusion of prompts and category system versions.
Ethical considerations	Mandatory	Details the protection, anonymisation and privacy of data uploaded to external platforms.

Introduction (optional, in some cases only)

When to mention it — Only when LLM use is part of the study's novelty, the general methodological framework, or the justification of the work's relevance.

Justification — Allows contextualisation that the study draws on emerging assistance tools, without yet attributing a central methodological role to them.

Ejemplo breve - “Este estudio incorpora herramientas de asistencia basadas en modelos de lenguaje (LLM) para apoyar, no sustituir, determinados procesos del análisis de contenido.”

Method (mandatory)

When to mention it — Always.

Es el lugar principal donde debe describirse el uso:

which tasks the LLM performed,
how it was instructed (instructions/prompt),
which decisions remained the researcher's responsibility,
and how the quality of the process was controlled.

Justification — Methodologically, the reader must be able to understand how categories were generated, how they were applied and what role the LLM played in each phase of the analysis. This information also allows evaluation of validity, biases and reproducibility.

Ejemplo breve - “El análisis de contenido se realizó mediante un procedimiento asistido por LLM. El modelo se empleó para generar propuestas iniciales de categorías, aplicar el sistema categorial validado y elaborar síntesis preliminares. Todas las decisiones de revisión, ajuste y validación fueron tomadas por el investigador. Se revisó manualmente una muestra del 20 % de las codificaciones y se efectuó una revisión cruzada con modelos alternativos.”

Results (depending on use)

¿Cuándo mencionarlo? - Solo si el LLM participó en la generación de resúmenes temáticos, agrupación de citas o identificación de patrones.

Justification — Allows distinction between:

what comes directly from human analysis,
what was generated with LLM support,
and what controls were applied to avoid errors (hallucinations, false quotations, unjustified inferences).

Ejemplo breve - “Las síntesis preliminares de cada categoría fueron generadas con apoyo de un LLM y posteriormente revisadas y ajustadas manualmente para asegurar fidelidad al corpus y coherencia con el sistema categorial.”

Discussion (optional, brief)

When to mention it — When LLM use has influenced how findings are interpreted, or when it is relevant to explain how the model was prevented from introducing external inferences.

Justification — Serves to reinforce transparency and to show that the interpretive role remains human.

Ejemplo breve - “Las interpretaciones presentadas se elaboraron exclusivamente a partir del análisis humano; el LLM se empleó únicamente como apoyo técnico para organizar los datos y generar borradores preliminares.”

Limitations (mandatory)

When to mention it — Always when LLMs are used.

Debe explicarse:

technological dependencies,
posibles sesgos del modelo,
variabilidad entre ejecuciones,
riesgos de interpretaciones no basadas en datos.

Justification — It is an ethical and methodological requirement. It allows the reader to evaluate the robustness of the work.

Ejemplo breve - “El uso de LLM introduce limitaciones asociadas a posibles sesgos del modelo y a variaciones entre ejecuciones, ya que un mismo prompt aplicado al mismo conjunto de datos puede generar respuestas ligeramente diferentes en distintos momentos, debido a la naturaleza probabilística del modelo. Estas diferencias pueden afectar a la clasificación, categorización o interpretación del contenido. Para minimizar estos riesgos, se realizaron comparaciones entre modelos y revisiones manuales sisthematic.”

Transparency and reproducibility (mandatory, may be integrated into the method or appendices)

When to mention it — Always.

Debe incluirse:

which prompts/instructions were used (or the most important ones),
which versions of the category system were provided to the LLM,
which quality controls were applied.

Justification — Allows other researchers to replicate the process or evaluate its credibility.

Ejemplo breve – (en Method) “Los prompts utilizados para la codificación y síntesis asistidas por LLM se incluyen en el Anexo 2, junto con la versión final del sistema categorial y una descripción de los controles de calidad aplicados.”

Consideraciones éticas (obligatorio)

When to mention it — Always, especially if data were uploaded to external platforms.

Justification — It must be explained how participant data were protected and how the LLM was prevented from processing identifiable information.

Ejemplo breve - “Todos los datos fueron anonimizados antes de ser procesados por los LLM y se emplearon plataformas que no reutilizan la información para entrenar modelos.”

AiRISES case study

Following the AiRISES case study: when writing the final report, we included a transparency declaration about how the LLM helped us process a volume that would have taken a human researcher years. We highlighted that the AI allowed us to identify a relative increase in employment-related concern (rising from 33% in 2016 to 49% in 2023). But the conclusion that students demand "seasonal guidance" and that a significant proportion of university students express regret about their choice is our interpretation as researchers — supported by the synthesis the model provided, but not dictated by it. Writing the report well is necessary but not sufficient. LLM use in research raises questions that go beyond methodology: what is delegated to the model, what data are entrusted to it, how all of this is declared. Part V brings together the ethical principles that must run throughout the entire process, not just the final report.

Parte V

Ethics and quality

Good practices, biases, reproducibility and process documentation.

Chapter 12

Good practice and ethics in LLM-assisted analysis

Cross-cutting chapter: aplica desde el inicio del proyecto hasta la entrega del informe.
Si lo lees al final: úsalo para revisar que has respetado los principios P1–P6 a lo largo de todo el proceso.
Si lo lees al principio: it will help you anticipate and plan ethical decisions from the study design stage.

The use of large language models (LLMs) in content analysis offers significant methodological opportunities, but also ethical, epistemological and practical risks. This chapter brings together fundamental principles to ensure responsible, rigorous and transparent use. I indicate here some ordered suggestions, already mentioned throughout the text.

If by the time you read this ChatGPT et al. are already making our mid-morning coffee too, it does not matter; what I explain here about bias and interpretation will still be what distinguishes a researcher from someone who only knows how to copy and paste.

Model biases and researcher biases

LLMs are not neutral; they learn linguistic patterns from enormous data corpora that include social, cultural and political biases, inherit biases from their training data, and may favour certain perspectives or discourses. The researcher also has their own biases. Awareness of both is an ethical and methodological requirement.

Sesgos del modelo

Pueden aparecer en:

interpretaciones que favorecen discursos mayoritarios,
invisibilisation of minority voices,
estereotipos sobre docentes o estudiantes,
categorizaciones demasiado simplificadas.

Sesgos del investigador

El LLM puede reforzar sesgos humanos preexistentes:

interpretar selectivamente respuestas,
accepting convenient categories,
over-trusting automated synthesis.

Recomendación:

Solicita siempre al LLM que identifique posibles sesgos en su salida:

Instruction for the LLM

Do you detect any bias or interpretation not supported by the data?

Limitations of using LLMs in content analysis

Models can over-interpret, invent details or simplify complex phenomena. They do not replace critical reflection or the negotiation of meanings between researchers. Among their most common limitations are:

Comprensión limitada: identifican patrones lingüísticos, pero no comprenden intenciones, contextos ni significados humanos profundos.
Risk of hallucinations: they may generate quotations, data or inferences not present in the analysed material.
Result inconsistencies: they may classify or interpret differently depending on the order, context or formulation of the request.
Reduccionismo analítico: tienden a simplificar fenómenos complejos, priorizando patrones dominantes y pudiendo incurrir en cherry picking (partial selection of evidence that reinforces an interpretation while omitting nuances or minority cases).
Excessive dependence: there is a risk that the researcher delegates interpretation without maintaining sufficient critical control.

How to avoid over-reliance

To avoid over-relying on AI, it is recommended to contrast results with manual analysis, consult other researchers and maintain a critical attitude towards the model's proposals. Good practices:

Manually review a sample of the coding.
Compare the LLM version with a human coding.
Use several models (ChatGPT, Gemini, Claude) for triangulation.
Carry out periodic critical checks:

Explain why this category is the most appropriate.

What alternatives might be plausible?

Técnica útil: “contraentrevistar al modelo”. Esto permite detectar debilidades en su razonamiento.

Instruction for the LLM

Defend the opposite category to the one you applied and justify your position.

Responsible use and reproducibility

Reproducibility in AI-assisted analysis requires documenting prompts, model versions, methodological decisions and key outputs. Although model responses may vary, it is possible to offer a sufficient description of the process.

Good practices for reproducibility:

Guardar todas las versiones del sistema categorial.
Guardar prompts utilizados.
Record each revision made to the system.
Save coding tables by block.
Flag which parts of the analysis were carried out by the LLM.

Mandatory declarations in undergraduate/master's dissertations/doctoral theses/articles:

model version (ChatGPT 4.x, Gemini Pro, Claude 3, etc.),
instrucciones dadas,
papel del LLM en cada etapa,
medidas tomadas para verificar exactitud.

How to document the process

In the doctoral thesis or article (etc.), it must be described how AI was used, at which stages, under what controls and with what limitations. This transparency increases the credibility of the work. Documentation is key to transparency and academic rigour. A typical section may include (as a checklist of information to be included):

Technical description:

which model(s) were used,
why they were selected,
at which phases they participated.

Detalle operativo:

prompts maestros,
units of analysis,
coding procedures,
justification of methodological decisions.

Verification strategies:

manual review,
triangulation,
contrast with literature.

Example of a transparent declaration:

Declaration of LLM use (Principle P6)

“El LLM se utilizó para generar propuestas preliminares de categorías, realizar una primera codificación asistida y elaborar borradores de síntesis. Todas las decisiones analíticas finales, incluyendo fusiones y delimitación de categorías, fueron tomadas por el investigador. Se revisaron manualmente todas las citas textuales y se validó la coherencia interna del sistema categorial.”

Llegados aquí, tienes el procedimiento completo. Lo que viene a continuación no es más teoría: son herramientas para que puedas usarlo: checklists y prompts listos para copiar.

The appendices provide practical materials that can be reused directly in future analyses. They function as quick-reference tools and complementary methodological support for the reader.

Part VI

Additional resources

Anexos: checklists y prompts listos para usar en cada fase.

Chapter 13

Appendices

Checklist: peace of mind

Before you start

I have clearly defined the study objective and research questions.
I have decided whether the analysis will be inductive, deductive or mixed.
He determinado el tipo de corpus (encuestas, interviews, foros, documentos institucionales).

During the analysis

The LLM has received clear, stable instructions.
Verbatim quotations have been used to justify categories.
I have manually reviewed corpus samples.

When finishing

The category system is coherent and stable.
Validation strategies have been applied.
The process is documented transparently.

Data preparation and cleaning checklist

The data are complete and free of duplicates.
Empty or non-codeable responses have been removed.
Proper names, institutions and locations have been anonymised.
The texts are in a uniform format (table, plain text, document).
Se ha decidido cómo tratar las respuestas mínimas (“Sí”, “No”, “Depende”).
Changes made to the data have been documented.

Checklist for defining units of analysis and coding type

I have clearly defined the unit of analysis (document, segment, sentence, semantic unit).
I have decided whether I will code:
- segments within the document,
- the complete document as a case,
- both.

I have defined which categories function as document attributes/variables.
I have explicitly stated these decisions in the LLM instructions.

Category system construction checklist

Each category has a clear and precise name.
Each category has an explicit definition.
Inclusion and exclusion criteria exist.
The categories are conceptually distinct from one another.
The system is exhaustive with respect to the corpus.
Each category includes representative textual examples.

LLM-assisted coding checklist

The LLM has the definitive category system.
The coding includes exact verbatim quotations.
A clear coding table has been requested.
Minimal responses are marked as such.
A part of the corpus has been manually reviewed.
Inconsistencies have been corrected before continuing.

Validation and quality control checklist

Manual review of a sample has been carried out.
Double coding (human or assisted) has been applied.
The LLM has been asked to justify doubtful decisions.
A second LLM has been used as a critical reviewer.
Discrepancies have been reviewed and categories adjusted.
Validation decisions are documented.

Ethics and data protection checklist

Data have been anonymised before using the LLM.
The privacy policies of the platform used have been reviewed.
No sensitive or identifiable data have been uploaded.
Unnecessary files have been deleted after the analysis.
LLM use is declared in the academic report.

Academic report checklist

LLM use is described in the Method section.
Final analytical decisions are human.
Results include verbatim quotations.
Limitations associated with LLM use are declared.
The ethical measures adopted are described.
The process is traceable and reproducible.

Methodological checklist

A final checklist to ensure all aspects have been covered:

Anonymised data
Master prompt defined
Versioned category system (min. v2.0)
Sample manually reviewed (min. 10%)
Methodological decisions documented
LLM use declared in method
Backups made
Acknowledged limits of the analyses

Ejemplos de prompts listos para usar (por fases)

1. Master prompt (use at the start of each session)

Instruction for the LLM

Actúa como asistente experto en análisis de contenido en investigación educativa.
Sigue escrupulosamente mis instrucciones.
No inventes información.
Si detectas ambigüedad, solicita aclaraciones inmediatamente.
Trabajaremos por etapas: exploración, categorías, codificación y síntesis.

2. Preliminary exploration

Instruction for the LLM

Aquí tienes un conjunto de respuestas sobre [tema].
Realiza un análisis preliminar que incluya:
1) Resumen general (5–7 líneas)
2) 5–10 patrones o temas preliminares
3) Citas textuales representativas
4) Tensiones o contradicciones emergentes
No generes categorías definitivas.

3. Creating the category system (inductive)

Instruction for the LLM

Genera un sistema de categorías inicial basado en los datos.
Para cada categoría incluye:
- nombre
- descripción clara
- criterios de inclusión
- criterios de exclusión
- 2–3 citas textuales del corpus

4. Refinement

Instruction for the LLM

Revisa este sistema de categorías (versión X).
Identifica:
- solapamientos
- categorías amplias o ambiguas
- redundancias
- diferencias de nivel conceptual
Propón una versión mejorada y justifica los cambios.

5. Coding

Instruction for the LLM

Codifica este bloque según el sistema de categorías versión X.
Devuelve la salida en una tabla con:
[Texto] – [Categoría(s)] – [Justificación con cita textual]
No inventes citas. Si una respuesta no encaja, indícalo.

6. Analytical synthesis

Instruction for the LLM

A partir de la tabla de codificación y el sistema categorial versión X,
elabora un análisis temático académico que incluya:
- temas principales
- relación entre categorías
- tensiones o contradicciones
- 1–2 citas representativas por subtema
No añadas información no presente en los datos.

7. Academic report

Instruction for the LLM

Redacta un borrador de informe académico que incluya:
- introducción breve
- método
- resultados (por temas con citas)
- discusión preliminar
- limitaciones del análisis y del uso de LLM
Basado exclusivamente en los datos proporcionados.

LLM-AssistedContent Analysis

LLM-Assisted Content Analysis

Preface

Foundations

Introduction

Purpose of the guide

Who it is for

Scope and limits of LLMs in content analysis

Rationale for using LLMs in educational research

Formal aspects of the guide

Cross-cutting operative principles (mandatory throughout)

Foundations of content analysis

Definitions and key concepts

Types of analysis: inductive, deductive and mixed

Units of analysis and text segmentation.

The role of the researcher in interpretation.

Differences between manual and LLM-assisted analysis

Preparation

Data preparation

Data collection: common sources in educational research

Cleaning and anonymisation

Recommended formats

Block-based analysis management

Special cases: short responses, noise and multilingual data

Configuring the LLM for content analysis

El “rol analítico” del LLM

Master prompt: what to include

Version management and conversational memory.

Sources of error and how to mitigate them

Using projects or persistent workspaces (ChatGPT Projects, NotebookLM, etc.)

Advantages

Disadvantages and risks

Recommendation

Analysis

Initial exploratory analysis

How to present data to the LLM

Requesting summaries, patterns and possible themes

Quick identification of interpretive angles

Possible errors at this stage

Building the category system

Academic criteria for a good category system

Inductive: generating categories from the data

Deductive: applying pre-existing categories

A special case of deductive coding: using lexical dictionaries (controlled lexicon)

Mixed: template + emerging categories

How to ask the LLM for clear descriptions, inclusions, exclusions and examples

Comparison between manually and LLM-generated systems

Refining and validating the category system

Merging, splitting and level of abstraction

Internal and external coherence

LLM assistance for detecting ambiguities

Example evolution: version 1.0 → 2.0 → 3.0

Documenting methodological decisions

Assisted coding

How to request clear, justified coding tables

How to manage large volumes of data

Validation and quality control

Manual review of assisted coding

Double coding assisted by the same LLM

Discussing coding decisions with the model

Cross-review using multiple LLMs (assisted triangulation)

Partial automation of analysis in advanced stages

Condiciones necesarias

What can be automated

Riesgos y advertencias

Automating analysis with ChatGPT Automations (workflow)

Recommendation

Synthesis and reporting

Analytical synthesis

Integrating direct quotations

Relationships between categories

Tensions, contradictions and exceptions

Reporting LLM use in the academic report

Ethics and quality

Good practice and ethics in LLM-assisted analysis

Model biases and researcher biases

Limitations of using LLMs in content analysis

How to avoid over-reliance

Responsible use and reproducibility

How to document the process

LLM-Assisted
Content Analysis