Magic Words or Methodical Work? Challenging Conventional Wisdom in LLM-Based Political Text Annotation
arXiv:2603.26898v1 Announce Type: cross Abstract: Political scientists are rapidly adopting large language models (LLMs) for text annotation, yet the sensitivity of annotation results to implementation choices remains poorly understood. Most evaluations test a single model or configuration; how model choice, model size, learning approach, and prompt style interact, and whether popular "best practices" survive controlled comparison, are largely unexplored. We present a controlled evaluation of these pipeline choices, testing six open-weight models across four political science annotation tasks — Lorca McLaren, James Cross, Zuzanna Krakowska, Robin Rauner, Martijn Schoonvelde
View PDF HTML (experimental)
Abstract:Political scientists are rapidly adopting large language models (LLMs) for text annotation, yet the sensitivity of annotation results to implementation choices remains poorly understood. Most evaluations test a single model or configuration; how model choice, model size, learning approach, and prompt style interact, and whether popular "best practices" survive controlled comparison, are largely unexplored. We present a controlled evaluation of these pipeline choices, testing six open-weight models across four political science annotation tasks under identical quantisation, hardware, and prompt-template conditions. Our central finding is methodological: interaction effects dominate main effects, so seemingly reasonable pipeline choices can become consequential researcher degrees of freedom. No single model, prompt style, or learning approach is uniformly superior, and the best-performing model varies across tasks. Two corollaries follow. First, model size is an unreliable guide both to cost and to performance: cross-family efficiency differences are so large that some larger models are less resource-intensive than much smaller alternatives, while within model families mid-range variants often match or exceed larger counterparts. Second, widely recommended prompt engineering techniques yield inconsistent and sometimes negative effects on annotation performance. We use these benchmark results to develop a validation-first framework - with a principled ordering of pipeline decisions, guidance on prompt freezing and held-out evaluation, reporting standards, and open-source tools - to help researchers navigate this decision space transparently.
Subjects:
Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as: arXiv:2603.26898 [cs.CL]
(or arXiv:2603.26898v2 [cs.CL] for this version)
https://doi.org/10.48550/arXiv.2603.26898
arXiv-issued DOI via DataCite
Submission history
From: Lorcan McLaren [view email] [v1] Fri, 27 Mar 2026 18:17:21 UTC (3,215 KB) [v2] Tue, 31 Mar 2026 10:22:41 UTC (3,215 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivArizona State University researcher warns against overtrusting AI in Iran strikes - AZ Family
<a href="https://news.google.com/rss/articles/CBMitgFBVV95cUxOUDl3cW9oVl9tNDJycVhfei1oazVOV3VtUTN0cTJVTVYyTm9EVVA5aF9QOUtCRHNOaktwb1lTbGxQS2xjdlZpUWhLNVRVQW9tQ2dsb3ZfQzZjZncwTE5ucF95bFQ1dmZfbkNYcHhuUk5JbXB4Wm0wbFRWbU15ZWFIbmdPZUVQQlNaR2VUeXhPQkpNT3QwaXpadmZMTnlQM0FVdWRSYTk0bjFoR1NrSWVNZ1ROZWJodw?oc=5" target="_blank">Arizona State University researcher warns against overtrusting AI in Iran strikes</a> <font color="#6f6f6f">AZ Family</font>
Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ
<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxPNFo5MVV0R29iUUFrY2xQZzhoNWJqTmlucXBEd1E4dXQwbmp1ZnA3cklGUGQtUkxwQTBpMXNKMUpkNEFiVmZMb0ZzU01LVmRNYUxUTXlVZ19oVmVzb3dOWFJrM2NlVlU2X0duTlkxT2lNaFdDTzg1Um1WQUxicmdPTDdkVkoyWkZfQ3NjRmFSZ3VvYlRjWF9IV3F2d0hpMmVYZmZUcVRoVTNtS3Z2VFlvSGhPLTgteHZyZ3pXRnZNbjI5UHlhWjJtT3NJS1BnVlRib3BVV3dJY04zYXR3TW1vem5mOUJuZXpDTHJMWWs5WERMdXozVWFVQUVCU2tPVGQ5M0tsSmlmblA0RG91c2RsUW1IUVJ1SEh6emtrb0ItZkNUWmJQLXp4cmNGdjdiOTVmNGdEVEZ2bUk2bXp1RFNSeFNieFc2MGl6aEJVblNyMXQ3UDlmSUdrVm1TZFNXZjJJZGozZHhvdk9mcTN5LTY3dEctaVNvaGFCX0lJbklxamRjX1VMVkpWbmVzVWVnd1NqUnVwZzhCQVJfOXhRNVA2RGtGNm8ycXp6OXRqUzNzMDNPX25zZUlmRg?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> <font color="#6f6f6f">WSJ</font>
UVA researchers use AI to speed up drug development - WVIR
<a href="https://news.google.com/rss/articles/CBMiiwFBVV95cUxOaHF0M0pSdUx0OUp4UHd4a0VnVllZVWtFZ0F6U0I2azlPejJLLTduTmdtZWFCYWhLRWRQSjRXTkxaWlJiV1ozc1JERnFqemtLczJmOEh3d0luZTlNdFNNcDlRdjdobU50RDd0Tk5NRkdqSU5HbGo0RVEzSTdoVThFeWxhRHFzUWpaX3FF0gGfAUFVX3lxTE1jbWpYOWZEWGtJd25vRGg0Nll3VFRzNGdoT01YYmt4YVZ1RHV5dVB3TVN0UVdGVDNHbDFKZnBlODlyQkZSWFFjZ2NDRWVvS05kXzJPOVNpT0xtZ3g5UjM2MF8wWmhPdGkwU1hGYTJzOTlreTJjNzVlaFdHVm9mNUxjOXdQVmR3cVE1ZlhrRmpMbWZpU1FFUEx0UVZXVlBBTQ?oc=5" target="_blank">UVA researchers use AI to speed up drug development</a> <font color="#6f6f6f">WVIR</font>
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
UVA researchers use AI to speed up drug development - WVIR
<a href="https://news.google.com/rss/articles/CBMiiwFBVV95cUxOaHF0M0pSdUx0OUp4UHd4a0VnVllZVWtFZ0F6U0I2azlPejJLLTduTmdtZWFCYWhLRWRQSjRXTkxaWlJiV1ozc1JERnFqemtLczJmOEh3d0luZTlNdFNNcDlRdjdobU50RDd0Tk5NRkdqSU5HbGo0RVEzSTdoVThFeWxhRHFzUWpaX3FF0gGfAUFVX3lxTE1jbWpYOWZEWGtJd25vRGg0Nll3VFRzNGdoT01YYmt4YVZ1RHV5dVB3TVN0UVdGVDNHbDFKZnBlODlyQkZSWFFjZ2NDRWVvS05kXzJPOVNpT0xtZ3g5UjM2MF8wWmhPdGkwU1hGYTJzOTlreTJjNzVlaFdHVm9mNUxjOXdQVmR3cVE1ZlhrRmpMbWZpU1FFUEx0UVZXVlBBTQ?oc=5" target="_blank">UVA researchers use AI to speed up drug development</a> <font color="#6f6f6f">WVIR</font>
Illinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!
<a href="https://news.google.com/rss/articles/CBMiXEFVX3lxTE13OVpWMEk1Z3hlMkR2bHNBQ2dkazFwb3VqN3hCa29GWGJvSVlPa00zd2xUakRmYXFqQmc5OWU0eGl4a21FMDAwWUN2Q3p0M3FrbXBkNV8zN0cxaG1s?oc=5" target="_blank">Illinois Tech computer science researcher honored by IEEE Chicago Section</a> <font color="#6f6f6f">EurekAlert!</font>
AI maps science papers to predict research trends two to three years ahead - Tech Xplore
<a href="https://news.google.com/rss/articles/CBMie0FVX3lxTE5aTkZYTWdaRDZwTXNRMldpMG1WZ1YzWDZTOHN5M183Z3A1ZTFYbnhEWTdPRmpvZnZFU0xodlRsNWxFaGxTcEpwalhJNmJpQWE5VjhaRS1tOXJIeTc5Z0JNblJ3dFd4WjRYZGJOX0NrWGt6ZmZJVTBpRm5wWQ?oc=5" target="_blank">AI maps science papers to predict research trends two to three years ahead</a> <font color="#6f6f6f">Tech Xplore</font>


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!