ClaimPT: A Portuguese Dataset of Annotated Claims in News Articles
arXiv:2601.19490v3 Announce Type: replace Abstract: Fact-checking remains a demanding and time-consuming task, still largely dependent on manual verification and unable to match the rapid spread of misinformation online. This is particularly important because debunking false information typically takes longer to reach consumers than the misinformation itself; accelerating corrections through automation can therefore help counter it more effectively. Although many organizations perform manual fact-checking, this approach is difficult to scale given the growing volume of digital content. These l — Ricardo Campos, Raquel Sequeira, Sara Nerea, In\^es Cantante, Diogo Folques, Lu\'is Filipe Cunha, Jo\~ao Canavilhas, Ant\'onio Branco, Al\'ipio Jorge, S\'ergio Nunes, Nuno Guimar\~aes, Purifica\c{c}\~ao Silvano
Authors:Ricardo Campos, Raquel Sequeira, Sara Nerea, Inês Cantante, Diogo Folques, Luís Filipe Cunha, João Canavilhas, António Branco, Alípio Jorge, Sérgio Nunes, Nuno Guimarães, Purificação Silvano
View PDF HTML (experimental)
Abstract:Fact-checking remains a demanding and time-consuming task, still largely dependent on manual verification and unable to match the rapid spread of misinformation online. This is particularly important because debunking false information typically takes longer to reach consumers than the misinformation itself; accelerating corrections through automation can therefore help counter it more effectively. Although many organizations perform manual fact-checking, this approach is difficult to scale given the growing volume of digital content. These limitations have motivated interest in automating fact-checking, where identifying claims is a crucial first step. However, progress has been uneven across languages, with English dominating due to abundant annotated data. Portuguese, like other languages, still lacks accessible, licensed datasets, limiting research, NLP developments and applications. In this paper, we introduce ClaimPT, a dataset of European Portuguese news articles annotated for factual claims, comprising 1,308 articles and 6,875 individual annotations. Unlike most existing resources based on social media or parliamentary transcripts, ClaimPT focuses on journalistic content, collected through a partnership with LUSA, the Portuguese News Agency. To ensure annotation quality, two trained annotators labeled each article, with a curator validating all annotations according to a newly proposed scheme. We also provide baseline models for claim detection, establishing initial benchmarks and enabling future NLP and IR applications. By releasing ClaimPT, we aim to advance research on low-resource fact-checking and enhance understanding of misinformation in news media.
Subjects:
Computation and Language (cs.CL)
Cite as: arXiv:2601.19490 [cs.CL]
(or arXiv:2601.19490v3 [cs.CL] for this version)
https://doi.org/10.48550/arXiv.2601.19490
arXiv-issued DOI via DataCite
Journal reference: Advances in Information Retrieval. ECIR 2026. Lecture Notes in Computer Science, vol 16486. Springer, Cham
Related DOI:
https://doi.org/10.1007/978-3-032-21321-1_58
DOI(s) linking to related resources
Submission history
From: Nuno Guimaraes [view email] [v1] Tue, 27 Jan 2026 11:22:00 UTC (1,147 KB) [v2] Mon, 9 Feb 2026 10:12:13 UTC (1,147 KB) [v3] Thu, 26 Mar 2026 20:14:25 UTC (1,147 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivSave the Sun Shrimp!
The supposition that we live in a "goldilocks zone" is frankly just nonsense built up by an anthropocentric need to feel self-important, like Copernicus I am here to rescue us from a self-absorbed disaster of thought. Indeed, what is required for life to form is the ability to create complex structures with causal persistence times above a threshold. With this in mind we are able to find many areas where organisms could persist, if we just had the eyes to see them, namely the Sun! The surface of the Sun is frankly massive, mjx-math { display: inline-block; text-align: left; line-height: 0; text-indent: 0; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; border-collapse: collapse; word-wrap: normal; word-spacing: normal; white-space:
AI models will secretly scheme to protect other AI models from being shut down, researchers find - Fortune
<a href="https://news.google.com/rss/articles/CBMixgFBVV95cUxPdDVrRUpkN1RRQU91SDJYYzVzejV4b1JoTWdwVEZVamltZHdKaGtfS3FNQlMyWVdmS2NqRi1pUHJWbG9KX1ZkUmFPeEllc0Q1SjlPdnVPMHRYTXE2S2EtbThEM1lncnVac01Wc2N2V0NGelIwUVFWUTFtdGRxMGpSby11QWNEcHlqcF96QWhuYWQ0YWFuWDBhWGFqSDNFRVNGc19uNzJnUHR4X0VxQzdZTDhUNjg2Y3pOWWw2QjUweFc0djFUSFE?oc=5" target="_blank">AI models will secretly scheme to protect other AI models from being shut down, researchers find</a> <font color="#6f6f6f">Fortune</font>

I Built a Social Post Engine to Escape the Canva-Export-Schedule Loop
<p>As a solo founder running WahResume.com, I was spending way too much time on social media - not on creativity, but on process.<br> Same templates. Same brand assets. Same hashtags. Every post meant opening Canva, exporting, uploading, scheduling… and repeating it the next day.</p> <p>So I built something to fix that.</p> <p>Social Post Engine is a small tool that helps me stay consistent on social media without having to touch Canva or an endless queue of schedulers.</p> <p>Here’s what it does:</p> <p>✅ Seed & review topics in one command — it researches, outlines, and preps your next posts.<br> ✅ Pre-generates branded images from templates (checklists, stat cards, charts, comparisons). It also writes captions in your brand’s voice using AI.<br> ✅ Publishes automatically to LinkedIn
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
Researchers to use robotics and AI to help sheep producers - University of Nevada, Reno
<a href="https://news.google.com/rss/articles/CBMic0FVX3lxTFB4UmxpREpFODBJN0lKakYwRVVtdlZPNmNiTExRelVFaDYzYW9kX2RCc0pEZjlmX01fT1dWYTlxZE1ET2ZKVVgzSVZIenY3bDlHa3FXS1dUdVBmTEdLa1hUR2x3OWxHbkE2RnROSjl6VHVHQ2c?oc=5" target="_blank">Researchers to use robotics and AI to help sheep producers</a> <font color="#6f6f6f">University of Nevada, Reno</font>
AIRA_2: Breaking Bottlenecks In AI Research Agents - Forbes
<a href="https://news.google.com/rss/articles/CBMiowFBVV95cUxNNmtndHhmQ2lpZGdPdTJwY25xejcyV1c1SWNLdWFOWnNwbjRUQTF0ZWdOZFNaclNBNWVsaUgtU0JUM2xrakhoOXVLMVJzVTNkajdrMmJGeS1lYUpMUG1NMkZNMDJFREZZdXU2ZVdEbkNZSDNBRjJBLVYyZE9XeEY4T0RJY3J5aDVWcEZVQ2lWUjhUYXBsUk16d09NdGdsQ3lxb3gw?oc=5" target="_blank">AIRA_2: Breaking Bottlenecks In AI Research Agents</a> <font color="#6f6f6f">Forbes</font>
Oracle Layoffs Recast Costs To Back US$50b AI Infrastructure Bet - simplywall.st
<a href="https://news.google.com/rss/articles/CBMivwFBVV95cUxQNWpZb2ZQVDBIOGVZTTBtLThzaGwxS3NkMnJBSS1wek5pQlJXRWdTOEh5aTdPTE9Cd3JHdjZDeWRtVzdMUUdESHJOQXZDdGNVdGZtTTBhanpfb3UxQnRobVlzNGdVUXJLZWptV2V6NXlNSWllX3FxOU5XYTF0RkM2TnJIaFJkcVBFOGc2alBSLTZEeU85QU1oTjBrMVZSTl84dm9GeFl5OGtUMjc3LVd1dS1fcHZ1RG9HcV82T2JFWdIBxAFBVV95cUxOSE5XVXh0QkM4Yi1WbXNhWkJ2Z2dLRlBGNjAwaTcyNFJWMWRPdXo5WjRQQkRGTG9IamxxbmdhMHpsaEJ6RDQwZl9ENGl5WDc5a2lrTXZ1bVpFbGdsdndHYjFINnZPSnNKX1dZamszUXByR1BlRXF6d1pKOHpBU3M5UFhUSldlUWtIMlRNQzdvTk9haEJKeDI1ZEg0WWQ1SXYzLUZCWElQc3pzR19ucGExdVpnc2hBQXlQNVpOZFVBVzRkLXFE?oc=5" target="_blank">Oracle Layoffs Recast Costs To Back US$50b AI Infrastructure Bet</a> <font color="#6f6f6f">simplywall.st</font>

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!