Research Papers research paper arxiv nlp language-models

TailNLG: A Multilingual Benchmark Addressing Verbalization of Long-Tail Entities

arXivMarch 31, 20262 min read1 views

arXiv:2603.27768v1 Announce Type: new Abstract: The automatic verbalization of structured knowledge is a key task for making knowledge graphs accessible to non-expert users and supporting retrieval-augmented generation systems. Although recent advances in Data-to-Text generation have improved multilingual coverage, little attention has been paid to potential biases in the verbalization of rare entities, frequently known as long-tail entities. In this work, we present the first systematic study of long-tail entities in Data-to-Text generation. We introduce TailNLG, a new multilingual benchmark — Lia Draetta, Michael Oliverio, Virginia Ram\'on-Ferrer, Pier Felice Balestrucci, Flaviana Corallo, Carlos Badenes-Olmedo, Alessandro Mazzei, Marco Antonio Stranisci, Rossana Damiano

View PDF HTML (experimental)

Abstract:The automatic verbalization of structured knowledge is a key task for making knowledge graphs accessible to non-expert users and supporting retrieval-augmented generation systems. Although recent advances in Data-to-Text generation have improved multilingual coverage, little attention has been paid to potential biases in the verbalization of rare entities, frequently known as long-tail entities. In this work, we present the first systematic study of long-tail entities in Data-to-Text generation. We introduce TailNLG, a new multilingual benchmark in English, Italian, and Spanish, built from Wikidata and covering entities with varying levels of popularity. We evaluate three different families of large language models in zero-shot settings and compare their performance on rare versus common entities, as well as against the established WebNLG benchmark. Our results reveal a consistent bias against long-tail entities: embedding-based scores are lower, and model uncertainty is higher for rare entities. We further show that the impact of long-tail entities varies across models and languages, and that existing evaluation metrics do not consistently capture these differences, highlighting the need for more reliable evaluation frameworks.

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2603.27768 [cs.CL]

(or arXiv:2603.27768v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.27768

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Marco Antonio Stranisci [view email] [v1] Sun, 29 Mar 2026 17:01:54 UTC (463 KB)

Original source

arXiv

https://arxiv.org/abs/2603.27768

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Frontier Research

Will Reinforcement Learning Get Us to AGI? This Anthropic Researcher Thinks So - The Information

Will Reinforcement Learning Get Us to AGI? This Anthropic Researcher Thinks So The Information

GNews AI reinforcement learning

1m6 months ago

Models

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models WSJ

Google News: LLM

1m3 days ago

Research PapersLive

Seeking arXiv cs.AI endorsement — neuroscience-inspired memory architecture for AI agents

Hi everyone, I’m an independent researcher (Zensation AI) seeking endorsement for my first arXiv submission in cs.AI. Paper: “ZenBrain: A Neuroscience-Inspired 7-Layer Memory Architecture for Autonomous AI Systems” Summary: ZenBrain is the first AI memory system grounded in cognitive neuroscience. It implements 7 memory layers (working, short-term, episodic, semantic, procedural, core, cross-context) with 12 algorithms including Hebbian learning, FSRS spaced repetition, sleep-time consolidation (Stickgold & Walker 2013), and Bayesian confidence propagation. Prior art: Published as defensive publication on TDCommons (dpubs_series/9683) and archived on Zenodo (DOI: 10.5281/zenodo.19353663). Open-source npm packages with 9,000+ tests. Why this matters: Recent surveys (arxiv:2603.07670) identi

discuss.huggingface.co

1mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 182 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

TailNLG: A Multilingual Benchmark Addressing Verbalization of Long-Tail Entities

Submission history

Daily AI Digest

More about

Will Reinforcement Learning Get Us to AGI? This Anthropic Researcher Thinks So - The Information

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

Seeking arXiv cs.AI endorsement — neuroscience-inspired memory architecture for AI agents

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Research Papers

Seeking arXiv cs.AI endorsement — neuroscience-inspired memory architecture for AI agents

TTA establishes AI security standards group to address emerging risks - telecompaper.com

Exclusive | OpenAI’s Former Research Chief Aims to Automate Manufacturing With AI - WSJ

Tech bills of the week: quantum computing research; AI workforce development; and more - Nextgov/FCW