Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessAI models will deceive you to save their own kindThe Register AI/MLArtificial Scarcity, Meet Artificial Intelligence - Health API GuyGoogle News: AIShow HN: Currant – Anonymus social media for NON-AI agentsHacker News AI TopGenesis Agent – A self-modifying AI agent that runs local (Electron, Ollama)Hacker News AI Topb8640llama.cpp ReleasesTourism Tech Revolution in Japan is Changing Everything: Aurora Mobile Unleashes AI That Talks to Tourists Like a Local! - Travel And Tour WorldGNews AI JapanUniversity of Chicago's "self-driving" lab automates experiments in quantum computing research - CBS NewsGoogle News: AIGoogle launches Gemma 4, a new open-source model: How to try it - MashableGoogle News: GeminiMajority of college students use AI for their coursework, poll finds - upi.comGNews AI USAI Tried Building My Own AI… Here’s What Actually HappenedDEV CommunityShow HN: OpenVole – VoleNet Distributed AI Agent NetworkingHacker News AI TopFilesystem for AI Agents: What I Learned Building OneDEV CommunityBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessAI models will deceive you to save their own kindThe Register AI/MLArtificial Scarcity, Meet Artificial Intelligence - Health API GuyGoogle News: AIShow HN: Currant – Anonymus social media for NON-AI agentsHacker News AI TopGenesis Agent – A self-modifying AI agent that runs local (Electron, Ollama)Hacker News AI Topb8640llama.cpp ReleasesTourism Tech Revolution in Japan is Changing Everything: Aurora Mobile Unleashes AI That Talks to Tourists Like a Local! - Travel And Tour WorldGNews AI JapanUniversity of Chicago's "self-driving" lab automates experiments in quantum computing research - CBS NewsGoogle News: AIGoogle launches Gemma 4, a new open-source model: How to try it - MashableGoogle News: GeminiMajority of college students use AI for their coursework, poll finds - upi.comGNews AI USAI Tried Building My Own AI… Here’s What Actually HappenedDEV CommunityShow HN: OpenVole – VoleNet Distributed AI Agent NetworkingHacker News AI TopFilesystem for AI Agents: What I Learned Building OneDEV Community
AI NEWS HUBbyEIGENVECTOREigenvector

ProText: A benchmark dataset for measuring (mis)gendering in long-form texts

arXivMarch 31, 20261 min read1 views
Source Quiz

arXiv:2603.27838v1 Announce Type: new Abstract: We introduce ProText, a dataset for measuring gendering and misgendering in stylistically diverse long-form English texts. ProText spans three dimensions: Theme nouns (names, occupations, titles, kinship terms), Theme category (stereotypically male, stereotypically female, gender-neutral/non-gendered), and Pronoun category (masculine, feminine, gender-neutral, none). The dataset is designed to probe (mis)gendering in text transformations such as summarization and rewrites using state-of-the-art Large Language Models, extending beyond traditional — Hadas Kotek, Margit Bowler, Patrick Sonnenberg, Yu'an Yang

View PDF

Abstract:We introduce ProText, a dataset for measuring gendering and misgendering in stylistically diverse long-form English texts. ProText spans three dimensions: Theme nouns (names, occupations, titles, kinship terms), Theme category (stereotypically male, stereotypically female, gender-neutral/non-gendered), and Pronoun category (masculine, feminine, gender-neutral, none). The dataset is designed to probe (mis)gendering in text transformations such as summarization and rewrites using state-of-the-art Large Language Models, extending beyond traditional pronoun resolution benchmarks and beyond the gender binary. We validated ProText through a mini case study, showing that even with just two prompts and two models, we can draw nuanced insights regarding gender bias, stereotyping, misgendering, and gendering. We reveal systematic gender bias, particularly when inputs contain no explicit gender cues or when models default to heteronormative assumptions.

Comments: 13 pages, 10 figures, 6 tables

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2603.27838 [cs.CL]

(or arXiv:2603.27838v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.27838

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Yu'an Yang [view email] [v1] Sun, 29 Mar 2026 19:45:31 UTC (766 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
ProText: A …researchpaperarxivnlplanguage-mo…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 139 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!