Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessWhich countries use ChatGPT the most? New study reveals top 5 - Deseret NewsGoogle News: ChatGPTOpenAI Is Letting Individuals Invest in Its $852 Billion Valuation—Here’s How - inc.comGoogle News: OpenAITransition From Data Scientist to Machine Learning Engineer 2026 Guide - Interview Kickstart Publishes New Career Guide - The Manila TimesGoogle News: Machine LearningValuations are 'Punchy': Salesforce's DrewsBloomberg TechnologyEarly AI Use Risks Children’s Development, Safety: UN - Mexico Business NewsGoogle News: AI SafetyAI blueprints can be stolen with a single small antennaTechXplore AIYou Have to Start Early in AI: Axiom Founder VenkatachalamBloomberg TechnologyAI and the Work-Product Doctrine: A New Frontier - callaborlaw.comGoogle News: AICompliance Policies: AI Policy & Upcoming Incident Response Plan Deadline - natlawreview.comGoogle News: AIIntegration in the Wealth Management Industry - wealthmanagement.comGoogle News: AI‘Boring’ Liberty Formula One Upgraded To Buy at Bank of AmericaBloomberg TechnologyCan You Run a Computer Without RAM? Surprisingly, Yes—But You’ll Be MiserableGizmodoBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessWhich countries use ChatGPT the most? New study reveals top 5 - Deseret NewsGoogle News: ChatGPTOpenAI Is Letting Individuals Invest in Its $852 Billion Valuation—Here’s How - inc.comGoogle News: OpenAITransition From Data Scientist to Machine Learning Engineer 2026 Guide - Interview Kickstart Publishes New Career Guide - The Manila TimesGoogle News: Machine LearningValuations are 'Punchy': Salesforce's DrewsBloomberg TechnologyEarly AI Use Risks Children’s Development, Safety: UN - Mexico Business NewsGoogle News: AI SafetyAI blueprints can be stolen with a single small antennaTechXplore AIYou Have to Start Early in AI: Axiom Founder VenkatachalamBloomberg TechnologyAI and the Work-Product Doctrine: A New Frontier - callaborlaw.comGoogle News: AICompliance Policies: AI Policy & Upcoming Incident Response Plan Deadline - natlawreview.comGoogle News: AIIntegration in the Wealth Management Industry - wealthmanagement.comGoogle News: AI‘Boring’ Liberty Formula One Upgraded To Buy at Bank of AmericaBloomberg TechnologyCan You Run a Computer Without RAM? Surprisingly, Yes—But You’ll Be MiserableGizmodo

PICon: A Multi-Turn Interrogation Framework for Evaluating Persona Agent Consistency

arXivMarch 26, 202610 min read0 views
Source Quiz

Large language model (LLM)-based persona agents are rapidly being adopted as scalable proxies for human participants across diverse domains. Yet there is no systematic method for verifying whether a persona agent's responses remain free of contradictions and factual inaccuracies throughout an interaction. A principle from interrogation methodology offers a lens: no matter how elaborate a fabricated identity, systematic interrogation will expose its contradictions. We apply this principle to propose PICon, an evaluation framework that probes persona agents through logically chained multi-turn q — Minseo Kim, Sujeong Im, Junseong Choi

View PDF HTML (experimental)

Abstract:Large language model (LLM)-based persona agents are rapidly being adopted as scalable proxies for human participants across diverse domains. Yet there is no systematic method for verifying whether a persona agent's responses remain free of contradictions and factual inaccuracies throughout an interaction. A principle from interrogation methodology offers a lens: no matter how elaborate a fabricated identity, systematic interrogation will expose its contradictions. We apply this principle to propose PICon, an evaluation framework that probes persona agents through logically chained multi-turn questioning. PICon evaluates consistency along three core dimensions: internal consistency (freedom from self-contradiction), external consistency (alignment with real-world facts), and retest consistency (stability under repetition). Evaluating seven groups of persona agents alongside 63 real human participants, we find that even systems previously reported as highly consistent fail to meet the human baseline across all three dimensions, revealing contradictions and evasive responses under chained questioning. This work provides both a conceptual foundation and a practical methodology for evaluating persona agents before trusting them as substitutes for human participants. We provide the source code and an interactive demo at: this https URL

Comments: 20 pages, 6 figures

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2603.25620 [cs.CL]

(or arXiv:2603.25620v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.25620

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Sujeong Im [view email] [v1] Thu, 26 Mar 2026 16:34:34 UTC (732 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
PICon: A Mu…researchpaperarxivnlplanguage-mo…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 129 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers