Research Papers research paper arxiv ai artificial-intelligence

Natural-Language Agent Harnesses

arXivMarch 26, 202610 min read0 views

Agent performance increasingly depends on \emph{harness engineering}, yet harness design is usually buried in controller code and runtime-specific conventions, making it hard to transfer, compare, and study as a scientific object. We ask whether the high-level control logic of an agent harness can instead be externalized as a portable executable artifact. We introduce \textbf{Natural-Language Agent Harnesses} (NLAHs), which express harness behavior in editable natural language, and \textbf{Intelligent Harness Runtime} (IHR), a shared runtime that executes these harnesses through explicit contr — Linyue Pan, Lexiao Zou, Shuo Guo

View PDF HTML (experimental)

Abstract:Agent performance increasingly depends on \emph{harness engineering}, yet harness design is usually buried in controller code and runtime-specific conventions, making it hard to transfer, compare, and study as a scientific object. We ask whether the high-level control logic of an agent harness can instead be externalized as a portable executable artifact. We introduce \textbf{Natural-Language Agent Harnesses} (NLAHs), which express harness behavior in editable natural language, and \textbf{Intelligent Harness Runtime} (IHR), a shared runtime that executes these harnesses through explicit contracts, durable artifacts, and lightweight adapters. Across coding and computer-use benchmarks, we conduct controlled evaluations of operational viability, module ablation, and code-to-text harness migration.

Comments: under review

Subjects:

Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.25723 [cs.CL]

(or arXiv:2603.25723v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.25723

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Linyue Pan [view email] [v1] Thu, 26 Mar 2026 17:58:15 UTC (1,836 KB)

Original source

arXiv

https://arxiv.org/abs/2603.25723v1

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Research PapersLive

I ran by instinct for years. Then I built an AI running coach.

How a 50 km trail race, a broken ChatGPT workflow, and 60+ research papers led me to create Coach Leo. Continue reading on Medium »

Medium AI

1mabout 1 hour ago

ModelsRecent

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models WSJ

Google News: LLM

1m1 day ago

ModelsFresh

How AI Fails: An Interactive Pedagogical Tool for Demonstrating Dialectal Bias in Automated Toxicity Models

arXiv:2511.06676v2 Announce Type: replace-cross Abstract: Now that AI-driven moderation has become pervasive in everyday life, we often hear claims that "the AI is biased". While this is often said jokingly, the light-hearted remark reflects a deeper concern. How can we be certain that an online post flagged as "inappropriate" was not simply the victim of a biased algorithm? This paper investigates this problem using a dual approach. First, I conduct a quantitative benchmark of a widely used toxicity model (unitary/toxic-bert) to measure performance disparity between text in African-American English (AAE) and Standard American English (SAE). The benchmark reveals a clear, systematic bias: on average, the model scores AAE text as 1.8 times more toxic and 8.8 times higher for "identity hate"

arXiv cs.HC

1mabout 7 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 169 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

Natural-Language Agent Harnesses

Submission history

Daily AI Digest

More about

I ran by instinct for years. Then I built an AI running coach.

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

How AI Fails: An Interactive Pedagogical Tool for Demonstrating Dialectal Bias in Automated Toxicity Models

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Research Papers

I ran by instinct for years. Then I built an AI running coach.

“It's not about gatekeeping."

Adversaries have under-protected APIs in their sights

Exclusive | OpenAI’s Former Research Chief Aims to Automate Manufacturing With AI - WSJ