Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessAnthropic is learning that there are no take-backs on the internetBusiness InsiderOpenClaw launches an official China mirror, with ByteDance providing the servers to host the Chinese-language service, as OpenClaw explodes in the country (Juro Osawa/The Information)TechmemeArtificial Intelligence in Process Control - The Chemical EngineerGoogle News: AIOpenAI doesn’t just want to answer your questions — it wants to run your digital life - TechRadarGoogle News: OpenAIWhy Nvidia just poured $2 billion into AI ASIC competitor Marvell — NVLink Fusion turns into soft ecosystem lock-intomshardware.comIs AI the new “Manhattan Project”? Vox went to Los Alamos to find out. - VoxGoogle News: ChatGPT'Users Should Own Their AI Agents, Not Rent Them' — Valory CEO David Minarsch Explains the Future of AI Control - CCN.comGoogle News: Generative AIBest Video Conferencing Solution for Enterprises in 2026Dev.to AIFunctional Testing vs Reality: What Actually Breaks in ProductionDev.to AIGenerative AI In Manufacturing Market to hit USD 10,540.1 Million by 2033 - vocal.mediaGoogle News: Generative AISources: Chinese optics company and Nvidia supplier Innolight confidentially filed for a Hong Kong IPO that could raise $3B+; Innolight is listed in Shenzhen (Bloomberg)TechmemeData Observability 2.0: The Backbone of Trusted Enterprise AnalyticsDev.to AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessAnthropic is learning that there are no take-backs on the internetBusiness InsiderOpenClaw launches an official China mirror, with ByteDance providing the servers to host the Chinese-language service, as OpenClaw explodes in the country (Juro Osawa/The Information)TechmemeArtificial Intelligence in Process Control - The Chemical EngineerGoogle News: AIOpenAI doesn’t just want to answer your questions — it wants to run your digital life - TechRadarGoogle News: OpenAIWhy Nvidia just poured $2 billion into AI ASIC competitor Marvell — NVLink Fusion turns into soft ecosystem lock-intomshardware.comIs AI the new “Manhattan Project”? Vox went to Los Alamos to find out. - VoxGoogle News: ChatGPT'Users Should Own Their AI Agents, Not Rent Them' — Valory CEO David Minarsch Explains the Future of AI Control - CCN.comGoogle News: Generative AIBest Video Conferencing Solution for Enterprises in 2026Dev.to AIFunctional Testing vs Reality: What Actually Breaks in ProductionDev.to AIGenerative AI In Manufacturing Market to hit USD 10,540.1 Million by 2033 - vocal.mediaGoogle News: Generative AISources: Chinese optics company and Nvidia supplier Innolight confidentially filed for a Hong Kong IPO that could raise $3B+; Innolight is listed in Shenzhen (Bloomberg)TechmemeData Observability 2.0: The Backbone of Trusted Enterprise AnalyticsDev.to AI
AI NEWS HUBbyEIGENVECTOREigenvector

To Write or to Automate Linguistic Prompts, That Is the Question

arXivby [Submitted on 26 Mar 2026]March 26, 20261 min read1 views
Source Quiz

LLM performance is highly sensitive to prompt design, yet whether automatic prompt optimization can replace expert prompt engineering in linguistic tasks remains unexplored. We present the first systematic comparison of hand-crafted zero-shot expert prompts, base DSPy signatures, and GEPA-optimized DSPy signatures across translation, terminology insertion, and language quality assessment, evaluating five model configurations. Results are task-dependent. In terminology insertion, optimized and manual prompts produce mostly statistically indistinguishable quality. In translation, each approach w — Marina Sánchez-Torrón, Daria Akselrod, Jason Rauchwerk

View PDF HTML (experimental)

Abstract:LLM performance is highly sensitive to prompt design, yet whether automatic prompt optimization can replace expert prompt engineering in linguistic tasks remains unexplored. We present the first systematic comparison of hand-crafted zero-shot expert prompts, base DSPy signatures, and GEPA-optimized DSPy signatures across translation, terminology insertion, and language quality assessment, evaluating five model configurations. Results are task-dependent. In terminology insertion, optimized and manual prompts produce mostly statistically indistinguishable quality. In translation, each approach wins on different models. In LQA, expert prompts achieve stronger error detection while optimization improves characterization. Across all tasks, GEPA elevates minimal DSPy signatures, and the majority of expert-optimized comparisons show no statistically significant difference. We note that the comparison is asymmetric: GEPA optimization searches programmatically over gold-standard splits, whereas expert prompts require in principle no labeled data, relying instead on domain expertise and iterative refinement.

Comments: 10 pages, to be submitted for EAMT 2026

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2603.25169 [cs.CL]

(or arXiv:2603.25169v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.25169

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Marina Sánchez-Torrón [view email] [v1] Thu, 26 Mar 2026 08:42:06 UTC (31 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
To Write or…researchpaperarxivnlplanguage-mo…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 166 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!