Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessDeveloper’s Guide to Building ADK Agents with SkillsGoogle Developers BlogCargill Wins 2026 BIG Artificial Intelligence Excellence Award - foodmarket.comGoogle News: AIMeet the Agentic AI Design-to-Source Workspace for PLM: From CAD to Confident Sourcing Decisions - Oracle BlogsGNews AI agenticYouTube blasted by hundreds of experts over ‘AI slop’ videos served up to kidsFast Company TechZenity Emphasizes Security Controls for Expanding Enterprise AI Agent Ecosystems - TipRanksGoogle News: AI SafetyApono Uses Gamified AI Security Exercise to Engage Cloud Security Community - TipRanksGoogle News: AI SafetyUniversity of Colorado delays student rollout of ChatGPT Edu - Boulder Daily CameraGoogle News: ChatGPTSpaceX finally files for IPO, targets $1.75 trillion valuationArs TechnicaMeta’s natural gas binge could power South DakotaTechCrunch AIYour AI Vendor's Worst Enemy Is Its Own Development Pipeline - GovInfoSecurityGoogle News: Machine LearningLegal AI startup Legora hits $100 million in annual recurring revenueBusiness InsiderAnthropic's leaked AI coding tool has been cloned over 8,000 times on GitHub despite mass takedownsThe DecoderBlack Hat USADark ReadingBlack Hat AsiaAI BusinessDeveloper’s Guide to Building ADK Agents with SkillsGoogle Developers BlogCargill Wins 2026 BIG Artificial Intelligence Excellence Award - foodmarket.comGoogle News: AIMeet the Agentic AI Design-to-Source Workspace for PLM: From CAD to Confident Sourcing Decisions - Oracle BlogsGNews AI agenticYouTube blasted by hundreds of experts over ‘AI slop’ videos served up to kidsFast Company TechZenity Emphasizes Security Controls for Expanding Enterprise AI Agent Ecosystems - TipRanksGoogle News: AI SafetyApono Uses Gamified AI Security Exercise to Engage Cloud Security Community - TipRanksGoogle News: AI SafetyUniversity of Colorado delays student rollout of ChatGPT Edu - Boulder Daily CameraGoogle News: ChatGPTSpaceX finally files for IPO, targets $1.75 trillion valuationArs TechnicaMeta’s natural gas binge could power South DakotaTechCrunch AIYour AI Vendor's Worst Enemy Is Its Own Development Pipeline - GovInfoSecurityGoogle News: Machine LearningLegal AI startup Legora hits $100 million in annual recurring revenueBusiness InsiderAnthropic's leaked AI coding tool has been cloned over 8,000 times on GitHub despite mass takedownsThe Decoder

L-MARS: Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2509.00761v3 Announce Type: replace Abstract: We present L-MARS (Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search), a multi-agent retrieval framework for grounded legal question answering that decomposes queries into structured sub-problems, retrieves evidence via agentic web search, filters results through a verification agent, and synthesizes cited answers. Existing legal QA benchmarks test either closed-book reasoning or retrieval over fixed corpora, but neither captures scenarios requiring current legal information. We introduce LegalSearchQA, a 50-question b — Ziqi Wang, Boqin Yuan

View PDF HTML (experimental)

Abstract:We present L-MARS (Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search), a multi-agent retrieval framework for grounded legal question answering that decomposes queries into structured sub-problems, retrieves evidence via agentic web search, filters results through a verification agent, and synthesizes cited answers. Existing legal QA benchmarks test either closed-book reasoning or retrieval over fixed corpora, but neither captures scenarios requiring current legal information. We introduce LegalSearchQA, a 50-question benchmark across five legal domains whose answers depend on recent developments that post-date model training data. L-MARS achieves 96.0% accuracy on LegalSearchQA, a 38.0% improvement over zero-shot performance (58.0%), while chain-of-thought prompting degrades performance to 30.0%. On Bar Exam QA (Zheng et al., 2025), a reasoning-focused benchmark of 594 bar examination questions, retrieval provides negligible gains (+0.7 percentage points), consistent with prior findings. These results show that agentic retrieval dramatically improves legal QA when tasks require up-to-date factual knowledge, but the benefit is benchmark-dependent, underscoring the need for retrieval-focused evaluation. Code and data are available at: this https URL

Subjects:

Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

Cite as: arXiv:2509.00761 [cs.AI]

(or arXiv:2509.00761v3 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2509.00761

arXiv-issued DOI via DataCite

Submission history

From: Boqin Yuan [view email] [v1] Sun, 31 Aug 2025 09:23:26 UTC (912 KB) [v2] Wed, 3 Sep 2025 00:57:14 UTC (912 KB) [v3] Mon, 30 Mar 2026 02:42:59 UTC (857 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
L-MARS: Leg…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 193 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers