Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessOrganization firewall settings for Copilot cloud agentGitHub Copilot Changelogtrunk/9557db4da58adc85b48136aad2e383f7ab6e456e: [xpu] Add deterministic matmul test to test_gemm.py (#179004)PyTorch ReleasesThe Facebook insider building content moderation for the AI eraTechCrunchMemory will consume 30% of hyperscaler data center spending this year, a 4X increase over 2023 — Nvidia gets preferential supply terms well below standard market rates, says analyst firmtomshardware.comBeijing mandates internal ethics committees for all Chinese AI companies - South China Morning PostGNews AI ethicsWhat formal protocols should exist when a model under evaluation is used in the evaluation pipeline?lesswrong.comb8648llama.cpp ReleasesSadly, The Whispering EarringLessWrong AI[P] I trained a Mamba-3 log anomaly detector that hit 0.9975 F1 on HDFS — and I’m curious how far this can goReddit r/MachineLearningAnthropic Responsible Scaling Policy v3: Dive Into The Detailslesswrong.comSubgen AI announces final outcome of the voluntary public offer to shareholders of Spanish subsidiary Substrate AI and resolves on directed issue of 291,284,082 ordinary shares with payment in kind - TradingViewGNews AI USABlack Hat USADark ReadingBlack Hat AsiaAI BusinessOrganization firewall settings for Copilot cloud agentGitHub Copilot Changelogtrunk/9557db4da58adc85b48136aad2e383f7ab6e456e: [xpu] Add deterministic matmul test to test_gemm.py (#179004)PyTorch ReleasesThe Facebook insider building content moderation for the AI eraTechCrunchMemory will consume 30% of hyperscaler data center spending this year, a 4X increase over 2023 — Nvidia gets preferential supply terms well below standard market rates, says analyst firmtomshardware.comBeijing mandates internal ethics committees for all Chinese AI companies - South China Morning PostGNews AI ethicsWhat formal protocols should exist when a model under evaluation is used in the evaluation pipeline?lesswrong.comb8648llama.cpp ReleasesSadly, The Whispering EarringLessWrong AI[P] I trained a Mamba-3 log anomaly detector that hit 0.9975 F1 on HDFS — and I’m curious how far this can goReddit r/MachineLearningAnthropic Responsible Scaling Policy v3: Dive Into The Detailslesswrong.comSubgen AI announces final outcome of the voluntary public offer to shareholders of Spanish subsidiary Substrate AI and resolves on directed issue of 291,284,082 ordinary shares with payment in kind - TradingViewGNews AI USA
AI NEWS HUBbyEIGENVECTOREigenvector

Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs

arXiv cs.CLby Zhuowen Liang, Xiaotian Lin, Zhengxuan Zhang, Yuyu Luo, Haixun Wang, Nan TangApril 1, 20261 min read0 views
Source Quiz

arXiv:2603.29232v1 Announce Type: new Abstract: Large language models (LLMs) are widely applied to data analytics over documents, yet direct reasoning over long, noisy documents remains brittle and error-prone. Hence, we study document question answering (QA) that consolidates dispersed evidence into a structured output (e.g., a table, graph, or chunks) to support reliable, verifiable QA. We propose a two-pillar framework, LiteCoST, to achieve both high accuracy and low latency with small language models (SLMs). Pillar 1: Chain-of-Structured-Thought (CoST). We introduce a CoST template, a schema-aware instruction that guides a strong LLM to produce both a step-wise CoST trace and the corresponding structured output. The process induces a minimal structure, normalizes entities/units, aligns

View PDF HTML (experimental)

Abstract:Large language models (LLMs) are widely applied to data analytics over documents, yet direct reasoning over long, noisy documents remains brittle and error-prone. Hence, we study document question answering (QA) that consolidates dispersed evidence into a structured output (e.g., a table, graph, or chunks) to support reliable, verifiable QA. We propose a two-pillar framework, LiteCoST, to achieve both high accuracy and low latency with small language models (SLMs). Pillar 1: Chain-of-Structured-Thought (CoST). We introduce a CoST template, a schema-aware instruction that guides a strong LLM to produce both a step-wise CoST trace and the corresponding structured output. The process induces a minimal structure, normalizes entities/units, aligns records, serializes the output, and verifies/refines it, yielding auditable supervision. Pillar 2: SLM fine-tuning. The compact models are trained on LLM-generated CoST data in two stages: Supervised Fine-Tuning for structural alignment, followed by Group Relative Policy Optimization (GRPO) incorporating triple rewards for answer/format quality and process consistency. By distilling structure-first behavior into SLMs, this approach achieves LLM-comparable quality on multi-domain long-document QA using 3B/7B SLMs, while delivering 2-4x lower latency than GPT-4o and DeepSeek-R1 (671B). The code is available at this https URL.

Comments: 26 pages, 17 figures, 10 tables. Accepted at ICLR 2026

Subjects:

Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Cite as: arXiv:2603.29232 [cs.CL]

(or arXiv:2603.29232v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.29232

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Zhuowen Liang [view email] [v1] Tue, 31 Mar 2026 04:03:07 UTC (7,242 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Long-Docume…modellanguage mo…announceavailablestudypolicyarXiv cs.CL

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 133 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!