Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessCursor s $2 billion bet: The IDE is now a fallback, not the defaultThe New StackAI Expert Says It’s Time to Stop Freaking Out About AI Taking Our JobsFuturism AIWhat is the effect on the Human mind from AI?discuss.huggingface.coUnderstanding Token Classification in NLP: NER, POS Tagging & Chunking ExplainedMedium AIIntroducing ForestFire, a new tree-learning libraryMedium AIBuy Verified Coinbase Accounts - 100% active and safeDev.to AIWe Can’t Even Imagine the Eating Disorders This New Meta Smart Glasses Feature Will CauseFuturism AI90% людей используют нейросети как поисковик. И проигрывают.Dev.to AIContinuing the idea of building a one-person unicorn, it is important to recognize that this…Medium AIHow to Build an AI Content Playbook That Actually Protects Your VoiceDev.to AIExploring Early Web Patterns for Modern AI Agent DevelopmentDev.to AIUnderstanding NLP Token Classification : A Beginner-Friendly GuideMedium AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessCursor s $2 billion bet: The IDE is now a fallback, not the defaultThe New StackAI Expert Says It’s Time to Stop Freaking Out About AI Taking Our JobsFuturism AIWhat is the effect on the Human mind from AI?discuss.huggingface.coUnderstanding Token Classification in NLP: NER, POS Tagging & Chunking ExplainedMedium AIIntroducing ForestFire, a new tree-learning libraryMedium AIBuy Verified Coinbase Accounts - 100% active and safeDev.to AIWe Can’t Even Imagine the Eating Disorders This New Meta Smart Glasses Feature Will CauseFuturism AI90% людей используют нейросети как поисковик. И проигрывают.Dev.to AIContinuing the idea of building a one-person unicorn, it is important to recognize that this…Medium AIHow to Build an AI Content Playbook That Actually Protects Your VoiceDev.to AIExploring Early Web Patterns for Modern AI Agent DevelopmentDev.to AIUnderstanding NLP Token Classification : A Beginner-Friendly GuideMedium AI
AI NEWS HUBbyEIGENVECTOREigenvector

Read More, Think More: Revisiting Observation Reduction for Web Agents

arXiv cs.CLby [Submitted on 2 Apr 2026]April 4, 20262 min read1 views
Source Quiz

arXiv:2604.01535v1 Announce Type: new Abstract: Web agents based on large language models (LLMs) rely on observations of web pages -- commonly represented as HTML -- as the basis for identifying available actions and planning subsequent steps. Prior work has treated the verbosity of HTML as an obstacle to performance and adopted observation reduction as a standard practice. We revisit this trend and demonstrate that the optimal observation representation depends on model capability and thinking token budget: (1) compact observations (accessibility trees) are preferable for lower-capability models, while detailed observations (HTML) are advantageous for higher-capability models; moreover, increasing thinking tokens further amplifies the benefit of HTML. (2) Our error analysis suggests that

View PDF HTML (experimental)

Abstract:Web agents based on large language models (LLMs) rely on observations of web pages -- commonly represented as HTML -- as the basis for identifying available actions and planning subsequent steps. Prior work has treated the verbosity of HTML as an obstacle to performance and adopted observation reduction as a standard practice. We revisit this trend and demonstrate that the optimal observation representation depends on model capability and thinking token budget: (1) compact observations (accessibility trees) are preferable for lower-capability models, while detailed observations (HTML) are advantageous for higher-capability models; moreover, increasing thinking tokens further amplifies the benefit of HTML. (2) Our error analysis suggests that higher-capability models exploit layout information in HTML for better action grounding, while lower-capability models suffer from increased hallucination under longer inputs. We also find that incorporating observation history improves performance across most models and settings, and a diff-based representation offers a token-efficient alternative. Based on these findings, we suggest practical guidelines: adaptively select observation representations based on model capability and thinking token budget, and incorporate observation history using diff-based representations.

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2604.01535 [cs.CL]

(or arXiv:2604.01535v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2604.01535

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Masafumi Enomoto [view email] [v1] Thu, 2 Apr 2026 02:14:47 UTC (325 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Read More, …modellanguage mo…announceavailableanalysistrendarXiv cs.CL

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 170 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!