Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessWhen the Scraper Breaks Itself: Building a Self-Healing CSS Selector Repair SystemDEV CommunitySelf-Referential Generics in Kotlin: When Type Safety Requires Talking to YourselfDEV CommunitySources: Amazon is in talks to acquire Globalstar to bolster its low Earth orbit satellite business; Apple's 20% stake in Globalstar is a complicating factor (Financial Times)TechmemeZ.ai Launches GLM-5V-Turbo: A Native Multimodal Vision Coding Model Optimized for OpenClaw and High-Capacity Agentic Engineering Workflows EverywhereMarkTechPostHow I Started Using AI Agents for End-to-End Testing (Autonoma AI)DEV CommunityHow AI Is Changing PTSD Recovery — And Why It MattersDEV CommunityYour Company’s AI Isn’t Broken. Your Data Just Doesn’t Know What It Means.Towards AIDisney’s Robot Olaf Dying Is the Funniest Thing to Happen in 2026GizmodoDeepSource vs Coverity: Static Analysis ComparedDEV CommunityClaude Code's Source Didn't Leak. It Was Already Public for Years.DEV CommunityStop Accepting BGP Routes on Trust Alone: Deploy RPKI ROV on IOS-XE and IOS XR TodayDEV CommunityI Built 5 SaaS Products in 7 Days Using AIDEV CommunityBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessWhen the Scraper Breaks Itself: Building a Self-Healing CSS Selector Repair SystemDEV CommunitySelf-Referential Generics in Kotlin: When Type Safety Requires Talking to YourselfDEV CommunitySources: Amazon is in talks to acquire Globalstar to bolster its low Earth orbit satellite business; Apple's 20% stake in Globalstar is a complicating factor (Financial Times)TechmemeZ.ai Launches GLM-5V-Turbo: A Native Multimodal Vision Coding Model Optimized for OpenClaw and High-Capacity Agentic Engineering Workflows EverywhereMarkTechPostHow I Started Using AI Agents for End-to-End Testing (Autonoma AI)DEV CommunityHow AI Is Changing PTSD Recovery — And Why It MattersDEV CommunityYour Company’s AI Isn’t Broken. Your Data Just Doesn’t Know What It Means.Towards AIDisney’s Robot Olaf Dying Is the Funniest Thing to Happen in 2026GizmodoDeepSource vs Coverity: Static Analysis ComparedDEV CommunityClaude Code's Source Didn't Leak. It Was Already Public for Years.DEV CommunityStop Accepting BGP Routes on Trust Alone: Deploy RPKI ROV on IOS-XE and IOS XR TodayDEV CommunityI Built 5 SaaS Products in 7 Days Using AIDEV Community

PaperVoyager : Building Interactive Web with Visual Language Models

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2603.22999v2 Announce Type: replace Abstract: Recent advances in visual language models have enabled autonomous agents for complex reasoning, tool use, and document understanding. However, existing document agents mainly transform papers into static artifacts such as summaries, webpages, or slides, which are insufficient for technical papers involving dynamic mechanisms and state transitions. In this work, we propose a Paper-to-Interactive-System Agent that converts research papers into executable interactive web systems. Given a PDF paper, the agent performs end-to-end processing withou — Dasen Dai, Biao Wu, Meng Fang, Wenhao Wang

View PDF HTML (experimental)

Abstract:Recent advances in visual language models have enabled autonomous agents for complex reasoning, tool use, and document understanding. However, existing document agents mainly transform papers into static artifacts such as summaries, webpages, or slides, which are insufficient for technical papers involving dynamic mechanisms and state transitions. In this work, we propose a Paper-to-Interactive-System Agent that converts research papers into executable interactive web systems. Given a PDF paper, the agent performs end-to-end processing without human intervention, including paper understanding, system modeling, and interactive webpage synthesis, enabling users to manipulate inputs and observe dynamic behaviors. To evaluate this task, we introduce a benchmark of 19 research papers paired with expert-built interactive systems as ground truth. We further propose PaperVoyager, a structured generation framework that explicitly models mechanisms and interaction logic during synthesis. Experiments show that PaperVoyager significantly improves the quality of generated interactive systems, offering a new paradigm for interactive scientific paper understanding.

Comments: 9 pages, 5 figures

Subjects:

Computation and Language (cs.CL)

ACM classes: F.2.2; I.2.7

Cite as: arXiv:2603.22999 [cs.CL]

(or arXiv:2603.22999v2 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.22999

arXiv-issued DOI via DataCite

Submission history

From: Biao Wu [view email] [v1] Tue, 24 Mar 2026 09:42:27 UTC (4,062 KB) [v2] Sat, 28 Mar 2026 06:16:28 UTC (4,062 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
PaperVoyage…researchpaperarxivnlplanguage-mo…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 200 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers