Models model language model announce multimodal agent paper

Auto-Slides: An Interactive Multi-Agent System for Creating and Customizing Research Presentations

arXiv cs.MAby Yuheng Yang, Wenjia Jiang, Yang Wang, Yi Song, Yiwei Wang, Chi ZhangApril 2, 20261 min read0 views

arXiv:2509.11062v3 Announce Type: replace-cross Abstract: The rapid progress of large language models (LLMs) has opened new opportunities for education. While learners can interact with academic papers through LLM-powered dialogue, limitations still exist: the lack of structured organization and the heavy reliance on text can impede systematic understanding and engagement with complex concepts. To address these challenges, we propose Auto-Slides, an LLM-driven system that converts research papers into pedagogically structured, multimodal slides (e.g., diagrams and tables). Drawing on cognitive science, it creates a presentation-oriented narrative and allows iterative refinement via an interactive editor to better match learners' knowledge level and goals. Auto-Slides further incorporates v

View PDF HTML (experimental)

Abstract:The rapid progress of large language models (LLMs) has opened new opportunities for education. While learners can interact with academic papers through LLM-powered dialogue, limitations still exist: the lack of structured organization and the heavy reliance on text can impede systematic understanding and engagement with complex concepts. To address these challenges, we propose Auto-Slides, an LLM-driven system that converts research papers into pedagogically structured, multimodal slides (e.g., diagrams and tables). Drawing on cognitive science, it creates a presentation-oriented narrative and allows iterative refinement via an interactive editor to better match learners' knowledge level and goals. Auto-Slides further incorporates verification and knowledge retrieval mechanisms to ensure accuracy and contextual completeness. Through extensive user studies, Auto-Slides demonstrates strong learner acceptance, improved structural support for understanding, and expert-validated gains in narrative quality compared with conventional LLM-based reading. Our contributions lie in designing a multi-agent framework for transforming academic papers into pedagogically optimized slides and introducing interactive customization for personalized learning.

Comments: Project Homepage: this https URL

Subjects:

Human-Computer Interaction (cs.HC); Multiagent Systems (cs.MA)

Cite as: arXiv:2509.11062 [cs.HC]

(or arXiv:2509.11062v3 [cs.HC] for this version)

https://doi.org/10.48550/arXiv.2509.11062

arXiv-issued DOI via DataCite

Submission history

From: Yuheng Yang [view email] [v1] Sun, 14 Sep 2025 03:05:54 UTC (4,973 KB) [v2] Wed, 17 Sep 2025 11:06:49 UTC (4,973 KB) [v3] Wed, 1 Apr 2026 08:41:51 UTC (2,051 KB)

Original source

arXiv cs.MA

https://arxiv.org/abs/2509.11062

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modelannounce

ProductsFresh

UniCon: A Unified System for Efficient Robot Learning Transfers

arXiv:2601.14617v2 Announce Type: replace Abstract: Deploying learning-based controllers across heterogeneous robots is challenging due to platform differences, inconsistent interfaces, and inefficient middleware. To address these issues, we present UniCon, a lightweight framework that standardizes states, control flow, and instrumentation across platforms. It decomposes workflows into execution graphs with reusable components, separating system states from control logic to enable plug-and-play deployment across various robot morphologies. Unlike traditional middleware, it prioritizes efficiency through batched, vectorized data flow, minimizing communication overhead and improving inference latency. This modular, data-oriented approach enables seamless sim-to-real transfer with minimal re-

arXiv cs.RO

1mabout 6 hours ago

ProductsFresh

A Survey of Real-Time Support, Analysis, and Advancements in ROS 2

arXiv:2601.10722v2 Announce Type: replace Abstract: The Robot Operating System 2 (ROS~2) has emerged as a relevant middleware framework for robotic applications, offering modularity, distributed execution, and communication. In the last six years, ROS~2 has drawn increasing attention from the real-time systems community and industry. This survey presents a comprehensive overview of research efforts that analyze, enhance, and extend ROS~2 to support real-time execution. We first provide a detailed description of the internal scheduling mechanisms of ROS~2 and its layered architecture, including the interaction with DDS-based communication and other communication middleware. We then review key contributions from the literature, covering timing analysis for both single- and multi-threaded exe

arXiv cs.RO

2mabout 6 hours ago

ReleasesFresh

Simulation of Active Soft Nets for Capture of Space Debris

arXiv:2511.17266v2 Announce Type: replace Abstract: In this work, we propose a simulator, based on the open-source physics engine MuJoCo, for the design and control of soft robotic nets for the autonomous removal of space debris. The proposed simulator includes net dynamics, contact between the net and the debris, self-contact of the net, orbital mechanics, and a controller that can actuate thrusters on the four satellites at the corners of the net. It showcases the case of capturing Envisat, a large ESA satellite that remains in orbit as space debris following the end of its mission. This work investigates different mechanical models, which can be used to simulate the net dynamics, simulating various degrees of compliance, and different control strategies to achieve the capture of the deb

arXiv cs.RO

2mabout 6 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 229 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsFresh

An Open-Source LiDAR and Monocular Off-Road Autonomous Navigation Stack

arXiv:2604.03096v1 Announce Type: new Abstract: Off-road autonomous navigation demands reliable 3D perception for robust obstacle detection in challenging unstructured terrain. While LiDAR is accurate, it is costly and power-intensive. Monocular depth estimation using foundation models offers a lightweight alternative, but its integration into outdoor navigation stacks remains underexplored. We present an open-source off-road navigation stack supporting both LiDAR and monocular 3D perception without task-specific training. For the monocular setup, we combine zero-shot depth prediction (Depth Anything V2) with metric depth rescaling using sparse SLAM measurements (VINS-Mono). Two key enhancements improve robustness: edge-masking to reduce obstacle hallucination and temporal smoothing to mit

arXiv cs.RO

1mabout 6 hours ago

ModelsFresh

FSUNav: A Cerebrum-Cerebellum Architecture for Fast, Safe, and Universal Zero-Shot Goal-Oriented Navigation

arXiv:2604.03139v1 Announce Type: new Abstract: Current vision-language navigation methods face substantial bottlenecks regarding heterogeneous robot compatibility, real-time performance, and navigation safety. Furthermore, they struggle to support open-vocabulary semantic generalization and multimodal task inputs. To address these challenges, this paper proposes FSUNav: a Cerebrum-Cerebellum architecture for fast, safe, and universal zero-shot goal-oriented navigation, which innovatively integrates vision-language models (VLMs) with the proposed architecture. The cerebellum module, a high-frequency end-to-end module, develops a universal local planner based on deep reinforcement learning, enabling unified navigation across heterogeneous platforms (e.g., humanoid, quadruped, wheeled robots

arXiv cs.RO

1mabout 6 hours ago

ModelsLive

ChatGPT web service hit by brief disruption, OpenAI investigates - news.cgtn.com

ChatGPT web service hit by brief disruption, OpenAI investigates news.cgtn.com

Google News: ChatGPT

1mabout 1 hour ago

ModelsLive

Anthropic cuts OpenClaw access from Claude subscriptions, offers credits to ease transition - InfoWorld

Anthropic cuts OpenClaw access from Claude subscriptions, offers credits to ease transition InfoWorld

Google News: Claude

1mabout 1 hour ago