Impact of Multimodal and Conversational AI on Learning Outcomes and Experience
arXiv:2604.02221v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) offer an opportunity to support multimedia learning through conversational systems grounded in educational content. However, while conversational AI is known to boost engagement, its impact on learning in visually-rich STEM domains remains under-explored. Moreover, there is limited understanding of how multimodality and conversationality jointly influence learning in generative AI systems. This work reports findings from a randomized controlled online study (N = 124) comparing three approaches to learning biology from textbook content: (1) a document-grounded conversational AI with interleaved text-and-image responses (MuDoC), (2) a document-grounded conversational AI with text-only responses (TexDoC),
View PDF HTML (experimental)
Abstract:Multimodal Large Language Models (MLLMs) offer an opportunity to support multimedia learning through conversational systems grounded in educational content. However, while conversational AI is known to boost engagement, its impact on learning in visually-rich STEM domains remains under-explored. Moreover, there is limited understanding of how multimodality and conversationality jointly influence learning in generative AI systems. This work reports findings from a randomized controlled online study (N = 124) comparing three approaches to learning biology from textbook content: (1) a document-grounded conversational AI with interleaved text-and-image responses (MuDoC), (2) a document-grounded conversational AI with text-only responses (TexDoC), and (3) a textbook interface with semantic search and highlighting (DocSearch). Learners using MuDoC achieved the highest post-test scores and reported the most positive learning experience. Notably, while TexDoC was rated as significantly more engaging and easier to use than DocSearch, it led to the lowest post-test scores, revealing a disconnect between student perceptions and learning outcomes. Interpreted through the lens of the Cognitive Load Theory, these findings suggest that conversationality reduces extraneous load, while visual-verbal integration induced by multimodality increases germane load, leading to better learning outcomes. When conversationality is not complemented by multimodality, reduced cognitive effort may instead inflate perceived understanding without improving learning outcomes.
Comments: 16 pages, 3 figures, Accepted to AIED 2026 (Seoul, South Korea)
Subjects:
Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI)
Cite as: arXiv:2604.02221 [cs.HC]
(or arXiv:2604.02221v1 [cs.HC] for this version)
https://doi.org/10.48550/arXiv.2604.02221
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Karan Taneja [view email] [v1] Thu, 2 Apr 2026 16:12:00 UTC (936 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modellanguage modelannounce
Ant Group Launches Anvita for AI Agent Crypto
Ant Digital Technologies, the blockchain arm of Alipay's parent company, has unveiled Anvita — a two-part platform that lets AI agents autonomously hold crypto assets, execute trades, and settle payments in real time using stablecoins. For any fintech developer or crypto developer building payment infrastructure in the UK, this marks the moment a major Asian fintech giant went all in on the agent-to-agent economy running on crypto rails. Announced at the Real Up summit in Cannes on 5 April, Anvita sits at the exact intersection of agentic AI and crypto payment infrastructure — two domains converging faster than most payment developers anticipated. What Anvita Means for Payment Developers Anvita ships in two distinct modules, each targeting a different layer of the fintech stack: Anvita Taa

How MCP Is Changing Test Management — And Which Tools Support It
Quick Answer MCP (Model Context Protocol) is an open standard that lets AI agents — Claude, GitHub Copilot, Cursor, and others — interact directly with external tools through a unified interface. For test management, this means you can create test cases, start test cycles, assign testers, and pull coverage reports using natural language — without opening a browser. Only two test management platforms currently support MCP: TestKase and Qase. If your tool does not support MCP, your team is missing the biggest productivity shift in QA since test automation. Top 3 Key Takeaways MCP eliminates context switching. Instead of bouncing between your IDE, browser, and test management tool, you talk to an AI agent that handles everything in one place. Only 2 of 5 major test management tools support MC

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligence, from massive industry investments to critical safety considerations and integration into core development processes. Key Areas Explored: Record-Breaking Investments: Major tech firms are committing billions to AI infrastructure, signaling a significant acceleration in the field. AI in Software Development: We examine how companies are leveraging AI for code generation and the implications for engineering workflows. Safety and Responsibility: The increasing focus on ethical AI development and protecting vulnerable users, particularly minors. Market Dynamics: How AI is influencing stock performance, cloud computing strategies, and
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

7 Settings That Turned My Claude AI from 35 to 92 Quality Score
7 Settings That Turned My Claude AI from 35 to 92 Quality Score 直接看數據: 同一句提示詞:「幫我寫一個登入頁面」 ┌─ 裸用 ────────────────┐ ┌─ 有設定 ──────────────────┐ │ 基礎 HTML form │ │ React + TypeScript + Tailwind │ │ 50 行 │ │ 200 行 │ │ 無驗證、無響應式 │ │ Zod 驗證 + 響應式 + 無障礙 │ │ 無安全考量 │ │ CSRF + Rate Limit │ │ 品質:35/100 │ │ 品質:92/100 │ └────────────────────────┘ └──────────────────────────────┘ 差距不在 Claude 的智力——兩邊用的是同一個 Sonnet 4.6。差距在 Claude 知不知道你的情境 。設定就是告訴 Claude 你的情境。 4.1 為什麼前置設定如此重要? 本章教的每一個設定,都在回答 Claude 心中的一個問題: Projects → 「這個專案是什麼?」 Custom Instructions → 「你是什麼角色?要什麼品質標準?」 Styles → 「你喜歡什麼風格的回覆?」 Memory → 「你之前跟我說過什麼?」 MCP → 「我能不能直接幫你存取外部資料?」 Extended Thinking → 「這個問題我要想多久?」 完整的對比實驗數據 → Ch27 4.2 Projects(專案)設定 建立 Project 步驟: 步驟 1:登入 claude.ai 步驟 2:左側欄 → 點「+ Create project」 步驟 3:輸入專案名稱(例如「電商平台前端」) 步驟 4:上傳 Project Kno

Claude Code in the Philippines: ₱112/month vs ₱1,120 for ChatGPT
Claude Code in the Philippines: ₱112/month vs ₱1,120 for ChatGPT If you're a Filipino developer, the math on AI subscriptions doesn't work in your favor. ChatGPT Plus costs $20/month. That's roughly ₱1,120 at current exchange rates — more than a day's wages for many developers just starting out. And if you want Claude Pro? Same story. But here's the thing: you don't need the $20 plan to use Claude's API. The Real Cost of AI for Pinoy Devs Let's be honest about what ₱1,120/month means: That's 5+ Jollibee Chickenjoy combos That's a week's worth of daily commute from Pasig to BGC For a fresh grad at ₱20,000/month starting salary, that's 5.6% of take-home pay — just for one AI tool And if you're freelancing on Upwork? Project rates in PHP don't scale with USD subscription costs. You're eating




Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!