Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessHow a Monorepo Keeps Multiple Projects in Sync - From Shared Code to Atomic DeploymentsDEV CommunityStep‑by‑Step Guide: Generate PowerPoint Slides Using Copilot Studio AgentDEV CommunityFinnish neurowellness startup Audicin raises $1.9MThe Next Web Neural🙀 Anthropic accidentally leaked Claude Code's entire source code - The NeuronGoogle News: ClaudeI Built a Python Tool to Check If AI Search Engines Can Find Your WebsiteDEV CommunityFrom AWS Key Leak to evnx: The Origin Story of a Developer's Safety NetDEV CommunityHarnessOS: scaffold/middleware for infinite autonomous tasks — built on Harness EngineeringDEV CommunityUnderstanding Gemini: Google’s AI tools, explained - Campaign Middle EastGoogle News: GeminiInside the push to make every employee an AI masterBusiness InsiderThe Convergence of APC and AI: From Advanced Control to Intelligent Operations - ARC AdvisoryGoogle News: Machine LearningAnthropic Accidentally Leaks Entire Claude Code Source Code Online - trendingtopics.euGoogle News: ClaudeBuilding a Decentralized Prediction Market: A Full-Stack Architecture GuideDEV CommunityBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessHow a Monorepo Keeps Multiple Projects in Sync - From Shared Code to Atomic DeploymentsDEV CommunityStep‑by‑Step Guide: Generate PowerPoint Slides Using Copilot Studio AgentDEV CommunityFinnish neurowellness startup Audicin raises $1.9MThe Next Web Neural🙀 Anthropic accidentally leaked Claude Code's entire source code - The NeuronGoogle News: ClaudeI Built a Python Tool to Check If AI Search Engines Can Find Your WebsiteDEV CommunityFrom AWS Key Leak to evnx: The Origin Story of a Developer's Safety NetDEV CommunityHarnessOS: scaffold/middleware for infinite autonomous tasks — built on Harness EngineeringDEV CommunityUnderstanding Gemini: Google’s AI tools, explained - Campaign Middle EastGoogle News: GeminiInside the push to make every employee an AI masterBusiness InsiderThe Convergence of APC and AI: From Advanced Control to Intelligent Operations - ARC AdvisoryGoogle News: Machine LearningAnthropic Accidentally Leaks Entire Claude Code Source Code Online - trendingtopics.euGoogle News: ClaudeBuilding a Decentralized Prediction Market: A Full-Stack Architecture GuideDEV Community

REFINE: Real-world Exploration of Interactive Feedback and Student Behaviour

ArXiv CS.AIby Fares Fawzi, Seyed Parsa Neshaei, Marta Knezevic, Tanya Nazaretsky, Tanja K\"aserApril 1, 20261 min read0 views
Source Quiz

arXiv:2603.29142v1 Announce Type: new Abstract: Formative feedback is central to effective learning, yet providing timely, individualised feedback at scale remains a persistent challenge. While recent work has explored the use of large language models (LLMs) to automate feedback, most existing systems still conceptualise feedback as a static, one-way artifact, offering limited support for interpretation, clarification, or follow-up. In this work, we introduce REFINE, a locally deployable, multi-agent feedback system built on small, open-source LLMs that treats feedback as an interactive process. REFINE combines a pedagogically-grounded feedback generation agent with an LLM-as-a-judge-guided regeneration loop using a human-aligned judge, and a self-reflective tool-calling interactive agent

View PDF HTML (experimental)

Abstract:Formative feedback is central to effective learning, yet providing timely, individualised feedback at scale remains a persistent challenge. While recent work has explored the use of large language models (LLMs) to automate feedback, most existing systems still conceptualise feedback as a static, one-way artifact, offering limited support for interpretation, clarification, or follow-up. In this work, we introduce REFINE, a locally deployable, multi-agent feedback system built on small, open-source LLMs that treats feedback as an interactive process. REFINE combines a pedagogically-grounded feedback generation agent with an LLM-as-a-judge-guided regeneration loop using a human-aligned judge, and a self-reflective tool-calling interactive agent that supports student follow-up questions with context-aware, actionable responses. We evaluate REFINE through controlled experiments and an authentic classroom deployment in an undergraduate computer science course. Automatic evaluations show that judge-guided regeneration significantly improves feedback quality, and that the interactive agent produces efficient, high-quality responses comparable to a state-of-the-art closed-source model. Analysis of real student interactions further reveals distinct engagement patterns and indicates that system-generated feedback systematically steers subsequent student inquiry. Our findings demonstrate the feasibility and effectiveness of multi-agent, tool-augmented feedback systems for scalable, interactive feedback.

Comments: Accepted to AIED 2026

Subjects:

Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)

Cite as: arXiv:2603.29142 [cs.AI]

(or arXiv:2603.29142v1 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2603.29142

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Fares Fawzi [view email] [v1] Tue, 31 Mar 2026 01:48:08 UTC (1,382 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modelannounce

Knowledge Map

Knowledge Map
TopicsEntitiesSource
REFINE: Rea…modellanguage mo…announceopen-sourcevaluationanalysisArXiv CS.AI

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 229 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models