Models model language model announce open-source valuation analysis

REFINE: Real-world Exploration of Interactive Feedback and Student Behaviour

ArXiv CS.AIby Fares Fawzi, Seyed Parsa Neshaei, Marta Knezevic, Tanya Nazaretsky, Tanja K\"aserApril 1, 20261 min read0 views

Source Quiz

arXiv:2603.29142v1 Announce Type: new Abstract: Formative feedback is central to effective learning, yet providing timely, individualised feedback at scale remains a persistent challenge. While recent work has explored the use of large language models (LLMs) to automate feedback, most existing systems still conceptualise feedback as a static, one-way artifact, offering limited support for interpretation, clarification, or follow-up. In this work, we introduce REFINE, a locally deployable, multi-agent feedback system built on small, open-source LLMs that treats feedback as an interactive process. REFINE combines a pedagogically-grounded feedback generation agent with an LLM-as-a-judge-guided regeneration loop using a human-aligned judge, and a self-reflective tool-calling interactive agent

View PDF HTML (experimental)

Abstract:Formative feedback is central to effective learning, yet providing timely, individualised feedback at scale remains a persistent challenge. While recent work has explored the use of large language models (LLMs) to automate feedback, most existing systems still conceptualise feedback as a static, one-way artifact, offering limited support for interpretation, clarification, or follow-up. In this work, we introduce REFINE, a locally deployable, multi-agent feedback system built on small, open-source LLMs that treats feedback as an interactive process. REFINE combines a pedagogically-grounded feedback generation agent with an LLM-as-a-judge-guided regeneration loop using a human-aligned judge, and a self-reflective tool-calling interactive agent that supports student follow-up questions with context-aware, actionable responses. We evaluate REFINE through controlled experiments and an authentic classroom deployment in an undergraduate computer science course. Automatic evaluations show that judge-guided regeneration significantly improves feedback quality, and that the interactive agent produces efficient, high-quality responses comparable to a state-of-the-art closed-source model. Analysis of real student interactions further reveals distinct engagement patterns and indicates that system-generated feedback systematically steers subsequent student inquiry. Our findings demonstrate the feasibility and effectiveness of multi-agent, tool-augmented feedback systems for scalable, interactive feedback.

Comments: Accepted to AIED 2026

Subjects:

Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)

Cite as: arXiv:2603.29142 [cs.AI]

(or arXiv:2603.29142v1 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2603.29142

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Fares Fawzi [view email] [v1] Tue, 31 Mar 2026 01:48:08 UTC (1,382 KB)

Original source

ArXiv CS.AI

https://arxiv.org/abs/2603.29142

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modelannounce

ProductsLive

HDF5 vs. TsFile: Efficient Time-Series Data Storage

In the era of big data, efficient data storage and management are critical to the success of both scientific research and industrial applications. <a href="https://www.hdfgroup.org/solutions/hdf5/" rel="noopener noreferrer">HDF5</a>, a hierarchical format for managing experimental data, and <a href="https://tsfile.apache.org" rel="noopener noreferrer">TsFile</a>, a modern time-series data storage format, each offer unique strengths and design philosophies. This article takes a deep dive into the origins, use cases, and limitations of HDF5, and explores the similarities and differences between HDF5 and TsFile. <h2> Origins of HDF5 </h2> HDF5, short for Hierarchical Data Format version 5, is more than just a file format. It encompasses a full data model, software libraries

DEV Community

13m23 minutes ago

ProductsLive

Securing the Agentic Frontier: Why Your AI Agents Need a "Citadel" 🏰

Remember when we thought chatbots were the peak of AI? Fast forward to early 2026, and we’re all-in on autonomous agents. Frameworks like <a href="https://neuraltrust.ai/blog/openclaw-moltbook" rel="noopener noreferrer">OpenClaw</a> have made it incredibly easy to build agents that don't just talk, they do. They manage calendars, write code, and even deploy to production. But here’s the catch: the security models we built for humans are fundamentally broken for autonomous systems. If you’re a developer building with agentic AI, you’ve probably heard of the "unbounded blast radius." Unlike a human attacker limited by typing speed and sleep, an AI agent operates at compute speed, 24/7. One malicious "skill" or a po

DEV Community

4m17 minutes ago

ModelsLive

Claude Code's Leaked Source: A Real-World Masterclass in Harness Engineering

Earlier this year, Mitchell Hashimoto coined the term "harness engineering" — the discipline of building everything around the model that makes an AI agent actually work in production. OpenAI wrote about it. Anthropic published guides. Martin Fowler analyzed it. Then Claude Code's source leaked. 512K lines of TypeScript. And suddenly we have the first real look at what production harness engineering looks like at scale. <h2> The Evolution: From Prompt to Harness </h2> The AI engineering discipline has shifted rapidly: <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>2023-2024: Prompt Engineering → "How to ask the model" 2025: Context Engineering → "What information to feed the model" 2026: Harness Engineering → "How the ent

DEV Community

7m22 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 229 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsLive

Claude Code's Leaked Source: A Real-World Masterclass in Harness Engineering

DEV Community

7m22 minutes ago

ModelsLive

I Built an AI PPT Maker and Resume Builder Website

I Built an AI PPT Maker and Resume Builder Website built a website that helps students and professionals create PowerPoint presentations and resumes using AI in just a few minutes. What the Website Does The website has two main tools: AI PPT Maker – Generate presentations from a topic Resume Maker – Create professional resumes quickly You just enter your topic or details, and the tool generates content automatically. Why I Built This Many students spend hours making presentations and resumes. I wanted to build a simple tool that saves time and makes this process easier using AI. Tools Used React Node.js Gemini API AI Studio Vercel for deployment Try It Here You can try the website he

DEV Community

1m22 minutes ago

Models

An updated analysis of large language model performance on ophthalmology speciality examinations - Nature

<a href="https://news.google.com/rss/articles/CBMiX0FVX3lxTFBpUV83NFU5RDZtbm9pb3NTN3UwaEhaZmh2Mm04OHJNS3JYbHVVQVFQYXRISld2SG1UdHlHWWxsQWk4VFZtR1Q1QWhqakM5ME1COEtCdXZCM3FOazg4UlhB?oc=5" target="_blank">An updated analysis of large language model performance on ophthalmology speciality examinations</a> Nature

Google News: LLM

1m2 months ago

Models

Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT - WSJ

<a href="https://news.google.com/rss/articles/CBMiogNBVV95cUxOM2VrdzJzY3pZWGphZ0NnVlFGTGJVNllDeHN6Y09QU09MNlA0UEhJbFR0dmFqWHFhYkpGUXVYV3FaMFZNM3pEenptVVN3TW5ZZHBQcUZoeWxBY19LbDA3dWsyRlVsQmF6NEJkYmlmcUh4RHh6V0NieTAzZE5PNWtNTU5oSC1jbmY0U3R0SnhPY0Y0RU51THNDaklVS2FOczN3MC0yNkVidTRtZktJa3ViVzFXdUJvdEFvMmlqUHl5OGRVbkpUeVRkY1d6ZERBU2J1NDllVTRacjNZZ1Q0ZmU0ci13SG91Wk8tTnl2RlphOU9tQVViZzlHRWNnVHVWbEJ4NEZTYVRvc05OYzBUQVctblV1RGxpMDlHNVpSQzZLR0FhSG5pZ0RYMS1wRllJbnRpNFFMOEIzMzdkazc2Z0tSYm11MEtqRFNtbl95Y0U5cEVrWEViZEpSZ2hhTFUtY1FoTm90eHlOcFhmNGpRMnBKWldQVHlBeGZFZGtfTUlRelFuUVNFdlgyeHp2aXIxVGVLdGpmdlJXUllWYWFCamE3WFNR?oc=5" target="_blank">Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT</a> WSJ

Google News: OpenAI

1m2 days ago