Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessBuild a Price Comparison Tool in 15 Minutes with the Marketplace Price APIDEV CommunityKubernetes - A Beginner's Guide to Container OrchestrationDEV Community5 Free Copilot Alternatives That Actually Work in 2026DEV CommunityCodiumAI vs Codium (Open Source): They Are NOT the SameDEV CommunityHow Bifrost Reduces GPT Costs and Response Times with Semantic CachingDEV Community[New Research] You need Slack to be an effective agentLessWrong AIAn interview with Galen Buckwalter, a BCI recipient in a Caltech brain implant study, on his recent ability to use the implant to produce musical tones (Emily Mullin/Wired)TechmemeA startup founder explains why she built 9 AI employees: 'I am a breathless OpenClaw bro'Business InsiderTop 5 Enterprise AI Gateways to Track Claude Code CostsDEV CommunityAntigravity: My Approach to Deliver the Most Assured Value for the Least MoneyDEV CommunityTrading My Body for Logic: The Physical Decay We IgnoreDEV CommunityGetting Started with Apache Kafka: What I Learned Building Event-Driven Microservices at EricssonDEV CommunityBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessBuild a Price Comparison Tool in 15 Minutes with the Marketplace Price APIDEV CommunityKubernetes - A Beginner's Guide to Container OrchestrationDEV Community5 Free Copilot Alternatives That Actually Work in 2026DEV CommunityCodiumAI vs Codium (Open Source): They Are NOT the SameDEV CommunityHow Bifrost Reduces GPT Costs and Response Times with Semantic CachingDEV Community[New Research] You need Slack to be an effective agentLessWrong AIAn interview with Galen Buckwalter, a BCI recipient in a Caltech brain implant study, on his recent ability to use the implant to produce musical tones (Emily Mullin/Wired)TechmemeA startup founder explains why she built 9 AI employees: 'I am a breathless OpenClaw bro'Business InsiderTop 5 Enterprise AI Gateways to Track Claude Code CostsDEV CommunityAntigravity: My Approach to Deliver the Most Assured Value for the Least MoneyDEV CommunityTrading My Body for Logic: The Physical Decay We IgnoreDEV CommunityGetting Started with Apache Kafka: What I Learned Building Event-Driven Microservices at EricssonDEV Community

Using LLMs for Knowledge Component-level Correctness Labeling in Open-ended Coding Problems

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2602.17542v2 Announce Type: replace Abstract: Fine-grained skill representations, commonly referred to as knowledge components (KCs), are fundamental to many approaches in student modeling and learning analytics. However, KC-level correctness labels are rarely available in real-world datasets, especially for open-ended programming tasks where solutions typically involve multiple KCs simultaneously. Simply propagating problem-level correctness to all associated KCs obscures partial mastery and often leads to poorly fitted learning curves. To address this challenge, we propose an automated — Zhangqi Duan, Arnav Kankaria, Dhruv Kartik, Andrew Lan

View PDF HTML (experimental)

Abstract:Fine-grained skill representations, commonly referred to as knowledge components (KCs), are fundamental to many approaches in student modeling and learning analytics. However, KC-level correctness labels are rarely available in real-world datasets, especially for open-ended programming tasks where solutions typically involve multiple KCs simultaneously. Simply propagating problem-level correctness to all associated KCs obscures partial mastery and often leads to poorly fitted learning curves. To address this challenge, we propose an automated framework that leverages large language models (LLMs) to label KC-level correctness directly from student-written code. Our method assesses whether each KC is correctly applied and further introduces a temporal context-aware Code-KC mapping mechanism to better align KCs with individual student code. We evaluate the resulting KC-level correctness labels in terms of learning curve fit and predictive performance using the power law of practice and the Additive Factors Model. Experimental results show that our framework leads to learning curves that are more consistent with cognitive theory and improves predictive performance, compared to baselines. Human evaluation further demonstrates substantial agreement between LLM and expert annotations.

Subjects:

Computation and Language (cs.CL); Computers and Society (cs.CY)

Cite as: arXiv:2602.17542 [cs.CL]

(or arXiv:2602.17542v2 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2602.17542

arXiv-issued DOI via DataCite

Submission history

From: Zhangqi Duan [view email] [v1] Thu, 19 Feb 2026 16:58:34 UTC (792 KB) [v2] Fri, 27 Mar 2026 21:30:24 UTC (794 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Using LLMs …researchpaperarxivnlplanguage-mo…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 220 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers