Research Papers research paper arxiv nlp language-models

Training data generation for context-dependent rubric-based short answer grading

arXivMarch 31, 20262 min read0 views

arXiv:2603.28537v1 Announce Type: new Abstract: Every 4 years, the PISA test is administered by the OECD to test the knowledge of teenage students worldwide and allow for comparisons of educational systems. However, having to avoid language differences and annotator bias makes the grading of student answers challenging. For these reasons, it would be interesting to compare methods of automatic student answer grading. To train some of these methods, which require machine learning, or to compute parameters or select hyperparameters for those that do not, a large amount of domain-specific data is — Pavel \v{S}indel\'a\v{r}, D\'avid Slivka, Christopher Bouma, Filip Pr\'a\v{s}il, Ond\v{r}ej Bojar

View PDF HTML (experimental)

Abstract:Every 4 years, the PISA test is administered by the OECD to test the knowledge of teenage students worldwide and allow for comparisons of educational systems. However, having to avoid language differences and annotator bias makes the grading of student answers challenging. For these reasons, it would be interesting to compare methods of automatic student answer grading. To train some of these methods, which require machine learning, or to compute parameters or select hyperparameters for those that do not, a large amount of domain-specific data is needed. In this work, we explore a small number of methods for creating a large-scale training dataset using only a relatively small confidential dataset as a reference, leveraging a set of very simple derived text formats to preserve confidentiality. Using these methods, we successfully created three surrogate datasets that are, at the very least, superficially more similar to the reference dataset than purely the result of prompt-based generation. Early experiments suggest one of these approaches might also lead to improved model training.

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2603.28537 [cs.CL]

(or arXiv:2603.28537v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.28537

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Pavel Šindelář [view email] [v1] Mon, 30 Mar 2026 14:59:53 UTC (66 KB)

Original source

arXiv

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Products

Buyer beware: how AI is infiltrating humanitarian aid operations

Access Now’s latest research unpacks how AI tools are dodging procurement vetting to infiltrate humanitarian organisations, creating new risks for them and the communities they serve. The post Buyer beware: how AI is infiltrating humanitarian aid operations appeared first on Access Now .

Access Now

1m5 days ago

Research Papers

Artificial intelligence as a moral mediator: emotional reciprocity driving happiness in hospitality

Artificial intelligence (AI) in hospitality is often portrayed as a cold, efficiency-focused tool, overlooking its potential to mediate emotional and ethical dynamics in the workplace. This study addresses the problem of how AI can ethically regulate emotional labor without dehumanizing work, and how emotional reciprocity contributes to workplace happiness. Using a quantitative, multigroup survey methodology, data were collected from 754 hospitality employees and 42 managers across hotels in Spain. Structural equation modeling examined the mediating role of AI-mediated emotional reciprocity (AI-MER) between emotional labor sustainability (ELS), shared prosperity (SP), human-centered leadership, and workplace happiness. Findings reveal that ELS is a foundational anchor enabling AI to mediat

AI & Society Journal

1m8 days ago

Products

I Tested These 5 AI Deep Research Tools So You Don't Have To!

AI YouTube Channel 40

1m10 months ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 123 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research Papers

Artificial intelligence as a moral mediator: emotional reciprocity driving happiness in hospitality

AI & Society Journal

1m8 days ago

Research Papers

Unpacking the message: visual cues to reduce bystander uncertainty about delivery drones in public spaces

As drones are deployed in public spaces for tasks such as package delivery, drones will encounter the public as bystanders passing by. The distinctive character of bystanders is that they are not the package recipients, so they lack prior information about the drone. Clear communication of drone intentions is essential to reduce uncertainty and improve public safety and trust. Limited research, however, has examined how a drone’s communication strategies affect bystanders. This online questionnaire study investigated how a drone’s visual cues affect bystanders' uncertainty about a drone’s intentions. Participants ( N = 150) viewed software simulated scenarios of drones delivering packages either by landing or by cable drop, each with or without visual interfaces (on-board lights, on-board

AI & Society Journal

2m4 days ago

Research Papers

Edouard Harris - New Research: Advanced AI may tend to seek power by default

Edouard Harris - New Research: Advanced AI may tend to seek power *by default*

AI YouTube Channel 37

1mover 3 years ago

Research Papers

Drag Your GAN Explained: Image Editing via Drag & Drop Using AI | Paper Reimplementation | Tutorial

AI YouTube Channel 28

1malmost 3 years ago