SciEGQA: A Dataset for Scientific Evidence-Grounded Question Answering and Reasoning
arXiv:2511.15090v2 Announce Type: replace-cross Abstract: Scientific documents contain complex multimodal structures, which makes evidence localization and scientific reasoning in Document Visual Question Answering particularly challenging. However, most existing benchmarks evaluate models only at the page level without explicitly annotating the evidence regions that support the answer, which limits both interpretability and the reliability of evaluation. To address this limitation, we introduce SciEGQA, a scientific document question answering and reasoning dataset with semantic evidence grou — Wenhan Yu, Zhaoxi Zhang, Wang Chen, Guanqiang Qi, Weikang Li, Lei Sha, Deguo Xia, Jizhou Huang
View PDF HTML (experimental)
Abstract:Scientific documents contain complex multimodal structures, which makes evidence localization and scientific reasoning in Document Visual Question Answering particularly challenging. However, most existing benchmarks evaluate models only at the page level without explicitly annotating the evidence regions that support the answer, which limits both interpretability and the reliability of evaluation. To address this limitation, we introduce SciEGQA, a scientific document question answering and reasoning dataset with semantic evidence grounding, where supporting evidence is represented as semantically coherent document regions annotated with bounding boxes. SciEGQA consists of two components: a human-annotated fine-grained benchmark containing 1,623 high-quality question--answer pairs, and a large-scale automatically constructed training set with over 30K QA pairs generated through an automated data construction pipeline. Extensive experiments on a wide range of Vision-Language Models (VLMs) show that existing models still struggle with evidence localization and evidence-based question answering in scientific documents. Training on the proposed dataset significantly improves the scientific reasoning capabilities of VLMs. The project page is available at this https URL.
Comments: 8 pages, 4 figures, 3 tables
Subjects:
Databases (cs.DB); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2511.15090 [cs.DB]
(or arXiv:2511.15090v2 [cs.DB] for this version)
https://doi.org/10.48550/arXiv.2511.15090
arXiv-issued DOI via DataCite
Submission history
From: Wenhan Yu [view email] [v1] Wed, 19 Nov 2025 04:03:54 UTC (2,212 KB) [v2] Mon, 30 Mar 2026 06:53:39 UTC (22,175 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxiv
RFT FPCM OV - a Hugging Face Space by RFTSystems
huggingface.co RFT FPCM OV - a Hugging Face Space by RFTSystems RFT Fixed Parameter Cosmology Model, Open Validation 1. Fixed‑Parameter Cosmology Panel (FPCM‑OV) This side of the Space shows the core RFT cosmology running on one locked parameter set. Nothing adjusts itself — the whole model stands or falls on this single solution. What people can see here Age at z = 13.67: RFT gives 568.52 Myr , which lines up with JWST early‑galaxy maturity without any tuning. Horizon Ratio: The model naturally produces a horizon about 490× larger than ΛCDM. (This removes the horizon problem without inflation.) Unified Expansion Curve (H_RFT) The purple curve shows how expansion behaves across all redshifts using the same fixed parameters. JWST Maturity Plot The cyan and red lines show how RFT’s predicted
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers

RFT FPCM OV - a Hugging Face Space by RFTSystems
huggingface.co RFT FPCM OV - a Hugging Face Space by RFTSystems RFT Fixed Parameter Cosmology Model, Open Validation 1. Fixed‑Parameter Cosmology Panel (FPCM‑OV) This side of the Space shows the core RFT cosmology running on one locked parameter set. Nothing adjusts itself — the whole model stands or falls on this single solution. What people can see here Age at z = 13.67: RFT gives 568.52 Myr , which lines up with JWST early‑galaxy maturity without any tuning. Horizon Ratio: The model naturally produces a horizon about 490× larger than ΛCDM. (This removes the horizon problem without inflation.) Unified Expansion Curve (H_RFT) The purple curve shows how expansion behaves across all redshifts using the same fixed parameters. JWST Maturity Plot The cyan and red lines show how RFT’s predicted




Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!