Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessLost Warship From Battle of Copenhagen Found After 225 YearsGizmodoThese One-of-a-Kind Objects Are in the Wrong MuseumsGizmodoNew 'GeForge' and 'GDDRHammer' attacks can fully infiltrate your system through Nvidia's GPU memory — Rowhammer attacks in GPUs force bit flips in protected VRAM regions to gain read/write accesstomshardware.comI Uploaded My Blood Work to AI. Am I Oversharing? - WSJGNews AI healthcareAI’s next frontier is the real worldFortune TechDebris from aerial interception strikes Oracle building in Dubai, UAE saysCNBC TechnologyI Audited 30+ Small Businesses on Their AI Visibility. Here's What Most Are Getting Wrong.Dev.to AIHow to Actually Monitor Your LLM Costs (Without a Spreadsheet)Dev.to AIОдин промпт приносит мне $500 в неделю на фрилансеDev.to AINetflix AI Team Just Open-Sourced VOID: an AI Model That Erases Objects From Videos — Physics and AllMarkTechPostUnderstanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained.DEV CommunityHow to Supercharge Your AI Coding Workflow with Oh My CodexDev.to AIBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessLost Warship From Battle of Copenhagen Found After 225 YearsGizmodoThese One-of-a-Kind Objects Are in the Wrong MuseumsGizmodoNew 'GeForge' and 'GDDRHammer' attacks can fully infiltrate your system through Nvidia's GPU memory — Rowhammer attacks in GPUs force bit flips in protected VRAM regions to gain read/write accesstomshardware.comI Uploaded My Blood Work to AI. Am I Oversharing? - WSJGNews AI healthcareAI’s next frontier is the real worldFortune TechDebris from aerial interception strikes Oracle building in Dubai, UAE saysCNBC TechnologyI Audited 30+ Small Businesses on Their AI Visibility. Here's What Most Are Getting Wrong.Dev.to AIHow to Actually Monitor Your LLM Costs (Without a Spreadsheet)Dev.to AIОдин промпт приносит мне $500 в неделю на фрилансеDev.to AINetflix AI Team Just Open-Sourced VOID: an AI Model That Erases Objects From Videos — Physics and AllMarkTechPostUnderstanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained.DEV CommunityHow to Supercharge Your AI Coding Workflow with Oh My CodexDev.to AI
AI NEWS HUBbyEIGENVECTOREigenvector

Containing the Reproducibility Gap: Automated Repository-Level Containerization for Scholarly Jupyter Notebooks

arXiv cs.SEby [Submitted on 1 Apr 2026]April 2, 20262 min read1 views
Source Quiz

arXiv:2604.01072v1 Announce Type: new Abstract: Computational reproducibility is fundamental to trustworthy science, yet remains difficult to achieve in practice across various research workflows, including Jupyter notebooks published alongside scholarly articles. Environment drift, undocumented dependencies and implicit execution assumptions frequently prevent independent re-execution of published research. Despite existing reproducibility guidelines, scalable and systematic infrastructure for automated assessment remains limited. We present an automated, web-oriented reproducibility engineering pipeline that reconstructs and evaluates repository-level execution environments for scholarly notebooks. The system performs dependency inference, automated container generation, and isolated exe

View PDF HTML (experimental)

Abstract:Computational reproducibility is fundamental to trustworthy science, yet remains difficult to achieve in practice across various research workflows, including Jupyter notebooks published alongside scholarly articles. Environment drift, undocumented dependencies and implicit execution assumptions frequently prevent independent re-execution of published research. Despite existing reproducibility guidelines, scalable and systematic infrastructure for automated assessment remains limited. We present an automated, web-oriented reproducibility engineering pipeline that reconstructs and evaluates repository-level execution environments for scholarly notebooks. The system performs dependency inference, automated container generation, and isolated execution to approximate the notebook's original computational context. We evaluate the approach on 443 notebooks from 116 GitHub repositories referenced by publications in PubMed Central. Execution outcomes are classified into four categories: resolved environment failures, persistent logic or data errors, reproducibility drift, and container-induced regressions. Our results show that containerization resolves 66.7% of prior dependency-related failures and substantially improves execution robustness. However, a significant reproducibility gap remains: 53.7% of notebooks exhibit low output fidelity, largely due to persistent runtime failures and stochastic non-determinism. These findings indicate that standardized containerization is essential for computational stability but insufficient for full bit-wise reproducibility. The framework offers a scalable solution for researchers, editors, and archivists seeking systematic, automated assessment of computational artifacts.

Subjects:

Software Engineering (cs.SE); Computational Engineering, Finance, and Science (cs.CE)

Cite as: arXiv:2604.01072 [cs.SE]

(or arXiv:2604.01072v1 [cs.SE] for this version)

https://doi.org/10.48550/arXiv.2604.01072

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Sheeba Samuel [view email] [v1] Wed, 1 Apr 2026 16:07:54 UTC (524 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

announcearxivresearch

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Containing …announcearxivresearchpublishedfindingsgithubarXiv cs.SE

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 276 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers