Research Papers research paper arxiv machine-learning deep-learning

In-Context Molecular Property Prediction with LLMs: A Blinding Study on Memorization and Knowledge Conflicts

arXivby [Submitted on 26 Mar 2026]March 30, 20261 min read1 views

arXiv:2603.25857v1 Announce Type: new Abstract: The capabilities of large language models (LLMs) have expanded beyond natural language processing to scientific prediction tasks, including molecular property prediction. However, their effectiveness in in-context learning remains ambiguous, particularly given the potential for training data contamination in widely used benchmarks. This paper investigates whether LLMs perform genuine in-context regression on molecular properties or rely primarily on memorized values. Furthermore, we analyze the interplay between pre-trained knowledge and in-conte — Matthias Busch, Marius Tacke, Sviatlana V. Lamaka, Mikhail L. Zheludkevich, Christian J. Cyron, Christian Feiler, Roland C. Aydin

View PDF HTML (experimental)

Abstract:The capabilities of large language models (LLMs) have expanded beyond natural language processing to scientific prediction tasks, including molecular property prediction. However, their effectiveness in in-context learning remains ambiguous, particularly given the potential for training data contamination in widely used benchmarks. This paper investigates whether LLMs perform genuine in-context regression on molecular properties or rely primarily on memorized values. Furthermore, we analyze the interplay between pre-trained knowledge and in-context information through a series of progressively blinded experiments. We evaluate nine LLM variants across three families (GPT-4.1, GPT-5, Gemini 2.5) on three MoleculeNet datasets (Delaney solubility, Lipophilicity, QM7 atomization energy) using a systematic blinding approach that iteratively reduces available information. Complementing this, we utilize varying in-context sample sizes (0-, 60-, and 1000-shot) as an additional control for information access. This work provides a principled framework for evaluating molecular property prediction under controlled information access, addressing concerns regarding memorization and exposing conflicts between pre-trained knowledge and in-context information.

Subjects:

Machine Learning (cs.LG)

Cite as: arXiv:2603.25857 [cs.LG]

(or arXiv:2603.25857v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.25857

arXiv-issued DOI via DataCite

Submission history

From: Matthias Busch [view email] [v1] Thu, 26 Mar 2026 19:27:30 UTC (796 KB)

Original source

arXiv

https://arxiv.org/abs/2603.25857

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Research Papers

Exclusive | OpenAI’s Former Research Chief Aims to Automate Manufacturing With AI - WSJ

Exclusive | OpenAI’s Former Research Chief Aims to Automate Manufacturing With AI WSJ

GNews AI manufacturing

1mabout 1 month ago

ModelsLive

DenseNet Paper Walkthrough: All Connected

When we try to train a very deep neural network model, one issue that we might encounter is the vanishing gradient problem. This is essentially a problem where the weight update of a model during training slows down or even stops, hence causing the model not to improve. When a network is very deep, the [ ] The post DenseNet Paper Walkthrough: All Connected appeared first on Towards Data Science .

Towards Data Science

23mabout 1 hour ago

Models

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models WSJ

GNews AI energy

1m3 days ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 162 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research Papers

Exclusive | OpenAI’s Former Research Chief Aims to Automate Manufacturing With AI - WSJ

Exclusive | OpenAI’s Former Research Chief Aims to Automate Manufacturing With AI WSJ

GNews AI manufacturing

1mabout 1 month ago

Research PapersFresh

Experts to address AI at BC3 cybersecurity conference - Butler Eagle

Experts to address AI at BC3 cybersecurity conference Butler Eagle

GNews AI cybersecurity

1mabout 3 hours ago

Research PapersLive

TROY student Eli Hankinson showcases research on AI and interactive learning at regional conference - Troy University

TROY student Eli Hankinson showcases research on AI and interactive learning at regional conference Troy University

GNews AI education

1mabout 2 hours ago

Research PapersFresh

How Leg Stiffness Affects Energy Economy in Hopping

arXiv:2501.03971v2 Announce Type: replace Abstract: In the fields of robotics and biomechanics, the integration of elastic elements such as springs and tendons in legged systems has long been recognized for enabling energy-efficient locomotion. Yet, a significant challenge persists: designing a robotic leg that perform consistently across diverse operating conditions, especially varying average forward speeds. It remains unclear whether, for such a range of operating conditions, the stiffness of the elastic elements needs to be varied or if a similar performance can be obtained by changing the motion and actuation while keeping the stiffness fixed. This work explores the influence of the leg stiffness on the energy efficiency of a monopedal robot through an extensive parametric study of it

arXiv cs.RO

2mabout 11 hours ago