Research Papers research paper arxiv machine-learning deep-learning

TabPFN-Wide: Continued Pre-Training for Extreme Feature Counts

arXivMarch 31, 202610 min read0 views

arXiv:2510.06162v2 Announce Type: replace Abstract: Revealing novel insights from the relationship between molecular measurements and pathology remains a very impactful application of machine learning in biomedicine. Data in this domain typically contain only a few observations but thousands of potentially noisy features, posing challenges for conventional tabular machine learning approaches. While prior-data fitted networks emerge as foundation models for predictive tabular data tasks, they are currently not suited to handle large feature counts (>500). Although feature reduction enables t — Christopher Kolberg, Jules Kreuer, Jonas Huurdeman, Sofiane Ouaari, Katharina Eggensperger, Nico Pfeifer

View PDF HTML (experimental)

Abstract:Revealing novel insights from the relationship between molecular measurements and pathology remains a very impactful application of machine learning in biomedicine. Data in this domain typically contain only a few observations but thousands of potentially noisy features, posing challenges for conventional tabular machine learning approaches. While prior-data fitted networks emerge as foundation models for predictive tabular data tasks, they are currently not suited to handle large feature counts (>500). Although feature reduction enables their application, it hinders feature importance analysis. We propose a strategy that extends existing models through continued pre-training on synthetic data sampled from a customized prior. The resulting model, TabPFN-Wide, matches or exceeds its base model's performance, while exhibiting improved robustness to noise. It seamlessly scales beyond 30,000 categorical and continuous features, regardless of noise levels, while maintaining inherent interpretability, which is critical for biomedical applications. Our results demonstrate that prior-informed adaptation is suitable to enhance the capability of foundation models for high-dimensional data. On real-world omics datasets, we show that many of the most relevant features identified by the model overlap with previous biological findings, while others propose potential starting points for future studies.

Subjects:

Machine Learning (cs.LG)

Cite as: arXiv:2510.06162 [cs.LG]

(or arXiv:2510.06162v2 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2510.06162

arXiv-issued DOI via DataCite

Submission history

From: Christopher Kolberg [view email] [v1] Tue, 7 Oct 2025 17:28:49 UTC (10,898 KB) [v2] Sun, 29 Mar 2026 15:04:23 UTC (1,387 KB)

Original source

arXiv

https://arxiv.org/abs/2510.06162

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

CountriesRecent

Artificial Intelligence at JPMorgan Chase - Emerj Artificial Intelligence Research

<a href="https://news.google.com/rss/articles/CBMibEFVX3lxTE12bUwyd1dkamZPZExOVHJvb1MxTkZDaml4ak1PbDBvdXlrODBFdmtFVnBMVkhiS1RHTy0yWVRqVmYzQng2NG9VUkcwYVo0R0txZHhMUjFmTDh6NG00N3E2R0RIZWxUb2d4X0dJcw?oc=5" target="_blank">Artificial Intelligence at JPMorgan Chase</a> Emerj Artificial Intelligence Research

Google News: AI

1m1 day ago

ModelsFresh

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxNYk90NlRFVDRuRDQxRlFGY3o1SHhHSWdXR3Z3eGJkZjE4blJGSzdKZUNlMlNXR1lUUU5ydGhZQ2ZCS1ItUi12MjBMMEdDc3VfNTE1bUpPYjgxTUI1YU8wZjNZQ3F5RmFyVThObXlZMG9VM1FqQ0xUaThidHNYU3k5dzRBQ2FKcnNLY3FZMjBKcjFUZlFJcVd6dFoyRUd5QlVsVDdCWGVBZk9KXzg4WWotZVdqMUpGS0xUbDBYRmwtWWwxLXRsYU4zSDBLVVhFby12SXFqSVVxWU5YUkMtaVh5b1NPS2tBYkdiR0JuLXR0TEp5MHg0Y1dRR1EyOXV5STdkSzF0U0t2Z0V4UlBJUXkzbDNDNTZvZWotN0Z1UFZ4d2lNY0RMVWo3TEI1MHFrTG11aUZ1bmEtRExzZlhncFg0elYwOTd1RTBvS0t4dGQxcmpvV2JmRU9zWWxMSjVnbW15YklFeG83cWJZNHhEN3JNZXp3WFNGaDdtdDVvNFdTNlJnODFsWlZBTDE1VmRFWGI4SzdFMWxGUFZKUDR5RFNsUGJiaHZnYWlJQmJvTGRRRXdTS3FBVWpIaA?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> WSJ

Google News: LLM

1mabout 4 hours ago

ProductsRecent

US may reassess Nato ties after Iran war ends, Rubio says

Secretary of State Marco Rubio said the US may need to reassess its relationship with Nato after the Iran war is finished, calling the military alliance’s alleged lack of support during the Middle East conflict “very disappointing”. Rubio assailed Nato members for denying access to military bases, following prior criticism from US President Donald Trump that partners in the security bloc are “cowards” and that the alliance is a “paper tiger”. “The president and our country will have to...

SCMP Tech (Asia AI)

1mabout 18 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 147 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersLive

A Retrospective on the ICLR 2026 Review Process

The selection of papers for ICLR 2026 has fully concluded. We extend our congratulations to the authors whose work will appear at the conference. Creating ICLR’s technical program requires immense effort from the authors, reviewers, and area chairs, and we thank you for your contributions and service. For researchers whose work was rejected, we hope […]

blog.iclr.cc

1m37 minutes ago

Research Papers

Vector Researchers present papers at ACL 2024

Vector researchers will be well represented at the 62nd Annual Meeting of the Association for Computational Linguistics in Bangkok, Thailand this year. 14 papers co-authored by Vector-affiliated researchers are being […] The post Vector Researchers present papers at ACL 2024 appeared first on Vector Institute for Artificial Intelligence .

Vector Institute

1mover 1 year ago

Research Papers

Yann LeCun's Team's New Paper: AI Development Mimicking Human Intelligence Hits a Dead End - eu.36kr.com

<a href="https://news.google.com/rss/articles/CBMiU0FVX3lxTFBkbTRhNlhtRnY0cVBERld2OTdWNkRGMXBEaG9Vc21janRUcjJaUlJ4YzZRajVmMGQxNGJYTFB6M3lleUFNakUtWElHdGwzTXBQZjNZ?oc=5" target="_blank">Yann LeCun's Team's New Paper: AI Development Mimicking Human Intelligence Hits a Dead End</a> eu.36kr.com

GNews AI AGI

1m23 days ago

Research Papers

Plans must be made for the welfare of sentient AI, animal consciousness researchers argue - The Hill

<a href="https://news.google.com/rss/articles/CBMiiAFBVV95cUxNNzVaUTkzYkFUaVRsNGtnQVRXS2xsQVZfd1dFQ01RUlNZWUdDbjBNLUNycll2enl2NHp4Z0Ficm9HUnNWUnlvSGFrR3lDVUVxT1QyeE03QWhWcHFDTVJxV3VUQ0FKT3hiTkY3dWZha3JjcjRIM3l3WUtHZVlBUlhxdVBhLW1tdlJ40gGOAUFVX3lxTFBDQnllcVNNa1NRYVMyYlBtVXVxR0VPeHNjTjNMNWNTMFZXRjRkSU1OeXRFNmxvcENqbXkwSERoU1pGdXJYX2g5c214cFJFdEc1WUlkaEE5TlFDTTNoek5yR18tVi1vWUlGUnl4Tk13VWlFMDhzdUUyOUl3RmhNZ0FobTdiVG51N2h1SmJ5Y3c?oc=5" target="_blank">Plans must be made for the welfare of sentient AI, animal consciousness researchers argue</a> The Hill

GNews AI welfare

1mover 1 year ago