MarkushGrapher-2: End-to-end Multimodal Recognition of Chemical Structures
arXiv:2603.28550v1 Announce Type: new Abstract: Automatically extracting chemical structures from documents is essential for the large-scale analysis of the literature in chemistry. Automatic pipelines have been developed to recognize molecules represented either in figures or in text independently. However, methods for recognizing chemical structures from multimodal descriptions (Markush structures) lag behind in precision and cannot be used for automatic large-scale processing. In this work, we present MarkushGrapher-2, an end-to-end approach for the multimodal recognition of chemical struct — Tim Strohmeyer, Lucas Morin, Gerhard Ingmar Meijer, Val\'ery Weber, Ahmed Nassar, Peter Staar
View PDF HTML (experimental)
Abstract:Automatically extracting chemical structures from documents is essential for the large-scale analysis of the literature in chemistry. Automatic pipelines have been developed to recognize molecules represented either in figures or in text independently. However, methods for recognizing chemical structures from multimodal descriptions (Markush structures) lag behind in precision and cannot be used for automatic large-scale processing. In this work, we present MarkushGrapher-2, an end-to-end approach for the multimodal recognition of chemical structures in documents. First, our method employs a dedicated OCR model to extract text from chemical images. Second, the text, image, and layout information are jointly encoded through a Vision-Text-Layout encoder and an Optical Chemical Structure Recognition vision encoder. Finally, the resulting encodings are effectively fused through a two-stage training strategy and used to auto-regressively generate a representation of the Markush structure. To address the lack of training data, we introduce an automatic pipeline for constructing a large-scale dataset of real-world Markush structures. In addition, we present IP5-M, a large manually-annotated benchmark of real-world Markush structures, designed to advance research on this challenging task. Extensive experiments show that our approach substantially outperforms state-of-the-art models in multimodal Markush structure recognition, while maintaining strong performance in molecule structure recognition. Code, models, and datasets are released publicly.
Comments: 15 pages, to be published in CVPR 2026
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2603.28550 [cs.CV]
(or arXiv:2603.28550v1 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2603.28550
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Tim Strohmeyer [view email] [v1] Mon, 30 Mar 2026 15:11:17 UTC (7,368 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxiv
Google DeepMind s Research Lets an LLM Rewrite Its Own Game Theory Algorithms — And It Outperformed the Experts
Designing algorithms for Multi-Agent Reinforcement Learning (MARL) in imperfect-information games — scenarios where players act sequentially and cannot see each other s private information, like poker — has historically relied on manual iteration. Researchers identify weighting schemes, discounting rules, and equilibrium solvers through intuition and trial-and-error. Google DeepMind researchers proposes AlphaEvolve, an LLM-powered evolutionary coding agent [ ] The post Google DeepMind s Research Lets an LLM Rewrite Its Own Game Theory Algorithms — And It Outperformed the Experts appeared first on MarkTechPost .

Researchers build Wi-Fi chip that can operate inside a nuclear reactor — receiver uses special materials and design to withstand high doses of radiation for at least six months
Researchers build Wi-Fi chip that can operate inside a nuclear reactor — receiver uses special materials and design to withstand high doses of radiation for at least six months

Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.




Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!