A Modular Reference Architecture for MCP-Servers Enabling Agentic BIM Interaction
arXiv:2601.00809v2 Announce Type: replace-cross Abstract: Agentic workflows driven by large language models (LLMs) are increasingly applied to Building Information Modelling (BIM), enabling natural-language retrieval, modification and generation of IFC models. Recent work has begun adopting the emerging Model Context Protocol (MCP) as a uniform tool-calling interface for LLMs, simplifying the agent side of BIM interaction. While MCP standardises how LLMs invoke tools, current BIM-side implementations are still authoring tool-specific and ad hoc, limiting reuse, evaluation, and workflow portabi — Tobias Heimig-Elschner, Changyu Du, Anna Scheuvens, Andr\'e Borrmann, Jakob Beetz
View PDF HTML (experimental)
Abstract:Agentic workflows driven by large language models (LLMs) are increasingly applied to Building Information Modelling (BIM), enabling natural-language retrieval, modification and generation of IFC models. Recent work has begun adopting the emerging Model Context Protocol (MCP) as a uniform tool-calling interface for LLMs, simplifying the agent side of BIM interaction. While MCP standardises how LLMs invoke tools, current BIM-side implementations are still authoring tool-specific and ad hoc, limiting reuse, evaluation, and workflow portability across environments. This paper addresses this gap by introducing a modular reference architecture for MCP servers that enables API-agnostic, isolated and reproducible agentic BIM interactions. From a systematic analysis of recurring capabilities in recent literature, we derive a core set of requirements. These inform a microservice architecture centred on an explicit adapter contract that decouples the MCP interface from specific BIM-APIs. A prototype implementation using IfcOpenShell demonstrates feasibility across common modification and generation tasks. Evaluation across representative scenarios shows that the architecture enables reliable workflows, reduces coupling, and provides a reusable foundation for systematic research.
Comments: Accepted at the GNI Symposium on Artificial Intelligence for the Built World (Technical University of Munich, May 18--20, 2026)
Subjects:
Other Computer Science (cs.OH); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
Cite as: arXiv:2601.00809 [cs.OH]
(or arXiv:2601.00809v2 [cs.OH] for this version)
https://doi.org/10.48550/arXiv.2601.00809
arXiv-issued DOI via DataCite
Submission history
From: Tobias Heimig-Elschner [view email] [v1] Sun, 21 Dec 2025 23:12:26 UTC (4,437 KB) [v2] Sat, 28 Mar 2026 22:25:06 UTC (4,438 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxiv[P] Federated Adversarial Learning
I'm a CS/ML engineering student in my 4th year, and I need help for a project I recently got assigned to (as an "end of the year" project). I am familiar with basic ML stuff, deep learning etc and made a few "standard" projects here and there about it... However I found this topic a bit challenging since it combines both FL and the adversarial aspect, I did a lot of research especially on arxiv to try to understand the gist of it. HOWEVER, the subject is essentially "federated adversarial learning" and I am struggeling to understand what I'm supposed to do. (I found ONE article on arxiv but ngl i find it very hard to understand as it is very theoritical.) I talked to my teachers/supervisors about this but they said "do whatever you want" which doesn't help AT ALL..... They did provide a da
[R] The SPORE Clustering Algorithm
https://preview.redd.it/di99yw56tksg1.png?width=992 format=png auto=webp s=8828c9459dcf8f8541718e4d7a9fae52bfc0b95a I created a clustering algorithm SPORE ( S keleton P ropagation O ver R ecalibrating E xpansions) for general purpose clustering, intended to handle nonconvex, convex, low-d and high-d data alike. I've benchmarked it on 28 datasets from 2-784D and released a Python package as well as a research paper . Short Summary SPORE is a density-variance-based method meant for general clustering in arbitrary geometries and dimensionalities. After building a knn graph, it has 2 phases. Phase 1 (Expansion) uses BFS with a continually refined density-variance constraint to expand initial clusters in a way that adapts to their specific scale. The aim is to capture inner, well-shielded skele
[D] Does seeing the identify of authors influence your scoring?
Let's be honest, at some stage of the review process. A lot of us have gotten bored and tried to Google the papers we are reviewing. And sometimes those papers might have already been uploaded onto arXiv with the identity of the authors. Which we then tried to look them up. As a first-time reviewer, I noticed the top 2 papers in my batch happened to be the only papers in my batch that is on arXiv. I am trying to work out if revealing the author's identity had influenced my decision. Or it's just a coincidence. submitted by /u/d_edge_sword [link] [comments]
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
[D] Does seeing the identify of authors influence your scoring?
Let's be honest, at some stage of the review process. A lot of us have gotten bored and tried to Google the papers we are reviewing. And sometimes those papers might have already been uploaded onto arXiv with the identity of the authors. Which we then tried to look them up. As a first-time reviewer, I noticed the top 2 papers in my batch happened to be the only papers in my batch that is on arXiv. I am trying to work out if revealing the author's identity had influenced my decision. Or it's just a coincidence. submitted by /u/d_edge_sword [link] [comments]

Leveraging Commit Size Context and Hyper Co-Change Graph Centralities for Defect Prediction
arXiv:2604.01132v1 Announce Type: new Abstract: File-level defect prediction models traditionally rely on product and process metrics. While process metrics effectively complement product metrics, they often overlook commit size the number of files changed per commit despite its strong association with software quality. Network centrality measures on dependency graphs have also proven to be valuable product level indicators. Motivated by this, we first redefine process metrics as commit size aware process metric vectors, transforming conventional scalar measures into 100 dimensional profiles that capture the distribution of changes across commit size strata. We then model change history as a hyper co change graph, where hyperedges naturally encode commit-size semantics. Vector centralities

Detecting Call Graph Unsoundness without Ground Truth
arXiv:2604.00885v1 Announce Type: new Abstract: Java static analysis frameworks are commonly compared under the assumption that analysis algorithms and configurations compose monotonically and yield semantically comparable results across tools. In this work, we show that this assumption is fundamentally flawed. We present a large-scale empirical study of semantic consistency within and across four widely used Java static analysis frameworks: Soot, SootUp, WALA, and Doop. Using precision partial orders over analysis algorithms and configurations, we systematically identify violations where increased precision introduces new call-graph edges or amplifies inconsistencies. Our results reveal three key findings. First, algorithmic precision orders frequently break within frameworks due to moder

Containing the Reproducibility Gap: Automated Repository-Level Containerization for Scholarly Jupyter Notebooks
arXiv:2604.01072v1 Announce Type: new Abstract: Computational reproducibility is fundamental to trustworthy science, yet remains difficult to achieve in practice across various research workflows, including Jupyter notebooks published alongside scholarly articles. Environment drift, undocumented dependencies and implicit execution assumptions frequently prevent independent re-execution of published research. Despite existing reproducibility guidelines, scalable and systematic infrastructure for automated assessment remains limited. We present an automated, web-oriented reproducibility engineering pipeline that reconstructs and evaluates repository-level execution environments for scholarly notebooks. The system performs dependency inference, automated container generation, and isolated exe

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!