A Tight Expressivity Hierarchy for GNN-Based Entity Resolution in Master Data Management
arXiv:2603.27154v1 Announce Type: cross Abstract: Entity resolution -- identifying database records that refer to the same real-world entity -- is naturally modelled on bipartite graphs connecting entity nodes to their attribute values. Applying a message-passing neural network (MPNN) with all available extensions (reverse message passing, port numbering, ego IDs) incurs unnecessary overhead, since different entity resolution tasks have fundamentally different complexity. For a given matching criterion, what is the cheapest MPNN architecture that provably works? We answer this with a four-theo — Ashwin Ganesan
View PDF HTML (experimental)
Abstract:Entity resolution -- identifying database records that refer to the same real-world entity -- is naturally modelled on bipartite graphs connecting entity nodes to their attribute values. Applying a message-passing neural network (MPNN) with all available extensions (reverse message passing, port numbering, ego IDs) incurs unnecessary overhead, since different entity resolution tasks have fundamentally different complexity. For a given matching criterion, what is the cheapest MPNN architecture that provably works? We answer this with a four-theorem separation theory on typed entity-attribute graphs. We introduce co-reference predicates $\mathrm{Dup}r$ (two same-type entities share at least $r$ attribute values) and the $\ell$-cycle predicate $\mathrm{Cyc}\ell$ for settings with entity-entity edges. For each predicate we prove tight bounds -- constructing graph pairs provably indistinguishable by every MPNN lacking the required adaptation, and exhibiting explicit minimal-depth MPNNs that compute the predicate on all inputs. The central finding is a sharp complexity gap between detecting any shared attribute and detecting multiple shared attributes. The former is purely local, requiring only reverse message passing in two layers. The latter demands cross-attribute identity correlation -- verifying that the same entity appears at several attributes of the target -- a fundamentally non-local requirement needing ego IDs and four layers, even on acyclic bipartite graphs. A similar necessity holds for cycle detection. Together, these results yield a minimal-architecture principle: practitioners can select the cheapest sufficient adaptation set, with a guarantee that no simpler architecture works. Computational validation confirms every prediction.
Subjects:
Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Data Structures and Algorithms (cs.DS)
ACM classes: I.2.6; H.2.8; F.2
Cite as: arXiv:2603.27154 [cs.LG]
(or arXiv:2603.27154v1 [cs.LG] for this version)
https://doi.org/10.48550/arXiv.2603.27154
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Ashwin Ganesan [view email] [v1] Sat, 28 Mar 2026 06:38:03 UTC (52 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivChatGPT acts as a "cognitive crutch" that weakens memory, new research suggests - PsyPost
<a href="https://news.google.com/rss/articles/CBMiowFBVV95cUxQTC13Zm5WZG9iQmRmZFpXM0ctamxRZ3E5N0ZFRDlIOWhHa2l1eTVmaTBzZFR6VGx6QjR2VEd0SFZoY0l4ZXVPSVF1c3FDTnE4Nk5zanNMWXhyLVpxVVlIUEZhZUFxXzYtQkRnM2E5eTN5M3NYUmJsX09YMTR6dWhLQ2hUSk55S2FJQXV4WFEzVFB4ZzhyS21RUzFoMDJzSXpQQ1pR?oc=5" target="_blank">ChatGPT acts as a "cognitive crutch" that weakens memory, new research suggests</a> <font color="#6f6f6f">PsyPost</font>
Gemini Deep Think: Redefining the Future of Scientific Research - Google DeepMind
<a href="https://news.google.com/rss/articles/CBMipgFBVV95cUxPRmtMZnRYNW04a3Q4b0dSQm9aall0S3BJWFFOczQ3dmdfX3cyR1plYlotZHg5ekhlZ2s3cUd6Y1pyT3lkVEJrV1V0c0NWVlBQNHlMbXlEbXpTYWlSVUZHVllYZWdSb2RMU2JTelVGMXBEckZSdWt5VUs1d24zdUVLaExpS0NZMmtpSTRoNDd2MHRZdlBRaWxSWmVTNk0wRWtRQ2NaV2ln?oc=5" target="_blank">Gemini Deep Think: Redefining the Future of Scientific Research</a> <font color="#6f6f6f">Google DeepMind</font>
Alibaba Poaches Google DeepMind Research Scientist For Qwen AI Push - Yahoo Finance
<a href="https://news.google.com/rss/articles/CBMijwFBVV95cUxOYTZwZk0walRzazJQampab1FCM2k4Uy1SYk12UWZraENkUXYzZU9kbnlGTGZJS0pFaTZIUFlKZFkwVnJkRzhKbXhNV3lNdUZpdF8tSU1LMklqcTZlUDZERDZ3VzdWbjNQYUN4T2d2ZkRQT1R1MUc0LXdYNndPQTNzbXBXMXJhb3ZEZE00ZFMtaw?oc=5" target="_blank">Alibaba Poaches Google DeepMind Research Scientist For Qwen AI Push</a> <font color="#6f6f6f">Yahoo Finance</font>
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
Alibaba Poaches Google DeepMind Research Scientist For Qwen AI Push - Yahoo Finance
<a href="https://news.google.com/rss/articles/CBMijwFBVV95cUxOYTZwZk0walRzazJQampab1FCM2k4Uy1SYk12UWZraENkUXYzZU9kbnlGTGZJS0pFaTZIUFlKZFkwVnJkRzhKbXhNV3lNdUZpdF8tSU1LMklqcTZlUDZERDZ3VzdWbjNQYUN4T2d2ZkRQT1R1MUc0LXdYNndPQTNzbXBXMXJhb3ZEZE00ZFMtaw?oc=5" target="_blank">Alibaba Poaches Google DeepMind Research Scientist For Qwen AI Push</a> <font color="#6f6f6f">Yahoo Finance</font>
Is AI's visual understanding mostly a 'mirage'? New research suggests so. - Fortune
<a href="https://news.google.com/rss/articles/CBMihgFBVV95cUxORGxTdWF3bnBiU0VaUEVtanJCT1htWVdjTUo3UnJycUxKcl9HU3Q1ODNINW9na1R0aENXXzhGYnc0Qlg3aGFGM2hiTVNFSjBZQ2FPUElZYmVGdzhfU0d5QkR2cDVnSzJBd2Y5WEVMLUJfWHY4YUc5c1I1U1dUQW9TeU56U1JjQQ?oc=5" target="_blank">Is AI's visual understanding mostly a 'mirage'? New research suggests so.</a> <font color="#6f6f6f">Fortune</font>
71% of Businesses Are Invisible to AI - And Most Don't Know It Yet
Search didn't evolve - it got replaced. AI systems don't return links, they return answers. New research shows 71% of businesses are invisible to AI recommendation engines. Brand size doesn't matter. Vanguard scores 16/100 while ProtonMail scores 88. The playbook that won Google doesn't work here. Clarity beats clout. Read All

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!