Open Source AI model open-source million report research github

Beyond Static RAG: Using 1958 Biochemistry to Beat Multi-Hop Retrieval by 14%

DEV Communityby EmilApril 1, 20262 min read1 views

Standard Retrieval-Augmented Generation (RAG) often falls short on complex, multi-hop questions because it relies on static "lock and key" query matching. If the information needed to answer a query is semantically distant from the original text, standard vector search simply won't find it. We've developed Induced-Fit Retrieval (IFR), a dynamic graph traversal approach that mutates the query vector at every step to discover semantically distant but logically connected information. The Core Results We ran our prototype through a rigorous test suite of 30 queries across multiple graph sizes, up to 5.2 million atoms. 14.3% higher nDCG@10 compared to a competitive RAG-rerank baseline. 15% Multi-hop Hit@20 in scenarios where traditional RAG methods scored 0%.<

Standard Retrieval-Augmented Generation (RAG) often falls short on complex, multi-hop questions because it relies on static "lock and key" query matching. If the information needed to answer a query is semantically distant from the original text, standard vector search simply won't find it.

We've developed Induced-Fit Retrieval (IFR), a dynamic graph traversal approach that mutates the query vector at every step to discover semantically distant but logically connected information.

The Core Results We ran our prototype through a rigorous test suite of 30 queries across multiple graph sizes, up to 5.2 million atoms.

14.3% higher nDCG@10 compared to a competitive RAG-rerank baseline.

15% Multi-hop Hit@20 in scenarios where traditional RAG methods scored 0%.

O(1) Latency Scaling: Latency remains near 10ms whether searching 100 atoms or 5.2 million.

Why Biochemistry? The system is inspired by Daniel Koshland’s 1958 "induced fit" model. In biology, enzymes change shape upon encountering a substrate to improve binding.

IFR applies this to Information Retrieval: instead of a static query vector, the vector mutates at each hop based on the visited node's embedding. This allows the query to follow the "curved manifolds" of high-dimensional embedding space that a fixed vector cannot reach.

Lessons from the Data Transparency is key to research, so we are also sharing our failures:

Catastrophic Drift: 67% of our failures occurred because the query mutated too aggressively, losing its original intent.

The Solution: v2 will implement an "Alpha Floor" to preserve at least 50% of the original query signal at all times.

We have open-sourced the prototype, our 18 raw JSON result logs, ablation studies, and full technical reports.

Check out the repo on GitHub: https://github.com/emil-celestix/celestix-ifr

Original source

DEV Community

https://dev.to/emilcelestix/beyond-static-rag-using-1958-biochemistry-to-beat-multi-hop-retrieval-by-14-4hfn

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelopen-sourcemillion

Models

Universal Robots and Scale AI Launch Imitation Learning System to Accelerate AI Model Training, Bridging the ‘Lab-to-Factory’ Gap - Morningstar

<a href="https://news.google.com/rss/articles/CBMilwJBVV95cUxPdlNrbXpJM2pWZmpubkJ2Yl9VeGNfcXNFU3N4TG82V3ZOcWhKTEhGZG5QUzl1aXNmQmpDY29Sd3lsZU1lYmM3ZlpiNzA2OXFSeUFDQnBzOVFrdm9JVU9KNElRRTlTOFJRRzl5VEh1UlZrU1lzVk5TMGRHUTBPaW1PWEZGdjJpaFozbHFfZDdmRUk3QnkwQ2hlN1NwaEhSUEpPdm9DWXBNZTRuNnhIQUpXUnprUDJtenRKTkNremdjZWdjRXViQkZ1U3B1eG5kSnk0WkdPZnh4QnE1WTVpWER4LUMwRUItWVlnMVA3d0Rxd3lYVkxRd3FUa04tSWhMeFZpdVNMek1Fcl9yRFdNNjE3cUJVbUQ2MDg?oc=5" target="_blank">Universal Robots and Scale AI Launch Imitation Learning System to Accelerate AI Model Training, Bridging the ‘Lab-to-Factory’ Gap</a> Morningstar

Google News - Scale AI data

1m16 days ago

ModelsLive

Google Veo 3.1 Lite Explained: A Faster, Lighter, More Affordable AI Video Generation Model - Gizbot

<a href="https://news.google.com/rss/articles/CBMi3AFBVV95cUxOaVRwSGFXQU1PN19vaWdXb1hWY3ViT2xfdG9UckpuWkRNRTdSMzkzcnc1cnNzVTh2RW1LZTh6SnNXa2FBWlg0bEtMc0ZDNkY4OF80QVJlQWk5SG5KNmI0YjUxUnpSa3FlcnFCZjhONlQxWWFIRFhodjIwa24yRWJYUjZGV0JIVGJRb21zS1NUZ3duS3hWQ3gyMnNXQzFLU2VwcjNDV3dwX3JGWm1wWjM4ZUN3aExqenFVSUFjUVZoaEhncEswZWpSbEF6YlYwX0R1b08tTmxtc1lXVjk4?oc=5" target="_blank">Google Veo 3.1 Lite Explained: A Faster, Lighter, More Affordable AI Video Generation Model</a> Gizbot

GNews AI video

1mabout 1 hour ago

Models

The case for liquid foundation models - McKinsey & Company

<a href="https://news.google.com/rss/articles/CBMiowFBVV95cUxONnRITnVaR1JRSTVxTXY4cnoweGgtNG8tT0tyTlJvTW14OFJ3bUpESWxQVGhTcTBRU2h0QktOdVpidlNOdnk4cmw5UkdXX2dXbFBNNDQxaE40d2RubXpZb1hkLVU1Ni1zeEpNdV9qdGFEc2ZFM0hCbE94eURlZ2pXdGZMenFncVNaVWVTQ0lySzRIT1EycTJCbmhkUDJvRkhRVlJj?oc=5" target="_blank">The case for liquid foundation models</a> McKinsey & Company

GNews AI transformer

1m2 months ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 134 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Open Source AI

Open Source AI

Computer science students take on neuroscience - Mizzou Engineering

<a href="https://news.google.com/rss/articles/CBMijgFBVV95cUxNRmZreWF3d3R5cWtzM0pBVmNGd1NiYzV5XzlvZTRlalFOZ0V2X2Fqd1dXcWd0bWJnNl9zWkhOVWRwdDYtbXc3d256Yk1GWlRlSUZOY3RlYjNLVHJSMmtBTU5CXy16RTZMb2hTa1BINzZoVzVhX0NJNmc0SFYxejdsZWR1c3pGQW0tcG5uSjVR?oc=5" target="_blank">Computer science students take on neuroscience</a> Mizzou Engineering

GNews AI neuroscience

1mabout 1 month ago

Open Source AILive

Hugging Face TRL v1.0 Turns LLM Fine-Tuning From Art Into Engineering - startupfortune.com

<a href="https://news.google.com/rss/articles/CBMinAFBVV95cUxNb1I1YlZ3NWUyZUQwWDFvODdDdDl4dEI0ZWFDWGVRXzQwUFFXRTVzXzJ0NDl1U2FPaGV2R185d1lfM2RfTmZNX0N0cjZWMXkwbl9zSU9sME5BenN3eDU1aFlkczJSR2kwUkpHU2ZIT2JTc29HNWNZTExsT2VWR3kzN3dkeER1QVBkSGdTZWFfdkVkVGl2cDlTVTFTZzc?oc=5" target="_blank">Hugging Face TRL v1.0 Turns LLM Fine-Tuning From Art Into Engineering</a> startupfortune.com

GNews AI fine-tuning

1m36 minutes ago

Open Source AILive

Hugging Face Releases TRL v1.0: A Unified Post-Training Stack for SFT, Reward Modeling, DPO, and GRPO Workflows

Hugging Face has officially released TRL (Transformer Reinforcement Learning) v1.0, marking a pivotal transition for the library from a research-oriented repository to a stable, production-ready framework. For AI professionals and developers, this release codifies the Post-Training pipeline—the essential sequence of Supervised Fine-Tuning (SFT), Reward Modeling, and Alignment—into a unified, standardized API. In the early stages […] The post Hugging Face Releases TRL v1.0: A Unified Post-Training Stack for SFT, Reward Modeling, DPO, and GRPO Workflows appeared first on MarkTechPost .

MarkTechPost

1mabout 1 hour ago

Open Source AIFresh

Anthropic releases part of AI tool source code in 'error' - wataugademocrat.com

<a href="https://news.google.com/rss/articles/CBMi5gFBVV95cUxNTG5FV3JLdGlWWllNTUdFa2o1aTk1NHRQZmFZNnZKVzJHY2RJTzdvN3dxS0stU1BoTnRSeXlwcUF0YnpaZ1dVYl9sS1J3QzRjSDViREdLWlhHVlBiMi1OaXBkUXZ4S013MlRVMWZRM0tZeTJkZ1d1OWhaMDhSalhpOUd3SkdSYm95WlBHZG9mZzVVbk5OblFtRlQ1YU5zd3hCd1h5RVFJdzE5MzlGOS1vNmdwT2FENjNlUEpTcEtmRXdFc0pJcGdJcUlOaEVIZkFtbWI4bHBVbHk2QWVjeGJpT0RlblVHZw?oc=5" target="_blank">Anthropic releases part of AI tool source code in 'error'</a> wataugademocrat.com

Google News: Claude

1mabout 2 hours ago