Products benchmark announce application valuation arxiv research

Towards Robustness: A Critique of Current Vector Database Assessments

arXiv cs.DBby [Submitted on 1 Jul 2025 (v1), last revised 2 Apr 2026 (this version, v2)]April 3, 20262 min read1 views

arXiv:2507.00379v2 Announce Type: replace Abstract: Vector databases are critical infrastructure in AI systems, and average recall is the dominant metric for their evaluation. Both users and researchers rely on it to choose and optimize their systems. We show that relying on average recall is problematic. It hides variability across queries, allowing systems with strong mean performance to underperform significantly on hard queries. These tail cases confuse users and can lead to failure in downstream applications such as RAG. We argue that robustness consistently achieving acceptable recall across queries is crucial to vector database evaluation. We propose Robustness-$\delta$@K, a new metric that captures the fraction of queries with recall above a threshold $\delta$. This metric offers a

View PDF HTML (experimental)

Abstract:Vector databases are critical infrastructure in AI systems, and average recall is the dominant metric for their evaluation. Both users and researchers rely on it to choose and optimize their systems. We show that relying on average recall is problematic. It hides variability across queries, allowing systems with strong mean performance to underperform significantly on hard queries. These tail cases confuse users and can lead to failure in downstream applications such as RAG. We argue that robustness consistently achieving acceptable recall across queries is crucial to vector database evaluation. We propose Robustness-$\delta$@K, a new metric that captures the fraction of queries with recall above a threshold $\delta$. This metric offers a deeper view of recall distribution, helps vector index selection regarding application needs, and guides the optimization of tail performance. We integrate Robustness-$\delta$@K into existing benchmarks and evaluate mainstream vector indexes, revealing significant robustness differences. More robust vector indexes yield better application performance, even with the same average recall. We also identify design factors that influence robustness, providing guidance for improving real-world performance.

Subjects:

Databases (cs.DB)

Cite as: arXiv:2507.00379 [cs.DB]

(or arXiv:2507.00379v2 [cs.DB] for this version)

https://doi.org/10.48550/arXiv.2507.00379

arXiv-issued DOI via DataCite

Submission history

From: Zikai Wang [view email] [v1] Tue, 1 Jul 2025 02:27:57 UTC (255 KB) [v2] Thu, 2 Apr 2026 00:55:48 UTC (203 KB)

Original source

arXiv cs.DB

https://arxiv.org/abs/2507.00379

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

benchmarkannounceapplication

ProductsLive

How to Publish a Power BI Report and Embed It on a Website

You have built a Power BI report. The charts look sharp, the DAX measures are doing their job, and the data model is clean. Now what? The report is sitting on your local machine in a .pbix file that nobody else can see or interact with. This article walks you through the final stretch: publishing that report to the Power BI Service and embedding it on a website. We cover two approaches. The first is Publish to web , which makes your report publicly accessible to anyone with the link. The second is the Website or portal method, which requires viewers to sign in and respects your data permissions. Both produce an interactive iframe you drop into your HTML. We will also cover workspace creation, publishing from Desktop, responsive design, URL filtering, and troubleshooting. What you need befo

DEV Community

16mabout 1 hour ago

ProductsLive

Qodo vs Tabnine: AI Coding Assistants Compared (2026)

Quick Verdict Qodo and Tabnine address genuinely different problems. Qodo is a code quality specialist - its entire platform is built around making PRs better through automated review and test generation. Tabnine is a privacy-first code assistant - its entire platform is built around delivering AI coding help in environments where data sovereignty cannot be compromised. Choose Qodo if: your team needs the deepest available AI PR review, you want automated test generation that proactively closes coverage gaps, you use GitLab or Azure DevOps alongside GitHub, or you want the open-source transparency of PR-Agent as your review foundation. Choose Tabnine if: your team needs AI code completion as a primary feature, your organization requires on-premise or fully air-gapped deployment with battle

DEV Community

32mabout 1 hour ago

ProductsLive

Shielding Your LLMs: A Deep Dive into Prompt Injection & Jailbreak Defense

Large Language Models (LLMs) are revolutionizing how we interact with technology, but their power comes with inherent security risks. Prompt injection and jailbreaking are two of the most significant threats, allowing malicious actors to hijack an LLM’s intended behavior. This post will explore these vulnerabilities, dissect the underlying mechanisms, and provide practical strategies – including code examples – to fortify your LLM applications. We'll focus on securing local LLMs, but the principles apply broadly. The Adversarial Playground: Understanding Prompt Injection Jailbreaking At its core, LLM security revolves around the clash between the model’s instructions (the system prompt ) and user-provided data. Think of it as an adversarial battleground where attackers attempt to manipulat

DEV Community

6mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 193 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Products

ProductsLive

From Desktop to Web: A Guide to Publishing and Embedding Power BI Reports

Power BI is a powerful business intelligence tool that transforms raw data into immersive, interactive visual stories. However, the true value of a report is realized only when it is shared with stakeholders. Publishing is the process of moving your report from the local Power BI Desktop environment to the cloud-based Power BI Service, where it can be managed, shared, and integrated into other platforms like company websites or portals. Step 1: Creating a Workspace A Workspace is a collaborative container in the Power BI Service where you house your reports, dashboards, and datasets. Sign in to the Power BI Service. On the left-hand navigation pane, click on Workspaces. Select Create a workspace (usually at the bottom of the pane). Give your workspace a unique name (e.g., "Sales Analytics

DEV Community

3mabout 1 hour ago

ProductsLive

How to Publish a Power BI Report and Embed It on a Website

DEV Community

16mabout 1 hour ago

ProductsLive

I Connected 12 MCP Servers to Amazon Q. Here's What Broke

👋 Hey there, tech enthusiasts! I'm Sarvar, a Cloud Architect with a passion for transforming complex technological challenges into elegant solutions. With extensive experience spanning Cloud Operations (AWS Azure), Data Operations, Analytics, DevOps, and Generative AI, I've had the privilege of architecting solutions for global enterprises that drive real business impact. Through this article series, I'm excited to share practical insights, best practices, and hands-on experiences from my journey in the tech world. Whether you're a seasoned professional or just starting out, I aim to break down complex concepts into digestible pieces that you can apply in your projects. Let's dive in and explore the fascinating world of cloud technology together! 🚀 Written from experience building AI age

DEV Community

10mabout 1 hour ago

ProductsLive

Qodo vs Tabnine: AI Coding Assistants Compared (2026)

DEV Community

32mabout 1 hour ago