How We Finally Solved Test Discovery
<h1> How We Finally Solved Test Discovery </h1> <p>Yesterday I wrote about <a href="https://gitauto.ai/blog/why-our-test-writing-agent-wasted-12-iterations-reading-files?utm_source=devto&utm_medium=referral" rel="noopener noreferrer">why test file discovery is still unsolved</a>. Three approaches (stem matching, content grepping, hybrid), each failing differently. The hybrid worked best but had a broken ranking function - flat scoring that gave <code>src/</code> the same weight as <code>src/pages/checkout/</code>. Today it's solved.</p> <h2> The Problem With Flat Scoring </h2> <p>The March 30 post ended with this bug: <code>+30</code> points for any shared parent directory. One shared path component got the same bonus as three. With 3 synthetic inputs, other factors dominated. With 29
How We Finally Solved Test Discovery
Yesterday I wrote about why test file discovery is still unsolved. Three approaches (stem matching, content grepping, hybrid), each failing differently. The hybrid worked best but had a broken ranking function - flat scoring that gave src/ the same weight as src/pages/checkout/. Today it's solved.
The Problem With Flat Scoring
The March 30 post ended with this bug: +30 points for any shared parent directory. One shared path component got the same bonus as three. With 3 synthetic inputs, other factors dominated. With 29 real file paths, unrelated test files ranked above relevant ones.
The fix wasn't tweaking the constant. It was replacing the scoring model entirely.
Five Tiers, Not Points
Instead of adding up weighted scores, we rank by structural relationship. Higher tiers always win over lower ones, regardless of path depth or name similarity.
Tier 1 - Colocated tests. Same directory, same stem with a test suffix. Button.tsx and Button.test.tsx side by side. This is the strongest signal possible.
Tier 2 - Same-directory content match. A test file in the same directory whose source code imports the implementation file.
Tier 3 - Path-based match. The test file's path contains the implementation stem. tests/test_client.py for services/client.py. The classic mirror-tree convention.
Tier 4 - Content grep match. A test file anywhere in the repo references the implementation file in its source code.
Tier 5 - Parent directory content match. A test file in a parent directory that references the impl. Weakest signal, but still a real connection.
The key insight: tiers are ordinal, not additive. A Tier 1 match always outranks a Tier 3 match. No combination of bonus points can promote a distant test above a colocated one.
Content-Aware Matching
Path matching alone can't handle barrel re-exports. When a test imports from '@/pages/checkout' and that resolves to index.tsx, the string "index" never appears in the import statement. Path matching sees nothing.
Content-aware matching reads the test file and greps for references to the implementation. If a test file contains import { CheckoutPage } from './index' or require('./checkout'), the content grep catches it. Tiers 2, 4, and 5 are the content tiers that fill gaps path-only matching leaves open.
Single-Source Patterns
Every language has its own test naming convention:
-
.test.ts, .test.tsx - JavaScript/TypeScript (Jest, Vitest)
-
.spec.ts, .spec.tsx - Angular, Cypress, Playwright
-
test_*.py - Python (pytest)
-
*_test.go - Go
-
*Test.java, *Test.kt - Java/Kotlin (JUnit)
-
*_spec.rb - Ruby (RSpec)
-
*.spec.js - JavaScript (Mocha, Jasmine)_
All of these are defined once and imported everywhere. Before this change, three different functions each maintained their own pattern list - slightly different, each missing cases the others caught.
The Takeaway
Test file discovery looks like a string matching problem. It's actually a ranking problem with structural priors. Flat scoring collapses structure into numbers and loses information. Tiered ranking preserves the structural relationship and makes the algorithm's priorities explicit and debuggable. And the only way to validate ranking is against real data at real scale - not 3 curated inputs that any algorithm can pass.
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelserviceinsight![[P] Remote sensing foundation models made easy to use.](https://d2xsxph8kpxj0f.cloudfront.net/310419663032563854/konzwo8nGf8Z4uZsMefwMr/default-img-data-stream-CS83dLj3ogncWS7WB2tNYE.webp)
[P] Remote sensing foundation models made easy to use.
This project enables the idea of tasking remote sensing models to acquire embeddings like we task satellites to acquire data! https://github.com/cybergis/rs-embed submitted by /u/amritk110 [link] [comments]

AI agent observability: what enterprises need to know
You wouldn t run a hospital without monitoring patients vitals. Yet most enterprises deploying AI agents have no real visibility into what those agents are actually doing — or why. What began as chatbots and demos has evolved into autonomous systems embedded in core workflows: handling customer interactions, executing decisions, and orchestrating actions across complex infrastructures.... The post AI agent observability: what enterprises need to know appeared first on DataRobot .
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.






Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!