Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessStop Chatting with Large Language Models: A Product Manager's Guide to Reconstructing AI Workflows - 36 KrGoogle News: LLMOpenAI’s Never-Ending Soap Opera - The InformationGoogle News: OpenAITennibot launches Partner V2, its latest robotic tennis ball machineThe Robot ReportI fear Anthropic, OpenAI, and SpaceX IPOs will suck capital out of the market, says Jim Cramer - CNBCGoogle News: OpenAIAn Implementation Guide to Running NVIDIA Transformer Engine with Mixed Precision, FP8 Checks, Benchmarking, and Fallback ExecutionMarkTechPostBusinesses scramble to get noticed by AI searchBBC TechnologyOpenAI is getting weird again - PlatformerGoogle News: OpenAI[D] How's MLX and jax/ pytorch on MacBooks these days?Reddit r/MachineLearningWhich Artificial Intelligence (AI) Supercycle Stock Will Make You Richer Over the Next 10 Years? - The Motley FoolGoogle News: AIOpenAI policy blueprint sparks AI regulation debate - Fox BusinessGNews AI regulationAnthropic Claude AI training model targets AI skills gap | ETIH EdTech News - EdTech Innovation HubGoogle News: ClaudeSamsung flags eightfold jump in Q1 profit as AI chip demand drives up prices - ReutersGNews AI SamsungBlack Hat USADark ReadingBlack Hat AsiaAI BusinessStop Chatting with Large Language Models: A Product Manager's Guide to Reconstructing AI Workflows - 36 KrGoogle News: LLMOpenAI’s Never-Ending Soap Opera - The InformationGoogle News: OpenAITennibot launches Partner V2, its latest robotic tennis ball machineThe Robot ReportI fear Anthropic, OpenAI, and SpaceX IPOs will suck capital out of the market, says Jim Cramer - CNBCGoogle News: OpenAIAn Implementation Guide to Running NVIDIA Transformer Engine with Mixed Precision, FP8 Checks, Benchmarking, and Fallback ExecutionMarkTechPostBusinesses scramble to get noticed by AI searchBBC TechnologyOpenAI is getting weird again - PlatformerGoogle News: OpenAI[D] How's MLX and jax/ pytorch on MacBooks these days?Reddit r/MachineLearningWhich Artificial Intelligence (AI) Supercycle Stock Will Make You Richer Over the Next 10 Years? - The Motley FoolGoogle News: AIOpenAI policy blueprint sparks AI regulation debate - Fox BusinessGNews AI regulationAnthropic Claude AI training model targets AI skills gap | ETIH EdTech News - EdTech Innovation HubGoogle News: ClaudeSamsung flags eightfold jump in Q1 profit as AI chip demand drives up prices - ReutersGNews AI Samsung
AI NEWS HUBbyEIGENVECTOREigenvector

Two-Pass LLM Processing: When Single-Pass Classification Isn't Enough

Dev.to AIby Diven RastdusApril 5, 20267 min read0 views
Source Quiz

Here's a pattern I keep running into: you have a batch of items (messages, tickets, documents, transactions) and you need to classify each one. The obvious approach is one LLM call per item. It works fine until it doesn't. The failure mode is subtle. Each item gets classified correctly in isolation. But the relationships between items -- escalation patterns, contradictions, duplicate reports of the same issue -- are invisible to a single-pass classifier because it never sees the full picture. The problem Say you're triaging a CEO's morning messages. Three Slack messages from the same person: 9:15 AM : "API migration 60% done, no blockers" 10:30 AM : "Found an issue with payment endpoints, investigating" 11:45 AM : "3% of live payments failing, need rollback/hotfix decision within an hour"

Here's a pattern I keep running into: you have a batch of items (messages, tickets, documents, transactions) and you need to classify each one. The obvious approach is one LLM call per item. It works fine until it doesn't.

The failure mode is subtle. Each item gets classified correctly in isolation. But the relationships between items -- escalation patterns, contradictions, duplicate reports of the same issue -- are invisible to a single-pass classifier because it never sees the full picture.

The problem

Say you're triaging a CEO's morning messages. Three Slack messages from the same person:

  • 9:15 AM: "API migration 60% done, no blockers"

  • 10:30 AM: "Found an issue with payment endpoints, investigating"

  • 11:45 AM: "3% of live payments failing, need rollback/hotfix decision within an hour"

A single-pass classifier looks at message #1 and says: "FYI, low priority." It's correct -- in isolation.

But a human reading all three messages sees an escalation from "no blockers" to "production incident requiring executive decision." The classification of message #1 should change in light of messages #2 and #3, because it's the start of a thread that ended in a crisis.

Single-pass classification can't do this. It processes each item without context from the others.

The two-pass architecture

The fix is straightforward: run the LLM twice.

Pass 1 -- Independent classification. Process each item individually. Get per-item labels: category, urgency, metadata. This is your standard classification pass. It runs fast because items can be processed in parallel.

interface Pass1Result {  itemId: string;  category: "urgent" | "delegate" | "fyi" | "ignore";  urgency: "critical" | "high" | "medium" | "low";  summary: string;  suggestedAction: string; }

async function pass1(items: Item[]): Promise { // Each item classified independently const results = await Promise.all( items.map(item => classifyItem(item)) ); return results; }`

Enter fullscreen mode

Exit fullscreen mode

Pass 2 -- Cross-reference and synthesis. Feed ALL items plus ALL Pass 1 classifications back into the LLM. Ask it to find relationships, patterns, and adjust classifications based on the full picture.

interface Pass2Result {  threads: Thread[]; // Related item clusters  flags: Flag[]; // Cross-item alerts  adjustedClassifications: Map;  briefing: string; // Synthesized summary }

async function pass2( items: Item[], pass1Results: Pass1Result[] ): Promise { const prompt = buildCrossReferencePrompt(items, pass1Results); return await llm.generate(prompt); }`

Enter fullscreen mode

Exit fullscreen mode

The key insight: Pass 2 sees what Pass 1 cannot. It catches:

  • Escalation threads: Items from the same source that increase in severity

  • Contradictions: One person says X, then reverses to Y

  • Scheduling conflicts: Two items reference the same time slot

  • Resolved issues: Problem reported, then resolved -- no action needed

  • Deal changes: Financial figures that shift between messages

Why not just do one big pass?

You might think: "Why not feed everything into one prompt and classify + cross-reference in a single call?"

Three reasons:

  1. Classification quality degrades in long prompts. When you ask an LLM to do two things at once (classify each item AND find cross-item patterns), it tends to do both worse than if you split them. The model's attention is divided.

  2. Structured output is more reliable for simple tasks. Pass 1 returns a clean, typed classification per item. No ambiguity, no free-text interpretation needed. Pass 2 can then assume correct per-item labels and focus entirely on relationships.

  3. You can parallelize Pass 1. If you have 50 items, you can fire 50 parallel classification calls. Single-pass would need to fit all 50 items in one prompt, hitting context limits and increasing latency.

The prompt structure matters

For Pass 1, keep it focused:

function buildPass1Prompt(item: Item): string {  return \
Classify this message.

Channel: ${item.channel} From: ${item.from} Time: ${item.timestamp} Body: ${item.body}

Respond with JSON: { "category": "urgent" | "delegate" | "fyi" | "ignore", "urgency": "critical" | "high" | "medium" | "low", "summary": "one sentence", "suggestedAction": "what the recipient should do" }`; }`

Enter fullscreen mode

Exit fullscreen mode

For Pass 2, structure the input so the model can scan efficiently:

function buildPass2Prompt(  items: Item[],  classifications: Pass1Result[] ): string {  const combined = items.map((item, i) => ({  ...item,  classification: classifications[i],  }));

return `You have ${items.length} classified messages. Review ALL messages together and identify:`

  1. THREADS: Groups of messages from the same person about the same topic. Flag if a thread escalates in severity.

  2. CONTRADICTIONS: Cases where someone says X then reverses to Y.

  3. SCHEDULING CONFLICTS: Multiple items referencing overlapping times.

  4. RESOLVED ITEMS: Problems that were reported then resolved (these need no action, even if Pass 1 flagged them as urgent).

  5. RECLASSIFICATIONS: Any items where the Pass 1 classification should change based on cross-item context.

Messages and their classifications: ${JSON.stringify(combined, null, 2)}

Respond with JSON matching this schema: { threads, flags, reclassifications, briefing }`; }`

Enter fullscreen mode

Exit fullscreen mode

When to use two-pass

This pattern adds latency (two sequential LLM calls instead of one). It's worth it when:

  • Items have relationships. Messages in a thread, tickets about the same system, transactions from the same account. If items are truly independent, single-pass is fine.

  • Cross-item patterns have high consequences. Missing an escalation thread or a scheduling conflict costs more than the extra 2-3 seconds of latency.

  • You need both per-item labels AND a synthesis. Dashboards that show individual items with classifications AND a summary view.

  • The item count fits in a single context window. Pass 2 needs all items at once. If you have 10,000 items, you'll need a different approach (clustering, then two-pass within clusters).

When it's overkill:

  • Items are truly independent (product reviews, standalone support tickets from different customers)

  • You only need per-item labels, not cross-item analysis

  • Latency is more important than accuracy

Production considerations

Caching. Pass 1 results are deterministic for a given item. If the same item appears in multiple batches, cache the Pass 1 result.

Error handling. If Pass 2 fails, you still have valid Pass 1 classifications. Degrade gracefully to single-pass results rather than failing entirely.

Cost. Pass 2 uses more tokens (all items + all classifications in one prompt). Use a fast, cheap model for Pass 1 (Gemini Flash, Haiku) and a more capable model for Pass 2 if needed. Often the same fast model works for both.

Structured output. Both Gemini (responseMimeType: "application/json") and OpenAI (response_format: { type: "json_object" }) support forcing JSON output. Use it. Parsing free-text LLM output is fragile.

const result = await model.generateContent({  contents: [{ role: "user", parts: [{ text: prompt }] }],  generationConfig: {  responseMimeType: "application/json",  temperature: 0.1, // Low temp for classification  }, });

Enter fullscreen mode

Exit fullscreen mode

The general principle

Two-pass processing is a specific case of a broader pattern: separate extraction from synthesis. Pass 1 extracts structured data from each item. Pass 2 synthesizes across the extracted data.

This applies beyond text classification:

  • Code review: Pass 1 annotates each file, Pass 2 finds cross-file issues

  • Financial analysis: Pass 1 categorizes transactions, Pass 2 finds patterns

  • Research synthesis: Pass 1 summarizes each paper, Pass 2 identifies themes and contradictions

The architectural insight is that LLMs are better at focused tasks than multi-objective tasks. Splitting the work into two focused passes produces better results than one ambitious pass, even though it uses more compute.

I build production AI systems. If you're designing LLM pipelines, I'm at astraedus.dev.

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Two-Pass LL…geminimodelproductapplicationanalysisreportDev.to AI

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 160 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!