Products model product service analysis insight review

How I built an AI that reads bank contracts the way bankers do (not the way customers do)

DEV Communityby WASA ConfidenceApril 1, 20264 min read1 views

<h1> How I built an AI that reads bank contracts the way bankers do (not the way customers do) </h1> <p>The problem started in 2009. I was a banker. I watched loan officers use internal scoring grids that customers never saw. The information asymmetry wasn't illegal — it was just never shared.</p> <p>Fifteen years later, the asymmetry got worse. Banks now run LLMs on customer data before any human reviews it. The customer still signs without understanding what they're signing.</p> <p>So I built the reverse.</p> <h2> The core insight: bankers read contracts differently than customers </h2> <p>A customer reads a loan contract linearly — page by page, looking for the monthly payment.</p> <p>A banker reads it dimensionally — simultaneously scanning for:</p> <ul> <li> <strong>Covenant triggers<

How I built an AI that reads bank contracts the way bankers do (not the way customers do)

The problem started in 2009. I was a banker. I watched loan officers use internal scoring grids that customers never saw. The information asymmetry wasn't illegal — it was just never shared.

Fifteen years later, the asymmetry got worse. Banks now run LLMs on customer data before any human reviews it. The customer still signs without understanding what they're signing.

So I built the reverse.

The core insight: bankers read contracts differently than customers

A customer reads a loan contract linearly — page by page, looking for the monthly payment.

A banker reads it dimensionally — simultaneously scanning for:

Covenant triggers (what makes the loan callable)
Cross-default clauses (what other contracts could trigger this one)
Margin ratchets (how the rate changes under specific conditions)
Termination asymmetries (who can exit and under what conditions)

These aren't hidden in fine print. They're just never explained. An LLM trained to scan for these patterns — in the order a banker would — surfaces what a linear read misses.

The architecture

The system runs four specialized agents in parallel rather than one general-purpose model. This is borrowed from the 4D analytical framework we use at WASA Confidence — the principle being that parallel agents surfacing contradictions are more reliable than a single agent producing a confident answer.

Agent 1 — Clause Extractor

Parses the document structure. Identifies clause types, cross-references, and defined terms. Does not interpret — only maps.

system_prompt = """ You are a legal document parser. Your only task is to:

system_prompt = """ You are a legal document parser. Your only task is to:

List every clause by type (payment, covenant, default, termination, rate)
Flag every cross-reference between clauses
Flag every defined term that appears in a clause but is defined elsewhere

Output JSON only. No interpretation. No summary. """`

Enter fullscreen mode

Exit fullscreen mode

Agent 2 — Risk Scanner

Takes the clause map from Agent 1. Scores each clause against a library of 340 known adverse patterns — built from 15 years of banking experience.

system_prompt = """ You are a senior credit analyst. You receive a structured clause map. For each clause, return:

system_prompt = """ You are a senior credit analyst. You receive a structured clause map. For each clause, return:

risk_level: none / low / medium / high / critical
pattern_match: which known adverse pattern this matches (if any)
plain_language: one sentence explaining what this means for the borrower

Do not summarize the document. Score each clause independently. """`

Enter fullscreen mode

Exit fullscreen mode

Agent 3 — Cross-Contract Analyzer

This is the one customers never run. It takes the flagged clauses and checks them against the borrower's other contracts — insurance policies, supplier agreements, other loans.

A cross-default clause in a bank loan that triggers on a supplier payment delay is invisible if you only read the bank contract.

Agent 4 — Contradiction Detector

Runs against the outputs of Agents 1, 2 and 3. Looks for contradictions between what the contract says and what the borrower believes (captured in a short intake form).

The contradictions between agents are often more informative than any single agent's output. This is the core principle behind the WASA Confidence 4D methodology — parallel analysis surfaces what sequential analysis misses.

What it finds in practice

On a sample of 47 SME loan contracts analyzed:

Finding Count

Margin ratchet clause borrower was unaware of 31 / 47

Cross-default linking loan to unrelated supplier contracts 19 / 47

Callable provisions triggered by unmonitored financial ratios 8 / 47

Termination asymmetries giving bank unilateral exit rights 3 / 47

None of these were illegal. None were hidden. All were unread.

The technical limit worth being honest about

LLMs hallucinate on numerical conditions. If a covenant says "ratio must remain above 1.35x adjusted EBITDA" — the model will extract the clause correctly but may misinterpret what counts as adjusted EBITDA without the definition section.

The fix: Agent 1 explicitly maps every defined term before Agent 2 interprets any condition. You cannot let a model interpret a covenant before it has resolved every defined term in that covenant.

This sounds obvious. It isn't how most people prompt document analysis.

Where this goes

The same architecture applies to insurance contracts, supplier agreements, and lease terms. Anywhere a professional on one side of the table reads dimensionally and a non-professional on the other side reads linearly.

The full service — contract analysis, banking condition audit, transaction data room — is at mainstreetbrigade.org. The underlying 4D analytical framework is documented at wasaconf.org.

The code above is simplified but the architecture is production. Happy to discuss the prompt engineering for the contradiction detection agent in the comments — that's where most of the interesting edge cases live.

Original source

DEV Community

https://dev.to/wasa-confidence/how-i-built-an-ai-that-reads-bank-contracts-the-way-bankers-do-not-the-way-customers-do-206

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelproductservice

ReleasesFresh

An Empirical Study of Testing Practices in Open Source AI Agent Frameworks and Agentic Applications

arXiv:2509.19185v3 Announce Type: replace Abstract: Foundation model (FM)-based AI agents are rapidly gaining adoption across diverse domains, but their inherent non-determinism and non-reproducibility pose testing and quality assurance challenges. While recent benchmarks provide task-level evaluations, there is limited understanding of how developers verify the internal correctness of these agents during development. To address this gap, we conduct the first large-scale empirical study of testing practices in the AI agent ecosystem, analyzing 39 open-source agent frameworks and 439 agentic applications. We identify ten distinct testing patterns and find that novel, agent-specific methods like DeepEval are seldom used (around 1%), while traditional patterns like negative and membership tes

arXiv cs.SE

2mabout 9 hours ago

ModelsFresh

A Multi-Language Perspective on the Robustness of LLM Code Generation

arXiv:2504.19108v5 Announce Type: replace Abstract: Large language models have gained significant traction and popularity in recent times, extending their usage to code-generation tasks. While this field has garnered considerable attention, the exploration of testing and evaluating the robustness of code generation models remains an ongoing endeavor. Previous studies have primarily focused on code generation models specifically for the Python language, overlooking other widely used programming languages. In this work, we conduct a comprehensive comparative analysis to assess the robustness performance of several prominent code generation models and investigate whether robustness can be improved by repairing perturbed docstrings using an LLM. Furthermore, we investigate how their performanc

arXiv cs.SE

2mabout 9 hours ago

ModelsFresh

Precision or Peril: A PoC of Python Code Quality from Quantized Large Language Models

arXiv:2411.10656v2 Announce Type: replace Abstract: Context: Large Language Models (LLMs) like GPT-5 and LLaMA-405b exhibit advanced code generation abilities, but their deployment demands substantial computation resources and energy. Quantization can reduce memory footprint and hardware requirements, yet may degrade code quality. Objective: This study investigates code generation performance of smaller LLMs, examines the effect of quantization, and identifies common code quality issues as a proof of concepts (PoC). Method: Four open-source LLMs are evaluated on Python benchmarks using code similarity metrics, with an analysis on 8-bit and 4-bit quantization, alongside static code quality assessment. Results: While smaller LLMs can generate functional code, benchmark performance is limited

arXiv cs.SE

1mabout 9 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 197 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Products

Products

Taiwan Startups Showcase AI Capabilities at NVIDIA GTC 2026, Highlighting Strategic Role in Global AI Supply Chain - streetinsider.com

Taiwan Startups Showcase AI Capabilities at NVIDIA GTC 2026, Highlighting Strategic Role in Global AI Supply Chain streetinsider.com

GNews AI Taiwan

1m14 days ago

ProductsFresh

[PokeClaw] First working app that uses Gemma 4 to autonomously control an Android phone. Fully on-device, no cloud.

PokeClaw - A Pocket Version of OpenClaw Most "private" AI assistants are private because the company says so. PokeClaw is private because there's literally no server component. The AI model runs on your phone's CPU. There's no cloud endpoint. You can block the app from the internet entirely and it works the same. It runs Gemma 4 on-device using LiteRT and controls your phone through Android Accessibility. You type a command, the AI reads the screen, decides what to tap, and executes. Works with any app. I built this because I wanted a phone assistant that couldn't spy on me even if it wanted to. Not because of a privacy policy, but because of architecture. There's nowhere for the data to go. First app I've found that does fully local LLM phone control — every other option I checked either

Reddit r/LocalLLaMA

1mabout 3 hours ago

Products

Silverback AI Chatbot Introduces AI Assistant Feature to Support Structured Digital Communication and Intelligent Workflow Automation - Daytona Beach News-Journal

Silverback AI Chatbot Introduces AI Assistant Feature to Support Structured Digital Communication and Intelligent Workflow Automation Daytona Beach News-Journal

GNews AI assistant

1m12 days ago

Products

Interactive apps, AI chatbots promote playfulness, reduce privacy concerns - The Pennsylvania State University

Interactive apps, AI chatbots promote playfulness, reduce privacy concerns The Pennsylvania State University

GNews AI privacy

1m7 months ago