I Built a Python Tool to Check If AI Search Engines Can Find Your Website
<p>You spent months tuning your <code><title></code> tags, chasing backlinks, submitting sitemaps to Google Search Console. Your rankings are solid. Then you ask ChatGPT about your industry — and it cites three of your competitors but not you.</p> <p>You are not invisible to Google. You are invisible to the AI that is increasingly <em>replacing</em> Google.</p> <p>This is the problem that <strong>Generative Engine Optimization (GEO)</strong> solves. And in this post, you will learn what GEO is, why it matters right now, and how to measure and fix your site's AI visibility using an open-source Python tool — in under 10 minutes.</p> <h2> SEO vs GEO: What's the Difference? </h2> <p>Traditional SEO optimizes for <em>ranking</em>: getting your blue link to appear on page one of Google's results
You spent months tuning your tags, chasing backlinks, submitting sitemaps to Google Search Console. Your rankings are solid. Then you ask ChatGPT about your industry — and it cites three of your competitors but not you.
You are not invisible to Google. You are invisible to the AI that is increasingly replacing Google.
This is the problem that Generative Engine Optimization (GEO) solves. And in this post, you will learn what GEO is, why it matters right now, and how to measure and fix your site's AI visibility using an open-source Python tool — in under 10 minutes.
SEO vs GEO: What's the Difference?
Traditional SEO optimizes for ranking: getting your blue link to appear on page one of Google's results. The signals are well understood — crawlability, backlinks, Core Web Vitals, structured data.
Generative Engine Optimization optimizes for citation: getting an AI model (ChatGPT, Perplexity, Claude, Gemini) to mention, quote, or link to your content when a user asks a relevant question. These models do not return a list of ten blue links. They synthesize an answer — and if your site is not part of that synthesis, you simply do not exist in that response.
The signals are fundamentally different:
Signal SEO GEO
Primary goal Rank high in SERPs Be cited in AI answers
Crawler Googlebot GPTBot, ClaudeBot, PerplexityBot...
Key file
sitemap.xml
llms.txt
Schema priority Breadcrumbs, Products FAQPage, Article, Organization
Content style Keyword density Factual claims, statistics, citations
Trust signal Backlinks Authorship, dates, authoritative quotes
The research backing this comes from Princeton KDD 2024 and AutoGEO ICLR 2026 — peer-reviewed work showing that specific content and technical signals consistently increase a site's citation rate in large language model responses.
Meet GEO Optimizer
GEO Optimizer is an open-source Python toolkit (MIT license) that audits your website across all eight GEO signal categories, gives you a 0–100 score, and generates the files you need to fix the gaps.
-
1030 tests, zero external HTTP calls in the test suite
-
Based on Princeton KDD 2024 + AutoGEO ICLR 2026 research
-
Four CLI commands: geo audit, geo fix, geo llms, geo schema
-
MCP server for AI-powered IDE integration (Claude Code, Cursor, Windsurf)
-
Web demo at geo-optimizer-web.onrender.com
-
Current version: v4.0.0-beta.1
Installation
Requires Python 3.9+.
pip install geo-optimizer-skill
Enter fullscreen mode
Exit fullscreen mode
That is the entire installation. Verify it worked:
geo --version
geo-optimizer-skill 4.0.0b1`
Enter fullscreen mode
Exit fullscreen mode
Your First Audit
geo audit --url https://yoursite.com
Enter fullscreen mode
Exit fullscreen mode
The tool fetches your homepage, robots.txt, llms.txt, checks for JSON-LD schema blocks, meta tags, content quality signals, and AI discovery endpoints. The whole thing runs in a few seconds.
A typical output looks like this:
GEO Optimizer — AI Citability Audit https://yoursite.comGEO Optimizer — AI Citability Audit https://yoursite.comROBOTS.TXT ───────────────────────────────────────────────── GPTBot MISSING (OpenAI — ChatGPT training) critical OAI-SearchBot MISSING (OpenAI — ChatGPT citations) critical ClaudeBot allowed PerplexityBot MISSING critical
LLMS.TXT ──────────────────────────────────────────────────── Not found at https://yoursite.com/llms.txt
SCHEMA JSON-LD ────────────────────────────────────────────── WebSite schema found FAQPage schema missing Article schema missing Organization missing
META TAGS ─────────────────────────────────────────────────── Title yoursite.com - Home Description missing Canonical found OG tags found
CONTENT QUALITY ───────────────────────────────────────────── Headings 8 Statistics 0 add numbers + data External links 0 add authoritative citations
AI DISCOVERY ──────────────────────────────────────────────── /.well-known/ai.txt missing /ai/summary.json missing
────────────────────────────────────────────────────────────── GEO SCORE [████████░░░░░░░░░░░░] 41 / 100 FOUNDATION ──────────────────────────────────────────────────────────────
Top recommendations:
- Add all 24 AI bots to robots.txt (currently blocking ChatGPT)
- Create llms.txt — biggest single GEO win available
- Add FAQPage schema for AI answer extraction
- Add statistics and data references to content`
Enter fullscreen mode
Exit fullscreen mode
Score bands:
Score Band What it means
86–100 Excellent Optimized for AI citation
68–85 Good Solid foundation, tune for specifics
36–67 Foundation Gaps exist, AI crawlers partially blocked
0–35 Critical Invisible or blocked from most AI engines
The 8 Audit Categories Explained
GEO Optimizer evaluates eight signal areas, each weighted based on their empirical impact on AI citation rates.
1. Robots.txt (18 points)
What it checks: Whether the 24 known AI crawlers are explicitly allowed in your robots.txt. Many sites have a blanket User-agent: * rule that technically allows everything — but missing explicit entries for bots like GPTBot, OAI-SearchBot, PerplexityBot, and ClaudeBot can mean those bots apply conservative defaults.*
Why it matters: If a bot cannot crawl your site, it cannot index or cite it. This is the single fastest fix available — it takes five minutes and affects everything downstream.
2. llms.txt (18 points)
What it checks: Whether your site has an /llms.txt file, and whether that file includes a proper H1, blockquote description, structured sections, links to key pages, and a full-text variant (/llms-full.txt).
Why it matters: llms.txt is an emerging standard (proposed 2024) that gives AI models a curated, machine-readable summary of your site. It is the sitemap.xml of the GEO era. Sites with a well-formed llms.txt see measurably higher citation rates in Perplexity and other retrieval-augmented systems.
3. JSON-LD Schema (16 points)
What it checks: Presence and quality of structured data — specifically WebSite, Organization, FAQPage, and Article schema types.
Why it matters: FAQPage schema is directly extracted by AI systems to populate answer snippets. Article schema provides authorship and date signals that LLMs use to assess freshness and trustworthiness.
4. Meta Tags (14 points)
What it checks: Title tag quality, meta description, canonical URL, and Open Graph tags.
Why it matters: Meta descriptions and OG descriptions are often used verbatim by AI systems when summarizing a page. A missing description means the AI has to guess — and it usually gets it wrong or omits your site.
5. Content Quality (12 points)
What it checks: Heading hierarchy (h1 through h3), presence of statistics and numeric claims, front-loaded key information, use of lists, word count, and external citation links.
Why it matters: Princeton GEO research found that content with verifiable statistics and authoritative citations is cited 2–3x more frequently than equivalent content without them. "Cite your sources" turns out to be good advice for getting cited yourself.
6. Signals (6 points)
What it checks: lang attribute on , RSS/Atom feed presence, and content freshness indicators (structured date data or visible publication dates).
Why it matters: AI systems use language declarations to route queries correctly. RSS feeds allow AI-integrated news systems to track your content. Date signals affect how AI systems rank freshness for time-sensitive queries.
7. AI Discovery Endpoints (6 points)
What it checks: Whether your site exposes /.well-known/ai.txt, /ai/summary.json, /ai/faq.json, and /ai/service.json.
Why it matters: These endpoints let AI crawlers self-serve a structured overview of your site without parsing full HTML. They are the API layer for AI discovery.
8. Brand and Entity (10 points)
What it checks: Coherence of brand name across pages, knowledge graph readiness, presence of About and Contact pages, geographic identity signals, and topic authority clustering.
Why it matters: LLMs build entity graphs. A site with a clear, consistent entity identity (one brand name, one headquarters, one topical focus) is significantly more likely to be cited as an authoritative source than a site with scattered signals.
Auto-Fix: Generate the Missing Files
Auditing is the diagnosis. geo fix is the treatment:
geo fix --url https://yoursite.com
Enter fullscreen mode
Exit fullscreen mode
This generates ready-to-deploy files:
-
A robots.txt patch with all 24 AI bots explicitly allowed
-
A complete llms.txt built from your sitemap
-
Missing JSON-LD schema blocks as snippets
-
Meta tag HTML for any missing tags
You can also target a specific category:
geo fix --url https://yoursite.com --only llms geo fix --url https://yoursite.com --only schemageo fix --url https://yoursite.com --only llms geo fix --url https://yoursite.com --only schemaEnter fullscreen mode
Exit fullscreen mode
And generate just the llms.txt separately:
geo llms --url https://yoursite.com
Enter fullscreen mode
Exit fullscreen mode
Python API Usage
If you need to integrate GEO auditing into your own tooling, the Python API is clean and returns typed dataclasses — it never prints to stdout.
from geo_optimizer.core.audit import run_full_audit
result = run_full_audit("https://yoursite.com")
print(result.score) # 41 print(result.band) # "foundation" print(result.robots.score) # 8 print(result.llms.score) # 0
for rec in result.recommendations: print(f"- {rec}")
- Add all 24 AI bots to robots.txt
- Create llms.txt
- Add FAQPage schema`
Enter fullscreen mode
Exit fullscreen mode
For async contexts (FastAPI, async scripts):
import asyncio from geo_optimizer.core.audit import run_full_audit_asyncimport asyncio from geo_optimizer.core.audit import run_full_audit_asyncasync def check_site(url: str) -> dict: result = await run_full_audit_async(url) return { "score": result.score, "band": result.band, "top_issues": result.recommendations[:3], }
asyncio.run(check_site("https://yoursite.com"))`
Enter fullscreen mode
Exit fullscreen mode
The JSON output format works well for dashboards and monitoring pipelines:
geo audit --url https://yoursite.com --format json | jq '.score'
Enter fullscreen mode
Exit fullscreen mode
CI/CD Integration: Catch Regressions Before They Ship
One of the most practical use cases is automated GEO regression testing. A CMS update can silently break your schema. A robots.txt change can accidentally block AI bots. Catching this in CI costs nothing.
The easiest path is the official GitHub Action:
# .github/workflows/geo-audit.yml name: GEO Audit# .github/workflows/geo-audit.yml name: GEO Auditon: push: branches: [main] pull_request:
jobs: geo: runs-on: ubuntu-latest steps:
-
uses: actions/checkout@v4
-
uses: Auriti-Labs/geo-optimizer-skill@v1 with: url: https://yoursite.com threshold: 68 # Fail if score drops below "good" band format: sarif # Appears in GitHub Security tab`
Enter fullscreen mode
Exit fullscreen mode
With format: sarif, findings automatically populate the Security tab of your repository as Code Scanning alerts — no extra configuration needed.
For PR comments that show the score on every pull request:
- uses: Auriti-Labs/geo-optimizer-skill@v1 id: geo with: url: https://yoursite.com- uses: Auriti-Labs/geo-optimizer-skill@v1 id: geo with: url: https://yoursite.com- uses: actions/github-script@v7
if: github.event_name == 'pull_request'
with:
script: |
const score = '${{ steps.geo.outputs.score }}';
const band = '${{ steps.geo.outputs.band }}';
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body:
## GEO Audit\n\n**Score:** ${score}/100 **Band:** \${band}`});});
Enter fullscreen mode
Exit fullscreen mode
For teams using JUnit-compatible CI dashboards (Jenkins, CircleCI, etc.):
- uses: Auriti-Labs/geo-optimizer-skill@v1 with: url: https://yoursite.com format: junit output-file: geo-results- uses: Auriti-Labs/geo-optimizer-skill@v1 with: url: https://yoursite.com format: junit output-file: geo-results- uses: dorny/test-reporter@v1 with: name: GEO Audit path: geo-results.xml reporter: java-junit`
Enter fullscreen mode
Exit fullscreen mode
MCP Server: GEO Audits Inside Your AI IDE
If you use Claude Code, Cursor, or Windsurf, you can install the GEO Optimizer MCP server and audit sites directly from your AI assistant without leaving the editor.
pip install geo-optimizer-skill[mcp]
Enter fullscreen mode
Exit fullscreen mode
Claude Code setup:
claude mcp add geo-optimizer -- geo-mcp
Enter fullscreen mode
Exit fullscreen mode
Cursor setup — add to .cursor/mcp.json:
{ "mcpServers": { "geo-optimizer": { "command": "geo-mcp", "args": [] } } }{ "mcpServers": { "geo-optimizer": { "command": "geo-mcp", "args": [] } } }Enter fullscreen mode
Exit fullscreen mode
Once connected, you can ask your AI assistant things like:
"Run a GEO audit on my-client-site.com and list the top three issues."
"Generate an llms.txt for https://docs.myproduct.com"
"Validate the JSON-LD schema on the homepage"
The MCP server exposes eight tools: geo_audit, geo_fix, geo_llms_generate, geo_schema_validate, geo_citability, geo_ai_discovery, geo_trust_score, and geo_compare. The last one is particularly useful for competitive analysis — you can compare your GEO score against a competitor's in a single call.
Try It Now
The fastest way to see where you stand is the web demo — no installation required:
geo-optimizer-web.onrender.com
Paste your URL, get a full breakdown in seconds.
If you want the CLI:
pip install geo-optimizer-skill geo audit --url https://yoursite.compip install geo-optimizer-skill geo audit --url https://yoursite.comEnter fullscreen mode
Exit fullscreen mode
Key Takeaways
-
GEO is not SEO. Ranking on Google and being cited by ChatGPT require different signals. Both matter in 2026.
-
The biggest wins are quick. Fixing robots.txt to allow AI bots and adding llms.txt can be done in under an hour and covers 36 of the 100 available points.
-
Automate the regression check. One GitHub Actions step catches GEO regressions the same way ESLint catches code quality issues — before they reach production.
-
The MCP server brings auditing into your editor. If you are already using an AI IDE, you can add GEO checks to your development workflow with a single command.
Resources
-
GitHub: github.com/Auriti-Labs/geo-optimizer-skill — star the repo to follow updates
-
Web demo: geo-optimizer-web.onrender.com — free, no account required
-
Documentation: auriti-labs.github.io/geo-optimizer-skill
-
Princeton KDD 2024 paper: GEO: Generative Engine Optimization
-
llms.txt standard: llmstxt.org
If the tool helps you, a GitHub star helps more developers find it. If you find a bug or want to contribute a new audit check, pull requests are open and the contributing guide is in the repo.
What AI search visibility issues have you run into? Drop them in the comments — I read everything.
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
claudegeminimodelA Very Fine Untuning
How fine-tuning made my chatbot worse (and broke my RAG pipeline) I spent weeks trying to improve my personal chatbot, Virtual Alexandra , with fine-tuning. Instead I got increased hallucination rate and broken retrieval in my RAG system. Yes, this is a story about a failed attempt, not a successful one. My husband and I called fine tuning results “Drunk Alexandra” — incoherent answers that were initially funny, but quickly became annoying. After weeks of experiments, I reached a simple conclusion: for this particular project, a small chatbot that answers questions based on my writing and instructions, fine tuning was not a good option. It was not just unnecessary, it actively degraded the experience and didn’t justify the extra time, cost, or complexity compared to the prompt + RAG system

Google's TurboQuant saves memory, but won't save us from DRAM-pricing hell
<h4>Chocolate Factory’s compression tech clears the way to cheaper AI inference, not more affordable memory</h4> <p>When Google unveiled <a target="_blank" rel="nofollow" href="https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/">TurboQuant</a>, an AI data compression technology that promises to slash the amount of memory required to serve models, many hoped it would help with a memory shortage that has seen prices triple since last year. Not so much.…</p>

I Built a Portable Text Editor for Windows — One .exe File, No Installation, Forever Free
<p>A solo developer's story of building the Notepad replacement that should have existed years ago.</p> <p>I've been using Windows my whole life. And my whole life, every time I needed to write something with a bit of formatting — a heading, some bold text, a colored note — I ended up either opening Word (too heavy), using Notepad (too limited), or pasting into a browser-based tool (too many accounts).</p> <p>WordPad was the middle ground. Then Microsoft removed it from Windows 11.<br> That was the moment I decided to build my own.</p> <h2> The Problem I Was Solving </h2> <p>Let me be specific about what I needed, because "text editor" covers everything from Vim to Google Docs.</p> <p>I wanted something that:</p> <ul> <li>Requires zero installation. I work on multiple machines — personal,
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models
A Very Fine Untuning
How fine-tuning made my chatbot worse (and broke my RAG pipeline) I spent weeks trying to improve my personal chatbot, Virtual Alexandra , with fine-tuning. Instead I got increased hallucination rate and broken retrieval in my RAG system. Yes, this is a story about a failed attempt, not a successful one. My husband and I called fine tuning results “Drunk Alexandra” — incoherent answers that were initially funny, but quickly became annoying. After weeks of experiments, I reached a simple conclusion: for this particular project, a small chatbot that answers questions based on my writing and instructions, fine tuning was not a good option. It was not just unnecessary, it actively degraded the experience and didn’t justify the extra time, cost, or complexity compared to the prompt + RAG system

Google's TurboQuant saves memory, but won't save us from DRAM-pricing hell
<h4>Chocolate Factory’s compression tech clears the way to cheaper AI inference, not more affordable memory</h4> <p>When Google unveiled <a target="_blank" rel="nofollow" href="https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/">TurboQuant</a>, an AI data compression technology that promises to slash the amount of memory required to serve models, many hoped it would help with a memory shortage that has seen prices triple since last year. Not so much.…</p>

Introducing The Screwtape Ladders
The time has come for me to find a new home for my writings. Like many an author before me, I've enjoyed improving my craft and getting feedback on my essays here. LessWrong is a good incubator for honing one's skills in that arena. There's a chance to get your point out in front of a really broad audience of really smart people. There's been some cool moments. My oldest visible post, Write A Thousand Roads to Rome , got cited in a discussion with Eliezer Yudkowsky once. I keep seeing people bring up Loudly Give Up, Don't Quietly Fade as a motivator for speaking out. Sometimes it's really cool people working on awesome projects, and I feel a flash of sadness at 'aww, it's not going to happen' and also a bit of cool 'whoa, they remember that post?' You've all also let me get away with a lot
Anthropic Executive Sees Cowork Agent as Bigger Than Claude Code - Bloomberg.com
<a href="https://news.google.com/rss/articles/CBMitgFBVV95cUxOM0VfSzdRYUNpT21XMlVuNXhsVEY4TUFxM3UzWUJDOEhFcUtJQnhTbjY2VjBXOUw1d1ZOUDRKeHVKMzkta3pFVWRWSGNoQkp3aWVndlRBQlpVUGxVN0ZnQW80OUZnYWN6RlhJWHRjT0V4RVhPcGhxMmE3b3oyVDlUV2RLY0g2NEx4M1dfMXhvTlhPTW50eFR1cEhxcHB3SXpURnRtbDZtZHp6bGQ2Z09IMjZBODBjdw?oc=5" target="_blank">Anthropic Executive Sees Cowork Agent as Bigger Than Claude Code</a> <font color="#6f6f6f">Bloomberg.com</font>

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!