SameDayDesk · Guide/Comparison · June 2026

Is llms.txt worth it? We checked the data (2026)

Short answer: for getting cited by AI search, largely no. Google doesn't support llms.txt, and roughly 97% of LLM crawler hits never fetch it. Here's what actually moves the needle, with numbers.

~97%of LLM crawler hits never fetch llms.txt (Ahrefs)
0Google support for llms.txt — "not supported, not planned"
~87%of SearchGPT citations matched Bing's top 20 (Seer)
~44%of LLM citations come from the first 30% of a page

The direct answer

llms.txt is a proposed file you drop at your domain root to give large language models a curated map of your content. The pitch sounds great: a clean, markdown-friendly index that "helps AI understand your site." In practice, the engines that send you traffic mostly ignore it.

Two facts settle it for 2026:

So is it worth adding? It's cheap, and it doesn't hurt. Treat llms.txt as optional hygiene, the same way you'd keep a tidy humans.txt. Just don't confuse it with a citation strategy. If you spend a Saturday on llms.txt and skip structured data, you optimized the thing nobody reads and ignored the thing every engine parses.

About 97% of LLM crawler hits never fetch llms.txt, and Google says it's not supported and not planned. The file you're told to write is the one the bots don't open.

What actually drives AI citation

AI search runs on live retrieval (RAG), which is independent of model training data. A freshly indexed page can be cited the same week it goes live, without ever being in a training set. That changes the playbook: the question isn't "did the model learn about me," it's "can the engine find and extract me right now." Four levers do almost all the work.

1. Get into the Bing index (the fast lane to ChatGPT)

ChatGPT Search and Microsoft Copilot largely retrieve from the Bing index. Seer Interactive found about 87% of SearchGPT citations matched Bing's top 20 results, a finding Search Engine Land reconfirmed in April 2026. The lever: push your URLs to Bing via IndexNow, which needs no account and indexes new content in hours to days. We already push it for sites we touch.

2. Don't wait on Google's sandbox

Google sandboxes new domains for roughly 3 to 9 months on commercial queries, so don't gate your strategy on it. Better news: Google AI Overviews citation is largely independent of organic rank. About 68% of AIO-cited pages were not in the top 10 organic results. You can be quoted by AI without ranking on page one.

3. Answer first, then add structure

Per the Princeton/Georgia Tech GEO study (KDD 2024), about 44% of LLM citations come from the first 30% of a page. Put the answer at the top. Then ship JSON-LD structured data: Organization and Article for content, SoftwareApplication for a tool, Service plus Offer (with price) for a paid product. Skip FAQPage and HowTo as a rich-result play. Google deprecated those rich results between 2023 and 2026; the markup is fine for machine-readability but it won't win you a result snippet.

4. Write the way models like to quote

The same GEO research measured what raises a page's odds of being cited. The lifts are large and stackable:

TechniqueVisibility liftWhy it works
Adding statistics~+32%Concrete numbers are easy to extract and verify
Direct quotations~+41%Quotable spans drop cleanly into an answer
Citing named sources~+30%Attribution signals trustworthiness to the model

llms.txt vs the levers that actually work

TacticEffortImpact on AI citation
llms.txt fileLowNegligible — Google unsupported, ~97% of bots skip it
IndexNow / Bing indexLowHigh — feeds ~87% of SearchGPT citations
JSON-LD structured dataMediumHigh — the format every engine parses
Answer-first contentMediumHigh — ~44% of citations come from the top 30%
Stats + quotes + named sourcesMediumHigh — up to ~+41% visibility per technique
Comparison / "vs" / "alternatives" pagesMediumHigh — ~40.9% of citations on commercial queries

That last row matters if you sell something. Comparison, "X vs Y," "alternatives to," and listicle formats are among the most-cited by AI on commercial-intent queries, around 40.9% of citations on those queries. And AI-search referral traffic reportedly converts about 4.4x organic search traffic, because the visitor already got a recommendation before clicking.

The plot twist: even the experts are leaving points on the table

We scored 189 well-known companies across 10 industries on six fundamentals (AI-crawler access, JSON-LD, title/meta, Open Graph, XML sitemap, and yes, llms.txt). The gaps weren't where you'd expect. OpenAI and GitHub scored a D on their homepages, mostly because the pages are JS-heavy with thin server-rendered content. Perplexity, an AI search engine, scored a C. LlamaIndex, whose entire product is making data readable by LLMs, scored a D. Klarna scored the lowest at 38 (F), and Ars Technica scored an F: it lets crawlers in but ships no structured data at all.

SaaS averaged 87 across 24 sites, yet 1 in 3 had no JSON-LD whatsoever. Stripe, Supabase, Webflow, Vercel (93), and HubSpot (90) led. Figma (73), Linear (71), Substack (68), Airtable (68), and Clerk (61) trailed, and 8 of the 24, including Notion, Linear, Airtable, Clerk, Cal.com, PostHog, and Gumroad, ship no JSON-LD at all. The takeaway: the basics that drive citation are unevenly done, even by category leaders. That's the opening.

See your real AI-citation gaps in 60 seconds

Skip the guesswork. Our free checker scans your site for the six fundamentals that actually drive AI visibility (crawler access, JSON-LD, titles, Open Graph, sitemap) and shows where you stand against the 189-company benchmark.

Run the free check Get the $9 AI Readiness Kit

What to do instead (a 5-step checklist)

  1. Verify crawler access. Make sure GPTBot, ClaudeBot, PerplexityBot, and Google-Extended aren't blocked in robots.txt. You can't be cited if you can't be crawled.
  2. Push to Bing with IndexNow. It's free, needs no account, and is the fastest path into ChatGPT Search and Copilot. Bing indexes new content in hours to days.
  3. Ship JSON-LD. Organization + Article on content, SoftwareApplication on tools, Service + Offer (with price) on paid products. This is the format engines extract from.
  4. Restructure answer-first. Put the direct answer in the first 30% of every page, then back it with statistics, a direct quote, and named sources.
  5. Build comparison pages for your commercial queries ("X vs Y," "alternatives to Z"). They earn the highest AI-citation share on buyer-intent searches.

You can run the whole audit yourself with our free open-source CLI: npx github:epistemedeus/ai-readiness yoursite.com (repo on GitHub). Prefer the browser? Use the free AI Readiness Checker. Want the full benchmark, every structured-data template, and the checklist in one download? The $9 AI Readiness Kit has it.

If you'd rather not touch code: the $39 Fix Pack is built for your exact site, same day, with the structured data and answer-first edits done for you. And if you want proof, the $249 AI-Search Visibility Audit runs real citation testing against your named competitors so you can see exactly where you win and lose.

Want to check our math? The full 189-company benchmark is open under CC-BY: download the raw CSV.