AI Search Optimisation
llms.txt Explained: The New File That Tells AI How to Use Your Site
llms.txt is the emerging standard for telling AI systems how to navigate and cite your website. Here's what it is, who's adopting it, what to put in yours, and whether it's worth your time today.
If you've spent any time on AI search optimisation lately, you've probably bumped into llms.txt. It's positioned as a new web standard — sometimes described as "robots.txt but for AI" — and the biggest names in tech are quietly adopting it. So what is it actually, and should you care?
What is llms.txt?
llms.txt is a plain-text Markdown file at the root of your website (yourdomain.com/llms.txt) that gives AI systems a curated, structured map of your most important content. Think of it as "if I were going to summarise my whole website to an LLM, what's the table of contents I'd hand over?"
It was proposed by Jeremy Howard in late 2024 and adoption has accelerated through 2025-2026. The pitch is simple:
- Modern AI systems have context-window limits — they can't fit your whole site in a single prompt
- HTML is noisy — navigation, ads, scripts, and boilerplate get in the way of the actual content
llms.txtis a clean, opinionated index that tells AI: "here are the canonical pages on this site, in priority order"
It complements robots.txt (which controls access) and sitemap.xml (which lists all URLs). llms.txt is curated and Markdown-formatted — designed to be human-and-AI-readable.
llms.txt vs robots.txt vs sitemap.xml
Easy to confuse — they sound similar. Here's the real distinction:
robots.txt: Permission. Tells crawlers what they're allowed to access. (See our robots.txt guide.)sitemap.xml: Inventory. Lists every URL on your site for search engines to discover.llms.txt: Curation. Picks your most important pages and presents them in priority order, in Markdown, for AI extraction.
Who's adopting it?
As of , the public adopters include:
- Anthropic — docs.anthropic.com/llms.txt
- Vercel — published an
llms.txtacross their docs and product surfaces - Mintlify — auto-generates
llms.txtfor every documentation site they host - Stripe — partial adoption on developer docs
- Cloudflare, Hugging Face, Replicate — published versions on their docs sites
What's missing? Direct confirmation from OpenAI, Google, or Anthropic that their crawlers actually use llms.txt as a primary signal. As of mid-2026, the file is read by some crawlers and ignored by others. Adoption is real but the consumer side is still settling.
llms.txt is a low-effort, low-risk addition. Even if AI systems aren't using it heavily today, the cost to publish one is 30 minutes and the upside if it becomes a standard is non-trivial. We add one for every client. We don't expect it to move the needle on its own — it's a "fundamentals box-tick" alongside robots.txt and sitemap.xml.
What goes in an llms.txt?
The format is simple. A single # heading with your business name, a short blockquote with a one-line description, then sections of links. Here's the canonical structure:
Every link should have a short description after the colon. AI systems use these descriptions to decide which pages to deep-read for a given query.
What we'd put in an Arclight llms.txt
For our own site, the file looks something like:
How to publish llms.txt
- Save the content above (with your details) as a plain text file named
llms.txt. - Upload to the root of your site, same folder as your homepage. Goal:
yourdomain.com/llms.txtloads in a browser. - For Squarespace / Wix / Webflow: most platforms don't natively support custom root files, so you may need to use a redirect or wait for a platform update. Custom builds (static, Vercel, Netlify, WordPress) handle it natively.
- Optional: also publish an
llms-full.txtwith deeper content extracts. We don't do this for small business sites — overkill.
Should I bother today?
Pragmatic answer for small businesses in 2026:
- Yes, if it costs you 30 minutes — easy win, future-proof, signals intent.
- No, if you don't have
robots.txt, schema, and FAQ blocks sorted first — those move the needle 10x more thanllms.txtright now.
The full AI Search Optimisation playbook is: fix robots.txt → add schema → add FAQ structure → then add llms.txt. Don't skip steps.
What's next for the standard?
Two open questions in the standard's evolution:
- Will major AI crawlers commit to using it? OpenAI and Google have not officially confirmed. Anthropic has hinted but not specified. The standard's success depends on consumer-side adoption.
- Will it be extended with priority signals or schema? The current spec is intentionally minimal. Some discussion in the community about adding pricing, last-updated, or content-type metadata. Watch this space.
For now: publish a clean llms.txt, link to it from your robots.txt with Sitemap:-style reference (proposed but not standardised), and revisit in 6 months.
Want one of these for your site?
Our AI Search Optimisation service includes llms.txt, robots.txt, schema markup, and FAQ structure as standard. Get a free audit to see where you stand.