AI Search Optimisation

llms.txt Explained: The New File That Tells AI How to Use Your Site

llms.txt is the emerging standard for telling AI systems how to navigate and cite your website. Here's what it is, who's adopting it, what to put in yours, and whether it's worth your time today.

Arclight Digital · · 4 min read

If you've spent any time on AI search optimisation lately, you've probably bumped into llms.txt. It's positioned as a new web standard — sometimes described as "robots.txt but for AI" — and the biggest names in tech are quietly adopting it. So what is it actually, and should you care?

What is llms.txt?

llms.txt is a plain-text Markdown file at the root of your website (yourdomain.com/llms.txt) that gives AI systems a curated, structured map of your most important content. Think of it as "if I were going to summarise my whole website to an LLM, what's the table of contents I'd hand over?"

It was proposed by Jeremy Howard in late 2024 and adoption has accelerated through 2025-2026. The pitch is simple:

  • Modern AI systems have context-window limits — they can't fit your whole site in a single prompt
  • HTML is noisy — navigation, ads, scripts, and boilerplate get in the way of the actual content
  • llms.txt is a clean, opinionated index that tells AI: "here are the canonical pages on this site, in priority order"

It complements robots.txt (which controls access) and sitemap.xml (which lists all URLs). llms.txt is curated and Markdown-formatted — designed to be human-and-AI-readable.

llms.txt vs robots.txt vs sitemap.xml

Easy to confuse — they sound similar. Here's the real distinction:

  • robots.txt: Permission. Tells crawlers what they're allowed to access. (See our robots.txt guide.)
  • sitemap.xml: Inventory. Lists every URL on your site for search engines to discover.
  • llms.txt: Curation. Picks your most important pages and presents them in priority order, in Markdown, for AI extraction.

Who's adopting it?

As of , the public adopters include:

  • Anthropicdocs.anthropic.com/llms.txt
  • Vercel — published an llms.txt across their docs and product surfaces
  • Mintlify — auto-generates llms.txt for every documentation site they host
  • Stripe — partial adoption on developer docs
  • Cloudflare, Hugging Face, Replicate — published versions on their docs sites

What's missing? Direct confirmation from OpenAI, Google, or Anthropic that their crawlers actually use llms.txt as a primary signal. As of mid-2026, the file is read by some crawlers and ignored by others. Adoption is real but the consumer side is still settling.

Honest take from us llms.txt is a low-effort, low-risk addition. Even if AI systems aren't using it heavily today, the cost to publish one is 30 minutes and the upside if it becomes a standard is non-trivial. We add one for every client. We don't expect it to move the needle on its own — it's a "fundamentals box-tick" alongside robots.txt and sitemap.xml.

What goes in an llms.txt?

The format is simple. A single # heading with your business name, a short blockquote with a one-line description, then sections of links. Here's the canonical structure:

# Your Business Name > One-sentence description of what this business does and who it serves. A short paragraph or two giving context — what you do, who you serve, and why the AI should care. ## Core pages - [Homepage](https://yourdomain.com/): What the business does in one line - [About](https://yourdomain.com/about): Who runs it and credentials - [Services](https://yourdomain.com/services): Overview of services offered - [Contact](https://yourdomain.com/contact): How to get in touch ## Services - [Service 1](https://yourdomain.com/services/one): What it is, who it's for - [Service 2](https://yourdomain.com/services/two): What it is, who it's for ## Resources - [Blog](https://yourdomain.com/blog): Articles on the topic - [FAQ](https://yourdomain.com/faq): Common questions answered ## Optional - [Press / Mentions](https://yourdomain.com/press): Third-party validation

Every link should have a short description after the colon. AI systems use these descriptions to decide which pages to deep-read for a given query.

What we'd put in an Arclight llms.txt

For our own site, the file looks something like:

# Arclight Digital > Brisbane-based web design and SEO agency for Australian small business. Arclight Digital is a Brisbane web design and SEO agency. We build modern websites and run ongoing search engine optimisation for small businesses across Australia, with a focus on tradies, allied health, beauty, and fitness verticals. Sites from $500, SEO from $49/month, no lock-in contracts. ## Core pages - [Homepage](https://arclightdigital.com.au/): What we do, packages, portfolio - [Contact](https://arclightdigital.com.au/contact): Get in touch / book a free consult - [Free SEO Audit](https://arclightdigital.com.au/audit): No-obligation site review ## Services - [Web Design Brisbane](https://arclightdigital.com.au/services/web-design-brisbane): Modern, fast websites from $500 - [SEO Brisbane](https://arclightdigital.com.au/services/seo-brisbane): Affordable SEO from $99/month - [AI Search Optimisation](https://arclightdigital.com.au/services/ai-search-optimisation): GEO / AEO for ChatGPT, Perplexity, Claude - [Local SEO Brisbane](https://arclightdigital.com.au/services/local-seo-brisbane): Suburb + service ranking - [Google Business Profile](https://arclightdigital.com.au/services/google-business-profile): GBP setup and management ## Industries - [Trades](https://arclightdigital.com.au/industries/trades): Plumbers, electricians, builders - [Health](https://arclightdigital.com.au/industries/health): Allied health, NDIS, clinics - [Beauty](https://arclightdigital.com.au/industries/beauty): Salons, lash studios, brow bars - [Fitness](https://arclightdigital.com.au/industries/fitness): PTs, gyms, yoga, pilates ## Resources - [Blog](https://arclightdigital.com.au/blog): SEO + AI search how-to articles - [Why Your Website Is Invisible to ChatGPT](https://arclightdigital.com.au/blog/invisible-to-chatgpt-robots-txt): The robots.txt fix - [The 8 AI Crawlers You Should Be Allowing](https://arclightdigital.com.au/blog/ai-crawlers-list-2026): Definitive list ## Portfolio - [Create Allied Health Services](https://arclightdigital.com.au/portfolio/allied-health): NDIS allied health Sydney - [Functional Patterns Brisbane](https://arclightdigital.com.au/portfolio/functional-patterns-brisbane): Biomechanics specialist - [Grant Martin Plumbing](https://grant-martin-plumbing.vercel.app/): Brisbane plumber - [Jentech Electrical](https://jentec-electrical.vercel.app/): Brisbane electrician - [LASHÉ](https://lashe-website.vercel.app/): Premium lash studio

How to publish llms.txt

  1. Save the content above (with your details) as a plain text file named llms.txt.
  2. Upload to the root of your site, same folder as your homepage. Goal: yourdomain.com/llms.txt loads in a browser.
  3. For Squarespace / Wix / Webflow: most platforms don't natively support custom root files, so you may need to use a redirect or wait for a platform update. Custom builds (static, Vercel, Netlify, WordPress) handle it natively.
  4. Optional: also publish an llms-full.txt with deeper content extracts. We don't do this for small business sites — overkill.

Should I bother today?

Pragmatic answer for small businesses in 2026:

  • Yes, if it costs you 30 minutes — easy win, future-proof, signals intent.
  • No, if you don't have robots.txt, schema, and FAQ blocks sorted first — those move the needle 10x more than llms.txt right now.

The full AI Search Optimisation playbook is: fix robots.txt → add schema → add FAQ structure → then add llms.txt. Don't skip steps.

What's next for the standard?

Two open questions in the standard's evolution:

  1. Will major AI crawlers commit to using it? OpenAI and Google have not officially confirmed. Anthropic has hinted but not specified. The standard's success depends on consumer-side adoption.
  2. Will it be extended with priority signals or schema? The current spec is intentionally minimal. Some discussion in the community about adding pricing, last-updated, or content-type metadata. Watch this space.

For now: publish a clean llms.txt, link to it from your robots.txt with Sitemap:-style reference (proposed but not standardised), and revisit in 6 months.

Want one of these for your site?

Our AI Search Optimisation service includes llms.txt, robots.txt, schema markup, and FAQ structure as standard. Get a free audit to see where you stand.

Get a Free AI Search Audit