llms.txt Presence

llms.txt is the emerging convention for telling AI models which parts of your site to prioritize for citation. Sites without one give AI no curation signal — the models guess what matters about your brand. Sites with a well-structured one direct citations toward your actual brand story, key products, and contact channels.

Methodology

Cited checks for /llms.txt at your domain root and, if found, grades the file across five content sections. The signal scores out of 8 — 4 points for the file existing with content, plus up to 4 bonus points for substantive structure. We fetch from your domain root first, then try the www or non-www variant if the primary fails. An HTTP 200 with non-empty body counts as found. Any other response (404, 500, timeout) means the file is missing — the signal returns 0/8 with a recommendation to create one. When the file exists, we grade five sections used by AI models to understand your brand: Company description. At least 2 lines of prose (over 40 characters each) describing what your company does. This is what AI models cite when asked “what is BrandName?” Product or service list. At least 2 enumerated items (markdown lists or numbered lists). This is what AI models cite when asked “what does BrandName sell?” Key content URLs. At least 2 https:// links to important pages — blog, docs, pricing, case studies. These tell AI which pages to fetch for deeper context. Contact information. An email address, a contact/support/help URL, or a social media link (LinkedIn, Twitter/X, Facebook, Instagram). Used when AI is asked “how do I reach BrandName?” Preferred citation format. Text matching patterns like “cite as”, “preferred reference”, “please cite”, “brand name:”, or “official name:”. Tells AI how you want to be named in answers — full name, abbreviation, or with a tagline. Each section found adds 2 points to a separate quality grade (out of 10) reported alongside the score. The main signal score also picks up bonus points for structure: any markdown headings (+1), any links (+1), any descriptive lines over 50 characters (+1), and a file longer than 10 lines (+1). The score is capped at 8.

Verification

You can verify our finding yourself in any browser. Step 1: Check the file exists. Visit https://yoursite.com/llms.txt directly. If you see plain text or markdown content, the file exists. If you get a 404, you don’t have one. Also try the www or non-www variant — Cited tests both, so the file at https://www.yoursite.com/llms.txt works even if your canonical is the apex domain. Step 2: Check for the five sections. The llms.txt spec (llmstxt.org) is intentionally loose, but the scanner grades against five practical sections. Open your file and check:

Two or more lines of prose describing your company (each over 40 characters, not lists or headings)
A list of products or services (markdown list with - or *, or numbered)
Two or more https:// links to key pages
A contact method — email, social link, or contact/support URL
A citation hint — text saying how to refer to your brand

Step 3: Compare with a known-good example. Anthropic’s docs.anthropic.com/llms.txt and Mintlify’s mintlify.com/docs/llms.txt are widely-cited references. They use the spec’s standard structure — H1 with brand name, blockquote summary, sectioned link lists. Yours doesn’t have to match exactly, but the structural pattern works. Step 4: Test the AI-readable variant. Some sites publish both /llms.txt (curation file) and /llms-full.txt (full content dump for AI training). Cited only checks the first. If you have both, the curation file is what matters for citation routing. If your verification disagrees with Cited’s finding, that’s a bug — let us know.

Technical detail

llms.txt is an emerging convention proposed by Jeremy Howard in September 2024 and codified at llmstxt.org. It is not yet an RFC or IETF standard; adoption is voluntary and the spec is evolving. AI platforms including Anthropic, OpenAI, Perplexity, and Mintlify have published their own llms.txt files as reference implementations, signaling community acceptance. Parsing logic. The scanner does no XML or strict schema parsing — llms.txt is unstructured markdown by design. Detection is heuristic:

File existence: HTTP 200 with non-empty body at /llms.txt
Prose detection: lines longer than 40 characters that don’t start with #, -, or http
List detection: lines matching ^\s*[-*]\s+\S (markdown bullets) or ^\s*\d+[.)]\s+\S (numbered)
Link detection: regex match against https?://\S+
Contact detection: email regex ([\w.-]+@[\w.-]+\.\w{2,}), contact URL pattern (/contact, /support, /help), or social platform URLs (LinkedIn, Twitter/X, Facebook, Instagram)
Citation hint detection: case-insensitive string matching against six phrases including “cite as”, “please cite”, “brand name:”, “official name:”, “preferred reference”

Quality grade vs main score. The signal returns two numbers. The main score (0-8) drives the GEO Score pillar. The quality grade (0-10) drives the per-section recommendations surfaced in the Opportunity card — telling the user specifically which sections to add. Both are persisted in the scan result’s details field. Edge cases the scanner handles:

www/non-www mismatch — If https://yoursite.com/llms.txt fails but https://www.yoursite.com/llms.txt succeeds, the scanner finds it. Either variant counts.
Redirects — HTTP 301/302 are followed. The final URL after redirect is what gets graded.
HTML returned instead of plain text — Some misconfigured servers return the homepage at /llms.txt. The scanner still grades the response content; HTML-only pages typically miss list and citation patterns and score low on quality.
Empty files — A 200 response with an empty body counts as missing. We require at least some content.
Connection timeout — 8-second timeout. Files that take longer to serve are treated as missing.

What this signal does not measure:

/llms-full.txt — the long-form content dump variant some sites publish alongside the curation file. AI models may use it for training; Cited doesn’t currently grade it.
llms.txt format compliance against the llmstxt.org spec. The spec proposes a specific H1 + blockquote + sectioned-list structure; the scanner uses looser heuristics so non-spec-conforming files that still cover the five sections get credit.
Whether AI models actually read your llms.txt. Adoption is uneven across platforms — Anthropic and Perplexity reportedly consult it; OpenAI’s posture is unclear; Google has not committed. The signal measures availability, not utilization.
Frequency of update. A stale llms.txt with old product names or removed links still scores well if the structural sections are present.

For most brands without an llms.txt today, creating one is a quick fix — under 30 minutes for a tight, well-structured file. The format is forgiving; the curation signal matters more than the markup precision. See also: robots.txt AI Crawler Access.

Get Started

Concepts

Methodology

Signals

Playbooks

MCP

Glossary

Changelog

Methodology

Verification

Technical detail

​Methodology

​Verification

​Technical detail

Methodology

Verification

Technical detail