Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.getcited.in/llms.txt

Use this file to discover all available pages before exploring further.

Crawl budget is the number of pages a search engine or AI crawler will fetch from a website in a given time period. Crawlers allocate their resources across millions of sites; each site gets a finite share of crawl activity. If a site has many low-value pages (empty stubs, thin content, duplicate pages), crawlers may spend their budget on those instead of reaching the high-value pages.

Why it matters

For AI visibility, crawl budget matters because AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended) need to access and index your content to include it in AI-generated answers. A site with 10,000 pages where only 50 have substantive content wastes crawler time on the other 9,950. Concentrating crawl budget on the pages that matter — through noindex directives, sitemap curation, and llms.txt — is a basic GEO optimization.

How Cited uses it

Cited’s own docs site addresses crawl budget by adding noindex: true to stub pages that have no real content yet, keeping them out of the sitemap until they are populated. This focuses crawler attention on the published pages with substantive content. The site’s robots.txt explicitly allows all major AI crawlers to access all published content without restriction.