Crawl budget is the number of pages a search engine or AI crawler will fetch from a website in a given time period. Crawlers allocate their resources across millions of sites; each site gets a finite share of crawl activity. If a site has many low-value pages (empty stubs, thin content, duplicate pages), crawlers may spend their budget on those instead of reaching the high-value pages.Documentation Index
Fetch the complete documentation index at: https://docs.getcited.in/llms.txt
Use this file to discover all available pages before exploring further.
Why it matters
For AI visibility, crawl budget matters because AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended) need to access and index your content to include it in AI-generated answers. A site with 10,000 pages where only 50 have substantive content wastes crawler time on the other 9,950. Concentrating crawl budget on the pages that matter — through noindex directives, sitemap curation, and llms.txt — is a basic GEO optimization.How Cited uses it
Cited’s own docs site addresses crawl budget by addingnoindex: true to stub pages that have no real content yet, keeping them out of the sitemap until they are populated. This focuses crawler attention on the published pages with substantive content. The site’s robots.txt explicitly allows all major AI crawlers to access all published content without restriction.
Related concepts
- llms.txt — a direct signal to AI crawlers about which pages matter
- Content freshness — crawlers revisit fresh content more frequently
- What sources LLMs cite — crawlability is a prerequisite for citation