Why not schema markup (yet)
Cited’s docs are built on Mintlify, a documentation platform we chose for its built-in llms.txt generation, clean design, and operational simplicity. Mintlify uses React Server Components for rendering, and we found through production testing that custom JSON-LD injection via MDX components does not survive the RSC serialization pipeline — the schema content ends up in the JavaScript RSC payload rather than as parseable<script type="application/ld+json"> DOM elements.
We evaluated several workarounds:
- Frontmatter fields — Mintlify does not currently support a
structuredDataorjsonLdfrontmatter field - Config-based injection — Mintlify’s
docs.jsondoes not support acustomHeadorheadoption for arbitrary HTML - Build-time scripts — Mintlify does not expose prebuild hooks that would allow injecting schema into the generated HTML
- Google Tag Manager — GTM-injected schema executes client-side via JavaScript, which means most AI crawlers (GPTBot, ClaudeBot, PerplexityBot) would not see it. GTM would have partially solved the problem for traditional search engines at the cost of added complexity, without reaching our primary use case.
What we do instead
Four mechanisms carry the discoverability load without requiring custom schema.Auto-generated llms.txt and llms-full.txt
Mintlify generates/llms.txt with comprehensive page listings and /llms-full.txt with full markdown content of every page. Both files are served at the domain root and referenced via HTTP link headers on every page response. This is a strong mechanism for LLM discovery and on-demand content retrieval — arguably more directly useful for AI search than JSON-LD.
Textual citability patterns
The content on every page follows a deliberate citability structure:- Definition-first ledes — first paragraph is a Wikipedia-style definition of the page subject
- Entity-first sentences — brand names and concepts appear before numbers and data
- Table pre-summaries — natural-language sentences summarize table content before the table itself
- Self-contained FAQ answers — each answer reads standalone without requiring the question for context
- Explicit comparisons — both values appear in the same sentence when comparing
Clean HTML structure
Mintlify renders semantic HTML with proper heading hierarchy, table markup, code blocks, and internal link structure. AI crawlers receive the same full content as human browsers, verified via user-agent-specific crawl tests.Comprehensive crawler allowlist
The robots.txt file explicitly allows all major AI crawlers — GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended, and xAI — alongside traditional search engine crawlers.What we lose by not having custom schema
To be transparent about the tradeoffs:- Google’s rich results for FAQ pages (the expandable Q&A displayed in search results) are not available for our docs
- The DefinedTerm entity-graph signal that helps Google’s Knowledge Graph is not present
- A structural signal that reinforces content type and authority is missing
When this may change
We will revisit this decision when any of the following happens:- Mintlify adds native JSON-LD support. The most likely path. Their platform evolves regularly and structured data is a common request from technical docs sites.
- A new mechanism emerges. Server-side injection patterns, edge workers, or other approaches may become viable over time.
- The tradeoff calculation changes. If schema markup becomes demonstrably critical for AI citation beyond its current role as one signal among several, the GTM workaround or a platform migration may become worth the cost.
Related concepts
- What sources LLMs cite — the broader framework of how LLMs select content to cite
- Benchmarks methodology — how the Cited Index data is constructed
- Citations vs mentions — the basic vocabulary distinction