Skip to main content
JSON-LD tells AI models exactly what your page is about — an Organization, a Product, an Article, an FAQ. Without it, AI models guess from prose, and the guesses are often wrong: a product page reads as a blog post, a service page reads as a corporate brochure, and the AI cites accordingly.

Methodology

Cited samples up to 5 pages from your site and extracts every JSON-LD block on each. Schema.org types declared in any sampled page count toward the score — the scanner aggregates across the sample rather than per-page, so a Product type declared on one page and an Organization type on another both contribute. Schema types are scored in two value tiers based on how strongly AI models use each type for grounded answers: High-value types (2 points each). Organization, LocalBusiness, Product, Article, BlogPosting, NewsArticle, FAQPage, HowTo. These are the eight types AI models most reliably ground citations against — they answer “who are you,” “what do you sell,” “what do you say about X,” and “how do I do X.” Medium-value types (1 point each). WebSite, WebPage, BreadcrumbList, Review, Service, Person, ItemList, SoftwareApplication. These add context that AI models use as supporting structure, not primary grounding. The score sums points across all unique types found across the sample, capped at 8. A site with Organization + Product + Article declared scores 6/8 (three high-value at 2 each). A site with Organization + BreadcrumbList scores 3/8 (one high-value, one medium). A site with JSON-LD present but no scored type — for example, only Thing or a custom type — falls through to a 1-point floor for “JSON-LD on the page but not in a useful form.” The signal scores out of 8. Status thresholds: pass at 6/8, partial at 3/8, fail below. Why aggregate across the sample? AI citation patterns favor sites that declare structured data consistently — a site with Organization schema in the global layout and Product schema on product pages signals editorial discipline across the whole template. The signal rewards declaring the right type for each page type, not piling all schemas onto one page.

Verification

You can verify our finding yourself in a browser. Step 1: Open the pages we sampled. Cited reports the URLs we tested. Open each in a new tab. Step 2: Extract JSON-LD blocks via the console. Open DevTools (Cmd+Option+I / Ctrl+Shift+I), Console tab, and run:
[...document.querySelectorAll('script[type="application/ld+json"]')].map(s => { try { return JSON.parse(s.textContent); } catch { return 'INVALID JSON'; } })
This returns every JSON-LD block on the page. Each block is either a parsed object (success) or 'INVALID JSON' (the scanner silently skips these — fix the JSON if you see this). Step 3: List the @type declarations. For each parsed block, find the @type field. It’s either a string ("Article"), an array (["Article", "BlogPosting"]), or nested inside @graph (one schema wrapping multiple typed objects). Compare against the scored type lists above — every match adds points. Step 4: Validate with Google’s Rich Results Test. Paste the page URL into Google’s Rich Results Test. It returns every schema Google’s parser detects, plus any validation errors. The same JSON-LD AI models consume from your page is visible here. Schema.org’s own Schema Markup Validator gives a stricter view without Google’s rich-results filter. If your verification disagrees with Cited’s finding, that’s a bug — let us know.

Technical detail

Schema.org was launched in 2011 as a joint Google / Microsoft / Yahoo / Yandex initiative to standardize structured data declarations. JSON-LD (JSON for Linked Data) was added as a Schema.org-recognized format in 2013 and is now the dominant serialization — preferred over microdata and RDFa in modern documentation because it lives in the <head> separately from rendered HTML, making it easier for both authors and parsers. Extraction logic. Cited’s scanner uses regex against the raw HTML to find JSON-LD blocks:
  • Pattern: <script type="application/ld+json">…</script> (case-insensitive on the type attribute)
  • Each block’s text content is passed to JSON.parse() — blocks that throw are silently skipped (an invalid JSON-LD block contributes nothing but doesn’t crash the scan)
  • The parsed object is recursively walked: top-level @graph arrays expand into their member objects; top-level arrays expand into elements; nested objects in fields like mainEntity are NOT recursively traversed (only top-level and @graph traversal is performed)
  • Each object’s @type is read — strings count as one type, arrays count as multiple types
Type aggregation. The scanner builds a Set of all types found across all sampled pages. Duplicates within a page (Article declared in two separate blocks) and duplicates across pages (Organization on every page) collapse to a single entry. The set is then scored against the high-value and medium-value lists. Score calculation. Points = sum of (high-value matches × 2) + (medium-value matches × 1), capped at 8. If no scored types match but at least one JSON-LD block parsed successfully, score = 1. If no JSON-LD blocks parse, score = 0. Edge cases the scanner handles:
  • @graph arrays — a single JSON-LD block can declare multiple typed objects inside an @graph array (Yoast and RankMath both do this). The scanner walks the array and extracts each object’s @type. A @graph with Organization + WebSite + BreadcrumbList contributes all three types.
  • Top-level arrays — some sites serialize multiple schemas as a top-level array: [{...Article...}, {...Person...}]. The scanner walks each element. Equivalent to @graph in scoring.
  • Array @type values — Schema.org allows @type: ["Article", "BlogPosting"] to declare a type that inherits from both. The scanner counts each string in the array as a separate type. A block with both array entries earns 4 points (2 each).
  • Invalid JSON — a JSON-LD block with a trailing comma, unquoted key, or other syntax error throws on JSON.parse(). The scanner catches the error and continues to the next block. The page isn’t penalized beyond the missing schema.
  • Microdata and RDFa — Schema.org can be expressed inline via itemtype="https://schema.org/Article" (microdata) or typeof="schema:Article" (RDFa). The scanner does NOT extract these formats — only JSON-LD @type declarations count. Sites using microdata get partial credit elsewhere (Article schema’s author field still satisfies the Author Attribution signal) but score 0 here.
What this signal does not measure:
  • Property completeness. A page with @type: "Article" but no headline, author, or datePublished scores the same as a fully-filled Article schema. The scanner counts type presence, not field richness. Other signals (Author Attribution, Content Freshness) check for specific fields independently.
  • Schema validity beyond JSON parsing. A page with @type: "Prodcut" (typo) parses as valid JSON but doesn’t match any scored Schema.org type. The scanner treats this as “no recognized type” and falls through to the 1-point floor. Validators like Google’s Rich Results Test catch this; Cited reports it implicitly via the missing-types diagnostic.
  • Schema.org version. Schema.org evolves continuously, and some types (e.g., Drug, Course) exist only in recent versions. The scanner uses a fixed scored-type list based on what AI models reliably ground against today — newer or specialized types may be present and useful but don’t score here.
  • @context. The scanner doesn’t validate that @context: "https://schema.org" is declared. Custom contexts that happen to use the same type names still count. This is permissive on purpose — Schema.org’s identity is widely understood and rarely mis-specified.
For brands scoring 0-1/8, the highest-leverage fix is adding Organization schema to the site’s global layout and the appropriate page-type schema (Product for product pages, Article for blog posts, FAQPage for FAQ sections) to each page template. Most modern CMSes have plugins that handle this automatically — WordPress (Yoast, RankMath), Shopify (Smart SEO), Webflow (built-in Open Graph and Schema fields), Next.js (next-seo) — but custom themes often need explicit additions to the <head>. See also: Author Attribution, Open Graph Tags, Content Freshness.