Content Freshness

AI models prefer recent content for time-sensitive queries, and most queries are more time-sensitive than they appear. A 2019 datestamp on an authoritative page gets flagged as outdated even when the underlying answer hasn’t changed. AI cites “outdated information” warnings instead of the page itself, and your competitor with a 2026 datestamp gets the placement.

Methodology

Cited samples up to 5 pages from your site and extracts every date it can find on each. The signal evaluates the most recent date across the entire sample — not per page, not an average — because AI models look at the freshest available timestamp when deciding whether to cite a page or treat its content as stale. Dates come from three structured sources:

<time> elements with a datetime attribute (<time datetime="2026-05-15">) or visible text content
Open Graph article meta tags — <meta property="article:published_time"> and <meta property="article:modified_time">
JSON-LD datePublished and dateModified fields inside any structured data block

The scanner aggregates dates across all sampled pages into a single list, parses each string, rejects invalid dates and dates more than a day in the future (typically parsing errors from misformatted strings), then finds the maximum. Scoring runs against the most-recent date’s age in calendar months:

Less than 3 months old → 5/5. Fresh content — AI models treat the page as current.
3 to 12 months old → 3/5. Moderately fresh — AI may cite but with implicit recency caveats.
More than 12 months old → 1/5. Stale — AI models systematically downweight or skip for time-sensitive queries.
No dates found anywhere → 1/5. Same tier as stale, because AI models can’t distinguish “fresh content with no datestamp” from “old content the author chose not to date.”

The signal scores out of 5. Status thresholds: pass at 4/5 (requires under-3-month tier), partial at 2/5 (under-1-year tier), fail below. Why use the most-recent date instead of per-page averages? AI citation favors the freshest available source, and one recently-updated cornerstone page can pull the whole site’s perceived freshness up. Conversely, a site where every page is 3 years old has no cornerstone to anchor freshness against. The aggregation reflects what AI actually consumes.

Verification

You can verify our finding yourself in a browser. Step 1: Open the pages we sampled. Cited reports the URLs we tested. Open each in a new tab. Step 2: Extract dates via the console. Open DevTools (Cmd+Option+I / Ctrl+Shift+I), Console tab, and run:

[
  ...[...document.querySelectorAll('time')].map(t => t.getAttribute('datetime') || t.textContent.trim()),
  document.querySelector('meta[property="article:published_time"]')?.content,
  document.querySelector('meta[property="article:modified_time"]')?.content,
  ...[...document.querySelectorAll('script[type="application/ld+json"]')].flatMap(s => { try { const d = JSON.parse(s.textContent); const arr = Array.isArray(d) ? d : [d]; return arr.flatMap(o => [o.datePublished, o.dateModified]); } catch { return []; } })
].filter(Boolean)

This returns every date string the scanner sees on this page. The most recent one across all sampled pages is the value the score is based on. Step 3: Validate the most recent date. Compare what the console returns to what Cited reports as the most recent date. They should match. If they don’t, the scanner may have rejected a date as invalid (try parsing it manually with new Date('your-string') — if isNaN(parsed.getTime()) is true, the string is unparseable). Step 4: Check for missing dates on important pages. If your homepage, top blog posts, or cornerstone product pages don’t have any of the three date sources, AI models can’t tell when the content was last refreshed. Adding <time datetime="…"> to bylines or dateModified to JSON-LD on those pages is the fastest fresh-up. If your verification disagrees with Cited’s finding, that’s a bug — let us know.

Technical detail

Content freshness as a citation signal traces to information retrieval research from the early 2000s, formalized in Google’s “QDF” (Query Deserves Freshness) algorithm description around 2007. AI models inherited the same recency weighting: for queries about evolving topics — product comparisons, pricing, regulatory guidance, anything tied to a year — the freshest credible source ranks first. Extraction logic. The crawler runs three independent date extraction passes per page and concatenates the results into one list:

<time> elements — every <time> tag in the rendered DOM contributes either its datetime attribute value (preferred) or its trimmed textContent. If both exist, datetime wins because it’s the machine-readable form.
Open Graph meta tags — the article:published_time and article:modified_time meta properties are read directly from <head>. Both contribute if both exist.
JSON-LD dates — every JSON-LD block is parsed (with errors silently swallowed); each parsed object’s datePublished and dateModified fields are extracted. Arrays at the top level are walked. The scanner does NOT recurse into nested objects like mainEntity — only top-level dates count.

The combined list is passed through parseDate(), which calls new Date(dateStr), rejects NaN.getTime() (invalid), and rejects values more than 86,400,000ms (one day) in the future. This catches the most common parsing errors: malformed timestamps, accidental epoch values, and content-management systems that emit far-future dates as drafts. Score calculation. Valid dates are sorted descending. The most recent date’s calendar-month delta from Date.now() determines the tier: < 3 → 5, <= 12 → 3, else 1. Sites with zero valid dates score 1, identical to the stale tier. Edge cases the scanner handles:

Multiple date formats — 2026-05-15, May 15, 2026, 2026-05-15T14:30:00Z, Wed, 15 May 2026 14:30:00 GMT all parse via new Date() and yield the same instant. The scanner doesn’t require a specific format because new Date() accepts most common forms.
Future-dated drafts — some CMSes emit dateModified slightly in the future when content is being staged. The scanner accepts up to 24 hours of future drift but rejects anything beyond. Strict adherence would miss legitimate scheduled-publish workflows.
Pages with no date in the DOM but dates in JSON-LD — modern sites often have visible bylines stripped from the design but still emit datePublished in Article schema. These pages count as dated because the scanner sees the JSON-LD. A clean design isn’t penalized as long as the structured data carries the timestamp.
<time> elements without datetime attribute — visible text like <time>May 15, 2026</time> falls back to the text content. The string is then parsed; failures are silently dropped. Most human-readable date phrasings (“May 15, 2026”, “15 May 2026”) parse; ones with relative terms (“yesterday”, “3 days ago”) don’t.
Time zones — dates without time zones are interpreted as UTC by Date(). A site in another time zone whose datePublished lacks a +00:00 suffix might appear off-by-one in edge cases near the day boundary, but this never changes the month-bucket score.

What this signal does not measure:

Whether the content was actually updated. A site can update its dateModified field nightly via cron without changing any content. The scanner trusts the timestamp; AI models often cross-reference content changes against timestamps but Cited doesn’t.
Per-page freshness. A site with one updated-yesterday cornerstone page and four 5-year-old pages scores 5/5 because the most-recent date is recent. The aggregation rewards having any fresh anchor; it doesn’t penalize per-page staleness.
The right page being fresh. Updating a footer disclaimer’s timestamp doesn’t help if the article AI wants to cite is still dated 2019. The scanner sees the freshest date anywhere across the sample, not the freshest date on the page AI would actually quote.
Content type appropriateness. An evergreen explainer doesn’t need a 2026 timestamp the way a “Best products of 2026” listicle does. The signal applies the same threshold to all pages because the scanner has no content-type awareness.

For brands scoring 1-3/5, the highest-leverage fix is auditing the top 20 pages by traffic, finding the ones with stale or missing dateModified values, refreshing the underlying content, and updating the timestamp. The order matters — bumping timestamps without refreshing content is a signal AI models eventually detect and penalize. Most CMSes (WordPress with Yoast, Webflow CMS, Ghost) emit dateModified automatically when a post is republished. See also: JSON-LD Structured Data, Author Attribution, Sitemap Accessibility.

Get Started

Concepts

Methodology

Signals

Playbooks

MCP

Glossary

Changelog

Methodology

Verification

Technical detail

​Methodology

​Verification

​Technical detail

Methodology

Verification

Technical detail