AI visibility data is inherently noisier than traditional web analytics. LLM responses are non-deterministic, sample sizes are smaller than web traffic datasets, and platform behavior changes as models are updated. This page explains how to interpret Cited’s data with appropriate confidence and what to watch for when drawing conclusions.Documentation Index
Fetch the complete documentation index at: https://docs.getcited.in/llms.txt
Use this file to discover all available pages before exploring further.
Single-run vs aggregated data
A single AI response to a single prompt is one data point. It tells you whether Brand X was mentioned in this specific response — but it does not tell you whether Brand X is generally mentioned for this type of query. The same prompt run a second time may produce a different answer with different brands. Cited aggregates across multiple runs per query and across multiple queries per brand. Dashboard mention rates are computed from the full set of daily responses, not from any single run. This aggregation is what makes the metrics stable enough to act on.Sample size considerations
For the Cited Index:- 253 brands across 8 categories, with 21-55 brands per category
- 185 queries (20-25 per category)
- 740 total responses per monthly refresh
- Category-level percentile distributions (p25, p50, p75) are statistically stable because they aggregate across all brands in a category
| Plan | Queries | Platforms | Daily data points |
|---|---|---|---|
| Starter | 25 | 3 | 75 |
| Pro | 75 | 5 | 375 |
| Scale | 125+ | 5 | 625+ |
When to trust a trend
Not all metric movements deserve a reaction. The reliability of a trend depends on its duration and magnitude. Single-day changes: Treat as noise unless the change is dramatic (more than 15 percentage points in a single day). Most single-day movements are caused by LLM non-determinism, not by real changes in brand visibility. 3-5 day trends: Directional signal. Worth investigating but not acting on. If mention rate drops for 4 consecutive days, start looking for causes — but do not make content or strategy changes based on less than a week of data. 2-week trends: Reliable. If a metric moves consistently over two weeks, the change is real. This is the minimum window for making GEO strategy decisions — “our mention rate dropped 5 points over the last two weeks” is actionable. Month-over-month: The most reliable comparison window, especially for Cited Index data where monthly refresh cycles align with the comparison period.Known noise sources
Five things cause legitimate measurement noise that is not a signal of real visibility change:- LLM non-determinism — the same prompt produces different brand mentions across runs
- Model updates by AI providers — silent updates to ChatGPT, Gemini, or Claude models change response patterns without notice
- Live retrieval variance — search-enabled platforms (Perplexity, ChatGPT) pull different web results across sessions
- Seasonal or news-driven spikes — a brand in the news gets temporary mention rate lifts that regress once the news cycle passes
- Pipeline timing — if a query runs during a platform outage or rate limit, it produces no data for that run
What Cited reports vs what Cited recommends
Cited reports raw metrics as computed from the data: mention rates, share of voice, average position, sentiment scores. These are the numbers in your dashboard. Cited recommends interpreting these numbers through the lens of trend duration and sample size. Look at 2-week moving averages rather than daily snapshots. Compare against category benchmarks rather than absolute thresholds. Do not over-index on single-day movements.Related concepts
- Non-determinism — the underlying cause of measurement noise
- Mention rate — the primary metric with benchmark data
- Refresh cadence — when data is collected and available
- Cited Index benchmarks — the empirical baseline for comparison
Frequently asked questions
How confident can I be in my brand's mention rate?
How confident can I be in my brand's mention rate?
At Pro tier (75 queries, 5 platforms, 375 daily data points), mention rate is reliable to within a few percentage points day-over-day. At Starter tier (25 queries, 3 platforms, 75 daily data points), there is more variance. In both cases, 2-week moving averages are the most reliable metric to track.
Why not just run more queries to increase confidence?
Why not just run more queries to increase confidence?
More queries do increase statistical confidence, but each query costs money. The tier structure (25/75/125 queries) balances measurement reliability against cost. Deep-dive audits with 200+ prompts are available for brands that need higher-confidence point-in-time snapshots.
How do I distinguish a real drop in mention rate from normal noise?
How do I distinguish a real drop in mention rate from normal noise?
Check three things: (1) did the drop persist for 3 or more consecutive days, (2) did competitors see similar movement (category-wide shift vs brand-specific), (3) was there a known model update from the AI platform during that period. If (1) is yes and (2) is no (only your brand dropped), investigate. If (1) is no, wait.