After an AI platform returns a response, Cited needs to determine which brands were mentioned, where they appeared (position), and how they were characterized (sentiment). This parsing pipeline runs on every response and produces the structured data that powers dashboards, benchmarks, and reports.Documentation Index
Fetch the complete documentation index at: https://docs.getcited.in/llms.txt
Use this file to discover all available pages before exploring further.
Mention extraction
Each AI response is parsed to identify brand name appearances. The parser handles exact brand name matches, common variations and abbreviations, brand names embedded in longer strings, and multiple mentions of the same brand in one response. The parser uses Claude Haiku (claude-haiku-4-5-20251001) for structured extraction. Every response is sent through a Haiku parsing call that returns a JSON object with brand mentions, positions, sentiment, and framing tags. This parsing call is separate from the main platform query and adds to per-query cost — but using a smaller, faster model for parsing keeps costs manageable at scale. The parsing prompt instructs Haiku to extract all brand entities from the response text and return structured data: brand name, position in the response (ordinal rank of first mention), sentiment classification, and descriptive framing tags (1-3 words characterizing how the brand was described).Position tracking
When a response mentions multiple brands, Cited records the position of each — position 1 for the first brand mentioned, position 2 for the second, and so on. This feeds the average position metric. Position is determined by order of appearance in the response text. For citation-based platforms like Perplexity, where sources are explicitly numbered, the citation number is used directly as the position. For prose-based platforms like Claude, position is inferred from text order.Sentiment classification
Each mention is classified into one of four sentiment categories:- Positive — brand described favorably (effective, well-rated, reliable, innovative)
- Neutral — brand mentioned factually without evaluative language (offers, provides, available in)
- Negative — brand described unfavorably or criticized (controversial, overpriced, problematic)
- Mixed — response contains both positive and negative characterization of the brand
(positive_count - negative_count) / total_mentions, producing a value between -1.0 (always negative) and +1.0 (always positive). Mentions where the brand was not found produce no sentiment data — they contribute to mention rate as a zero, not to sentiment scoring.
What the parser does NOT do
The parser has deliberate scope boundaries:- It does not extract pricing data or feature comparisons — these require domain-specific parsing that is not generalizable across categories
- It does not distinguish between branded mentions and category-level mentions (“skincare” is not the same as “Plum skincare”)
- Sentiment classification is categorical, not continuous — there is no granular score between 0 and 100
- Small sample sizes (under 10 responses per brand-query combination) produce noisy sentiment distributions, which is why sentiment benchmarks are not yet published at the Cited Index level
Quality control
The Haiku parsing call includes retry logic: if the initial parsing returns malformed data (e.g., invalid JSON, missing required fields), the pipeline retries with adjusted parameters up to 3 times. A multi-strategy JSON extraction utility handles common formatting issues (markdown fences around JSON, trailing commas, incomplete brackets). Dashboard data consistency is maintained by computing all metrics from the same date-windowed base data. When a time window is selected (7-day, 14-day, 30-day, 90-day), every metric on the page respects that window — preventing disagreements between panels that would suggest two different versions of the same truth.How Cited references competitors in your AI Narrative
Cited only references competitors you have explicitly declared in your brand profile. When generating your AI Narrative summary, Cited reads the competitor list from your brand settings and instructs the AI summarization model to reference only those competitors. It explicitly prevents the model from inferring or naming additional competitors from claim content. Without this guardrail, the AI summarization model may surface competitors based on what appears in raw claim content — including outdated, irrelevant, or incorrectly extracted mentions. By anchoring on your declared list, the narrative stays accurate and aligned with your strategic positioning. If you have not declared any competitors, Cited will not name any competitors in your narrative. Instead, it will reference “the broader category” or “other tools in this space.” We recommend declaring at least 3-5 direct competitors in Settings to enable richer competitive narrative analysis.Related concepts
- Mention rate — the primary metric this pipeline produces
- Sentiment — the sentiment metric and why benchmarks are deferred
- Average position — uses position data from this pipeline
- How we generate queries — what happens before extraction
- Refresh cadence — when the pipeline runs
Frequently asked questions
How accurate is the sentiment classification?
How accurate is the sentiment classification?
Sentiment classification using Haiku is reliable for clearly positive or negative characterizations but less precise for subtle or mixed framing. The four-category system (positive, neutral, negative, mixed) deliberately avoids false precision. For critical sentiment analysis, Cited recommends reviewing the raw AI responses alongside the classified scores.
What happens if the parser misidentifies a brand mention?
What happens if the parser misidentifies a brand mention?
False positives (detecting a mention that is not there) and false negatives (missing a real mention) both occur at low rates. The parser is tuned for precision over recall — it is better to miss an ambiguous mention than to count a false one. Manual audit spot-checks are part of the quality process.
Why use Claude Haiku for parsing instead of the same model that generated the response?
Why use Claude Haiku for parsing instead of the same model that generated the response?
Cost and speed. Haiku is significantly cheaper and faster than the models used for response generation. Since parsing is a structured extraction task (not a creative generation task), a smaller model performs well. This keeps per-query costs manageable when processing hundreds of responses daily.