AI Search Measurement & Tools - Tygart Media

Category: AI Search Measurement & Tools

LLM Visibility Measurement in 2026: The Three-Layer Stack That Actually Works
If you have run a GEO campaign for any length of time, you already know the measurement problem: there is no Search Console for ChatGPT, no Performance report for Perplexity, and the analytics you do have leak roughly a third of the traffic into Direct. LLM visibility is real, the buyers are real, but the dashboards that prove it exist have to be assembled from at least three different layers. This is the stack we use for client work in 2026 — what each layer measures, what it costs, and the regex you need to make it work.

What “LLM visibility” actually means

LLM visibility is the percentage of relevant AI-generated answers in which your brand, content, or experts appear. It is not the same as ranking, because answers do not have ranks — they have presence or absence. A useful operational definition borrowed from the practitioner community: track a fixed list of prompts that represent buyer intent for your category, run them across a fixed list of models on a recurring cadence, and count two things. First, mention rate — what percent of responses name you at all. Second, citation rate — what percent of responses include a clickable link back to your domain. Those two numbers are the foundation of every dashboard worth building.

The three measurement layers

No single tool gives you the full picture, so build the stack in three layers and treat them as complementary.

Layer one — Visibility tracking. Are you in the answer? This is the prompt-monitoring layer. You pick 50 to 200 prompts that a real buyer would type into ChatGPT, Perplexity, Gemini, Copilot, or Claude, then a tool re-runs them on a schedule and parses the responses for your brand and your competitors. This is the only layer that can prove a GEO campaign is working before any clicks happen.

Layer two — Referral analytics. When an AI answer does include a link and a user clicks it, does it show up in GA4? In May 2026 Google added a native “AI Assistant” channel to the GA4 Default Channel Group, which assigns the medium value ai-assistant to recognized referrers and groups those sessions automatically. That is a major improvement, but the underlying problem has not gone away: mobile apps and in-app browsers for ChatGPT, Claude, and Perplexity strip referrer headers, so a meaningful portion of AI-originated visits still arrive as Direct. Practitioner estimates put clean-referrer coverage somewhere in the 60 to 80 percent range depending on the model and the platform mix.

Layer three — Proxy signals. Branded search volume, direct traffic on long-tail URLs that have no other discovery path, self-reported attribution in lead forms, and CRM “how did you hear about us” data. None of these are clean, but together they sanity-check the first two layers and catch the AI traffic that the referrer pipeline lost.

The GA4 channel-group regex

Even with the native AI Assistant channel in place, you still want a custom channel group for granular per-platform reporting and for any property where the new default has not propagated yet. Create one under Admin → Data Display → Channel Groups and put it above Referral in the rule order — GA4 applies rules top-down and Referral will swallow the visit if it gets there first.

Match against the source dimension with this pattern:
```
chatgpt\.com|chat\.openai\.com|openai\.com|perplexity\.ai|claude\.ai|gemini\.google\.com|copilot\.microsoft\.com|bing\.com/chat|deepseek\.com|grok\.com|meta\.ai|you\.com
```
That is the full set of recognized referrers as of the May 2026 Google update. For agency reporting we split this into one channel per platform rather than a single “AI” bucket, because the engagement profile is genuinely different — Perplexity sessions tend to behave like high-intent research traffic, while ChatGPT sessions skew more exploratory.

What the tools actually do — and what they cost

The visibility-tracking market in 2026 has consolidated into a recognizable shape. Here is the practitioner read on the four tools most likely to come up in a procurement conversation.

Profound. Tracks coverage across ChatGPT, Gemini, Google AI Overviews, Google AI Mode, Perplexity, Claude, Copilot, Grok, and DeepSeek. The Lite tier starts at $499/month per Profound’s published pricing. This is the enterprise-default option — broadest model coverage, mature competitive view, the price tag to match.

Semrush AI Toolkit. Tracks Google AI Overviews, Google AI Mode, Perplexity, ChatGPT, and Gemini. Available standalone at $99/month per domain or bundled inside Semrush One starting at $199/month. Strong choice if you already run Semrush — the prompt monitoring lives next to your traditional keyword reports.

Otterly. Tracks share of voice across ChatGPT, Google AI Overviews, Perplexity, and Copilot, with AI Mode and Gemini as add-ons. Starts at $29/month on the Lite plan, which makes it the cheapest serious on-ramp in the category. Best for solo operators and small in-house teams that need a real share-of-voice number without a five-figure annual commitment.

SE Ranking AI Visibility Tracker. Bundled inside SE Ranking’s existing SEO platform. Good fit for SE Ranking users; not a category leader for AI alone.

For a single client account we typically run Otterly for the day-to-day share-of-voice number and add Profound when the scope justifies the spend — usually when the client has more than three competitors they care about benchmarking against.

A minimal measurement framework you can ship this week

Build it in this order. None of the steps require a tool purchase to begin.
1. Write your prompt list. Fifty prompts that a buyer in your category would actually type. Mix top-of-funnel (“what is X”), comparison (“X vs Y”), and bottom-of-funnel (“best X for Y”) in roughly equal thirds.
2. Establish a baseline manually. Run every prompt in ChatGPT, Perplexity, and Gemini once. Record: did the response mention you, did it cite you, who was cited instead. This becomes the zero-point for the campaign.
3. Configure GA4. Create the AI custom channel group with the regex above and place it above Referral. Verify the native AI Assistant channel is populated on the property.
4. Set the cadence. Monthly for the manual re-run if you are unfunded. Weekly automated tracking the moment Otterly or equivalent is in the stack.
5. Report two numbers. Mention rate and citation rate, broken down by model. Everything else is secondary.
The honest limitation

Every tool in this category is sampling. They re-run your prompts on their own infrastructure, not on the model instance a real user hits. The same prompt run twice in ChatGPT in the same hour can return different brand mentions because of retrieval variance and the freshness of the model’s web index. Treat any single-day number as noise and any 30-day trend as signal. The teams that get this right report on rolling four-week windows, not daily deltas.

Where to spend next

Once the measurement stack is live, the next dollar belongs in two places: the content updates that show up in your low-mention-rate prompts, and an LLMs.txt file if you don’t have one yet. Measurement without an action loop is a dashboard, not a campaign. The point of knowing your citation rate is to move it.

Frequently asked questions

What is LLM visibility?
LLM visibility is the percentage of relevant AI-generated answers — across ChatGPT, Perplexity, Gemini, Copilot, and Claude — in which your brand, content, or experts are mentioned or cited. It is measured by running a fixed prompt list on a recurring cadence and counting mention rate and citation rate.

How do I track AI traffic in Google Analytics 4?
GA4 added a native “AI Assistant” channel to the Default Channel Group in May 2026 that automatically groups sessions from recognized AI referrers. For per-platform reporting, also create a custom channel group under Admin → Data Display → Channel Groups, place it above Referral, and match the source dimension against the regex of known AI domains.

What is the cheapest LLM visibility tool?
Otterly is the lowest-priced serious option at $29/month on its Lite plan, with coverage of ChatGPT, Google AI Overviews, Perplexity, and Copilot. It is the recommended starting point for solo operators and small in-house teams.

Why does AI referral traffic show up as Direct in GA4?
Mobile apps and in-app browsers for ChatGPT, Claude, and Perplexity often strip the referrer header when a user clicks an outbound link. Without a referrer, GA4 cannot identify the source and classifies the session as Direct. Industry estimates put clean-referrer coverage at 60 to 80 percent of true AI-originated traffic.

How often should I measure GEO performance?
Report on rolling four-week windows, not daily deltas. The same prompt run twice in the same hour can return different brand mentions because of retrieval variance, so single-day numbers are noise. Weekly automated tracking with monthly reporting is the practitioner standard.
May 23, 2026
How to Measure LLM Visibility in 2026: The GA4 + Response-Side Stack
Traditional analytics platforms can’t see the most important impression you’re making in 2026. When a user asks ChatGPT, Perplexity, Gemini, or Claude about your category, your brand either shows up in the answer or it doesn’t — and your GA4 dashboard has no idea either way. This is the measurement blind spot at the center of generative engine optimization. If you can’t measure LLM visibility, you can’t optimize for it.

This guide walks through the measurement stack that actually works in 2026: the GA4 channel grouping that catches AI referral traffic, the manual verification protocol that costs nothing, and the dedicated LLM visibility platforms that automate prompt monitoring at scale. By the end, you’ll have a measurement framework you can run starting today.

Why GA4 alone is not enough

Standard web analytics measures what happens after the click. LLM visibility is what happens before the click — or instead of one. According to widely cited industry reporting, a large share of AI search sessions end without the user ever clicking through to a source, which means the brand impression inside the AI response is often the only impression you get. GA4 cannot see that impression. It cannot see when ChatGPT recommends you in a comparison. It cannot see when Perplexity cites your article as a source for an answer.

You still need GA4 — AI referral traffic is real, growing, and converts well — but you need it as one layer of a two-layer stack. Layer one is referral-side measurement, which captures the users who actually click through from AI platforms. Layer two is response-side measurement, which monitors what AI platforms are saying about you whether anyone clicks or not.

Layer one: catching AI referrals in GA4

GA4 does not have a built-in “AI” channel. By default, traffic from ChatGPT, Perplexity, Claude, and Gemini gets bucketed into the generic Referral channel, where it disappears next to social and partner sites. The fix is a custom channel group that uses a referrer regex to peel AI traffic out into its own bucket.

In GA4, go to Admin → Data Settings → Channel Groups, create a custom channel group, and add a new rule above the default Referral rule. Set the conditions to Source matches regex and use a pattern like this:
```
chatgpt\.com|openai\.com|perplexity\.ai|claude\.ai|anthropic\.com|gemini\.google\.com|copilot\.microsoft\.com|deepseek\.com|you\.com|meta\.ai|poe\.com
```
The order matters. Your AI Traffic rule must sit above the Referral rule in the priority list, or AI traffic will be captured by Referral first and never reach your custom channel. Once the rule is live, you can build Explorations that segment AI traffic by source, page, conversion rate, and engagement time — and compare that segment against organic, direct, and social.

The referrer attribution gap

One caveat: not every AI click passes a referrer. ChatGPT’s free tier in particular has been reported to strip referrer headers in many configurations, meaning a meaningful share of ChatGPT traffic shows up as Direct in GA4 rather than as a chatgpt.com referral. This is a known limitation across the industry. Treat your AI referral numbers as a floor, not a ceiling, and use response-side monitoring to fill in the gap.

Layer two: response-side monitoring

This is the measurement that traditional SEO never needed. You’re no longer just asking “did anyone visit?” — you’re asking “what is the AI saying about me?” There are two ways to answer that question.

The manual verification protocol

The free, no-tool approach is a structured query log. Build a list of 15 to 25 prompts that a buyer in your category would realistically type into an AI assistant. Be specific. “Best CRM for small B2B teams” is a prompt. “What is a CRM” is not — that’s a research query, not a buyer query.

Once a week, run every prompt through each AI platform you care about — typically ChatGPT, Perplexity, Gemini, and Claude — and record three things per query: whether your brand was mentioned, whether your domain was cited as a source, and what position your brand appeared in if it was named alongside competitors. A simple spreadsheet with prompt, date, platform, mention (yes/no), citation (yes/no), and position is enough to start. Week-over-week deltas on this sheet will tell you whether your GEO and AEO work is moving the needle.

This is slow and manual but it’s the only method that gives you ground truth. The dedicated platforms below are essentially automating this protocol — running the same kind of prompt log against the same APIs on a daily schedule. If you’re under $1,000/month in marketing spend, run it manually. If you’re past that, automate it.

Dedicated LLM visibility platforms

A new category of tools emerged in 2025 and matured in 2026 specifically to monitor LLM responses. They all do roughly the same thing — run your target prompts daily across multiple AI engines, score visibility, track which sources the AIs cite, and surface competitor gaps — but they segment by price point.

At the budget end, Otterly.AI offers monitoring plans starting around $29/month, with a Share of AI Voice metric and time-to-first-data of under ten minutes after signup. It’s the simplest entry point for teams that just want a citation-frequency dashboard. In the mid-market, Peec AI starts around €89/month and emphasizes multilingual coverage and actionable recommendations — it doesn’t just tell you you’re invisible, it suggests what to change. At the enterprise tier, Profound starts around $499/month and adds Prompt Volumes, which estimates real AI search demand by topic with demographic breakdowns. SOC 2 compliance and dedicated onboarding generally start at the $1,000+ enterprise tiers across this category.

Other platforms in active use this year include Semrush’s AI Toolkit, SE Ranking’s SE Visible, Goodie AI, Rankscale, Nightwatch, AirOps, and Searchable. The category is moving fast — pricing and features change quarterly — so verify the current state of any platform before committing.

The six KPIs to track

Whatever measurement stack you use, the same handful of metrics will tell you whether GEO is working. Organize them into leading and lagging indicators:

Leading indicators (response-side, change first):
- Mention Rate — the percentage of monitored prompts where AI responses mention your brand name. This is the broadest signal.
- Citation Rate — the percentage of monitored prompts where your domain is cited as a source, not just named. Citation is stronger than mention because it implies the AI is treating your content as authoritative.
- Position — when your brand is named alongside competitors, where in the list does it appear. First-named brands get disproportionate attention.
Lagging indicators (referral and revenue-side, change later):
- AI Referral Sessions — total sessions from your AI Traffic channel group in GA4.
- AI Referral Engagement — engagement rate and average engagement time for the AI segment, compared to organic. Strong AI referral traffic typically engages longer because the user arrived with intent already framed by the AI.
- AI-Influenced Conversions — conversions where AI was part of the attribution path, even if not the last touch.
Tier-one metrics move first because content changes affect what AIs say within days to weeks. Tier-two metrics lag because they require enough traffic to be statistically meaningful, which can take a quarter or more to develop.

The minimum viable setup

If you do nothing else this week, do these three things:
1. Add the AI Traffic channel group to GA4 using the regex above and move it above Referral in priority.
2. Build a 15-prompt spreadsheet of buyer-intent queries for your category and run them once across ChatGPT, Perplexity, Gemini, and Claude. Record mention, citation, and position.
3. Set a calendar reminder to repeat step two every Friday for four weeks. After four weeks you’ll have a real trendline.
That setup costs nothing and produces the measurement layer that lets you tell whether your GEO, AEO, and LLMs.txt work is actually compounding — or whether you’re guessing. Once the trendline is stable, evaluate whether automating with Otterly, Peec, or Profound is worth the spend. For most operators, the manual protocol gets you 80% of the insight at 0% of the budget.

Frequently Asked Questions

What is LLM visibility?

LLM visibility is the measurement of how often, and how prominently, a brand or website appears in responses generated by large language models like ChatGPT, Perplexity, Gemini, and Claude. It is the response-side counterpart to traditional search ranking — instead of measuring where you appear in a results page, you’re measuring whether AI assistants mention or cite you when answering questions in your category.

Can GA4 track AI traffic from ChatGPT and Perplexity?

GA4 can track AI referral clicks if you create a custom channel group with a referrer regex matching AI domains and place it above the default Referral rule. It cannot track impressions inside AI responses where the user doesn’t click through, and ChatGPT’s free tier often strips referrers entirely, so a portion of AI traffic still lands in Direct. Treat GA4 numbers as a floor.

What is the difference between mention rate and citation rate?

Mention rate measures the percentage of monitored AI prompts where your brand name appears anywhere in the response. Citation rate measures the percentage where your specific domain or URL is referenced as a source. Citation is a stronger signal because it indicates the AI is treating your content as authoritative, not just naming you in passing.

Which LLM visibility tool should I use in 2026?

For budget-conscious teams, Otterly.AI starts around $29/month and gets you to first data in minutes. For mid-market needs with multilingual coverage and recommendations, Peec AI starts around €89/month. For enterprise teams that need prompt-volume demand data and SOC 2 compliance, Profound starts around $499/month. Verify current pricing before purchasing — the category moves quickly.

How often should I check my LLM visibility?

For manual tracking, weekly is the right cadence — frequent enough to catch movement, infrequent enough to avoid noise. Dedicated platforms typically run automated checks daily and let you review weekly. Don’t expect day-to-day stability; AI responses have inherent variance, so look at week-over-week and month-over-month trends rather than single data points.
May 16, 2026
How to Measure LLM Visibility: The Complete Tracking Stack for 2026
Most SEO teams know they need to care about AI search. Almost none of them have a measurement system in place for it. That’s the gap this article closes.

Ranking in ChatGPT, Perplexity, Google AI Overviews, or Claude isn’t a vanity metric anymore — it’s a traffic channel. But unlike Google, AI systems don’t serve a results page you can screenshot. They weave citations into prose. Your brand either shows up in that prose or it doesn’t, and if you’re only watching GA4’s built-in channel reports, you’re flying mostly blind.

This is a practitioner’s setup guide: the exact metrics, GA4 configuration, and tool stack needed to track LLM visibility systematically.

The Five Metrics That Define LLM Visibility

Traditional SEO tracks ranking position, impressions, and clicks. None of those exist in AI search. You need a new metric set:

Citation frequency — How often your domain or brand is mentioned in AI-generated answers for your target query set. LLMs typically cite 2–7 sources per response. Capturing one of those slots consistently is the entire game.

Prompt coverage — Out of your tracked prompt library, what percentage of prompts return your brand at all? Calculate it as: (prompts where you appear ÷ total tracked prompts) × 100. A brand actively optimizing for AI search should be above 40% coverage on tier-1 prompts within 90 days of focused content work.

Share of voice — For a given topic cluster, how often do AI answers cite you versus competitors? If you appear in 12 of 30 tested prompts and a competitor appears in 20, they hold 67% share of voice on that topic. That ratio is more strategically meaningful than any single citation count.

AI referral sessions — The sessions in GA4 that actually arrived from an AI platform with a usable referrer header. This is the only metric that ties visibility to business outcomes. Setup is covered in the next section.

Conversion quality from AI traffic — AI-referred visitors behave differently from organic search visitors. They arrive with higher intent (they asked a specific question and your site was the answer). Track engagement rate, pages per session, and goal completions for AI referral sessions separately. If this cohort converts at 2–3× the rate of your organic traffic — which early data from practitioners suggests — it changes how you think about GEO investment.

Setting Up GA4 to Capture AI Traffic: The Regex You Need

Out of the box, GA4 misclassifies most AI referral traffic. ChatGPT sessions land in “Referral.” Perplexity sessions land in “Referral.” Claude.ai sessions may land in “Direct.” Without a custom channel group, you have no way to isolate or trend this traffic.

In GA4: Admin → Data Display → Channel Groups → Create New Channel Group

Name it “AI Search” and configure the rule:
- Condition type: Session source
- Match type: matches regex
- Pattern (copy exactly):
```
^(chatgpt\.com|openai\.com|chat\.openai\.com|perplexity\.ai|claude\.ai|anthropic\.com|gemini\.google\.com|bard\.google\.com|copilot\.microsoft\.com|bing\.com\/chat|deepseek\.com|grok\.com|you\.com|poe\.com|meta\.ai)$
```
Critical step: Place the “AI Search” channel above “Referral” in your channel list. GA4 processes channel rules top-to-bottom — if Referral appears first, every AI referral will match Referral before ever reaching your AI channel definition. This is the single most common setup mistake.

One important caveat on scope: approximately 70% of AI-originated visits arrive without a referrer header. OpenAI’s iOS app, private browsing mode, and in-app browsers all strip referrer data before the request reaches your server. This means your “AI Search” channel in GA4 is capturing the visible minority — the sessions where the referrer was preserved. Don’t benchmark by absolute volume. Benchmark by growth rate. If your AI Search channel is growing month-over-month while overall Direct traffic is stable, your citation presence is expanding.

To supplement GA4 attribution, add a self-reported source question to high-intent forms: “How did you find us?” Include “ChatGPT / AI assistant” as an option. This provides ground truth that session data alone cannot.

The Tool Tier: Free to Enterprise

The LLM visibility tool market matured significantly through 2025 and into 2026. Three tiers have emerged, and most independent publishers and agencies should start at the first tier before paying for anything.

Free / DIY layer — start here

Run 20 representative prompts manually across ChatGPT, Perplexity, Claude, and Google AI Overviews each month. Record mentions in a spreadsheet: cited (yes/no), cited with link (yes/no), competitor named instead. This gives you baseline prompt coverage and share of voice data with zero budget. Do this for at least one month before paying for any tool — you’ll understand your own citation patterns much better and know exactly what problem you’re trying to solve with a paid platform.

Mid-market tools ($100–$500/month)

Otterly.ai provides automated monitoring across Google AI Overviews, ChatGPT, Perplexity, Gemini, and Microsoft Copilot. It runs scheduled prompt sets on your behalf and tracks brand mention frequency and citation links over time. The value is removing the manual labor of the 20-prompt audit while expanding coverage to more platforms and prompts than you’d realistically run by hand.

LLMrefs takes a different approach: input your existing SEO keywords rather than writing prompts, and the platform automatically generates prompt fan-outs and returns tracking in a dashboard that mirrors a traditional rank tracker. Lower learning curve for teams coming from keyword-centric SEO workflows.

Enterprise layer ($1,000+/month)

Profound is built around its proprietary Prompt Volumes dataset — a search-volume equivalent for AI queries. It estimates how often specific questions are actually being asked across LLMs, which lets you prioritize content topics based on demand rather than intuition. This is genuinely useful at scale, but it’s overkill for most independent publishers. It becomes relevant when you’re deciding between 20 possible content angles and need volume data to make the call.

The 20-Prompt Audit: Your Monthly Baseline Protocol

Whether you use a paid tool or not, run this protocol monthly:
1. Build a prompt library of 20 questions your target buyer would ask an AI system. These should be the questions your content is designed to answer — not keyword-formatted phrases, but actual conversational queries.
2. Run each prompt across ChatGPT, Perplexity, and Google AI Overviews (3 platforms × 20 prompts = 60 data points per month).
3. For each result, record: was your brand cited in text, was your domain linked, and which competitor was cited if you were not.
4. Calculate prompt coverage per platform (what % of the 20 prompts returned your brand) and total share of voice versus your top 3 competitors.
Log results in a spreadsheet with a date column. Three months of monthly data reveals directional trends — whether your GEO and AEO work is moving the needle. No tool gives you this longitudinal view without ongoing, consistent execution.

Diagnosing a Citation Drop

If your monthly audit shows prompt coverage declining from the previous period, run through this checklist before assuming a platform algorithm change:

Did you remove or restructure a previously cited page? AI systems build representations of your content over time. Pages that disappear or are significantly restructured lose citation weight. Check your changelog against the prompt set that declined.

Did a competitor publish stronger content on the topic? AI citation is zero-sum within the 2–7 source window. If a competitor published a more authoritative, well-structured page, it may have displaced yours. Review their recent publishing calendar.

Check your LLMs.txt file. A crawlability block accidentally introduced via LLMs.txt or a misconfigured robots.txt Disallow directive will cut AI citation access at the source. Verify your LLMs.txt is allowing the pages you expect to be cited.

Check for a model update on the platform. Major model releases can reset citation patterns. GPT-5, Gemini 2.0, and similar releases changed which sources each platform weighted. Check the platform’s public changelog for the period in question.

If none of these apply, run a structured data audit on the pages that lost citations. Schema markup, FAQ blocks, clear heading hierarchy, and factual density all affect how AI systems extract and attribute content. A page that lost its FAQ section in a redesign may have simultaneously lost its AI citation utility.

The Bottom Line

LLM visibility measurement is not a solved problem, but the measurement primitives exist today: GA4 custom channel groups for traffic attribution, manual prompt audits for citation coverage, and mid-market tools for automated monitoring at scale. The sites building this infrastructure now will have 12–18 months of baseline data by the time the rest of the market treats it as standard practice.

Build the 20-prompt library this week. Set up the GA4 channel group today. Everything else layers on top of those two data streams.
May 9, 2026