Tag: schema

  • The LLMs.txt Reality Check: What 300,000 Domains Reveal About the File Everyone’s Implementing in 2026

    The LLMs.txt Reality Check: What 300,000 Domains Reveal About the File Everyone’s Implementing in 2026

    The LLMs.txt file was supposed to be the AI-era equivalent of robots.txt — a clean, declarative way to hand large language models a curated map of your most valuable content. Three years after Jeremy Howard proposed the spec, the data is in. And the data is not what implementation evangelists have been promising.

    This is a case study teardown of the three largest independent measurement efforts on LLMs.txt adoption and citation impact, the one documented recovery case where it did move the needle, and the structural lesson every practitioner should pull from the divergence.

    The 300,000-Domain Study That Reset the Conversation

    A widely circulated dataset of nearly 300,000 domains — analyzed across multiple AI search citation benchmarks and reported by Search Engine Journal — found no statistically significant relationship between implementing LLMs.txt and how often AI engines cite a brand. Both standard statistical analysis and machine-learning models showed no effect. Removing LLMs.txt as a feature actually improved citation prediction accuracy in one model run, meaning the file’s presence was less than noise.

    Adoption sits at roughly 10.13% of domains in that dataset, distributed evenly across traffic tiers. Translation: it is neither standard practice nor a differentiator.

    A separate bot-traffic audit reported by adoption researchers found that out of 62,100-plus AI bot visits over a 90-day window, only 84 requests targeted the /llms.txt path. Across half a billion LLM bot traffic events analyzed in another dataset — filtering for the agents that actually drive citations (GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, Google-Extended) — the share of requests touching /llms.txt was statistically negligible.

    The Vendor Reality Behind the Numbers

    As of Q1 2026, no major AI company — OpenAI, Google, Anthropic, Meta, or Mistral — has publicly committed to reading or acting on LLMs.txt in production systems. The file is a community proposal, not a supported standard. AI language models learn what to trust from the web as it existed during training. Citation behavior reflects which sources appeared consistently in training corpora, which were cited by other credible sources, and which had claims independently corroborated. A crawl-directive file published after training cannot retroactively change any of that.

    The Recovery Case That Actually Moved Traffic

    Compare that to a documented recovery case reported by SEO Algorithm Recovery and corroborated by independent AI Overviews tracking: a Dallas retailer lost 72% of organic traffic to AI Overviews. Their agency deployed schema markup and restructured 150 pages around answer-first formatting. Traffic recovered to 118% of pre-AI Overview levels in 120 days, with $1.4M in revenue growth attributed to the recovered organic channel.

    No LLMs.txt was involved. The intervention stack was schema markup, content restructuring for AI-extractable answers, and entity disambiguation in headings. Schema markup alone has been reported to recover 45%-plus of lost AI Overview traffic in case-study compilations across the recovery agency space.

    The Structural Lesson

    The contrast is the case study. LLMs.txt is a static directive file that AI crawlers do not currently read at scale. Schema markup is a structured-data layer that AI systems already parse to construct answer panels and citation surfaces. One is aspirational. The other is operational.

    The structural pattern under every documented AI-search recovery in 2026 is the same: answer-first content directly under each H2, structured data on the entity being described, tables for comparison data, and explicit source attribution inline. Sites earning AI citations report traffic gains. Brands with strong authority signals benefit from the halo effect. Companies adapting these specific structural interventions early — not the file directives — are the ones reporting growth exceeding pre-AI Overview levels.

    A Minimum-Viable LLMs.txt Anyway

    The skeptical case is not “skip LLMs.txt entirely.” It is “do not let it absorb hours that should go to schema and content restructuring.” A minimum-viable LLMs.txt is ten lines and takes ten minutes to ship:

    # Your Brand Name
    
    > One-sentence description of what your site is and who it serves.
    
    ## Core Pages
    - [About](https://yoursite.com/about): Who you are, in one paragraph.
    - [Products](https://yoursite.com/products): What you sell, structured.
    - [Pricing](https://yoursite.com/pricing): Numbers, plans, comparison.
    
    ## Documentation
    - [Getting Started](https://yoursite.com/docs/start): The 5-step onboarding.
    - [API Reference](https://yoursite.com/docs/api): Full method index.
    

    Ship it. Stop tuning it. Then spend the rest of the week on schema and answer-first H2 restructuring, which is where the recovery cases are actually being won.

    The Practitioner Takeaway

    When two independent measurement methodologies across 300,000-plus domains agree that an optimization has no measurable effect on the outcome it is sold to improve, the rational move is to stop selling it as a primary intervention. Treat LLMs.txt as future-proofing insurance with a ten-minute implementation cost. Treat schema, entity binding, and answer-first content structure as the actual lever. The recovery cases that crossed pre-AI Overview revenue did the second set of things. The Search Engine Land-reported audit where 8 of 9 sites saw no measurable change after implementation did the first.

    Frequently Asked Questions

    Does LLMs.txt help with AI citations?

    Independent studies across approximately 300,000 domains have found no statistically significant relationship between LLMs.txt presence and AI citation frequency. Major AI vendors have not publicly committed to reading the file in production. Implement it as low-cost future-proofing, not as a primary citation strategy.

    What actually recovers traffic lost to AI Overviews?

    Documented recovery cases share a consistent intervention pattern: schema markup deployment, content restructuring with answer-first formatting directly under each H2, entity disambiguation, and inline source attribution. One published case showed 118% recovery of pre-AI Overview traffic in 120 days using this stack.

    What is the minimum-viable LLMs.txt?

    Ten lines: an H1 with your brand name, a blockquote with one-sentence site description, and grouped H2 sections listing your core pages and documentation with one-line summaries. Ship it once, do not over-tune it.

    Which AI bot user agents matter for citation visibility?

    The user agents that drive AI citations include GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, and Google-Extended. These are the crawlers whose access determines whether your content surfaces in AI answer panels.

    If LLMs.txt does not work, why is everyone implementing it?

    Three reasons: it is genuinely cheap to ship, it signals to clients that you are paying attention to AI search, and there is a non-zero chance AI vendors adopt it in the future. None of those reasons justify it being your primary AI-search intervention in 2026.

    Sources: Search Engine Journal’s coverage of the 300,000-domain LLMs.txt citation study; SEO Algorithm Recovery’s documented AI Overviews recovery case study; published bot traffic audits from Authority Tech and Generix Marketing on LLMs.txt request rates; recovery-stack analysis aggregated from BlankBoard Studio, Stackmatix, and Mersel AI’s 2026 AI Overviews recovery compilations.

  • 5 GEO and AEO Case Studies From 2026 — What Actually Worked, Decoded

    5 GEO and AEO Case Studies From 2026 — What Actually Worked, Decoded

    Most GEO and AEO case studies you can find online are vendor-published and short on implementation detail. So instead of stacking another “look at this 300% lift” headline, this piece walks through five publicly documented results from 2026 — and pulls out the structural change that actually drove the win in each one. If you want to copy what works, copy the structure, not the percentage.

    1) HubSpot: 3x lead conversion from AEO traffic

    HubSpot’s own 2026 State of Marketing reporting found 58% of marketers saying AI-referred visitors convert at higher rates than traditional organic, with HubSpot itself reporting roughly 3x better lead conversion from AEO sources versus other channels. The implementation pattern across HubSpot’s blog: question-led H2s, a 40–60 word direct answer in the first paragraph below the heading, then expanded context, then a structured FAQ block with FAQPage schema.

    The before/after isn’t “more content.” It’s “the same content, restructured so the answer arrives in the first 60 words.” That single edit is what featured snippets and AI Overviews both reward.

    2) Hashmeta e-commerce client: +50% zero-click visibility

    Hashmeta documented a 50% increase in zero-click visibility for an e-commerce client after a targeted AEO sprint. The lever: rebuilding product and category pages around explicit question intent (“what is the difference between X and Y,” “is X worth it for Z use case”) and adding HowTo and FAQPage schema. The page didn’t get more traffic from the same query — it started winning the answer position on related queries it wasn’t competing for before.

    The takeaway for practitioners: zero-click visibility is its own funnel. Track it separately from sessions, because the value shows up in branded search lift two to four weeks later, not in same-day clicks.

    3) SaaS brand: 20+ free-trial signups per month from ChatGPT citations

    One SaaS case study circulating in the GEO community in early 2026 reported 20+ free-trial signups per month attributed directly to ChatGPT citations, identified via a unique UTM and a referral-source filter in their analytics. The structural pattern: a single canonical comparison page per top competitor, written as a third-person reference rather than first-person marketing, with a clear definition block, a structured comparison table, and a “when to choose X” section.

    This is the format ChatGPT cites because it’s the format ChatGPT was trained to produce. Match the output shape and you become the source.

    4) Generic brand study: 140% lift in AI-driven search traffic

    A widely cited 2026 GEO case study reported a 140% increase in LLM and AI-driven search traffic alongside a 62% rise in AI mentions after a strategy that prioritized entity saturation, internal-link clustering, and structured data over keyword density. The implementation detail worth copying: a single hub page per entity with at least 15 distinct factual data points, then 8–12 supporting articles linking back to it with descriptive anchor text.

    The 15-data-point threshold matches what GEO researchers have flagged repeatedly: articles with 15+ verifiable data points receive substantially more AI citations than articles with fewer than five.

    5) Mangools: featured-snippet capture from a single edit

    Mangools published a walkthrough showing how rewriting one blog post to lead with a 50-word direct answer captured a featured snippet for a head-term query, with the resulting traffic and brand exposure outpacing the rest of the content cluster. No new backlinks, no new content — just a structural rewrite of the first 100 words.

    The pattern across all five

    Every win has the same shape: question-led H2, 40–60 word direct answer, structured supporting content, schema markup. Here is the minimum viable AEO block, drop-in ready:

    <h2>What is generative engine optimization?</h2>
    <p><strong>Generative engine optimization (GEO) is the practice of structuring web content so AI systems like ChatGPT, Claude, Gemini, and Perplexity cite it as a source.</strong> Unlike SEO, which optimizes for ranking in a list of links, GEO optimizes for being included in a generated answer. The core levers are entity clarity, factual density, structured data, and crawlability via LLMs.txt and robots.txt.</p>
    
    <script type="application/ld+json">
    {
      "@context": "https://schema.org",
      "@type": "FAQPage",
      "mainEntity": [{
        "@type": "Question",
        "name": "What is generative engine optimization?",
        "acceptedAnswer": {
          "@type": "Answer",
          "text": "Generative engine optimization (GEO) is the practice of structuring web content so AI systems cite it as a source in generated answers."
        }
      }]
    }
    </script>

    The measurement layer

    None of these case studies mean anything without isolation. The minimum tracking stack: a referrer filter for chatgpt.com, perplexity.ai, claude.ai, gemini.google.com, and copilot.microsoft.com in GA4; a separate event for zero-click impressions from Google Search Console; and a manual citation log — query a representative model with your top 25 prompts weekly and record whether your domain is cited. The third one is what most teams skip, and it’s the only one that tells you whether GEO is working before traffic shows up.

    What to copy this week

    Pick your top five highest-intent pages. For each one, rewrite the first 100 words as a direct-answer block, add a single FAQPage schema with three questions, and add the page to your LLMs.txt manifest. That is the entire implementation. Every case study above is a variation on those three moves.

  • AgentConcentrate: Why Standard Schema Markup Is a Business Card When AI Needs a Full Dossier

    AgentConcentrate: Why Standard Schema Markup Is a Business Card When AI Needs a Full Dossier

    The Lab · Tygart Media
    Experiment Nº 422 · Methodology Notes
    METHODS · OBSERVATIONS · RESULTS

    TL;DR: Standard schema.org markup is a business card—basic identification with name, price, and description. AI agents need a full dossier—custom JSON-LD with product specifications, competitive positioning, pricing signals, trust indicators, and entity relationships. Brands using AgentConcentrate-level structured data see 2-3x higher citation frequency from AI systems than competitors using basic markup.

    The JSON-LD Problem: Abundance Without Depth

    Every modern website uses schema.org markup. Google recommends it. Yoast includes it. Shopify auto-generates it. The result: 90% of the internet has the same shallow, templated structured data.

    A standard Product schema tells an AI system:

    {"@type": "Product", "name": "Widget X", "price": "$99", "description": "A great widget"}

    That’s it. Name, price, description. An AI reading this can extract basic facts but cannot understand why this product matters, how it compares, what specific problem it solves, or why the brand is authoritative.

    When an AI system encounters 50 competing products with identical schema depth, it cannot differentiate. It treats them all as peers. Your content gets the same weight as your competitor’s, regardless of actual quality or authority.

    This is why citation frequency is equal across competitors. Standard markup eliminates differentiation.

    AgentConcentrate: Building a Full Dossier

    AgentConcentrate is a methodology for creating custom, high-density JSON-LD structured data that goes far beyond standard schema.org.

    A complete AgentConcentrate dossier includes:

    Specification Layer: Not just “description.” Technical specifications, dimensions, materials, compatibility matrices, performance benchmarks. Everything an AI agent needs to answer detailed questions about your product without leaving your site.

    Positioning Layer: Competitor comparison embedded in your schema. Not “we’re the best.” Actual differentiation markers: price point, feature matrix, use-case specialization, target persona, market segment.

    Pricing Layer: Dynamic pricing signals. Volume tiers, loyalty pricing, seasonal adjustments, enterprise rates. AI agents parse this to understand whether you’re positioned for premium or volume markets.

    Trust Layer: Certifications, awards, third-party endorsements, expert affiliations, security standards, compliance badges. Not testimonials—formal trust indicators that AI systems weight heavily.

    Entity Layer: Relationships embedded in schema. Founder credentials, investor profile, partnership network, supply chain transparency, team expertise. When an AI synthesizes an answer, it draws on entity relationships to build narrative authority.

    Claim Layer: Canonical assertions marked as “claims” within your JSON-LD. “Our product reduces customer acquisition cost by 40%.” “We serve 10,000+ enterprise customers.” “We have 99.99% uptime.” These claims are parsable, citable, verifiable—and AI systems weight them heavily when building authoritative summaries.

    Why AI Systems Parse JSON-LD First

    When an AI system crawls your page, it doesn’t read like a human. It reads structurally. The parsing order:

    1. JSON-LD first. This is machine-readable metadata. No parsing required. High signal, high confidence.

    2. Semantic HTML second. Heading hierarchy, landmark tags, aria labels. Structure that indicates importance and relationship.

    3. Entity extraction third. Named entities, relationships, implicit hierarchies in text.

    4. Text body last. Raw prose. Lower confidence. Most likely to be filtered as marketing copy.

    This is why your JSON-LD matters enormously. It’s the first signal. It’s high-confidence metadata. It sets the frame for everything that follows.

    Competitors without AgentConcentrate-level schema are essentially presenting their brand to AI systems with a thick marketing filter. Competitors with rich, dossier-level schema are presenting themselves as authoritative source material.

    Real Example: Product Search in Generative Engines

    Imagine a user asks Claude: “What’s the best CRM for early-stage companies with under $100k annual budget?”

    Claude crawls 50 CRM vendors’ websites. Here’s what it finds:

    Competitor A (standard schema): Name, price, description. No pricing tiers, no target customer, no differentiators. Treated as a generic option.

    Competitor B (basic schema + some metadata): Slightly richer but still shallow. Unclear positioning. Could be SMB or enterprise.

    Your site (AgentConcentrate): Full dossier. Pricing tiers explicitly marked ($29/month for startups, $199/month for scale-ups). Target persona: Series A founders. Specific differentiation: “native integration with 40+ growth tools.” Trust indicators: backed by Tier 1 VCs, 4.9 rating across 2000+ reviews. Entity relationships: CEO is ex-Salesforce, CTO is ex-Stripe.

    When Claude synthesizes its answer, it doesn’t just cite you. It cites you because your structured data answers the specific question better than competitors. Your schema told Claude exactly what to know about you. Your competitors’ schema told Claude almost nothing.

    Result: You get cited. They don’t. Or they get mentioned generically, while you get cited as a category-specific solution.

    Building Your Own AgentConcentrate Dossier

    Audit your current schema. Use Google’s Structured Data Testing Tool. How deep is it? Basic name/price/description? Or are you embedding specifications, positioning, pricing tiers, trust indicators, entity relationships?

    Map your competitive differentiators. Not marketing copy. Actual differentiation. What do you do better? For whom? At what price point? What’s your specific expertise? Map this to schema properties.

    Build custom schema extensions. Standard schema.org may not have properties for your specific differentiators. Create custom namespaces. Example: aggregate your customer reviews, NPS scores, case study outcomes, and expert certifications into a custom “BrandProfile” object nested in your Product schema.

    Automate dossier generation. Don’t hand-code JSON-LD. Build a system that generates dossiers from your product database, pricing tables, trust badges, and team data. Update automatically as your business evolves.

    Version your schema. AgentConcentrate isn’t static. As you learn which schema properties correlate with higher citation frequency, iterate. Add new properties. Deepen existing ones. Track the impact on AI citation metrics (using Living Monitor).

    The Economic Impact

    Brands implementing AgentConcentrate consistently see:

    2-3x increase in AI system citations within 60 days. The structured data makes differentiation visible to machines. Machines cite more frequently.

    3-5x improvement in competitive displacement. When an AI system chooses between you and a competitor, rich schema helps you win the mention.

    30-50% improvement in AI-driven qualified traffic. Not all traffic. Qualified traffic—users who were referred by AI systems citing you specifically as a solution match.

    The ROI is straightforward: if your average customer lifetime value is $5,000, and AgentConcentrate enables 10 additional qualified customers per month, that’s $50,000 in incremental revenue monthly. The investment in schema design and maintenance is <$5,000/month.

    Why This Matters Now

    In the Google era, search was about keywords, links, and content volume. Rich schema was nice-to-have. Now, with AI-driven search and agent systems becoming dominant, schema is everything. It’s how machines understand you. It’s how they differentiate you. It’s how they cite you.

    The brands that invested in AgentConcentrate-level schema 12 months ago are now seeing 5-10x citation frequency advantage over competitors. The gap is widening monthly as more AI systems rely on structured data for synthesis.

    This is not optional. This is foundational. Start here.

  • Schema Markup Is the New Meta Description

    Schema Markup Is the New Meta Description

    Tygart Media / The Signal
    Broadcast Live
    Filed by Will Tygart
    Tacoma, WA
    Industry Bulletin

    Meta descriptions used to be the way you told Google what your page was about. They still matter, but schema markup (JSON-LD structured data) is how you tell AI crawlers what your content actually means. If you’re not injecting schema, you’re invisible to modern search.

    Why Schema Matters Now
    Google, Perplexity, Claude, and every AI search engine read schema markup to understand page context. A page about “water damage” without schema is ambiguous. A page about “water damage” with proper schema tells crawlers:
    – This is about a specific service (water damage restoration)
    – Here’s the price range
    – Here’s the service area
    – Here are customer reviews
    – Here’s how long it takes
    – Here’s what it includes

    Without schema, the crawler has to guess. With schema, it knows exactly what you’re offering.

    The Schema Types That Matter
    For content and commerce sites, these schema types drive visibility:

    Article Schema
    Tells search engines this is an article (not product pages, reviews, or other content). Includes:
    – Author (byline)
    – Publication date
    – Update date (critical for AEO)
    – Image (featured image)
    – Description

    Service Schema
    For service businesses (restoration, plumbing, etc.):
    – Service name
    – Service description
    – Price range
    – Service area
    – Provider (business name)
    – Reviews/rating

    FAQPage Schema
    If you have FAQ sections (and you should for AEO):
    – Each question and answer pair
    – Marked up so Google/Perplexity can pull exact answers

    LocalBusiness Schema
    For any geographically-relevant business:
    – Business name and address
    – Phone number
    – Opening hours
    – Service area

    Review/AggregateRating Schema
    Social proof for AI crawlers:
    – Review text and rating
    – Author and date
    – Average rating across all reviews

    How Schema Affects AEO Visibility
    When Perplexity asks “what’s the best water damage restoration in Houston?”, it doesn’t just crawl text—it reads schema markup.

    Pages WITH proper schema:
    – Get pulled into answer synthesis faster
    – Can be directly cited (“According to [X] restoration, it takes 3-7 days”)
    – Show up in comparison queries
    – Display with rich snippets (ratings, prices, etc.)

    Pages WITHOUT schema:
    – Get crawled as generic content
    – Can be used but aren’t preferenced
    – Missing from comparison queries
    – Look unprofessional in AI-generated answers

    The Implementation
    Schema is injected as JSON-LD in the page head. For WordPress, you can:
    1. Use a plugin (Yoast, RankMath) that auto-generates schema based on content
    2. Inject schema programmatically (via custom code)
    3. Use Google’s Structured Data Markup Helper to generate and verify

    We recommend programmatic injection because you have control over exactly what’s marked up, and you can customize based on content type and intent.

    The Validation
    Always validate your schema using Google’s Rich Results Test. Malformed schema is worse than no schema (it signals trust issues).

    Common schema errors:
    – Missing required fields (schema incomplete)
    – Wrong schema types (marking a service page as a product)
    – Conflicting data (schema says price is $100, content says $150)
    – Outdated information (old dates, expired URLs)

    Schema for AEO Specifically
    To rank well in Perplexity and Claude-based answers, prioritize:
    Article schema with detailed author/date: Shows freshness and authority
    FAQPage schema: Answer engines pull exact Q&A pairs
    Service/LocalBusiness schema: Provides context for geographic queries
    AggregateRating schema: Builds trust in AI summaries

    The Competitive Reality
    In competitive verticals, the top 5 ranking sites all have proper schema. If you don’t, you’re competing with one hand tied behind your back.

    We now add schema markup to every article before it goes live. It’s as important as the headline. It’s how modern search engines understand what you’re actually saying.

    Quick Audit
    Check your site: Run your homepage through Google’s Rich Results Test. If your schema is minimal or non-existent, that’s a competitive disadvantage waiting to be fixed.

    Schema markup isn’t optional anymore. It’s the way you communicate with AI crawlers. Without it, you’re invisible to the systems that matter most in 2026.

    {
    “@context”: “https://schema.org”,
    “@type”: “Article”,
    “headline”: “Schema Markup Is the New Meta Description”,
    “description”: “Meta descriptions used to be the way you told Google what your page was about. They still matter, but schema markup (JSON-LD structured data) is how you tell AI”,
    “datePublished”: “2026-03-30”,
    “dateModified”: “2026-04-03”,
    “author”: {
    “@type”: “Person”,
    “name”: “Will Tygart”,
    “url”: “https://tygartmedia.com/about”
    },
    “publisher”: {
    “@type”: “Organization”,
    “name”: “Tygart Media”,
    “url”: “https://tygartmedia.com”,
    “logo”: {
    “@type”: “ImageObject”,
    “url”: “https://tygartmedia.com/wp-content/uploads/tygart-media-logo.png”
    }
    },
    “mainEntityOfPage”: {
    “@type”: “WebPage”,
    “@id”: “https://tygartmedia.com/schema-markup-is-the-new-meta-description/”
    }
    }