Category: Tygart Media Editorial

Tygart Media’s core editorial publication — AI implementation, content strategy, SEO, agency operations, and case studies.

  • How to Write Content That AI Systems Actually Cite

    How to Write Content That AI Systems Actually Cite

    Tygart Media / Content Strategy
    The Practitioner JournalField Notes
    By Will Tygart
    · Practitioner-grade
    · From the workbench

    Being cited by AI systems is not luck and it’s not purely a domain authority game. There are structural characteristics of content that make AI systems more or less likely to pull from it. Here’s what those characteristics are and how to build them in deliberately.

    Why Content Structure Determines Citation Likelihood

    AI systems — whether Perplexity, ChatGPT with web search, or Google AI Overviews — are trying to answer a question. When they search the web and retrieve candidate content, they’re looking for the passage or page that most directly and reliably answers the query. The content that wins is the content that makes the answer easiest to extract.

    This has direct structural implications. A 3,000-word narrative essay that eventually answers a question on page 2 loses to a 600-word page that answers the question in the first paragraph, provides supporting evidence, and includes a definition. Not because shorter is better, but because clarity of answer placement is better.

    The Structural Characteristics That Drive Citation

    1. Direct Answer in the First 100 Words

    Every piece of content you want AI systems to cite should answer the primary question it’s targeting before the first scroll. AI retrieval systems don’t read like humans — they identify the most relevant passage, and that passage needs to contain the answer, not just lead toward it.

    Test: take your target query and your first 100 words. Does the answer exist in those 100 words? If not, restructure until it does. The rest of the piece can develop nuance, context, and supporting evidence — but the answer must be front-loaded.

    2. Explicit Q&A Formatting

    Question-and-answer structure signals to AI systems that the content is explicitly organized around answering queries. H3 headers phrased as questions, followed by direct answers, are one of the most reliable patterns for citation capture.

    This is why FAQ sections work — not because of FAQPage schema specifically, but because the underlying structure gives AI systems a clean extraction target. Schema reinforces it; the structure is the foundation.

    3. Defined Terms and Named Concepts

    Content that defines terms clearly — “X is Y” statements — becomes citable for queries looking for definitions. AI systems frequently answer “what is X” queries by pulling the clearest definition they can find. If your content doesn’t include a crisp definitional sentence, it’s not competing for definition queries even if you’ve written a thorough treatment of the topic.

    Add definition boxes. State “AI citation rate is the percentage of sampled AI queries where your domain appears as a cited source.” Don’t bury the definition in the third paragraph of an explanation.

    4. Specific, Verifiable Facts

    AI systems weight specificity. “$0.08 per session-hour” gets cited. “A relatively modest fee” does not. “60 requests per minute for create endpoints” gets cited. “Limited rate limits apply” does not.

    Replace hedged language with concrete numbers and specific claims wherever your content supports it. Don’t fabricate specificity — wrong specific numbers are worse than honest hedging. But wherever you have real, verifiable data, make it explicit and prominent.

    5. Entity Clarity

    Content that makes clear who is speaking, what organization they represent, and what their basis for authority is gets cited more reliably. This is the E-E-A-T signal applied to AI citation: the system needs to assess whether this source is credible enough to cite.

    Name the author. State the organization. Link to primary sources. Include dates on time-sensitive claims (“as of April 2026”). These signals tell the AI system this content has an accountable source, not anonymous text.

    6. Freshness on Time-Sensitive Topics

    For any topic where recency matters — product pricing, regulatory status, current events — AI systems heavily weight recently indexed, recently updated content. A page published April 2026 beats a page published January 2025 for queries about current status, even if the older page has higher domain authority.

    Update time-sensitive content. Add “last updated” dates. Re-publish with fresh timestamps when the underlying facts change. Freshness signals are real citation drivers for volatile topic areas.

    7. Speakable and Structured Data Markup

    Speakable schema explicitly marks the passages in your content best suited for AI extraction. It’s a direct signal to AI retrieval systems: “this paragraph is the answer.” Combined with FAQPage schema, Article schema, and HowTo schema where relevant, structured markup makes your content more parseable.

    Schema doesn’t replace the underlying structure — it reinforces it. A well-structured page with schema beats a poorly structured page with schema. But a well-structured page with schema beats a well-structured page without it.

    8. Internal Link Architecture

    AI systems that crawl the web assess topical depth partly through link structure. A page that sits within a tight cluster of related pages — all cross-linking around a topic — signals topical authority more strongly than an isolated page, even if the isolated page’s content is comparable.

    Build the cluster. The hub-and-spoke architecture is as relevant for AI citation as it is for traditional SEO. Every spoke article should link to the hub; the hub should link to every spoke.

    What Doesn’t Work

    A few patterns that are intuitively appealing but don’t translate to citation lift:

    • More content for its own sake: 5,000 words of padded content is not more citable than 900 words of dense, accurate content. AI retrieval is looking for passage quality, not page length.
    • Keyword density: Traditional keyword repetition strategies don’t make content more citable. The query match is handled at retrieval; the citation decision is about answer quality, not keyword frequency.
    • Generic authority claims: “We’re the leading experts in X” is not citable. A specific data point that demonstrates expertise is.

    The Compound Effect

    These characteristics compound. A page with a direct front-loaded answer, Q&A structure, defined terms, specific facts, clear entity signals, fresh timestamps, and schema markup sitting within a well-linked cluster is materially more citable than a page with only two or three of these characteristics. The full stack produces disproportionate results.

    For the monitoring layer: How to Track When AI Systems Cite You. For the metrics: What Is AI Citation Rate?. For the full citation monitoring guide: AI Citation Monitoring Guide.


    For the infrastructure layer: Claude Managed Agents Pricing Reference | Complete FAQ Hub.

  • AI Citation Monitoring Tools — What Exists, What Doesn’t, What We Built

    AI Citation Monitoring Tools — What Exists, What Doesn’t, What We Built

    The Lab · Tygart Media
    Experiment Nº 570 · Methodology Notes
    METHODS · OBSERVATIONS · RESULTS

    You want to monitor whether AI systems are citing your content. What tools actually exist for this, what they do, what they don’t do, and what we’ve built ourselves when nothing on the market fit.

    The Market as of April 2026

    The AI citation monitoring category is real but nascent. Here’s an honest inventory:

    Established SEO Platforms Adding AI Visibility Metrics

    Several major SEO platforms have added “AI visibility” or “AI search” modules in the past 6–12 months. These generally track:

    • Whether your domain appears in AI Overviews for tracked keywords (via SERP scraping)
    • Brand mentions in AI-generated snippets
    • Comparative visibility versus competitors in AI search results

    Ahrefs, Semrush, and Moz have all moved in this direction to varying degrees. Verify current feature availability — this has been an active development area and capabilities have changed rapidly.

    Mention Monitoring Tools Expanding to AI

    Brand mention tools like Brand24 and Mention have begun tracking AI-generated content that includes brand references. The challenge: they’re tracking brand name occurrences in crawled content, not necessarily AI citation events. Useful for brand visibility in AI-generated content that gets published, less useful for tracking in-session citations.

    Purpose-Built AI Citation Tools (Emerging)

    Several purpose-built tools targeting AI citation tracking specifically have launched or raised funding in early 2026. This category is moving fast. As of our last check:

    • Tools focused on tracking specific brand or entity mentions across AI platforms
    • API-first tools targeting developers who want to build citation monitoring into their own workflows
    • Dashboard tools with pre-built query sets for common industry categories

    Treat any specific product recommendation here as a starting point for your own research — the category will look different in 6 months.

    Google Search Console

    The strongest existing tool, and it’s free. AI Overviews that cite your pages register as impressions and clicks in GSC under the relevant queries. This is first-party data from Google itself. Limitation: covers only Google AI Overviews, not Perplexity, ChatGPT, or other platforms.

    What We Built

    When no existing tool covered the specific workflows we needed, we built our own. The stack:

    Perplexity API Query Runner

    A Cloud Run service that runs a predefined query set against Perplexity’s API on a weekly schedule. It parses the citations field from each response, checks for domain appearances, and writes results to a BigQuery table. Total engineering time: roughly one day. Ongoing cost: minimal (Cloud Run idle cost + Perplexity API usage).

    The output: a weekly BigQuery record per query showing which domains Perplexity cited, with timestamps. Trend queries show citation rate over time by query cluster.

    GSC AI Overview Monitor

    Not a custom build — just systematic review of GSC data. We check weekly which queries are generating AI Overview impressions for our tracked sites. The signal: if a page is generating AI Overview impressions on new queries, that’s a citation event.

    Manual ChatGPT Sampling

    For highest-priority queries, manual weekly sampling of ChatGPT with web search enabled. We log results to a shared spreadsheet. Less scalable than the API approach, but ChatGPT’s web search activation is inconsistent enough that API automation adds complexity without proportional reliability gain.

    What Doesn’t Exist (That Would Be Useful)

    The tool gaps that we still feel:

    • Cross-platform citation dashboard: A single view showing citation rate across Perplexity, ChatGPT, Gemini, and AI Overviews for the same query set. Nobody has built this cleanly yet.
    • Historical citation rate database: Knowing your citation rate is useful. Knowing whether it improved after you published a new piece of content is more useful. The temporal correlation is hard to establish with spot-check sampling.
    • Competitor citation tracking at scale: Easy to check manually for specific queries; hard to monitor systematically across a large competitor set and query space.

    These gaps exist because the category is new, not because the problems are technically hard. Expect the tool landscape to fill in significantly over the next 12 months.

    How to calculate citation rate: What Is AI Citation Rate?. How to set up tracking: How to Track When ChatGPT or Perplexity Cites Your Content. How to optimize for citations: How to Write Content That AI Systems Cite.


    The Perplexity API monitoring stack we built runs on Claude. For the hosted infrastructure context: Claude Managed Agents Pricing Reference | Complete FAQ.

  • What Is AI Citation Rate? (And How to Calculate Yours)

    What Is AI Citation Rate? (And How to Calculate Yours)

    AI citation rate is a metric that doesn’t have a standard definition yet, which means everyone using the term might mean something slightly different. Here’s what it is, how to calculate it, and what it actually measures — and doesn’t.

    Definition

    AI Citation Rate

    The percentage of sampled AI queries where a specific domain or URL appears as a cited source in the AI system’s response.

    Formula: (Queries where your domain appeared as a source) ÷ (Total queries sampled) × 100

    A Concrete Example

    You run 50 queries in Perplexity across your core topic cluster. Your domain appears as a cited source in 12 of those responses. Your AI citation rate for that query set on that platform: 12/50 = 24%.

    That’s the basic calculation. The complexity is in what you define as your query set, which platforms you sample, and what counts as a “citation.”

    What Counts as a Citation

    Not all AI source mentions are equal. Some distinctions worth tracking separately:

    • Direct URL citation: The AI explicitly lists your URL as a source. Highest confidence — trackable programmatically via API.
    • Domain mention: Your domain name appears in the response text but not necessarily as a formal source citation.
    • Brand mention: Your brand name appears in the response. May or may not correlate with your web content being the source.
    • Implied citation: Content clearly derived from your page but no explicit attribution. Only detectable through content fingerprinting — difficult at scale.

    For tracking purposes, direct URL citation is the most reliable signal. Brand mentions are noisier but still worth tracking for brand visibility purposes.

    How to Calculate It

    Step 1: Define Your Query Set

    Select 20–100 queries where you want to appear. Good sources for your query set:

    • Your highest-impression GSC queries (you rank for these — do AI systems cite you?)
    • Queries where you’ve published dedicated content
    • Queries from your keyword research that match your expertise
    • Questions your clients or prospects actually ask

    Step 2: Sample Across Platforms

    Run each query in Perplexity (most trackable — consistent citation format), ChatGPT with web search enabled, and Google AI Overviews (via organic search). Track results separately by platform — citation rates vary significantly between platforms for the same query set.

    Step 3: Log Results

    For each query on each platform, record:

    • Whether your domain appeared as a citation (binary: yes/no)
    • Position if ranked (first citation, third citation, etc.)
    • Date of query

    Step 4: Calculate Rate

    Aggregate by time period (weekly or monthly). Calculate separately by platform and by topic cluster — aggregate rate across all platforms and queries hides the variation that’s actually useful.

    Step 5: Establish Baseline, Then Track Change

    Your first 4–6 weeks of data sets your baseline. After that, track directional change — is the rate improving, declining, or stable? Correlate changes with content updates, new publications, and competitor activity.

    What Citation Rate Actually Measures (And Doesn’t)

    AI citation rate is a proxy for content authority signal in AI systems — not a direct ranking factor you can optimize mechanically. It reflects:

    • Whether your content is being indexed and surfaced by AI systems for your target queries
    • Whether your content structure and freshness match what AI systems prefer to cite
    • Relative authority versus competitors for the same query space

    It doesn’t measure:

    • Whether AI systems are using your content without citation (training data influence)
    • User behavior after AI responses (do they click through to your site?)
    • Revenue impact of being cited (cited ≠ converting)

    Benchmarks and Context

    Because this metric is new, industry benchmarks don’t exist yet. What matters is your own trend line, not comparison to a published standard. A 20% citation rate in a highly competitive topic cluster might represent strong performance; 20% in a niche you should dominate might indicate underperformance. Context is everything.

    For the full monitoring setup: How to Track When ChatGPT or Perplexity Cites Your Content. For tools available: AI Citation Monitoring Tools Comparison. For content optimization: How to Write Content That AI Systems Actually Cite.


    For the agent infrastructure behind automated citation tracking: Claude Managed Agents Pricing and FAQ Hub.

  • How to Track When ChatGPT or Perplexity Cites Your Content

    How to Track When ChatGPT or Perplexity Cites Your Content

    Tygart Media Strategy
    Volume Ⅰ · Issue 04Quarterly Position
    By Will Tygart
    Long-form Position
    Practitioner-grade

    ChatGPT cited a competitor’s blog post instead of yours. Perplexity summarized the wrong article. An AI answer engine described your service category without mentioning you. You’d like to know when this happens — and whether it’s improving over time.

    The problem: no one has built a clean, turnkey tool for this yet. Here’s what actually exists, what we’ve pieced together, and what a real tracking setup looks like.

    Why This Is Hard

    Web search citation tracking is solved: rank trackers like Ahrefs and SEMrush show you who’s linking to what. AI citation tracking has no equivalent infrastructure. Here’s why:

    • Non-deterministic outputs: Ask ChatGPT the same question twice; you may get different sources cited, or no sources at all. There’s no persistent ranking to track.
    • No public citation index: Google’s index is crawlable. There’s no equivalent for “content that AI systems have cited in responses.” You can’t pull a report.
    • Variable source disclosure: Perplexity shows sources. ChatGPT’s web-enabled mode shows sources sometimes. Gemini shows sources. Claude generally doesn’t show sources in the same way. Tracking works where sources are disclosed; it breaks where they aren’t.
    • Query sensitivity: Your content might get cited for one phrasing and completely missed for a near-synonym. There’s no search volume data to tell you which phrasings matter.

    What Actually Exists Today

    Manual Query Sampling

    The only fully reliable method: run queries yourself and check the sources cited. For a content monitoring program this might look like:

    • Define 20–50 queries where you want to appear (covering your core topics)
    • Run each query in Perplexity, ChatGPT (web-enabled), and Gemini weekly or biweekly
    • Log whether your domain appears in cited sources
    • Track citation rate (appearances / total queries run) over time

    This is tedious but gives you ground truth. It’s what a real monitoring program looks like before you automate it.

    Perplexity Source Tracking

    Perplexity consistently displays its sources, making it the most tractable platform for systematic citation tracking. A simple automated approach:

    • Use Perplexity’s API to query your target questions programmatically
    • Parse the citations field in the response
    • Check whether your domain appears
    • Log and aggregate over time

    Perplexity’s API is available with a subscription. The citations field returns the URLs Perplexity used to generate its answer. You can run this as a scheduled Cloud Run job and dump results to BigQuery for trend analysis.

    ChatGPT Web Search Mode

    When ChatGPT uses web search (either via the browsing tool or search-enabled API), it returns source citations. The search-enabled ChatGPT API (available with OpenAI API access) gives you programmatic access to these citations. Same approach: define queries, run them, parse citations, track your domain.

    Limitation: not all ChatGPT responses use web search. For queries it answers from training data, no source is cited and you have no visibility into whether your content influenced the answer.

    Google AI Overviews

    Google AI Overviews (formerly SGE) shows cited sources inline in search results. You can track these through Google Search Console for your own content — if Google’s AI Overview cites your page, that page gets an impression and potentially a click recorded in GSC under that query. This is the only AI citation signal with first-party tracking infrastructure.

    Emerging Tools

    As of April 2026, several tools are building toward AI citation tracking as a category: mention monitoring services that have added AI search coverage, SEO platforms adding “AI visibility” metrics, and purpose-built tools targeting this specific problem. The category is forming but not mature. Verify current capabilities — this space has changed significantly in the past six months.

    What a Real Monitoring Setup Looks Like

    Here’s the practical stack we’ve assembled for tracking citation presence across AI platforms:

    1. Define your query set: 30–50 queries across your core topic clusters. Weight toward queries where you have existing content and where you’re trying to establish authority.
    2. Perplexity API integration: Scheduled weekly run. Parse citations. Log domain appearances to a tracking spreadsheet or BigQuery table.
    3. ChatGPT web search sampling: Less systematic — manual sampling weekly for highest-priority queries. The API approach works but requires more engineering to handle variability in when web search activates.
    4. Google Search Console: Monitor AI Overview impressions. This is your strongest signal because it’s Google’s own data, not sampled queries.
    5. Baseline and trend: After 4–6 weeks of tracking, you have a baseline citation rate. Changes correlate (imperfectly) with content quality improvements, new publications, and competitor activity.

    What Citation Rate Actually Tells You

    Citation rate — your domain appearances divided by total queries sampled — is a proxy metric, not a direct ranking signal. What drives it:

    • Content freshness: AI systems prefer recently indexed, recently updated content for queries about current information
    • Structural clarity: Content with explicit Q&A structure, defined terms, and direct factual claims gets cited more reliably than narrative content
    • Domain authority signals: The same signals that help SEO rankings help AI citation rates — but the weighting may differ by platform
    • Entity specificity: Content that clearly establishes your brand as an entity with defined characteristics gets cited more consistently than generic content

    For the content optimization angle: AI Citation Monitoring Guide. For the broader GEO picture: What Managed Agents means for content visibility.

    For the hosted agent infrastructure context: Claude Managed Agents Pricing Reference — how the billing works for agents that could automate citation monitoring workflows.

  • The Real Monthly Cost of Running Claude Managed Agents 24/7

    The Real Monthly Cost of Running Claude Managed Agents 24/7

    Tygart Media Strategy
    Volume Ⅰ · Issue 04Quarterly Position
    By Will Tygart
    Long-form Position
    Practitioner-grade

    If you’re considering running Claude Managed Agents around the clock, you want a number. Not “it depends.” An actual number you can put in a budget. Here’s the math, worked out by scenario, with the honest caveats about where the real costs are.

    The Formula

    Total monthly cost = (Active session hours × $0.08) + token costs + optional tool costs

    The $0.08/session-hour charge only applies during active execution. Idle time — waiting for input, tool confirmations, external API responses — doesn’t count. This matters significantly for 24/7 workloads, because very few agents are active 100% of the time even when “running around the clock.”

    The Maximum Theoretical Cost

    Scenario: Agent running continuously, zero idle time, 24 hours a day, 30 days a month.

    • Session runtime: 24 hrs × $0.08 × 30 days = $57.60/month
    • Token costs: separate, highly variable (see below)

    $57.60/month is the ceiling on session runtime charges. You cannot pay more than this in session fees under any 24/7 scenario. But here’s the reality: that ceiling assumes zero idle time across the entire month, which doesn’t describe any real production agent.

    Realistic 24/7 Scenarios

    Monitoring Agent (High Idle Ratio)

    Runs continuously watching for triggers — error alerts, specific data patterns, incoming requests. Activates on trigger, processes, returns to monitoring state.

    • Assumption: 5% active execution time (watching 95% of the time, executing 5%)
    • Active hours: 24 × 30 × 0.05 = 36 hours/month
    • Session runtime: 36 × $0.08 = $2.88/month
    • Token costs: low — moderate bursts on trigger events
    • Realistic total: $5–15/month

    Customer Support Agent (Business Hours Active)

    “24/7” in the sense of always-available, but actual request volume concentrates in business hours. Waits for tickets, processes them, waits again.

    • Assumption: 8 hours/day active execution, 16 hours waiting
    • Active hours: 8 × 30 = 240 hours/month
    • Session runtime: 240 × $0.08 = $19.20/month
    • Token costs: depends heavily on ticket volume and average length
    • At 100 tickets/day with moderate length: likely $30–80/month in tokens
    • Realistic total: $50–100/month

    Continuous Autonomous Pipeline

    Batch processing agent that runs continuously through a queue with minimal waiting — the closest to true 24/7 active execution.

    • Assumption: 20 hours/day truly active (4 hours queue exhaustion/maintenance)
    • Active hours: 20 × 30 = 600 hours/month
    • Session runtime: 600 × $0.08 = $48/month
    • Token costs: high — continuous processing means continuous token consumption
    • This is where tokens become the dominant cost driver by a significant margin
    • Realistic total: $200–500+/month (tokens dominate)

    The Real Variable: Token Costs

    For any 24/7 workload that’s genuinely busy, token costs will substantially exceed session runtime costs. The math:

    A moderately active agent processing 10,000 input tokens and 2,000 output tokens per hour with Claude Sonnet 4.6:

    • Input: 10,000 tokens × $3/million = $0.03/hour
    • Output: 2,000 tokens × $15/million = $0.03/hour
    • Token cost: $0.06/hour vs. session runtime of $0.08/hour — roughly equal at this volume

    Scale to 100,000 input tokens and 20,000 output tokens per hour (a busy processing agent):

    • Input: $0.30/hour; Output: $0.30/hour
    • Token cost: $0.60/hour vs. session runtime of $0.08/hour — tokens are 7.5× the runtime charge

    The session runtime fee is flat and bounded. Token costs scale with workload volume. For high-volume 24/7 agents, optimize token efficiency (prompt caching, context management, output brevity) before worrying about the session runtime charge.

    Prompt Caching Changes the Token Math

    If your agent has a large, stable system prompt — common in agents with extensive tool definitions or knowledge bases — prompt caching dramatically reduces input token costs. Cache hits cost a fraction of base input rates. For a 24/7 agent with a 20,000-token system prompt hitting the same context repeatedly, caching that prompt can cut input costs by 80–90%. The session runtime charge is unchanged, but the total cost picture improves significantly.

    The Budget Summary

    Agent Type Runtime/mo Typical Total
    Monitoring / low activity ~$3 $5–15
    Support agent (business hours volume) ~$19 $50–100
    Continuous processing pipeline ~$48 $200–500+
    Theoretical maximum (zero idle) $57.60 Unbounded (tokens)

    Complete pricing reference: Claude Managed Agents Pricing Guide. How idle time affects billing: Idle Time and Billing Explained. All questions: FAQ Hub.

  • Claude Managed Agents vs. OpenAI Agents API — A Direct Comparison

    Claude Managed Agents vs. OpenAI Agents API — A Direct Comparison

    Quick Comparison — May 2026

    Feature Claude Managed Agents OpenAI Agents API
    Model lock-in Claude only GPT-4o, o3, multi-model
    Orchestration Fully managed — no infra to build SDK-based — you build the harness
    Memory Built-in (public beta, May 2026) Manual — implement via vector DB
    Multiagent Native (May 2026) — lead + specialists Supported via Swarm/SDK patterns
    Sandboxing Secure containers included You manage execution environment
    Pricing model Token-based (Sonnet 4.6 / Opus 4.7) Token-based (GPT-4o / o3 rates)
    Best for Claude-native stacks, fast production Multi-model flexibility, existing OAI infra

    Model Accuracy Note — Updated May 2026

    Current flagship: Claude Opus 4.7 (claude-opus-4-7). Current models: Opus 4.7 · Sonnet 4.6 · Haiku 4.5. Claude Opus 4.6 referenced in this article has been superseded. See current model tracker →

    Tygart Media Strategy
    Volume Ⅰ · Issue 04Quarterly Position
    By Will Tygart
    Long-form Position
    Practitioner-grade

    You’re evaluating hosted agent infrastructure. Both Anthropic and OpenAI have one. Before you commit to either, here’s what’s actually different — not the marketing version, the architectural and pricing version.

    Bottom Line Up Front

    If your stack is Claude-native and you want to get to production fast without building orchestration infrastructure, Managed Agents is hard to beat. If you need multi-model flexibility or have OpenAI deeply embedded in your stack, the calculus changes. Lock-in is real on both sides.

    Still Deciding?

    I’ve run both. Email me your use case and I’ll tell you which one fits.

    No pitch. If Claude isn’t the right call for what you’re building, I’ll tell you that too.

    Email Will → will@tygartmedia.com

    What Each Product Is

    Claude Managed Agents

    Anthropic’s hosted runtime for long-running Claude agent work. You define an agent (model, system prompt, tools, guardrails), configure a cloud environment, and launch sessions. Anthropic handles sandboxing, state management, checkpointing, tool orchestration, and error recovery. Launched April 8, 2026 in public beta.

    OpenAI Agents API

    OpenAI’s hosted agent infrastructure layer, launched earlier in 2026. Provides similar capabilities: hosted execution, tool integration, multi-agent coordination. Supports multiple OpenAI models (GPT-4o, o1, o3, etc.).

    Model Flexibility

    Managed Agents: Claude models only. Sonnet 4.6 and Opus 4.6 are the primary options for agent work. No multi-model mixing within the managed infrastructure.

    OpenAI Agents API: OpenAI models only, but a wider current model lineup (GPT-4o, o1, o3-mini depending on task). Also Claude-only within its own ecosystem — not multi-model in the cross-provider sense.

    The practical implication: If your evaluation is “I want the best model for this specific task regardless of provider,” neither hosted solution gives you that. Both lock you to their provider’s models. The multi-model comparison matters for self-hosted frameworks (LangChain, etc.), not for managed hosted solutions.

    Pricing Structure

    Claude Managed Agents: Standard Claude token rates + $0.08/session-hour of active runtime. Idle time doesn’t bill. Code execution containers included in session runtime — not separately billed.

    OpenAI Agents API: Standard OpenAI token rates + usage-based tooling costs. Pricing structure varies by tool and model tier. Verify current rates at OpenAI’s pricing page — rates have changed multiple times as their agent products have evolved.

    Direct comparison difficulty: Without modeling the same specific workload against both providers’ current rates, headline comparisons mislead. Token rates differ by model, model capabilities differ, and “session runtime” isn’t a category OpenAI uses. Model the workload, not the headline number.

    Infrastructure and Lock-In

    Both solutions create meaningful lock-in. This isn’t a criticism — it’s an honest description of the trade-off you’re making:

    Claude Managed Agents lock-in: Your agents run on Anthropic’s infrastructure with their tools, session format, sandboxing model, and checkpointing. Migrating to OpenAI’s Agents API or self-hosted infrastructure requires rearchitecting session management, tool integrations, and guardrail logic. One developer’s reaction at launch: “Once your agents run on their infra, switching cost goes through the roof.”

    OpenAI Agents API lock-in: Symmetric. Same dynamic in reverse. OpenAI’s session format, tool integration patterns, and infrastructure assumptions create equivalent switching costs to move to Anthropic’s platform.

    The honest framing: You’re not choosing “open” vs. “locked.” You’re choosing which provider’s lock-in you’re more comfortable with, given your existing infrastructure, model preferences, and vendor relationship.

    Data Sovereignty

    Both solutions run your data on provider-managed infrastructure. Neither currently offers native on-premise or multi-cloud deployment for the managed hosted layer. For companies with strict data sovereignty requirements, this is a parallel constraint on both platforms — not a differentiator.

    Production Track Record

    Claude Managed Agents: Launched April 8, 2026. Production users at launch: Notion, Asana, Rakuten (5 agents in one week), Sentry, Vibecode, Allianz. Anthropic’s agent developer segment run-rate exceeds $2.5 billion.

    OpenAI Agents API: Earlier launch gives more time in production, but the product has been revised significantly since initial release. Longer production history, but also more legacy architectural assumptions baked in.

    When to Choose Claude Managed Agents

    • Your stack is already Claude-native (you’re using Sonnet or Opus for most model calls)
    • You want to reach production without building orchestration infrastructure
    • Your tasks are long-running and asynchronous — the session-hour model fits naturally
    • The Notion, Asana, or Sentry integrations are relevant to your workflow
    • You want Anthropic’s specific safety and reliability guarantees

    When to Consider OpenAI’s Agents API Instead

    • Your stack is already heavily OpenAI-integrated (GPT-4o for primary model work, existing tool integrations)
    • You need access to reasoning models (o1, o3) for specific task types — Anthropic’s equivalent is Claude’s extended thinking, which has different characteristics
    • The specific tool integrations in OpenAI’s ecosystem are better matched to your stack
    • You want more production time at scale before committing to a platform

    When to Use Neither (Self-Hosted Frameworks)

    LangChain, LlamaIndex, and similar self-hosted frameworks remain viable — and better — when you genuinely need multi-model flexibility, on-premise execution, or tighter loop control than either hosted solution provides. The trade-off is engineering effort: months of infrastructure work that Managed Agents or OpenAI’s API eliminates.

    Complete pricing breakdown: Claude Managed Agents Pricing Reference. All Managed Agents questions: FAQ Hub. Enterprise deployment example: Rakuten: 5 Agents in One Week.

  • How Claude Managed Agents Handles Idle Time (And Why It Matters for Your Bill)

    How Claude Managed Agents Handles Idle Time (And Why It Matters for Your Bill)

    Tygart Media Strategy
    Volume Ⅰ · Issue 04Quarterly Position
    By Will Tygart
    Long-form Position
    Practitioner-grade

    The most counterintuitive thing about Claude Managed Agents pricing is what you don’t pay for. Most people, when they hear “$0.08 per session-hour,” mentally model a virtual machine running continuously. That’s the wrong mental model. Here’s the right one, and why it matters for your bill.

    The Core Distinction: Active vs. Idle

    Managed Agents session runtime only accrues while your session’s status is running. The session can exist — open, initialized, capable of continuing — without accumulating runtime charges when it’s not actively executing.

    The specific states that do not count toward your $0.08/hr charge:

    • Time spent waiting for your next message
    • Time waiting for a tool confirmation
    • Time waiting on an external API response your tool is calling
    • Rescheduling delays
    • Terminated session time

    This is a meaningful architectural decision by Anthropic. They’re billing on what actually taxes their compute — active execution — not on session existence or wall-clock time.

    Why This Is Different From How You Might Expect Billing to Work

    Compare three billing models:

    Virtual machine billing (what this is not): You pay for every hour the instance exists, whether it’s idle or saturated. A VM running 24/7 with 10% actual utilization still costs 24 hours/day.

    Lambda/function billing (closer analogy): AWS Lambda bills on execution duration and invocation count — you pay when code actually runs, not when a function is “available.” Idle Lambda functions cost nothing.

    Managed Agents billing (what this actually is): Closer to Lambda than VM. You pay $0.08 per hour of active execution. A session that runs for 2 hours of wall-clock time but has 90 minutes of waiting costs $0.08 × 1.5 hours = $0.12, not $0.08 × 2 hours = $0.16.

    A Real Scenario: The Human-in-the-Loop Agent

    Consider an agent that processes your inbox for action items and waits for your approval before sending replies. Wall-clock time: 4 hours open during your workday. Actual active execution: 20 minutes of processing across that 4-hour window, with the rest spent waiting for your review decisions.

    • VM billing equivalent: 4 hours × rate = significant charge
    • Managed Agents billing: 20 minutes × $0.08/hr = $0.027

    The difference is real. For interaction-heavy agents where the agent frequently waits for human decisions, the idle-time exclusion significantly reduces costs versus a naive per-hour model.

    A Real Scenario: The Autonomous Batch Agent

    Now consider an agent running a fully autonomous content pipeline — no human checkpoints, just continuous execution through a queue. Wall-clock time and active execution time are nearly identical because the agent never waits.

    • A 2-hour autonomous batch: 2 hours × $0.08 = $0.16

    Here, the idle-time model provides no benefit — the agent has no idle time. The billing is effectively equivalent to per-hour pricing because execution is continuous.

    Code Execution Containers Are Included

    One more billing nuance worth knowing: when your agent runs code, the execution happens in sandboxed Linux containers. These containers are not separately billed on top of session runtime. The $0.08/hr covers both the session runtime and the container execution. This is explicitly documented by Anthropic and represents meaningful savings if your agent is doing significant code execution work — you’re not paying twice.

    What This Means for Workload Design

    If you’re designing agent workflows and have the choice between architectures, the billing model creates a useful signal:

    • Agents that wait on humans: Metered billing is favorable — you only pay for the actual reasoning and execution time, not the human decision time
    • Fully autonomous agents: Billing approaches equivalent to per-hour rates — optimize these on token efficiency, not idle reduction
    • Scheduled batch agents: Natural fit — run when needed, terminate when done, no idle accumulation

    The 24/7 Agent Math

    For anyone doing the 24/7 always-on calculation: the maximum theoretical runtime exposure is 24 hrs × $0.08 × 30 days = $57.60/month in session fees. But a 24/7 agent with zero idle time is rare in practice. Agents that sleep between triggers, wait on external data, or hold for human decisions have meaningful idle windows that reduce the actual charge below the theoretical ceiling.

    Full monthly cost analysis: The Real Monthly Cost of Running Claude Managed Agents 24/7. Pricing reference: Complete Pricing Guide. All questions: FAQ Hub.

  • Claude Managed Agents Rate Limits — What 60 Requests Per Minute Means in Practice

    Claude Managed Agents Rate Limits — What 60 Requests Per Minute Means in Practice

    The Lab · Tygart Media
    Experiment Nº 561 · Methodology Notes
    METHODS · OBSERVATIONS · RESULTS

    You’re planning to run Claude Managed Agents at scale. You’ve modeled the token costs, the session-hour charge, the workload cadence. Then you hit the actual constraint: rate limits. Here’s what 60 requests per minute actually means in practice, and whether it’s going to be your ceiling.

    The Two Limits You Need to Know

    Managed Agents has two endpoint-specific rate limits, separate from your standard Claude API limits:

    • Create endpoints: 60 requests per minute
    • Read endpoints: 600 requests per minute

    Your organization-level API limits apply on top of these. If your org is on a tier with a lower requests-per-minute ceiling, that’s the actual binding constraint.

    What “60 Create Requests Per Minute” Actually Means

    A create request, in Managed Agents context, is typically a session creation call — starting a new agent session. 60/minute means you can start 60 sessions per minute maximum. For almost all real workloads, this is not the binding constraint. Here’s why:

    Think about what generates create requests. If you’re running a batch pipeline that starts one new agent session per content item, processing 60 items per minute would saturate the limit. But a 60-item-per-minute content pipeline is running 3,600 items per hour — a genuinely high-volume operation. Most production agent workloads don’t look like this. They look like one session that runs for minutes or hours, processes multiple tasks within that session, and terminates when done.

    The create limit matters most for architectures where you’re spinning up a new session per task rather than running tasks within a persistent session. If that’s your pattern, 60/minute is a hard ceiling you’ll need to design around.

    What “600 Read Requests Per Minute” Actually Means

    Read requests include polling session status, reading agent output, checking checkpoints, and retrieving session state. 600/minute is a relatively generous limit — that’s 10 reads per second. For a monitoring dashboard polling 10 active sessions every second, you’d hit this. For most production monitoring patterns (checking status every 5-30 seconds per session), you’re well under the ceiling.

    The read limit becomes relevant in high-concurrency architectures where many sessions are running in parallel and all being polled aggressively. If you’re running 50 concurrent agents and checking each one every 2 seconds, that’s 25 reads/second — still within the 10 reads/second limit per second, but compressing toward it.

    The Limit That’s More Likely to Actually Stop You

    For most agent workloads, token throughput limits hit before request rate limits do. The reasoning: a long-running agent session processing significant context generates a lot of tokens. If you’re running many such sessions in parallel, you’ll hit your organization’s token-per-minute limit before you hit 60 sessions created per minute.

    Token limits depend on your API tier. Higher tiers have higher token throughput limits. Rate limit increases and custom limits for high-volume enterprise customers are negotiated with Anthropic’s sales team.

    Designing Around the 60 Create Limit

    If your architecture genuinely needs more than 60 new sessions per minute, the primary design pattern is batching more work within each session rather than creating more sessions. A single Managed Agents session can handle sequential tasks — you don’t need a new session per task if your tasks can be queued and processed within one session’s lifecycle.

    The tradeoff: longer-running sessions accumulate more runtime charge ($0.08/hr active). For most workloads, the efficiency gains from batching outweigh the marginal runtime cost.

    The Agent Teams Implication

    Agent Teams — Managed Agents’ multi-agent coordination feature — coordinate multiple Claude instances with independent contexts. Each instance in an Agent Team is a separate entity from a context standpoint. How Agent Team member sessions count against the create rate limit is worth verifying against current documentation if you’re architecting a high-concurrency Agent Teams deployment.

    For Enterprise Workloads

    If you’re evaluating Managed Agents for enterprise-scale deployment and the published limits don’t fit your volume requirements, contact Anthropic’s enterprise sales team. Rate limit increases for high-volume applications are a documented option — they’re negotiated, not self-serve.

    Contact: [email protected] or through the Claude Console.

    Frequently Asked Questions

    Does the 60 requests/minute limit apply to all API calls or just session creation?

    The 60/minute limit applies to create endpoints — session creation being the primary one. Read operations have a separate 600/minute limit. Standard Messages API calls are governed by your organization’s standard tier limits, not these Managed Agents-specific limits.

    Do subagents count against the create rate limit separately from the parent session?

    Subagents operate within the parent session’s context and report results upward — they’re architecturally different from new sessions. Verify current documentation for precise billing treatment of subagent creation calls vs. Agent Team session creation.

    What happens when I hit the rate limit?

    Standard API rate limit behavior applies — requests over the limit receive a 429 response. Implement exponential backoff in your session creation logic for any high-volume pattern that approaches the 60/minute ceiling.

    How does this compare to OpenAI’s Agents API limits?

    Rate limit structures differ by product and tier. Direct comparison requires checking both providers’ current documentation for your specific tier. The full comparison: Claude Managed Agents vs. OpenAI Agents API.

    Full pricing context including rate limits: Claude Managed Agents Complete Pricing Reference. All questions: Claude Managed Agents FAQ.

  • What Notion’s Claude Managed Agents Integration Actually Does

    What Notion’s Claude Managed Agents Integration Actually Does

    Tygart Media Strategy
    Volume Ⅰ · Issue 04Quarterly Position
    By Will Tygart
    Long-form Position
    Practitioner-grade

    When Anthropic launched Claude Managed Agents, Notion was one of four launch partners. That detail got buried in the announcement. Here’s what it actually means for people who use Notion for knowledge work, and why “Notion voice input desktop” keeps showing up as a query against a Managed Agents page.

    Short answer: Managed Agents in Notion is an ambient intelligence layer. It’s not a chatbot in a sidebar. It’s an agent that watches your workspace and acts — without you directing every step.

    What the Notion Integration Actually Does

    Notion’s Claude Managed Agents integration runs as a persistent background agent with access to your workspace. The practical capabilities, as documented at launch:

    • Autonomous page updates: The agent can read, summarize, and rewrite Notion pages without manual triggers. You set a task; it works through it.
    • Cross-database synthesis: Pull data from multiple Notion databases, synthesize it, and write outputs to a target page or database entry
    • Meeting note processing: Ingest raw meeting notes and produce structured summaries, action items, and task entries in your project database
    • Workflow automation: Trigger actions based on database property changes — a status update in one database can kick off agent work in another

    The key difference from Notion AI (which Notion has had for some time): Notion AI is request-response. You ask it something; it answers. Managed Agents in Notion can be configured to run autonomously on a schedule or on trigger, keep working through multi-step tasks, and report back when done. It’s closer to a background employee than an on-demand assistant.

    Why This Showed Up in Search as “Notion Voice Input Desktop”

    This is worth explaining, because that query cluster is real and mildly interesting. The Managed Agents announcement included voice input functionality — the ability to interact with agents via voice in some contexts. People searching “notion voice input desktop” and “notion ai voice input desktop” were looking for whether this voice capability existed in the desktop client for Notion specifically.

    The honest answer as of April 2026: voice input capabilities are in preview or context-dependent. Verify current availability in Notion’s desktop client against their current documentation — this is an area that may have evolved since launch.

    The “Decoupled Brain and Hands” Model Applied to Notion

    Anthropic describes their Managed Agents architecture as decoupling the brain (Claude, the reasoning layer) from the hands (the sandboxed containers where actions execute). In Notion’s context, this maps cleanly:

    • The brain reads your Notion workspace, understands context, makes decisions about what to do
    • The hands execute — writing to pages, updating database entries, moving content between sections

    The brain and hands operate independently. The agent can reason about what your project needs without being tightly coupled to the specific API calls that will implement it. This matters because it means the agent can handle ambiguity — “clean up the Q2 notes and create action items” is a goal, not a procedure, and the agent figures out the procedure.

    What You Actually Configure

    To run Claude Managed Agents in Notion, you’re defining:

    • Task definition: What the agent is supposed to accomplish (in natural language or structured format)
    • Tool access: Which Notion databases, pages, and capabilities the agent can read and write
    • Guardrails: What the agent cannot do — pages it can’t modify, actions it must confirm before taking
    • Trigger: When the agent runs — on schedule, on database trigger, or on demand

    You don’t write the orchestration logic. Anthropic’s infrastructure handles session management, state persistence, and error recovery. If the agent hits an error mid-task, it checkpoints and recovers — you don’t lose progress.

    The Practical Cost of Running Notion Agents

    Using Managed Agents in Notion triggers the same billing as any Managed Agents session: standard token rates plus $0.08/session-hour of active runtime. For typical knowledge work tasks:

    • A daily meeting summary agent running 15 minutes of active execution: ~$0.02/day in runtime (~$0.60/month), plus token costs for the volume of notes processed
    • A weekly database synthesis task running 45 minutes: ~$0.06/run

    For most knowledge workers, the session runtime cost is negligible — the token costs (driven by how much content the agent reads and writes) are the actual variable to model. See the complete pricing reference for worked examples.

    Asana and the Broader Pattern

    Asana was also a Managed Agents launch partner, and the integration pattern is similar: an agent that can read project data, update task statuses, move cards, and generate project summaries without constant human direction. The launch partner list (Notion, Asana, Rakuten, Sentry) suggests Anthropic targeted three categories: knowledge management (Notion), project management (Asana), enterprise operations (Rakuten), and developer tools (Sentry).

    That’s a deliberate wedge. If agents can handle the administrative layer of these four categories, the surface area for autonomous business work expands significantly.

    What This Means for How You Work

    The honest use case for most people reading this: you have a Notion workspace with databases that need regular synthesis, and you’re currently doing that manually. Managed Agents is the path to automating that synthesis without building and maintaining a custom integration.

    The constraint worth naming: you’re running your workspace data through Anthropic’s infrastructure. That’s the trade-off. For most knowledge work, the data sensitivity concern is low. For anything involving client data, legal documents, or proprietary strategy — read Anthropic’s data handling terms before configuring access.

    For the full Managed Agents setup and pricing context: Claude Managed Agents: Every Question Answered. For the enterprise deployment pattern: How Rakuten Deployed 5 Enterprise Agents in a Week.

  • Claude Managed Agents — Every Question Answered (Complete FAQ 2026)

    Claude Managed Agents — Every Question Answered (Complete FAQ 2026)

    Tygart Media Strategy
    Volume Ⅰ · Issue 04Quarterly Position
    By Will Tygart
    Long-form Position
    Practitioner-grade

    Everything people actually ask about Claude Managed Agents, answered straight. No preamble about “the exciting world of AI agents.” If you’re here, you already know why this matters — you just need answers.

    This page covers pricing, setup, capabilities, limits, comparisons, and the specific questions that don’t have obvious homes in Anthropic’s documentation. It updates as the beta evolves.

    Context

    Claude Managed Agents launched April 8, 2026 as a public beta. All answers reflect current documentation as of April 2026. Beta details change — verify specifics at platform.claude.com/docs.

    Pricing Questions

    What does Claude Managed Agents cost?

    Two charges: standard Claude API token rates (same as calling the Messages API directly) plus $0.08 per session-hour of active runtime. That’s the complete formula. See the complete pricing reference for worked examples by workload type.

    What exactly is a “session-hour” and when does it start billing?

    A session-hour is one hour of active session runtime — time when your session’s status is running. Billing is metered to the millisecond. It does not accrue during idle time, time waiting for your input, time waiting for tool confirmations, or after session termination.

    What’s included in the $0.08/session-hour charge?

    The session runtime charge covers Anthropic’s managed infrastructure: sandboxed code execution containers, state management, checkpointing, tool orchestration, error recovery, and scaling. You are not separately billed for container hours on top of session runtime.

    Does the $0.08/hr apply even if my agent is just waiting?

    No. Time spent waiting for your message, waiting for tool confirmations, or sitting idle does not accumulate runtime charges. Only active execution time counts.

    What does web search cost inside a Managed Agents session?

    $10 per 1,000 searches ($0.01 per search), billed separately from session runtime and token costs. This is the same rate as web search through the standard API.

    Are there volume discounts?

    Yes, negotiated case-by-case for high-volume users. Contact [email protected] or through the Claude Console.

    How does Managed Agents pricing compare to running my own agent infrastructure?

    The $0.08/session-hour is almost always cheaper than equivalent provisioned compute — but you trade infrastructure control and data locality for that simplicity. For a full comparison: Build vs. Buy: The Real Infrastructure Cost.

    What’s the real monthly cost if I run an agent 24/7?

    Maximum theoretical session runtime: 24 hrs × $0.08 × 30 days = $57.60/month. In practice, no production agent has zero idle time. Token costs become the dominant cost driver long before you hit the runtime ceiling. Detailed breakdown: The Real Monthly Cost of Running Claude Managed Agents 24/7.

    Setup and Access Questions

    How do I get access to Claude Managed Agents?

    Available to all Anthropic API accounts in public beta — no separate signup. You need the managed-agents-2026-04-01 beta header in your API requests. The Claude SDK adds this header automatically.

    Does it work with my existing API key?

    Yes. Same API key you’re already using for the Messages API. Same authentication. The beta header is the only new requirement.

    What three ways can I access Managed Agents?

    Via the Claude SDK (recommended — handles the beta header automatically), via direct API calls with the beta header, or via the Claude Console’s new Managed Agents section for no-code agent configuration and session tracing.

    Can I use Managed Agents through AWS Bedrock or Google Vertex AI?

    Managed Agents runs on Anthropic-managed infrastructure. This is distinct from Bedrock and Vertex AI deployments. Check Anthropic’s current documentation for multi-cloud availability status — this is an area of active development.

    Capability Questions

    What can Claude Managed Agents actually do?

    Run long autonomous sessions with persistent state, execute code in sandboxed Linux containers, use tools including web search and MCP servers, coordinate multiple Claude instances via Agent Teams, and maintain checkpoints for crash recovery. The session can last minutes or hours without you staying in the loop.

    What’s the difference between Agent Teams and subagents?

    Agent Teams coordinate multiple Claude instances with independent contexts, direct agent-to-agent communication, and a shared task list — suited for complex parallel tasks. Subagents operate within the same session as the main agent and only report results upward — more economical for sequential targeted tasks but less capable of true parallelism.

    Does it support MCP servers?

    Yes. MCP servers can be integrated as tool sources in Managed Agents sessions, extending what the agent can access and act on.

    How long can a session run?

    Anthropic’s documentation currently references session durations of minutes to hours. Claude Code’s longest autonomous sessions have reached 45 minutes. Managed Agents is architected for longer-running work. Check current documentation for specific session duration limits as the beta matures.

    What happened to Claude Code — is it the same as Managed Agents?

    No. Claude Code is a separate local coding workflow product. Anthropic’s docs explicitly note partners should not conflate the two. Managed Agents is a hosted API runtime service. Claude Code is a developer tool. Different products, different use cases, different billing.

    Rate Limit Questions

    What are the rate limits for Managed Agents?

    60 requests per minute for create endpoints; 600 requests per minute for read endpoints. Organization-level API limits still apply on top of these. For higher limits, contact Anthropic enterprise sales. Detailed breakdown: Claude Managed Agents Rate Limits Explained.

    Do standard Claude API rate limits still apply inside a session?

    Organization-level limits apply. The session runtime and create/read endpoint limits are Managed Agents-specific. If you’re running many parallel Agent Teams, model token throughput limits will become relevant.

    Comparison Questions

    How does Managed Agents compare to OpenAI’s Agents API?

    Both offer hosted agent infrastructure. Key differences: Managed Agents is Claude-native (no multi-model flexibility), sessions bill on runtime + tokens vs. OpenAI’s different pricing model, and lock-in dynamics differ. Full comparison: Claude Managed Agents vs. OpenAI Agents API.

    Should I use Managed Agents or the Claude Agent SDK?

    Use Managed Agents when you want Anthropic to host the runtime — less infrastructure work, faster to production. Use the SDK when you need tighter loop control, on-premise execution, or multi-cloud flexibility. Anthropic’s own migration docs draw this line clearly: SDK runs in your environment; Managed Agents runs in theirs.

    What companies are already using Managed Agents in production?

    Notion, Asana, Rakuten, Sentry, and Vibecode were launch partners. Rakuten deployed five enterprise agents within a week. Allianz is using Claude for insurance agent workflows. Anthropic’s run-rate from the agent developer segment exceeds $2.5 billion. How Rakuten did it in a week →

    Data and Security Questions

    Where does my data go when running in Managed Agents?

    Execution runs on Anthropic’s infrastructure. This is the explicit trade-off: you get managed infrastructure; they manage the compute. For companies with strict data sovereignty requirements, this is the key constraint to evaluate. On-premise or native multi-cloud deployment is not currently available.

    What are the sandboxing guarantees?

    Anthropic uses disposable Linux containers — “decoupled hands” in their terminology. Each container is a fresh sandboxed environment for code execution. State persistence is managed separately from the execution environment.

    Strategic Questions

    Is this a bet worth making?

    That depends on your switching cost tolerance. Lock-in is real: once your agents run on Anthropic’s infrastructure with their tools, session format, and sandboxing, switching providers isn’t trivial. The counter-argument: the infrastructure you’d otherwise build to match this is months of engineering. One developer’s reaction at launch was blunt: “there goes a whole YC batch.” That captures both the opportunity and the risk. Our take on why we’re staying our course →

    What does this mean for AI citation and visibility?

    Agents running on Anthropic’s infrastructure make decisions about what content to surface, cite, and synthesize. As agent workloads grow, being present in the knowledge sources agents draw from becomes a search strategy question in itself. What AI citation monitoring looks like →