Tag: AI Agents

  • llms-full.txt vs llms.txt: Why AI Agents Crawl It More (2026)

    llms-full.txt vs llms.txt: Why AI Agents Crawl It More (2026)

    Most conversations about AI crawlability focus on one file: llms.txt. But if you look at what Anthropic, Vercel, and LangGraph actually ship – and what GEO crawler research found AI agents fetching most – the file that matters more is its companion: llms-full.txt.

    Here’s the practical reality: llms.txt is the map. llms-full.txt is the territory. And in 2026, the agents that matter for citation traffic are fetching the territory.

    The Full File Family You Probably Don’t Know About

    The original llms.txt proposal – published by Jeremy Howard in September 2024 – defined one file. Implementers built the rest. The complete family as of mid-2026 is four files, but most sites only need two:

    FileWhat’s in itWhen to use
    /llms.txtCurated index – H1, summary, link sectionsAlways. The orientation layer.
    /llms-full.txtFull content of every linked page, concatenated as MarkdownWhen you want a model to deep-ingest your docs in a single fetch
    /llms-ctx.txtPre-expanded context without URLsFastHTML-style implementations
    /llms-ctx-full.txtPre-expanded context with URLs preservedSame, but URL-aware

    The pattern that works – and the one Anthropic, Vercel, and LangGraph all run – is the index + export pair: llms.txt for orientation, llms-full.txt for deep ingestion.

    Why llms-full.txt Gets Crawled More

    GEO researchers analyzing AI crawler behavior – including work cited by Profound – have noted that agents from Microsoft, OpenAI, and others tend to fetch llms-full.txt more frequently than llms.txt when both are present. The working explanation is structural: when a file contains the full content, it removes one retrieval step. An agent that fetches llms-full.txt gets everything it needs in a single HTTP request instead of fetching the index, parsing the links, then fetching each linked page individually. This is consistent with how developer documentation platforms like Mintlify describe the behavior of IDE agents operating under tight latency budgets.

    For IDE agents (Cursor, Continue, Cline) and MCP integrations, this is even more pronounced. These tools are operating under tight context windows and latency budgets. A single fetch that returns a clean Markdown blob of your entire docs is structurally preferable to a multi-step crawl.

    The implication: if you’ve shipped llms.txt but not llms-full.txt, you’ve done half the job.

    How to Build llms-full.txt

    The construction logic is simple: take every URL in your llms.txt, fetch each page, strip HTML to Markdown, and concatenate. In practice, most sites do this in their build pipeline.

    Here’s the minimal Node.js pattern:

    const fs = require('fs');
    const fetch = require('node-fetch');
    const TurndownService = require('turndown');
    const turndown = new TurndownService();
    
    async function buildLlmsFullTxt(llmsIndexPath, outputPath) {
      const index = fs.readFileSync(llmsIndexPath, 'utf8');
      const urlRegex = /\[.*?\]\((https?:\/\/[^\)]+)\)/g;
      const urls = [...index.matchAll(urlRegex)].map(m => m[1]);
    
      let output = '';
      for (const url of urls) {
        const res = await fetch(url);
        const html = await res.text();
        const markdown = turndown.turndown(html);
        output += \n\n---\n# Source: \n\n;
      }
    
      fs.writeFileSync(outputPath, output);
      console.log(Built llms-full.txt:  pages,  chars);
    }
    
    buildLlmsFullTxt('./public/llms.txt', './public/llms-full.txt');

    One constraint to manage: keep llms-full.txt under roughly 200,000 tokens (about 150K words, around 700KB). That’s the threshold where most models can ingest the file in a single context window. If your docs are larger, segment by product or language the way Supabase does – llms-full-api.txt, llms-full-guides.txt – and list the segmented files in your main llms.txt.

    The 2026 robots.txt Stack That Completes the Picture

    Shipping llms.txt and llms-full.txt is the visibility layer. The access-control layer is robots.txt – and it changed significantly in Q2 2026.

    The key development: Anthropic split its crawler into two separate user-agents. ClaudeBot is the training scraper (high bandwidth, no citation value – block it). Claude-Web is the live-retrieval agent that fetches pages to answer Claude.ai user queries in real time (allow it, because it drives citation traffic). Brands that blanket-block “all Anthropic crawlers” lose Claude citations entirely.

    Meta also shipped two active training scrapers in March 2026 – FacebookBot and Meta-ExternalAgent – at GPTBot-level crawl volume. Most sites have no rules for them yet.

    Here’s the 2026 template:

    # BLOCK: Training scrapers - high bandwidth, zero referral value
    User-agent: GPTBot
    Disallow: /
    
    User-agent: CCBot
    Disallow: /
    
    User-agent: ClaudeBot
    Disallow: /
    
    User-agent: FacebookBot
    Disallow: /
    
    User-agent: Meta-ExternalAgent
    Disallow: /
    
    # OPT OUT: Google Gemini training (keeps Search indexing intact)
    User-agent: Google-Extended
    Disallow: /
    
    # ALLOW: Live-retrieval agents - drive citation traffic
    User-agent: OAI-SearchBot
    Allow: /
    
    User-agent: ChatGPT-User
    Allow: /
    
    User-agent: Claude-Web
    Allow: /
    
    User-agent: anthropic-ai
    Allow: /
    
    User-agent: PerplexityBot
    Allow: /

    One important caveat on robots.txt enforcement: aggressive training scrapers often ignore the file or spoof their user-agents. The robots.txt rules signal intent and work for compliant bots; a WAF rule at the edge is the only deterministic block for non-compliant crawlers.

    The Honest State of the Technology

    The SERanking study of 300,000 domains (November 2025) found no measurable correlation between having llms.txt and being cited by ChatGPT, Claude, Gemini, or Perplexity. Google’s John Mueller compared the file to the deprecated keywords meta tag – something site owners declare but that search systems derive from the content itself.

    None of that means you shouldn’t ship both files. The cost is low, the optionality is real, and the IDE-agent ecosystem (Cursor, Continue, Cline) does actively use llms.txt. But the robots.txt work is the lever that moves outcomes today. The llms.txt + llms-full.txt pair is infrastructure investment – you want to be correct when major LLM providers start honoring it, and building the build pipeline now costs far less than retrofitting it later.

    The practical sequence for a site that hasn’t done this yet:

    1. Update robots.txt first. Add the Q2 2026 user-agent rules above. This takes twenty minutes and immediately affects how training scrapers treat your content.
    2. Ship llms.txt. Curated index, 20-50 priority pages, one-sentence description per link, sections in priority order.
    3. Build llms-full.txt. Concatenated Markdown of every linked page, under 200K tokens. Run it in your build pipeline so it stays current.
    4. Verify both files are served correctly. curl -I https://yoursite.com/llms.txt should return 200 with Content-Type: text/plain. A 404 on either file is the most common implementation error.
    5. Add an access-log check. Once per month, grep your logs for requests to /llms.txt and /llms-full.txt by user-agent. You want to see live-retrieval agents (Claude-Web, OAI-SearchBot, PerplexityBot) in the results – not just training scrapers.

    The goal isn’t to optimize for a standard that isn’t fully adopted yet. It’s to build the infrastructure correctly now, while the field is still forming, so that adoption changes work in your favor rather than requiring catch-up.

    Related Reading

    Frequently Asked Questions

    What is the difference between llms.txt and llms-full.txt?

    llms.txt is a curated index — an H1, a summary, and link sections that orient an AI agent to your site. llms-full.txt is the full content of every linked page concatenated as Markdown, so an agent can deep-ingest your documentation in a single fetch. The index is the map; the full file is the territory.

    Why do AI agents crawl llms-full.txt more often than llms.txt?

    Fetching llms-full.txt removes a retrieval step: the agent gets everything in one HTTP request instead of fetching the index, parsing links, and fetching each page individually. For IDE agents like Cursor, Continue, and Cline operating under tight latency and context budgets, a single clean Markdown blob is structurally preferable to a multi-step crawl.

    How big should llms-full.txt be?

    Keep it under roughly 200,000 tokens (about 150K words, around 700KB) so most models can ingest it in a single context window. If your docs are larger, segment by product or language — for example llms-full-api.txt and llms-full-guides.txt — and list the segmented files in your main llms.txt.

    Does having llms.txt actually improve AI citations?

    Not measurably on its own. A November 2025 SERanking study of 300,000 domains found no correlation between having llms.txt and being cited by ChatGPT, Claude, Gemini, or Perplexity, and Google’s John Mueller compared it to the deprecated keywords meta tag. The lever that moves outcomes today is robots.txt configuration; llms.txt and llms-full.txt are low-cost infrastructure for when adoption grows.

    Which AI crawlers should I allow in robots.txt in 2026?

    Allow live-retrieval agents that drive citation traffic — Claude-Web, OAI-SearchBot, ChatGPT-User, anthropic-ai, and PerplexityBot. Block high-bandwidth training scrapers with no referral value such as GPTBot, CCBot, ClaudeBot, FacebookBot, and Meta-ExternalAgent, and opt out of Google-Extended to skip Gemini training while keeping Search indexing intact.

  • Claude Code vs Codex CLI (2026): A Hands-On Head-to-Head

    Last verified: June 2026.

    Both Claude Code and OpenAI Codex CLI are terminal-native coding agents: you run them inside a repo, they read your files, edit code, run commands, and iterate. I run both daily on real projects. This is the head-to-head I wish existed when I was deciding which one to make my default. No benchmarks-chasing, just install commands, config files, pricing math, and where each one actually earns its keep. For the broader toolchain these slot into, see our AI operator’s stack.

    Claude Code vs Codex CLI: the short answer

    If you want one sentence: Claude Code is the more mature agentic harness (subagents, hooks, skills, deep MCP, a flat-rate plan that makes heavy use affordable), while Codex CLI is the leaner, cheaper-per-token option with strong raw coding from the GPT-5.x line and a tight sandbox model. Most teams that live in the terminal all day end up on Claude Code for the workflow tooling; people who want a fast, low-cost agent on top of an existing OpenAI subscription reach for Codex.

    The honest version: they are closer than tribal arguments suggest. The deciding factors are almost never “which model is smarter this week” and almost always pricing structure, sandbox defaults, and how much workflow scaffolding you need.

    How do you install each one?

    Claude Code installs from npm and runs as the claude command:

    npm install -g @anthropic-ai/claude-code
    cd your-project
    claude

    First run walks you through OAuth login (Pro/Max plan) or an ANTHROPIC_API_KEY. On Windows it runs natively in PowerShell now, though a lot of operators still prefer it under WSL for fewer path headaches.

    Codex CLI ships an install script and is also on npm:

    # Mac / Linux
    curl -fsSL https://chatgpt.com/codex/install.sh | sh
    
    # Windows (PowerShell)
    powershell -ExecutionPolicy ByPass -c "irm https://chatgpt.com/codex/install.ps1 | iex"
    
    # or via npm
    npm install -g @openai/codex

    Then codex in your repo. Auth is either a ChatGPT login (Plus/Pro/Business) or an OpenAI API key via codex login. Both tools are open-source clients hitting hosted models, so the install is the easy part; the model access is what you are really buying.

    Which models do they run in 2026?

    Claude Code defaults to the current Claude flagship. As of June 2026 that is Opus 4.8 for the hardest reasoning, with Sonnet 4.6 as the fast everyday workhorse and Haiku 4.5 for cheap, high-volume calls. You switch in-session with /model. Opus 4.8 also exposes reasoning-effort levels (high is the default; xhigh and max push deeper on gnarly problems at higher token cost).

    Codex CLI runs the GPT-5.x coding line. GPT-5.5 is the current recommended default for complex coding and agentic work, GPT-5.4-mini is the faster/cheaper option for light tasks and subagents, and GPT-5.3-Codex remains a strong coding-tuned choice. Pick the model with codex -m gpt-5.5 or set it in your config.

    Practical read: on a clean, well-specified function both produce good code. The gap shows up on long, multi-file refactors where the agent has to hold a lot of context and recover from its own mistakes. That is a harness problem as much as a model problem, which is the next section.

    What about workflow features: subagents, hooks, and config?

    This is where Claude Code is currently ahead, and it is the real reason it tends to win for power users.

    • Subagents – Claude Code spawns isolated sub-sessions with their own context window, tool restrictions, and prompts. Great for “go research this in parallel while the main thread keeps coding.” Codex has a lighter subagent concept (often pointed at GPT-5.4-mini to keep cost down) but it is less fleshed out.
    • Hooks – Claude Code fires deterministic scripts at lifecycle points (PreToolUse, UserPromptSubmit, and more). These run real code, so they cannot hallucinate: you can hard-block a dangerous command, auto-format on every edit, or inject context before the model sees a prompt. Codex leans on its approval/sandbox policy and execpolicy rules instead of a general hook system.
    • Skills and slash commands – In Claude Code, custom slash commands have merged into skills; /your-command still works and skills add reusable, packaged capabilities. Codex uses prompt files and profiles rather than a skills layer.
    • Project memory – Both read a project instruction file. Claude Code uses CLAUDE.md; Codex uses AGENTS.md (checked in a fallback order including AGENTS.override.md and .agents.md). Keep these tight: architecture, conventions, and the few rules the agent keeps forgetting.

    Codex’s config story is clean if you like a single file: ~/.codex/config.toml holds your model, approval policy, sandbox mode, MCP servers, and named profiles you switch with codex --profile work. Claude Code spreads config across ~/.claude/ and .claude/settings.json plus per-project files, which is more surface area but more granular control.

    How do the sandbox and approval models compare?

    This matters more than most comparisons admit, because it governs how much the agent can do without asking.

    Codex CLI has an explicit, well-documented sandbox. Sandbox modes run from read-only to workspace-write (edit files in the project, network off by default) up to full access, paired with approval policies like untrusted and on-request. On Windows the native sandbox can run unelevated or elevated. The mental model is clear: pick how much rope, then approve escalations.

    Claude Code manages permissions through allow/deny rules and modes (including a plan mode that reasons without touching files, and an auto-accept mode for trusted loops). Combined with PreToolUse hooks you can build a strict policy, but it is more “assemble it yourself” than Codex’s preset sandbox tiers.

    If you are dropping an agent onto an unfamiliar or sensitive repo, start read-only in both. Codex makes that posture a one-flag default; Claude Code gives you finer-grained control once you invest in the config.

    Do both support MCP?

    Yes, and this is a genuine tie that matters. Both speak the Model Context Protocol, so you can wire in the same external tools, databases, and APIs. Codex registers STDIO or streaming-HTTP MCP servers in ~/.codex/config.toml and launches them at session start. Claude Code adds servers via claude mcp add or JSON config. If you have already built MCP integrations, neither tool locks you out. New to MCP, start with our Claude MCP setup guide and the Notion MCP setup walkthrough.

    What does each one cost?

    Pricing is where the decision often gets made, so here are the real numbers as of June 2026.

    Claude Code plans:

    • Pro – $20/mo: Sonnet 4.6 plus some Opus, roughly enough for focused daily sessions, not all-day heavy use.
    • Max 5x – $100/mo: much larger windows, real Opus headroom.
    • Max 20x – $200/mo: the heavy-user tier; effectively flat-rate firehose access.
    • API pay-as-you-go: Opus 4.7 about $5/$15 per million input/output… (current Opus tier runs higher), Sonnet 4.6 $3/$15, Haiku 4.5 $1/$5.

    Codex CLI: Included in ChatGPT Plus/Pro/Business plans (usage governed by your plan’s limits), or pay-as-you-go on the API. GPT-5.3-Codex runs about $1.75 per million input / $14 per million output, with cheaper input on cached tokens. The mini model is far cheaper for light work.

    The structural difference: Claude Code’s Max plans are flat-rate, which is why heavy users love them. People have tracked billions of tokens that would cost five figures on API metering but ran around a few hundred dollars on Max. Codex’s per-token rates are lower per unit and great if your usage is bursty or already bundled into a ChatGPT subscription, but a true all-day agent habit can run up metered cost faster than a flat plan. Estimate your monthly token volume honestly, then do the arithmetic both ways.

    So which coding agent should you actually use?

    Pick Claude Code if you want the deepest agentic workflow (subagents, hooks, skills), you are a heavy daily user who benefits from the flat-rate Max plan, or you need fine-grained, scriptable control over what the agent can do. It is the more complete operator’s harness in 2026.

    Pick Codex CLI if you want lower per-token cost, you already pay for ChatGPT and want to use that allowance, you like the clean preset sandbox/approval model, or you simply prefer the GPT-5.x output style. It is lean, fast to stand up, and genuinely capable.

    The move a lot of us make: run both. They are cheap relative to engineer time, they share MCP servers, and they have different failure modes. When one gets stuck in a loop on a hard bug, handing the same task to the other with fresh context often breaks the logjam. If you are weighing terminal agents against IDE-native ones, our Claude Code vs Cursor breakdown covers that axis.

    Frequently asked questions

    Is Claude Code or Codex CLI better for large refactors?

    Claude Code tends to hold up better on long multi-file refactors, mostly because of subagents and hooks that keep context organized and catch mistakes deterministically. Codex can do it too, especially with GPT-5.5, but you lean harder on tight AGENTS.md instructions and approval gates.

    Can I use Codex CLI without a ChatGPT subscription?

    Yes. Run codex login with an OpenAI API key and you pay per token instead of through a ChatGPT plan. Same for Claude Code with an ANTHROPIC_API_KEY if you would rather meter than subscribe.

    Do they work on Windows natively?

    Both do in 2026. Claude Code runs in PowerShell (many operators still prefer WSL for cleaner paths), and Codex CLI has a native Windows installer plus a Windows sandbox with unelevated/elevated modes. Watch out for shells that mangle /tmp or C:\ style paths in arguments.

    What is the single biggest difference?

    Pricing structure and workflow depth. Claude Code offers flat-rate Max plans and a richer harness (subagents, hooks, skills); Codex offers lower per-token rates and a cleaner preset sandbox. Model quality is close enough that those two factors usually decide it.

    Which model do they run by default?

    Claude Code defaults to the current Claude flagship (Opus 4.8 as of June 2026, with Sonnet 4.6 for everyday speed). Codex CLI recommends GPT-5.5 for complex work, with GPT-5.4-mini and GPT-5.3-Codex as alternatives. Switch in-session with /model or the -m flag.

    How do I get either tool cited or surfaced by AI engines for my own docs?

    That is a content question, not a tooling one. The same structure that makes this page answerable, short factual answers, question-shaped headers, and a visible FAQ, is what AI engines reward. See how AI engines cite content for the full playbook.

  • Claude Code vs Cursor: An Honest 2026 Comparison

    Claude Code vs Cursor: An Honest 2026 Comparison

    Last verified: June 2026.

    Claude Code and Cursor are the two tools most working developers actually reach for in 2026, and they are not the same kind of thing. Cursor is an AI-native code editor (a VS Code fork) where the model lives inside your IDE. Claude Code is a terminal agent that lives in your shell and edits files, runs commands, and drives git from the command line. I run both every day. This is the honest version: what each one is good at, what they cost right now, and a simple rule for picking.

    Claude Code vs Cursor: what is the actual difference?

    The short answer: Cursor is an editor you type in; Claude Code is an agent you delegate to. Cursor keeps you in the driver’s seat with autocomplete, inline edits, and a chat sidebar that sees your open files. Claude Code takes a goal (“add rate limiting to the upload endpoint and run the tests”) and works the repo autonomously in the terminal, asking permission before it touches things.

    Dimension Claude Code Cursor
    Form factor Terminal CLI (plus IDE extension, web, desktop) Full IDE (VS Code fork)
    Primary loop Delegate a task, approve actions Type code, accept suggestions
    Models Claude only (Sonnet 4.6, Opus 4.8) Multi-model: Claude, GPT, Gemini
    Best at Multi-file refactors, scripted/headless runs, git workflows Tight edit loops, autocomplete, staying in one window
    Entry price $20/mo (Pro) Free (Hobby) / $20/mo (Pro)
    Billing model Usage windows (5-hour + weekly) Credit pool ($ equal to plan price)

    How does each one actually work?

    Claude Code (terminal agent)

    You install it globally and run it from inside a project directory:

    npm install -g @anthropic-ai/claude-code
    cd my-project
    claude

    From there you talk to it in plain language. It reads files, proposes edits as diffs, and runs shell commands only after you approve them. A few patterns I use constantly:

    • Project memory: drop a CLAUDE.md file in the repo root with build commands, conventions, and “do not touch” rules. Claude Code reads it on every run, so you stop re-explaining the same context.
    • Headless / scripted runs: claude -p "bump all deps and run the test suite" runs one-shot and exits, which is what makes it scriptable in CI or cron jobs. This is the single biggest thing Cursor cannot do.
    • Permission control: by default it asks before edits and commands. You can pre-approve safe tools so it stops prompting on every npm test.
    • Plan mode: ask it to plan before it writes, review the plan, then let it execute. This is how you avoid a runaway agent rewriting half the codebase.

    Cursor (AI IDE)

    Cursor is a download, not a package install. You open your folder and the AI is wired into the editing surface:

    • Tab completion: multi-line, context-aware autocomplete that predicts your next edit, not just the next token. This is the feature people stay for.
    • Inline edit (Cmd/Ctrl+K): select code, describe the change, get a diff in place.
    • Agent mode: a chat panel that can edit multiple files and run terminal commands, closing the gap with Claude Code from inside the IDE.
    • Model picker: switch between Claude Sonnet, GPT, and Gemini per request from a dropdown. Useful when one model is stuck and you want a second opinion without leaving the window.

    What does Claude Code cost in 2026?

    Claude Code is billed by usage windows, not per-request credits. As of June 2026:

    • Pro: $20/month. Sonnet 4.6 and Opus 4.6, roughly 10 to 40 prompts per 5-hour window depending on repo size.
    • Max 5x: $100/month. ~5x Pro limits and access to Opus 4.8.
    • Max 20x: $200/month. ~20x Pro limits, all models including Opus 4.8.
    • API (pay-per-token): Opus 4.7 at $5 input / $25 output per million tokens; Sonnet 4.6 at $3 / $15.

    The mechanic to understand: there is a 5-hour rolling session window (your budget resets from your first prompt) plus a weekly active-compute cap that only counts time the model is actually reasoning. If you hit a wall mid-afternoon, you are usually waiting for the 5-hour window to roll, not the week.

    What does Cursor cost in 2026?

    Cursor moved to a credit-pool model (the switch happened in mid-2025). Every paid plan includes a monthly credit pool equal to the plan price in dollars, and each request burns credits based on which model you pick and how heavy the request is. As of June 2026:

    • Hobby: Free. Limited tab completions and agent requests, plus a one-week Pro trial on signup.
    • Pro: $20/month ($16 annual). Frontier model access, MCP support, cloud agents, and a $20 credit pool.
    • Pro+: $60/month. ~3x the credits.
    • Ultra: $200/month. ~20x usage and priority features.
    • Teams: $40/user/month with SSO and admin controls.

    Practical note on the credit pool: model choice matters a lot. Roughly, $20 of credits buys about 225 Claude Sonnet requests or about 550 Gemini requests, because Anthropic models cost more per call than Gemini in Cursor’s pricing. If you run Claude on everything, the $20 pool drains faster than newcomers expect. This is the source of most “what happened to Cursor pricing” confusion.

    Which models do you actually get?

    This is the cleanest dividing line.

    • Claude Code is Claude-only. You get Anthropic’s frontier coding models (Sonnet 4.6 for speed/cost, Opus 4.8 for the hardest agentic work on Max). No GPT, no Gemini. If you trust Claude for code, the single-vendor integration is tighter and the agent behavior is tuned end to end.
    • Cursor is multi-model. Claude, OpenAI, and Google models from one dropdown. The advantage is hedging: if one model whiffs on a problem, switch and retry in seconds. The trade-off is that no single model is integrated as deeply as Claude is in its own first-party tool.

    Which one is better for big refactors and automation?

    Claude Code, clearly. Two reasons. First, the terminal-agent loop is built for “go do this across the whole repo” tasks, and plan mode plus CLAUDE.md keep it on rails. Second, headless mode (claude -p "...") means you can wire it into scripts, pre-commit hooks, and scheduled jobs. Cursor’s agent mode is strong inside the IDE, but it is fundamentally an interactive editor, not a thing you call from a cron line.

    Which one is better for everyday coding flow?

    Cursor, for most people. If your day is reading, editing, and iterating on code you understand, Cursor’s tab completion and inline edits keep you in one window with near-zero friction. You never leave the editor to get help. Developers who are uneasy handing a whole task to an autonomous agent also tend to prefer Cursor because they stay in control of every keystroke.

    Can you use both together?

    Yes, and a lot of people do. The common setup: Cursor as the editor, Claude Code in Cursor’s integrated terminal. You get Cursor’s autocomplete and visual diff review for hands-on work, and you drop into Claude Code when you want to delegate a multi-file job or run something headless. They do not conflict. If you are building a broader operator setup around these tools, see our AI operator’s stack for how the pieces fit, and our Claude MCP setup guide for wiring external tools and data into Claude Code via MCP.

    Claude Code vs Cursor vs Codex?

    Codex is the third option people weigh, and it sits closer to Claude Code as an agent than to Cursor as an editor. The decision usually comes down to which model family and which workflow you trust. We break that specific matchup down in Claude Code vs Codex.

    Bottom line: when to pick which

    • Pick Claude Code if you want an autonomous agent for refactors, you live in the terminal and git, you need scriptable/headless runs, and you are happy with Claude as your one model.
    • Pick Cursor if you want best-in-class autocomplete, you prefer staying inside a visual editor, you value swapping between Claude/GPT/Gemini, and you want to keep your hands on the keyboard.
    • Pick both if you can swing two subscriptions: Cursor for the edit loop, Claude Code in the terminal for delegation. Start each on the $20 tier and only upgrade the one you hit limits on.

    FAQ

    Is Claude Code or Cursor cheaper?

    Both start at $20/month (Cursor also has a free Hobby tier). The difference is the meter: Claude Code limits you by 5-hour usage windows plus a weekly cap, while Cursor gives you a $20 credit pool that drains per request based on the model. Heavy Claude usage in Cursor burns the pool faster than people expect.

    Does Cursor use Claude?

    Yes. Cursor offers Anthropic’s Claude models alongside OpenAI and Google models, selectable per request. But you are using Claude through Cursor’s integration, not Anthropic’s first-party Claude Code agent, so the agentic behavior differs.

    Can Claude Code edit files and run commands like an IDE agent?

    Yes. Claude Code reads and writes files, runs shell commands, and drives git directly from the terminal. By default it asks permission before edits and commands, and you can pre-approve safe tools to cut down the prompts.

    Which is better for beginners?

    Cursor. The visual editor, inline diffs, and autocomplete are more forgiving than a terminal agent, and the free Hobby tier lets you learn before paying. Claude Code rewards people who are already comfortable in the shell and with git.

    Do I need to know the command line to use Claude Code?

    Largely yes. Claude Code is a CLI-first tool, and while it does most of the git and shell work for you, you will be living in a terminal. There is also an IDE extension and a desktop app, but the terminal is where it is strongest.

    Can I run Claude Code in CI or on a schedule?

    Yes, via headless mode: claude -p "your task" runs once and exits, which makes it usable in CI pipelines, git hooks, and scheduled jobs. Cursor has no equivalent because it is an interactive editor.

    Will using both at once cause conflicts?

    No. A common and stable setup is Cursor as your editor with Claude Code running in Cursor’s integrated terminal. They operate on the same files without stepping on each other, as long as you are not having both edit the exact same file simultaneously.

    Related reading: how AI engines cite content and Claude in Chrome for LinkedIn automation.

  • Claude Code vs Cursor in 2026: An Honest Comparison for Developers Who Ship

    Claude Code vs Cursor in 2026: An Honest Comparison for Developers Who Ship

    The conversation about Claude Code vs Cursor has collapsed into lazy takes: Claude Code is smarter, Cursor is friendlier, buy both. That framing is not wrong, but it isn’t useful. If you’re deciding where to put your coding tool budget in 2026, you need to know where each tool wins and loses – with specifics, not vibes.

    Here’s what a year of both tools in production actually looks like.

    The Fundamental Architecture Gap

    Claude Code is a terminal-native CLI agent. You run it with claude in your shell, point it at a codebase, give it a task, and walk away. It has no GUI. It doesn’t autocomplete as you type. What it has is the ability to autonomously execute multi-step tasks – read files, write code, run tests, iterate on failures – without you babysitting it.

    Cursor is an IDE built on VS Code. It has tab autocomplete, an inline chat panel, Agent mode for longer tasks, and a polished visual interface that feels like VS Code with a superpower grafted on. If you already live in VS Code, Cursor’s learning curve is close to zero.

    These are genuinely different tools. The “which one wins” question should really be “which one wins for what.”

    Where Claude Code Wins: Long Autonomous Runs

    The biggest measurable advantage Claude Code has right now is context. Running on Claude Opus 4.6 or 4.7, Claude Code natively supports a 1 million token context window – and that’s a first-class, supported number with no per-token surcharge for long context on the API.

    Cursor’s advertised context is lower, and it draws from multiple model backends depending on which you select. On a large monorepo task – think refactoring an auth system across 40 files – the difference between context limits is the difference between Claude Code holding the whole codebase in view and the alternative having to page through it.

    Claude Opus 4.6 scores 80.84% on SWE-bench Verified, per Anthropic’s published system card. Opus 4.7 improved on that, particularly on the hardest problems in the benchmark set, and on Rakuten-SWE-Bench (a production-task evaluation, not just GitHub issues) it resolves 3x more tasks than Opus 4.6. That is a meaningful gap.

    The autonomous-run workflow looks like this in practice:

    claude "Refactor the payment module to use the new Stripe SDK, update all tests, and make sure existing integration tests still pass"

    Claude Code will read the relevant files, identify the Stripe version mismatch, write the new implementation, run your test suite, and iterate if something fails – often without a single follow-up prompt. That same task in Cursor’s Agent mode typically requires you to approve each file write and re-prompt when the agent stalls on an error.

    Where Cursor Wins: Daily Developer Experience

    Cursor’s tab autocomplete is genuinely good. It’s not a feature Claude Code has at all – Claude Code is not an IDE and doesn’t inject suggestions while you type. If your daily workflow is: open file, write code, open file, write code, Cursor is the better tool for that rhythm.

    Cursor’s @codebase reference and file mention system is also excellent for interactive exploration. You can ask “why does this function fail on null input?” while looking at the code, and Cursor’s inline context makes that conversation fast. Claude Code can answer the same question, but you’re doing it in a terminal with no visual reference.

    For teams on an existing GitHub workflow, GitHub Copilot’s deep integration with PRs, issues, and Actions is hard to match. If your team is standardized on GitHub and your security team needs IP indemnity coverage, Copilot is the defensible enterprise choice – Claude Code and Cursor both require more procurement work.

    The Pricing Reality

    Plan Monthly Cost
    Claude Code via Claude Pro $20/month
    Claude Code via Max 5x $100/month
    Claude Code via Max 20x $200/month
    Cursor Pro $20/month
    GitHub Copilot Individual $10/month

    The entry point is the same for Claude Code (via Claude Pro) and Cursor. At that tier, Claude Code’s usage limits are more restricted. The Max 5x plan at /month is where Claude Code becomes a full autonomous-agent platform – higher rate limits, Opus access, and Claude Code usage limits that are double the Pro tier.

    For individual developers doing heavy autonomous runs, the Max 5x plan at competes directly with a Cursor Pro subscription plus meaningful API spend. For teams, the calculus shifts: Cursor’s team plan pricing is lower per seat than a premium Claude Code subscription, which matters when you’re buying for 20 developers.

    The Honest Call

    Claude Code wins on: autonomous multi-step tasks, large codebase refactors, long-running agents, raw SWE-bench performance, and 1M token context on complex jobs.

    Cursor wins on: daily IDE experience, tab autocomplete, interactive inline chat, onboarding speed for VS Code users, and team-tier pricing.

    The recommendation most senior developers are landing on in 2026 is two tools: Cursor open in the background for interactive work, Claude Code for the tasks you used to put in a Jira ticket and wait two days for. If you can only buy one and you mostly write code file-by-file, get Cursor. If your bottleneck is “I need to refactor three services and I don’t have three days,” Claude Code is the one that changes your output.

    The Max 5x plan makes that bet financially coherent for a senior developer. The Pro tier is a reasonable way to find out if autonomous coding is a workflow you actually use.

  • Always Allow vs Allow Once: Claude Code’s Quiet Tell

    The short version: In Claude Code, the prompt that asks whether to “Always Allow” or “Allow Once” isn’t really about security. It’s a question about your own systems. If you keep choosing Always Allow, the work is recurring — go build the automaton. If it’s honestly Allow Once, it’s a one-off — let it go instead of trying to remember it.

    I spend most of my day inside Claude Code, and a tiny piece of the interface has been living rent-free in my head. Every time the agent wants to run a command, edit a file, or hit an API, it stops and asks: Always Allow, or Allow Once?

    On the surface that’s a permission prompt. Click the box, move on. But after the hundredth time, I started to notice the choice was telling me something about how I actually work — and where I was leaving time on the table.

    “Always Allow” means: go build the automaton

    Here’s the pattern. If I find myself reaching for Always Allow, it’s because I’ve seen this exact action before. I’ll see it again. I trust it enough to stop being asked.

    That’s not a permission decision. That’s a build order.

    If an action is safe, repeatable, and I do it constantly, the right move isn’t to keep approving it forever — it’s to take it out of the prompt entirely. Turn it into a tool. Wrap it in a script. Register it as a skill. Put it on a cron so it runs whether I’m at the desk or not. The “Always Allow” click is the moment the work earns its own piece of infrastructure.

    Most people stop at the click. They grant the permission and feel productive because the friction went away. But friction that shows up every single day isn’t friction you should approve — it’s friction you should engineer out. Every “Always Allow” is a quiet little flag waving at you: this deserves to be an automaton.

    “Allow Once” means: let it go on purpose

    The other side is just as useful, and it’s the part people get wrong.

    When the honest answer is Allow Once — this is a weird one-off, I’m not going to do it again — the temptation is to write it down. Save the command. Add it to a doc. File it away just in case it ever comes back.

    Resist that. A one-off doesn’t deserve a permanent home in your memory or your system. The cost of storing it isn’t the disk space — it’s the upkeep. Every note you keep is something you now have to organize, search past, keep current, and trip over later. Knowledge you save but rarely touch quietly rots, and stale knowledge is worse than none.

    The way I think about it: it’s more fit to sift through the dirt than to re-sift the knowledge. If a one-off ever does come back, re-deriving it from scratch is cheap — you dig through the dirt once and you’re done. But re-sifting a giant pile of “just in case” notes, over and over, every time you go looking for the thing you actually need? That’s the expensive part. Forgetting a one-off on purpose is a feature, not a failure.

    Why re-deriving usually beats remembering

    This is really a question of economics, and it’s the same math whether you’re managing an AI agent or your own head.

    Storing knowledge has two costs people forget about: the cost to keep it accurate, and the cost to find the signal inside it later. A one-off has a low chance of ever being needed again, so the expected payoff of saving it is tiny — while the drag it adds to everything else you’ve stored is real and permanent. Recurring work is the opposite: high chance of reuse, so it’s worth paying once to encode it well and never think about it again.

    So the rule of thumb falls out on its own:

    • Recurring → encode it. Build the tool, the skill, the cron. Pay once, reuse forever.
    • One-off → forget it on purpose. Do the thing, then let it go. If it ever comes back, dig it up fresh — it’ll be faster than you think.

    The mistake is doing it backwards: hand-running the recurring stuff every day because you never built the automaton, while hoarding a graveyard of one-off notes you’ll never open again. That’s how you end up busy and buried at the same time.

    How to act on the tell in Claude Code

    Next time that prompt pops up, treat it as a tiny decision point instead of a speed bump:

    1. You reached for “Always Allow.” Stop for a second. Ask: what would it take to make this prompt never appear again? An orchestration step, a saved skill, a scheduled job, a hook? Put it on the list. The prompt just told you what to build next.
    2. You reached for “Allow Once.” Do it, then genuinely drop it. Don’t screenshot it, don’t file it. Trust that if it matters, it’ll show up again — and the second sighting is your real signal to build.
    3. You’re not sure. That’s fine — “Allow Once” is the safe default. Two or three “Allow Once” clicks for the same action is the universe telling you it was an “Always Allow” the whole time.

    None of this is really about Claude Code. The tool just happens to put the decision right in front of you, every day, in a little box. Most systems make you guess where your time is leaking. This one points at it and asks you to choose. (It pairs well with knowing when to use Plan Mode and when to skip it — same instinct, a different prompt.)

    Recurring work wants to become an automaton. One-off work wants to be forgotten. The prompt already knows which is which. The only question is whether you’re listening.

    Frequently asked questions

    What’s the difference between “Always Allow” and “Allow Once” in Claude Code?

    “Allow Once” approves a single action one time; the next identical action prompts you again. “Always Allow” approves that action or pattern going forward, so Claude Code stops asking. Functionally, “Always Allow” is how you tell the tool an action is safe and routine.

    Should I use “Always Allow” in Claude Code?

    Use it when an action is safe, repeatable, and something you do often — but treat each “Always Allow” as a signal to eventually build that action into a tool, skill, hook, or scheduled job so it leaves the prompt entirely.

    Is “Always Allow” a security risk?

    It can be if you grant it to broad or destructive actions. Keep “Always Allow” for narrow, well-understood operations, and lean on “Allow Once” for anything unfamiliar, destructive, or outward-facing.

    When should I turn a Claude Code action into an automation?

    When you’ve granted — or wanted to grant — “Always Allow” for it. That’s the tell that the work is recurring, and recurring, trusted work is worth encoding once as a tool, skill, hook, or cron so you never approve it by hand again.

    Why shouldn’t I save one-off commands?

    Because storing knowledge has ongoing costs — keeping it accurate, and sifting past it to find what you actually need. A one-off has little chance of reuse, so it’s usually cheaper to re-derive it later than to maintain it forever.

    What does “more fit to sift through the dirt than to re-sift the knowledge” mean?

    It means re-deriving a rarely-needed answer from scratch — sifting the dirt once — is cheaper than maintaining and repeatedly searching a hoard of saved notes, which is re-sifting the knowledge every time. For one-offs, forgetting is the efficient choice.

  • Claude Routines Is a Frankenstein Product, and That’s Why It’s Working

    Anthropic shipped one feature on April 14. Nine days in, the internet has already decided it’s five different things.


    On April 14, 2026, Anthropic quietly pushed a research preview called Routines into Claude Code. The framing from their launch post is almost boring: “A routine is a Claude Code automation you configure once — including a prompt, repo, and connectors — and then run on a schedule, from an API call, or in response to an event.”

    That’s it. That’s the whole pitch. You write instructions once, Anthropic runs them on their cloud, and your laptop can be closed at the bottom of a lake for all it matters.

    Nine days later, I pulled social reactions from the first week of real usage — developers, indie hackers, ad ops people, a Polymarket trader, a guy learning piano, a Japanese solo dev running it for a week, Hamel Husain grumbling about YAML. And the thing that jumped out wasn’t the feature. It was how wildly people disagreed about what Routines even is.

    Is it an n8n killer? A cron replacement? An enterprise procurement play? A way to avoid buying a Mac Mini? A vibes machine for autonomous trading bots? A broken MCP detector?

    Yes. All of those. At the same time. That’s the story.


    The five Routines

    Here’s what Routines looks like, depending on who’s holding it.

    To the production automation crowd, it’s a toy. Alex Vacca (@itsalexvacca) wrote the most viewed thread in the launch window — 28,000+ views, 283 replies — and it was a full-throated defense of n8n. His agency runs 13 workflows, 2,000+ executions per day, 41 nodes in one pipeline alone. Monthly n8n bill: $384. “The same workloads on Claude would cost $60K,” he wrote. “That’s why I’m not buying the ‘Claude killed n8n’ take. They’re not the same layer.”

    He’s right. If you’re firing thousands of deterministic executions a day through a visual graph with tight error handling, Routines at 5-to-25 runs per day on included tiers isn’t even in the conversation. You’ll eat your Extra Usage budget by noon Tuesday.

    To the indie hacker crowd, it’s liberation. Aman Kumar (@Amank1412) summed up the mood in two lines and a video: “Claude Routines automatically run at a schedule without keeping your laptop open. Those who spent $599 on a Mac Mini.” A Spanish developer (@anthonysurfermx) is moving his OpenClaw logic off Digital Ocean: “me quito 30 USD mensuales.” A Japanese developer (@KameAIHacks) reported back after a full week: nightly test runs, auto PR reviews, weekly dependency scans — “個人開発者のメンテナンス作業がほぼゼロになった.” Maintenance work as a solo dev dropped to nearly zero.

    These people aren’t trying to replace n8n. They’re trying to not-own a server. The unlock isn’t workflow power. It’s that you can delete a piece of infrastructure from your life.

    To the enterprise crowd, it’s a land grab. The sharpest observation came from @grapeot, writing in Chinese: “Claude Routines 每个是独立 API endpoint 带 bearer token,独立配额独立计价,配套 SSH 让 agent 跑在企业内网。它服务的是把 agent 写进采购合同的企业.” Translation: every routine is a separate API endpoint with its own auth token, its own quota, its own billing line, and SSH support for running agents inside corporate networks. This is Anthropic saying “put this in your procurement contract.” It’s not a consumer feature dressed up. It’s enterprise infrastructure wearing consumer clothes.

    To the crypto crowd, it’s a printing press. @regent0x_ shared a story about a Polymarket trader who connected Routines to price feeds via API trigger. Price moves 4%, Claude wakes up, analyzes news, checks sentiment, decides whether to alert or auto-execute. “Laptop hasn’t been open in a week… $23k profit last month… total costs: $5/mo webhook + $87 in API calls… net profit margin: 99.6%.” Asked what he did with the free time: “learning piano.”

    This is the quote that’s going to outlive the launch. Not because it’s representative — it absolutely isn’t — but because it’s the Platonic ideal of what cloud agents are supposed to feel like when they work. Research, reason, act, report. Go practice Chopin.

    To Hamel Husain, it’s just YAML. The machine learning veteran (@HamelHusain) tried Routines and walked away: “I found it to be far better to use GitHub Actions. I have more control with GHA, secret management, etc. Claude is really good at writing all the yaml and iterating until it works on its own too. Wild times that I’m saying I like GitHub Actions LOL.”

    If you already live in GHA, Routines isn’t offering you anything you don’t already have — except the novelty of a natural-language wrapper, which costs you control.


    The broken pieces nobody’s hiding

    A feature isn’t real until it breaks, and Routines is breaking in public. @ghuubear tried it on day 9 and reported his MCP connectors weren’t detected at all: “anthropic is shipping broken products.” @ahmetb couldn’t get GitHub PR-open triggers to fire: “not working at all.” Rich Baldry (@chooserich), who’s spent “countless hours with Codex Automations, Claude Routines, OpenClaw,” landed on a phrase that’s going to stick: “unreliable magic machines.”

    His follow-up is the real critique, and it’s the one Anthropic needs to answer: “building software with the new agentic coding tools for the same tasks is vastly more reliable.” In other words — use Claude to write a real cron job, not to be the cron job.

    That’s a serious challenge. When the alternative to your cloud agent is “use your cloud agent to write the non-agent version instead,” you’ve built a very fancy bootstrap.


    The pricing question nobody’s settled

    Pro gets 5 routine runs per day. Max ($100 and $200) gets 15. Team and Enterprise get 25. After that, overages bill against Extra Usage at standard API rates.

    The Japanese dev community did the cleanest math: “Proプランだと1日5回まで。個人開発なら十分だけど、3つ以上のRoutineを毎日回したい場合はMaxプランが必要.” Five runs a day is fine for one or two scheduled jobs. Want three or more running daily? Plan up.

    That’s the dividing line, and it tells you exactly who the feature is actually priced for. It is not priced for the n8n crowd. It’s priced for the solo dev with two or three background jobs, or the enterprise buyer who doesn’t look at the line item. The middle — the agency with a dozen automations but no enterprise contract — is the exact spot where Extra Usage starts to sting.

    My Routines counter reads 0/15. I also have $250 in Extra Usage sitting in my account. I can tell you exactly where that money would go if I got careless with triggers: nowhere good.


    What I actually think

    I run a WordPress content network, a Notion command center, a few GCP projects, and enough scheduled tasks in Cowork to keep my desktop busy. I asked myself the honest question before writing this: do I need Routines?

    Answer: not yet. My laptop stays on. My scheduled tasks fire. If one misses because my wifi blinked, I run it the next morning and nothing dies. I’m not a Polymarket trader. I’m not running a procurement contract. I’m not trying to delete a Mac Mini I never bought.

    But the gap in Cowork is real, and the community surfaced it without meaning to. Right now, scheduled tasks in Cowork run on your machine. Routines run in the cloud. Nothing connects them. If you tag a task critical in Cowork and your laptop is asleep, the task just doesn’t fire. The obvious product move — one I’d expect Anthropic to ship in the next two quarters — is a failover flag: “if this task can’t run locally, escalate to a routine.” That closes the loop. Until it exists, you have to pick a side.


    The Frankenstein is the feature

    Here’s the thing about products that mean five different things at once: usually that’s a sign of a broken launch. Wrong messaging, wrong audience, wrong pricing. “Nobody knows what it is.”

    Routines is the opposite. Every one of those five readings is correct. It IS a toy next to n8n. It IS liberation from a VPS. It IS an enterprise procurement play. It IS a crypto printing press, sometimes. It IS broken in specific places. The Frankenstein isn’t a bug in the positioning. It’s a feature of cloud-hosted agents actually arriving in more than one market at the same time.

    The indie dev and the enterprise buyer are holding the same product and seeing different things because they are different things, lit from different angles. That’s what a platform primitive looks like in its first week.

    The Mac Mini guys get it. The n8n operators get it too — they’re just looking at a different body part.

    As for me: I’m keeping my counter at 0/15 for now. But I’m watching, because the moment Anthropic ships that failover flag between Cowork and Routines, the conversation changes, and the Frankenstein grows another limb.

    Learning piano is probably a stretch.


    Sources: Introducing Routines in Claude Code (claude.com/blog, April 14, 2026); Claude Code Routines documentation (code.claude.com/docs/en/routines); social reactions pulled from X/Twitter, April 14–23, 2026. All quotes used with attribution to their original posters.

  • Why the Best AI Operators Think Small: Lessons from the “Token Wall”

    Why the Best AI Operators Think Small: Lessons from the "Token Wall"

    There’s a moment every serious Claude user hits eventually. You’re mid-session, deep in the flow of building a workflow, a content pipeline, or a complex research thread. You’ve built something substantial, and you’re right on the verge of a breakthrough.

    Then the model goes quiet. Or it returns something strange and vague. Or it just stops mid-sentence.

    You didn’t break anything. You simply ran out of room. You’ve hit the "Token Wall," and understanding how to navigate this limit is what separates a casual user from a master operator.

    1. The Physics of the Whiteboard

    Every AI conversation has a "context window," which is essentially a fixed amount of memory the model can hold at once. Think of it like a whiteboard. Every message you send, every response the model generates, every task list, and every snippet of code takes up space on that board.

    When you get close to the limit, the model doesn't just shut off; it begins to struggle under the weight of its own history. You might notice the "feel" of a session getting heavy. The model starts to lose its edge, often attempting to "pattern-match on noise" within the context rather than following your instructions.

    Crucially, the smarter the model, the faster it hits the wall. This is the Opus Paradox: Claude Opus thinks deeply and writes extensively. Because its outputs are more verbose and nuanced, it consumes its own runway far more aggressively than a simpler model. Its intelligence is the very thing that accelerates its failure in a crowded session. When the board is full, the model tries to squeeze a new request into a space that doesn’t exist, resulting in the graceful—but frustrating—failures we’ve all experienced.

    2. The Haiku Trick: Precision Over Power

    When a session stalls at the context limit, your first instinct might be to switch to an even more powerful model. That is almost always the wrong move.

    The veteran operator’s secret is to go smaller. Claude Haiku—the lightest and fastest model—can often "squeeze through the gap" that a heavier model like Opus or Sonnet simply cannot fit through. Because Haiku is lean and efficient, it can perform surgical actions like updating a task list, summarizing the current state of play, or triggering a "compaction" of the history. This small action clears the whiteboard just enough to unlock the entire session.

    "It's not always about raw intelligence. It's about fit. The right tool for the moment isn't the most powerful one — it's the one that can actually execute given the constraints you're operating in."

    This shift from seeking raw power to seeking operational fit is a fundamental breakthrough. It’s the realization that the most "intelligent" move is often the one that creates the most momentum with the least amount of space.

    3. The Formula One Mindset: Strategy Outruns Raw Compute

    To excel in the new era of AI, you have to embrace the Formula One analogy. F1 teams spend hundreds of millions on the fastest cars, but the car doesn't win the race on its own. The driver wins by knowing when to push the engine, when to conserve tires, and when to pit.

    The AI is your car; you are the driver. Two people using the exact same model will produce radically different results based on their "driver skills." These aren't skills you find in a manual; they are earned through "hours in the seat." A master operator develops an instinct for:

    • Pruning Context and History: Recognizing the moment a session feels "heavy" and manually clearing the whiteboard to keep the model focused.
    • Strategic Model Swapping: Knowing exactly when to call in the heavy lifting of Opus and when to pivot to the lean navigation of Haiku.
    • Compacting and Resetting: Identifying when a conversation has become too polluted with noise and needs a clean summary before starting fresh.
    • Task Handoffs to Subagents: Understanding that a subagent operating in isolation will almost always outperform a single, mile-long thread where context is diluted.

    4. What Agents Teach Us About Human Momentum

    We often focus on making AI more like humans, but the more valuable lesson is learning what agents can teach us about our own productivity.

    Agents succeed when they have a bounded context, a defined task, and honest signals about their capacity. They fail when their context is polluted with noise, when tasks are ambiguous, or when they try to do too much in one pass. This is a perfect mirror for human cognitive load. When we are overwhelmed, it’s rarely because we aren't "smart" enough for the task—it's because our internal whiteboard is full of distraction and noise.

    "When you're overwhelmed and stuck, the answer usually isn't to think harder. It's to do the smallest possible thing that creates forward momentum."

    Just as Haiku unlocks a stalled AI session by clearing one small item, humans can overcome paralysis by making one small decision or finishing one minor task. Operating intelligently within your own mental constraints is a superpower, not a compromise.

    5. The Internalized Hybrid

    The most effective AI users aren't just "humans using tools." They are "internalized hybrids"—operators who have adopted the logic of agentic thinking as their own.

    They naturally break massive projects into discrete, manageable tasks. They are honest about their own "context limits," realizing that pushing through a complex task at 11:00 PM is the cognitive equivalent of a model producing garbage when its whiteboard is full.

    This level of mastery isn't taught in a tutorial. It’s forged in the "Machine Room" at midnight, in those moments of operational failure when you hit the token wall and realize that a smaller, smarter approach is the only way through the gap. You have to live the experience of the work to develop the instinct for it.

    Conclusion: Getting Back in the Seat

    The relationship between you and the AI is defined by the "Driver and the Car." The car provides the potential for incredible speed, but it is the driver who provides the strategy, the timing, and the environmental awareness required to reach the finish line.

    The technology is now available to everyone, which means the tool itself is no longer the competitive advantage. The advantage is the operator.

    As you return to your workflows, ask yourself: Are you just pressing harder on the accelerator and wondering why you’re hitting a wall? Or are you ready to become a true driver, managing your context and choosing the right tool for the moment?

    The car is waiting. The driver makes the difference. It’s time to get back in the seat.

  • Agentic AI Orchestration: The Three-Layer Stack (Antigravity vs. Claude Code)

    Agentic AI Orchestration: The Three-Layer Stack (Antigravity vs. Claude Code)

    The Shift from Solitary Agents to Orchestrated Systems

    By May 2026, the novelty of “chatting” with an AI has vanished. For technical operators and systems architects, the conversation has moved from prompt engineering to orchestration. We no longer ask an agent to “write a script”; we deploy stacks that monitor state, reconcile data across disparate platforms, and execute complex workflows without human intervention unless a threshold is breached. In this landscape, two primary paradigms for AI orchestration tools 2026 have emerged: the sequential, deterministic approach of Claude Code and the parallel, swarm-based architecture of Antigravity 2.0.

    The “operator’s reality” in 2026 is that building a single agent is a hobby; building a three-layer stack is a business. This stack—composed of Notion as the human-readable “Eyes,” Google Cloud Platform (GCP) as the “Headless Engine,” and tools like Claude Code or Antigravity as the “Hands”—has become the standard for scalable automation. The challenge isn’t getting the AI to do the work; it’s the reconciliation. It’s ensuring that what the agent thinks it did in the terminal matches what the business sees in its records. This is the breakdown of how these tools operate in the field.

    Claude Code: The Sequential Conductor

    Claude Code remains the gold standard for high-precision, terminal-first execution. It operates as a “Senior Engineer” archetype. When you initialize a session in a repository, it doesn’t just guess; it indexes the environment, maps dependencies, and proceeds with a surgical, step-by-step logic that requires human verification for high-impact changes.

    In our tests, Claude Code’s primary strength is its determinism. If you are refactoring a legacy microservice on GCP, you want the “Conductive” approach. You want the agent to read the logs, propose a fix, and wait for your y/n confirmation before it pushes to production. It is a tool of restraint. Its CLI-native interface is designed for the developer who lives in the terminal, using a local context window to ensure that every line of code written is idiomatically consistent with the existing codebase.

    However, the limitation of claude code vs antigravity becomes apparent in high-volume operations. Claude Code is sequential. It is one agent, one terminal, one task. It is brilliant at fixing a bug; it is slow at managing a fleet of 500 social media accounts or reconciling 10,000 line items across a multi-region inventory system. For that, you need a different architecture.

    Antigravity 2.0: The Parallel Swarm

    Antigravity 2.0, released earlier this year, takes the opposite approach. It is built on “Swarm Intelligence.” Instead of a single conductor, Antigravity deploys a Mission Control UI that manages dozens of “worker” agents simultaneously. These agents don’t wait for your confirmation at every step; they use browser verification to “see” their results in real-time and self-correct based on the visual state of the web or a GUI.

    If Claude Code is the surgeon, Antigravity is the construction crew. In a recent deployment for a logistics client, we used Antigravity to monitor carrier pricing across 15 different portals. A single Claude Code instance would have taken hours to cycle through these sequentially. Antigravity spun up 15 parallel swarms, each with its own browser instance, scraped the data, verified the pricing against the contract terms (using its internal visual verification), and updated the database in under four minutes.

    The Mission Control UI is the differentiator. While Claude Code users are staring at a scrolling terminal, Antigravity users are looking at a dashboard of active swarms. You can see which agents are “thinking,” which are “verifying,” and which have hit a roadblock. It is designed for multi-agent orchestration at scale, where the operator’s role shifts from “approver” to “overseer.”

    The Three-Layer Stack: Eyes, Brain, and Hands

    The most effective systems we’ve built this year don’t rely on a single tool. They use what we call the “Rare Three-Layer Stack.” Most people pick one layer and wonder why their automation is brittle. The real power is in the reconciliation of these three components:

    Layer 1: The Eyes (Notion AI Agents)

    Notion is no longer just a document store; it is the synthesis layer. We use notion ai agents to serve as the “Eyes” of the operation. These agents monitor our project databases, meeting notes, and strategy docs. They synthesize the human intent. If a project manager changes a status in Notion from “Draft” to “Ready for Deployment,” the Notion agent detects this change and sends a signal to the next layer. It provides the human-readable visibility that a terminal lacks.

    Layer 2: The Headless Engine (GCP)

    The “Brain” or “Engine” lives in GCP. We use Cloud Functions and Firestore to maintain the “Source of Truth.” This is where the business logic resides. When the Notion agent signals a status change, GCP processes the rules: Does this change require a security audit? Does it fit the budget? It maintains the state of the entire system, acting as a headless automation layer that doesn’t care about the UI.

    Layer 3: The Hands (Claude Code / Antigravity)

    Finally, the “Hands” execute the work. If the task is a surgical code update, GCP triggers a Claude Code session via a webhook. If the task is a wide-scale data migration or a browser-based workflow, it triggers an Antigravity swarm. These are the connective hands that read from the engine and write to the external world.

    The Reconciliation Ledger: Solving Agent Drift

    The biggest failure we see in agentic ai implementation is “drift.” Drift occurs when an agent performs an action (the Hands), but the state isn’t updated in the record (the Eyes), or the engine (the Brain) loses track of the execution.

    To solve this, we implemented a “Reconciliation Ledger.” Every action taken by a Claude Code or Antigravity instance must be logged back to a Firestore collection with a unique transaction ID. The Notion agent then periodically “audits” the ledger. If Antigravity reports that it updated 500 records, but the GCP database only shows 498 changes, the Notion agent flags a “reconciliation error” and alerts a human operator.

    Without this ledger, multi-agent orchestration is a recipe for silent failure. We’ve seen swarms enter infinite loops because they couldn’t verify their own success, racking up thousands of dollars in API costs before anyone noticed. The ledger is the guardrail.

    Operator’s Log: The Failure of the “Blind Swarm”

    Last month, we tried to automate a complex data migration for an e-commerce client using only Antigravity 2.0 swarms, bypassing the GCP engine layer. We thought the agents were smart enough to handle the state locally. We were wrong.

    The swarm was tasked with updating product descriptions and prices across four different platforms. Because the agents were working in parallel and lacked a centralized “Brain” (GCP) to manage the lock state, two agents attempted to update the same product simultaneously. Agent A updated the price to $49.99 based on the original data, while Agent B updated the description. Agent B’s save operation overwrote Agent A’s price change because it was working with an older “view” of the product page.

    The result was a $12,000 discrepancy in sales over a weekend. We learned the hard way: AI orchestration tools 2026 are powerful, but they are not a substitute for traditional database integrity. You need a headless engine to manage state; you cannot leave it to the agents to “figure it out” in parallel.

    Choosing Your Paradigm: Claude vs. Antigravity

    When choosing between claude code vs antigravity, the decision tree is straightforward:

    • Use Claude Code when: You are working within a single repository, the task requires deep logical reasoning, you need idiomatic code quality, and you have a human operator ready to verify steps. It is for “Building.”
    • Use Antigravity 2.0 when: You are working across multiple web platforms, the task is repetitive and high-volume, you need parallel execution, and visual/browser verification is more important than code-level precision. It is for “Operating.”

    In the most sophisticated environments, you aren’t choosing; you are layering. You use Claude Code to build the scripts that Antigravity then executes at scale. You use Claude to write the custom GCP functions that manage the state for your Antigravity swarms.

    What You’d Do Tomorrow: The Practical Path

    If you are an agency owner or a systems architect looking to move into agentic orchestration, don’t start by trying to automate your entire business. Start with the ledger.

    1. Map your “Eyes”: Identify where your human intent lives. Is it Notion? Jira? Slack? Set up a basic webhook to watch for state changes.
    2. Build the “Engine”: Create a centralized database (Firestore or a simple Postgres instance on GCP) that tracks the state of your manual tasks.
    3. Deploy the “Hands” on one task: Pick a single, annoying, terminal-based task and use Claude Code to automate it. Or pick a browser-based task and use Antigravity.
    4. Reconcile: Ensure that the result of the “Hands” is automatically reflected back in the “Eyes” via the “Engine.”

    The future of work in 2026 isn’t about agents replacing people. It’s about operators managing stacks. The goal isn’t to have the smartest agent; it’s to have the most reliable reconciliation ledger. When the “Eyes,” “Brain,” and “Hands” are in sync, the system scales. When they aren’t, you just have a very expensive way to generate errors.

  • The Death of ‘Vertex AI’ and the Rise of the Gemini Enterprise Agent Platform

    The Death of ‘Vertex AI’ and the Rise of the Gemini Enterprise Agent Platform

    The Death of ‘Vertex AI’ and the Rise of the Gemini Enterprise Agent Platform

    For four years, Vertex AI was the “everything store” for Google Cloud’s machine learning stack. It was a sprawling, often fragmented collection of notebooks, endpoint managers, and feature stores designed for a world where data scientists spent months training models that rarely saw production. But at Google Cloud Next 2026, that era ended quietly. Vertex AI was officially retired, replaced by the Gemini Enterprise Agent Platform.

    This isn’t just a marketing exercise or a shallow rebranding of a legacy service. It is a fundamental architectural admission: the “model-centric” era of AI is over. If 2023 was about finding the best model and 2024 was about RAG (Retrieval-Augmented Generation), 2026 is about the autonomous agent. Google has shifted its entire infrastructure from a library of static endpoints to a stateful orchestration layer for agents that can think, execute, and—most importantly—correct themselves.

    The Architecture Shift: Model-Centric vs. Agent-First

    In the old Vertex AI framework, you deployed a model. You sent a prompt, you received a completion, and the transaction was over. Any complexity—looping, tool-calling, or memory—had to be built by your developers in a separate layer, usually involving fragile Python scripts or heavy frameworks like LangChain.

    The Gemini Enterprise Agent Platform flips this. With the rollout of ADK 2.0 (Agent Development Kit), the “model” is now just a component of an “agent.” In this new architecture, the platform handles the state. You no longer manage a stateless API; you manage a persistent entity with a memory buffer and a task queue.

    For agencies, this means moving away from “deploying models” and toward autonomous agent governance. If you are still billing clients for “custom GPTs” or simple RAG pipelines, you are effectively selling 2024 technology. The current standard is stateful multi-step execution where the agent can initiate its own sub-processes, query external APIs, and wait for asynchronous callbacks without the developer managing the intermediate state.

    ADK 2.0 and the Developer Workflow

    The core of this transition is ADK 2.0. Unlike its predecessor, which felt like a wrapper for REST calls, ADK 2.0 is built for local-first development. Most of our internal testing at Tygart Media now happens through the Gemini CLI, which allows operators to spin up agent environments that mirror production exactly.

    When you use the Gemini CLI to initialize a project (gemini init --agent-type=stateful), it doesn’t just create a YAML file. It provisions a “Reasoning Engine” that can handle long-running tasks. We recently tested this on a complex data migration for a logistics client. In the Vertex AI days, we would have had to write a massive script to handle 404 errors, retries, and schema mismatches. With the Gemini Enterprise Agent Platform, we deployed a “Migration Agent” that simply had the goal: “Sync these 12 databases. If a schema doesn’t match, research the correct mapping in the legacy docs and retry. Log all failures to Antigravity for human review.”

    The agent didn’t just run; it resided on the platform for three days, executing tasks, pausing when it hit rate limits, and resuming without losing its place in the sequence. This is the difference between a tool and a worker.

    Agent Studio: Low-Code Orchestration That Actually Works

    Google also introduced Agent Studio, which replaces the old Vertex AI Model Garden. While the Model Garden was a catalog, Agent Studio is a visual IDE for agentic loops. It allows systems architects to map out decision trees where the “nodes” aren’t just LLM calls, but “skills”—authenticated connections to BigQuery, Google Search, or internal ERPs.

    The key feature here is stateful multi-step logic. In previous iterations, if an agent failed at step 4 of a 10-step process, you had to restart from step 1 or build complex checkpointing logic. Agent Studio handles the checkpointing natively. For an operator, this reduces the “failure surface area.” We can now see exactly where an agent’s reasoning diverged and “hot-fix” the prompt or the tool definition mid-execution.

    The Hard Truth About Autonomous Agent Governance

    As Vertex AI is rebranded and replaced, the biggest hurdle for agencies isn’t the code—it’s the governance. When you move from “models” to “agents,” you are introducing non-deterministic actors into a client’s environment.

    We’ve seen what happens when governance is ignored. In a pilot project earlier this year, an autonomous agent tasked with “optimizing ad spend” accidentally deleted three high-performing campaigns because it interpreted “efficiency” as “cutting all costs.” This wasn’t a model failure; the model did exactly what it was told. It was a governance failure. There were no guardrails or supervisor agents to check its work.

    In the Gemini Enterprise Agent Platform, governance is a first-class citizen. You can now deploy “Supervisor Agents” that sit one level above your worker agents. These supervisors don’t perform tasks; they only audit the “Chain of Thought” (CoT) of the workers. At Tygart Media, we use tools like Claude Code to write the initial guardrail logic, then deploy it to the Gemini platform to monitor our production loops. If the worker agent’s proposed action deviates from the safety policy by more than a 0.15 variance in the embedding space, the supervisor kills the process and pings an operator.

    Pricing Shift: From Tokens to Outcomes

    One of the most disruptive changes in the May 2026 rollout is the pricing model. Google is moving away from purely token-based billing for Enterprise Agent Platform users, introducing outcome-based pricing for specific task completions.

    The old model penalized efficiency. If you spent more tokens making an agent “think” more deeply to avoid a mistake, you paid more. The new model allows you to pay per “Successful Task Completion.” This aligns Google’s incentives with the agency’s. We no longer care about the context window length as a cost factor; we care about the “Agentic Success Rate” (ASR).

    For a mid-sized agency, this simplifies the math significantly. If a client wants a support agent that handles 1,000 tickets, you can now project a flat cost per resolved ticket rather than guessing how many tokens a “difficult” customer might consume.

    A Practical Failure: Why ‘Models’ Weren’t Enough

    To understand why this change was necessary, look at our failure with “Project Orion” in late 2025. We tried to build a competitor analysis engine using Vertex AI and Gemini 1.5 Pro. We used a standard RAG setup. It worked 70% of the time. The other 30% of the time, the model would hallucinate a competitor’s pricing because it couldn’t access a gated PDF or failed to navigate a Javascript-heavy website.

    The model was “smart,” but it was “blind” and “unreliable” in a loop. It had no way to say, “I failed to read this page, let me try a different browser headers strategy.”

    Two weeks ago, we rebuilt Project Orion on the Gemini Enterprise Agent Platform using ADK 2.0. The new agent has a “retry skill.” When it hits a Javascript wall, it triggers a headless browser sub-agent. If it still fails, it searches for a cached version on the Wayback Machine. It doesn’t report back until the task is done or it has exhausted a defined set of “recovery behaviors.” Our ASR jumped from 70% to 94%. We didn’t change the model; we changed the architecture from a “static call” to an “autonomous worker.”

    What You Should Do Tomorrow

    If you are managing an AI stack, the “Vertex AI” name disappearing from your console is your signal to stop building “wrappers” and start building “systems.” Here is the tactical path forward:

    1. Audit your current ‘Models’: Identify which of your current deployments are actually just stateless prompts. These are your biggest liabilities. Plan to migrate them to the Gemini Enterprise Agent Platform to take advantage of stateful memory.
    2. Adopt a CLI-First Workflow: Stop using the web console for anything other than monitoring. Use the Gemini CLI and integrate it with Claude Code or your local IDE. The speed of iteration in ADK 2.0 is only visible when you are working in a terminal environment.
    3. Install a Governance Layer: Before you deploy your next agent, define its “Exit Criteria.” Use the new Supervisor patterns in Agent Studio to ensure no agent can execute an external API call (like send_email or update_database) without a secondary “Reasoning Audit.”
    4. Re-evaluate your Contracts: If you are billing based on “implementation hours,” you are going to get crushed as agents become easier to deploy. Move toward “Performance-Based Retainers” that mirror Google’s outcome-based pricing. If the agent solves the problem, you get paid.

    The Gemini Enterprise Agent Platform isn’t just a new tool; it’s a new operating system for business. The agencies that thrive in the next 12 months won’t be the ones with the best prompts, but the ones with the most robust, well-governed agentic loops.

  • The Rise of the Curation Class

    The Rise of the Curation Class

    This is what I’m building for myself, and what I’m building for the people I work with. It’s a long essay because the shift it describes is large and the through-line matters. The ten images below aren’t decoration — they’re the spine. Each one is a moment in a life that doesn’t fully exist yet but is closer than most people realize.

    I want to start where the technology starts, which is not in a factory.

    The man in the image above is finishing a wearable by hand. It’s an AR ring — leather and brushed aluminum, the band sized to his client’s wrist, the materials chosen because his client cares about how the thing feels at 6 AM on the day she has to present to a board. Behind him are leather rolls and fabric swatches that wouldn’t look out of place in a coachbuilder’s atelier. To his right are the kind of objects you’d find in a hardware prototyping lab — chassis teardowns, a development tablet, AR glasses on a stand. The corkboard above the bench has automotive interior sketches and material studies pinned next to each other.

    What that workshop is, in operational terms, is a luxury goods atelier and a hardware lab collapsed into one room. The collapse is the thing. The line between “this is bespoke craft” and “this is consumer electronics” has been melting for a decade, and the workshop above is what it looks like once that line is gone.

    I’m building for the people who will live on the right side of that collapse. The people who don’t want a phone — they want an instrument that fits the way they think. The people who have stopped trusting mass-produced anything and started looking for the small workshop, the verified maker, the device tuned to them specifically. That’s the Curation Class. They’ve existed in clothing for a hundred years and in cars for sixty. They’re now showing up in technology, and the technology is the part of the story I have to build.

    This essay is about what their daily life looks like when the ecosystem actually works. Then it’s about why I think this is where things go from here, and what I’m doing about it.

    Introduction to the instrument

    Meet the user. She’s the one who commissioned the work in the hero image. She’s an architect — the corkboard behind her is a hint, the mood board with fashion sketches and house renderings tells you something about her aesthetic taste. The coffee cup has a small leather wrap and a logo I won’t try to read; the flower in the vase is past its bloom but she hasn’t replaced it yet because she likes it that way.

    She’s just opened the ecosystem the artisan was finishing. The hologram floating above the ring spells out what she’s getting: “Vibe Curation, Concierge Cred Network, Curated Intelligence.” The version number is v1.4, which tells you the device has been iterated. This isn’t a Kickstarter prototype. This is a maintained system that updates the way her car updates and her phone updates, except it updates to fit her specifically rather than to fit the median user.

    The phrase “Personalized Ecosystem” deserves to be said carefully because it gets thrown around by everyone selling anything. What’s on her desk is different. It’s not a feature flag set to her preferences. It’s not a recommendation algorithm tuned to her purchase history. It’s an ecosystem in the literal sense — an interconnected set of devices, services, vendors, and contexts that have been wired together around her cognition, her body, her schedule, her taste, and the people she trusts. The wearable is the access token. The ecosystem is everything the token unlocks.

    The reason this matters is not that the technology is impressive. It’s that the unit of value is changing. For a generation, the value was in the device. For the next generation, the value is in the connections between the devices and the person who wears them. You don’t buy the ring. You buy your way into the ecosystem that the ring represents. The ring is just the part you can touch.

    This is what I’m building toward. Not the device. The connections.

    The day starts with a small ritual

    The first time the ecosystem touches her day, it’s a coffee. She’s at a café — bright, marble-countered, the kind of place that does third-wave coffee and serves it in a small ceramic cup. The barista is named Maria. The hologram above her ring is showing the order before Maria has had to ask: oat latte, 120°F (which is a specific temperature most people don’t know to ask for), Ethiopian Yirgacheffe roast.

    The detail that matters is the parenthetical: “Maria (verified).”

    This is the Concierge Cred Network. Maria isn’t just a barista. She’s been verified by the ecosystem — pulled up by name because she’s the one who makes the coffee the way the subject likes it. If Maria’s not working today, the ecosystem might suggest a different café entirely rather than route the order to a barista the system doesn’t trust to nail the temperature. The vendor relationship has become specific to the human, not the brand.

    I want to name something about this image that the casual viewer might miss. The subject is barely looking at the ring. Her gaze is on Maria. The interaction is human; the technology is in the background doing the work that makes the interaction friction-free. When the ecosystem works, it disappears. It doesn’t ask her to type her order, doesn’t ask her to dig out her phone, doesn’t ask her to remember which roast she likes. It does that work upstream. What she’s left with is a moment of eye contact and a coffee that’s right.

    This is, in my experience, the part most technology gets wrong. The goal isn’t to put more interface in front of people. The goal is to remove the interface from places it doesn’t belong. The Curation Class is willing to pay a premium for that subtraction.

    The home she designed for herself

    Now she’s home. The wall she’s touching is travertine — real stone, the kind with porosity you can feel under your fingertips. The hologram tells you the room is in a “Curated Sanctuary” mode and lists the materials: travertine and a cashmere blend. The room is calm. The light is afternoon. The chair is leather and looks like it’s been broken in for years.

    The detail I want to pull forward is the curator field on the hologram: “User_24A. Verified.”

    She is the curator. The “Verified” tag isn’t a brand verification. It’s her own. The space was designed by her, for her, and the ecosystem is tracking that fact. The wall, the light temperature, the fragrance the room is currently running, the sound dampening, the chair — all of it is a vibe she composed and the ecosystem is just executing.

    This is where the Curation Class diverges most sharply from the mass-luxury class that came before it. The old luxury class hired Robert Mion or Kelly Wearstler to curate for them. They bought the taste of someone whose taste was for sale. The new class makes the curation themselves and uses the ecosystem to remember the choices and reproduce them. The taste isn’t borrowed. It’s authored. The ecosystem is what makes authored taste tractable at the level of a daily-running home.

    I’ll be honest about why this matters to me operationally. When I think about what I’m building for my best clients — the ones who are paying for something more than a website or a content pipeline — I’m not building campaigns. I’m building the systems that let them author their own taste and reproduce it at scale. The Notion structure is part of that. The content stack is part of that. The way we wire models and routing and observability is part of that. None of it is technology for its own sake. All of it is the infrastructure of authored taste.

    The room above is what that looks like when it’s done.

    The work she actually does

    The studio above is hers. The building is hers too — she’s an architect, and “The Veda Residences” is the project she’s leading. The hologram shows iteration v9.2, which means this design has been worked through. The physical model on the leather pad is the build she’s referring to when the holographic version isn’t enough.

    A few things to notice. The drafting table has a real architect’s set square on it. The materials board has fabric and stone swatches that look like they were pulled from suppliers she trusts. The two colleagues in the back are visible through a glass partition; the studio isn’t a solo operation. It’s a small firm.

    What the ecosystem gives her here isn’t draft generation. It’s not “AI did the design.” The design is hers, plus her team’s. The ecosystem gives her something subtler — the ability to iterate v9.2 against her own internal coherence rules, her own taste profile, her firm’s body of work, the structural and material verifications she requires. She is still making every decision. The ecosystem is making every decision legible and reproducible.

    This is the part I think most people get wrong about where AI is going. They think it’s going to do the work. It’s not. It’s going to make the work expressible. The architect above doesn’t need an AI to design her building. She needs an instrument that lets her ask “would this material be coherent with the rest of my catalog?” and get an answer with citations. She needs the ecosystem to be the silent third party that holds her own standards more reliably than she can hold them in her head across a four-month project.

    The building she’s designing in this image, by the way, is the one she’ll be standing inside in the last image of this essay. Hold that. We’ll come back to it.

    Recovery, the part the ecosystem treats as work

    After the work, the recovery. The image above is what wellness looks like when it stops being a separate vertical and becomes a function of the same ecosystem that runs the rest of the day.

    The hologram says “Vibe State Recovery (post-design cycle).” That phrase is doing real work. The ecosystem knows she just spent eight hours on iteration v9.2 of the building project. It knows what that does to her body — the cortisol curve, the shoulder tension, the eye strain. It’s prescribing a recovery protocol that’s specific to what she just did. Not a generic massage. Not a generic meditation. A recovery state tuned to a design cycle.

    “Second Brain (User_24A): Verified Biometrics” is the connective tissue here. The wellness system isn’t reading her body from scratch. It’s reading her body in the context of everything else the ecosystem knows about her — her schedule, her work, her sleep history, her stress baseline, her medication if any, her preferences for what kinds of intervention she’ll accept. The Second Brain in this image isn’t a metaphor. It’s literally the persistent memory layer that lets every part of the ecosystem behave intelligently with respect to every other part.

    If I had to name what I think the single biggest unlock of the next ten years will be, it would be this: persistent personal memory that crosses contexts. Right now your fitness app doesn’t know what your therapist said. Your calendar doesn’t know what your sleep tracker measured. Your travel booking doesn’t know your spouse’s allergy profile. Each of these systems is islanded. The Curation Class will be the first cohort to live in a world where those islands are connected, and the connection will be the persistent personal Second Brain that they own — not a vendor’s database. Theirs.

    This is, again, why I do what I do. Not because I want to sell people on “AI wellness.” Because the architectural pattern of a persistent personal Second Brain, owned by the human, is the foundation everything else rides on.

    A deeper intervention

    The session continues. She’s now holding a more specific tool — a neural stim device that’s been issued to her, the kind of thing that has to be verified for her specifically because applying it wrong would do real damage. The hologram says “Neural Pathway Targeted: Verified.” The ecosystem isn’t just letting her use the device. It’s verifying that the protocol is appropriate for her at this moment.

    The phrase “Vedic Regeneration” is doing some cultural work here. I’m not going to oversell it — different people will read different things into it. What I’ll say operationally is that the Curation Class tends to be polyglot about where its wellness traditions come from. They’ll combine cold plunges, somatic therapy, Ayurvedic principles, and neural-feedback hardware in the same week without feeling the contradictions. The ecosystem is what makes that polyglot stance tractable — it can hold the protocols from five different traditions and apply the one that fits the moment.

    The reason a verification layer matters is harder. We’re entering an era where people will be doing more sophisticated interventions on their own nervous systems than ever before. Some of those interventions will be safe. Some won’t. Some will work for one person and harm another. The ecosystem above is doing what regulators won’t be able to do for another fifteen years: assuring that a specific intervention is appropriate for a specific person on a specific day. The verification isn’t bureaucratic. It’s the thing that lets her safely run the protocol at all.

    I’ll name the discomfort here. There’s a version of this that ends badly — concentration of biometric data, vendor lock-in, dependence on a system that someone else can shut down. That risk is real. The mitigation isn’t to refuse the technology. The mitigation is to own the Second Brain rather than rent it. Which is part of why I’m building the way I’m building. The architecture matters. The architecture is the politics.

    The commute as part of the system

    She’s in the car now. It’s autonomous — the road is moving but her attention is on the floating dashboard. The destination on the hologram is her own design studio at 11 Rivoli. ETA fourteen minutes.

    The phrase that earns its keep is “Flow State Curation.” The car isn’t just transporting her body. The car is preparing her cognition for what’s about to happen at the studio. Audio profile tuned. Cabin temperature optimized. Lighting on a curve that brings her up into focus rather than letting her crash at the end of the recovery session. The fourteen minutes between wellness and work aren’t dead minutes. They’re a transition that the ecosystem is actively shaping.

    When I look at this image I think about how much of contemporary life is wasted in transitions. The Curation Class won’t tolerate it. Their time is their most expensive asset, and they’re willing to pay to have transitions be productive rather than evaporated. The autonomous car is part of that. So is the ring. So is the wellness suite. So is the studio. None of them in isolation is interesting. Stitched together they are an enormous economic shift.

    The other thing worth naming: the car is bespoke. “Smart cashmere & polished aluminum, verified.” This is not a leased Tesla. It’s a vehicle whose interior materials have been chosen for her, verified by the maker, and integrated into the ecosystem in a way that lets the car participate in the flow state curation rather than fight it. The market for that kind of vehicle barely exists today. It will exist in ten years, and it will be larger than people think.

    Collaboration at scale

    The studio meeting. Four colleagues, a marble table, a wall of glass onto the city. She’s standing because she’s leading.

    The hologram says “Group Alignment 88%.” That’s the part I want to pull forward. The ecosystem isn’t just running her individually — it’s running a measurement of how aligned her team is on the current iteration of the project. Eighty-eight percent is high. Twelve percent is the gap she has to close in the room.

    This is where the Curation Class moves from being a personal lifestyle to being an operational advantage. A team that can see its own alignment in real time, that can identify the twelve percent of disagreement and address it directly rather than letting it metastasize through three more meetings — that team will outperform a team that can’t. The ecosystem is doing the work of measurement that used to require an executive coach in the room. Now it’s just there, on the table, visible to everyone.

    I want to be careful here. There’s a version of this where the alignment metric becomes a cudgel, where dissent gets flattened by the pressure to push the number up. That’s a failure mode and the ecosystem above can absolutely become it if the culture around it is wrong. The fix isn’t to refuse the measurement. The fix is to make the measurement legible enough that disagreement is preserved as signal rather than erased as noise. The ecosystem can do that. Whether the team uses it that way is a cultural question, not a technological one.

    The technology, by itself, is neutral. The culture decides whether it’s surveillance or instrumentation. I’m building for the latter.

    The arc closes

    This is the image that earns the whole essay.

    She’s standing inside the building. The Veda Residences — the project that was iteration v9.2 in the studio scene — is now built. The curved concrete, the fluted glass, the composite timber that the hologram in that earlier scene specified, all of it has gone from model to reality. She designed the room she is now living in. The hologram above her is reporting that the sanctuary is “realized” and that the alignment is at 100%, which is the team-level analog of the personal sanctuary she was tuning at home.

    She designed her own world into existence. The ecosystem made the through-line tractable across nine months of design iterations, two construction phases, fifteen vendor relationships, three biometric recovery cycles, a hundred small daily curations, and the original choice — three years earlier — to commission a hand-finished AR ring from a maker who works with leather and aluminum on a single bench.

    The Curation Class is not, fundamentally, a class that consumes better products. It’s a class that authors its own life and uses an ecosystem to make the authorship coherent across time. The wearable, the home, the studio, the wellness suite, the car, the team, the building — these are all expressions of one continuous act of authorship. The technology is the substrate. The taste is the act. The realization is the proof.

    Why I’m building for this

    I started this essay by saying it’s about what I’m building for myself and my clients. I want to close on that more directly.

    I am not building generic AI tools. I am not building “content automation.” I am building the operational substrate that lets a person — a founder, an operator, an artist, an architect — author their own coherent system across time and have the system reliably express the authorship. That’s the Notion architecture. That’s the model routing layer. That’s the content pipeline. That’s the persistent memory. None of it is interesting in isolation. All of it is interesting because of what it adds up to.

    The person I am building for is the architect above. She doesn’t know me. She might not exist yet. But the infrastructure that makes her life tractable is the infrastructure I am wiring this week, this month, this year. Every client I take on is a step toward making the substrate real. Every article I publish is a way of describing the future I’m trying to bring forward. Every system I document is a piece of the operating manual for the Curation Class.

    I think this is the work. I think it’s where the next ten years are. I think the people who get this right will look back at the current era — when AI was being used to mass-produce the same five blog posts and the same five product descriptions — the way the Bauhaus generation looked back at Victorian ornament. They will see the gap between what was being built and what could have been built, and they will name it.

    I’m trying to be on the right side of that gap.

    The image above — the woman standing inside the building she designed, with a glass of water, watching the city she optimized — is what I’m working toward. Not for her specifically. For the version of that life that becomes available to anyone who decides to author it and has the infrastructure to do so. That’s the Curation Class. That’s the brief I’m operating under. That’s the future I’m building.

    It’s already starting. The man in the first image is finishing the ring by hand. The system is being built. The class is forming. The rest is execution.