Tag: AI Tools

  • The Three-Legged Stack: Why I Run Everything on Notion, Claude, and Google Cloud

    The Three-Legged Stack: Why I Run Everything on Notion, Claude, and Google Cloud

    Last refreshed: May 15, 2026

    A surveyor's tripod with copper, porcelain, and steel legs planted on rocky ground at sunrise above the clouds — representing the Notion, Claude, and Google Cloud three-legged stack
    The three-legged stack — Notion, Claude, Google Cloud — is what’s actually holding up the operation.

    I run a portfolio of businesses — restoration companies, content properties, creative ventures, a software platform, a comedy site, a few things I haven’t decided what to do with yet — on three legs. Notion. Claude. Google Cloud. That’s it. Everything else either fits inside that triangle or it doesn’t last in my stack.

    This article is the doctrine. Not “here’s a list of tools I like.” The actual operating philosophy of why this specific three-piece architecture is what holds the work up, where each leg’s job ends, and what I learned the hard way about which tools belong on the floor instead of the table.

    If you’re trying to decide what your own AI-driven operating stack should look like, what follows is what I’d tell you over coffee.

    Why three legs and not two, four, or twelve

    I tried twelve. I tried four. I lived for a while with two. Three is what’s left after everything else either failed in production, got absorbed into one of the three legs, or became overhead that didn’t pay for itself.

    The reason it’s not two is that you need a place where state lives, a place where reasoning happens, and a place where heavy compute runs. If you collapse two of those into one tool, the tool has to be excellent at both jobs and almost nothing is. If you keep them separate, each tool gets to be excellent at its actual job.

    The reason it’s not four is that every additional leg multiplies the surface area of what can break, what needs to be monitored, what needs to be paid for, what needs to be learned by every new person you bring in. Four legs sounds like it would be more stable but it isn’t. It’s more rigid. Three legs sit flat on uneven ground.

    The reason it’s not twelve is that I tried that and the cognitive cost of remembering which tool did which job was higher than the work the tools were supposed to be saving.

    Notion is the system of record

    State lives in Notion. That’s the rule. If a piece of information needs to exist tomorrow, it goes in Notion first.

    That includes the things you’d expect — clients, projects, content pipelines, scheduled tasks, the Promotion Ledger that governs which autonomous behaviors are running at what tier — and a lot of things you might not. Meeting notes go in Notion. Random ideas at 11pm go in Notion. The reasons I made a particular architectural decision six months ago go in Notion. Anything I might want Claude to read later goes in Notion.

    The reason this leg has to be Notion specifically — and not, say, a folder of markdown files, or a Google Doc, or Airtable — is structured queryability paired with human-readable rendering. Notion databases let me describe my business in shapes (a content piece is a row, a project is a row, a contact is a row) while keeping every row a real document I can read and write to like a normal page. That dual nature is rare. Most systems force you to pick between structured and prose. Notion lets the same object be both.

    The May 13, 2026 Notion Developer Platform launch made this leg even stronger. Workers, database sync, and the External Agents API mean the system of record can now do active things on its own and host outside agents (including Claude) as native collaborators. Notion stopped being a passive document store and started being a programmable control plane. That’s a big deal for this architecture and I wrote about it in my piece on the platform launch.

    Claude is the reasoning layer

    Claude does the thinking. That’s the rule on the second leg.

    Anywhere I would otherwise have to write something from scratch, decide between options, summarize a long document, generate code, audit content, or do any task that requires a brain rather than just a database query, Claude is the first thing I reach for. The work happens in Claude. The result lands in Notion.

    I want to be specific about why Claude and not “an LLM” generically. I have used the others. I have used GPT in production. I have used Gemini in production. They all work. Claude is what I picked, and the reasons aren’t religious.

    First, the writing is recognizable. Claude’s voice has a calibration to it that the others don’t quite have for the kind of work I’m doing — long-form content, operator-voice editorial, technical explainers. I can edit a Claude draft to feel like me much faster than I can edit the others.

    Second, the agentic behavior is the most stable across long sessions. Claude Managed Agents and Claude Code in particular are willing to think for a long time without losing the plot. For multi-step work that involves reading a lot of context, holding it, and acting on it across many turns, the difference is real.

    Third, the tooling around Claude — Claude Code, Cowork, the Agent SDK, MCP — is the most operator-friendly of the bunch right now. The other models will catch up. As of May 2026, Claude is the best fit for how I actually work.

    Fourth, and this matters more than people give it credit for: I am willing to bet on Anthropic the company. I am betting my operations on the leg that bears my reasoning load. Whose roadmap I’m comfortable with, whose values I find legible, whose engineering culture I trust to keep shipping the thing without breaking it underneath me — that’s a real input to the decision, not a soft preference.

    Google Cloud is the substrate

    The third leg is the heavy one. Google Cloud is where the things live that have to be reliable in a way that Notion can’t be and Claude isn’t supposed to be.

    The 27 WordPress sites I manage all live on GCP infrastructure. The knowledge-cluster-vm hosts five interconnected sites. The proxy that lets Claude talk safely to WordPress sites runs on Cloud Run. The cron jobs that fire scheduled work, the Python services that handle image pipelines, the AI Media Architect that runs autonomously — all on GCP. Anything that involves real compute, regulated data, behind-a-firewall execution, or sustained reliability lives on the third leg.

    The reason this leg has to be a real cloud and not just a laptop or a Hetzner box is that I run autonomous behaviors. Tier C autonomous behaviors run unattended, which means the substrate they run on has to be more reliable than I am. GCP gives me that. It’s also where Anthropic’s Claude is available through Vertex AI, which means there’s a path where the entire stack can run inside one cloud’s perimeter when that becomes operationally necessary.

    I picked GCP specifically over AWS or Azure for a few reasons. Vertex AI’s first-party Claude access matters to me. The GCP control surface is the one I’m fastest in. Cost-wise it’s been competitive for the workloads I run. None of those are universal — your third leg might be AWS, or Azure, or a hybrid with on-premise hardware. The doctrine isn’t “use GCP.” The doctrine is “have a real substrate that can carry the heavy work.”

    How the three legs hold each other up

    The thing that makes this an actual stack and not just three tools is the load each leg puts on the others.

    Notion holds Claude’s memory. Claude doesn’t have persistent memory across sessions in any deep way — what it remembers is what’s in the prompt and what it’s allowed to look up. Notion is where I put the things I want Claude to know tomorrow. Project briefs, brand voice docs, the Promotion Ledger, client context, my preferences. When Claude starts a session it looks at Notion. When the session is done, what mattered gets written back to Notion. The memory leg is Notion. Without it, Claude is amnesiac and has to be re-briefed every time.

    Claude does the work that Notion can’t and that GCP isn’t shaped for. Notion can hold structured data and run light automation through Workers and database sync. Notion can’t write a 2,000-word article in your voice. GCP can run a reliable cron job and host whatever you want on Cloud Run. GCP isn’t going to read your existing client notes and propose a follow-up email. The reasoning leg is Claude. Without it, you have a database and a server and no one to think.

    GCP holds the things that have to keep running when nobody is watching. Notion can’t host a WordPress site. Claude can’t run a cron job by itself. The compute leg is GCP. Without it, the autonomous behaviors that make this a system instead of a tool collection have nowhere to live.

    Each leg fails gracefully into the others. If Notion is down, GCP keeps the live workloads running and Claude can still do work in a session. If Claude is down, Notion still holds state and GCP still runs the autonomous infrastructure. If GCP is down, the websites are unreachable but the planning surface (Notion) and the reasoning surface (Claude) still let me figure out what to do about it. No single failure takes the whole operation down.

    What I tried that didn’t make the cut

    For honesty’s sake, here’s what I had in earlier versions of the stack that’s no longer there:

    Zapier and Make for orchestration. They worked. They cost real money at the volumes I was running. The May 13 Notion Developer Platform launch absorbed most of what I was using them for into native Notion functionality. What’s left I do with Cloud Run jobs.

    Multiple LLMs for “best tool for the job.” I went through a phase of routing different work to different models. The cognitive overhead of “which one for this task” was higher than any quality gain from the routing. I picked Claude and stayed.

    Custom CRMs and project management tools. Tried several. None of them did the job better than a well-structured set of Notion databases with the right templates and views. The CRM is in Notion now. The project management is in Notion. The pipeline tracking is in Notion.

    A second cloud “for redundancy.” Sounded smart, was actually overhead. If GCP goes down catastrophically I have bigger problems than my stack. Single-cloud is fine for a small operator portfolio.

    Local AI models for cost savings. The math didn’t work for me. I have a powerful workstation that can run open models, but the time cost of running them, debugging them, and maintaining them outweighed the API savings. Claude through the subscription and through Vertex when I need it is what I pay for now.

    Why this matters beyond my own operation

    I write about this not because anyone is required to copy it but because the shape of the answer — three legs, one for state, one for reasoning, one for compute — generalizes.

    If you’re a solo operator, a small agency, a content business, a service business with operational complexity, this shape works. Your specific tool choices for each leg will be different. Maybe your state lives in Airtable instead of Notion. Maybe your reasoning leg is GPT or Gemini. Maybe your substrate is AWS or Vercel or your own bare metal. The three-leg architecture survives the substitutions.

    What doesn’t survive substitutions is collapsing the legs. Putting state and reasoning in the same tool (anyone who has tried to use ChatGPT as their CRM knows what I mean) doesn’t work. Putting reasoning and compute in the same tool means you’re either compromising on reasoning to keep compute simple or compromising on compute to keep reasoning fluid. The separation is where the strength is.

    Where the stack is going next

    Three things I’m watching:

    Notion’s platform maturation. The May 13 launch is version 1 of what Notion as a programmable platform looks like. If Workers and database sync continue to grow into real automation surface, more of what I do on GCP could move to Notion. I don’t expect the heavy stuff to migrate, but the lightweight glue is moving in that direction.

    Claude’s agentic capabilities. Claude Managed Agents and the Agent SDK are getting better fast. Some of what I currently script in Python on Cloud Run will move into Claude-native agentic loops as the agents become more capable of long-running, reliable work without supervision.

    The fortress pattern on GCP. The ability to run Claude inside a private GCP perimeter via Vertex AI is becoming more important as I take on regulated industry work. The substrate leg is staying GCP precisely because of this — the perimeter matters.

    The stack will evolve. The three-leg shape probably won’t.

    Frequently Asked Questions

    Why Notion and not Airtable, Coda, or Obsidian?

    Notion’s combination of structured databases and human-readable page rendering is what makes it work as both a database and a knowledge base for Claude. Airtable is more powerful as a database but worse as a document. Coda is similar in spirit but smaller community and tooling around it. Obsidian is excellent for personal knowledge but doesn’t have the multi-user, structured-database surface I need to run businesses on.

    Why Claude and not GPT or Gemini?

    Voice quality for the kind of writing I do, agentic stability across long sessions, operator-friendly tooling (Claude Code, Cowork, MCP), and Anthropic’s roadmap and culture being legible to me. The other models work; Claude is what I picked.

    Why Google Cloud and not AWS?

    Vertex AI’s first-party Claude access, GCP’s control surface fitting how I work, competitive cost on my specific workloads. AWS would also work. The doctrine is “have a real substrate,” not “use GCP specifically.”

    Can a small operator afford this stack?

    Yes. Notion is $10/seat. Claude Pro is $20/month, Max is $100-$200. GCP costs scale with what you actually run — my 27-site infrastructure runs in the low three figures monthly. Total monthly stack cost for a solo operator running this architecture is well under what most people pay for a single SaaS tool that does only one of these jobs.

    What if one of the legs goes away or pivots badly?

    Each leg is replaceable. The shape of the stack matters more than the specific brands. If Notion pivots away from being useful, the state leg moves somewhere else. If Anthropic pivots, the reasoning leg moves. If I leave GCP, the substrate leg moves. The architecture is durable; the specific tool choices are not load-bearing in the way the architecture is.

    How long did it take to settle on this shape?

    Roughly two years of trying things. I write the doctrine now because I want my own next iteration to start from this shape rather than rebuilding it from scratch. If you want to skip those two years, this is the shortcut.

    Related Reading

  • Claude Models Roadmap May 2026: Opus 4.7, Knowledge Cutoffs, the 1M Context Window, and What’s Real About Claude 5

    Claude Models Roadmap May 2026: Opus 4.7, Knowledge Cutoffs, the 1M Context Window, and What’s Real About Claude 5

    Last refreshed: May 15, 2026

    The pace of new Claude releases in 2026 has been fast enough that the canonical question — “what’s the latest Claude model and what’s it actually good for?” — has a different answer almost every quarter. This article is the current map, dated and sourced, of what Anthropic has shipped in 2026, what’s confirmed about each model’s specs and knowledge cutoffs, and what’s been claimed (but not officially confirmed by Anthropic) about what’s coming next.

    Two ground rules first, because the model-roadmap space is full of speculation:

    • Specs and release dates marked as verified come from Anthropic’s own documentation, news posts, or help center pages. We list the specific source.
    • Anything marked as reported or claimed comes from third-party reporting (TechCrunch, secondary news sites, analyst commentary) that we could not independently confirm against an Anthropic-published source as of May 15, 2026.

    If you’re making product decisions on this information, treat verified facts as actionable and reported facts as directional.

    The current generally-available Claude models (May 15, 2026)

    From Anthropic’s official models overview and pricing pages, the current production Claude lineup is:

    Claude Opus 4.7claude-opus-4-7

    • Status: Generally available, currently the most capable Claude model
    • Context window: 1 million tokens at standard pricing (no long-context premium)
    • Max output: 128,000 tokens
    • Knowledge cutoff: January 2026 (per Anthropic Help Center, verified May 15, 2026)
    • Pricing: $5/MTok input, $25/MTok output (base rates)
    • Notable changes from 4.6: New tokenizer (uses up to ~35% more tokens for the same text), high-resolution image support up to 2576px / 3.75MP, new xhigh effort level, task budgets beta. Extended thinking budgets and sampling parameters (temperature, top_p, top_k) are removed.

    Claude Opus 4.6 — Still generally available, $5/MTok input, $25/MTok output. Released February 2026.

    Claude Sonnet 4.6 — $3/MTok input, $15/MTok output. Includes the 1M token context window at standard pricing.

    Claude Haiku 4.5 — Cheapest model in the active lineup at $1/MTok input, $5/MTok output.

    Earlier models still active or in deprecation: Opus 4.5, Opus 4.1, Sonnet 4.5, and Haiku 3.5 (retired except on Bedrock and Vertex AI). Opus 4 and Sonnet 4 are listed as deprecated.

    Knowledge cutoff dates that actually matter

    Per Anthropic’s Help Center article on training-data recency (verified May 15, 2026), the most recent generally-available models have January 2026 knowledge cutoffs. That means:

    • Anything that happened after January 2026 is outside the model’s training data
    • For current events, recent product launches, recent legal or regulatory changes, or very recent technical documentation, the model needs to be given the information directly (in the prompt, via web search, or through tool use) — it can’t be relied on to know it
    • The model still has tools available (web search, code execution, file access) that can access post-cutoff information when explicitly invoked

    The practical version: don’t ask Claude what happened last week and expect it to know. Hand it the source material and ask it to analyze, summarize, or work with what you’ve given it.

    The 1M token context window — what it actually unlocks

    Per Anthropic’s official pricing documentation (verified May 15, 2026), Opus 4.7, Opus 4.6, and Sonnet 4.6 all include the full 1 million token context window at standard pricing. There’s no long-context premium — a 900,000-token request is billed at the same per-token rate as a 9,000-token request.

    That’s an enormous practical change from earlier Claude generations. A 1M context window is roughly:

    • ~750,000 words of English text
    • Most full books or technical specifications in a single context
    • ~8 hours of meeting transcripts at typical density
    • An entire mid-sized codebase, including most or all source files

    Prompt caching and batch processing discounts both apply at standard rates across the full 1M window. For workloads that involve sending the same large document repeatedly with different questions, prompt caching against a 1M context is one of the highest-leverage cost optimizations available in the current Claude lineup.

    What’s reported about Claude 5 (and what we cannot independently verify)

    Multiple third-party sources reported in early 2026 that Anthropic CEO Dario Amodei confirmed a Q2 2026 launch window for Claude 5 in a TechCrunch interview published February 1, 2026. The same sources cited an internal-roadmap leak suggesting an April 28 target date.

    What we can verify as of May 15, 2026:

    • Anthropic’s official model lineup, news page, and platform documentation list the latest production models as Opus 4.7 and earlier 4.x variants. Anthropic has not, to our review, published an official “Claude 5” launch announcement on its anthropic.com news page or its docs.claude.com release notes as of this date.
    • The third-party reporting on Claude 5 specifications (500K context window, 20-25% benchmark improvements, ~90%+ on SWE-bench Verified) is widely repeated but, as far as we could verify, is not sourced to an Anthropic-published document.

    The honest read: Q2 2026 ends June 30, so if the reported timeline is accurate, an official Claude 5 announcement could plausibly land in the next several weeks. If you’re planning a project that depends on a specific Claude 5 capability, build against current Opus 4.7 first and treat any Claude 5-specific work as speculative until Anthropic publishes official model details.

    Claude Sonnet 5 — separate question

    Some 2026 third-party reporting refers to “Claude Sonnet 5” launching in early February 2026 under an internal codename. We could not, in our May 15, 2026 review, find this model listed in Anthropic’s official models overview, pricing page, or release notes — only Sonnet 4.6 and earlier Sonnet variants are listed as currently available models. If “Sonnet 5” was a real intermediate release, it does not appear in Anthropic’s current public model documentation under that name.

    Two possibilities to consider, neither of which we can confirm: the reported Sonnet 5 may have been folded into the broader 4.x lineup under a different name, or the reporting may have been speculative or premature. If you’re tracking model identifiers for production use, only model IDs published in Anthropic’s documentation (such as claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5) are guaranteed to be valid against the API.

    How to actually keep up with Claude releases

    The signal-to-noise ratio in the model-release coverage space is not great. Two practical sources are reliable enough to bookmark:

    • Anthropic’s news page at anthropic.com/news — first-party launch announcements with full model details
    • Claude API release notes at the Help Center release-notes page — concise, dated, version-specific

    For breaking changes that affect production code, the Anthropic platform documentation publishes per-version “What’s new” pages (Opus 4.7’s, for example, lists every API breaking change at launch). Those are the canonical reference for migration work.

    For everything else — analyst commentary, predictions, leak coverage — treat it as commentary, not as fact.

    What this means for your work today

    Based on what is verifiable on May 15, 2026:

    • If you need the most capable Claude model available, use Opus 4.7. It has the largest context window, the highest knowledge cutoff (January 2026), and the strongest reported coding/agentic performance.
    • If you need cost-efficient production work, use Sonnet 4.6. Same 1M context, much lower per-token rates than Opus.
    • If you need cheap, fast, simple-task workloads, use Haiku 4.5.
    • If you’re planning around Claude 5, treat the timing as unconfirmed and build resilience into your code (don’t hard-code model IDs that don’t exist yet).
    • For knowledge cutoff-sensitive use cases (current events, recent regulatory data, post-January 2026 news), always provide the information directly or use tool calls — don’t rely on training data alone.

    Frequently Asked Questions

    What is the knowledge cutoff for Claude Opus 4.7?

    January 2026, per Anthropic’s Help Center documentation verified May 15, 2026. Information about events, products, or developments after that date is not in the model’s training data and must be provided directly.

    What is the largest Claude context window currently available?

    1 million tokens, available on Opus 4.7, Opus 4.6, and Sonnet 4.6 at standard pricing with no long-context premium.

    Has Anthropic officially announced Claude 5?

    As of May 15, 2026, we could not locate an Anthropic-published announcement of a Claude 5 model on anthropic.com or docs.claude.com. Multiple third-party sources have reported a Q2 2026 launch window based on a TechCrunch interview with Dario Amodei, but we could not independently confirm those specifications against a primary source.

    Is Claude Sonnet 5 a real model I can use?

    As of May 15, 2026, “Claude Sonnet 5” does not appear in Anthropic’s official models overview or pricing documentation. The currently available Sonnet model is Claude Sonnet 4.6 (model ID claude-sonnet-4-6). Earlier reports of a Sonnet 5 release were not confirmed against an Anthropic-published source in our review.

    Why does Opus 4.7 use more tokens than Opus 4.6 for the same text?

    Opus 4.7 ships with a new tokenizer that contributes to its improved performance but uses approximately 1x to 1.35x as many tokens for the same input text compared to previous models. Anthropic recommends increasing max_tokens headroom and adjusting compaction triggers accordingly.

    Are sampling parameters (temperature, top_p, top_k) still supported on Opus 4.7?

    No. Setting temperature, top_p, or top_k to any non-default value on Opus 4.7 returns a 400 error. Migration guidance: omit these parameters and use prompting to guide the model’s behavior.

    Related Reading

    How we sourced this

    Sources reviewed May 15, 2026:

    • Anthropic Pricing Documentation: docs.claude.com/en/docs/about-claude/pricing (primary source for model lineup, per-token rates, context window pricing)
    • Anthropic Platform Documentation: What’s new in Claude Opus 4.7 (primary source for Opus 4.7 features, breaking changes, tokenizer, image support, task budgets)
    • Anthropic Help Center: How up-to-date is Claude’s training data? (primary source for knowledge cutoff dates)
    • Anthropic news page (primary source check for Claude 5 announcement — none located as of May 15, 2026)
    • Third-party reporting on Claude 5 / Sonnet 5 (TechCrunch interview reports, Claude5.com, Fello AI, WaveSpeed Blog) — cited as reported but not independently confirmed against primary sources

    This article applies the verified vs. reported distinction throughout. If any of the unverified third-party claims are confirmed by Anthropic in the weeks after this article’s date stamp, the relevant sections should be updated to reflect the new primary-source documentation.

  • Amazon Prime Student and Claude Pro: Is There a Bundle or Discount? (May 2026 Honest Answer)

    Amazon Prime Student and Claude Pro: Is There a Bundle or Discount? (May 2026 Honest Answer)

    Last refreshed: May 15, 2026

    If you’re a student paying for Amazon Prime Student and you’re wondering whether your subscription includes Claude Pro — or unlocks a discount on it — here’s the direct answer first, and then the supporting context.

    As of May 15, 2026, after reviewing Amazon’s official Prime Student benefits page, Anthropic’s pricing and plans pages, Anthropic’s published news and partnership announcements, and AWS Public Sector publications, we found no announced partnership, bundle, or discount between Amazon Prime Student and Claude Pro.

    That does not confirm such a partnership doesn’t exist or won’t exist later. It confirms that we searched the places you would expect to find an announcement and could not locate one. If Amazon or Anthropic launches this kind of program after the date stamp on this article, this conclusion will be out of date — and the right place to check is always Amazon’s Prime Student benefits page and Anthropic’s own announcements.

    Why people are searching for this

    Search Console data and general 2026 web trends show consistent volume on queries like “amazon prime student claude pro” and “amazon prime student claude code.” The pattern usually reflects one of three things:

    • Students assuming that because Amazon Prime Student bundles several other digital subscriptions and benefits, it would make sense for Claude Pro to be on the list
    • Confusion between Amazon (the retailer/Prime Student parent), AWS (the cloud platform where Anthropic’s Claude is available), and Anthropic (the company that makes Claude)
    • A misread of news coverage about Claude’s availability on AWS Bedrock or AWS Marketplace as some sort of consumer bundle

    None of those are unreasonable assumptions. They’re just not, as far as we can verify in May 2026, actual partnerships.

    What Amazon Prime Student actually includes (as of May 2026)

    Per Amazon’s official Prime Student benefits page, the core benefits are:

    • Six-month free trial, then ~50% off standard Prime pricing
    • Free same-day or one-day shipping on eligible items
    • Prime Video, Amazon Music Prime, and Prime Reading access
    • Exclusive student deals and promotions
    • Bundled access to select third-party services (this list rotates and varies by region)

    Claude Pro is not currently listed among those bundled third-party services. AWS-side products and developer tools are separate from the Prime Student consumer benefit set.

    What students can actually do to access Claude at reduced cost

    Anthropic does not run a public, individual Claude Pro student discount. What it does run, verified May 15, 2026, is a set of institutional and program-based paths to discounted or free access:

    Claude for Education. Launched in April 2025, this is Anthropic’s program for higher-education institutions. Students, faculty, and staff at participating universities get access to Claude’s premium features for free as long as they remain enrolled or employed. Known partner institutions include Northeastern University, the London School of Economics, Champlain College, the University of San Francisco School of Law, and Northumbria University. If your school is part of the program, signing in to claude.ai with your school email upgrades your account automatically — no application or payment required.

    GitHub Student Developer Pack. Verified students enrolled in degree-granting programs can claim a developer pack that has historically included credits or premium access to a wide range of developer tools. Claude offerings within the pack have varied over time — check the current pack contents at GitHub’s education portal for what’s available the day you apply.

    Direct Anthropic partnerships with specific universities. Beyond the formal Claude for Education program, Anthropic has signed individual agreements with universities providing campus-wide access at institutional rates. If your university isn’t on the public partner list, it’s worth asking your IT or library services whether they have a direct arrangement.

    The standard Claude free tier. Anyone can use Claude without paying. The free tier provides limited daily messages on a recent model, and for many students that’s sufficient for coursework that doesn’t require sustained heavy use.

    For a broader breakdown of every legitimate path students can take to reduce Claude costs, see our existing guide: Claude Student Discount: The Honest Guide to Getting Claude for Less.

    What about AWS Marketplace and Claude for Education?

    One source of search confusion is that Claude for Education became available through AWS Marketplace in 2026 (covered in the AWS Public Sector Blog). This is an institutional purchasing path for universities — it allows schools to procure Claude for Education through their existing AWS billing relationship — not a consumer or student-facing benefit.

    It’s also distinct from the underlying availability of Claude models on AWS Bedrock for developers, which is again an enterprise/developer feature, not a Prime Student benefit.

    What to be wary of

    Because there’s real search demand for a Prime Student + Claude Pro discount that doesn’t currently exist, third-party sites have filled the gap with content of varying quality. Specifically:

    • “Promo code” pages claiming 50% off Claude Pro through Prime Student. We could not verify any of these against Anthropic’s official pricing, and Anthropic’s Help Center has stated that support cannot issue one-off discounts.
    • Reseller and account-sharing services that advertise Claude Pro at a discount through some Amazon channel. These typically involve shared logins, terms-of-service violations, or both.
    • YouTube videos and articles that describe a Prime Student / Claude bundle as if it exists — usually republishing each other’s speculation rather than citing a primary source.

    The honest read: until Amazon or Anthropic announces a partnership directly, on their own properties, treat any third-party claim of a Prime Student + Claude Pro discount as unverified.

    What we’d actually like to see

    A Prime Student + Claude Pro bundle would make sense. Prime Student is a credible distribution channel for student-facing digital benefits, Claude is increasingly central to how students do research and writing, and Anthropic has shown it’s willing to do institutional deals for the education market. There’s a logical product collaboration sitting on the table.

    Whether either party is interested in pursuing it isn’t something we can speak to. If it happens, we’ll update this article. If you’ve seen a credible announcement we missed, let us know — the methodology in this article is exactly the kind of finding that should get re-checked when the facts change.

    Frequently Asked Questions

    Does Amazon Prime Student include Claude Pro?

    No, as of May 15, 2026, Amazon Prime Student does not include Claude Pro. We reviewed Amazon’s official Prime Student benefits page, Anthropic’s plans and pricing pages, and Anthropic’s news releases, and found no announced partnership, bundle, or discount linking the two products.

    Is there an Amazon Prime Student discount on Claude Code?

    No, as of May 15, 2026. Claude Code uses the same subscription tiers as Claude Pro (or runs against a Claude Developer Platform API key), and no Amazon Prime Student discount or bundle on either product has been announced through official channels we reviewed.

    Why do search engines suggest “amazon prime student claude pro” if it doesn’t exist?

    Search engines surface query suggestions based on actual user search volume, not on whether the underlying product exists. The high volume of users searching for this combination reflects assumption and curiosity, not a confirmed offering.

    What’s the cheapest legitimate way for a student to use Claude Pro?

    If your university participates in Claude for Education, sign in to claude.ai with your school email — that’s free premium access. If not, the GitHub Student Developer Pack sometimes includes Claude-related benefits. Beyond those, the standard Claude free tier costs nothing, and individual Claude Pro subscriptions are $20/month at standard pricing.

    Can students share a single Claude Pro account to save money?

    Account sharing typically violates Anthropic’s terms of service. The Team plan exists for groups that need multi-user access at a per-seat rate.

    Will Anthropic ever offer a public student discount?

    Unknown. As of May 2026, Anthropic’s stated position is that it focuses student access through institutional Claude for Education partnerships rather than individual discount codes. That could change at any time.

    Related Reading

    How we sourced this

    Sources reviewed May 15, 2026:

    • Amazon Prime Student official benefits page (primary source for what Prime Student actually includes)
    • Anthropic pricing page and plans page at claude.com/pricing (primary source for Claude pricing structure and absence of student discount)
    • Anthropic Help Center and news releases (primary source for Claude for Education and partnership announcements)
    • AWS Public Sector Blog: Claude for Education now available in AWS Marketplace (primary source for the AWS Marketplace path)
    • Multiple independent comparison sources (Krater, GamsGo, Get AI Perks, Krater, others) consistently reporting no Prime Student / Claude partnership exists — Tier 2 confirming sources

    This article applies a negative-finding standard: when a claim can’t be verified, we state what we searched and what we did not find, rather than declaring the claim false. If the partnership status changes after May 15, 2026, the conclusion here should be re-verified against the original sources before being treated as current.

  • Claude MCP Token Cost Reality: Why Your Model Context Protocol Setup Is Burning 18,000 Tokens Per Turn

    Claude MCP Token Cost Reality: Why Your Model Context Protocol Setup Is Burning 18,000 Tokens Per Turn

    Last refreshed: May 15, 2026

    If you’ve ever connected a few Model Context Protocol (MCP) servers to Claude Code and watched your usage limit drain faster than the work you actually did would explain, you’re not imagining it. There’s a real, documented, and sometimes substantial token cost to wiring MCP servers into your Claude environment — and most setup guides don’t mention it.

    The short version: each MCP server you connect injects its complete tool schema into the context of every message you send. Multiple servers stack. The total overhead can range from a few thousand tokens for a single server up to roughly 18,000 tokens per turn when you’re running a typical multi-server developer setup. Anthropic’s own engineering team has acknowledged this in a public GitHub issue and shipped optimizations to reduce it.

    This article walks through where the overhead actually comes from, how to measure your own setup, what Anthropic has changed in 2026 to ease the cost, and the concrete steps you can take to keep MCP useful without burning through your token budget.

    What MCP actually is, briefly

    The Model Context Protocol is an open standard created by Anthropic that lets Claude (and other LLMs that adopt the standard) connect to external tools and data sources through a common interface. Instead of writing a custom integration for every API or database you want Claude to access, you point Claude at an MCP server, and the server exposes its capabilities — file access, Slack messages, GitHub repos, database queries — in a format Claude can use.

    It’s a real productivity unlock. It’s also why the token math gets complicated.

    Where the token cost comes from

    When you connect an MCP server to Claude Code (or any MCP-aware client), three things happen on every message:

    1. Tool schema injection. Every tool the server exposes — every name, every description, every parameter definition — is included in the context Claude sees. A Slack MCP server with 10–15 tools typically adds about 2,000 tokens. A GitHub server is heavier. A custom internal-tooling server with verbose descriptions can run 5,000–8,000 tokens on its own.

    2. Tool-use system prompt overhead. Anthropic’s documentation confirms that whenever tools are present in a request, a special system prompt is automatically prepended that teaches the model how to use tools. For Claude 4.x models with tool_choice: auto, that’s an additional 346 tokens per request. The bash tool adds 245. The text editor tool adds 700. The computer-use tool adds 735 plus a 466–499 token system prompt extension.

    3. Stateless re-sending. Each message in a conversation is a fresh API request that includes the full conversation history plus the full tool schema. Claude does not “remember” your tools from the last turn the way a human remembers a colleague’s job description. Every turn pays the schema cost again.

    That’s the math. Now multiply by the number of MCP servers you have connected. A developer running Slack + GitHub + a database connector + an internal custom server can easily land in the 15,000–20,000 tokens-per-turn range — and that’s before you’ve typed your actual question.

    The 18,000-token figure, sourced

    The “up to 18,000 tokens per turn” number comes from a combination of public sources verified May 15, 2026:

    • Anthropic’s own GitHub repo for Claude Code, issue #3406, titled “Built-in tools + MCP descriptions load on first message causing 10–20k token overhead.” Anthropic engineers acknowledged the issue and have shipped progressive optimizations against it.
    • Independent analysis by MindStudio measuring real Claude Code sessions with multiple MCP servers attached.
    • Anthropic’s official Claude Code documentation on cost management explicitly recommends running /mcp to inspect connected servers and disabling unused ones to control token consumption.

    The exact number for your setup will be different. The shape of the problem is the same.

    Why this matters more than it looks

    Claude’s standard context window is 200,000 tokens. Losing 18,000 of those to tool definitions before you start typing represents about 9% of your effective working space. That’s a real ceiling cost — but it’s not the part that hurts most.

    The part that hurts is the cumulative bill. If you’re on a Claude subscription with a usage limit, every turn through Claude Code is paying the full schema cost again. A workflow that takes 30 turns of back-and-forth burns 540,000 tokens worth of tool definitions across that session — even if the tool descriptions never change. On the API at standard Sonnet 4.6 rates, that’s about $1.62 in pure schema overhead per session, before any of the actual work gets billed.

    Multiply by a team of engineers running Claude Code daily, and the overhead becomes the largest single line item in your token spend.

    What Anthropic has changed in 2026

    Anthropic has shipped two meaningful optimizations against MCP token bloat over the past few months:

    Deferred tool loading. In recent Claude Code releases, MCP tool definitions are no longer all loaded into context at the start of a session by default. Tool names enter context, but the full schemas only load when Claude actually invokes a particular tool. This is a substantial improvement for sessions where you have many tools available but only use a few.

    Tool Search. A new built-in search mechanism lets Claude discover relevant MCP tools on demand rather than carrying them all in context. One independent measurement reported a Claude Code MCP context cut of 46.9% — from roughly 51,000 tokens down to 8,500 tokens — by using Tool Search instead of full upfront loading.

    These optimizations help, but they don’t make the overhead zero. The baseline cost of having any MCP server connected at all is real, and you still pay it on every turn even with deferral active.

    How to measure your own MCP token cost

    Two practical methods work for most setups:

    Method 1 — The /mcp command. In Claude Code, run /mcp to see every server currently connected. For each one, check how many tools it exposes. Anthropic’s documentation explicitly recommends this as the first step to controlling MCP costs.

    Method 2 — Token-count delta. Send a single message in Claude Code with no MCP servers connected and note the input token count from the API response. Reconnect your MCP servers one at a time. The delta in input tokens between configurations is the per-turn cost of each server. This is the most precise way to know your own number.

    Anything north of about 8,000 tokens per turn in pure MCP overhead is worth optimizing. North of 15,000 is a flag.

    Concrete steps to control MCP token cost

    • Disable MCP servers you aren’t actively using. The single highest-leverage move. If you connected a server two weeks ago for one experiment and never went back to it, every turn you’ve taken since has been paying for it.
    • Prefer CLI tools over MCP servers when both exist. Anthropic’s own cost-management guidance notes that tools like gh, aws, gcloud, and sentry-cli remain more context-efficient than equivalent MCP servers because they don’t add per-tool listing overhead. Claude can simply invoke them via the bash tool.
    • Use MCP gateways for large server counts. If you genuinely need many tools available, gateway products (Maxim, Milvus-backed setups, others) consolidate tools and surface only relevant ones per query, cutting net overhead substantially.
    • Run a complex CLAUDE.md audit. Long project-level CLAUDE.md files compound the per-turn baseline. Treat CLAUDE.md as an asset that’s expensive to keep verbose.
    • Watch for context compounding. In long Claude Code sessions, conversation history grows alongside the tool schema cost. If you’re running a workflow longer than 20 turns, periodically clear context (/clear) to reset the per-turn cost to baseline.

    Frequently Asked Questions

    Does every MCP server cost 18,000 tokens?

    No. The 18,000-token figure is for a typical multi-server setup with several connected servers and built-in tools active. A single small MCP server (5–10 tools, concise descriptions) might only add 1,500–3,000 tokens. The cost scales with the number of servers and the verbosity of their tool definitions.

    Why does Claude reload the tool definitions every turn?

    The Claude API is stateless. Every message is a fresh API request containing the full conversation history and the full tool schema. The model has no memory between requests, so the schema must be present every time tools could be used. Recent deferred-loading optimizations reduce this for unused tools, but anything Claude actually needs still loads each turn.

    How do I see what’s loaded in my Claude Code environment?

    Run /mcp in Claude Code to list every connected MCP server and its tool count. To check the actual token cost, send a test message and inspect the input token count returned by the API.

    Are CLI tools really cheaper than MCP servers?

    Yes, for tools that have both options. CLI tools accessed via the bash tool only add the bash tool’s 245-token overhead. An equivalent MCP server adds its full tool schema for every tool it exposes. For tools you use frequently, MCP can still be worth it for the structured interface; for tools you use rarely, CLI is more efficient.

    Does this affect Claude on the web (claude.ai) too?

    Web Claude does not use the same MCP server-connection model as Claude Code. The MCP token-overhead pattern primarily affects Claude Code, custom Agent SDK applications, and other developer-facing clients where you wire in MCP servers directly.

    Will this get better in future Claude releases?

    Likely. Anthropic has already shipped deferred tool loading and Tool Search in 2026, both of which materially reduce the per-turn overhead for unused tools. The architectural baseline (tools must be present in context to be invoked) is unlikely to change, but the practical cost should keep dropping as the deferred-loading optimizations mature.

    Related Reading

    How we sourced this

    Sources reviewed May 15, 2026:

    • Anthropic GitHub: anthropics/claude-code issue #3406, “Built-in tools + MCP descriptions load on first message causing 10-20k token overhead” (primary source for the overhead figure and Anthropic acknowledgment)
    • Anthropic Claude Code documentation: Connect Claude Code to tools via MCP and Manage costs effectively (primary source for /mcp command and CLI vs. MCP guidance)
    • Anthropic Pricing Documentation: tool-use system prompt token counts, bash/text-editor/computer-use overheads (primary source for the per-tool fixed costs)
    • Independent analysis: MindStudio (multiple Claude Code MCP measurements), Joe Njenga’s Tool Search 51K→8.5K measurement, Maxim and Scott Spence on optimization patterns (Tier 2 confirming sources)

    Token-cost numbers in this article are accurate as of May 15, 2026. Anthropic is shipping MCP optimizations regularly, so the practical overhead may be lower in your environment than what’s described here.

  • Claude Code Pricing in May 2026: What $20, $100, and $200 a Month Actually Buy You

    Claude Code Pricing in May 2026: What $20, $100, and $200 a Month Actually Buy You

    Last refreshed: May 15, 2026

    Claude Code pricing has stopped being a clean sticker number and started being a question of which ceiling you hit first. There is a $20 plan, a $100 plan, and a $200 plan — and underneath all three sits a 5-hour rolling window, a weekly active-hours cap added in August 2025, and a per-model multiplier that quietly makes Opus 4.7 the most expensive thing you can do inside the terminal. If you came looking for the right plan, the honest answer is: it depends on whether you are mostly a Sonnet operator or you live in Opus.

    The three subscription tiers, stripped down

    Pro — $20/month. Access to Claude Code in the terminal, web, and desktop, with both Sonnet 4.6 and Opus 4.7 available. The practical envelope is about 44,000 tokens per 5-hour window and roughly 40–80 weekly active hours on Sonnet, depending on session concurrency. This is the plan for someone running Claude Code a few hours a day on focused work — refactors, scoped feature builds, debugging passes — not someone leaving an agent running while they eat lunch.

    Max 5x — $100/month. Five times the Pro envelope, plus priority during peak demand. The window allocation lands around 88,000 tokens per 5-hour block. This is the tier where you stop thinking about token budgets during a single working day and start thinking about them across a whole week. Picked correctly, it is the cheapest way to use Claude Code as your primary IDE companion without flipping over to API billing.

    Max 20x — $200/month. Twenty times Pro — about 220,000 tokens per window — which translates to roughly 480 Sonnet-hours or about 40 Opus-hours per week before the weekly cap kicks in. Real-world reports from early 2026 had $200/month users watching single Opus prompts eat 10–20% of their daily allocation; Anthropic publicly acknowledged the problem, expanded capacity, and doubled the 5-hour rate limit for Pro and Max accounts. If you are running Claude Code across multiple repos all week and reaching for Opus on the hard problems, this is the tier that stops you from staring at a rate-limit wall.

    The API, as a sanity check

    If you want a sanity check on whether the subscription math works, price the same workload against the API:

    • Claude Haiku 4.5 (claude-haiku-4-5-20251001): $1.00 input / $5.00 output per million tokens
    • Claude Sonnet 4.6 (claude-sonnet-4-6): $3.00 input / $15.00 output per million tokens
    • Claude Opus 4.7 (claude-opus-4-7): $5.00 input / $25.00 output per million tokens

    Prompt caching is the lever almost nobody uses correctly. Cache writes cost 1.25x input price for the 5-minute TTL or 2.0x for the 1-hour TTL, but cache reads cost 0.10x — a 90% discount on every subsequent request that hits the same context. If your .clauderules file, project map, and the file you are editing are all stable for an hour, the bill on a long pairing session can drop by an order of magnitude. The Batch API knocks another 50% off both directions for asynchronous workloads, which is worth knowing if you are running large refactor sweeps.

    One trap on Opus 4.7 specifically: the model uses a new tokenizer that inflates token counts by up to 35% on identical text compared to Opus 4.6. The headline price did not change, but your effective spend per request did — sometimes by nothing, sometimes by a third, depending on the content. If you migrated from Opus 4.6 and your bill went up without your prompt patterns changing, that is the reason.

    How to actually choose

    The cleanest way to pick a plan is to first decide your model mix, then your weekly hours.

    If you are mostly a Sonnet operator — long agentic runs, multi-file edits, codebase Q&A, with Opus only reached for on the architectural questions — Pro at $20 is plausible up to about 5–8 hours of focused use per day, Max 5x covers most full-time individual developers, and Max 20x is overkill unless you are running multiple sessions in parallel.

    If you live in Opus — long-horizon agentic work, hard refactors across many files, anything where you would rather have one good attempt than three Sonnet retries — Pro will frustrate you within two weeks, Max 5x is the realistic floor, and Max 20x is the only tier that gives you a defensible Opus envelope without bouncing over to API billing.

    And if you are running Claude Code across multiple repos all week, leaving agents to grind on tasks while you do other things, Max 20x is the only subscription that holds up — and even then, the weekly cap is real. Use the API for the spillover and you will still come out cheaper than trying to brute-force a smaller plan.

    The number that matters

    One developer’s public report this year: roughly 10 billion tokens consumed across Claude Code over eight months. API metered cost would have exceeded $15,000. The same workload on Max at $100/month for the same window came in around $800 — about 93% cheaper. That is the gap that makes the subscription model worth taking seriously, even when the rate limits feel arbitrary. The $200 tier is not a vanity number; it is the price Anthropic charges to stop being a meaningful constraint on your workflow.

    The right way to read Claude Code pricing in May 2026 is not to ask which plan is cheapest. It is to ask which plan is the cheapest one that disappears — the one that stops appearing in your day. For most full-time developers reaching for Opus regularly, that plan is Max 20x. For everyone else, Max 5x is the first plan that actually gets out of your way.

  • LLMs.txt in 2026: The 4-Element Spec, The Robots.txt Pairing, and How to Verify Crawlers Are Reading It

    LLMs.txt in 2026: The 4-Element Spec, The Robots.txt Pairing, and How to Verify Crawlers Are Reading It

    If you publish an llms.txt file this week, no major model is going to fetch it tonight. That is the honest 2026 read on the spec — and yet the file is still worth shipping for narrow, specific reasons. This guide covers the 4-element specification published at llmstxt.org, the robots.txt pairing that actually controls AI crawler behavior right now, and a server-log filter you can run to verify whether anyone is reading the file you just shipped.

    What llms.txt actually is (and what it isn’t)

    llms.txt is a Markdown file served at the site root — /llms.txt — proposed by Jeremy Howard of Answer.AI on September 3, 2024. The spec at llmstxt.org defines four elements: a required H1 with the project or site name; a blockquote summary; zero or more Markdown content sections (no headings); and zero or more H2-delimited file-list sections containing annotated Markdown links to deeper content. That is the entire specification. There is no header convention, no schema requirement, no robots-style allow/deny syntax.

    What llms.txt is not: it is not a substitute for robots.txt, it is not an access-control mechanism, and as of May 2026 it is not consumed at inference time by ChatGPT, Claude, Gemini, Perplexity, or Copilot in any documented production system. Server-log audits across multiple independent practitioners show GPTBot, ClaudeBot, and Google-Extended do not request /llms.txt in meaningful volume during routine crawls.

    The realistic 2026 use case is developer tooling. AI coding assistants and IDE agents — Cursor, GitHub Copilot, Claude Code, and similar tools — retrieve docs in real time, and a curated llms.txt cuts token waste by pointing them at canonical Markdown sources instead of HTML-rendered pages bloated with nav and tracking. Companies like Anthropic, Stripe, Cursor, Cloudflare, Vercel, Mintlify, Supabase, and LangGraph ship llms.txt for that reason.

    The 4-element template — a working example

    Here is a real, valid llms.txt for a hypothetical SaaS docs site. Copy this structure, change the project name, and you have a shippable file in under 30 minutes:

    # Acme Analytics
    
    > Acme Analytics is a self-hosted product analytics platform for SaaS teams. This file points AI assistants and IDE agents at canonical Markdown documentation, not the rendered HTML.
    
    Authoritative Markdown sources for product, API, and SDK documentation. Use the `.md` variant of any docs page (append `.md` to the URL) for a clean, agent-friendly version.
    
    ## Getting Started
    
    - [Quickstart](https://acme.example/docs/quickstart.md): 10-minute setup, install through first event.
    - [Concepts](https://acme.example/docs/concepts.md): events, properties, identities, sessions — definitions and examples.
    
    ## API Reference
    
    - [REST API Reference](https://acme.example/docs/api/rest.md): every endpoint, request/response schema, rate limits.
    - [Webhook Reference](https://acme.example/docs/api/webhooks.md): payload contracts and retry behavior.
    
    ## SDKs
    
    - [JavaScript SDK](https://acme.example/docs/sdk/js.md): browser and Node, including server-side rendering notes.
    - [Python SDK](https://acme.example/docs/sdk/python.md): server-side ingestion patterns.
    
    ## Optional
    
    - [Changelog](https://acme.example/docs/changelog.md): version history, breaking changes flagged inline.
    

    Two practitioner notes. First, the spec uses an “Optional” H2 as a soft signal — links under that heading can be skipped by aggressive token budgets. Second, the file is most useful when every linked URL has a parallel .md Markdown version. If your site is pure HTML, llms.txt without paired Markdown does little.

    The robots.txt pairing — this is what actually controls AI bots today

    The lever that meaningfully controls AI crawler behavior in 2026 is robots.txt with user-agent–specific rules. Anthropic publishes official documentation for three bots — ClaudeBot for training, Claude-User for user-initiated fetches, and Claude-SearchBot for search indexing — and confirms all three honor robots.txt. OpenAI runs GPTBot (training) and OAI-SearchBot (live ChatGPT search). Google’s AI training opt-out is the Google-Extended user-agent. Perplexity uses PerplexityBot.

    The two-bucket pattern most practitioner sites should ship: block training-only crawlers, allow search and user-initiated retrieval so your content can still be cited in answers.

    # Allow AI search and user-fetch traffic (citations, attribution)
    User-agent: Claude-SearchBot
    Allow: /
    
    User-agent: Claude-User
    Allow: /
    
    User-agent: OAI-SearchBot
    Allow: /
    
    User-agent: PerplexityBot
    Allow: /
    
    # Block training-only crawlers
    User-agent: ClaudeBot
    Disallow: /
    
    User-agent: GPTBot
    Disallow: /
    
    User-agent: Google-Extended
    Disallow: /
    
    # Standard search crawler — leave open
    User-agent: Googlebot
    Allow: /
    
    Sitemap: https://example.com/sitemap.xml
    

    One operational caveat: robots.txt is policy, not enforcement. Anthropic, OpenAI, and Google have all publicly committed their named bots to compliance, but unnamed scrapers and residential-IP harvesters routinely ignore it. For sites with sensitive content, pair robots.txt with WAF or Cloudflare bot-management rules at the edge.

    Structured data still does more heavy lifting than llms.txt

    If your goal is AI citation rather than IDE-agent retrieval, structured data on the page itself moves the needle more than llms.txt. The minimum stack for any article you want cited: Article schema with named author and publisher, FAQPage schema on any post that answers a discrete question, and speakable markup on the answer paragraphs. These get parsed during normal HTML fetches by every major AI crawler — no separate file required.

    How to verify your llms.txt is actually being read

    Ship the file, then run this server-log filter weekly for 30 days. On any standard access-log format (nginx, Apache, or a Cloudflare log push), grep for requests to /llms.txt and break them down by user-agent:

    grep "GET /llms.txt" /var/log/nginx/access.log \
      | awk -F\" '{print $6}' \
      | sort | uniq -c | sort -rn
    

    What you will almost certainly see in May 2026: a steady trickle of human curl requests, the occasional IDE agent fetch tagged with a Cursor or VS Code user-agent, and effectively zero hits from GPTBot, ClaudeBot, or Google-Extended. That null result is itself the measurement — it tells you llms.txt is a developer-experience asset right now, not an AI-citation asset, and your investment should match that reality.

    The recommended 2026 rollout

    For most sites, the right sequence is: ship the robots.txt user-agent rules above first, because those are enforceable today and shape every AI crawler interaction. Add structured data to every article that competes for AI citation. Then publish llms.txt — under 30 minutes of work — for the IDE-agent and dev-tooling upside, with no expectation of immediate search lift. When OpenAI, Anthropic, or Google publicly confirm production llms.txt consumption, you are already in position.

  • Claude MCP in 2026: What Actually Changed and How to Configure It Without Wasting Tokens

    Claude MCP in 2026: What Actually Changed and How to Configure It Without Wasting Tokens

    Last refreshed: May 15, 2026

    If you set up Claude MCP six months ago and have not touched the config since, three things have changed underneath you: the recommended transport, how tools are loaded into context, and how teams share server configs. None of these are cosmetic. If you ignore them, you are leaving tokens, money, and stability on the table.

    This is the working Claude MCP setup I use in May 2026 — what the claude mcp add command actually does, which scope to pick, what the deprecation of SSE means in practice, and where Claude Code still falls short.

    The three-scope mental model

    Every MCP server you wire into Claude Code lives at exactly one of three scopes. Get this wrong and you will either leak credentials into git or wonder why your teammate cannot use the same database the AI just queried.

    • Local (default): the server is available only to you, only inside the current project. Config is written into your project’s entry inside ~/.claude.json. Good for project-specific servers like a dev database or a Sentry project key you do not want other repos to inherit.
    • User: the server is available to you across every project on your machine. Also stored in ~/.claude.json. This is where GitHub, search providers, and personal productivity servers belong.
    • Project: the server is written to a .mcp.json file at the repo root and shared with the whole team via git. Claude Code prompts for approval the first time a teammate opens the project — by design, because anyone who can push to the repo can wire a new server into your environment.

    When the same server is defined in more than one scope, Claude Code resolves it in this order: local beats project beats user beats plugin-provided. This is the part that bites people the most. If you have a “github” entry at user scope and someone adds a different “github” entry at project scope in .mcp.json, the project definition wins for that repo. Run claude mcp list when something behaves strangely.

    The commands you actually need

    The CLI is more useful than the docs make it look. Three commands cover ~90% of real setup work:

    # Add a remote HTTP MCP server at user scope (available everywhere)
    claude mcp add --transport http hubspot --scope user https://mcp.hubspot.com/anthropic
    
    # Add a local stdio server scoped only to this project
    claude mcp add my-db -s local -- node ./scripts/db-mcp.js
    
    # Share a server with your team via the repo's .mcp.json
    claude mcp add my-server -s project -- node server.js

    The short flag is -s, the long is --scope. The -- separator is required for stdio servers because everything after it is treated as the literal command to spawn. Forget it and Claude Code will try to interpret your Node arguments as its own flags.

    SSE is dead. Use Streamable HTTP.

    If your MCP server documentation still tells you to use the sse transport, the documentation is stale. The MCP spec dated 2025-03-26 introduced Streamable HTTP and simultaneously deprecated HTTP+SSE. Through 2026, vendor after vendor has set hard cutoff dates — Atlassian’s Rovo MCP server keeps SSE around until June 30, 2026 and then drops it; Keboola pulled SSE on April 1; Cumulocity’s AI Agent Manager flipped to Streamable HTTP on May 8.

    Why this matters beyond a name change: SSE required Claude Code to hold a persistent connection to a single server replica, which broke horizontal scaling and made every transient network blip a reconnection drama. Streamable HTTP is stateless. Multiple replicas behind a load balancer just work. If you have flaky MCP connections in production, the first thing to check is whether the server is still on SSE.

    For new setups, use --transport http. The older --transport sse still functions but is on the deprecation path.

    Tool Search is the feature you should actually care about

    The single biggest change in how Claude Code uses MCP in 2026 is lazy tool loading via Tool Search. Older MCP clients dumped every tool schema from every connected server into the model’s context window at the start of every conversation. With ten servers wired up that could easily be 20,000+ tokens of overhead before you typed a single character.

    Tool Search inverts this. Claude Code keeps only the server names and short descriptions resident. When a tool is actually needed, it fetches that tool’s full schema on demand. Anthropic’s own documentation says this reduces tool-definition context usage by roughly 95% versus eager-loading clients. In practice that means you can run a serious MCP fleet — GitHub, Sentry, a database, a search provider, your internal API — without quietly burning through your context budget. The Sonnet 4.6 and Opus 4.7 1M-token context window does not save you here, because anything you let crowd the prompt is also being re-read on every turn.

    Companion feature: list_changed notifications. An MCP server can now tell Claude Code “my tool list changed” and Claude Code refreshes capabilities without a disconnect-reconnect dance. If you build your own server, emit this when you swap tool definitions and you save users a restart.

    What it still gets wrong

    Honest take: claude mcp list still does not surface scope information for every entry in a useful way — there is an open issue on the anthropics/claude-code repo asking for it (#8288 if you want to track). Project-scoped servers from .mcp.json have a separate history of not appearing in the list output (#5963) depending on how you opened the project. If you cannot find a server, check both ~/.claude.json and ./.mcp.json directly.

    The other rough edge is the project-approval prompt. The first time you open a repo with a new .mcp.json, Claude Code asks you to approve each project-scoped server. That is the right security default. It is also infuriating in CI or any non-interactive shell, where the prompt blocks the session. The current workaround is to bake the servers in at user scope on build agents so the project-scope approval never fires in CI. A cleaner non-interactive approval flow is the single most-requested fix I see in real teams.

    The setup I would run on a new machine today

    User-scope: GitHub, a code search server, and a single notes/Notion server. Project-scope in each repo’s .mcp.json: whatever database the project owns and whatever observability backend it reports to. Local-scope: anything experimental I am evaluating but do not want my team or my other repos to inherit.

    Pin --transport http on everything remote. Skip Desktop Extensions (.dxt) for anything you want versioned with the codebase — they are a Claude Desktop convenience, not a Claude Code primitive, and they hide the config from your team. Run claude mcp list when something is off and read .mcp.json directly when list is unhelpful.

    That is the whole working model. The pieces that matter — three scopes, Streamable HTTP, Tool Search — fit on a single screen. The pieces that have not caught up yet — list output, non-interactive approvals — are visible in the issue tracker and will move.

  • Claude Context Window — Every Question Answered (Complete FAQ 2026)

    Last refreshed: May 15, 2026

    Tygart Media · Claude Context Window Reference

    Claude Context Window — Every Question Answered

    Updated May 9, 2026 · Sizes verified from Anthropic’s official models page · Based on production use

    Context window questions answered from someone who actually uses the 1M token window in production — not from a spec sheet alone.

    Covers window sizes by model, what 1M tokens holds, the memory vs context distinction, performance at long context, and API-specific details. Full explainer: Claude Context Window Size 2026

    Size Questions

    What is Claude’s context window size in 2026?

    Model API String Context Window Max Output
    Claude Opus 4.7 claude-opus-4-7 1,000,000 tokens 128,000 tokens
    Claude Sonnet 4.6 claude-sonnet-4-6 1,000,000 tokens 64,000 tokens
    Claude Haiku 4.5 claude-haiku-4-5-20251001 200,000 tokens 64,000 tokens

    Source: Anthropic’s official models page, verified May 9, 2026.

    What does 1 million tokens actually hold?

    • ~750,000 words of English text — roughly 10 full-length novels, or 1,500 average blog posts
    • A full mid-size codebase — a 50,000-line Python project with comments
    • ~60–100 research PDFs at 20–30 pages each, all simultaneously
    • Hours of meeting transcripts — a full workday of recorded calls, transcribed
    • Our full WordPress site audit — 200+ posts worth of content loaded in one session for comprehensive SEO analysis

    The shift from 200K to 1M wasn’t just “more room.” It changed what we could ask Claude to do in a single session — whole-codebase reasoning, multi-document synthesis, full-history context.

    How many pages can Claude read at once?

    A typical 20-page PDF is roughly 10,000–15,000 tokens, so at 1M tokens you could load 60–100 such documents simultaneously. A 300-page book runs roughly 150,000–200,000 tokens — Claude can hold 5–6 full books in context at once. In practice, the constraint is usually time to upload and your session structure, not the window ceiling.

    What’s the difference between context window and memory?

    Three distinct things that get conflated:

    • Context window: Everything Claude can see right now in this session. Temporary — disappears when the session ends.
    • claude.ai memory: Facts extracted from past conversations and injected as a summary into new sessions. Persistent but compressed — a small snippet in the context, not the full history.
    • Managed Agents memory stores / Dreaming: Developer-layer knowledge graphs that agents build and refine between sessions. More structured than consumer memory, requires API implementation.

    The 1M context window is your working memory for one session. Memory systems are what carry information across sessions — they work by injecting a summary into the new session’s context, not by giving Claude access to the full prior history.


    Performance Questions

    Does performance degrade at very long context lengths?

    The honest answer: yes, somewhat, and it depends on the task. The “lost in the middle” pattern is real — models tend to weight the beginning and end of very long contexts more heavily than the middle. For tasks that require pinpointing specific information buried deep in a 500-page document, performance is lower than for shorter contexts. For tasks that benefit from broad synthesis across a large body of material — architectural review, theme identification, cross-document comparison — long context is a net positive. Structure important information at natural reference points rather than burying it in the middle of a large document.

    How does Opus 4.7’s context window differ from Sonnet 4.6?

    Same 1M input context window. The difference is max output: Opus 4.7 can generate up to 128,000 tokens in a single response; Sonnet 4.6 caps at 64,000. For most tasks this doesn’t matter. It matters for generating very long documents, large codebases in a single pass, or batch outputs that need to be very long. If you’re not generating 64K+ token outputs, choose between models on capability and cost, not on output ceiling.

    What happens when I hit the context window limit?

    Earlier messages begin dropping out of the active context. Claude can no longer reference information from those dropped messages — it effectively forgets that part of the conversation. In the claude.ai interface, you’ll see a notification as you approach the limit. In API usage, the context window limit is enforced hard — requests exceeding it return an error.


    API and Technical Questions

    Is the 1M context window available on the free plan?

    The model available to free plan users supports the 1M window technically, but free plan rate limits mean sustained heavy long-context use hits limits quickly. The window is available; using it intensively for extended periods is more practical on paid tiers.

    What’s the extended output option on the Batch API?

    On the Message Batches API, Opus 4.7, Opus 4.6, and Sonnet 4.6 support up to 300,000 output tokens using the output-300k-2026-03-24 beta header. This applies only to batch processing — not to synchronous API calls. Useful for large documentation generation, book-length content, or large codebase outputs in batch.

    Can I query context window limits programmatically?

    Yes. The Models API returns max_input_tokens, max_tokens, and a capabilities object for every available model. If you’re building systems that need to programmatically enforce context limits or route by capability, this is the right way to get current values rather than hardcoding from documentation.

    Does context window size affect API cost?

    Only indirectly — you pay for tokens consumed, not for context window capacity. A 1M token window doesn’t cost more than a 200K window. You pay for the tokens you actually send and receive. Loading a 500K-token document into context costs the same per token regardless of whether the model has a 200K or 1M window. The window size determines whether the request is possible at all — not what it costs per token.

  • Claude Pricing — Every Question Answered (Complete FAQ 2026)

    Last refreshed: May 15, 2026

    Tygart Media · Claude Pricing Reference

    Claude Pricing — Every Question Answered

    Updated May 9, 2026 · All prices verified from Anthropic’s official pricing page · Model strings current

    Subscription vs. API. Free vs. Pro vs. Max. Managed Agents on top. What actually changed in May 2026. The answers without the marketing layer.

    Covers subscription plans, API token rates, Managed Agents pricing, Claude Security, and the May 2026 rate limit changes. Full pricing page: Claude AI Pricing — All Plans

    Plan Pricing

    What does each Claude plan cost?

    Plan Price Claude Code Best For
    Free $0 Casual / evaluation use
    Pro $20/mo Individual daily power use
    Max 5× $100/mo Heavy individual use, no peak throttle
    Max 20× $200/mo Highest individual ceiling available
    Team Standard $25/seat/mo (annual) · $30 monthly Shared team access, no coding
    Team Premium $100/seat/mo (annual) · $125 monthly Shared team access + coding
    Enterprise Custom Large orgs, custom limits, SSO

    All subscription prices are per-user per-month. Annual billing locks in the lower rate.

    What’s the difference between Pro and Max?

    Same models, same Claude Code access. Max gives you more usage within the 5-hour rolling window — 5× or 20× Pro’s limit depending on tier — and eliminates peak-hours throttling. If you regularly hit Pro’s limits mid-session, Max is the upgrade. If you haven’t hit limits on Pro, you don’t need Max.

    Did the May 2026 SpaceX deal change subscription pricing?

    May 6, 2026Prices unchanged. Limits doubled. Peak-hours throttling eliminated for Pro and Max. Free plan unchanged.

    The SpaceX Colossus 1 compute expansion doubled the 5-hour rate limit ceiling for Pro, Max, Team, and Enterprise — at no price increase. If you’ve been hitting limits and considering upgrading to Max, check first whether the doubled Pro ceiling now fits your workflow.


    API Pricing

    How does API pricing work?

    API pricing is pay-per-token — you pay for what you use, no subscription required. Rates as of May 2026 (verified from Anthropic’s official models page):

    Model API String Input / MTok Output / MTok
    Claude Opus 4.7 claude-opus-4-7 $5 $25
    Claude Sonnet 4.6 claude-sonnet-4-6 $3 $15
    Claude Haiku 4.5 claude-haiku-4-5-20251001 $1 $5

    Batch API discounts, prompt caching rates, and extended thinking costs apply on top — see Anthropic’s full pricing page for those specifics.

    Is subscription or API cheaper for my use case?

    Subscription wins for consistent daily use (claude.ai interface, Claude Code). API wins for variable-volume programmatic use and batch workloads. The breakeven point: if you’re using Claude heavily enough to hit Pro’s limits even weekly, you’re likely consuming more than $20/month in equivalent API tokens. For batch processing at scale, the Batch API with its discount rate is almost always the most cost-efficient path.

    What’s the real cost of Opus 4.7 vs Sonnet 4.6?

    List price: Opus 4.7 is $5/$25 per MTok input/output vs Sonnet 4.6’s $3/$15 — roughly 1.67× more expensive at list. However, Opus 4.7’s tokenizer produces approximately 1.46× more tokens per task than Sonnet 4.6 on typical workloads, meaning real-world Opus 4.7 costs can run meaningfully higher than the list price ratio implies. For most production API workloads, Sonnet 4.6 is the right default. Use Opus 4.7 when the task genuinely requires maximum reasoning and cost is secondary.


    Managed Agents Pricing

    What does Claude Managed Agents cost?

    Two charges: standard API token rates for whatever model you use, plus $0.08 per session-hour of active runtime. That’s the complete formula — no other managed infrastructure fee on top.

    A session-hour is one hour of active session status. Billing is metered to the millisecond. Idle time, time waiting for your input, and time waiting for tool confirmations do not accrue charges.

    Maximum theoretical monthly runtime cost (24/7 agent): 24 hrs × $0.08 × 30 days = $57.60/month. In practice, token costs become the dominant cost driver well before you approach this ceiling.

    Full breakdown: Claude Managed Agents Complete Pricing Reference

    What does web search cost inside a Managed Agents session?

    $10 per 1,000 searches ($0.01 per search), billed separately from session runtime and token costs. Same rate as web search via the standard API.

    What does Dreaming cost?

    Dreaming uses an advisor/executor billing model. The advisor generates a short plan (typically 400–700 tokens) at the advisor model’s rate; the executor handles the full memory reorganization at its rate. Combined cost stays well below running the advisor model end-to-end. Use max_uses to cap advisor calls per request. Dreaming is developer preview — invitation-only access as of May 2026. Docs: platform.claude.com/docs/en/managed-agents/dreams


    Specialty Model Pricing

    What does Claude Mythos Preview cost?

    $25 per million input tokens, $125 per million output tokens. Invitation-only through Project Glasswing — no self-serve access. Contact Anthropic at anthropic.com/glasswing. Claude Mythos is not available through any subscription tier or standard API access.

    Is Claude Security Beta included in my plan?

    Claude Security Beta is available to all Enterprise customers during the beta period — included as part of Enterprise, no separate per-scan fee. Underlying model is Opus 4.7 ($5/$25 per MTok at API rates). For Enterprise pricing including Claude Security, contact Anthropic sales. Standard API users do not have access during beta.

  • Claude Code — Every Question Answered (Complete FAQ 2026)

    Last refreshed: May 15, 2026

    Tygart Media · Claude Code Reference

    Claude Code — Every Question Answered

    Updated May 9, 2026 · Verified against Anthropic docs · Claude Code v2.1.133

    No preamble. If you’re here, you’re trying to install Claude Code, figure out pricing, or understand what changed. Here are the actual answers.

    This page covers installation, pricing by plan, what’s new in 2026, and the questions that don’t have clean homes in Anthropic’s documentation. Updates as Claude Code ships new versions — currently tracking weekly releases.

    Pricing Questions

    How much does Claude Code cost?

    Claude Code has no separate subscription fee. Access is included in these Claude plans:

    Plan Monthly Cost Claude Code Rate Limits
    Free $0 ❌ Not included
    Pro $20 ✅ Included 5-hr window, doubled May 2026
    Max (5×) $100 ✅ Included 5× Pro limits, no peak throttle
    Max (20×) $200 ✅ Included 20× Pro limits, no peak throttle
    Team Standard $25/seat ❌ Not included
    Team Premium $100/seat ✅ Included 6.25× Pro limits, doubled May 2026
    Enterprise Custom ✅ Included Custom

    API usage (tokens consumed by Claude Code) is billed separately at standard API rates on top of your subscription. For most users, subscription is the dominant cost.

    Is there a Claude Code student discount or Amazon Prime bundle?

    No. As of May 2026, there is no Claude Code-specific student discount and no Amazon Prime Student bundle that includes Claude Code. Pro at $20/month is the cheapest plan that includes Claude Code access. See the full student discount guide for what legitimate options exist for reducing cost.

    What did the May 2026 SpaceX deal change for Claude Code users?

    May 6, 2026 UpdatePeak-hours throttling eliminated for Pro and Max. 5-hour rate limits doubled for Pro, Max, Team Premium, and Enterprise. Free plan unchanged.

    If you’ve been hitting limits during long agentic runs or multi-file refactors, the ceiling is now twice as high. Source: anthropic.com/news/higher-limits-spacex


    Installation Questions

    What are the system requirements for Claude Code?

    • Node.js 18+ required (Node.js 20+ recommended)
    • macOS, Linux, or Windows (Windows support GA as of April 2026 — PowerShell is now the default shell, Git Bash no longer required)
    • Active Anthropic account on a plan that includes Claude Code (Pro, Max, Team Premium, or Enterprise)

    How do I install Claude Code?

    One command:

    npm install -g @anthropic-ai/claude-code

    Then authenticate:

    claude

    Full installation walkthrough with troubleshooting: How to Install Claude Code

    How do I update Claude Code to the latest version?

    npm update -g @anthropic-ai/claude-code

    Current version as of May 9, 2026: v2.1.133 (released May 7, 23:49 UTC). Check your version with claude --version.

    What’s in the latest Claude Code release?

    v2.1.133 (May 7, 2026) key changes:

    • Subagent skill discovery fix — subagents now correctly find project, user, and plugin skills via the Skill tool. Previously a silent failure that broke multi-agent pipelines without obvious error.
    • worktree.baseRef setting (fresh | head) — controls whether EnterWorktree branches from origin/<default> or local HEAD. Default is fresh — this changes prior behavior if you relied on EnterWorktree inheriting unpushed commits.
    • Hooks now receive active effort level via effort.level JSON field and $CLAUDE_EFFORT env var
    • Memory improvement: warm-spare background workers release under memory pressure
    • Fixed parallel sessions hitting 401 from a refresh-token race

    Full release notes: github.com/anthropics/claude-code/releases


    Model Questions

    Which Claude model does Claude Code use?

    By default, Claude Code uses the model Anthropic recommends for coding tasks — currently claude-sonnet-4-6 for most operations, with claude-opus-4-7 available for complex reasoning tasks. The v2.1.126 gateway model picker lets you configure multi-model routing. Current model strings (verified from Anthropic docs):

    • claude-opus-4-7 — most capable, 1M context, 128K max output
    • claude-sonnet-4-6 — balanced speed/intelligence, 1M context, 64K max output
    • claude-haiku-4-5-20251001 — fastest, 200K context

    What happens when Claude Sonnet 4 and Opus 4 retire June 15, 2026?

    If you have any Claude Code configuration or scripts pinning the 20250514 date-string model IDs, those will break. Claude Code’s default model routing will update automatically — but custom configurations pointing to specific deprecated strings won’t. Search your config files for 20250514 now and update to claude-sonnet-4-6 or claude-opus-4-7.


    Capability Questions

    What is Claude Code actually good at vs. not good at?

    Strong: Multi-file refactors, understanding existing codebases, writing tests against real code, debugging with full context, long-horizon tasks that require holding many files in mind simultaneously, architectural reasoning across a full project.

    Less strong: Tasks requiring real-time external data without a tool, highly specialized domain knowledge that isn’t well-represented in training, generating correct code for very niche frameworks with limited documentation.

    Can Claude Code run terminal commands on my machine?

    Yes — with your permission. Claude Code operates in a permission model where it asks before running commands, editing files, or taking actions outside the current working directory. You configure which operations auto-approve and which require confirmation. The claude CLI runs with your local user permissions, not elevated ones.

    What is computer use in Claude Code?

    Computer use (research preview as of April 2026) lets Claude Code open native apps, navigate desktop UI, click through interfaces, and verify results from the terminal — without needing an API or automation script. Available on macOS and Windows within the Cowork desktop app. Useful for tools with no accessible API; slower than direct API integrations when those exist.

    What’s the difference between Claude Code CLI and Claude Code in the IDE?

    The CLI (claude command) is the core product — works in any terminal, any OS, any project. IDE extensions (VS Code, JetBrains) provide UI integration on top of the same underlying capability. Both use the same authentication and the same model. The CLI is the authoritative version for anything involving automation, scripts, or multi-step agentic workflows.