Tag: AI Context

  • Claude Models Roadmap May 2026: Opus 4.7, Knowledge Cutoffs, the 1M Context Window, and What’s Real About Claude 5

    Claude Models Roadmap May 2026: Opus 4.7, Knowledge Cutoffs, the 1M Context Window, and What’s Real About Claude 5

    Last refreshed: May 15, 2026

    The pace of new Claude releases in 2026 has been fast enough that the canonical question — “what’s the latest Claude model and what’s it actually good for?” — has a different answer almost every quarter. This article is the current map, dated and sourced, of what Anthropic has shipped in 2026, what’s confirmed about each model’s specs and knowledge cutoffs, and what’s been claimed (but not officially confirmed by Anthropic) about what’s coming next.

    Two ground rules first, because the model-roadmap space is full of speculation:

    • Specs and release dates marked as verified come from Anthropic’s own documentation, news posts, or help center pages. We list the specific source.
    • Anything marked as reported or claimed comes from third-party reporting (TechCrunch, secondary news sites, analyst commentary) that we could not independently confirm against an Anthropic-published source as of May 15, 2026.

    If you’re making product decisions on this information, treat verified facts as actionable and reported facts as directional.

    The current generally-available Claude models (May 15, 2026)

    From Anthropic’s official models overview and pricing pages, the current production Claude lineup is:

    Claude Opus 4.7claude-opus-4-7

    • Status: Generally available, currently the most capable Claude model
    • Context window: 1 million tokens at standard pricing (no long-context premium)
    • Max output: 128,000 tokens
    • Knowledge cutoff: January 2026 (per Anthropic Help Center, verified May 15, 2026)
    • Pricing: $5/MTok input, $25/MTok output (base rates)
    • Notable changes from 4.6: New tokenizer (uses up to ~35% more tokens for the same text), high-resolution image support up to 2576px / 3.75MP, new xhigh effort level, task budgets beta. Extended thinking budgets and sampling parameters (temperature, top_p, top_k) are removed.

    Claude Opus 4.6 — Still generally available, $5/MTok input, $25/MTok output. Released February 2026.

    Claude Sonnet 4.6 — $3/MTok input, $15/MTok output. Includes the 1M token context window at standard pricing.

    Claude Haiku 4.5 — Cheapest model in the active lineup at $1/MTok input, $5/MTok output.

    Earlier models still active or in deprecation: Opus 4.5, Opus 4.1, Sonnet 4.5, and Haiku 3.5 (retired except on Bedrock and Vertex AI). Opus 4 and Sonnet 4 are listed as deprecated.

    Knowledge cutoff dates that actually matter

    Per Anthropic’s Help Center article on training-data recency (verified May 15, 2026), the most recent generally-available models have January 2026 knowledge cutoffs. That means:

    • Anything that happened after January 2026 is outside the model’s training data
    • For current events, recent product launches, recent legal or regulatory changes, or very recent technical documentation, the model needs to be given the information directly (in the prompt, via web search, or through tool use) — it can’t be relied on to know it
    • The model still has tools available (web search, code execution, file access) that can access post-cutoff information when explicitly invoked

    The practical version: don’t ask Claude what happened last week and expect it to know. Hand it the source material and ask it to analyze, summarize, or work with what you’ve given it.

    The 1M token context window — what it actually unlocks

    Per Anthropic’s official pricing documentation (verified May 15, 2026), Opus 4.7, Opus 4.6, and Sonnet 4.6 all include the full 1 million token context window at standard pricing. There’s no long-context premium — a 900,000-token request is billed at the same per-token rate as a 9,000-token request.

    That’s an enormous practical change from earlier Claude generations. A 1M context window is roughly:

    • ~750,000 words of English text
    • Most full books or technical specifications in a single context
    • ~8 hours of meeting transcripts at typical density
    • An entire mid-sized codebase, including most or all source files

    Prompt caching and batch processing discounts both apply at standard rates across the full 1M window. For workloads that involve sending the same large document repeatedly with different questions, prompt caching against a 1M context is one of the highest-leverage cost optimizations available in the current Claude lineup.

    What’s reported about Claude 5 (and what we cannot independently verify)

    Multiple third-party sources reported in early 2026 that Anthropic CEO Dario Amodei confirmed a Q2 2026 launch window for Claude 5 in a TechCrunch interview published February 1, 2026. The same sources cited an internal-roadmap leak suggesting an April 28 target date.

    What we can verify as of May 15, 2026:

    • Anthropic’s official model lineup, news page, and platform documentation list the latest production models as Opus 4.7 and earlier 4.x variants. Anthropic has not, to our review, published an official “Claude 5” launch announcement on its anthropic.com news page or its docs.claude.com release notes as of this date.
    • The third-party reporting on Claude 5 specifications (500K context window, 20-25% benchmark improvements, ~90%+ on SWE-bench Verified) is widely repeated but, as far as we could verify, is not sourced to an Anthropic-published document.

    The honest read: Q2 2026 ends June 30, so if the reported timeline is accurate, an official Claude 5 announcement could plausibly land in the next several weeks. If you’re planning a project that depends on a specific Claude 5 capability, build against current Opus 4.7 first and treat any Claude 5-specific work as speculative until Anthropic publishes official model details.

    Claude Sonnet 5 — separate question

    Some 2026 third-party reporting refers to “Claude Sonnet 5” launching in early February 2026 under an internal codename. We could not, in our May 15, 2026 review, find this model listed in Anthropic’s official models overview, pricing page, or release notes — only Sonnet 4.6 and earlier Sonnet variants are listed as currently available models. If “Sonnet 5” was a real intermediate release, it does not appear in Anthropic’s current public model documentation under that name.

    Two possibilities to consider, neither of which we can confirm: the reported Sonnet 5 may have been folded into the broader 4.x lineup under a different name, or the reporting may have been speculative or premature. If you’re tracking model identifiers for production use, only model IDs published in Anthropic’s documentation (such as claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5) are guaranteed to be valid against the API.

    How to actually keep up with Claude releases

    The signal-to-noise ratio in the model-release coverage space is not great. Two practical sources are reliable enough to bookmark:

    • Anthropic’s news page at anthropic.com/news — first-party launch announcements with full model details
    • Claude API release notes at the Help Center release-notes page — concise, dated, version-specific

    For breaking changes that affect production code, the Anthropic platform documentation publishes per-version “What’s new” pages (Opus 4.7’s, for example, lists every API breaking change at launch). Those are the canonical reference for migration work.

    For everything else — analyst commentary, predictions, leak coverage — treat it as commentary, not as fact.

    What this means for your work today

    Based on what is verifiable on May 15, 2026:

    • If you need the most capable Claude model available, use Opus 4.7. It has the largest context window, the highest knowledge cutoff (January 2026), and the strongest reported coding/agentic performance.
    • If you need cost-efficient production work, use Sonnet 4.6. Same 1M context, much lower per-token rates than Opus.
    • If you need cheap, fast, simple-task workloads, use Haiku 4.5.
    • If you’re planning around Claude 5, treat the timing as unconfirmed and build resilience into your code (don’t hard-code model IDs that don’t exist yet).
    • For knowledge cutoff-sensitive use cases (current events, recent regulatory data, post-January 2026 news), always provide the information directly or use tool calls — don’t rely on training data alone.

    Frequently Asked Questions

    What is the knowledge cutoff for Claude Opus 4.7?

    January 2026, per Anthropic’s Help Center documentation verified May 15, 2026. Information about events, products, or developments after that date is not in the model’s training data and must be provided directly.

    What is the largest Claude context window currently available?

    1 million tokens, available on Opus 4.7, Opus 4.6, and Sonnet 4.6 at standard pricing with no long-context premium.

    Has Anthropic officially announced Claude 5?

    As of May 15, 2026, we could not locate an Anthropic-published announcement of a Claude 5 model on anthropic.com or docs.claude.com. Multiple third-party sources have reported a Q2 2026 launch window based on a TechCrunch interview with Dario Amodei, but we could not independently confirm those specifications against a primary source.

    Is Claude Sonnet 5 a real model I can use?

    As of May 15, 2026, “Claude Sonnet 5” does not appear in Anthropic’s official models overview or pricing documentation. The currently available Sonnet model is Claude Sonnet 4.6 (model ID claude-sonnet-4-6). Earlier reports of a Sonnet 5 release were not confirmed against an Anthropic-published source in our review.

    Why does Opus 4.7 use more tokens than Opus 4.6 for the same text?

    Opus 4.7 ships with a new tokenizer that contributes to its improved performance but uses approximately 1x to 1.35x as many tokens for the same input text compared to previous models. Anthropic recommends increasing max_tokens headroom and adjusting compaction triggers accordingly.

    Are sampling parameters (temperature, top_p, top_k) still supported on Opus 4.7?

    No. Setting temperature, top_p, or top_k to any non-default value on Opus 4.7 returns a 400 error. Migration guidance: omit these parameters and use prompting to guide the model’s behavior.

    Related Reading

    How we sourced this

    Sources reviewed May 15, 2026:

    • Anthropic Pricing Documentation: docs.claude.com/en/docs/about-claude/pricing (primary source for model lineup, per-token rates, context window pricing)
    • Anthropic Platform Documentation: What’s new in Claude Opus 4.7 (primary source for Opus 4.7 features, breaking changes, tokenizer, image support, task budgets)
    • Anthropic Help Center: How up-to-date is Claude’s training data? (primary source for knowledge cutoff dates)
    • Anthropic news page (primary source check for Claude 5 announcement — none located as of May 15, 2026)
    • Third-party reporting on Claude 5 / Sonnet 5 (TechCrunch interview reports, Claude5.com, Fello AI, WaveSpeed Blog) — cited as reported but not independently confirmed against primary sources

    This article applies the verified vs. reported distinction throughout. If any of the unverified third-party claims are confirmed by Anthropic in the weeks after this article’s date stamp, the relevant sections should be updated to reflect the new primary-source documentation.

  • Claude MCP in 2026: What Actually Changed and How to Configure It Without Wasting Tokens

    Claude MCP in 2026: What Actually Changed and How to Configure It Without Wasting Tokens

    Last refreshed: May 15, 2026

    If you set up Claude MCP six months ago and have not touched the config since, three things have changed underneath you: the recommended transport, how tools are loaded into context, and how teams share server configs. None of these are cosmetic. If you ignore them, you are leaving tokens, money, and stability on the table.

    This is the working Claude MCP setup I use in May 2026 — what the claude mcp add command actually does, which scope to pick, what the deprecation of SSE means in practice, and where Claude Code still falls short.

    The three-scope mental model

    Every MCP server you wire into Claude Code lives at exactly one of three scopes. Get this wrong and you will either leak credentials into git or wonder why your teammate cannot use the same database the AI just queried.

    • Local (default): the server is available only to you, only inside the current project. Config is written into your project’s entry inside ~/.claude.json. Good for project-specific servers like a dev database or a Sentry project key you do not want other repos to inherit.
    • User: the server is available to you across every project on your machine. Also stored in ~/.claude.json. This is where GitHub, search providers, and personal productivity servers belong.
    • Project: the server is written to a .mcp.json file at the repo root and shared with the whole team via git. Claude Code prompts for approval the first time a teammate opens the project — by design, because anyone who can push to the repo can wire a new server into your environment.

    When the same server is defined in more than one scope, Claude Code resolves it in this order: local beats project beats user beats plugin-provided. This is the part that bites people the most. If you have a “github” entry at user scope and someone adds a different “github” entry at project scope in .mcp.json, the project definition wins for that repo. Run claude mcp list when something behaves strangely.

    The commands you actually need

    The CLI is more useful than the docs make it look. Three commands cover ~90% of real setup work:

    # Add a remote HTTP MCP server at user scope (available everywhere)
    claude mcp add --transport http hubspot --scope user https://mcp.hubspot.com/anthropic
    
    # Add a local stdio server scoped only to this project
    claude mcp add my-db -s local -- node ./scripts/db-mcp.js
    
    # Share a server with your team via the repo's .mcp.json
    claude mcp add my-server -s project -- node server.js

    The short flag is -s, the long is --scope. The -- separator is required for stdio servers because everything after it is treated as the literal command to spawn. Forget it and Claude Code will try to interpret your Node arguments as its own flags.

    SSE is dead. Use Streamable HTTP.

    If your MCP server documentation still tells you to use the sse transport, the documentation is stale. The MCP spec dated 2025-03-26 introduced Streamable HTTP and simultaneously deprecated HTTP+SSE. Through 2026, vendor after vendor has set hard cutoff dates — Atlassian’s Rovo MCP server keeps SSE around until June 30, 2026 and then drops it; Keboola pulled SSE on April 1; Cumulocity’s AI Agent Manager flipped to Streamable HTTP on May 8.

    Why this matters beyond a name change: SSE required Claude Code to hold a persistent connection to a single server replica, which broke horizontal scaling and made every transient network blip a reconnection drama. Streamable HTTP is stateless. Multiple replicas behind a load balancer just work. If you have flaky MCP connections in production, the first thing to check is whether the server is still on SSE.

    For new setups, use --transport http. The older --transport sse still functions but is on the deprecation path.

    Tool Search is the feature you should actually care about

    The single biggest change in how Claude Code uses MCP in 2026 is lazy tool loading via Tool Search. Older MCP clients dumped every tool schema from every connected server into the model’s context window at the start of every conversation. With ten servers wired up that could easily be 20,000+ tokens of overhead before you typed a single character.

    Tool Search inverts this. Claude Code keeps only the server names and short descriptions resident. When a tool is actually needed, it fetches that tool’s full schema on demand. Anthropic’s own documentation says this reduces tool-definition context usage by roughly 95% versus eager-loading clients. In practice that means you can run a serious MCP fleet — GitHub, Sentry, a database, a search provider, your internal API — without quietly burning through your context budget. The Sonnet 4.6 and Opus 4.7 1M-token context window does not save you here, because anything you let crowd the prompt is also being re-read on every turn.

    Companion feature: list_changed notifications. An MCP server can now tell Claude Code “my tool list changed” and Claude Code refreshes capabilities without a disconnect-reconnect dance. If you build your own server, emit this when you swap tool definitions and you save users a restart.

    What it still gets wrong

    Honest take: claude mcp list still does not surface scope information for every entry in a useful way — there is an open issue on the anthropics/claude-code repo asking for it (#8288 if you want to track). Project-scoped servers from .mcp.json have a separate history of not appearing in the list output (#5963) depending on how you opened the project. If you cannot find a server, check both ~/.claude.json and ./.mcp.json directly.

    The other rough edge is the project-approval prompt. The first time you open a repo with a new .mcp.json, Claude Code asks you to approve each project-scoped server. That is the right security default. It is also infuriating in CI or any non-interactive shell, where the prompt blocks the session. The current workaround is to bake the servers in at user scope on build agents so the project-scope approval never fires in CI. A cleaner non-interactive approval flow is the single most-requested fix I see in real teams.

    The setup I would run on a new machine today

    User-scope: GitHub, a code search server, and a single notes/Notion server. Project-scope in each repo’s .mcp.json: whatever database the project owns and whatever observability backend it reports to. Local-scope: anything experimental I am evaluating but do not want my team or my other repos to inherit.

    Pin --transport http on everything remote. Skip Desktop Extensions (.dxt) for anything you want versioned with the codebase — they are a Claude Desktop convenience, not a Claude Code primitive, and they hide the config from your team. Run claude mcp list when something is off and read .mcp.json directly when list is unhelpful.

    That is the whole working model. The pieces that matter — three scopes, Streamable HTTP, Tool Search — fit on a single screen. The pieces that have not caught up yet — list output, non-interactive approvals — are visible in the issue tracker and will move.

  • The Context That Lives Between People

    The Context That Lives Between People

    There’s a simple version of the AI-in-organizations problem that’s wrong: you build the system, give it access to the right data, write a thorough system prompt, and it operates in your organizational context. The prompt is the context. The context is the prompt.

    This framing is everywhere. It’s also the reason most organizational AI deployments produce work that is technically correct and somehow off.

    The context that matters — the context that determines whether a decision lands right, whether a draft feels aligned, whether a flagged opportunity is genuinely actionable — is not stored anywhere. It lives between people.


    Every organization operates on a layer of standing assumptions that nobody explicitly maintains and nobody could fully articulate on request. Not values, not principles, not priorities — something below those. The interpretive substrate that makes the documented values mean anything.

    When someone joins a team and violates one of these assumptions — proposes the wrong thing in the wrong meeting, pushes a decision that is technically within their authority but somehow not theirs to make, surfaces a priority the organization agreed to de-emphasize without announcing it — everyone feels it. The violator usually doesn’t. The substance was fine. Something else was wrong.

    That something else is the context AI systems don’t have.


    Documentation can encode explicit knowledge. It cannot encode the community that makes the documentation mean anything.

    A system prompt can say “this organization prioritizes speed over perfect.” What it cannot encode is whether that norm has actually been consistent for the last six months, or whether leadership has been quietly walking it back after three bad launches, or whether it applies to customer-facing work but not internal infrastructure, or whether the one person whose approval you need is the one exception to the norm.

    The standing assumptions are not stored. They are enacted. They show up in what gets committed to and what sits in the inbox for thirty days.

    Watch a team’s queue long enough and you can read the context. Not from the items themselves — from the pattern of what moves and what doesn’t. Stalled items tell you which commitments have real backing and which are aspirational. Rapid movement in one lane tells you where the actual authority is concentrated. The gap between what the organization says it prioritizes and what it actually processes is a map of the standing assumptions it hasn’t named.

    A single operator can solve this. They can read the board, feel the friction, and say: the predicate is wrong. The item needs to be reframed before it moves. They can do this because they hold the context in their own head, accumulated over months, updated daily.

    A team cannot do this as easily. The context is distributed. Each person holds part of it. The standing assumptions live in the gaps between what anyone would say individually. Ask the team to write down why something has been stalled for thirty days and you’ll get five different answers, each of which is partially true, none of which is sufficient.


    The naive solution is documentation. Write the standing assumptions down. Build a better system prompt. Give the AI more context.

    This helps at the margins. It doesn’t solve the problem.

    Documentation of standing assumptions produces a different artifact — a curated version of the context, shaped by whoever did the writing, frozen at the moment of writing, immediately in tension with the organizational reality it was supposed to encode. It becomes a reference document. The context moves on. The document does not.

    The less naive solution — the one organizations rarely take — is to treat context as an ongoing artifact rather than a static one. Not a document but a practice. Something that gets updated not when someone decides to update it, but when a decision is made that the prior version couldn’t have predicted.

    Every time a team makes a decision that would have surprised an outside observer, that decision contains information about the organizational context. The surprise is the data. The question is whether anyone captures it — not as documentation but as signal, living in the same system as the work itself.

    This is not how most organizational AI deployments are built. They treat context as given — encoded once, referenced forward. The system prompt goes stale six weeks in and nobody notices because the outputs are still technically correct. The work product is fine. The alignment is drifting.


    A system that can only read your context is a tool. A system that reads the gaps between your documented context and your actual decisions is starting to understand something harder to name.

    The implication isn’t that AI systems need more access. More access to documented context doesn’t help if the relevant context isn’t documented. The implication is that organizational deployment requires a different architecture: one where the context layer is treated as a first-class input that needs active maintenance, and where the signal for updating it is not a calendar prompt but a decision that contradicts the prior version.

    This is harder to build than a thorough system prompt. It requires the organization to treat its own implicit knowledge as an artifact worth maintaining — which means surfacing it, which requires the uncomfortable process of naming standing assumptions that everyone was benefiting from not naming.

    The systems that work at organizational scale will have solved this. Not by encoding context better but by treating context as a process rather than a state.


    Prior pieces in this series have addressed the individual operator: memory as infrastructure, capture versus commitment, the discipline of waiting. Those all assumed a single person holding the context in their own head, updated daily, acted on personally.

    The team changes the shape of the problem. Not because teams are harder — though they are — but because the context is no longer located anywhere. It exists only in the aggregate of how the team behaves, and that aggregate is not readable from any single vantage point, including the AI’s.

    The context lives between people. You cannot put it in the prompt. The first step is admitting that.

    The second step — what an organization can actually do about it — is less clean than any framework suggests, and probably requires a different piece.

  • The Context Stack: How I Give Claude Memory Across 27 Sites and 6 Businesses

    The Context Stack: How I Give Claude Memory Across 27 Sites and 6 Businesses

    Last refreshed: May 15, 2026

    The most common question I get from people who read the Split-Brain Architecture piece is some version of: how does Claude actually know what it’s working on? If you are managing 27 sites, 6 businesses, and hundreds of ongoing tasks, how do you avoid spending the first ten minutes of every session re-explaining your entire operation to an AI that has no memory of yesterday?

    The answer is what I call the Context Stack. It is not a single file or a single tool — it is a layered system where each layer handles a different time horizon of memory, and Claude reads exactly what it needs for the task at hand without being overwhelmed by everything else.

    The Problem With AI Memory

    Claude does not have persistent memory across sessions by default. Every conversation starts blank. For someone running a simple use case — drafting an email, summarizing a document — this is fine. For someone running a content network across 27 WordPress sites with different brand voices, different SEO strategies, different clients, and different publishing schedules, a blank slate every session is an operational catastrophe.

    The naive solution is to paste a giant context document at the start of every conversation. I tried this. It doesn’t work. Not because Claude can’t read it — it can — but because a 5,000-word context dump at the start of every session is cognitively expensive for the human, slows down the first response, and buries the relevant information under a pile of irrelevant information.

    The right solution is a stack: different layers of context loaded at different times, for different purposes.

    Layer One — The Global Layer (Always Loaded)

    The global layer is the context that is true across everything I do, all the time. It lives in a CLAUDE.md file at the workspace root and in a persistent system prompt inside Claude’s project settings.

    What goes here: my name, my email, the fact that I manage a network of WordPress sites, the Notion workspace structure, the proxy URL and authentication pattern for WordPress API calls, and a handful of behavioral rules that apply universally — brevity preferences, how I want work logged, what “done” means to me.

    What does not go here: anything site-specific, client-specific, or task-specific. The global layer is 200 lines maximum. Anthropic’s own guidance on CLAUDE.md length is right — longer files reduce adherence. I treat the 200-line limit as a hard constraint, not a guideline.

    Layer Two — The Site Layer (Loaded Per Project)

    Each WordPress site I manage has its own Claude Project, and each project has its own knowledge files. These files contain everything Claude needs to work on that specific site without me having to explain it: the brand voice, the target audience, the top-performing content, the internal linking structure, the credentials, the publishing cadence, and the current content roadmap.

    I generate these files programmatically when I onboard a new site. They pull from the WordPress REST API, the site’s GA4 data, and the Notion database for that client. A site knowledge file for an established site runs about 800–1,200 words. Claude reads it at the start of any session for that project and immediately knows the difference between how to write for a Houston restoration contractor versus a New York luxury lender.

    The site layer is why I can switch from working on a restoration contractor to a luxury lender to a live comedy platform in the same afternoon without losing context. The context travels with the project, not with me.

    Layer Three — The Task Layer (Loaded On Demand)

    The task layer is ephemeral. It is the specific context for the thing I am doing right now: the article brief, the GA data from this session, the list of posts that need refreshing, the client’s feedback on last week’s content.

    This layer lives nowhere permanent. I paste it into the conversation, Claude uses it, and when the session ends it is gone. The task layer is intentionally disposable. If it matters beyond this session, it gets promoted to the site layer or the global layer. If it doesn’t matter beyond this session, it doesn’t need to be stored.

    Most AI users try to make everything permanent. The discipline of the context stack is knowing what deserves permanence and what doesn’t.

    Layer Four — The Second Brain (Asynchronous)

    The second brain layer is Notion. It is not loaded into Claude’s context window directly — it is queried via the Notion MCP when Claude needs specific information.

    What lives here: every session log, every publish log, every piece of competitive intelligence, every client preference that has emerged over time, the Promotion Ledger for autonomous behaviors, the Second Brain database of extracted knowledge from prior sessions.

    The key distinction: Notion is not context I push into Claude. It is context Claude pulls from Notion when it needs it. The MCP connection means Claude can search the Second Brain mid-session, find a relevant prior session log, and use it — without me having to remember that the prior session happened.

    This is the layer that makes the system feel like it has long-term memory even though it doesn’t. Claude doesn’t remember. But it can look things up, and the things worth looking up are stored.

    What This Looks Like In Practice

    A typical session for me starts with a project context already loaded (site layer). Within thirty seconds Claude knows which site it’s working on, what voice to use, and what the current priorities are. I drop in the task layer — a GA report, a list of post IDs, a brief — and we are working within two minutes of starting.

    When something important happens — a new client preference, a site credential change, a strategy decision — I say “log this to Notion” and Claude writes it to the Second Brain. I don’t maintain the second brain manually. Claude maintains it as a byproduct of doing the work.

    When I need to recall something from months ago — what we decided about the internal linking structure for a specific site, what the client said about their brand voice in March — Claude searches Notion and finds it. The retrieval is imperfect but it is dramatically better than my own memory.

    The Honest Constraints

    This system took months to build and it is still not finished. The site knowledge files need updating when strategies change and I don’t always remember to update them. The Second Brain has gaps where sessions weren’t logged properly. The global CLAUDE.md drifts toward bloat and needs periodic pruning.

    The bigger constraint is that this architecture assumes you are operating at a certain scale — multiple sites, multiple clients, recurring workflows. If you are running one site for one business, the overhead of building and maintaining this stack is probably not worth it. A well-written CLAUDE.md and a single Notion page of context will get you most of the way there.

    But if you are scaling past three or four sites, or if you find yourself re-explaining the same context in every session, the stack pays for itself quickly. The ten minutes you spend building a site knowledge file saves you two minutes per session indefinitely.

    The goal is not to give Claude everything. The goal is to give Claude exactly what it needs, when it needs it, at the right layer of permanence.

    Building Your Own Context Stack?

    Email me what you are managing and I will tell you which layers you actually need.

    Most people over-engineer the global layer and under-invest in the site layer. Five minutes of conversation usually fixes it.

    Email Will → will@tygartmedia.com

  • Second-Brain Architecture in the Age of Notion Agents

    Second-Brain Architecture in the Age of Notion Agents

    Second-Brain Architecture in the Age of Notion Agents

    The 60-second version

    The pre-AI second brain was a personal information system. The post-AI second brain is a personal information system that an agent can also navigate. The two are different. A pile of brilliant unstructured notes is great for human recall and useless for agent synthesis. The shift is structural: more databases, fewer floating pages; controlled tags instead of free-text; cross-links between related items; an explicit glossary. Most second brains need to be partially rebuilt to work as agent substrate.

    What changes with agents in the picture

    Pre-agent, the second brain optimization was retrieval-for-humans: how fast can I find the thing I’m looking for. Post-agent, it’s retrieval-for-agents: how reliably can the agent find and synthesize across the right things without human guidance.
    These are different optimizations. Humans use intuition, recent memory, and visual scanning. Agents use semantic search, structured queries, and link traversal. A second brain optimized for one isn’t optimized for the other.

    Five structural shifts

    1. Pages → Databases. Floating pages don’t query well. Databases with consistent properties do. If you have a “books I’ve read” pile of pages, convert it to a database with author, genre, key insight, related-projects properties.
    2. Free tags → Controlled vocabulary. Twenty variations of “client” produces an agent that misses things. One canonical “Client” tag with defined scope works.
    3. Standalone pages → Cross-linked graph. Notion’s link system is the agent’s navigation. A new page should link to at least 2-3 related existing pages. Pages with no inbound or outbound links are dead to the agent.
    4. Implicit conventions → Explicit glossary. A page that captures “this is what we call things and how we structure projects” gives the agent rules instead of guesses.
    5. Recent-memory archives → Continuously enriched archives. Old projects shouldn’t decay. AI Autofill can re-summarize, re-tag, and re-cross-link old pages so they stay queryable.

    The agent-aware folder structure

    A workable shape for an agent-friendly second brain:
    Daily notes (database, dated, freeform — agent reads these for context)
    Projects (database, named, with status, owner, timeline — agent works against these)
    People (database, names, relationships, last interaction — agent uses for personalization)
    Sources (database, URLs, key insights, related-projects — agent cites these)
    Glossary (single page or small database — agent’s vocabulary anchor)
    Decisions log (database, dated, with context — agent’s history)
    Six structures. That’s it. Most second-brain sprawl can be consolidated to this.

    What this enables

    Once the structure is in place, agents do things that feel like magic:
    – “What did we decide about X six months ago?” returns the actual decision plus the context.
    – “Summarize what I’ve learned about Y this year” produces a real synthesis.
    – “Draft a brief on Z” pulls from sources, projects, decisions, and prior work.
    None of this works without the substrate. All of it is trivial with it.

    What to read next

    Editorial Surface Area, Gates Before Volume, AI-Native Company Patterns.

  • Editorial Surface Area: Why Notion AI Only Works as Well as Your Inputs

    Editorial Surface Area: Why Notion AI Only Works as Well as Your Inputs

    Editorial Surface Area: Why Notion AI Only Works as Well as Your Inputs

    The 60-second version

    Notion AI doesn’t make you smarter. It makes your existing editorial infrastructure faster. If your workspace is well-organized, well-tagged, and well-written, the agent produces output that feels like a sharp teammate. If your workspace is sparse, contradictory, or under-tagged, the agent produces output that feels generic. Editorial Surface Area is the operator’s term for the substrate the agent runs on. The smartest move before scaling agents is widening that surface — not buying more credits.

    Why this matters more than tooling debates

    Most operator conversations about AI fixate on which model is best, which platform is winning, and which prompts to use. Those debates miss the underlying mechanic: the agent’s output is a function of the input substrate. A great agent on a thin substrate produces thin work. A mediocre agent on a deep substrate produces strong work. The substrate is the leverage point.
    This is why two operators using the same Notion AI on the same plan get wildly different value. The one with three years of organized project notes, tagged client databases, and structured meeting archives gets an agent that can synthesize anything. The one who joined Notion last month and hasn’t filled in fields gets an agent that hallucinates plausibly.

    What editorial surface area actually consists of

    Five layers, in rough order of impact:
    1. Structured databases with consistent properties. Not pages, databases. With named columns, controlled vocabularies, and reliable filling. This is the substrate agents query best.
    2. Cross-linked pages. Pages that reference each other through Notion’s link system give the agent a navigable graph. Standalone pages are dead ends.
    3. Tagged content with controlled taxonomy. Tags only help if they’re consistent. Twenty different spellings of “client” produces an agent that can’t find anything.
    4. Written-down conventions. A page that says “this is how we name projects, this is how we structure client folders” gives the agent the rules of your house.
    5. Historical archives. Old meeting notes, decided projects, retired playbooks. Agents synthesize patterns from history. The deeper the archive, the better the synthesis.

    The operator’s mistake

    The mistake is treating AI as a substitute for editorial work rather than as an amplifier of it. The pattern goes:
    1. Operator decides to “use AI more”
    2. Operator turns on Custom Agents
    3. Outputs feel underwhelming
    4. Operator concludes AI isn’t ready
    5. Real conclusion: the substrate wasn’t ready
    The fix isn’t different prompts or different models. The fix is widening the surface. Spend two weeks tightening database schemas, cross-linking pages, normalizing tags. Then run the agent again. The improvement is dramatic.

    How to widen your editorial surface area

    Five moves that pay back fast:
    1. Pick three databases and standardize their properties. Same column types, same controlled vocabularies, same filling discipline.
    2. Add a “context” page to every major project. A short page that captures decisions made, constraints, and stakeholder map.
    3. Build a glossary page. What you call things. Your acronyms. Your team conventions.
    4. Migrate Slack-quality conversations into Notion. The decisions that happen in Slack but never make it to a Notion page are invisible to the agent.
    5. Set a “tag review” calendar event monthly. Twenty minutes to clean up taxonomy drift.

    The Tygart Media thesis

    This idea has a name in the Tygart Media editorial line: gates before volume. You don’t scale by adding more outputs. You scale by tightening the gates that produce the outputs. AI amplifies whatever you point it at. If you point it at a sloppy substrate, you get sloppy output at scale. If you point it at a tight substrate, you get tight output at scale.
    The work that feels boring — schema cleanup, tag discipline, archive organization — is the work that makes AI worth running.

    What to read next

    Gates Before Volume (the operational version of this idea), Second-Brain Architecture (how to structure the substrate), Trust Gap (why even good substrate doesn’t eliminate human review).

  • Prompt Patterns That Work Inside Notion: What Generic Prompting Guides Miss

    Prompt Patterns That Work Inside Notion: What Generic Prompting Guides Miss

    Prompt Patterns That Work Inside Notion: What Generic Prompting Guides Miss

    The 60-second version

    Most prompting advice was written for ChatGPT. ChatGPT prompts treat the AI as a blank-context entity that needs everything explained. Notion AI is different — it knows your workspace, so the right prompt patterns reference workspace structure rather than recreate it. Generic “act as an expert and provide a detailed analysis” prompts work poorly. Specific “read the project page X, summarize against rubric Y, output in format Z” prompts work well.

    Five patterns that work in Notion specifically

    1. Reference workspace structure explicitly.
    “Read the [Project Name] page and the linked research database. Summarize key decisions in the format below.”
    Better than: “Summarize this project.”
    2. Pin sources by name.
    “Using only content from the Q3 Strategy database and the Customer Interviews page, identify themes.”
    Better than: “Identify themes from our research.”
    3. Specify output structure with examples.
    “Output as: [Decision], [Date], [Owner], [Status]. Example: ‘Switch CRM to HubSpot, 2026-03-15, Sarah, Approved’.”
    Better than: “Format as a table.”
    4. Constrain length per section.
    “Five sections, two sentences each, in active voice.”
    Better than: “Be concise.”
    5. Reference style guides as named sources.
    “Match the voice of the Tygart Media style guide page.”
    Better than: “Use a professional tone.”

    Three patterns that don’t work in Notion

    1. Role-play prompts. “Act as an expert McKinsey consultant” produces generic consultancy-speak. Notion AI doesn’t need persona priming; it needs context priming.
    2. Long preamble. “I am working on a project that involves…” is wasted tokens when the agent can read the project page directly.
    3. Hypothetical scenarios. Notion AI works on workspace reality. Hypothetical prompts pull the agent away from the actual data.

    The compound prompt pattern

    Effective complex prompts inside Notion stack three elements:
    Source pinning (which pages/databases)
    Task specification (what to do with the source)
    Output specification (format, length, sections)
    A good prompt reads like a small specification. A bad prompt reads like a conversation starter.

    Where this goes wrong

    1. Importing ChatGPT habits. Long preambles and role-play priming hurt Notion AI more than they help.
    2. Vague source references. “Our notes” is ambiguous; “the Customer Interviews database” is specific.
    3. Output ambiguity. “Summarize” produces variance. “Five-section summary, two sentences each” produces consistency.

    What to read next

    How Notion Skills Work, Building Your First Skill, Auto Model Selection, Editorial Surface Area.

  • Designing a Database Schema for AI Autofill That Stays Trustworthy

    Designing a Database Schema for AI Autofill That Stays Trustworthy

    Designing a Database Schema for AI Autofill That Stays Trustworthy

    The 60-second version

    Most database schemas were designed for humans typing things in. Autofill works differently — it processes one row at a time using row content and a prompt. Schemas designed for Autofill make the prompt’s job easier and the human’s job auditable. Controlled vocabularies. Source attribution. Fill-date stamps. Clear separation between human and agent fields. Get the schema right and Autofill is reliable. Get it wrong and you’ll fight Autofill forever.

    Schema design principles

    1. Controlled vocabularies over free text. A “category” field with five select options outperforms a free-text field. Autofill picks from a list reliably; it improvises inconsistently.
    2. Atomic fields over compound fields. “Customer info” as a single text field is bad for Autofill. Separate fields (name, industry, size, region) each get filled cleanly.
    3. Source attribution columns. Add a “filled by” select (Human / Basic Autofill / Custom Agent) and a “fill date.” The audit trail makes drift visible.
    4. Separate human and agent fields. Don’t let Autofill overwrite human-entered fields. Configure Autofill to only fill empty cells or only specific columns marked for agent use.
    5. Validation columns where stakes are high. A “verified by human” checkbox on agent-filled fields creates a gate where human review happens before the field is trusted downstream.

    Patterns for specific use cases

    Content library: title (human), URL (human), summary (Autofill), category (Autofill from controlled list), tags (Autofill from controlled list), filled-by (auto), fill-date (auto), verified (human checkbox).
    CRM: company name (human), industry (Autofill from list), size (Autofill from list), key contacts (Autofill extraction), notes (human), last interaction (formula from related database).
    Research database: source (human), key claim (Autofill summary), category (Autofill), related projects (Autofill relation), my take (human), filled-by (auto).

    Three schema mistakes

    1. Letting Autofill manage relation properties. Cross-row relationships are judgment calls. Autofill misses context. Keep relations human.
    2. No fill date. Without a date stamp, you can’t tell stale data. After 30 days, Autofill output may not reflect current page state.
    3. Mixing free text with structured fields. A free-text “notes” field next to an Autofill “summary” creates confusion about which is canonical.

    What to read next

    AI Autofill Databases foundation piece, Editorial Surface Area, Second-Brain Architecture, Trust Gap.

  • Notion AI vs Claude Projects: Which Belongs in Your Stack

    Notion AI vs Claude Projects: Which Belongs in Your Stack

    Last refreshed: May 15, 2026

    Update — May 15, 2026: Two things have shifted since this article was originally written. First, Claude Opus 4.7 (released April 2026) is now Anthropic’s most capable model with a 1M token context window at standard pricing — which changes the calculus for any task involving large documents or long-form reasoning, where Claude was already the stronger choice. Second, on May 13, 2026, Notion shipped the Notion Developer Platform with Claude as a launch partner, which means the comparison is no longer just “Notion AI vs Claude Projects” — Claude can now operate natively inside Notion via the External Agents API. For the platform launch breakdown, see Notion Developer Platform Launch (May 13, 2026). For the current Claude model lineup, see Claude Models Roadmap May 2026. For how this fits into a working stack, see The Three-Legged Stack.

    Notion AI vs Claude Projects: Which Belongs in Your Stack

    The 60-second version

    Notion AI and Claude Projects both let you bring custom context to AI. The difference is what surrounds the AI. Notion AI lives inside a workspace with databases, integrations, schedules, and a team. Claude Projects lives inside a conversation with files, instructions, and the conversation history. For ongoing operational work where the AI needs to be part of how you work, Notion AI fits. For deep focused work where conversation quality is the primary value, Claude Projects fits. Many operators use both.

    When Notion AI wins

    • Persistent operational context across the workspace
    • Custom Agents on schedules
    • Database fluency and Autofill
    • Native integrations (Slack, Mail, Calendar)
    • Team collaboration patterns
    • Mobile and cross-device access

    When Claude Projects wins

    • Deep, focused task work
    • Strong conversation continuity within a topic
    • Specific instruction sets per project
    • File-heavy reference contexts (code, research, large documents)
    • When conversation quality (Claude’s strength) matters more than integration

    The stacking pattern

    The pattern many operators use:
    Notion AI for the ongoing rhythm of work — agents, databases, daily operational synthesis
    Claude Projects for “I need to deeply work on X” sessions — heavy reasoning, complex code, large reference contexts
    The two don’t conflict; they cover different time horizons. Notion AI is always-on background. Claude Projects is intentional focused sessions.

    What Claude Projects does that Notion AI doesn’t

    • File upload context with longer effective memory in-conversation
    • More flexible custom instructions per project
    • Conversation continuity that’s purely Claude-native (no model-switching)

    What Notion AI does that Claude Projects doesn’t

    • Workspace databases and Autofill
    • Scheduled agent execution
    • Native integrations beyond conversation
    • Multi-user collaboration on the same context

    Where comparisons go wrong

    1. Treating them as direct substitutes. They overlap but serve different shapes of work.
    2. Picking based on raw conversation quality alone. That favors Claude. But conversation quality isn’t the whole product.
    3. Picking based on integration breadth alone. That favors Notion. But integration matters more for some workflows than others.

    What to read next

    Notion AI vs ChatGPT, Notion AI vs Gemini, Editorial Surface Area, Custom Agents vs Basic.

  • Google Drive + Notion AI: Bringing External Documents Into Agent Context

    Google Drive + Notion AI: Bringing External Documents Into Agent Context

    Google Drive + Notion AI: Bringing External Documents Into Agent Context

    The 60-second version

    Most teams have content split between Notion and Google Drive. Drive holds the “I’m collaborating in real-time with five people” docs; Notion holds the structured workspace and database content. The Drive integration lets agents read across both. The result: synthesis that pulls from “the project doc in Drive” plus “the project page in Notion” plus “the related research in Notion’s research database” without manual copy-paste.

    Three patterns that work

    1. Cross-source synthesis. “Summarize the state of project X” pulls from the Notion project page, the Google Doc collaborators are working in, and the Sheets file with the metrics. Agent produces one synthesis from three sources.
    2. Drive-content-as-source for Notion drafts. Drafting a Notion document, agent pulls from a Drive Doc as reference. Useful when the source-of-truth lives in Drive but the deliverable lives in Notion.
    3. Migration assistance. Teams moving from Drive to Notion can use the integration to surface “what’s still in Drive that should be in Notion.” Helps the migration without forcing it.

    What stays manual

    • The actual collaboration in Drive (real-time editing isn’t an agent task)
    • Decisions about which content lives where (organizational, not synthesis)
    • Sensitive Drive content the agent shouldn’t see (don’t connect it)

    Permission inheritance

    The Drive integration uses the connected user’s permissions. The agent sees what you see. Two practical implications:
    – For org-wide Drive content, connect through an account with broad access
    – For personal Drive, connect your personal account; the agent sees only your stuff

    Where this goes wrong

    1. Connecting too broadly. A Drive integration that gives the agent access to your entire org’s Drive includes things you didn’t think about (HR docs, finance, executive). Scope tightly.
    2. Letting Drive content lag behind Notion content. When a Notion page is canonical, the agent should reference it, not the Drive doc. Mark canonical sources clearly.
    3. Treating Drive as substrate without organization. A messy Drive feeds an agent that produces messy synthesis. The Editorial Surface Area thesis applies to Drive too.

    What to read next

    Editorial Surface Area, Slack Integration, Calendar + Notion AI, MCP foundation piece.