The Agency Stack in 2026: Notion + Claude + One Human

I’m going to describe the stack I actually run, and then I’m going to tell you honestly whether you should copy it.

Most writing about “AI agencies” in April 2026 is either pitch deck vapor or hedged-everything consultant speak — pieces that tell you “AI is transforming agencies” without telling you which tools, which workflows, which tradeoffs. This article is the opposite. I’m going to name specifics. I’m going to say what’s working. I’m going to say what isn’t. I’m going to skip the part where I pretend this is a solved problem, because it isn’t, and pretending is how operators who listened to the pitch deck end up eighteen months into a rebuild.

The stack that follows is what a real, paying-bills agency runs to manage dozens of active properties, real client relationships, and a content production operation that ships every day — with one human in the operator chair. It is not hypothetical. It is also not recommended for everyone, which is the part most of these articles leave out.

Here’s the real version. You can decide whether it’s for you when we get to the bottom.


The one-line version of the stack

Notion is the control plane. Claude is the intelligence layer. A handful of operational services run the work. One human makes the calls.

That’s it. That’s the whole stack at the summary level. Everything that follows is specificity about what each of those pieces does, why it’s there, and what happens when you try to run a real business through it.

The four pieces are load-bearing in different ways. Notion holds the state of the business — what’s happening, what’s decided, what’s next. Claude provides the judgment and the synthesis when judgment is needed. The operational services (publishers, research tools, deployment pipelines) do the deterministic work that judgment shouldn’t be wasted on. The human reads, decides, approves, and occasionally gets out of the way.

Fifteen years ago the same agency would have needed forty people. Ten years ago it would have needed twenty. Five years ago it would have needed eight. In April 2026 it needs one human plus the stack. That’s the thesis. The question is whether you can actually run it that way.


What “AI-native” actually means in this context

The phrase “AI-native” has been worn out enough that I need to be specific about what I mean.

AI-native doesn’t mean “uses AI tools.” Every agency uses AI tools. Every freelancer uses AI tools. That bar is on the floor.

AI-native means the operating model of the business assumes AI is a teammate, not a productivity tool. AI is in the loop on strategic thinking. AI is reading the state of the workspace and synthesizing it. AI is drafting, reviewing, triaging, and sometimes deciding — with human oversight, but as a continuous participant, not an occasional assistant you turn to when you get stuck.

The practical difference: an agency that uses AI tools works the way agencies have always worked, but with ChatGPT open in a tab. An AI-native agency has rebuilt its workflows around the assumption that there’s a persistent intelligence layer in the substrate of the business.

The stack below is what the second version looks like when you commit to it.


The control plane: Notion

Notion is where I live during the working day. Not where I put things when I’m done with them — where I actually do the work.

The workspace is organized around the Control Center pattern I’ve written about before. A single root page that surfaces the live state of the business: what’s on fire today, what’s progressing, what’s waiting on me, what the week’s focus is. Under it sits a database spine that maps to the actual operational objects — properties, clients, projects, briefs, drafts, published work, decisions, open loops. Each database answers a specific question someone running the business would ask regularly.

Every meaningful page in the workspace has a small JSON metadata block at the top — page type, status, summary, last updated. That metadata block is for the AI, not for me. It lets Claude read the state of a page in a hundred tokens instead of three thousand. Across a workspace of thousands of pages, the compounding context savings are enormous, and it changes what Claude can realistically see in a session.

The workspace is sharded deliberately. The master context index lives as a small router page that points to larger domain-specific shards. When Claude needs to reason about a specific area of the business, it fetches the shard for that area. When it needs the whole picture, it fetches the router. This is not a product feature anyone has written about — it’s a pattern I arrived at after the main index page got too large to fit into Claude’s context window without truncation. It works. It’s probably what a lot of operators will end up doing.

What Notion is great at: holding operational state, being legible to both humans and AI, letting you traverse the business by asking questions of the workspace rather than navigating folders, integrating cleanly with Claude via MCP, running background rhythms through Custom Agents.

What Notion is not great at: being a database in the performance sense (anything heavy goes somewhere else), being the source of truth for code (version control is), being the source of truth for financial transactions (a real accounting system is), being reliable as the only source for anything mission-critical (it has an outage SLA, not an uptime guarantee).

The rule I follow: Notion holds the operating company. It does not hold the substrate the operating company depends on. That distinction is what keeps the pattern stable.


The intelligence layer: Claude

Claude is the AI I actually run the business with. Not because Claude is strictly better than the alternatives at every task — at this point in 2026 the frontier models are all highly capable — but because Claude’s design posture matches what an operator actually needs.

Specifically: Claude is thoughtful about uncertainty, tells me when it doesn’t know, asks for clarification instead of fabricating, and has a deep integration with Notion via MCP that makes the workspace-and-AI pattern actually work. Those qualities are worth more to me than any single-task benchmark. An AI that sometimes gets things wrong but tells me when it’s uncertain is far more useful than an AI that confidently hallucinates.

The intelligence layer shows up in three configurations:

Chat Claude — what I use for strategic thinking, drafting, review, and synthesis. A conversation on claude.ai or the desktop app with the Notion MCP wired in, so Claude can reach into the workspace to ground its answers in real context. This is where the high-judgment work happens. When I’m making a decision, I work through it in a Claude conversation before I commit to it.

Claude Code — the terminal-based version that lives at the intersection of code and agent. This is where the more technical work happens — building publishers, writing scripts, managing infrastructure, executing multi-step workflows that touch multiple systems. Claude Code reads my codebase, reaches into Notion when it needs to, calls external services through MCP, and writes back run reports.

Notion’s in-workspace AI (Custom Agents and Notion Agent) — the on-demand and autonomous agents that live inside Notion itself. These handle the rhythms: the daily brief that’s written before I wake up, the triage agent that sorts whatever lands in the inbox, the weekly review that gets drafted on Friday. I didn’t build these to be clever. I built them because I was doing the same small synthesis tasks over and over, and Custom Agents let me stop.

Three configurations, three different jobs. Each one’s strengths map to a different kind of work. Together they cover the whole territory.

What Claude is great at: synthesis across real context, drafting with judgment, reasoning through decisions, catching inconsistencies in my thinking, executing defined workflows with honest failure modes.

What Claude is not great at: being the last line of defense on anything (always have a human gate), handling workflows where one error compounds (use deterministic tools for those), long-horizon autonomy without oversight (agents drift, supervise accordingly), making decisions that require context it doesn’t have access to.

The mental model I use: Claude is a thoughtful senior teammate who happens to be infinitely patient and always awake. That framing gets the relationship right. Over-rely on it and you get hurt. Under-rely on it and you’ve hired a senior teammate and asked them to run errands.


The operational services: the things that do the work

The third layer is the part most agency-AI writeups skip, because it’s unglamorous. It’s the set of operational services that do the actual deterministic work. Publishing. Research. Deployment. Monitoring. The stuff that shouldn’t require judgment once you’ve set it up correctly.

I’m going to describe the shape without naming specific tools, because the shape is what’s durable and the specific tools will change.

Publishers — services that take content prepared upstream and push it to the properties where it needs to live. WordPress for editorial content, social media scheduling for distribution, email tools for outbound. The publisher’s job is to execute reliably and log honestly. When it fails, it fails loudly enough that I notice.

Research infrastructure — services that pull structured data about keywords, competitors, search volumes, backlink profiles, and so on. This is where AI-native agencies diverge most sharply from traditional ones. Traditional agencies do research manually. AI-native agencies run research as a pipeline: the structured data comes in, gets processed, and lands in the workspace as briefs and intelligence reports that the human and the AI both read.

Background pipelines — the scheduled services that keep the workspace fresh. New briefs get generated. Stale content gets flagged. Traffic data gets ingested. The kinds of things that an agency would traditionally ask a human to do on a weekly rhythm, running autonomously in the background.

Deployment and monitoring — how the technical side ships. Version control holds the source of truth. Deployments run on triggers. When something breaks, it breaks to a channel I actually read.

The principle that holds all of this together: deterministic work belongs in deterministic systems. Don’t use an AI agent to do something a script can do. An AI agent adds judgment, which is valuable when you need judgment, and costly when you don’t. The operational services do the work that has a right answer every time. The AI handles the work that requires judgment.

Most agency-AI failures I’ve watched happen are cases where someone tried to use an AI agent for the deterministic work. The agent mostly succeeds, occasionally hallucinates, and introduces a class of silent failure that didn’t exist in the deterministic version. It feels like you’re being clever. You’re introducing unreliability.


The one human in the chair

This is the part the vendor writeups never include, and it’s the most important piece.

There is one human in the operator chair. That human is non-optional. Every workflow, every agent, every pipeline eventually terminates at a human decision or a human review gate. The AI stack does not run the business. The AI stack is a lever that makes one human capable of running what used to take many.

What the human does in this configuration is different from what they would have done in a traditional agency. The human is not writing every post. The human is not doing every bit of research. The human is not executing every workflow. The human is:

Setting the posture. What are we working on this week? What’s the priority? What’s the theme? The AI is exceptional at executing against clarity. It is not exceptional at deciding what to be clear about.

Reading the synthesis. The AI surfaces what matters. The human decides what to do about it. Every morning brief, every weekly review, every escalation flags lands in front of the human, who makes the call.

Making the judgment calls. When a client needs a difficult conversation. When a strategy needs to change. When something the AI suggested is actually wrong. These are the moments the AI can’t be left alone with. The operator role is increasingly concentrated around exactly these moments.

Holding the relationships. Clients don’t want to talk to an AI. They want to talk to a human who happens to be very well-supported by AI. The difference matters enormously in trust, tone, and staying power of the engagement.

Maintaining the stack itself. The stack doesn’t maintain itself. Every week there are small adjustments, small rewirings, small improvements. The operator is also the architect of the operating company, and the architecture is a living thing.

A person who thought they were buying “AI that runs my agency for me” is going to be disappointed. A person who understood they were buying “a lever that makes them ten times more effective at the parts of agency work that actually matter” is going to be delighted. The difference is what you think you’re getting.


The daily rhythm (what it actually looks like)

Let me describe a real working day in this stack, because the abstract description doesn’t convey what using it feels like.

Morning. I open Notion. The Morning Brief Agent ran overnight; the top of today’s Daily page already has a three-paragraph synthesis of the state of the business, pulled from the active projects, the task database, yesterday’s run reports, and the overnight changes. I read it in ninety seconds. I know what’s on fire, what’s progressing, what’s waiting on me. The context tax that used to cost me the first hour of every day is already paid.

Morning block. I work through the highest-leverage thing on the day’s priority list. If it’s strategic, I work through it in a Claude conversation with the Notion workspace wired in, because grounding the AI in real context produces dramatically better thinking than working in isolation. If it’s technical, I work in Claude Code, because the terminal version handles multi-step technical work better. Either way, I’m working with the AI as a thinking partner, not a tool I reach for occasionally.

Mid-day. The triage agent has processed whatever landed in the inbox. I scan its decisions, override the ones I disagree with, and dispatch anything important into its real database. The escalation agent has flagged the three things that need my attention today. I make the calls. These are the moments the stack needs a human for — no amount of clever configuration replaces them.

Afternoon block. Content operations. Research intelligence lands as structured data in the workspace. Briefs get drafted. I review them. Approved briefs flow to the publishing pipeline. The pipeline runs, logs back to the workspace, and I get notified of anything that failed. I don’t write every post. I write the ones where my voice specifically matters, and I review the rest. The ratio is maybe one in ten that I write from scratch these days.

Evening. Five minutes of close. Anything that didn’t get done gets re-dated. Tomorrow’s priority list pre-stages. I close Notion. The overnight agents will handle the rhythms while I sleep.

That’s the day. It is dramatically different from running a traditional agency, and dramatically more sustainable. The cognitive load is substantially lower even while the operational throughput is substantially higher. That’s the whole promise of the pattern, and it’s the part that’s real.


What this stack actually costs (and doesn’t)

The direct tool costs for the stack in April 2026, at the level I run it:

  • Notion Business plan with AI add-on
  • Claude subscription (Max tier for the agent budget)
  • A cloud provider account for the operational services (running pennies to small dollars per day at my volume)
  • A handful of research and analysis tool subscriptions
  • Domain, email, and the usual small-business infrastructure

Total monthly direct tool cost is the equivalent of what a traditional agency would spend on a single junior employee’s salary for one week. The leverage ratio is extreme, and it will get more extreme.

What it costs that isn’t money:

  • Setup time. Weeks to stand up the initial version, months to iterate it into something that runs smoothly. This is not a weekend project.
  • Ongoing attention to the stack itself. Maybe ten percent of my week is spent on the operating company rather than on client work. That ratio is load-bearing; if I let it go below that, the stack rots.
  • Discipline about not adding cleverness. Every new tool, every new agent, every new integration is a tax on the coherence of the system. Most weeks I’m resisting the urge to add something, not looking for something to add.
  • Loneliness of the role. One-human agencies are lonely. You don’t have a team meeting. You don’t have a coffee conversation with a coworker. The stack is not a substitute for colleagues. This is the part nobody writes about and it’s genuinely significant.

What this stack is not good for

If I’m being honest about who should not run this pattern, it includes:

Agencies that want to scale headcount. This stack is designed to make one human capable of more. It’s not designed to coordinate ten humans. A ten-person agency on this stack would have chaos problems I haven’t solved.

Businesses where the work is primarily relational. Sales-heavy businesses, high-touch consulting, therapy practices. The stack is strong at operational and production work. It is weak at anything where the work is fundamentally “I am present with this other person.”

Anyone uncomfortable with AI making meaningful decisions. The stack assumes you’re willing to let AI make decisions that have real consequences — triage, synthesis, drafting under your name. If that crosses your line philosophically, don’t force it. The stack won’t be fun for you.

People looking for a plug-and-play system. This is a living architecture. It requires ongoing maintenance. It never stops being built. If you want something that works out of the box and stays working, buy software; don’t build an operating company.

Early-stage businesses without a clear shape yet. The stack rewards clarity about what your business is. If you’re still figuring that out, the stack will accelerate whatever direction you’re going — which is great if the direction is right and brutal if it isn’t. Figure out the direction first, then build the stack.


Who this stack is good for

The operators I’ve seen get the most out of this pattern share a specific profile:

  • Running businesses with high operational complexity but small team size. Multi-property content operations, advisory practices, specialist agencies. The kind of business where one capable person with leverage beats a team without it.
  • Comfortable with systems thinking. The stack rewards people who think in terms of flows, interfaces, and substrates. If that vocabulary feels alien, the stack will feel alien.
  • Honest about what they’re good at and what they aren’t. The stack amplifies the operator. If the operator is strong at strategy and weak at execution, the stack handles the execution. If the operator is strong at execution and weak at strategy, the stack does not magically produce strategy. Know which version you are.
  • Willing to maintain the architecture. The stack is a long commitment to the operating company, not a one-time setup. Operators who enjoy tending the system do well. Operators who resent tending the system should not run it.

If you recognize yourself in the good-fit list and not the bad-fit list, this pattern is probably worth the investment. If you’re on the fence, it probably isn’t yet — come back when the decision is clearer.


The part I want to be brave about

Here’s the part this article is supposed to be honest about.

This pattern works for me. It might not work for you. The vendor-shaped narrative says every business should be AI-native, every agency should be running this stack, every operator should be ten times leveraged. That narrative is wrong. It’s wrong in the boring, everyday way that industry narratives are always wrong: it oversells, it under-discloses the costs, and it creates an expectation gap that a lot of operators are going to run into eighteen months from now.

The accurate narrative is this: for a specific kind of operator running a specific kind of business, this stack produces a kind of leverage that was not previously available. For everyone else, it’s a distraction from what they should actually be doing, which is the hard work of their specific business with the tools that fit their specific situation.

I am describing what I run because I think honest examples are more useful than vague generalities. I am not recommending you run it. I am recommending you look at your actual business, your actual operating constraints, and your actual relationship with AI tools, and decide whether a version of this pattern — adapted, simplified, or rejected — makes sense for you.

There’s a version of this article that promises that if you copy my stack, you’ll get my outcomes. That article is lying to you. The outcomes come from matching the stack to the business, not from the stack itself.

If you read this and it resonates, take the pieces that apply. If you read this and it doesn’t, take what you learned about what’s possible and leave the rest. Either response is correct.


The five things I’d tell someone thinking about building something like this

Start with the Control Center, not the agents. The Control Center is the anchor everything else builds against. If you build agents before you have the Control Center, the agents have nothing to write to. Build the workspace shape first. The rest follows.

Resist the urge to add complexity. The operators who succeed with this pattern run simpler versions than they could. The operators who fail run more elaborate versions than they need. Every piece of the stack should be earning its place every week.

Write everything down as you go. The operating company is a living architecture. Six months from now you will have forgotten why you made a specific configuration choice. Document the choices in the workspace as you make them. Future-you will thank present-you.

Don’t over-trust the AI. It’s a teammate, not an oracle. It’s wrong sometimes. It’s confident when it shouldn’t be sometimes. Build review gates. Assume failure. The stack is resilient when you don’t assume otherwise.

Accept that you are building an operating company, not deploying software. This is a long game. It doesn’t work in the first week. It starts working in the second month. It starts compounding in the sixth month. If you’re not willing to tend it for that long, don’t start.


A closing observation

I’ve been running variations of this stack for long enough to have opinions that don’t match what I thought I believed when I started. The biggest surprise has been how much of the work is operational hygiene rather than AI cleverness. Building an agent was the easy part. Running an agency on the operating company pattern has mostly been a discipline problem — staying consistent about metadata, about documentation, about review gates, about when to let the AI decide and when to intervene.

The AI is not the interesting part anymore. The interesting part is the operating model the AI makes possible. That’s the part this article has tried to describe honestly, and that’s the part worth thinking about if you’re considering something similar.

If you do build a version of this, I’d genuinely like to hear how it turns out. The frontier here is being figured out by operators sharing what works and doesn’t, and every honest report makes the next person’s build better. This is my report. I hope it helps.


FAQ

Can I run this stack solo? Yes. The stack is explicitly designed for solo operators or very small teams. One-human operation is the whole point. Multi-person teams work too but introduce coordination complexity the pattern doesn’t directly solve.

How long does it take to build? The minimum viable version — Control Center, a handful of databases, one Custom Agent, Claude wired in — is a week of part-time work. The version that actually earns its place takes two to three months of iteration. It never stops getting built; it compounds over time.

Do I need to know how to code? For the minimum viable version, no. Notion + Claude + Notion Custom Agents gets you a long way without writing code. For the operational services layer, some technical comfort is needed or you’ll need a technical collaborator. Claude Code dramatically lowers the bar here.

What if Notion gets replaced by a competitor? The pattern survives. The Control Center, the database spine, the metadata discipline, the workspace-as-control-plane posture — all of those port to any capable workspace tool. If something displaces Notion in 2027, the migration is real work but the operating model is durable. The durable asset is the pattern, not the specific tool.

What if Claude gets replaced by a competitor? Also fine. The pattern assumes there’s an intelligence layer wired into the workspace; Claude is the current implementation of that layer. If another frontier model becomes more suitable, swap it. The MCP standard that connects everything is model-agnostic. This is deliberate.

Can I use ChatGPT or another AI instead of Claude? Mostly yes. The MCP-to-Notion pattern works with any AI that supports MCP, including ChatGPT, Cursor, and others. I use Claude for the reasons described above, but the stack pattern is compatible with other frontier models. Don’t let tool preferences get in the way of the architecture.

How much does this cost to run? The tool subscription stack costs roughly what one junior employee’s weekly salary would cost per month, total. The non-monetary costs (setup time, maintenance attention, lifestyle tradeoffs of solo operation) are more significant and worth thinking about before committing.

Is this sustainable for a growing business? Yes, up to a point. The pattern scales smoothly to a certain operational volume per human. Beyond that, you need more humans, and coordinating multiple humans on this stack introduces problems that the solo version doesn’t have. Most operators hit the natural ceiling before they hit the growth limit.


Sources and further reading

Related reading from the broader ecosystem:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *