The Missing Layer: Why Split Brain Stacks Need a Conversational State Store

My operating stack has three layers. Claude is the brain. Google Cloud Platform is the brawn. Notion is the memory. Each layer has a clear job and the handoffs between them work well most of the time. But there is a fourth layer I did not notice was missing until I had to name it, and the gap it covers runs through every working relationship I have. I am calling it the conversational state store and I think most AI-native stacks have the same hole.

The three layers that already exist

Let me start by describing what I do have, because the shape of the gap only becomes visible against the shape of the things that are already in place.

The Notion layer holds facts. It is the human-readable operational backbone. Six core databases — Master Entities, Master CRM, Revenue Pipeline, Master Actions, Content Pipeline, Knowledge Lab — with filtered views per entity. Every client, every contact, every deal, every task, every article, every SOP. When I want to see the state of a client, I open their Focus Room and the dashboards pull from the six core databases. When Pinto wants to understand the architecture, he reads Knowledge Lab. When I want to know which posts are scheduled for next week, I filter the Content Pipeline. Notion is where humans (me, Pinto, future collaborators) go to read the state of the business.

The BigQuery layer holds embeddings. The operations_ledger dataset has eight tables including knowledge_pages and knowledge_chunks. The chunks carry Vertex AI embeddings generated by text-embedding-005. This is where semantic retrieval happens. When Claude needs to find “everything I have ever thought about tacit knowledge extraction,” it does not keyword-search Notion. It runs a cosine similarity query against the chunks table and gets back the passages that are semantically closest to the question. BigQuery is where Claude goes to read.

The Claude layer holds orchestration. Claude is the thing that decides which of the other two layers to consult, composes queries across both, synthesizes the results, and produces outputs. It reads Notion through the Notion API when it needs current operational state. It queries BigQuery when it needs semantic retrieval. It writes to WordPress through the REST API when it needs to publish. It is the brain that knows which limb to use.

Three layers, three clear jobs, handoffs that mostly work. I have been operating this way for months and it scales well for running 27 client WordPress sites as a solo operator.

The thing that is missing

None of those three layers track the state of open conversational loops between me and the people I work with.

Here is a concrete example. Yesterday I sent Pinto an email with a P1 task. This morning he replied with a completion email. His completion email is sitting in my Gmail inbox, unread. Somewhere in the next few hours I am going to send him a new task. When I do, I need to know three things: (1) did Pinto finish the last thing? (2) did I acknowledge that he finished it? (3) what is the current state of the implicit trust ledger between us — do I owe him a thank-you, does he owe me a response, or are we even?

None of those questions can be answered by Notion. Notion does not know about Gmail threads. None of them can be answered by BigQuery in any useful way because the embeddings are semantic, not temporal. Claude can answer them — but only by reading Gmail live at the start of every session, holding the state in its working memory for the duration of that session, and losing it all when the session ends.

That is the gap. There is no persistent layer that holds the state of conversations. Every session, Claude rebuilds it from scratch, and the rebuild is expensive in tokens and time and prone to missing things.

Why the existing layers cannot fill it

You might ask: why not just put it in Notion? Create a new database called Open Loops, add a row for every active conversation, let Claude read it like any other database. The problem is that Notion is a human-readable layer. It is optimized for humans to see state, not for a machine to update state tens of times per day. Adding rows to Notion costs an API call per row. Open loops change constantly. Every time Pinto sends me a message, the state changes. Every time I reply, the state changes again. Updating Notion in real time for every state change would generate hundreds of API calls per day and would make the Notion workspace feel cluttered to the humans who actually read it.

You might ask: why not put it in BigQuery? BigQuery is the machine layer, after all. It can handle high-frequency writes. The problem is that BigQuery is optimized for analytical queries over large datasets, not for real-time state lookups on small ones. Every time Claude needs to know “what is the current state of my conversation with Pinto,” a BigQuery query would take two to three seconds. That latency at the start of every response breaks the conversational flow. BigQuery is also append-heavy, not update-heavy, which is the wrong shape for conversational state that changes constantly.

You might ask: why not let Claude hold it in working memory across sessions? Because Claude does not have persistent memory across sessions in the way this requires. Each new conversation starts fresh. Claude can read Gmail live at the start of each session, but that forces a full re-derivation of conversational state every single time, which is wasteful and lossy.

The right shape for a conversational state store is none of the above. It is something closer to a key-value store or a document database, optimized for low-latency reads, moderate-frequency writes, and small record sizes. Something like Firestore or a Redis cache, living on the GCP side of the stack, read by Claude at the start of every session and updated whenever a new message flows through.

What the store would actually hold

The schema does not need to be complicated. Per collaborator, I need to know:

  • Last inbound message (timestamp, subject, one-sentence summary)
  • Last outbound message (timestamp, subject, one-sentence summary)
  • Open loops: questions I have asked that are unanswered, with shape and age
  • Acknowledgment debt: things they completed that I have not explicitly thanked them for
  • Active tasks: things I have asked them to do, status, last update
  • Implicit tone: is the relationship warm, neutral, or strained right now

That is maybe ten fields per collaborator. Even with a hundred collaborators, the whole table fits in memory on a laptop. This is not a big-data problem. It is a schema design problem.

Claude reads the store at the start of every session, checks which collaborators are relevant to the current task, and surfaces any open loops or acknowledgment debt that should be addressed inside the work. When Claude sends a message, it updates the store. When a new inbound message arrives, a Cloud Function parses it and updates the store.

Why I am writing this instead of building it

Because I have a rule and the rule is don’t build until the principle is clear. I have an ongoing tension in my operation between building new tools and using the tools I already have. Every new database is a maintenance burden. Every new Cloud Run service is a monthly cost and a failure mode. I have made the mistake before of getting excited about an architectural insight and spending three weeks building something that, once built, I used for four days and then forgot about.

Before I build the conversational state store, I want to know: can I get 80% of the value by letting Claude read Gmail live at the start of every session? If yes, the store is not worth building. If the live-read approach loses state in ways that matter, then the store earns its place.

My honest guess is that the live-read approach is fine for now. I only have one active collaborator (Pinto) and a handful of active client contacts. Claude reading Gmail at the start of a session takes two seconds and catches everything I care about. The conversational state store would be justified when I have ten or fifteen active collaborators and the live-read cost becomes prohibitive. Today it is not justified.

But I am naming the layer anyway because naming it is the first step. If I ever do build it, I will know what I am building and why. And if someone else reading this has the same shape of operation with more collaborators, they might build it before I do, and that is fine too.

When this goes wrong

The failure mode I want to flag most is building the store and then stopping using it because the maintenance cost exceeds the value. This is the universal failure mode of custom knowledge systems and I have fallen into it multiple times. The rule I am setting for myself: if the store cannot be updated automatically from Gmail + Slack + calendar feeds through Cloud Functions, do not build it. A store that requires manual updates will die within thirty days.

The second failure mode is over-engineering. The moment you decide to build a conversational state store, the next thought is “and it should track sentiment, and it should predict response times, and it should flag relationship risk, and it should integrate with calendar for context.” Stop. Ten fields. Two endpoints. One cron. If the MVP does not prove value in two weeks, the elaborate version will not save it.

The third failure mode is pretending this layer is optional. It is not. Every AI-native operator has conversational state. The only question is whether it lives in your head or in a system. Your head is a lossy, biased, forgetful system that works fine until you have more collaborators than you can track mentally, and then it breaks without warning.

The generalization

Any AI-native stack that has (facts layer) plus (embeddings layer) plus (orchestrator) is missing a conversational state layer, and the absence shows up first in async remote collaboration because that is where relational debt compounds fastest. If you operate this way and you feel a vague sense that your working relationships are getting worse in ways you cannot quite articulate, the missing layer is probably part of the explanation. Name it. Decide whether to build it. If you decide not to, at least let Claude read your inbox live so the gap gets covered by runtime instead of persistence.

I am still in the decide-not-to-build phase. I am writing this so that future-me, when I reread it, remembers what the decision was and why.


The Five-Node Series

This piece is part of a five-article knowledge node series on async AI-native solo operations. The full set:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *