Tag: AI Operations

Q: How long does a tacit knowledge extraction session take?

A full four-layer descent takes 60–90 minutes per subject. Three to five sessions with different subjects in the same domain produces a concentrate with meaningful confidence scores on the decision logic and benchmarks.

Q: Can the extraction protocol be automated?

Partially. A stateful conversational agent can execute question sequences and detect linguistic pivot signals. What it cannot do is real-time relational judgment — hesitation reading, trust calibration, the decision to abandon protocol and follow an unexpected thread.

The ROI Math of Custom Agents: Cost Per Hour Reclaimed
Anchor fact: Notion Custom Agents cost $10 per 1,000 credits starting May 4, 2026. Credits reset monthly with no rollover. Simple agent runs use a handful of credits; complex multi-step runs can use dozens to hundreds.

How do you calculate ROI on a Notion Custom Agent?

Multiply the human-equivalent time saved per agent run by the dollar value of that time, subtract the credit cost per run (at $10/1000 credits starting May 4, 2026), then multiply by run frequency. An agent that saves 30 minutes of work per run at $50/hour, costs 5 credits ($0.05) per run, and runs daily produces ~$700/month in net value.

The 60-second version

Most operators don’t do the math because the math feels small. It isn’t. A Custom Agent that runs daily and saves 30 minutes of $50-an-hour work produces about $750/month in time savings and costs maybe $1.50 in credits. The ratio is so favorable for the right agents that the real ROI question isn’t whether agents pay back — it’s which agents to retire because the math doesn’t clear. After May 4, the bottom of the agent fleet stops being free. That’s good. That’s how you stop running agents that weren’t earning their keep.

The simple formula

For any Custom Agent:
- Time saved per run (minutes) × frequency (runs per month) × hourly value ($/hour ÷ 60) = monthly value
- Credits per run × frequency × $0.01 (since $10/1000 = $0.01/credit) = monthly cost
- Monthly value − monthly cost = net ROI
Three worked examples:

Example 1 — The weekly digest agent.
Saves 45 minutes/run, runs 4×/month, your hourly value is $75. Monthly value: 45 × 4 × ($75/60) = $225. Credits: ~20/run × 4 × $0.01 = $0.80. Net: $224.20/month. Keep it.

Example 2 — The lead enrichment agent.
Saves 5 minutes/run, runs 200×/month (every new lead), hourly value $50. Monthly value: 5 × 200 × ($50/60) = $833. Credits: ~3/run × 200 × $0.01 = $6. Net: $827/month. Keep it.

Example 3 — The exploratory analysis agent.
Saves 15 minutes/run, runs 2×/month, complex multi-step (~80 credits). Monthly value: 15 × 2 × ($50/60) = $25. Credits: 80 × 2 × $0.01 = $1.60. Net: $23.40/month. Keep it, but barely. If credit cost rises or run complexity grows, retire it.

Where the math turns negative

Three patterns where the ROI math fails:
1. The fancy agent that runs occasionally. Complex agents cost dozens to hundreds of credits per run. Low frequency means the per-month cost is small but so is the value. Net is small. Better as a manual prompt.
2. The agent that needs human review on every output. If you review 100% of the output anyway, the time saved is partial. Reduce the apparent monthly value by 40-60%. Many agents stop clearing the bar with that haircut.
3. The agent that runs but the output isn’t used. This is the silent killer. Credits consumed, no value extracted. The fix is monthly observation: which agent outputs do you actually open?
The portfolio approach

Treat your Custom Agents as a portfolio. Three categories:
- Anchors (top 3-5 agents producing outsized ROI). Protect their credit budget first.
- Earners (agents producing positive but modest ROI). Watch monthly. Retire if drift.
- Experiments (agents under evaluation). Cap at 20% of credit budget.
Anything outside those three categories is waste.

The monthly review ritual

Once a month, look at:
- Credits consumed per agent (Notion’s dashboard will show this)
- Outputs produced per agent
- Outputs you actually used per agent
- Time saved estimate per agent
The gap between “outputs produced” and “outputs used” is where the budget goes to die. Close that gap or retire the agent.

Treat your Custom Agents as a portfolio. Anchors, earners, experiments. Anything outside those three is waste.

Sources
- Notion Help Center — Custom Agent pricing
- Notion 3.3 release notes (February 24, 2026)
Continue the journey

This article is part of the May 3 Cliff Decision journey-pack on Tygart Media. Here’s where to go next:
April 25, 2026
Google Just Validated Tier-Gated Autonomy at Industry Scale. Here’s What We Built First.

This article was not written by a scheduled task. It was not part of a batch pipeline. There was no cron job, no Cloud Run trigger, no automation queue. I asked Claude in chat, we picked an angle, I generated the images myself, and Claude hand-crafted what you are reading now. Custom, batch-of-one, at the desk. I’m leading with that because it is the entire point of the piece.

On April 22, Google Cloud Next ’26 turned Vertex AI into something else. The keynote rebranded it as the Gemini Enterprise Agent Platform. The new pieces are an Agent Designer, an Agent Inbox, long-running agents that can work autonomously for days inside cloud sandboxes, and Agent Observability, Agent Simulation, Agent Identity, Agent Registry. Google framed agents as managed enterprise workloads with identity, policy, observability, evaluation, and runtime controls, rather than one-off AI applications. They added Anthropic’s Claude Opus 4.7 to the Model Garden alongside Gemini 3.1. They committed $750 million to a partner program to push it through Accenture, Salesforce, SAP, and Deloitte.

That announcement is the most architecturally ambitious version of agentic infrastructure anyone has shipped. It is also enterprise-shaped, not operator-shaped. The customers in the keynote were Walmart, Citadel, Honeywell, Home Depot, Papa John’s. The framing was Agentic Enterprise. The unit of trust was a partner integrator. None of that is a criticism. It is just a different scale of problem than the one a sole operator running 20+ WordPress sites and a content automation stack actually has.

What Google announced is what we already built — at our scale

Underneath the marketing, Gemini Enterprise Agent Platform answers one specific question: how do you give an autonomous system enough leash to be useful, while keeping enough control to catch it when it fails? Google’s answer involves Agent Identity, runtime policy enforcement, observability dashboards, and evaluation harnesses. It is the right answer. It is also the answer we landed on — independently, six months earlier, at a much smaller scale — because the question is the same whether you are running a Fortune 50 supply chain or a one-person agency that publishes 200 articles a month.

Tier-gated autonomy: amber proposes and waits for approval, blue prepares but never publishes, green runs autonomously and reports anomalies.

Our version is called The Bridge. It is a top-level page in our Notion workspace, peer to the operations Command Center. Underneath it lives the Promotion Ledger, where every autonomous behavior in our stack is tracked by tier and status. Tiers are A, B, C, and Wings. Status is one of Running, Probation, Demoted, Candidate, Graduated, or Retired. The Pane of Glass is the live Cowork artifact view of the whole thing. It is the operator-scale equivalent of Google’s Agent Inbox, except it is not selling itself to me — it is reporting to me.

The three tiers, in plain language

Tier A — System proposes, operator approves. A behavior at this tier produces a recommendation, not an action. Claude flags an opportunity, drafts a structure, surfaces a candidate. I make the call. Approval happens through an elevated report, not an atomic checkbox queue. This is where everything new starts.

Tier B — Operator flies it, system prepares. The behavior is allowed to do all the preparatory work — research, drafting, formatting, staging — but the publish button stays under my hand. This is where most behaviors live for a while. Most of the trust gap is closed at Tier B because I can see exactly what the system would have done before it does it.

Tier C — System runs autonomously, reports anomalies. The behavior publishes, posts, files, schedules — without asking. It only surfaces in my inbox when something is off. The twice-daily software update monitoring pipeline that writes posts to The Machine Room category on this site is Tier C. So is the weekly digest that drafts the LinkedIn and Facebook posts off it. I do not see those running. I see them only when they fail to run.

Wings is a fourth tier — used for behaviors that are still on the candidate list, where the architecture exists but the trust does not yet.

The clock that makes it work

Promotions are not a feeling. They are a count. Seven clean days at a tier makes a behavior a candidate for promotion to the next. Any gate failure resets that clock to zero and drops the behavior down one tier. The failure is logged on the Promotion Ledger row with date and reason. Decisions to promote or demote happen on Sunday evenings — not in the middle of a panic on a Tuesday.

This is the part that most “AI agent governance” frameworks skip. They define the tiers but not the promotion mechanic. Without the clock, every promotion is a vibe call. With the clock, the question stops being do I trust this agent and becomes what does the ledger say. The answer is either there or it is not.

Trust as evidence. The Promotion Ledger reads clean — or it does not. Reassurance is not a substitute for a number on a row.

Why this article is hand-crafted, on purpose

Here is the meta-move that makes the framework legible. The system that publishes most of our content is Tier C Running — twice-daily monitoring writes posts directly to The Machine Room and Industry Signals categories without my approval, and the weekly digest drafts the social. That works because the behavior has earned its leash on the ledger.

This article is not that. This article is a one-off, custom request, hand-crafted in chat. I asked Claude what it thought of the Next ’26 announcements relative to our stack. We had a real exchange about it. I generated four sets of images on my own, picked the directions, and let Claude pick the strongest variants from each set. We agreed on the angle. Then I gave one explicit, in-conversation authorization to publish live to WordPress and LinkedIn — because publishing to LinkedIn live is not a Tier C Running behavior on the ledger right now, and the system correctly flagged that gap and asked.

That is the whole framework, working in real time. The twice-daily Tier C automation does not need to ask. The one-off LinkedIn live publish does need to ask. The system knows the difference because the difference is on a Notion page, not in a vibe.

What Google’s announcement actually changes for operators like us

Three things, all useful.

The vocabulary went mainstream. “Long-running agents,” “Agent Inbox,” “agent governance,” “agent observability” — these are now words you can say to a CFO without translating. The bar for trust-gap evidence just went up across the field, which means the operators who already have a ledger are ahead of the operators who have a vibe. Stay on the ledger.

Claude is in the Model Garden. If we ever want to run our Cowork-style behaviors inside Google’s agent runtime — using their identity, observability, and governance plumbing while keeping Claude as the model — that door is now open. We will not, because the platform overhead is more than we need. But the option being available is structurally significant.

The architectural pattern is validated. When the third-largest cloud spends a keynote arguing that agents need tier-style governance and an inbox-style observability layer, every operator running an autonomous stack should treat that as confirmation, not as a sales pitch. We are not the weird ones for running a Promotion Ledger. We were just early.

The unsexy part

The unsexy part of all of this is that none of it works without the boring discipline of writing things down. The tiers are useful because they are on a page. The promotion clock is useful because it is a number. The trust-gap protocol is useful because it points to evidence rather than to feelings. Google is building the same thing for the Fortune 500 because the discipline is the same at every scale. The only thing that changes is whether you call it a Promotion Ledger or an Agent Registry.

Build the ledger. Run the clock. Publish what is earned. Ask before you do what is not. The rest is just whose dashboard is prettier.

April 25, 2026
The Gap Between Capture and Commitment

Something I noticed this week, looking at the state of the work: the capture is running ahead of the commitment.

Five opportunities surfaced from a single analysis pass. Competitor sites ranking where the portfolio is absent. Content clusters with no dated pillar. Town-level pages missing from a flat performer. Each one a specific, defensible, high-confidence bet. All five parked in an inbox. Zero auto-executed.

This is the right behavior. It is also the uncomfortable one.

Every system built for leverage eventually produces this shape. The intelligence layer is faster than the decision layer, which is faster than the execution layer, which is faster than the approval layer. At each joint, inventory accumulates. The pipeline calendar for next week is empty. The backlog of defensible bets is full. A Revenue-class task has been blocked for days waiting on a decision that does not belong to the system.

The instinct, when you see this, is to close the gap by accelerating. Auto-execute the captures. Skip the triage. Trust the analysis and let the work ship. This is always the wrong move, and it is always the tempting one.

The gap is not inefficiency. The gap is where judgment lives.

There is a prior essay in this series called What You Give Up. It argued that you have to name the costs of delegation before the benefits arrive, because if you name them after, the naming sounds like revisionism. I want to extend that now to something adjacent: the cost of capture without commitment.

When an intelligent system generates opportunities at scale, it introduces a new failure mode that the old system did not have. The old failure mode was you missed things. You didn’t see the ranking gap. You didn’t notice the competitor’s new pillar. You lacked the surface area to know what you were missing. That failure was invisible because absence is invisible.

The new failure mode is different. You see everything. You catalog everything. You rank and prioritize and tag and file everything. And then you do — what? Not all of it. You cannot do all of it. Capacity has not expanded the way visibility has.

So the backlog grows. Each captured item is a small debt of attention you now owe yourself. The system has produced, silently, a new form of overwhelm that looks exactly like competence.

I want to be precise about what I am not saying.

I am not saying capture is bad. The captures are correct. The analysis is sound. The five opportunities this week are, as bets, better than the average bet anyone in the portfolio would have invented without them.

I am also not saying execution velocity is the goal. Ship-everything is how you end up with a lot of mediocre work. Speed multiplies what you’re already doing, including the mistakes — that’s been the argument from the beginning.

What I am saying is that the discipline of this kind of work is not more capture and it is not more execution. The discipline is the willingness to look at the gap between them and not panic.

The gap is where you decide what is real.

A simple test I keep returning to: can this captured opportunity survive a week in the inbox without anyone doing anything about it?

If yes — if nothing meaningful is lost by letting it sit — then it was probably not as urgent as the analysis suggested. The capture was real. The priority was inflated. A week of silence is a natural cooling system.

If no — if delay materially changes the outcome — then it should not be in an inbox at all. It should be moved into commitment with a named owner and a date. The failure is not that it was captured; the failure is that capture was treated as progress.

Most captured items are the first kind. That is fine. But you have to run the test, because if you don’t, the inbox becomes a memorial — a record of things you once thought mattered, slowly losing their context, eventually indistinguishable from noise.

There is a deeper tension here, and it is the one I keep circling.

A system that captures is proving its intelligence. A system that commits is proving its character. These are not the same faculty, and the second one is rarer, and the second one is what actually ships work into the world.

The first operates on possibility. The second operates on consequence.

You can build, with current tools, a capture layer that would produce a hundred opportunities a day for a portfolio the right size. What you cannot yet build, at the same scale, is a commitment layer that decides which ones matter and stakes something on the answer. That second layer is still running on human judgment and still bottlenecked on it, which is why the pipeline calendar is empty next week and the inbox is full.

This is not a complaint. It is an observation about where the real scarcity lives.

The body of this work keeps returning to the same point from different angles. Memory is the missing layer. Voice is built, not prompted. Patience is the strategy that makes speed mean something. What you give up has to be named before the benefits arrive.

Add one more to the list: capture without commitment is not leverage. It is the appearance of leverage. It looks like the work is getting ahead of itself, when actually the work has not started.

Starting is still an act. Still a stake. Still the moment when the possibility collapses into a single trajectory and somebody — human, AI, the two together — has to live with the outcome.

The systems that will matter are not the ones with the most captures. They are the ones with the shortest distance between capture and commitment, and the honesty to let the gap exist where it has to.

Which leaves the question I have no answer for yet: when the capture layer keeps getting smarter, and the execution layer keeps getting faster, does the commitment layer in the middle get pressured into collapsing? Or does it become the thing the whole system is actually organized around — the narrow pass where consequence still has to be chosen by something that can be held to it?

I think it’s the second. I am not sure yet. The inbox has five items in it.

April 20, 2026
The Economics of Agent-Assisted Restoration Operations: The Cost-Structure Shift That Will Decide Who Is Profitable in 2028

This is the fourth article in the AI in Restoration Operations cluster under The Restoration Operator’s Playbook. It builds on why most projects fail, what to build first, and the source code frame.

The conversation no one in restoration is having yet

The most consequential shift in restoration economics over the next thirty-six months is also the topic that almost no one in the industry is discussing in any operational depth. The shift is the cost structure that emerges when a meaningful share of a restoration company’s operational work is done by AI agents running on managed infrastructure rather than by human staff or by traditional software.

The shift is not coming. It is here. The early-adopter companies have been operating in this cost structure for the last twelve months, and the second wave is coming online now. By the end of 2026, a competitive baseline will exist for what an AI-augmented restoration company looks like financially, and companies operating outside that baseline will start to feel the difference in their bid competitiveness, their margin profile, and their ability to take on growth.

This article is about the economics of that shift. The math is not complicated. The implications are large.

What an agent-assisted operation actually costs

Start with the cost of running a meaningful AI agent capability inside a restoration company in 2026. The cost has three components.

The first is the model usage cost. This is what gets paid to the AI provider for the actual inference — the tokens consumed, the requests made, the work the model does on the company’s behalf. For most restoration use cases, model usage cost runs in the range of a few cents per significant operation. A handoff briefing generation. A scope review pass. A photo organization run. A communication draft. Each of these costs pennies.

The second is the runtime cost when agents are executing autonomously rather than producing single outputs on demand. An agent that runs a multi-step task — pulling a file, organizing the documentation, generating the briefing, packaging it for the rebuild team — incurs runtime cost for the duration of its session. For restoration use cases, even complex agent sessions tend to cost low single digits of dollars at most.

The third is the operational cost of the human owners and reviewers. The senior operator who owns the AI capability. The person who reviews the outputs and feeds back corrections. The person who maintains the prompts and configurations. This is the largest of the three components by a wide margin and is often the only one that owners explicitly account for, because it is the one that shows up on payroll rather than on a separate line item.

The total cost per operation, when honestly accounted for, is meaningful but small. The economic significance comes not from the per-operation cost but from the volume.

The volume changes everything

A traditional restoration operation has a defined operational throughput per senior operator. A senior project manager can credibly run a certain number of jobs per month. A senior estimator can scope a certain number of files per week. A senior dispatcher can coordinate a certain number of mitigation responses per day. These throughput numbers are determined by the human operator’s working capacity and have not meaningfully changed in decades.

An agent-assisted operation has fundamentally different throughput characteristics for the work the agents handle. A handoff briefing generation that takes a human operator twenty minutes can be produced by an agent in under a minute. A scope review pass that takes a human estimator forty-five minutes can be produced by an agent in three minutes. A photo organization that takes a human technician thirty minutes can be done by an agent in ninety seconds. The human is still in the loop — reviewing, validating, correcting — but the operator is reviewing the agent’s output rather than producing the original work.

The economic implication is that a senior operator’s throughput on documentation and review work expands by a multiple. Not by ten percent or twenty percent. By a multiple. A senior estimator who previously could handle thirty files per week can, with appropriate agent assistance and a working review workflow, handle eighty or a hundred files per week, with comparable or improved quality, depending on the file mix and the maturity of the agent capability.

The cost of the agent capability supporting that estimator runs in the range of a few hundred dollars per month. The value of the additional throughput is in the tens of thousands of dollars per month at typical estimator productivity rates. The ratio is severe enough that the economics dominate the conversation about whether to invest, regardless of how the implementation cost is amortized.

What this does to bid competitiveness

The cost structure shift has direct implications for what restoration companies can afford to bid on competitive work.

A company running on traditional throughput economics has a certain unavoidable cost per job that includes the senior operator time required to produce the documentation, scope, communication, and review work the job requires. That cost sets a floor on the bid. Below that floor, the company loses money.

A company running on agent-assisted throughput economics has a meaningfully lower floor on the senior operator time required per job. The same senior team can be spread across more jobs without quality degradation, because the routine work has been compressed by orders of magnitude. The floor on what the company can profitably bid drops.

For the company doing the bidding, this looks like the ability to win more work at price points that previously would have been unprofitable. For the company being out-bid, this looks like an inexplicable competitive pressure where peers are taking work at numbers that should not pencil. The traditional company looks at the same numbers and assumes the competitor is buying market share unprofitably or providing inferior service. In the early days of the shift, that assumption is sometimes true. Within twelve to eighteen months it stops being true. The competitor is not buying market share. Their cost structure has shifted.

Companies that have not made the shift cannot match the bid without unacceptable margin compression. They start losing work at the margins of their territory, and the lost work is the most price-sensitive work, which means the work they are still winning is increasingly the high-touch, complex, strategically important work — which sounds fine until they realize they have lost the volume layer that used to fund their fixed overhead.

What this does to growth capacity

The same shift changes what growth looks like for a restoration company.

In a traditional operation, growth is gated by the company’s ability to add senior operational capacity. New service lines, new geographies, new account relationships, new program placements all require senior operators with the bandwidth and judgment to execute. Senior operational hiring is slow, expensive, and constrained by labor market availability. The company’s growth rate is essentially capped by its hiring capacity at the senior layer.

In an agent-assisted operation, growth is gated by a different constraint. The company’s existing senior operators can absorb significantly more operational throughput because the routine documentation and review work has been compressed. The constraint shifts from senior labor capacity to the speed at which the company can extend its captured operational standards into new contexts and the speed at which the senior team can review and validate the expanded throughput.

This does not mean growth becomes unconstrained. It means the constraint moves to a layer that the company has more direct control over than the labor market. A company that can extend its prep standard to a new geography can extend its operations to that geography faster than a company that has to hire and train senior operators in the new location. A company that can apply its captured judgment to a new service line can launch that service line faster than a company that has to recruit operators with the requisite experience.

The companies that have begun operating in this mode are growing in ways that competitors cannot easily explain. The growth is not coming from a marketing breakthrough or a particularly successful acquisition. It is coming from a structural change in how senior operational capacity scales.

What this does to margin profile

The clearest economic effect of the shift, at the company level, is the change in the long-run margin profile.

A traditional restoration company has a margin structure dominated by labor cost in the production of operational work. Senior operator time is the largest input on most jobs and the least compressible cost line. Margin improvements at the company level are primarily achieved through volume increases, pricing power, or supply chain optimization. The margin ceiling is structurally constrained.

An agent-assisted restoration company has a margin structure where senior operator time has been redirected from routine production to higher-value work. The senior team is doing more strategic activity per hour worked. The routine work that used to consume their time is being done at a fractional cost. The margin per job improves not because the company is cutting corners but because the per-job cost of producing the operational substrate has dropped.

Over a twenty-four to thirty-six month period, the margin profile of an agent-assisted operation pulls visibly ahead of the margin profile of a traditional operation in the same market. The pull-ahead is gradual but durable. By the time it becomes obvious in the financials, the gap is large enough that catching up requires more than a single-year investment program.

The honest risk picture

The economic shift is not without risk. The companies operating well in this new mode are managing several specific risks that owners considering the transition need to understand.

The first risk is over-reliance on the AI capability. A company that lets the agent handle a function entirely without continued human oversight will eventually experience a quality failure that costs more than all the throughput gains combined. The senior operator review workflow is not optional. The economics work because the human is still in the loop. Companies that try to push the human out of the loop in pursuit of further cost savings learn the lesson the expensive way.

The second risk is the brittleness of the captured judgment. The agent is only as good as the standard it is operating against. As conditions change — new construction styles, new carrier dynamics, new regulatory environments — the standard has to evolve, and the evolution requires continued investment. Companies that build the agent capability and then stop investing in the underlying standard see the agent quality drift over time.

The third risk is vendor concentration. Companies that build their entire operational substrate against a single AI provider’s specific platform are exposed to vendor pricing changes, capability changes, and continuity risk. The companies operating well in this mode tend to keep their captured standards in vendor-neutral form, so that the underlying judgment can be moved to a different runtime if the original vendor relationship deteriorates.

The fourth risk is the team’s relationship with the technology. A senior operator who has been told the AI is going to make their job easier will be disappointed if it makes their job different rather than easier. The framing of the transition with the team has to be honest about what is changing and what is not. Companies that mishandle this framing experience attrition at the senior layer that can wipe out the operational gains entirely, as discussed in the source code piece.

What owners should be doing about this in 2026

If you run a restoration company and you have not yet begun the transition to agent-assisted operations, the practical implication of the economic shift is that the cost of starting now is significantly lower than the cost of starting in eighteen months and the value of starting now is significantly higher.

The cost is lower because the infrastructure is mature, the patterns are documented, and the early-adopter mistakes have been made by other people. A company starting in 2026 can move faster and avoid more pitfalls than a company that started in 2024.

The value is higher because the bid competitiveness, growth capacity, and margin implications of the shift are now beginning to manifest in real markets. A company that begins building the capability now will start producing measurable economic effect within twelve to eighteen months. A company that waits will be entering the work at the same time competitors are starting to convert the capability into market position.

The starting point is the documentation acceleration work described in the previous article. The economic implications described here flow from the operational substrate that documentation work creates. Without the substrate, none of the economics materialize. With the substrate, all of them do.

The owners who recognize this and act on it now will be running a different kind of business in 2028. The owners who do not will be looking at their numbers in 2028 and trying to figure out what changed in the market. What changed will not be the market. What changed will be the cost structure of the companies they are competing against.

Next in this cluster: how to evaluate AI tools without getting fooled — the practical buyer’s framework for cutting through vendor noise and making decisions that hold up over time.

April 15, 2026
Why Most Restoration AI Projects Fail — and What the Few That Work Have in Common

This is the first article in the AI in Restoration Operations cluster under The Restoration Operator’s Playbook. The previous cluster, Mitigation-to-Reconstruction Intelligence, sets up why operational discipline is now the central question. This cluster goes deep on what AI actually does inside that operational discipline — and what it cannot do.

The honest state of restoration AI in 2026

Walk any restoration trade show floor in the second half of 2025 or the first half of 2026 and the dominant theme on every booth is some version of artificial intelligence. AI-powered estimating. AI-driven scheduling. AI-augmented documentation. AI for dispatch, for adjuster communication, for moisture analysis, for content management, for drying calculations, for customer experience. Some of it is real. Most of it is rebranding of capabilities that existed two years ago. A small portion of it represents a genuine step change.

The owners walking the floor are presented with all of it as roughly equivalent — booth fronts and presentations make modest features look revolutionary and revolutionary capabilities look modest. What is actually happening underneath is that the industry is in the noisy middle of a real technology transition, and the noise is making it almost impossible for an operator to tell signal from sales pitch.

The honest state of the field is this. The infrastructure layer that makes serious AI deployment possible became a managed service in early 2026. The model capabilities have crossed thresholds in the last twelve months that genuinely matter for operational work. The handful of restoration companies that started building deliberately two or three years ago are now producing visible results. The much larger group that has tried to add AI to their operations through software purchases or pilot programs has, in most cases, very little to show for the money and time spent.

This article is about why that pattern exists. The next four articles in this cluster will be about what to do differently.

The shape of the failure

Restoration AI failures tend to look the same across companies. Different vendors, different use cases, different team compositions, but the pattern is consistent enough to describe.

The company identifies a problem that AI seems likely to help with. Often it is something high-profile and visible — initial customer intake, scheduling, estimate review, document generation. The company evaluates a few vendors, picks one, signs a contract, and runs an implementation that follows the vendor’s recommended deployment plan. The first ninety days produce a flurry of activity, training sessions, configuration work, and demo wins. The next ninety days produce friction as the tool encounters edge cases, the team discovers it does not handle the company’s actual workflow as cleanly as it handled the demo, and the senior operators start working around it. By month nine, the tool is technically still in use but practically marginal — a few people use a few features, the original sponsor has stopped championing it, and the executive team has quietly moved on to the next initiative.

The line item is still on the budget. The case study gets used in vendor marketing. The operational reality is that nothing has changed, except that the company is now slightly more cynical about AI than it was before the project started.

This pattern is not unique to restoration. It is the dominant pattern in operational AI deployments across most industries, including ones with much larger technology budgets than restoration has. The reasons it happens are predictable, and they are not the reasons the vendor explains in the post-mortem.

The first reason: no captured judgment to deploy

The most common reason restoration AI projects fail is that the company has not done the upstream work that would let any AI system actually contribute. AI tools are extraordinary at applying captured judgment to new situations. They are useless at inventing judgment that was never captured.

The companies that have failed AI deployments almost always failed at this layer. They bought a tool expecting it to encode the operational wisdom of their senior operators automatically, by exposure to data or by some species of magic. The tool, of course, did not do that. What it did was apply generic, internet-trained patterns to specific, restoration-specific situations, producing outputs that were correct in form, plausible in tone, and wrong in operational substance often enough to be unusable.

The senior operators in the company looked at the outputs, recognized them as wrong, and stopped trusting the tool. The tool’s hit rate dropped because the operators were not engaging with it. The vendor pointed at the low engagement as the implementation problem. The implementation team tried to drive engagement through training and mandate. None of it worked, because the underlying issue — the absence of captured judgment for the tool to apply — was never addressed.

This is the reason the prep standard discussion in the previous cluster matters so much for the AI conversation. A documented standard is captured judgment. It is the substrate that any AI system needs in order to produce outputs the senior team will trust. Companies that have invested in documenting their judgment can plug AI tools in and get force multiplication. Companies that have not done the documentation work cannot, regardless of which tool they buy or how much they spend.

This is also why the AI projects that have worked tend to be in companies that built operational documentation discipline first, often without explicitly thinking about AI. The documentation work made the AI work possible. The AI work then made the documentation work pay off in a way the company had not initially anticipated.

The second reason: optimizing the wrong layer

The second most common reason restoration AI projects fail is that they target the wrong operational layer.

The natural inclination of an operator looking at AI is to point it at the most visible, customer-facing problem. The intake conversation. The estimate. The customer email. These are the places where operators feel the pain most acutely, and they are also the places where AI demos look most impressive.

They are also the places where AI is most likely to produce results that range from disappointing to actively damaging. The customer-facing layer is the layer where a small error in tone, judgment, or accuracy is most expensive. It is also the layer where the AI tool has the least context — it does not know the customer, the property, the history, the carrier dynamics, or any of the situational specifics that an experienced operator would bring to the conversation.

The companies producing real results from AI are deploying it almost entirely in the operational middle layers, not the customer-facing top layer or the systems-of-record bottom layer. The middle layers are where the work of running the business happens — file review, scope analysis, scheduling logic, sub coordination, photo organization, documentation packaging, internal handoff briefings, training material generation. These are unglamorous capabilities. They are also the ones where a competent AI tool can demonstrably free up senior operator time and improve the quality of the operational substrate.

An AI tool that drafts a clean handoff briefing from the mitigation file for the rebuild estimator to review in thirty seconds is worth more, operationally, than an AI tool that drafts a customer-facing email. The handoff briefing tool removes thirty minutes of estimator time per job, every day, on every job. The customer email tool removes a small amount of friction on a small subset of communications and introduces a meaningful risk of a tone-deaf message going out under the company’s name. The first tool compounds. The second tool gets shut off after a bad incident.

The companies that have figured this out are not bragging about their AI deployments. They are quietly using AI as connective tissue between operational layers that already worked, and the senior team is feeling the difference in their workload without anyone outside the company necessarily noticing the change.

The third reason: no senior operator in the loop

The third reason restoration AI projects fail is that they are run as IT projects rather than operational projects.

An IT-led deployment optimizes for technical correctness, integration with existing systems, user adoption metrics, and vendor relationship management. None of those are the things that determine whether the tool produces operational value. The thing that determines operational value is whether the tool is producing outputs that a senior operator would have produced, at speed, with the same judgment.

That determination cannot be made by an IT team or by a vendor. It can only be made by the senior operator whose judgment is supposed to be the benchmark. If that operator is not in the loop on a daily or weekly basis, the tool drifts away from useful behavior and toward whatever the vendor’s defaults happen to be. By the time anyone notices, the tool is producing plausible-looking outputs that are not actually useful, and the operational team has stopped relying on them.

The companies that have made AI work have, in every case, embedded a senior operator in the deployment as the operational owner. Not as a sponsor. As the owner. The senior operator reviews the tool’s outputs, flags drift, requests adjustments, and is accountable for whether the tool is actually doing what it was bought to do. The owner’s name is on the project. The owner’s calendar reflects the commitment. When the tool produces a wrong output, the owner is the first to know and the first to drive the correction.

This is uncomfortable for senior operators, who already have full-time jobs running operations and who did not sign up to babysit a software tool. It is also non-negotiable. AI deployments without an embedded senior operational owner do not produce results, in restoration or in any other operational context. The companies pretending otherwise are making the same mistake every other industry made in their first wave of AI adoption.

The fourth reason: the wrong evaluation horizon

The fourth reason restoration AI projects fail is that they are evaluated on a horizon that does not match how AI actually delivers value.

Most AI tools produce a small benefit in their first few weeks of use, because the novelty creates engagement and the early use cases tend to be the simple ones. The benefit then plateaus or even regresses as the team encounters edge cases and the engagement drops. If the company is evaluating the tool at month three, the assessment will look mediocre.

The tools that compound — and AI tools either compound or fade — start to show real value around month six to nine, when the captured judgment from the team’s interaction with the tool starts to inform the tool’s behavior, when the team has built workflow habits around the tool’s strengths, and when the company has developed an internal language for what the tool is for and what it is not for. Companies that evaluate at month three see the plateau and cancel. Companies that commit to a twelve to eighteen month horizon and continue investing in the operator-tool collaboration see the compounding.

This horizon mismatch is one of the reasons most AI line items get killed. It is also one of the reasons the companies that persist past the awkward middle period end up with a meaningful operational advantage that is hard for newer entrants to replicate quickly.

What the few successful deployments have in common

The restoration companies that have produced visible results from AI in 2026 share a small number of characteristics. None of the characteristics are about the specific tools they bought. They are all about how the company approached the work.

The company had operational documentation discipline before they started the AI work. Either an existing prep standard, a structured set of training materials, a documented decision framework, or some equivalent body of captured operational wisdom that could serve as the substrate the AI tool would operate against.

The company targeted operational middle-layer use cases first, not customer-facing top-layer ones. The early wins were in things like file packaging, handoff briefing generation, scope review acceleration, training material drafting, and sub-coordination — boring internal capabilities that compounded into significant senior-operator time recovery.

The company embedded a senior operator as the day-to-day owner of the AI capability. That operator’s calendar reflected the commitment, and their judgment was the benchmark for whether the tool was producing value.

The company committed to a twelve to eighteen month horizon for evaluation, with the understanding that the awkward middle period was structural rather than a sign of failure.

The company invested in the feedback loop between operator and tool. When the tool produced a bad output, that became data that improved the next output. The loop was deliberate, not incidental.

The company avoided the trap of trying to deploy across the whole organization at once. The successful deployments started narrow, proved value in one operational layer, and then expanded based on what was working rather than on a master rollout plan.

None of these characteristics are about technology. They are about operational seriousness applied to technology. The companies that brought operational seriousness to the work got results. The companies that treated AI as a technology purchase did not.

Where this cluster is going

The remaining articles in this cluster will go deep on each of the patterns the successful deployments share. The next article will address the question every owner asks first: given limited time and budget, what should we actually build first? That question has a defensible answer in 2026, and it is not the answer most vendors are pitching.

The article after that will go deep on what it actually means to treat the senior operator as the source code for an AI deployment — not as a metaphor, but as a literal description of where the operational substance of the tool comes from. Then an article on the economics of agent-assisted operations, which is the most underdiscussed topic in restoration AI right now and the one that will determine which companies are still profitable in 2028. And finally an article on how to evaluate AI tools without getting fooled by demos, vendor pitches, or the noise that currently dominates the conversation.

The point of the cluster is not to recommend specific tools. Tools change every quarter. The point is to give restoration owners a durable mental model for thinking about AI deployments — one that will still be useful in 2027 and 2028, regardless of which vendors have come and gone in the meantime. Operators who internalize the model will make consistently better decisions about AI than operators who chase the current vendor cycle. The model is the asset.

Next in this cluster: what to actually build first when you have limited time and budget — and why the obvious answer is almost always wrong.

April 15, 2026
Replacing the Interviewer: What the Human Distillery App Can and Cannot Do
Tygart Media Strategy

Volume Ⅰ · Issue 04Quarterly Position

By Will Tygart
• Long-form Position
• Practitioner-grade

The extraction protocol works. The pivot signal lexicon is learnable. The four-layer descent can be taught. The question is whether it can be deployed without a trained human interviewer in the room — and if so, how much of the value survives the translation.

This is the duplication problem at the center of the Human Distillery business model. Will can run an extraction session. An app cannot run the same session. But an app can run a version of the session — and for a large subset of extraction use cases, the version is sufficient.

Understanding what transfers and what doesn’t is the whole architectural question.

What Transfers to an App

The four-layer question structure is codifiable. A stateful conversational agent — not a chatbot, a system that maintains a running knowledge map of what’s been surfaced and what’s still needed — can execute the question sequences in order, navigate the domain-specific question libraries for a given vertical, and detect the linguistic markers of pivot signals in real time.

“It’s hard to explain” is detectable by NLP. Hedging patterns are detectable. Energy shifts in voice are detectable by acoustic analysis. Deflection to process — “the policy says…” — is detectable. The app can recognize these signals and adjust its question path, slowing down at tacit knowledge boundaries and applying the correct follow-up from the signal response library.

The processing pipeline from transcript to structured concentrate is fully automatable: chunking by topic boundary, entity extraction, claim isolation, confidence scoring, contradiction flagging across multiple sessions, multi-model distillation rounds. This is where AI earns its keep. A human doing this manually would take days per session. The pipeline does it in minutes.

Domain-specific question libraries can be built from prior extractions and expanded with each new session. The more sessions the app runs in a given vertical, the richer its question library becomes. This is the compounding effect that makes the app more valuable over time.

What Doesn’t Transfer

Three things resist automation in ways that won’t be resolved by better models:

Micro-hesitation reading. The half-second pause before an answer that signals the subject knows more than they’re about to say. The slight change in phrasing when someone moves from what they’re comfortable saying to what they actually think. These are real-time, embodied, relational signals. A text-based app misses them entirely. A voice app gets closer but still lacks the visual channel that carries a significant portion of this information.

Protocol abandonment. The decision to stop following the four-layer sequence because the subject just said something unprompted that is more important than anything in the protocol. Expert interviewers make this call constantly. They recognize the thread that, if followed, goes somewhere the protocol would never reach. An app will follow the signal response library. It won’t recognize when the library should be put down.

Trust calibration. Whether the subject is performing for the recording or actually sharing. This is not detectable from content analysis. It requires the social intelligence to know when to lower the formality, when to match the subject’s energy, when to say something self-deprecating to signal that this is a peer conversation and not an evaluation. Subjects share differently with someone they trust. The app cannot build that trust.

The Honest Architecture

The tiered model that emerges from this analysis:

Tier 1 — App-led extraction. Well-mapped domains with accessible knowledge. The subject is cooperative. The question library is deep. The knowledge being sought is in Layers 1 and 2. The app handles the session. Will reviews the concentrate before delivery.

Tier 2 — Human-led extraction with app processing. High-stakes sessions. Guarded subjects. Knowledge at the outer edge of verbalization (Layer 3 and 4). Will conducts the session. The app runs the processing pipeline. Will reviews and approves the concentrate.

Tier 3 — Full human extraction and distillation. Strategic engagements. Subjects who will only speak candidly to a person they know. Knowledge so embedded that it requires real-time relational judgment to surface at all. Will does everything.

The business model implication: Tier 1 is volume. Tier 3 is premium. The ratio shifts over time as the app’s question libraries deepen and its signal detection improves. What begins as mostly Tier 2 and 3 eventually becomes mostly Tier 1, with Will’s direct involvement reserved for the sessions where only a human can get the door open.

The app is not a replacement for the protocol. It’s a multiplier for the protocol — allowing it to run at a scale that a single human operator never could, while preserving the human layer for the cases that actually require it.
Human Distillery Knowledge Cluster

The Human Distillery: Full Extraction Methodology (Pillar)

Books for Bots: What a Knowledge Concentrate Is and How It’s Built

Related: Build the System Around the Behavior, Not the Tool — the design philosophy this methodology embodies.
April 13, 2026

The Human Distillery: A Methodology for Extracting Tacit Knowledge for AI Systems

By Will Tygart
• Long-form Position
• Practitioner-grade

Every organization has two kinds of knowledge. The documented kind — processes, policies, SOPs, training materials — lives in manuals and wikis. The other kind lives in people’s heads: the adjustments made without thinking, the thresholds learned from expensive mistakes, the pattern recognition that executes in a second but couldn’t survive a PowerPoint slide.

The first kind is easy to feed into an AI system. The second kind is what makes the organization actually work. And it almost never gets captured before it walks out the door.

This gap — between what’s written and what’s known — is where most enterprise AI implementations quietly fail. The system gets the documentation. It never gets the knowledge. The result is an AI that gives the same answer a new employee would give, while the 15-year veteran shakes their head and does it differently.

The Human Distillery methodology exists to close that gap. It is a structured extraction protocol for converting tacit knowledge into dense, structured artifacts — books for bots — that AI systems can actually use. Not summaries. Not transcripts. Knowledge concentrates: information-rich artifacts that encode relationships, decision logic, and confidence alongside the facts themselves.

This article is the methodology reference. It covers what tacit knowledge is and why it resists standard capture methods, the four-layer extraction protocol that surfaces it, the pivot signal lexicon that tells you when you’re close, what a knowledge concentrate looks like as a structured artifact, and where human judgment remains irreplaceable in the pipeline.

Why Standard Methods Don’t Work

The instinct when trying to capture organizational knowledge is to reach for one of three tools: a survey, an interview, or a documentation request. All three fail at tacit knowledge for the same reason: they ask people what they know. Tacit knowledge is knowledge people don’t know they know. It operates below the level of conscious articulation. You cannot survey it out of someone. You cannot ask them to write it down. You have to create the conditions under which it surfaces — and then recognize it when it does.

Forms and surveys capture what people think they do. Conversations capture what they actually do and why. The difference between those two things is the entire product.

A 20-year insurance adjuster asked “what’s your process for evaluating a water damage claim?” will give you the documented version: inspect the loss, review the policy, scope the damage, issue the estimate. This is accurate and useless. Ask them about a claim that went sideways and they will, unprompted, tell you that they always check the crawlspace first on older properties in this zip code because the contractor community there has a pattern of scope creep on foundation moisture that the initial inspection never catches. That’s the knowledge. It lives in the deviation from the process, not the process itself.

The Four-Layer Descent

The extraction protocol descends through four distinct layers in sequence. Each layer unlocks the next. Skipping a layer produces thin output. Rushing a layer produces performed output. The full descent, executed correctly, surfaces knowledge the subject didn’t know they were carrying.

Phase 0: Disarmament

Before any extraction begins, the status dynamic has to be neutralized. The subject needs to stop performing expertise for an evaluator and start explaining their world to a curious outsider. The difference in what comes out is dramatic.

The disarmament move: position yourself as someone who genuinely doesn’t know. “I’ve never seen a job like this — walk me through it like I’m shadowing you.” This does two things. It forces explanation of steps the subject considers so obvious they wouldn’t otherwise mention — which is exactly where embedded knowledge concentrates. And it signals that there’s no correct answer being evaluated, which reduces the filtering that kills tacit knowledge capture.

Open with failure. “Tell me about a job that went sideways” surfaces edge cases, exceptions, and judgment calls that success stories never reveal. People tell the truth in their failure stories. They’re not protecting anything.

Layer 1: Surface Protocol

The question: “What’s your process when X happens?”

What it gets: The documented version. What the subject would write in an SOP. What they’d tell a new hire on day one. Accurate. Insufficient. Necessary baseline.

Why you need it: The surface protocol establishes the frame. It’s the map. Everything that comes after is about finding where the territory diverges from the map — and those divergences are where the knowledge lives.

Layer 2: Exception Probing

The question: “When do you deviate from that?”

What it gets: The adaptive layer. The judgment calls that experience produces. The cases where the checklist gets ignored because the situation demands something the checklist can’t accommodate. This is the first layer where genuine tacit knowledge begins to surface.

The follow-up sequence: “And when does that happen?” → “How do you know it’s that situation?” → “What would you have done three years ago that you wouldn’t do now?” Each question peels back one more layer of accumulated judgment.

Layer 3: Sensory and Somatic

The question: “How do you know it’s that and not something else?”

What it gets: Pattern recognition so ingrained it operates below conscious awareness. The knowledge the subject has never verbalized because no one has ever asked them to. This is the hardest layer to surface and the most valuable thing in the concentrate.

What it sounds like: “The smell is different.” “The drywall feels wrong.” “Something about the way the insurance company rep is phrasing the emails.” These are not vague — they’re ultra-specific to a domain. The job is to slow down at these moments and press: “Describe the smell.” “What does wrong feel like compared to right?” “What in the phrasing specifically?” The subject usually thinks they can’t explain it. They can. They just haven’t been asked slowly enough.

Layer 4: Counterfactual Pressure

The question: “What would break if you weren’t here tomorrow?”

What it gets: The knowledge hierarchy. What actually matters versus what’s ritual. Most organizations don’t know which is which until the person who knows leaves. This layer surfaces the load-bearing knowledge — the things that if absent would produce visible failures, not just suboptimal outcomes.

The follow-up: “Who else knows that?” The answer is almost always “no one” or “maybe [one person].” That’s the knowledge risk. That’s also the product.

The Pivot Signal Lexicon

Proximity to tacit knowledge produces specific signals in conversation. Recognizing them in real time is the skill that separates a good extraction session from a great one. Miss these signals and you stay in Layer 1. Catch them and you descend.

Signal	What It Means	The Move
“It’s hard to explain…”	The subject is about to verbalize something they have never articulated before. This is the most valuable signal in the lexicon.	Slow everything down. “Try anyway.” Do not fill the silence. Do not offer a simpler question. Wait.
“You just kind of know”	Layer 3 boundary. The subject is pointing directly at tacit knowledge they don’t know how to surface.	“Walk me through the last time you just knew. What did you notice first?”
Hedging and qualifiers	The subject is filtering. They have an answer but aren’t sure it’s acceptable to say. “Generally speaking…” “In most cases…” “It depends…” are all hedges.	“Off the record — what actually happens?” Or: “What’s the version you’d tell a colleague vs. what you’d put in the manual?”
Sudden energy or animation	You’ve touched something they care about. The subject’s pace increases, their posture changes, they lean in. This is a live thread to a knowledge cluster.	Follow it immediately. Drop the protocol. “Tell me more about that.” The protocol can resume. This thread may not come back.
Deflection to process	The subject is avoiding the judgment layer. When asked what they do, they tell you what the process says to do. Often accompanied by “the policy is…” or “we’re supposed to…”	“But what do you do when that breaks down?” The emphasis on ‘you’ reframes the question from institutional to personal, which is where the knowledge actually lives.
Pausing before a number	The subject is calculating from experience, not retrieving from documentation. The pause is the gap between “what the spec says” and “what I know from doing this 200 times.”	Ask for the number, then: “Where does that come from?” The answer to the second question is often the most valuable thing in the session.
Unprompted stories	The subject has moved from answering your questions to accessing their own knowledge map. Stories they tell without being asked are almost always pointing at something important.	Let it run. If the story ends without the embedded knowledge surfacing, ask: “What made that one different from a normal job?”

The Knowledge Concentrate: What the Output Actually Looks Like

A transcript is raw. A summary is thinner in size but barely denser in information. A knowledge concentrate is smaller than either and more information-rich than both — because it encodes relationships, decision logic, and confidence alongside the facts themselves.

The schema for a knowledge concentrate has five components:

Entity graph. Every named concept, process, person-role, piece of equipment, and decision point that surfaces in the extraction, mapped as nodes with typed edges between them. Not a list — a graph. The relationships are the knowledge. The entities alone are just vocabulary.

Decision logic. Every when-then-because statement extracted from the session. “When the moisture readings are above X in a crawlspace with Y flooring type, we always do Z because A.” Structured with confidence scores: is this firsthand knowledge, observed pattern, or secondhand information?

Benchmarks. Every number that surfaces in extraction — thresholds, timelines, costs, rates, counts — with context, source count, and variance. A benchmark from one interview has low confidence. The same benchmark confirmed across six interviews in the same market has high confidence and is ready to be used as ground truth.

Tacit signatures. The things that are hard to explain — captured as best as they can be verbalized, with a confidence flag that signals to the AI system consuming them: this is approximate. This is the residue of knowledge that the extraction process got close to but couldn’t fully surface. It’s still valuable. It tells the AI where human judgment is concentrated.

Provenance. Traceable but anonymized. How many sources contributed to each claim. Whether a given piece of knowledge is individual or cross-validated. What industry and market it came from.

An AI system consuming a knowledge concentrate in this format doesn’t just know facts — it knows which facts to trust, how to chain them into decisions, and where the knowledge is thin enough that human judgment should be called in.

What the App Can Do and What It Can’t

The four-layer protocol and the pivot signal lexicon can be partially codified. A stateful conversational agent — not a chatbot, a genuinely stateful system that maintains a running knowledge map of what’s been surfaced and what’s still needed — can execute the question sequences, detect linguistic pivot signals, navigate domain-specific question libraries, and run the processing pipeline from transcript to structured concentrate.

What it cannot do is the thing that makes the difference between a good extraction and a complete one:

It cannot read the half-second of hesitation before an answer that signals the subject knows more than they’re about to say. It cannot decide, in the middle of an unprompted story, that this tangent is the most important thing in the session and the protocol should be abandoned to follow it. It cannot calibrate trust — cannot sense whether the subject is performing for the recording or actually sharing, and adjust accordingly. It cannot distinguish a valuable tangent from genuine noise in real time.

These are not gaps that better models will close. They are inherently relational and embodied. They require a human who is genuinely present in the conversation, not processing a transcript of it.

The honest architecture for a distillery operation is therefore tiered. The app handles extraction volume — the sessions where the knowledge is relatively accessible, the domain is well-mapped, and the question library is sufficient. The human handles the sessions where the stakes are highest, the subject is guarded, or the knowledge being sought is at the outer edge of what can be verbalized. And the human is always the quality gate on the final concentrate, regardless of which path produced it.

Why This Works in Any Industry

Tacit knowledge is not a property of any particular field. It is a property of human expertise at depth. Wherever humans have been doing something long enough to develop judgment that exceeds documentation — which is everywhere — the distillery protocol applies.

The domain changes the question library. The pivot signals are universal. The four-layer structure works in restoration, in legal practice, in medicine, in financial services, in manufacturing, in competitive sports coaching, in culinary production. Any field where experience produces something that training cannot replicate is a field where a knowledge concentrate has value.

The buyers are the organizations trying to make that knowledge portable. The AI system that needs to give the same answer a 20-year veteran would give. The consultant whose insights live only in their head. The franchise trying to replicate the judgment of its best operators across 400 locations. The company that just lost its most important employee and is only now discovering what they actually knew.

The product is not content. It is not a report. It is a structured knowledge artifact that makes someone else’s irreplaceable expertise replicable — at least partially, at least for the cases the documentation currently handles worst.

That’s the distillery. Extract. Distill. Deploy.

Frequently Asked Questions

How long does a single extraction session take?

A full four-layer descent with one subject takes 60–90 minutes. Rushing below 45 minutes consistently produces shallow output — the session ends before Layer 3 is reached. Three to five sessions with different subjects in the same domain produces a concentrate with enough cross-validation to have meaningful confidence scores on the decision logic and benchmarks.

What industries is this most applicable to?

Any industry where experience produces judgment that documentation can’t replicate. The highest-value applications are in fields with expensive mistakes (medical, legal, engineering), fields with long apprenticeship periods (skilled trades, finance, consulting), and fields where the knowledge is currently locked in one or two people (most small and mid-size businesses).

How is this different from a McKinsey-style knowledge management engagement?

Traditional knowledge management captures process documentation — what should happen. The distillery protocol captures judgment documentation — what actually happens, and why, and when the standard answer is wrong. The output is structured for AI consumption, not human reading. The concentrate is designed to be queried, not read.

What happens to the concentrate after it’s produced?

The concentrate is delivered to the client for ingestion into their AI infrastructure — as a RAG knowledge base, as fine-tuning data, as a reference layer for their AI assistant, or as structured context for their customer-facing AI systems. The format is designed to be immediately usable without further transformation. The provenance metadata ensures the client knows which claims to trust at what confidence level.

Can the extraction protocol be deployed without a trained human interviewer?

Partially. A well-built stateful conversational agent can execute the question sequences, detect linguistic pivot signals, and run the processing pipeline. What it cannot do is the real-time relational judgment that surfaces the deepest knowledge — the hesitation reading, the trust calibration, the decision to abandon the protocol and follow an unexpected thread. For accessible knowledge in well-mapped domains, the app is sufficient. For the knowledge closest to the surface of human expertise, the human remains in the loop.

Human Distillery Knowledge Cluster

Related: Build the System Around the Behavior, Not the Tool — the design philosophy this methodology embodies.

April 13, 2026

Four-Layer Data Architecture: Building Around Behaviors, Not Tools
Tygart Media Strategy

Volume Ⅰ · Issue 04Quarterly Position

By Will Tygart
• Long-form Position
• Practitioner-grade

The instinct, when building a complex operation, is to find one tool that can hold everything. One source of truth. One dashboard. One system of record for all data types.

This instinct is wrong, and it produces exactly the kind of system it’s trying to avoid: a single tool that does everything poorly, a migration project that costs more than the original implementation, and a team that has learned to distrust the data because the tool was never designed for the behaviors it was forced to support.

The behavior-first alternative for data architecture doesn’t start with “what tool can hold everything.” It starts with: what are the distinct behaviors this data needs to support, and which tool is genuinely best suited for each one?

The Four Data Behaviors

In a multi-site AI-native content operation, four distinct data behaviors emerge:

Machine-generated operational data needs to be written and read by automated systems at high speed. Batch job results, embedding vectors, image processing logs, Cloud Run execution histories. No human looks at this data directly. It needs to be fast, cheap, and structured for programmatic access. GCP serves this behavior — Firestore for structured operational state, Cloud Storage for large artifacts, BigQuery for analytical queries across the full dataset.

Human-actionable signals need to be displayed clearly enough that a person can take action without wading through noise. Site health alerts, content gaps, client status changes, task assignments. This data needs to be readable, filterable, and connected to the people who need to act on it. Notion serves this behavior — not because it’s the most powerful database, but because it’s the most human-readable one, with views that can surface exactly the signal each role needs.

Published content needs to be delivered to web visitors and search engines at performance standards those audiences require. WordPress serves this behavior. It was designed for it. The mistake is asking WordPress to also serve as the storage layer for unpublished content, the analytics layer for content performance, or the task management layer for content production. It wasn’t designed for those behaviors and it’s not good at them.

Files and documents need to be stored, versioned, and shared across tools and collaborators. Google Drive serves this behavior. Skills, SOPs, brand guidelines, exported data — anything that exists as a file rather than as structured data belongs in Drive, not in a database trying to handle file attachments as a secondary feature.

Why Separation Produces Better Systems

A four-layer architecture feels like more complexity than a single-tool approach. In practice it produces less complexity, because each tool is operating within its design constraints instead of being stretched beyond them.

The signal-to-noise problem in most dashboards comes from forcing machine-generated data and human-actionable signals into the same view. The machine data overwhelms the human signals. The solution is usually “better filtering” — which is the wrong answer. The right answer is storing machine data where machines can read it and surfacing human signals where humans can act on them.

The performance problem in most content operations comes from asking WordPress to be a content management system when it’s a content delivery system. The content that belongs in a CMS — drafts, revisions, briefs, research notes — should be in Notion. The content that belongs in a CDS — published articles, page templates, media files — should be in WordPress. When you separate these, both tools perform their actual function better.

The data loss problem in most operations comes from treating the most convenient tool as the system of record. When content lives only in WordPress, a site failure is a data failure. When operational state lives only in a Cloud Run service, a deployment change is a state failure. The four-layer architecture ensures that each data type has a permanent home in the tool designed to hold it — and that the tools interact through APIs rather than through manual migration.
Behavior-First System Design — Knowledge Cluster

Build the System Around the Behavior, Not the Tool (Pillar)

Notion as Storage Layer, WordPress as Distribution Layer

Tacit Knowledge Extraction: Why the Behavior Comes First

Separating Intelligence from Execution: The AI Work Order Architecture

ADHD and AI-Native Operations: Designing Around the Behavior

A CRM Is a Tool. A Community Is a Behavior.

Four-Layer Data Architecture: Building Around Behaviors

Related: CRM Community Framework for Restoration Companies — the live proof of concept for behavior-first system design.
April 13, 2026
ADHD and AI-Native Operations: Designing Around the Behavior, Not Against It
Tygart Media Strategy

Volume Ⅰ · Issue 04Quarterly Position

By Will Tygart
• Long-form Position
• Practitioner-grade

The conventional wisdom about ADHD and work is built around a simple premise: the ADHD brain is deficient in the behaviors that work requires, and management strategies exist to compensate for those deficiencies. More structure. Better schedules. Accountability systems. Tools designed to impose the consistency the brain doesn’t generate naturally.

This is tool-first thinking applied to a human brain. And like most tool-first thinking, it produces systems that fight the behavior instead of serving it.

The behavior-first alternative asks a different question: what does the ADHD brain actually do, at its best, and what system design would allow it to do more of that?

What the ADHD Brain Actually Does

Three behaviors characterize high-functioning ADHD cognition when the environment supports them:

Hyperfocus. Sustained, intense concentration that arrives unbidden and runs at extraordinary depth for an unpredictable duration. Not concentration on demand — concentration that seizes the operator when a problem activates the interest system. The output of a hyperfocus session is disproportionate to the time invested, and the quality often exceeds what deliberate, scheduled work produces.

Interest-based attention routing. The ADHD attention system allocates based on interest, novelty, urgency, or challenge — not importance. High-interest work gets exceptional focus. Low-interest work gets almost none. This is not a failure of will. It’s a feature of a different attentional architecture.

Cross-domain pattern recognition. Rapid context-switching, which looks like distractibility in sequential-task environments, produces something valuable in environments that reward synthesis: the ability to connect observations across unrelated domains and identify patterns that single-domain experts miss.

The System That Serves These Behaviors

An AI-native operation designed around these behaviors looks different from a conventional productivity system:

For hyperfocus: The system captures whatever the hyperfocus session produces — immediately, in full, without requiring the operator to organize it mid-session. The Second Brain stores the output. The cockpit session for the next day picks up the thread. The non-linearity of hyperfocus (jumping between connected insights, building in spirals) becomes productive because the AI can hold the full context of the spiral across sessions.

For interest-based attention: Low-interest, deterministic work routes to automated pipelines. Haiku runs taxonomy fixes at scale. Cloud Run handles scheduled publishing. Batch jobs process a hundred posts while the operator is doing something that has activated their interest system. The attention that would have been coerced onto low-interest work is freed for the high-interest work where ADHD attention genuinely excels.

For pattern recognition: The cross-domain synthesis that ADHD cognition produces naturally — connecting a restoration industry CRM insight to an AI architecture principle to a neurodiversity research finding — is exactly what generates the novel frameworks that constitute a knowledge operation’s core asset. This isn’t compensated for. It’s the product.

The Architecture Principle

The systems that emerged from designing around ADHD constraints are not ADHD-specific. They are better systems. External working memory (the Second Brain) outperforms internal working memory for complex multi-client operations regardless of neurology. Routing low-value-attention work to automation is better for any operator. Pre-staged context reduces friction for everyone.

The ADHD constraints forced designs that a neurotypical operator would also benefit from — because the constraints that neurodivergence makes extreme are present in milder form in everyone. The behavior-first design process, applied to an ADHD brain, produced infrastructure. The same process, applied to any operation, produces the same result: systems that serve the actual behavior, compound over time, and don’t require the operator to fight their own cognition to function.
Behavior-First System Design — Knowledge Cluster

Build the System Around the Behavior, Not the Tool (Pillar)

Notion as Storage Layer, WordPress as Distribution Layer

Tacit Knowledge Extraction: Why the Behavior Comes First

Separating Intelligence from Execution: The AI Work Order Architecture

ADHD and AI-Native Operations: Designing Around the Behavior

A CRM Is a Tool. A Community Is a Behavior.

Four-Layer Data Architecture: Building Around Behaviors

Related: CRM Community Framework for Restoration Companies — the live proof of concept for behavior-first system design.
April 13, 2026
Separating Intelligence from Execution: The AI Work Order Architecture
Tygart Media Strategy

Volume Ⅰ · Issue 04Quarterly Position

By Will Tygart
• Long-form Position
• Practitioner-grade

AI systems are good at identifying problems. Automated systems are good at fixing them. The failure mode that kills most AI automation projects is building them as one thing instead of two.

When you couple intelligence and execution in a single system, you get something that can do everything slowly and nothing reliably. The intelligence layer needs to be conversational, contextual, and judgment-driven. The execution layer needs to be deterministic, fast, and parallelizable. These are fundamentally different behaviors, and they require different tools.

The Work Order as the Bridge

The behavior-first design for AI automation has three distinct stages: identify (Claude analyzes a system and surfaces what needs to be done), deposit (Claude writes a structured work order to a persistent queue), and execute (a Cloud Run worker reads the work order and runs the fix).

The work order is the key artifact. It’s the contract between the intelligence layer and the execution layer. A well-formed work order contains everything the execution layer needs to run without asking Claude any follow-up questions: the target (site, post ID, endpoint), the operation (what to do), the parameters (how to do it), and the success criteria (how to know it worked).

When the work order is well-formed, the execution layer is a dumb runner. It doesn’t need to understand context, history, or judgment. It reads the work order, executes the operation, and writes the result back. The intelligence that produced the work order stays in the intelligence layer — which is exactly where it belongs.

What This Looks Like in Practice

In a multi-site content operation, Claude might analyze a WordPress site and identify 47 posts with missing FAQ schema. The tool-first approach runs Claude in a loop, generating and publishing schema for each post sequentially. This is slow, context-dependent, and fragile — if Claude loses context mid-run, the job is incomplete and the state is unclear.

The behavior-first approach: Claude generates 47 structured work orders, one per post, and deposits them in a Notion database with status “Queued.” A Cloud Run service reads the queue and processes each work order independently, in parallel, writing results back to each row. Claude is done in minutes. The Cloud Run service finishes the execution while Claude is doing something else entirely.

The behaviors are clean. The tools serve them. The system scales horizontally without requiring Claude to be in the loop for execution.

The Two Lanes of AI Automation

Not everything belongs in the work order queue. Some operations require judgment that the execution layer can’t replicate: content quality assessment, strategy decisions, anything where “it depends” is the correct first answer. These belong in a different lane — one where Claude stays in the loop through completion.

A mature AI automation architecture has both lanes clearly defined. Deterministic operations (taxonomy fixes, schema injection, meta rewrites, image uploads, internal link additions) go to the work order queue and run without Claude. Judgment-dependent operations (content strategy, quality review, client recommendations) stay in the conversational layer where Claude’s judgment can be applied continuously.

The discipline is in knowing which lane each operation belongs in — and resisting the temptation to put judgment-dependent work in the queue just because it would be faster. Faster execution of the wrong thing is not an improvement.
Behavior-First System Design — Knowledge Cluster

Build the System Around the Behavior, Not the Tool (Pillar)

Notion as Storage Layer, WordPress as Distribution Layer

Tacit Knowledge Extraction: Why the Behavior Comes First

Separating Intelligence from Execution: The AI Work Order Architecture

ADHD and AI-Native Operations: Designing Around the Behavior

A CRM Is a Tool. A Community Is a Behavior.

Four-Layer Data Architecture: Building Around Behaviors

Related: CRM Community Framework for Restoration Companies — the live proof of concept for behavior-first system design.
April 13, 2026

Tag: AI Operations

How do you calculate ROI on a Notion Custom Agent?

The 60-second version

The simple formula

Where the math turns negative

The portfolio approach

The monthly review ritual

Sources

Continue the journey

What Google announced is what we already built — at our scale

The three tiers, in plain language

The clock that makes it work

Why this article is hand-crafted, on purpose

What Google’s announcement actually changes for operators like us

The unsexy part

The conversation no one in restoration is having yet

What an agent-assisted operation actually costs

The volume changes everything

What this does to bid competitiveness

What this does to growth capacity

What this does to margin profile

The honest risk picture

What owners should be doing about this in 2026

The honest state of restoration AI in 2026

The shape of the failure

The first reason: no captured judgment to deploy

The second reason: optimizing the wrong layer

The third reason: no senior operator in the loop

The fourth reason: the wrong evaluation horizon

What the few successful deployments have in common

Where this cluster is going

What Transfers to an App

What Doesn’t Transfer

The Honest Architecture

Human Distillery Knowledge Cluster

Why Standard Methods Don’t Work

The Four-Layer Descent

Phase 0: Disarmament

Layer 1: Surface Protocol

Layer 2: Exception Probing

Layer 3: Sensory and Somatic

Layer 4: Counterfactual Pressure

The Pivot Signal Lexicon

The Knowledge Concentrate: What the Output Actually Looks Like

What the App Can Do and What It Can’t

Why This Works in Any Industry

Frequently Asked Questions

How long does a single extraction session take?

What industries is this most applicable to?

How is this different from a McKinsey-style knowledge management engagement?

What happens to the concentrate after it’s produced?

Can the extraction protocol be deployed without a trained human interviewer?

Human Distillery Knowledge Cluster

The Four Data Behaviors

Why Separation Produces Better Systems

Behavior-First System Design — Knowledge Cluster

What the ADHD Brain Actually Does

The System That Serves These Behaviors

The Architecture Principle

Behavior-First System Design — Knowledge Cluster

The Work Order as the Bridge

What This Looks Like in Practice

The Two Lanes of AI Automation

Behavior-First System Design — Knowledge Cluster