The Plan-Mode-Plus-Hooks Pattern: How to Actually Trust Claude Code in a Production Repo

Claude Notion editorial image 29

About Will

I run a multi-site content operation on Claude and Notion with autonomous agents — and I write about what we do, including what breaks.

Connect on LinkedIn →

There is a workflow gap most Claude Code users walk straight into and never quite close. CLAUDE.md tells Claude what should happen. Plan mode lets you see what Claude intends to do. Hooks decide what Claude is physically allowed to do. Pick any one of those in isolation and you get a tool that is impressive in a demo and unreliable in a real repo. Pair plan mode with hooks the right way and Claude Code stops being a chat surface and starts behaving like a constrained junior engineer you can leave alone for an hour.

This is the workflow I have moved every non-trivial repo onto. It is not the simplest setup — that would be raw claude with a CLAUDE.md and trust. It is the setup that survives the moment Claude decides, with great confidence, to delete the wrong file.

The three layers, and why most people only use two

Claude Code as a programmable platform has three durable surfaces for shaping its behavior in 2026:

  1. CLAUDE.md — the markdown memory file Claude reads at the start of every session. Project conventions, glossary, “don’t touch this directory,” coding style.
  2. Plan mode — the read-only review gate, activated with Shift+Tab twice or /plan. No edits, no shell, no git. Claude proposes an implementation plan against the live codebase and waits.
  3. Hooks — deterministic shell scripts that fire on specific tool calls or session events. Pre-commit linting, blocking edits to generated files, refusing pushes to main.

The standard pattern I see in repos is CLAUDE.md plus vibes. Sometimes plan mode for the big tasks. Almost no one is running hooks until they have been burned once. That is the wrong order. Hooks are not advanced — they are the thing that lets plan mode actually mean something.

The reason is empirical and uncomfortable: CLAUDE.md instructions get followed roughly 70% of the time. That is acceptable for “prefer arrow functions” and catastrophic for “don’t push to main.” Plan mode raises the floor on the high-stakes decisions because you see the plan before any tool runs. Hooks raise the ceiling on the boring ones because they execute regardless of Claude’s intent.

What the pairing actually looks like

The mental model: plan mode is for novel work where you need to inspect the strategy. Hooks are for recurring boundaries you do not want to inspect ever again. If you find yourself reviewing the same kind of decision in plan mode twice, that decision belongs in a hook.

A concrete setup from one of my repos:

CLAUDE.md — short. Project glossary, the test command, the “production data is in prod/ and is read-only” rule, the rule that all new files in src/ need a test in tests/. Maybe forty lines. No essay.

Plan mode discipline — anything that touches more than three files, anything that changes a public interface, anything that touches the database schema, I open with /plan. I read the plan. I push back. Then I let it run. For one-file edits, bug fixes I have already scoped, or doc changes, I skip planning. The cost of planning a two-line fix is higher than the cost of undoing it.

Hooks doing the actual enforcement. This is where the work lives. The hooks I run on every active repo:

  • A PreToolUse hook on Bash that blocks any command matching git push.*main, rm -rf, or any reference to a path under prod/. Returns a non-zero exit and tells Claude what to do instead.
  • A PreToolUse hook on Edit and Write that refuses any file path matching the generated-code globs from .gitattributes. If the file is autogenerated, Claude is rewriting source-of-truth, not output.
  • A PostToolUse hook on Edit that runs the linter on just the touched file and surfaces the diagnostics back to Claude. Cheap, fast, closes the loop without waiting for the next test run.
  • A Stop hook that runs the test suite. Claude does not get to mark the task done if tests are red. This single hook eliminated about 80% of my “it said it was done but” moments.

That last one is the one I would put in every repo before anything else. Without it, Claude verifies its work using its own judgment, which degrades as context fills. With it, each red-to-green cycle is an unambiguous external signal that the work is actually done.

Where this pairing earns its keep

Two scenarios where the plan-mode-plus-hooks combination pays for the setup time:

The unfamiliar-codebase refactor. Claude in plan mode reads the codebase, proposes a refactor across eight files, lists what it will touch and what it will leave alone. You scan the plan, notice it wants to modify a file in a directory that should be read-only, and instead of arguing in chat you add a hook. The hook is now permanent. The next session cannot make the same mistake.

The long-running, multi-step job. You send Claude off to add a feature with twelve subtasks. You are not watching. The Stop hook running tests means Claude either finishes with a green suite or stops and reports. The push-to-main hook means even if Claude decides the merge looks fine, it physically cannot ship it. You get back, read the report, merge. The autonomy is real because the guardrails are real.

What this pattern is not

It is not a replacement for reading Claude’s diffs. Hooks catch categorical mistakes — wrong directory, wrong branch, wrong command — and miss subtle ones, like a refactor that compiles and passes tests but breaks a contract no test covered. Plan mode catches strategic mistakes — wrong approach, wrong scope — and misses tactical ones, like an off-by-one. You still review code. You just stop spending review time on things a script can check.

It is also not a substitute for subagents or skills. Hooks are deterministic enforcement. Subagents are context isolation for parallel work. Skills are reusable procedural knowledge. The Anthropic team’s own framing — start with skills, add hooks when you need deterministic enforcement, add subagents when parallel work or context isolation matters — is correct, and the three layers compose. But the order most practitioners actually need is the inverse of the order they reach for. Most teams reach for subagents first because they sound powerful. Hooks are what makes any of it trustworthy.

The setup that gets you to a usable baseline

If you have one hour, do this in this order:

First, write a forty-line CLAUDE.md. The test command, the build command, the directory rules, the glossary. Do not try to write an essay about your codebase. Claude will read it every session — keep it dense.

Second, add three hooks: a PreToolUse Bash hook blocking destructive commands on your protected paths, a PostToolUse Edit hook running the linter on the touched file, and a Stop hook running the test suite. Twenty lines of shell each. None of them require any framework — they are just executables that read JSON from stdin and exit non-zero to block.

Third, develop the habit of /plan for anything you would not be comfortable letting a new contractor commit without review. For everything else, let it run.

That is the baseline. You can layer on subagents, MCP servers, skills, custom slash commands — all of it is useful, none of it is required to ship reliably. The reliability comes from the boring layer: a memory file Claude reads, a plan mode you actually use, and hooks that mean what they say.

The Claude Code documentation will teach you the syntax for any of this in an afternoon. The pattern is the part that took a year of watching it go wrong to settle on.

Sources: Anthropic’s Claude Code documentation, the model list at the Anthropic docs site (verified at runtime), and a year of repos.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *