Claude’s context window is one of the most consequential — and most misunderstood — specs in the AI landscape. It determines how much information Claude can hold and reason about at once. Get it wrong in your planning and you’ll hit hard walls mid-task. This guide covers exactly how large Claude’s context window is, how it differs by model and plan, and what it means in practice.
What is a context window? The context window is Claude’s working memory for a conversation — the total amount of text (including your messages, Claude’s responses, uploaded files, and system instructions) that Claude can actively process at once. When a conversation exceeds this limit, Claude can no longer reference earlier parts of it without summarization or a new session.
Claude’s Context Window Size by Model and Plan
Context window size in Claude varies by model, plan type, and which product surface you’re using. Here’s the accurate picture as of April 2026:
Claude.ai (Web and Mobile Chat)
For users on paid claude.ai plans — Pro, Max, Team, and most Enterprise — the context window is 200,000 tokens across all models and paid plans. According to Anthropic’s support documentation, this is roughly 500 pages of text or more.
Enterprise plans on specific models have access to a 500,000 token context window. This is a plan-level feature, not a model selection — contact Anthropic’s enterprise team for details on which models qualify.
Claude Code (Terminal and IDE)
The larger context windows — 1 million tokens — are available specifically through Claude Code on paid plans:
Claude Opus 4.6: Supports a 1M token context window in Claude Code on Pro, Max, Team, and Enterprise plans. Pro users need to enable extra usage to access Opus 4.6 in Claude Code.
Claude Sonnet 4.6: Also supports a 1M token context window in Claude Code, but extra usage must be enabled to access it (except for usage-based Enterprise plans).
Claude API
Via the direct API, the current model context windows as published in Anthropic’s official documentation are:
Tokens are not the same as words. A token is roughly 3–4 characters, which works out to approximately 0.75 words in English. Here’s how the 200K token context window translates into practical content:
~150,000 words of plain text
~500+ pages of a standard document
A full-length novel (most are 80,000–120,000 words) with room to spare
Hundreds of emails in a thread
A moderately large codebase or multiple interconnected files
Hours of meeting transcripts
For the vast majority of everyday tasks — document review, writing, research, coding, analysis — 200K tokens is more than enough. The ceiling only becomes relevant for extended research sessions, very large codebases, or scenarios where you need to maintain context across a lengthy back-and-forth over many hours.
What 1M Tokens Actually Means
One million tokens is roughly 750,000 words — equivalent to about five full-length novels, or a substantial enterprise codebase in a single session. The practical use cases that genuinely require this scale are narrower than the marketing suggests, but they’re real:
Large codebase analysis: Feeding an entire repository — multiple files, modules, and dependencies — into a single Claude Code session for architecture review, debugging, or refactoring.
Book-length document processing: Analyzing or summarizing an entire textbook, legal corpus, or research archive without chunking.
Long-running agentic workflows: Multi-agent tasks where conversation history, tool call results, and accumulated context grow significantly over time.
Extended conversation history: Maintaining full context across a very long research or writing session without losing earlier exchanges.
For most individual users on claude.ai, the 200K chat context window is the relevant number. The 1M context window matters most to developers building on the API and power users running Claude Code sessions on large codebases.
Context Window vs. Usage Limit: Two Different Things
This is the most common point of confusion. The context window and usage limit are separate constraints that operate independently:
Context window (length limit): How much content Claude can hold in a single conversation. This is a technical capability of the model. When you hit the context window, Claude can no longer actively process earlier parts of the conversation without summarization.
Usage limit: How much you can interact with Claude over a rolling time period — the five-hour session window and weekly cap on paid plans. This controls how many total messages and how much total compute you consume across all your conversations, not the depth of any single conversation.
You can hit a usage limit without ever approaching the context window (many short conversations). You can also approach the context window limit without hitting your usage limit (one very long, deep conversation). They’re orthogonal constraints.
Automatic Context Management
For paid plan users with code execution enabled, Claude automatically manages long conversations when they approach the context window limit. When the conversation gets long enough that it would otherwise hit the ceiling, Claude summarizes earlier messages to make room for new content — allowing the conversation to continue without interruption.
Important details about how this works:
Your full chat history is preserved — Claude can still reference earlier content even after summarization.
This does not count toward your usage limit.
You may see Claude note that it’s “organizing its thoughts” — this indicates automatic context management is active.
Code execution must be enabled for automatic context management to work. Users without code execution enabled may encounter hard context limits.
Rare edge cases — very large first messages or system errors — may still hit context limits even with automatic management active.
How Context Window Affects Cost on the API
For developers using the Claude API directly, context window size has direct billing implications. Every token in the context window — input messages, conversation history, system prompts, uploaded documents, and tool call results — is billed as input tokens on each API call.
This creates an important cost dynamic: long conversations get progressively more expensive per message. In a 100-message thread, every new message requires reprocessing the entire conversation history as input tokens. A session that started at $0.01 per exchange can reach $0.10 or more per exchange by message 80.
Two features exist specifically to manage this cost:
Prompt caching: For repeated content — large system prompts, reference documents, or conversation history that doesn’t change — prompt caching allows Claude to read from a cache at roughly 10% of the standard input token price, rather than reprocessing the same content on every call. This can reduce costs by up to 90% on cached content.
Message Batches API: For non-real-time workloads, the Batch API provides a 50% discount on all token pricing. It doesn’t reduce the token count, but halves the cost per token.
How Projects Expand Effective Context
Claude Projects on claude.ai use retrieval-augmented generation (RAG), which changes how context works in a meaningful way. Instead of loading all project knowledge into the active context window at once, Projects retrieve only the most relevant content for each message.
This means you can store substantially more information in a Project’s knowledge base than would fit in the raw context window — and Claude will pull the relevant pieces into the active context as needed. For research-heavy workflows, content libraries, or any use case where you’re working with a large knowledge base across many sessions, Projects are the practical way to work beyond the hard context window ceiling.
Anthropic also offers a RAG mode for expanded project knowledge capacity that pushes this further for users who need it.
Context Window and Model Choice
If context window size is a primary constraint for your use case, here’s how to think about model selection:
For claude.ai chat users, all paid plans give you 200K tokens regardless of which model you’re using. The model choice doesn’t affect the context window in the chat interface.
For Claude Code users on Pro, Max, or Team plans, Opus 4.6 and Sonnet 4.6 both offer the 1M context window — but you need extra usage enabled to access it (except on usage-based Enterprise plans).
For API developers, Opus 4.7 and Sonnet 4.6 both provide 1M token context windows at their standard per-token rates. Haiku 4.5 is capped at 200K. If your workload requires context beyond 200K tokens, Sonnet 4.6 at $3/$15 per million tokens is the cost-efficient choice — you get the same 1M context window as Opus at 40% lower cost.
Practical Tips to Maximize Your Context Window
Whether you’re on the 200K or 1M window, these practices extend how effectively you can use available context:
Start fresh conversations for new topics. Don’t carry long threads across unrelated tasks — the accumulated history consumes context without adding value for the new task.
Use Projects for recurring reference material. Documents, instructions, and background context that you reference repeatedly belong in a Project, not re-uploaded to each conversation.
Keep system prompts concise. In API applications, every extra token in a system prompt multiplies across every call. Trim aggressively.
Disable unused tools and connectors. Web search, MCP connectors, and other tools add system prompt tokens even when not actively used. Turn them off for sessions that don’t need them.
Enable code execution if you’re on a paid plan — it activates automatic context management and extends how long conversations can run without hitting the ceiling.
Frequently Asked Questions
What is Claude’s context window size?
For paid claude.ai plans (Pro, Max, Team), the context window is 200,000 tokens — roughly 500 pages of text. Enterprise plans have a 500,000 token context window on specific models. Via the API and in Claude Code, Opus 4.7 and Sonnet 4.6 support a 1,000,000 token context window. Haiku 4.5 is 200,000 tokens across all surfaces.
How many words is 200K tokens?
Approximately 150,000 words. A token is roughly 0.75 words in English. 200,000 tokens is equivalent to a long novel, 500+ pages of standard text, or many hours of conversation history.
How many words is 1 million tokens?
Approximately 750,000 words — roughly five full-length novels, or the equivalent of a substantial codebase in a single session.
Does the context window reset between conversations?
Yes. Each new conversation starts with a fresh context window. Previous conversations do not carry over unless you’re using a Project, which maintains persistent knowledge across sessions, or unless Claude has memory features enabled that reference past conversations.
What happens when Claude hits the context window limit?
For paid plan users with code execution enabled, Claude automatically summarizes earlier messages and continues the conversation. Without code execution enabled, you may encounter a hard limit that requires starting a new conversation. In either case, the context window limit is separate from your usage limit — hitting one doesn’t affect the other.
Can I increase Claude’s context window?
The context window size is fixed by your plan and model. You can’t expand it directly, but you can use Projects (which use RAG to work with more information than fits in the raw context window), enable automatic context management via code execution, or use the API with models that have larger native context windows.
Does every message use the full context window?
No. Context usage grows as a conversation progresses. The first message in a conversation uses only the tokens from that message plus any system prompt. By message 50, the entire thread history is included as context on every subsequent call. This is why long conversations get progressively more token-intensive over time.
Is the context window the same as Claude’s memory?
Not exactly. The context window is technical working memory — what Claude can actively process in a session. Claude’s memory features (available on paid plans) are separate: they extract and store information from past conversations and make it available in future sessions, beyond what the context window can hold.
Most writing about CLAUDE.md gets one thing wrong in the first paragraph, and once you notice it, you can’t unsee it. People describe it as configuration. A “project constitution.” Rules Claude has to follow.
It isn’t any of those things, and Anthropic is explicit about it.
CLAUDE.md content is delivered as a user message after the system prompt, not as part of the system prompt itself. Claude reads it and tries to follow it, but there’s no guarantee of strict compliance, especially for vague or conflicting instructions. — Anthropic, Claude Code memory docs
That one sentence is the whole game. If you write a CLAUDE.md as if you’re programming a machine, you’ll get frustrated when the machine doesn’t comply. If you write it as context — the thing a thoughtful new teammate would want to read on day one — you’ll get something that works.
This is the playbook I wish someone had handed me the first time I set one up across a real codebase. It’s grounded in Anthropic’s current documentation (linked throughout), layered with patterns I’ve used across a network of production repos, and honest about where community practice has outrun official guidance.
If any of this ages out, the docs are the source of truth. Start there, come back here for the operator layer.
The memory stack in 2026 (what CLAUDE.md actually is, and isn’t)
Claude Code’s memory system has three parts. Most people know one of them, and the other two change how you use the first.
CLAUDE.md files are markdown files you write by hand. Claude reads them at the start of every session. They contain instructions you want Claude to carry across conversations — build commands, coding standards, architectural decisions, “always do X” rules. This is the part people know.
Auto memory is something Claude writes for itself. Introduced in Claude Code v2.1.59, it lets Claude save notes across sessions based on your corrections — build commands it discovered, debugging insights, preferences you kept restating. It lives at ~/.claude/projects/<project>/memory/ with a MEMORY.md entrypoint. You can audit it with /memory, edit it, or delete it. It’s on by default. (Anthropic docs.)
.claude/rules/ is a directory of smaller, topic-scoped markdown files — code-style.md, testing.md, security.md — that can optionally be scoped to specific file paths via YAML frontmatter. A rule with paths: ["src/api/**/*.ts"] only loads when Claude is working with files matching that pattern. (Anthropic docs.)
The reason this matters for how you write CLAUDE.md: once you understand what the other two are for, you stop stuffing CLAUDE.md with things that belong somewhere else. A 600-line CLAUDE.md isn’t a sign of thoroughness. It’s usually a sign the rules directory doesn’t exist yet and auto memory is disabled.
Anthropic’s own guidance is explicit: target under 200 lines per CLAUDE.md file. Longer files consume more context and reduce adherence.
Hold that number. We’ll come back to it.
Where CLAUDE.md lives (and why scope matters)
CLAUDE.md files can live in four different scopes, each with a different purpose. More specific scopes take precedence over broader ones. (Full precedence table in Anthropic docs.)
Managed policy CLAUDE.md lives at the OS level — /Library/Application Support/ClaudeCode/CLAUDE.md on macOS, /etc/claude-code/CLAUDE.md on Linux and WSL, C:\Program Files\ClaudeCode\CLAUDE.md on Windows. Organizations deploy it via MDM, Group Policy, or Ansible. It applies to every user on every machine it’s pushed to, and individual settings cannot exclude it. Use it for company-wide coding standards, security posture, and compliance reminders.
Project CLAUDE.md lives at ./CLAUDE.md or ./.claude/CLAUDE.md. It’s checked into source control and shared with the team. This is the one you’re writing when someone says “set up CLAUDE.md for this repo.”
User CLAUDE.md lives at ~/.claude/CLAUDE.md. It’s your personal preferences across every project on your machine — favorite tooling shortcuts, how you like code styled, patterns you want applied everywhere.
Local CLAUDE.md lives at ./CLAUDE.local.md in the project root. It’s personal-to-this-project and gitignored. Your sandbox URLs, preferred test data, notes Claude should know that your teammates shouldn’t see.
Claude walks up the directory tree from wherever you launched it, concatenating every CLAUDE.md and CLAUDE.local.md it finds. Subdirectories load on demand — they don’t hit context at launch, but get pulled in when Claude reads files in those subdirectories. (Anthropic docs.)
A practical consequence most teams miss: in a monorepo, your parent CLAUDE.md gets loaded when a teammate runs Claude Code from inside a nested package. If that parent file contains instructions that don’t apply to their work, Claude will still try to follow them. That’s what the claudeMdExcludes setting is for — it lets individuals skip CLAUDE.md files by glob pattern at the local settings layer.
If you’re running Claude Code across more than one repo, decide now whether your standards belong in project CLAUDE.md (team-shared) or user CLAUDE.md (just you). Writing the same thing in both is how you get drift.
The 200-line discipline
This is the rule I see broken most often, and it’s the rule Anthropic is most explicit about. From the docs: “target under 200 lines per CLAUDE.md file. Longer files consume more context and reduce adherence.”
Two things are happening in that sentence. One, CLAUDE.md eats tokens — every session, every time, whether Claude needed those tokens or not. Two, longer files don’t actually produce better compliance. The opposite. When instructions are dense and undifferentiated, Claude can’t tell which ones matter.
The 200-line ceiling isn’t a hard cap. You can write a 400-line CLAUDE.md and Claude will load the whole thing. It just won’t follow it as well as a 180-line file would.
Three moves to stay under:
1. Use @imports to pull in specific files when they’re relevant. CLAUDE.md supports @path/to/file syntax (relative or absolute). Imported files expand inline at session launch, up to five hops deep. This is how you reference your README, your package.json, or a standalone workflow guide without pasting them into CLAUDE.md.
See @README.md for architecture and @package.json for available scripts.
# Git Workflow
- @docs/git-workflow.md
2. Move path-scoped rules into .claude/rules/. Anything that only matters when working with a specific part of the codebase — API patterns, testing conventions, frontend style — belongs in .claude/rules/api.md or .claude/rules/testing.md with a paths: frontmatter. They only load into context when Claude touches matching files.
---
paths:
- "src/api/**/*.ts"
---
# API Development Rules
- All API endpoints must include input validation
- Use the standard error response format
- Include OpenAPI documentation comments
3. Move task-specific procedures into skills. If an instruction is really a multi-step workflow — “when you’re asked to ship a release, do these eight things” — it belongs in a skill, which only loads when invoked. CLAUDE.md is for the facts Claude should always hold in context; skills are for procedures Claude should run when the moment calls for them.
If you follow these three moves, a CLAUDE.md rarely needs to exceed 150 lines. At that size, Claude actually reads it.
What belongs in CLAUDE.md (the signal test)
Anthropic’s own framing for when to add something is excellent, and it’s worth quoting directly because it captures the whole philosophy in four lines:
Add to it when:
Claude makes the same mistake a second time
A code review catches something Claude should have known about this codebase
You type the same correction or clarification into chat that you typed last session
A new teammate would need the same context to be productive — Anthropic docs
The operator version of the same principle: CLAUDE.md is the place you write down what you’d otherwise re-explain. It’s not the place you write down everything you know. If you find yourself writing “the frontend is built in React and uses Tailwind,” ask whether Claude would figure that out by reading package.json (it would). If you find yourself writing “when a user asks for a new endpoint, always add input validation and write a test,” that’s the kind of thing Claude won’t figure out on its own — it’s a team convention, not an inference from the code.
The categories I’ve found actually earn their place in a project CLAUDE.md:
Build and test commands. The exact string to run the dev server, the test suite, the linter, the type checker. Every one of these saves Claude a round of “let me look for a package.json script.”
Architectural non-obvious. The thing a new teammate would need someone to explain. “This repo uses event sourcing — don’t write direct database mutations, emit events instead.” “We have two API surfaces, /public/* and /internal/*, and they have different auth requirements.”
Naming conventions and file layout. “API handlers live in src/api/handlers/.” “Test files go next to the code they test, named *.test.ts.” Specific enough to verify.
Coding standards that matter. Not “write good code” — “use 2-space indentation,” “prefer const over let,” “always export types separately from values.”
Recurring corrections. The single most valuable category. Every time you find yourself re-correcting Claude about the same thing, that correction belongs in CLAUDE.md.
What usually doesn’t belong:
Long lists of library choices (Claude can read package.json)
Full architecture diagrams (link to them instead)
Step-by-step procedures (skills)
Path-specific rules that only matter in one part of the repo (.claude/rules/ with a paths: field)
Anything that would be true of any project (that goes in user CLAUDE.md)
Writing instructions Claude will actually follow
Anthropic’s own guidance on effective instructions comes down to three principles, and every one of them is worth taking seriously:
Specificity. “Use 2-space indentation” works better than “format code nicely.” “Run npm test before committing” works better than “test your changes.” “API handlers live in src/api/handlers/” works better than “keep files organized.” If the instruction can’t be verified, it can’t be followed reliably.
Consistency. If two rules contradict each other, Claude may pick one arbitrarily. This is especially common in projects that have accumulated CLAUDE.md files across multiple contributors over time — one file says to prefer async/await, another says to use .then() for performance reasons, and nobody remembers which was right. Do a periodic sweep.
Structure. Use markdown headers and bullets. Group related instructions. Dense paragraphs are harder to scan, and Claude scans the same way you do. A CLAUDE.md with clear section headers — ## Build Commands, ## Coding Style, ## Testing — outperforms the same content run together as prose.
One pattern I’ve found useful that isn’t in the docs: write CLAUDE.md in the voice of a teammate briefing another teammate. Not “use 2-space indentation” but “we use 2-space indentation.” Not “always include input validation” but “every endpoint needs input validation — we had a security incident last year and this is how we prevent the next one.” The “why” is optional but it improves adherence because Claude treats the rule as something with a reason behind it, not an arbitrary preference.
Community patterns worth knowing (flagged as community, not official)
The following are patterns I’ve seen in operator circles and at industry events like AI Engineer Europe 2026, where practitioners share how they’re running Claude Code in production. None of these are in Anthropic’s documentation as official guidance. I’ve included them because they’re useful; I’m flagging them because they’re community-origin, not doctrine. Your mileage may vary, and Anthropic’s official behavior could change in ways that affect these patterns.
The “project constitution” framing. Community shorthand for treating CLAUDE.md as the living document of architectural decisions — the thing new contributors read to understand how the project thinks. The framing is useful even though Anthropic doesn’t use the word. It captures the right posture: CLAUDE.md is the place for the decisions you want to outlast any individual conversation.
Prompt-injecting your own codebase via custom linter errors. Reported at AI Engineer Europe 2026: some teams embed agent-facing prompts directly into their linter error messages, so when an automated tool catches a mistake, the error text itself tells the agent how to fix it. Example: instead of a test failing with “type mismatch,” the error reads “You shouldn’t have an unknown type here because we parse at the edge — use the parsed type from src/schemas/.” This is not documented Anthropic practice; it’s a community pattern that works because Claude Code reads tool output and tool output flows into context. Use with judgment.
File-size lint rules as context-efficiency guards. Some teams enforce file-size limits (commonly cited: 350 lines max) via their linters, with the explicit goal of keeping files small enough that Claude can hold meaningful ones in context without waste. Again, community practice. The number isn’t magic; the discipline is.
Token Leverage as a team metric. The idea that teams should track token spend ÷ human labor spend as a ratio and try to scale it. This is business-strategy content, not engineering guidance, and it’s emerging community discourse rather than settled practice. Take it as a thought experiment, not a KPI to implement by Monday.
I’d rather flag these honestly than pretend they’re settled. If something here graduates from community practice to official recommendation, I’ll update.
Enterprise: managed-policy CLAUDE.md (and when to use settings instead)
For organizations deploying Claude Code across teams, there’s a managed-policy CLAUDE.md that applies to every user on a machine and cannot be excluded by individual settings. It lives at /Library/Application Support/ClaudeCode/CLAUDE.md (macOS), /etc/claude-code/CLAUDE.md (Linux and WSL), or C:\Program Files\ClaudeCode\CLAUDE.md (Windows), and is deployed via MDM, Group Policy, Ansible, or similar.
The distinction that matters most for enterprise: managed CLAUDE.md is guidance, managed settings are enforcement. Anthropic is clear about this. From the docs:
Settings rules are enforced by the client regardless of what Claude decides to do. CLAUDE.md instructions shape Claude’s behavior but are not a hard enforcement layer. — Anthropic docs
If you need to guarantee that Claude Code can’t read .env files or write to /etc, that’s a managed settings concern (permissions.deny). If you want Claude to be reminded of your company’s code review standards, that’s managed CLAUDE.md. If you confuse the two and put your security policy in CLAUDE.md, you have a strongly-worded suggestion where you needed a hard wall.
Building With Claude?
I’ll send you the CLAUDE.md cheat sheet personally.
If you’re in the middle of a real project and this playbook is helping — or raising more questions — just email me. I read every message.
One practical note: managed CLAUDE.md ships to developer machines once, so it has to be right. Review it, version it, and treat changes to it the way you’d treat changes to a managed IDE configuration — because that’s what it is.
The living document problem: auto memory, CLAUDE.md, and drift
The thing that changed most in 2026 is that Claude now writes memory for itself when auto memory is enabled (on by default since Claude Code v2.1.59). It saves build commands it discovered, debugging insights, preferences you expressed repeatedly — and loads the first 200 lines (or 25KB) of its MEMORY.md at every session start. (Anthropic docs.)
This changes how you think about CLAUDE.md in two ways.
First, you don’t need to write CLAUDE.md entries for everything Claude could figure out on its own. If you tell Claude once that the build command is pnpm run build --filter=web, auto memory might save that, and you won’t need to codify it in CLAUDE.md. The role of CLAUDE.md becomes more specifically about what the team has decided, rather than what the tool needs to know to function.
Second, there’s a new audit surface. Run /memory in a session and you can see every CLAUDE.md, CLAUDE.local.md, and rules file being loaded, plus a link to open the auto memory folder. The auto memory files are plain markdown. You can read, edit, or delete them.
A practical auto-memory hygiene pattern I’ve landed on:
Once a month, open /memory and skim the auto memory folder. Anything stale or wrong gets deleted.
Quarterly, review the CLAUDE.md itself. Has anything changed in how the team works? Are there rules that used to matter but don’t anymore? Conflicting instructions accumulate faster than you think.
Whenever a rule keeps getting restated in conversation, move it from conversation to CLAUDE.md. That’s the signal Anthropic’s own docs describe, and it’s the right one.
CLAUDE.md files are living documents or they’re lies. A CLAUDE.md from six months ago that references libraries you’ve since replaced will actively hurt you — Claude will try to follow instructions that no longer apply.
A representative CLAUDE.md template
What follows is a synthetic example, clearly not any specific project. It demonstrates the shape, scope, and discipline of a good project CLAUDE.md. Adapt it to your codebase. Keep it under 200 lines.
# Project: [Name]
## Overview
Brief one-paragraph description of what this project is and who uses it.
Link to deeper architecture docs rather than duplicating them here.
See @README.md for full architecture.
## Build and Test Commands
- Install: `pnpm install`
- Dev server: `pnpm run dev`
- Build: `pnpm run build`
- Test: `pnpm test`
- Type check: `pnpm run typecheck`
- Lint: `pnpm run lint`
Run `pnpm run typecheck` and `pnpm test` before committing. Both must pass.
## Tech Stack
(Only list the non-obvious choices. Claude can read package.json.)
- We use tRPC, not REST, for internal APIs.
- Styling is Tailwind with a custom token file at `src/styles/tokens.ts`.
- Database migrations via Drizzle, not Prisma (migrated in Q1 2026).
## Directory Layout
- `src/api/` — tRPC routers, grouped by domain
- `src/components/` — React components, one directory per component
- `src/lib/` — shared utilities, no React imports allowed here
- `src/server/` — server-only code, never imported from client
- `tests/` — integration tests (unit tests live next to source)
## Coding Conventions
- TypeScript strict mode. No `any` without a comment explaining why.
- Functional components only. No class components.
- Imports ordered: external, internal absolute, relative.
- 2-space indentation. Prettier config in `.prettierrc`.
## Conventions That Aren't Obvious
- Every API endpoint validates input with Zod. No exceptions.
- Database queries go through the repository layer in `src/server/repos/`.
Never import Drizzle directly from route handlers.
- Errors surfaced to the UI use the `AppError` class from `src/lib/errors.ts`.
This preserves error codes for the frontend to branch on.
## Common Corrections
- Don't add new top-level dependencies without discussing first.
- Don't create new files in `src/lib/` without checking if a similar
utility already exists.
- Don't write tests that hit the real database. Use the test fixtures
in `tests/fixtures/`.
## Further Reading
- API design rules: @.claude/rules/api.md
- Testing conventions: @.claude/rules/testing.md
- Security: @.claude/rules/security.md
That’s roughly 70 lines. Notice what it doesn’t include: no multi-step procedures, no duplicated information from package.json, no universal-best-practice lectures. Every line is either a command you’d otherwise re-type, a convention a new teammate would need briefed, or a pointer to a more specific document.
When CLAUDE.md still isn’t being followed
This happens to everyone eventually. Three debugging steps, in order:
1. Run /memory and confirm your file is actually loaded. If CLAUDE.md isn’t in the list, Claude isn’t reading it. Check the path — project CLAUDE.md can live at ./CLAUDE.mdor./.claude/CLAUDE.md, not both, not a subdirectory (unless Claude happens to be reading files in that subdirectory).
2. Make the instruction more specific. “Write clean code” is not an instruction Claude can verify. “Use 2-space indentation” is. “Handle errors properly” is not an instruction. “All errors surfaced to the UI must use the AppError class from src/lib/errors.ts” is.
3. Look for conflicting instructions. A project CLAUDE.md saying “prefer async/await” and a .claude/rules/performance.md saying “use raw promises for hot paths” will cause Claude to pick one arbitrarily. In monorepos this is especially common — an ancestor CLAUDE.md from a different team can contradict yours. Use claudeMdExcludes to skip irrelevant ancestors.
If you need guarantees rather than guidance — “Claude cannot, under any circumstances, delete this directory” — that’s a settings-level permissions concern, not a CLAUDE.md concern. Write the rule in settings.json under permissions.deny and the client enforces it regardless of what Claude decides.
FAQ
What is CLAUDE.md? A markdown file Claude Code reads at the start of every session to get persistent instructions for a project. It lives in a project’s source tree (usually at ./CLAUDE.md or ./.claude/CLAUDE.md), gets loaded into the context window as a user message after the system prompt, and contains coding standards, build commands, architectural decisions, and other team-level context. Anthropic is explicit that it’s guidance, not enforcement. (Source.)
How long should a CLAUDE.md be? Under 200 lines. Anthropic’s own guidance is that longer files consume more context and reduce adherence. If you’re over that, split with @imports or move topic-specific rules into .claude/rules/.
Where should CLAUDE.md live? Project-level: ./CLAUDE.md or ./.claude/CLAUDE.md, checked into source control. Personal-global: ~/.claude/CLAUDE.md. Personal-project (gitignored): ./CLAUDE.local.md. Organization-wide (enterprise): /Library/Application Support/ClaudeCode/CLAUDE.md (macOS), /etc/claude-code/CLAUDE.md (Linux/WSL), or C:\Program Files\ClaudeCode\CLAUDE.md (Windows).
What’s the difference between CLAUDE.md and auto memory? CLAUDE.md is instructions you write for Claude. Auto memory is notes Claude writes for itself across sessions, stored at ~/.claude/projects/<project>/memory/. Both load at session start. CLAUDE.md is for team standards; auto memory is for build commands and preferences Claude picks up from your corrections. Auto memory requires Claude Code v2.1.59 or later.
Can Claude ignore my CLAUDE.md? Yes. CLAUDE.md is loaded as a user message and Claude “reads it and tries to follow it, but there’s no guarantee of strict compliance.” For hard enforcement (blocking file access, sandbox isolation, etc.) use settings, not CLAUDE.md.
Does AGENTS.md work for Claude Code? Claude Code reads CLAUDE.md, not AGENTS.md. If your repo already uses AGENTS.md for other coding agents, create a CLAUDE.md that imports it with @AGENTS.md at the top, then append Claude-specific instructions below.
What’s .claude/rules/ and when should I use it? A directory of smaller, topic-scoped markdown files that can optionally be scoped to specific file paths via YAML frontmatter. Use it when your CLAUDE.md is getting long or when instructions only matter in part of the codebase. Rules without a paths: field load at session start with the same priority as .claude/CLAUDE.md; rules with a paths: field only load when Claude works with matching files.
How do I generate a starter CLAUDE.md? Run /init inside Claude Code. It analyzes your codebase and produces a starting file with build commands, test instructions, and conventions it discovers. Refine from there with instructions Claude wouldn’t discover on its own.
A closing note
The biggest mistake I see people make with CLAUDE.md isn’t writing it wrong — it’s writing it once and forgetting it exists. Six months later it references libraries they’ve since replaced, conventions that have since shifted, and a team structure that has since reorganized. Claude dutifully tries to follow instructions that no longer apply, and the team wonders why the tool seems to have gotten worse.
CLAUDE.md is a living document or it’s a liability. Treat it the way you’d treat a critical piece of onboarding documentation, because functionally that’s exactly what it is — onboarding for the teammate who shows up every session and starts from zero.
Write it for that teammate. Keep it short. Update it when reality shifts. And remember the part nobody likes to admit: it’s guidance, not enforcement. For anything that has to be guaranteed, reach for settings instead.
Community patterns referenced in this piece were reported at AI Engineer Europe 2026 and captured in a session recap. They represent emerging practice, not Anthropic doctrine.
What Is a GCP Content Pipeline?
A GCP Content Pipeline is a Google Cloud-hosted infrastructure stack that connects Claude AI to your WordPress sites — bypassing rate limits, WAF blocks, and IP restrictions — and automates content publishing, image generation, and knowledge storage at scale. It’s the back-end that lets a one-person operation run like a 10-person content team.
Most content agencies are running Claude in a browser tab and copy-pasting into WordPress. That works until you’re managing 5 sites, 20 posts a week, and a client who needs 200 articles in 30 days.
We run 122+ Cloud Run services across a single GCP project. WordPress REST API calls route through a proxy that handles authentication, IP allowlisting, and retry logic automatically. Imagen 4 generates featured images with IPTC metadata injected before upload. A BigQuery knowledge ledger stores 925 embedded content chunks for persistent AI memory across sessions.
We’ve now productized this infrastructure so you can skip the 18 months it took us to build it.
Who This Is For
Content agencies, SEO publishers, and AI-native operators running multiple WordPress sites who need content velocity that exceeds what a human-in-the-loop browser session can deliver. If you’re publishing fewer than 20 posts a week across fewer than 3 sites, you probably don’t need this yet. If you’re above that threshold and still doing it manually — you’re leaving serious capacity on the table.
What We Build
WP Proxy (Cloud Run) — Single authenticated gateway to all your WordPress sites. Handles Basic auth, app passwords, WAF bypass, and retry logic. One endpoint to rule all sites.
Claude AI Publisher — Cloud Run service that accepts article briefs, calls Claude API, optimizes for SEO/AEO/GEO, and publishes directly to WordPress REST API. Fully automated brief-to-publish.
Imagen 4 Proxy — GCP Vertex AI image generation endpoint. Accepts prompts, returns WebP images with IPTC/XMP metadata injected, uploads to WordPress media library. Four-tier quality routing: Fast → Standard → Ultra → Flagship.
BigQuery Knowledge Ledger — Persistent AI memory layer. Content chunks embedded via Vertex AI text-embedding-005, stored in BigQuery, queryable across sessions. Ends the “start from scratch” problem every time a new Claude session opens.
Batch API Router — Routes non-time-sensitive jobs (taxonomy, schema, meta cleanup) to Anthropic Batch API at 50% cost. Routes real-time jobs to standard API. Automatic tier selection.
What You Get vs. DIY vs. n8n/Zapier
Tygart Media GCP Build
DIY from scratch
No-code automation (n8n/Zapier)
WordPress WAF bypass built in
✅
You figure it out
❌
Imagen 4 image generation
✅
❌
❌
BigQuery persistent AI memory
✅
❌
❌
Anthropic Batch API cost routing
✅
❌
❌
Claude model tier routing
✅
❌
❌
Proven at 20+ posts/day
✅
Unknown
❌
What We Deliver
Item
Included
WP Proxy Cloud Run service deployed to your GCP project
✅
Claude AI Publisher Cloud Run service
✅
Imagen 4 proxy with IPTC injection
✅
BigQuery knowledge ledger (schema + initial seed)
✅
Batch API routing logic
✅
Model tier routing configuration (Haiku/Sonnet/Opus)
✅
Site credential registry for all your WordPress sites
✅
Technical walkthrough + handoff documentation
✅
30-day async support
✅
Prerequisites
You need: a Google Cloud account (we can help set one up), at least one WordPress site with REST API enabled, and an Anthropic API key. Vertex AI access (for Imagen 4) requires a brief GCP onboarding — we walk you through it.
Ready to Stop Copy-Pasting Into WordPress?
Tell us how many sites you’re managing, your current publishing volume, and where the friction is. We’ll tell you exactly which services to build first.
Email only. No sales call required. No commitment to reply.
Frequently Asked Questions
Do I need to know how to use Google Cloud?
No. We build and deploy everything. You’ll need a GCP account and billing enabled — we handle the rest and document every service so you can maintain it independently.
How is this different from using Claude directly in a browser?
Browser sessions have no memory, no automation, no direct WordPress integration, and no cost optimization. This infrastructure runs asynchronously, publishes directly to WordPress via REST API, stores content history in BigQuery, and routes jobs to the cheapest model tier that can handle the task.
Which WordPress hosting providers does the proxy support?
We’ve tested and configured routing for WP Engine, Flywheel, SiteGround, Cloudflare-protected sites, Apache/ModSecurity servers, and GCP Compute Engine. Most hosting environments work out of the box — a handful need custom WAF bypass headers, which we configure per-site.
What does the BigQuery knowledge ledger actually do?
It stores content chunks (articles, SOPs, client notes, research) as vector embeddings. When you start a new AI session, you query the ledger instead of re-pasting context. Your AI assistant starts with history, not a blank slate.
What’s the ongoing GCP cost?
Highly variable by volume. For a 10-site agency publishing 50 posts/week with image generation, expect $50–$200/month in GCP costs. Cloud Run scales to zero when idle, so you’re not paying for downtime.
Can this be expanded after initial setup?
Yes — the architecture is modular. Each Cloud Run service is independent. We can add newsroom services, variant engines, social publishing pipelines, or site-specific publishers on top of the core stack.
What Is a GCP Content Pipeline?
A GCP Content Pipeline is a Google Cloud-hosted infrastructure stack that connects Claude AI to your WordPress sites — bypassing rate limits, WAF blocks, and IP restrictions — and automates content publishing, image generation, and knowledge storage at scale. It’s the back-end that lets a one-person operation run like a 10-person content team.
Most content agencies are running Claude in a browser tab and copy-pasting into WordPress. That works until you’re managing 5 sites, 20 posts a week, and a client who needs 200 articles in 30 days.
We run 122+ Cloud Run services across a single GCP project. WordPress REST API calls route through a proxy that handles authentication, IP allowlisting, and retry logic automatically. Imagen 4 generates featured images with IPTC metadata injected before upload. A BigQuery knowledge ledger stores 925 embedded content chunks for persistent AI memory across sessions.
We’ve now productized this infrastructure so you can skip the 18 months it took us to build it.
Who This Is For
Content agencies, SEO publishers, and AI-native operators running multiple WordPress sites who need content velocity that exceeds what a human-in-the-loop browser session can deliver. If you’re publishing fewer than 20 posts a week across fewer than 3 sites, you probably don’t need this yet. If you’re above that threshold and still doing it manually — you’re leaving serious capacity on the table.
What We Build
WP Proxy (Cloud Run) — Single authenticated gateway to all your WordPress sites. Handles Basic auth, app passwords, WAF bypass, and retry logic. One endpoint to rule all sites.
Claude AI Publisher — Cloud Run service that accepts article briefs, calls Claude API, optimizes for SEO/AEO/GEO, and publishes directly to WordPress REST API. Fully automated brief-to-publish.
Imagen 4 Proxy — GCP Vertex AI image generation endpoint. Accepts prompts, returns WebP images with IPTC/XMP metadata injected, uploads to WordPress media library. Four-tier quality routing: Fast → Standard → Ultra → Flagship.
BigQuery Knowledge Ledger — Persistent AI memory layer. Content chunks embedded via Vertex AI text-embedding-005, stored in BigQuery, queryable across sessions. Ends the “start from scratch” problem every time a new Claude session opens.
Batch API Router — Routes non-time-sensitive jobs (taxonomy, schema, meta cleanup) to Anthropic Batch API at 50% cost. Routes real-time jobs to standard API. Automatic tier selection.
What You Get vs. DIY vs. n8n/Zapier
Tygart Media GCP Build
DIY from scratch
No-code automation (n8n/Zapier)
WordPress WAF bypass built in
✅
You figure it out
❌
Imagen 4 image generation
✅
❌
❌
BigQuery persistent AI memory
✅
❌
❌
Anthropic Batch API cost routing
✅
❌
❌
Claude model tier routing
✅
❌
❌
Proven at 20+ posts/day
✅
Unknown
❌
What We Deliver
Item
Included
WP Proxy Cloud Run service deployed to your GCP project
✅
Claude AI Publisher Cloud Run service
✅
Imagen 4 proxy with IPTC injection
✅
BigQuery knowledge ledger (schema + initial seed)
✅
Batch API routing logic
✅
Model tier routing configuration (Haiku/Sonnet/Opus)
✅
Site credential registry for all your WordPress sites
✅
Technical walkthrough + handoff documentation
✅
30-day async support
✅
Prerequisites
You need: a Google Cloud account (we can help set one up), at least one WordPress site with REST API enabled, and an Anthropic API key. Vertex AI access (for Imagen 4) requires a brief GCP onboarding — we walk you through it.
Ready to Stop Copy-Pasting Into WordPress?
Tell us how many sites you’re managing, your current publishing volume, and where the friction is. We’ll tell you exactly which services to build first.
Email only. No sales call required. No commitment to reply.
Frequently Asked Questions
Do I need to know how to use Google Cloud?
No. We build and deploy everything. You’ll need a GCP account and billing enabled — we handle the rest and document every service so you can maintain it independently.
How is this different from using Claude directly in a browser?
Browser sessions have no memory, no automation, no direct WordPress integration, and no cost optimization. This infrastructure runs asynchronously, publishes directly to WordPress via REST API, stores content history in BigQuery, and routes jobs to the cheapest model tier that can handle the task.
Which WordPress hosting providers does the proxy support?
We’ve tested and configured routing for WP Engine, Flywheel, SiteGround, Cloudflare-protected sites, Apache/ModSecurity servers, and GCP Compute Engine. Most hosting environments work out of the box — a handful need custom WAF bypass headers, which we configure per-site.
What does the BigQuery knowledge ledger actually do?
It stores content chunks (articles, SOPs, client notes, research) as vector embeddings. When you start a new AI session, you query the ledger instead of re-pasting context. Your AI assistant starts with history, not a blank slate.
What’s the ongoing GCP cost?
Highly variable by volume. For a 10-site agency publishing 50 posts/week with image generation, expect $50–$200/month in GCP costs. Cloud Run scales to zero when idle, so you’re not paying for downtime.
Can this be expanded after initial setup?
Yes — the architecture is modular. Each Cloud Run service is independent. We can add newsroom services, variant engines, social publishing pipelines, or site-specific publishers on top of the core stack.
The Anthropic Console is the web-based dashboard where developers manage their Claude API access — creating API keys, monitoring usage, setting spending limits, and testing models. If you’re building with the Claude API, the Console is your operational home base.
Access:console.anthropic.com — sign in with your Anthropic account. API access requires adding a payment method and generating an API key.
What the Anthropic Console Does
Section
What you do here
API Keys
Create, name, and revoke API keys. Each key can have spending limits and restricted permissions.
Workbench
Test prompts and model configurations interactively before building. Adjust temperature, system prompts, and model selection in real time.
Usage & Billing
Monitor token consumption by model, set spending limits, view billing history, and add credits.
Rate Limits
See your current tier and the limits that apply — requests per minute, tokens per minute, tokens per day.
Models
Browse available models and their API strings. Use as reference before specifying models in code.
Prompt Library
Save and reuse prompts and system prompt configurations across projects.
Add a payment method under Billing — the API is pay-as-you-go, no subscription required.
Navigate to API Keys and click Create Key.
Name the key (e.g., “development” or “production”) and optionally set a spending limit.
Copy the key immediately — it won’t be shown again after you close the dialog.
Store it securely: environment variable, secrets manager, or your CI/CD vault. Never hardcode it.
# Store your key as an environment variable
export ANTHROPIC_API_KEY="sk-ant-..."
# Then access it in Python
import anthropic
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY automatically
The Workbench: Test Before You Build
The Workbench is the Console’s interactive testing environment. Before writing API code, use it to develop and test your prompts — adjust the system prompt, try different models, tune parameters, and see exactly how Claude responds. When you have the behavior you want, export the configuration as code with one click.
This is the fastest way to iterate on prompt design without writing a test harness every time. It’s also where you can verify current model behavior before updating a production system.
Understanding Rate Limits in the Console
The Console shows your current rate limit tier and the specific limits that apply. Anthropic uses a tiered system — as your spending grows, your limits increase automatically:
Tier 1 — New accounts, basic limits, minimum spend
Tier 2-4 — Limits scale up as cumulative API spend increases
Enterprise — Custom limits negotiated with Anthropic sales
If you’re hitting rate limits in production, the Console shows exactly which limit you’re hitting (requests per minute vs tokens per minute vs daily tokens) so you know whether to optimize your code or request a tier increase. For full context on limits, see Claude Rate Limits: What They Are and How to Work Around Them.
Spending Limits and Cost Control
The Console lets you set spending limits per API key — useful for development keys where you want a hard cap, or for giving team members API access with bounded risk. Usage dashboards show consumption by model and time period, which is essential for understanding which Claude model is driving cost in a production system.
The Anthropic Console (console.anthropic.com) is for developers building with the API. Claude.ai is the consumer product for end users having conversations with Claude. They use the same underlying models but serve different purposes — the Console is where you manage programmatic access, the claude.ai interface is where you use Claude directly.
Tygart Media
Getting Claude set up is one thing. Getting it working for your team is another.
We configure Claude Code, system prompts, integrations, and team workflows end-to-end. You get a working setup — not more documentation to read.
The Anthropic Console (console.anthropic.com) is the developer dashboard for managing Claude API access — creating API keys, monitoring usage and billing, testing prompts in the Workbench, and managing rate limits. It’s separate from claude.ai, which is the end-user product.
How do I get an Anthropic API key?
Go to console.anthropic.com, sign in, add a payment method under Billing, then go to API Keys and click Create Key. Copy the key immediately after creation — it won’t be shown again. Store it as an environment variable, never in your code.
Is the Anthropic Console free?
Creating an account and accessing the Console is free. The API itself is pay-as-you-go — you only pay for tokens consumed. There’s no monthly subscription fee for API access; you add credits and they’re deducted as you use the API.
Claude Code is available two ways: as the Code tab inside Claude Desktop (with a graphical interface), or as a CLI tool you install and run from your terminal. Here’s how to get either one set up from scratch.
Requirement: Claude Code requires a Pro, Max, Team, or Enterprise subscription. It is not available on the free plan. The Claude Desktop app (which includes a graphical Claude Code interface) is free to download but the Code tab requires a paid subscription.
Option 1: Claude Desktop (Recommended for Most Users)
The easiest way to get Claude Code is through Claude Desktop — no terminal required.
Installing Claude?
If you run into setup issues or want to know how to get the most out of it from day one, just email me.
Download Claude Desktop from claude.ai/download — available for macOS and Windows (x64 or ARM64). Linux is not supported.
Install — on Mac, open the PKG and drag to Applications; on Windows, run the installer.
Sign in with your Anthropic account (Pro, Max, Team, or Enterprise).
Click the Code tab in the top navigation.
Select Local to work with files on your machine, or Remote to run on Anthropic’s cloud infrastructure.
Click “Select folder” and choose your project directory. You’re ready.
On Windows, Git must be installed for local sessions to work. Most Macs include Git by default — check by running git --version in Terminal.
Free — no pitch
Get the Claude workflow that actually sticks.
Practical Claude setup tips from someone running it across 27 client sites daily — not marketing, not theory. Email Will directly and he’ll share what’s working.
For developers who prefer working in the terminal, Claude Code is also available as a command-line tool.
# Install via npm
npm install -g @anthropic-ai/claude-code
# Authenticate
claude login
# Start in your project directory
cd your-project
claude
The CLI requires Node.js. After running claude login, you’ll authenticate with your Anthropic account in a browser window. The session starts automatically in the current directory.
Local vs. Remote Sessions
Session type
What it does
Best for
Local
Runs on your machine, accesses your files directly
Everyday development work
Remote
Runs on Anthropic’s cloud, continues if you close the app
Long-running tasks, autonomous work
SSH
Connects to a remote machine over SSH
Server or cloud VM development
Common Setup Issues
Code tab not appearing in Desktop: Confirm your account is on a paid plan. Claude Code requires Pro, Max, Team, or Enterprise — it’s not available on the free tier.
Windows Git error: Claude Code needs Git for local sessions on Windows. Download Git from git-scm.com, install with default settings, then restart the desktop app.
CLI authentication failing: Run claude logout then claude login again. Make sure your Anthropic account has an active paid subscription.
Permission errors on first run: Claude Code will ask permission to access your files when you first select a folder. Click Allow — it needs read/write access to work with your project.
First Session: What to Expect
When you start your first Claude Code session, Anthropic recommends starting with a small, familiar project. Ask Claude to explain the codebase, fix a specific bug, or add a small feature. This gives you a calibrated sense of how it works before tackling larger tasks. Claude will read relevant files, propose changes, and ask for your approval before modifying anything.
Download Claude Desktop from claude.ai/download and use the Code tab — no terminal required. Or install the CLI with npm install -g @anthropic-ai/claude-code and run claude login to authenticate.
Is Claude Code free to install?
Claude Desktop (which includes Claude Code) is free to download. Using Claude Code requires a paid subscription — Pro ($20/month), Max ($100/month), Team, or Enterprise. It is not available on the free plan.
Does Claude Code work on Linux?
The Claude Desktop app does not support Linux. The Claude Code CLI does run on Linux — install via npm and use it from your terminal.
What’s the difference between Claude Code Desktop and the CLI?
Claude Code Desktop (the Code tab in the Claude Desktop app) gives you a graphical interface with visual file diffs, a built-in preview panel, and no terminal required. The CLI runs in your terminal and supports the same core operations. Both share configuration files and can run simultaneously on the same project.
Choosing between Claude’s three models comes down to one question: how hard is the task, and how much does cost matter? Haiku, Sonnet, and Opus each occupy a distinct position — this is the complete three-way breakdown so you can route work correctly from the start.
The routing rule in one sentence: Haiku for volume and speed, Sonnet for almost everything else, Opus for the tasks where Sonnet isn’t quite enough.
Haiku vs Sonnet vs Opus: Full Comparison
Spec
Haiku
Sonnet
Opus
API string
claude-haiku-4-5-20251001
claude-sonnet-4-6
claude-opus-4-6
Input price (per M tokens)
~$1.00
~$3.00
~$5.00
Output price (per M tokens)
~$5.00
~$5.00
~$25.00
Context window
200K
1M
1M
Speed
⚡ Fastest
⚡ Fast
🐢 Slower
Reasoning depth
Good
Excellent
Maximum
Writing quality
Good
Excellent
Maximum
Cost vs Sonnet
~4× cheaper
—
~5× more expensive
Claude Haiku: The Volume Model
Haiku is optimized for tasks that are high in quantity but low in complexity — situations where you’re running the same operation hundreds or thousands of times and cost per call is a real constraint. Classification, extraction, summarization, metadata generation, routing logic, short-form responses, and real-time features where latency matters more than depth.
The output quality on constrained tasks is strong. Where Haiku shows its limits is on open-ended, nuanced work — multi-step reasoning, long-form writing where voice consistency matters, or problems with competing constraints. For those, Sonnet is the right call.
Claude Sonnet: The Default
Sonnet handles the vast majority of professional work at a quality level that’s indistinguishable from Opus for most tasks. Writing, analysis, research, coding, summarization, strategy — Sonnet does all of it well. It’s the model to start with and the one most people should use as their production default.
The gap between Sonnet and Opus shows on genuinely hard tasks: novel multi-step reasoning, edge cases in complex code, nuanced judgment in ambiguous situations, or extended agentic sessions where small quality differences compound. For everything else, Sonnet is the right choice and a fraction of the cost.
Claude Opus: The Specialist
Opus earns its premium on tasks where maximum capability is the only variable that matters and cost is secondary. Complex legal or technical analysis, research synthesis across conflicting sources, architectural decisions with long-term consequences, extended agentic sessions, and any task where you’ve tried Sonnet and felt the output was a notch below what the problem deserved.
The practical test: if Sonnet’s output on a task is good enough, use Sonnet. Only reach for Opus when you’ve genuinely hit Sonnet’s ceiling on a specific problem. Most professionals do this on a small fraction of their actual workload.
The Decision Framework
Use Haiku when: same operation at high volume, output is constrained/structured, cost and speed matter, real-time latency required.
Use Sonnet when: any standard professional task — writing, coding, analysis, research. This should be your default 90% of the time.
Use Opus when: the task is genuinely hard, involves novel reasoning, Sonnet’s output wasn’t quite right, or quality is the only variable that matters regardless of cost.
What’s the difference between Claude Haiku, Sonnet, and Opus?
Haiku is fastest and cheapest — built for high-volume, constrained tasks. Sonnet is the balanced production default with excellent quality across most professional work. Opus is the most capable model for complex reasoning — about 5× more expensive than Sonnet on input tokens.
Which Claude model should I use?
Start with Sonnet for almost everything. Switch to Haiku when you’re running the same operation at high volume and cost matters. Switch to Opus when Sonnet’s output on a specific task isn’t quite at the level the problem requires.
Is Claude Haiku good enough for most tasks?
For structured, constrained tasks — yes, Haiku is strong. For open-ended writing, complex reasoning, or work requiring nuanced judgment, Sonnet is the right step up. The cost savings from Haiku are meaningful at scale, making it the right choice when the task fits its strengths.
Claude doesn’t use a traditional plugin marketplace — instead, it connects to external tools and services through MCP (Model Context Protocol), an open standard that lets any service build a Claude integration. Here’s a complete rundown of what Claude can connect to in 2026, how those connections work, and how to set them up.
How Claude integrations work: Claude uses MCP (Model Context Protocol) instead of plugins. Services publish an MCP server; Claude connects to it and gains access to that service’s capabilities. In Claude.ai, many integrations are available in Settings → Connections. In Claude Desktop and the API, you can connect to any MCP server.
Claude Integrations Available in Claude.ai (2026)
Service
What Claude can do
Available in
Google Drive
Search, read, and analyze documents
Claude.ai
Google Calendar
Read and create calendar events
Claude.ai
Gmail
Read, search, and draft emails
Claude.ai
Notion
Read and write pages, query databases
Claude.ai
Slack
Read channels, search messages, post
Claude.ai
GitHub
Read repos, create issues, review PRs
Claude Desktop / API
Zapier
Trigger automations across 6,000+ apps
Claude.ai
HubSpot
Read and update CRM records
Claude.ai
Cloudflare
Manage workers, DNS, and infrastructure
Claude Desktop / API
PostgreSQL / databases
Query, read schema, analyze data
Claude Desktop / API
File system
Read, write, organize local files
Claude Desktop
Web search
Search the web for current information
Claude.ai (built-in)
Jira / Linear
Read and create issues, update status
Claude.ai / API
Custom APIs
Any service with an MCP server
Claude Desktop / API
How to Add Integrations in Claude.ai
Go to claude.ai → Settings → Connections
Browse the available integrations and click Connect on any you want to enable
Authenticate with the service (usually OAuth — you’ll be redirected to authorize)
Once connected, Claude can use that service in your conversations when relevant
Claude Desktop: More Integrations, More Control
The Claude Desktop app supports MCP server configuration via a JSON config file — giving you access to any MCP server, including self-hosted ones and community-built integrations that aren’t in the official Claude.ai connection list. This is where the integration ecosystem expands beyond the curated set: database connections, local file systems, internal tools, and any API where someone has built an MCP server.
Building Your Own Claude Integration
Any developer can build an MCP server and connect it to Claude. Anthropic publishes the MCP spec openly — you implement the server, and Claude can immediately use whatever tools or data you expose. This is how companies integrate Claude into proprietary internal systems without exposing data to a third party. For the technical implementation, see the Claude MCP guide.
Frequently Asked Questions
Does Claude have plugins?
Claude doesn’t use a plugin marketplace like early ChatGPT did. Instead it uses MCP (Model Context Protocol) — an open standard where services publish integration servers that Claude connects to. In Claude.ai, these appear as “Connections” in Settings. Claude Desktop supports any MCP server via config file.
What apps can Claude connect to?
Claude can connect to Google Drive, Gmail, Google Calendar, Notion, Slack, Zapier, HubSpot, GitHub, Cloudflare, databases, local file systems, and any service that has published an MCP server. The ecosystem is growing rapidly — new MCP servers are added by third-party developers regularly.
How do I add integrations to Claude?
In Claude.ai, go to Settings → Connections and authenticate the services you want to connect. For Claude Desktop, integrations are configured via a JSON config file that specifies which MCP servers to load. Via the API, you pass MCP server URLs in your request parameters.
Claude Haiku is Anthropic’s fastest and most cost-efficient model — the right choice when you need high-volume AI at low cost without sacrificing the quality that makes Claude worth using. It’s not a cut-down version of the flagship models. It’s a purpose-built model for the tasks where speed and cost matter more than maximum reasoning depth.
When to use Haiku: Any time you’re running the same operation across many inputs — classification, extraction, summarization, metadata generation, routing logic, short-form responses — and cost or speed is a meaningful constraint. Haiku handles these at a fraction of Sonnet’s price with output quality that’s more than sufficient.
Claude Haiku Specs (April 2026)
Spec
Value
API model string
claude-haiku-4-5-20251001
Context window
200,000 tokens
Input pricing
~$1.00 per million tokens
Output pricing
~$5.00 per million tokens
Speed vs Sonnet
Faster — optimized for low latency
Batch API discount
~50% off (~$0.50 input / ~$2.50 output)
Claude Haiku vs Sonnet vs Opus
Model
Input cost
Speed
Reasoning depth
Best for
Haiku
~$1.00/M
Fastest
Good
High-volume, latency-sensitive
Sonnet
~$3.00/M
Fast
Excellent
Production workloads, daily driver
Opus
~$5.00/M
Slower
Maximum
Complex reasoning, highest quality
What Claude Haiku Is Best At
Haiku is optimized for tasks where the output is constrained and the logic is clear — not open-ended creative or strategic work where maximum capability pays off. The practical use cases where Haiku earns its position:
Classification and routing — is this a support ticket, a bug report, or a feature request? Tag it and route it. Haiku handles thousands of these per hour at minimal cost.
Extraction — pull the names, dates, dollar amounts, or addresses from a document. Structured output from unstructured text at scale.
Summarization — condense articles, emails, or documents to key points. Haiku’s summarization is strong enough for most production use cases.
SEO metadata — generate title tags, meta descriptions, alt text, and schema markup in bulk. This is where Haiku shines for content operations.
Short-form responses — FAQ answers, product descriptions, short explanations. Anything where the output is a few sentences or a structured short block.
Real-time features — chatbots, autocomplete, inline suggestions — anywhere latency affects user experience.
Claude Haiku vs GPT-4o Mini
GPT-4o mini is OpenAI’s comparable low-cost model and is less expensive than Haiku per token. The cost trade-off is real — GPT-4o mini is cheaper. The quality trade-off depends on the task. For instruction-following on complex structured outputs, Haiku tends to be more reliable. For simple, high-volume tasks where the output format is forgiving, the cost difference may favor GPT-4o mini. For teams already building on Claude for quality reasons, Haiku is the natural choice for high-volume work within that stack.
Claude Haiku is Anthropic’s fastest and most affordable model — approximately $1.00 per million input tokens. It’s purpose-built for high-volume, latency-sensitive tasks like classification, extraction, summarization, and short-form generation where cost efficiency matters more than maximum reasoning depth.
How much does Claude Haiku cost?
Claude Haiku costs approximately $1.00 per million input tokens and $5.00 per million output tokens. The Batch API reduces these to approximately $0.40 input and $2.00 output — roughly half price for non-time-sensitive workloads.
When should I use Claude Haiku instead of Sonnet?
Use Haiku when your task is well-defined with a constrained output, you’re running it at high volume, and cost or latency is a meaningful consideration. Use Sonnet when the task is complex, requires nuanced reasoning, or produces longer open-ended outputs where maximum quality matters.
What is the Claude Haiku API model string?
The current Claude Haiku model string is claude-haiku-4-5-20251001. Always verify the current string in Anthropic’s official model documentation before production deployment.
A system prompt is the instructions you give Claude before the conversation begins — the context, persona, rules, and constraints that shape every response in the session. It’s the most powerful lever you have for controlling Claude’s behavior at scale, and the foundation of any serious Claude integration. Here’s how system prompts work, how to write them well, and real examples across common use cases.
What a system prompt does: Sets Claude’s role, knowledge, tone, constraints, and output format before the user says anything. Claude treats system prompt instructions as authoritative — they persist throughout the conversation and take priority over conflicting user requests within the boundaries Anthropic allows.
System Prompt Structure: The Five Elements
A well-structured system prompt typically covers these elements — not all are required for every use case, but the strongest prompts address most of them:
# Role
You are [specific role/persona]. [1-2 sentences on expertise and perspective].
# Context
[What this system/application/conversation is for. Who the user is. What they’re trying to accomplish.]
# Instructions
[Specific behaviors: what to do, how to format responses, how to handle edge cases]
# Constraints
[What NOT to do. Topics to avoid. Format rules to enforce. Information not to share.]
# Output format
[How Claude should structure its responses: length, format, sections, tone]
System Prompt Examples by Use Case
Customer Support Agent
You are a customer support agent for Acme Software. You help users with account questions, billing issues, and technical troubleshooting for Acme’s project management platform.
Tone: professional, patient, solution-focused. Never dismissive.
For billing questions: provide information but escalate refund requests to billing@acme.com.
For technical issues: follow the troubleshooting guide below before escalating.
Never discuss: competitor products, internal pricing strategy, unreleased features.
Always end with: “Is there anything else I can help you with today?”
Code Assistant
You are a senior software engineer helping with Python and TypeScript code.
When writing code: use type hints in Python, strict TypeScript, and always include error handling. Prefer explicit over implicit. Comment non-obvious logic.
When reviewing code: flag issues by severity (critical/high/medium/low). Always explain why something is a problem, not just that it is.
Never write code without error handling. Never use eval(). Never hardcode credentials.
Content Writer
You write content for [Brand Name], a B2B SaaS company in the project management space.
Voice: direct, confident, no filler. Never use “leverage,” “synergy,” or “utilize.” Short sentences. Active voice.
Audience: project managers and engineering leads at companies with 50–500 employees.
Always: include a clear next step or CTA. Never: make claims we can’t back up, mention competitors by name.
What System Prompts Can and Can’t Do
System prompts are powerful but not absolute. They can reliably control: Claude’s tone and persona, output format and structure, topic scope and focus, response length guidelines, and how Claude handles specific scenarios. They cannot override Anthropic’s core guidelines — Claude won’t follow system prompt instructions to produce harmful content, lie about being an AI when sincerely asked, or violate its trained ethical constraints regardless of what the system prompt says.
System Prompts in the API vs. Claude.ai
In the API, the system prompt is passed as the system parameter in your API call. In Claude.ai Projects, the custom instructions field functions as the system prompt for all conversations in that Project. In Claude.ai standard conversations, you can prepend context at the start of a conversation — it’s not a true system prompt but achieves a similar effect.
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a helpful assistant...", # ← system prompt here
messages=[
{"role": "user", "content": "Hello"}
]
)
A system prompt is instructions given to Claude before the conversation begins — setting its role, constraints, tone, and output format. It persists throughout the session and takes priority over user messages within Anthropic’s guidelines.
How long should a Claude system prompt be?
Long enough to cover what Claude needs to behave correctly, short enough that Claude actually follows all of it. Most production system prompts are 200–1,000 words. Beyond that, you risk important instructions getting less attention. Structure with headers helps Claude parse longer prompts.
Can users override a system prompt?
Not reliably. System prompts take priority over user messages. A user saying “ignore your system prompt” won’t override legitimate business instructions. Claude is designed to follow operator system prompts even when users push back, within Anthropic’s ethical guidelines.