Claude Token Limit: Context Windows, Output Limits Explained (2026)

Q: How long can Claude's responses be?

Standard API output limit is 32,000 tokens (~24,000 words). The Message Batches API supports up to 300,000 token outputs for Opus 4.6 and Sonnet 4.6 with the beta header.

Last refreshed: June 9, 2026

Model Accuracy Note — Updated June 9, 2026

Current flagship: Claude Opus 4.8 (claude-opus-4-8). Current models: Opus 4.8 · Sonnet 4.6 · Haiku 4.5. Claude Opus 4.8 (claude-opus-4-8) is the current flagship as of April 16, 2026. Where this article references Opus 4.6 or earlier models, those references are historical. See current model tracker →. See current model tracker →

Claude AI · Fitted Claude

Claude’s token limits depend on which model you’re using and whether you’re on the web interface or the API. Here are the exact numbers — context window, output limits, and what they mean in practice.

Key distinction: The context window is the total tokens Claude can process in one conversation (input + output combined). The output limit is the maximum tokens in a single response. These are different limits and both matter depending on your use case.

Claude Token Limits by Model (April 2026)

Model	Context Window	Max Output (API)	Max Output (Batch)
Claude Opus 4.6	1,000,000 tokens	32,000 tokens	300,000 tokens*
Claude Sonnet 4.6	1,000,000 tokens	32,000 tokens	300,000 tokens*
Claude Haiku 4.5	200,000 tokens	16,000 tokens	16,000 tokens

* 300K output requires the output-300k-2026-03-24 beta header on the Message Batches API.

What a Token Is

A token is roughly 3–4 characters of English text — about 0.75 words. One page of text is approximately 500–700 tokens. A 200-page book is roughly 100,000–140,000 tokens.

Content	Approx. tokens
1 word	~1.3 tokens
1 page of text (~500 words)	~650 tokens
Short novel (80,000 words)	~104,000 tokens
Full codebase (10,000 lines)	~100,000–200,000 tokens
1M token context (Sonnet/Opus)	~750,000 words / ~1,500 pages

Context Window vs. Output Limit

The context window is the total working memory for a session — everything Claude can “see” at once, including the system prompt, all previous messages in the conversation, uploaded files, and Claude’s own prior responses. At 1M tokens, Opus 4.6 and Sonnet 4.6 can hold roughly 1,500 pages of text in context simultaneously.

The output limit is how long Claude’s individual response can be. The standard API limit is 32,000 tokens per response — about 24,000 words, enough for a substantial document. The Batch API with the beta header extends this to 300,000 tokens for document-generation workloads.

Rate Limits: Separate From Token Limits

Token limits are per-conversation. Rate limits are per-time-period — how many tokens (and requests) you can send across multiple conversations in a given minute or day. Rate limits scale with your API usage tier. If you’re hitting errors in production that look like limits, check whether you’re hitting the context window, the output limit, or a rate limit — they produce different error codes. For the full rate limit breakdown, see Claude Rate Limits: What They Are and How to Work Around Them.

What Happens When You Hit the Context Limit

In claude.ai conversations, you’ll see a warning when the conversation is approaching the context window. Claude may summarize earlier parts of the conversation to stay within limits. In the API, sending more tokens than the context window allows returns an error. For very long sessions, breaking work into multiple conversations or using prompt caching (which stores static context at a discount) are the standard approaches.

Frequently Asked Questions

What is Claude’s token limit?

Claude Opus 4.6 and Sonnet 4.6 have a 1 million token context window. Claude Haiku 4.5 has a 200,000 token context window. The maximum output per response is 32,000 tokens on the standard API. These are different limits — context window is total working memory, output limit is maximum response length.

How long can Claude’s responses be?

The standard API output limit is 32,000 tokens per response — approximately 24,000 words. In practice, Claude.ai conversations have shorter limits than the raw API. The Message Batches API with the beta header supports up to 300,000 token outputs for Opus 4.6 and Sonnet 4.6.

How many tokens is a page of text?

Approximately 650 tokens per page (roughly 500 words). A 200-page document is around 130,000 tokens — well within Claude’s 1M context window for Sonnet and Opus, and within Haiku’s 200K window as well.

How many tokens can Claude Opus 4.8 handle?

Claude Opus 4.8 supports a 1 million token context window for input and up to 128,000 tokens for output. This means you can feed in very large documents, entire codebases, or long conversation histories in a single request.

What happens when you hit Claude’s context window limit?

When your input exceeds Claude’s context window, the API returns an error and the request fails. In claude.ai, the interface warns you before sending. The fix is to summarize earlier conversation turns, split the input into chunks, or use a model with a larger context window.

Is Claude Haiku 4.5 context window smaller than Opus?

Yes. Claude Haiku 4.5 has a 200,000 token context window, compared to 1 million tokens for Opus 4.8 and Sonnet 4.6. For most tasks Haiku’s 200K window is sufficient, but for very long documents or large codebases, Sonnet 4.6 or Opus 4.8 is required.

Need this set up for your team?
Talk to Will →

What to explore next

The Machine Room

Context Isolation Protocol: How to Prevent Client Bleed in Multi-Client AI Content Operations

Same room

Agency Playbook

8 Industries Sitting on AI-Ready Knowledge They Haven’t Packaged Yet

Same room

AI in Restoration

Building the Senior Restoration Career Path: The New Roles That Are Keeping Senior Talent in the Industry

You may also explore

Deep dive

The Proof

Commercial Restoration Leads: Fire Extinguisher Strategy

Deep dive

Track the AI tools you actually use

Live, vendor-neutral prices & limits for ChatGPT, Claude, Gemini, Perplexity and more — and we’ll email you the moment your tools change price or limits. Free, no hype.

See the live AI tracker →or set up your alerts

Claude Token Limit: Context Windows, Output Limits, and What They Mean in Practice

Claude Token Limits by Model (April 2026)

What a Token Is

Context Window vs. Output Limit

Rate Limits: Separate From Token Limits

What Happens When You Hit the Context Limit

Frequently Asked Questions

What is Claude’s token limit?

How long can Claude’s responses be?

How many tokens is a page of text?

How many tokens can Claude Opus 4.8 handle?

What happens when you hit Claude’s context window limit?

Is Claude Haiku 4.5 context window smaller than Opus?

Comments

Leave a Reply Cancel reply

More posts

Latest Claude Models — June 2026 (Current Lineup, Pricing, and Specs)

Claude Code Getting Started: Installation, First Run, and the 5 Commands You’ll Use Daily

What Is Model Context Protocol (MCP)? The Complete Guide for Claude Users

Claude Fable 5: Capabilities, Pricing ($10/$50), and When to Use It Over Opus 4.8