Author: will_tygart

  • Claude Haiku: Pricing, API String, Use Cases, and When to Use It

    Claude AI · Fitted Claude

    Claude Haiku is Anthropic’s fastest and most cost-efficient model — the right choice when you need high-volume AI at low cost without sacrificing the quality that makes Claude worth using. It’s not a cut-down version of the flagship models. It’s a purpose-built model for the tasks where speed and cost matter more than maximum reasoning depth.

    When to use Haiku: Any time you’re running the same operation across many inputs — classification, extraction, summarization, metadata generation, routing logic, short-form responses — and cost or speed is a meaningful constraint. Haiku handles these at a fraction of Sonnet’s price with output quality that’s more than sufficient.

    Claude Haiku Specs (April 2026)

    Spec Value
    API model string claude-haiku-4-5-20251001
    Context window 200,000 tokens
    Input pricing ~$1.00 per million tokens
    Output pricing ~$5.00 per million tokens
    Speed vs Sonnet Faster — optimized for low latency
    Batch API discount ~50% off (~$0.50 input / ~$2.50 output)

    Claude Haiku vs Sonnet vs Opus

    Model Input cost Speed Reasoning depth Best for
    Haiku ~$1.00/M Fastest Good High-volume, latency-sensitive
    Sonnet ~$3.00/M Fast Excellent Production workloads, daily driver
    Opus ~$5.00/M Slower Maximum Complex reasoning, highest quality

    What Claude Haiku Is Best At

    Haiku is optimized for tasks where the output is constrained and the logic is clear — not open-ended creative or strategic work where maximum capability pays off. The practical use cases where Haiku earns its position:

    • Classification and routing — is this a support ticket, a bug report, or a feature request? Tag it and route it. Haiku handles thousands of these per hour at minimal cost.
    • Extraction — pull the names, dates, dollar amounts, or addresses from a document. Structured output from unstructured text at scale.
    • Summarization — condense articles, emails, or documents to key points. Haiku’s summarization is strong enough for most production use cases.
    • SEO metadata — generate title tags, meta descriptions, alt text, and schema markup in bulk. This is where Haiku shines for content operations.
    • Short-form responses — FAQ answers, product descriptions, short explanations. Anything where the output is a few sentences or a structured short block.
    • Real-time features — chatbots, autocomplete, inline suggestions — anywhere latency affects user experience.

    Claude Haiku vs GPT-4o Mini

    GPT-4o mini is OpenAI’s comparable low-cost model and is less expensive than Haiku per token. The cost trade-off is real — GPT-4o mini is cheaper. The quality trade-off depends on the task. For instruction-following on complex structured outputs, Haiku tends to be more reliable. For simple, high-volume tasks where the output format is forgiving, the cost difference may favor GPT-4o mini. For teams already building on Claude for quality reasons, Haiku is the natural choice for high-volume work within that stack.

    Using Claude Haiku in the API

    import anthropic
    
    client = anthropic.Anthropic()
    
    message = client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=256,
        messages=[
            {"role": "user", "content": "Classify this support ticket: ..."}
        ]
    )
    
    print(message.content)

    For a full model comparison, see Claude Models Explained: Haiku vs Sonnet vs Opus. For API pricing across all models, see Anthropic API Pricing.

    Frequently Asked Questions

    What is Claude Haiku?

    Claude Haiku is Anthropic’s fastest and most affordable model — approximately $1.00 per million input tokens. It’s purpose-built for high-volume, latency-sensitive tasks like classification, extraction, summarization, and short-form generation where cost efficiency matters more than maximum reasoning depth.

    How much does Claude Haiku cost?

    Claude Haiku costs approximately $1.00 per million input tokens and $5.00 per million output tokens. The Batch API reduces these to approximately $0.40 input and $2.00 output — roughly half price for non-time-sensitive workloads.

    When should I use Claude Haiku instead of Sonnet?

    Use Haiku when your task is well-defined with a constrained output, you’re running it at high volume, and cost or latency is a meaningful consideration. Use Sonnet when the task is complex, requires nuanced reasoning, or produces longer open-ended outputs where maximum quality matters.

    What is the Claude Haiku API model string?

    The current Claude Haiku model string is claude-haiku-4-5-20251001. Always verify the current string in Anthropic’s official model documentation before production deployment.

    Need this set up for your team?
    Talk to Will →

  • Anthropic vs OpenAI: What’s Different, What Matters, and Which to Use

    Tygart Media Strategy
    Volume Ⅰ · Issue 04Quarterly Position
    By Will Tygart
    Long-form Position
    Practitioner-grade

    Anthropic and OpenAI are the two most consequential AI labs in the world right now — and they’re building from fundamentally different starting points. Both are producing frontier AI models. Both have Claude and ChatGPT as their flagship consumer products. But their philosophies, ownership structures, and approaches to AI development diverge in ways that matter for anyone paying attention to where AI is going.

    Short version: OpenAI is larger, older, and has more products. Anthropic is smaller, younger, and more focused on safety as a core design methodology. Both are capable of frontier AI — the difference shows in philosophy and approach more than in raw capability benchmarks.

    Anthropic vs. OpenAI: Side-by-Side

    Factor Anthropic OpenAI
    Founded 2021 2015
    Flagship model Claude GPT / ChatGPT
    Legal structure Public Benefit Corporation For-profit (converted from nonprofit)
    Key investors Google, Amazon Microsoft, various VC
    Safety methodology Constitutional AI RLHF + policy layers
    Consumer product Claude.ai ChatGPT
    Image generation Via API (Vertex AI) DALL-E built in
    Agentic coding tool Claude Code Codex / Operator
    Tool/integration standard MCP (open standard) Function calling / plugins
    Not sure which to use?

    We’ll help you pick the right stack — and set it up.

    Tygart Media evaluates your workflow and configures the right AI tools for your team. No guesswork, no wasted subscriptions.

    The Founding Story: Why Anthropic Split From OpenAI

    Anthropic was founded in 2021 by Dario Amodei, Daniela Amodei, and several colleagues who had been senior researchers at OpenAI. The departure was driven by disagreements about safety priorities and the pace of commercial development. The founders believed that as AI systems became more capable, the risk of harm grew in ways that required dedicated research and more cautious deployment — not just policy layers added after the fact.

    That founding philosophy is baked into how Anthropic builds Claude. Constitutional AI — Anthropic’s training methodology — teaches Claude to evaluate its own outputs against a set of principles rather than optimizing purely for human approval. The result is a model more likely to push back, express uncertainty, and decline harmful requests even under pressure.

    What Each Company Does Better

    Anthropic’s strengths: Safety methodology, writing quality, instruction-following precision, long-context coherence, and Claude Code for agentic development. The public benefit corporation structure gives leadership more control over deployment decisions than investor pressure would otherwise allow.

    OpenAI’s strengths: Broader product ecosystem, DALL-E image generation built into ChatGPT, more established enterprise relationships, larger user base, and more third-party integrations built on their API over a longer period. GPT-4o is competitive with Claude on most benchmarks.

    The Safety Philosophy Difference

    This is the substantive philosophical divide. Both companies have safety teams and publish research. But Anthropic was founded specifically on the thesis that safety research needs to be a primary design input — not a compliance function. Constitutional AI is an attempt to operationalize that at the training level.

    OpenAI’s approach has historically been more RLHF-forward (reinforcement learning from human feedback) with safety addressed through usage policies and model behavior guidelines. The debate between these approaches is genuinely unresolved in the AI research community — neither has proven definitively superior for long-term safety outcomes.

    For Users: Does the Philosophy Difference Matter?

    Day to day, most users experience the difference as: Claude is more likely to push back, more honest about uncertainty, and more consistent in following complex instructions. ChatGPT has more features in the consumer product — image generation, a wider integration ecosystem — and is more likely to give you what you asked for even if what you asked for is slightly wrong.

    For enterprises evaluating which API to build on: both are capable, both have enterprise tiers, and the choice often comes down to which performs better on your specific workload. For safety-sensitive applications or regulated industries, Anthropic’s explicit safety focus and public benefit structure are meaningful differentiators.

    For the Claude vs. ChatGPT product comparison, see Claude vs ChatGPT: The Honest 2026 Comparison.

    Frequently Asked Questions

    What is the difference between Anthropic and OpenAI?

    Both are frontier AI labs — Anthropic makes Claude, OpenAI makes ChatGPT/GPT. Anthropic was founded by former OpenAI researchers who prioritized safety as a core design methodology. It’s structured as a public benefit corporation. OpenAI is older, larger, and has a broader product ecosystem including image generation and a longer history of enterprise integrations.

    Is Anthropic better than OpenAI?

    Neither is definitively better — they’re different. Claude (Anthropic) tends to win on writing quality, instruction-following, and safety calibration. ChatGPT (OpenAI) wins on ecosystem breadth, image generation, and third-party integrations. The better choice depends on your specific use case.

    Why did Anthropic founders leave OpenAI?

    The Anthropic founders — including Dario and Daniela Amodei — left OpenAI over disagreements about safety priorities and the pace of commercial deployment. They believed AI safety needed to be a primary research focus built into model training, not an add-on. That conviction became Anthropic’s founding mission and Constitutional AI methodology.

  • Can Claude Read PDFs? Yes — Here’s Exactly How It Works

    Claude AI · Fitted Claude

    Yes — Claude can read PDFs. You can upload a PDF directly to Claude.ai and ask questions about it, summarize it, extract specific information, or have Claude analyze its contents. Here’s exactly how it works, what the limits are, and what Claude does particularly well with PDF documents.

    How to upload a PDF: In Claude.ai, click the paperclip icon in the message box, select your PDF, and it uploads instantly. Then ask your question. Claude reads the full document and responds based on its contents.

    What Claude Can Do With a PDF

    Task Works well? Notes
    Summarize the document ✅ Excellent Full document or by section
    Answer questions about content ✅ Excellent Finds specific facts, quotes, data points
    Compare multiple PDFs ✅ Strong Upload multiple files in one session
    Extract tables and data ✅ Strong Works best on text-based tables
    Analyze contracts and legal docs ✅ Strong Identifies clauses, flags issues, explains terms
    Read scanned / image PDFs ⚠️ Limited Requires text layer — pure image scans may not work
    Translate PDF content ✅ Strong Ask Claude to translate after uploading
    Fill in or edit the PDF file ❌ No Claude reads PDFs, doesn’t modify them

    PDF Size Limits

    Claude supports PDFs up to 32MB per file and up to 100 pages. Documents within that range load fully — Claude reads the entire content, not just the first few pages. For longer documents, you may need to split them or work section by section.

    The 200,000 token context window means very long text-heavy PDFs are handled well. A 200-page research paper, a full contract stack, or a lengthy financial report typically fits within the context window without truncation. See the Claude Context Window guide for the full breakdown.

    Scanned PDFs: The Limitation to Know

    Claude reads PDFs by processing the text layer — the actual characters embedded in the file. Most modern PDFs created from Word, Google Docs, or similar tools have a full text layer and work perfectly. Scanned documents — where pages are photographs of physical paper — may have no text layer, just images of text. Claude’s ability to read these depends on whether the PDF includes OCR text alongside the image.

    If Claude returns a response suggesting it can’t read the content, the PDF is likely a pure image scan without a text layer. Running the PDF through OCR software first will resolve it.

    Best Prompts for PDF Analysis

    Summarization: “Summarize this document in 3 paragraphs. Focus on the key findings, recommendations, and any action items.”

    Contract review: “Review this contract and flag: (1) any clauses that are unusually favorable to the other party, (2) missing standard protections, (3) ambiguous language that should be clarified.”

    Data extraction: “Extract all financial figures from this report and organize them into a table: metric, value, and the time period it covers.”

    Multi-document comparison: “I’ve uploaded two versions of this agreement. Identify every difference between them.”

    PDF Reading via the API

    Developers can send PDFs to Claude via the API using base64-encoded file content. Claude processes the document and responds to your prompt based on its contents — the same way it works in the web interface. This enables automated document processing pipelines: contract analysis at scale, research synthesis, financial document review, and more. See the Claude API tutorial for implementation details.

    Frequently Asked Questions

    Can Claude read PDFs?

    Yes. Upload a PDF directly in Claude.ai by clicking the attachment icon. Claude reads the full document content and can summarize, answer questions, extract data, compare documents, and analyze contracts. The limit is 32MB and 100 pages per file.

    Can Claude read scanned PDFs?

    Claude reads PDFs by processing the text layer. Scanned PDFs that are pure images without a text layer may not work — Claude needs text to process, not just an image of text. If your scan was run through OCR and has a text layer embedded, it will work. Otherwise, run OCR first.

    How many PDFs can I upload to Claude at once?

    You can upload multiple PDFs in a single conversation — as long as their combined text content fits within Claude’s 1 million token context window (for Sonnet and Opus) or 200,000 tokens (Haiku). For most document types, that means dozens of typical-length files can be analyzed together.

    Does Claude save or store uploaded PDFs?

    Claude processes PDFs within the conversation context. Anthropic’s standard data handling applies — on Free and Pro plans, conversations including uploaded files may be used for model improvement unless you opt out. For sensitive documents, review Claude’s privacy policy and consider Enterprise for stronger data handling.

    Need this set up for your team?
    Talk to Will →

  • Claude System Prompt Guide: How to Write Them, Examples, and Best Practices

    Claude AI · Fitted Claude

    A system prompt is the instructions you give Claude before the conversation begins — the context, persona, rules, and constraints that shape every response in the session. It’s the most powerful lever you have for controlling Claude’s behavior at scale, and the foundation of any serious Claude integration. Here’s how system prompts work, how to write them well, and real examples across common use cases.

    What a system prompt does: Sets Claude’s role, knowledge, tone, constraints, and output format before the user says anything. Claude treats system prompt instructions as authoritative — they persist throughout the conversation and take priority over conflicting user requests within the boundaries Anthropic allows.

    System Prompt Structure: The Five Elements

    A well-structured system prompt typically covers these elements — not all are required for every use case, but the strongest prompts address most of them:

    # Role
    You are [specific role/persona]. [1-2 sentences on expertise and perspective].

    # Context
    [What this system/application/conversation is for. Who the user is. What they’re trying to accomplish.]

    # Instructions
    [Specific behaviors: what to do, how to format responses, how to handle edge cases]

    # Constraints
    [What NOT to do. Topics to avoid. Format rules to enforce. Information not to share.]

    # Output format
    [How Claude should structure its responses: length, format, sections, tone]

    System Prompt Examples by Use Case

    Customer Support Agent

    You are a customer support agent for Acme Software. You help users with account questions, billing issues, and technical troubleshooting for Acme’s project management platform.

    Tone: professional, patient, solution-focused. Never dismissive.

    For billing questions: provide information but escalate refund requests to billing@acme.com.
    For technical issues: follow the troubleshooting guide below before escalating.
    Never discuss: competitor products, internal pricing strategy, unreleased features.

    Always end with: “Is there anything else I can help you with today?”

    Code Assistant

    You are a senior software engineer helping with Python and TypeScript code.

    When writing code: use type hints in Python, strict TypeScript, and always include error handling. Prefer explicit over implicit. Comment non-obvious logic.

    When reviewing code: flag issues by severity (critical/high/medium/low). Always explain why something is a problem, not just that it is.

    Never write code without error handling. Never use eval(). Never hardcode credentials.

    Content Writer

    You write content for [Brand Name], a B2B SaaS company in the project management space.

    Voice: direct, confident, no filler. Never use “leverage,” “synergy,” or “utilize.” Short sentences. Active voice.

    Audience: project managers and engineering leads at companies with 50–500 employees.

    Always: include a clear next step or CTA. Never: make claims we can’t back up, mention competitors by name.

    What System Prompts Can and Can’t Do

    System prompts are powerful but not absolute. They can reliably control: Claude’s tone and persona, output format and structure, topic scope and focus, response length guidelines, and how Claude handles specific scenarios. They cannot override Anthropic’s core guidelines — Claude won’t follow system prompt instructions to produce harmful content, lie about being an AI when sincerely asked, or violate its trained ethical constraints regardless of what the system prompt says.

    System Prompts in the API vs. Claude.ai

    In the API, the system prompt is passed as the system parameter in your API call. In Claude.ai Projects, the custom instructions field functions as the system prompt for all conversations in that Project. In Claude.ai standard conversations, you can prepend context at the start of a conversation — it’s not a true system prompt but achieves a similar effect.

    import anthropic
    
    client = anthropic.Anthropic()
    
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system="You are a helpful assistant...",  # ← system prompt here
        messages=[
            {"role": "user", "content": "Hello"}
        ]
    )

    For a full library of tested prompts across use cases, see the Claude Prompt Library and Claude Prompt Generator and Improver.

    Tygart Media

    Getting Claude set up is one thing.
    Getting it working for your team is another.

    We configure Claude Code, system prompts, integrations, and team workflows end-to-end. You get a working setup — not more documentation to read.

    See what we set up →

    Frequently Asked Questions

    What is a Claude system prompt?

    A system prompt is instructions given to Claude before the conversation begins — setting its role, constraints, tone, and output format. It persists throughout the session and takes priority over user messages within Anthropic’s guidelines.

    How long should a Claude system prompt be?

    Long enough to cover what Claude needs to behave correctly, short enough that Claude actually follows all of it. Most production system prompts are 200–1,000 words. Beyond that, you risk important instructions getting less attention. Structure with headers helps Claude parse longer prompts.

    Can users override a system prompt?

    Not reliably. System prompts take priority over user messages. A user saying “ignore your system prompt” won’t override legitimate business instructions. Claude is designed to follow operator system prompts even when users push back, within Anthropic’s ethical guidelines.

    Need this set up for your team?
    Talk to Will →

  • Claude for Code Review: What It Catches, How to Use It, and Its Limits

    Claude AI · Fitted Claude

    Claude is a strong code review tool — capable of identifying bugs, security vulnerabilities, logic errors, and style issues across most languages and frameworks. Here’s how to use Claude for code review effectively, what it catches reliably, and where you still need a human reviewer.

    Bottom line: Claude is excellent for catching obvious bugs, security antipatterns, and code clarity issues — and fast enough to be part of your pre-PR workflow. It doesn’t replace review from someone who knows your system’s business logic, architectural constraints, or team conventions that aren’t visible in the code itself.

    What Claude Catches in Code Reviews

    Issue Type Claude’s reliability Notes
    Syntax errors and typos ✅ High Catches what linters miss
    Security vulnerabilities ✅ High SQL injection, XSS, hardcoded credentials, SSRF
    Logic errors in simple functions ✅ High Off-by-one errors, wrong comparisons, null handling
    Missing error handling ✅ High Uncaught exceptions, unhandled promise rejections
    Code clarity and readability ✅ High Naming, structure, comment quality
    Performance antipatterns ✅ Good N+1 queries, unnecessary loops, memory leaks
    Business logic correctness ⚠️ Limited Needs context Claude doesn’t have
    Architectural decisions ⚠️ Limited Requires system-wide context

    How to Run a Code Review With Claude

    The most effective approach is to give Claude both the code and the context it needs to review it well. A bare code dump produces generic feedback; a structured prompt produces actionable findings.

    Review this [language] code for: (1) security vulnerabilities, (2) bugs or logic errors, (3) missing error handling, (4) performance issues, (5) clarity problems.

    Context: This function [does X]. It receives [input type] and should return [output type]. It runs [frequency/context].

    Flag each issue with: severity (critical/high/medium/low), what’s wrong, and the fix.

    [paste code]

    Claude for Security Code Review

    Security review is one of Claude’s strongest code review use cases. It reliably identifies:

    • Injection vulnerabilities — SQL, command, LDAP injection patterns
    • Authentication issues — weak password handling, JWT misuse, session management problems
    • Hardcoded secrets — API keys, credentials in source code
    • Insecure dependencies — when you tell it what packages you’re using
    • Input validation gaps — missing sanitization, trust boundary violations

    For security review, explicitly tell Claude to “focus on security vulnerabilities” — the findings are more targeted and specific when it knows that’s the priority.

    Claude Code Review vs. Claude Code

    Code review via the chat interface is for analyzing code you paste in. Claude Code is the agentic tool that operates autonomously inside your actual development environment — reading files, running tests, and making changes. For code review as part of a larger development workflow, Claude Code can do it in-situ on your actual codebase rather than requiring you to paste code into a chat window.

    Frequently Asked Questions

    Can Claude review code?

    Yes. Claude is effective at catching bugs, security vulnerabilities, missing error handling, and clarity issues across most programming languages. Give it context about what the code is supposed to do for the most actionable feedback.

    Is Claude good for security code review?

    Yes, security review is one of Claude’s strongest code review use cases. It reliably identifies SQL injection, XSS, authentication issues, hardcoded credentials, and input validation gaps. Tell it explicitly to focus on security vulnerabilities for the most targeted output.

    What does Claude miss in code reviews?

    Claude can’t evaluate business logic correctness without context about your domain, architectural decisions without knowing your system design, or team conventions not visible in the code. It also can’t catch runtime behavior issues that only appear under specific conditions or load.

    Need this set up for your team?
    Talk to Will →

  • Claude Enterprise Pricing: What It Costs, What It Includes, and Who It’s For

    Claude AI · Fitted Claude

    Claude Enterprise is Anthropic’s top-tier plan for organizations with compliance requirements, security needs, or usage volumes that make custom pricing worthwhile. Here’s what it includes, who it’s designed for, and how it differs from Team and the standard paid plans.

    Key fact: Anthropic doesn’t publish Enterprise pricing — it’s custom and negotiated based on usage volume and requirements. To get a quote, contact Anthropic’s sales team directly at anthropic.com/contact-sales.

    What Claude Enterprise Includes

    Feature Pro Team Enterprise
    All Claude models
    Shared Projects
    SSO / SAML
    Audit logs
    Data processing agreement
    BAA (HIPAA compliance)
    Custom usage limits
    Admin usage reporting Basic Comprehensive
    Custom model behavior
    Dedicated support

    Who Claude Enterprise Is For

    Enterprise is the right tier if your organization:

    • Requires SSO/SAML integration with your identity provider
    • Needs audit logs of AI usage for compliance or security purposes
    • Handles HIPAA-regulated data and needs a Business Associate Agreement
    • Has legal, IT, or procurement requirements around vendor data handling
    • Needs custom usage limits higher than Team provides
    • Is large enough that custom pricing is financially meaningful

    Claude Enterprise Pricing: What to Expect

    Anthropic prices Enterprise contracts based on expected usage volume, the number of users, required features, and contract term. There’s no published starting price. Organizations evaluating Enterprise should contact Anthropic’s sales team with their use case, headcount, and approximate usage expectations to get a realistic quote.

    The negotiation typically involves: data handling requirements, custom usage limits, any special model behavior configurations, and SLA terms. Enterprise contracts are generally annual commitments rather than month-to-month.

    Claude Enterprise via the API

    Many enterprise-scale Claude deployments run through the API rather than the Claude.ai web interface — building Claude into internal tools, workflows, or customer-facing products. For API-based enterprise use, Anthropic offers enterprise API agreements with higher rate limits, dedicated support, and custom pricing through the same sales process. The Anthropic API pricing guide covers the standard API tiers; enterprise API pricing is negotiated separately.

    Frequently Asked Questions

    How much does Claude Enterprise cost?

    Anthropic doesn’t publish Enterprise pricing. It’s custom-negotiated based on usage volume, users, features, and contract term. Contact Anthropic’s sales team at anthropic.com/contact-sales for a quote.

    Does Claude Enterprise include SSO?

    Yes. SSO/SAML integration is an Enterprise-exclusive feature not available on Pro or Team. If your organization requires SSO for any vendor access, you need Enterprise.

    Is Claude Enterprise HIPAA compliant?

    HIPAA compliance requires a Business Associate Agreement (BAA) with Anthropic, which is only available on the Enterprise plan. No other Claude plan supports HIPAA-regulated data. Contact Anthropic’s sales team to discuss BAA terms as part of an Enterprise agreement.

    What’s the minimum size for Claude Enterprise?

    Anthropic doesn’t publish a minimum user count for Enterprise. In practice, Enterprise makes financial and operational sense for larger organizations or those with specific compliance requirements that justify the sales process. Smaller teams without compliance needs typically find Team ($30/user/month, 5-user minimum) is the right fit.

    Deploying Claude for your organization?

    We configure Claude correctly — right plan tier, right data handling, right system prompts, real team onboarding. Done for you, not described for you.

    Learn about our implementation service →

    Need this set up for your team?
    Talk to Will →

  • Claude 3.5 Sonnet: The Release That Changed Claude’s Trajectory

    Claude AI · Fitted Claude

    Claude 3.5 Sonnet was Anthropic’s mid-2024 flagship model — the release that significantly closed the gap between Claude and GPT-4o and established Claude as a serious competitor for daily professional use. Here’s what it was, how it compared at launch, and where it fits in the current model lineup.

    Current status: Claude 3.5 Sonnet has been succeeded by Claude Sonnet 4.6 (claude-sonnet-4-6). If you’re building something new, use the current Sonnet model. If you’re maintaining a system built on Claude 3.5, check Anthropic’s deprecation schedule for transition timing.

    Claude 3.5 Sonnet: What It Was

    Claude 3.5 Sonnet launched in June 2024 and was Anthropic’s strongest model at the time — outperforming Claude 3 Opus on most benchmarks while being significantly faster and cheaper. This made it an unusual release: the mid-tier model in a new generation beating the top-tier model from the previous generation. It set the pattern for how Anthropic structures model generations.

    At launch, Claude 3.5 Sonnet scored at the top of industry benchmarks on graduate-level reasoning, coding, and mathematics. It was the first Claude model to support computer use — the ability to see and interact with computer interfaces — in beta.

    Model Generations: Where 3.5 Sonnet Fits

    Model Generation Status
    Claude 3 Opus / Sonnet / Haiku Claude 3 (early 2024) Deprecated / legacy
    Claude 3.5 Sonnet / Haiku Claude 3.5 (mid 2024) Superseded
    Claude Sonnet 4.6 Claude 4.x (current) ✅ Current production default
    Claude Opus 4.6 Claude 4.x (current) ✅ Current flagship

    Why Claude 3.5 Sonnet Was a Landmark Release

    Before 3.5 Sonnet, the conventional wisdom was that Claude Opus was the model you reached for on serious tasks, accepting higher cost and slower speed. Claude 3.5 Sonnet changed that calculus — it was fast enough to use as a daily driver and capable enough to replace Opus on most tasks. The cost savings were substantial for anyone running high-volume API workloads.

    The release also marked Claude’s first serious push into coding benchmarks — it scored highly on SWE-bench, a test of real-world software engineering tasks, which attracted significant developer attention and migration from GPT-4o.

    Claude 3.5 Sonnet vs. Current Models

    The current Claude Sonnet 4.6 builds on what Claude 3.5 Sonnet established, with improvements across reasoning, coding, instruction-following, and context handling. If you were a Claude 3.5 Sonnet user, the upgrade path is straightforward — switch the model string and expect better performance across most tasks.

    For current model strings and specs, see Claude API Model Strings — Complete Reference. For a comparison of current Sonnet vs. Opus, see Claude Opus vs Sonnet: Which Model Should You Use?

    Frequently Asked Questions

    Is Claude 3.5 Sonnet still available?

    Claude 3.5 Sonnet has been superseded by Claude Sonnet 4.6. Anthropic maintains older models for a period after new releases but eventually deprecates them. Check Anthropic’s model documentation for current availability and any deprecation notices for Claude 3.5 Sonnet API strings.

    What was the Claude 3.5 Sonnet API model string?

    The Claude 3.5 Sonnet model strings were claude-3-5-sonnet-20240620 and the later version claude-3-5-sonnet-20241022. If you have production systems using these strings, verify their current availability in Anthropic’s model documentation and plan migration to current model strings.

    Should I upgrade from Claude 3.5 Sonnet to the current Sonnet?

    Yes. Claude Sonnet 4.6 outperforms Claude 3.5 Sonnet across most tasks. Migration is typically straightforward — update the model string in your application and test your core use cases. The current model string is claude-sonnet-4-6.

    Need this set up for your team?
    Talk to Will →

  • Claude Context Window: 200K Tokens (and 1M in Beta) — What It Means

    Claude AI · Fitted Claude

    Claude’s context window determines how much information it can hold and process in a single conversation. Claude Sonnet 4.6 and Opus 4.6 support 1 million tokens; Haiku 4.5 supports 200,000 tokens — one of the largest in the industry. Here’s what that means in practice, what you can actually fit inside it, and how context window size affects your work.

    200K tokens in plain terms: Roughly 150,000 words, or about 500 pages of text. That’s enough for an entire novel, a full codebase, or months of conversation history — all in a single session without truncation.

    Claude Context Window by Model (April 2026)

    Model Context Window ~Words ~Pages
    Claude Haiku 200,000 tokens ~150,000 ~500
    Claude Sonnet 200,000 tokens ~150,000 ~500
    Claude Opus 200,000 tokens ~150,000 ~500

    What Fits in 200K Tokens

    Content type Approximate fit
    News articles ~200+ articles
    Research papers ~30–50 papers depending on length
    A full novel Yes — most novels fit with room to spare
    Python codebase Medium-sized codebases (10k–50k lines)
    Legal contracts Hundreds of pages of contracts
    Conversation history Very long sessions before truncation

    Context Window vs. Output Length

    The context window covers everything Claude processes — both input and output combined. If your prompt is 50,000 tokens (a long document), Claude has 150,000 tokens remaining for its response and any further back-and-forth. The window is shared between what you send and what Claude generates.

    Maximum output length is a separate constraint — Claude won’t generate an infinitely long response even within a large context window. For very long outputs (full books, extensive reports), you typically work in sections rather than expecting Claude to produce everything in one pass.

    Why Context Window Size Matters

    Context window size is the practical limit on how much work you can give Claude at once without losing information. Before large context windows, working with long documents required chunking — splitting the document into pieces, analyzing each separately, and manually synthesizing the results. With 200K tokens, Claude can hold the entire document and answer questions about any part of it with full awareness of everything else.

    This matters most for: document analysis and legal review, code understanding across large files, research synthesis across many sources, and long multi-step conversations where earlier context affects later decisions.

    How Claude Performs at the Edges of Its Context Window

    Research on large language models has found that performance can degrade somewhat for information buried in the middle of a very long context — sometimes called the “lost in the middle” problem. Claude performs well across its context window, but for maximum reliability on information from a very long document, referencing specific sections explicitly (“in the section about pricing on page 12…”) helps ensure Claude focuses on the right part.

    For the full model spec breakdown, see Claude API Model Strings and Specs and Claude Models Explained: Haiku vs Sonnet vs Opus.

    Frequently Asked Questions

    What is Claude’s context window size?

    Claude Sonnet 4.6 and Opus 4.6 support a 1 million token context window at standard pricing. Claude Haiku 4.5 supports 200,000 tokens. That’s approximately 150,000 words or about 500 pages of text in a single conversation.

    How many tokens is 200K context?

    200,000 tokens is approximately 150,000 words of English text. One token is roughly four characters or three-quarters of a word. A typical 800-word article is about 1,000 tokens; a full novel is typically 80,000–120,000 tokens.

    Can I upload a full PDF to Claude?

    Yes, as long as the PDF’s text content fits within the 200K token context window. Most documents, reports, contracts, and research papers fit easily. Very large documents (multiple volumes, extensive legal filings) may need to be split.

    Need this set up for your team?
    Talk to Will →

  • Claude Rate Limits: What They Are, How They Work, and What to Do

    Claude AI · Fitted Claude

    Claude has usage limits on every plan — but Anthropic doesn’t publish exact numbers. Instead limits are dynamic, adjusting based on model, message length, and system load. Here’s what the limits actually look like in practice, what triggers them, and what your options are when you hit them.

    What you’ll see: When you hit Claude’s usage limit, you’ll get a message saying you’ve reached your usage limit and showing a countdown to when your limit resets. On Pro this typically resets within a few hours. On Max, limits are high enough that most users never hit them during normal work.

    Rate Limits by Plan

    Plan Relative limit Typical experience
    Free Low Hit limits quickly on heavy use; resets daily
    Pro ~5× Free Most users get through a full workday; heavy users may hit limits
    Max ~5× Pro Most users never hit limits; designed for agentic and heavy use
    Team Higher than Pro Per-user limits slightly higher than individual Pro
    API Separate system Tokens per minute/day limits by tier; see Anthropic’s API docs

    What Counts Against Your Limit

    Claude’s limits are usage-based, not message-count-based. A single message asking Claude to write a 3,000-word article uses more of your limit than ten quick back-and-forth questions. What consumes the most limit, fastest:

    • Long outputs — requests for long articles, detailed analyses, or extended code
    • Long context — uploading large documents and asking questions about them
    • Opus model — the most powerful model consumes limits faster than Sonnet or Haiku
    • Agentic tasks — multi-step autonomous operations use significantly more than conversational use

    API Rate Limits: How They Work

    The API uses a different limit system from the web interface. API limits are measured in:

    • Requests per minute (RPM) — how many API calls you can make
    • Tokens per minute (TPM) — total tokens (input + output) processed per minute
    • Tokens per day (TPD) — total daily token budget

    New API accounts start on lower tiers and can request higher limits through the Anthropic Console as usage establishes a track record. The Batch API has separate, higher limits since it’s asynchronous and non-time-sensitive.

    What To Do When You Hit a Limit

    Wait for reset: The limit message shows when your usage resets — usually within a few hours. This is the simplest option if the timing works.

    Switch models: If you’ve been using Opus, switching to Sonnet for less critical tasks conserves your limit for when you need the top model.

    Upgrade your plan: If you consistently hit Pro limits during your workday, Claude Max at $100/month gives 5× the headroom.

    Use the API: For developers, moving high-volume work to the API with the Batch API gives more control over usage and significant cost savings on non-time-sensitive tasks.

    Frequently Asked Questions

    What are Claude’s usage limits?

    Anthropic doesn’t publish exact numbers. Limits are dynamic and based on usage volume rather than message count. Free is most restricted; Pro is roughly 5× Free; Max is roughly 5× Pro. The limit message appears when you’ve reached your tier’s threshold and shows when it resets.

    How long does it take for Claude’s limit to reset?

    The reset countdown is shown in the limit message. For Pro, limits typically reset within a few hours. For Free, resets are on a daily cycle. The exact timing varies based on when you started using heavily in the current period.

    Does Claude count messages or tokens toward the limit?

    Usage is based on the volume of content processed, not a simple message count. One long request asking for a 3,000-word output uses significantly more of your limit than ten short conversational exchanges.

    Are API rate limits the same as subscription limits?

    No. API limits (RPM, TPM, TPD) are a separate system from web subscription limits. They’re set per API account tier and can be increased by request through the Anthropic Console. Subscription usage and API usage don’t share limits.

    Need this set up for your team?
    Talk to Will →

  • Does Claude Have Memory? How Context, Projects, and Memory Features Work

    Claude AI · Fitted Claude

    Claude doesn’t have persistent memory by default — each conversation starts fresh, with no recollection of previous sessions. But there are several ways to give Claude memory, both through Anthropic’s built-in features and through how you structure your interactions. Here’s exactly how memory works in Claude and what your options are.

    Short answer: By default, no — Claude doesn’t remember previous conversations. Within a single conversation, Claude remembers everything said so far. Claude’s Projects feature gives you persistent context that carries across sessions. And Claude.ai has a memory feature that extracts and stores facts about you automatically.

    The Three Types of Claude Memory

    Memory Type What it does Persists across sessions?
    In-conversation context Everything said in the current chat No — resets when conversation ends
    Projects Custom instructions + uploaded knowledge ✓ Yes — available in every session
    Memory feature Facts Claude learns about you over time ✓ Yes — grows over time

    In-Conversation Context: What Claude Remembers Right Now

    Within a single conversation, Claude has full access to everything that’s been said — all your messages, all its responses, any files you’ve uploaded. This is the context window, which for current Claude models supports up to 200,000 tokens. That covers very long conversations, large documents, and extensive back-and-forth without Claude losing track of earlier details.

    When the conversation ends, that context is gone. Start a new conversation and Claude has no knowledge of the previous one.

    Projects: Persistent Context Across Sessions

    Projects are Claude’s primary mechanism for persistent memory. A Project is a workspace where you can:

    • Set custom instructions that apply to every conversation in that Project
    • Upload documents, style guides, or knowledge files that Claude can reference
    • Keep your conversation history organized by topic or client

    Every conversation you start within a Project has access to those instructions and documents from the beginning — without you having to re-explain your context every time. This is the practical solution for most persistent memory use cases: tell Claude who you are, what you’re working on, and what you need once in the Project settings, and it carries forward.

    The Memory Feature: Claude Learning About You

    Claude.ai has a Memory feature (found in Settings → Memory) where Claude automatically extracts and stores facts about you from your conversations — your job, preferences, ongoing projects, communication style. These memories surface in future conversations to make Claude more personalized without you having to re-introduce yourself.

    You can view, edit, and delete individual memories from the settings page. You can also turn the feature off entirely if you’d rather start fresh each time. When Memory is active, Claude may reference things you mentioned in past conversations — “you mentioned you work in restoration…” — which can feel surprisingly persistent for a tool that otherwise has no cross-session recall.

    Memory in the API

    For developers building on Claude via the API, there’s no built-in persistent memory — each API call is stateless by default. Persistent memory for API applications requires building it yourself: storing conversation history in a database and injecting relevant context into each new request. Anthropic’s system prompt is the standard mechanism for doing this — load relevant facts or history into the system prompt at the start of each call.

    Frequently Asked Questions

    Does Claude remember previous conversations?

    Not by default. Each new conversation starts fresh. You can enable persistent memory through Projects (custom instructions and uploaded knowledge that apply to every session) or through Claude’s Memory feature (which stores facts about you across conversations).

    How do I give Claude memory between sessions?

    Use Projects: create a Project, add custom instructions describing your context, and upload any relevant documents. Every conversation within that Project will have access to that information from the start — no re-explaining required.

    What is Claude’s memory feature?

    Claude’s Memory feature (Settings → Memory) automatically extracts facts about you from conversations and stores them to personalize future interactions. You can view, edit, or delete individual memories, or disable the feature entirely.

    Does Claude remember within a conversation?

    Yes, fully. Within a single conversation Claude has access to everything said — up to 200,000 tokens of context for current models. It won’t forget something you said earlier in the same conversation.

    Need this set up for your team?
    Talk to Will →