Tygart Media Editorial - Tygart Media

Category: Tygart Media Editorial

Tygart Media’s core editorial publication — AI implementation, content strategy, SEO, agency operations, and case studies.

  • Claude API Pricing Explained: Token Costs, Rate Limits, and How to Calculate Your Monthly Bill

    Claude API Pricing Explained: Token Costs, Rate Limits, and How to Calculate Your Monthly Bill

    Claude API Pricing Explained: Token Costs, Rate Limits, and How to Calculate Your Monthly Bill

    Claude’s API pricing is token-based: you pay for the tokens you send (input) and the tokens Claude generates (output). But raw per-token prices are only part of the story. Rate limits, service tiers, prompt caching, batch processing, and feature-specific charges all affect your actual bill. This guide covers every component of Claude API pricing as of June 2026.

    Per-Token Pricing by Model

    All prices are per million tokens (MTok). Opus 4.8, Anthropic’s most intelligent model for agents and coding, costs $5/MTok input and $25/MTok output. Sonnet 4.6, the balanced option for most production workloads, costs $3/MTok input and $15/MTok output. Haiku 4.5, the fastest and cheapest model, costs $1/MTok input and $5/MTok output. Across all current-generation models, output tokens cost exactly 5x input tokens.

    Prompt Caching Pricing

    Prompt caching lets you store frequently-used context (system prompts, reference documents, conversation history) so you don’t pay full input price every time. Caching has two cost components: a cache write at 1.25x the standard input rate (a one-time cost when the content is first cached), and a cache read at approximately 10% of the standard input rate. For Opus 4.8, cache writes cost $6.25/MTok and cache reads cost $0.50/MTok. For Sonnet 4.6, writes are $3.75/MTok and reads are $0.30/MTok. For Haiku 4.5, writes are $1.25/MTok and reads are $0.10/MTok. The default cache TTL is 5 minutes, with extended 1-hour caching available.

    Batch Processing: 50% Off

    The Batch API processes requests asynchronously and charges half the standard rate. If you have workloads that don’t need real-time responses — document processing, content generation, data analysis — batch processing cuts your costs in half. Combining batch processing with prompt caching can reduce costs by up to 95% compared to standard synchronous requests.

    How to Calculate Your Monthly Bill

    A practical example: suppose your application sends an average of 2,000 tokens of input and receives 500 tokens of output per request, and you make 10,000 requests per day using Sonnet 4.6. Daily input tokens: 2,000 × 10,000 = 20M tokens → 20 MTok × $3 = $60/day. Daily output tokens: 500 × 10,000 = 5M tokens → 5 MTok × $15 = $75/day. Daily total: $135/day. Monthly total (30 days): approximately $4,050/month.

    Now apply optimizations. If 80% of your input is cacheable after the first request: cached input = 16 MTok × $0.30 = $4.80 + uncached 4 MTok × $3 = $12 → $16.80 input instead of $60. If you can batch 50% of requests: half your costs drop by 50%. Optimized monthly estimate: roughly $1,500-2,000/month versus $4,050 at list price.

    Service Tiers and Rate Limits

    Anthropic offers three service tiers that affect availability and pricing. Priority tier guarantees availability and predictable pricing for time-sensitive workloads. Standard tier is the default for both piloting and scaling everyday use cases. Batch tier offers 50% savings for asynchronous workloads. Rate limits — requests per minute and tokens per minute — increase as your account matures and spending grows. You can view your current limits in the Anthropic Console.

    Additional Platform Costs

    Beyond token costs, Anthropic charges for specific platform features. Managed Agents cost $0.08 per session-hour for active runtime plus standard token rates. Web search costs $10 per 1,000 searches (tokens for processing the search results are billed separately). Code execution includes 50 free hours daily per organization with additional hours at $0.05/hour. US-only inference for data residency requirements costs 1.1x standard token rates. Fast mode for Opus 4.8 costs 2x standard pricing for up to 2.5x faster speeds.

    Frequently Asked Questions

    How much does Claude API cost for a small project?

    A small project making 100-500 API calls per day with Haiku 4.5 might cost $5-30/month. Using Sonnet 4.6 at the same volume would be roughly $15-90/month. Your actual cost depends on the length of inputs and outputs.

    Is there a free tier for the Claude API?

    Anthropic does not offer a permanent free API tier. You need to add a payment method and load credits to use the API. New accounts start with conservative rate limits that increase over time.

    What’s the cheapest way to use the Claude API?

    Use Haiku 4.5 ($1/MTok input), enable prompt caching for repeated context (90% savings on cached reads), and use batch processing for non-real-time work (50% off). The combination can reduce effective costs by over 90%.

    How do Claude API costs compare to OpenAI?

    At the flagship level, Claude Opus 4.8 ($5/$25 per MTok) is competitive with GPT-4-class pricing. At the mid-tier, Sonnet 4.6 ($3/$15) competes with GPT-4o. At the economy tier, Haiku 4.5 ($1/$5) competes with GPT-4o-mini. Both platforms offer similar cost optimization features.

    Related: Claude AI Pricing (2026) — every plan, API rate, and the cost calculator

  • Claude in Chrome: What It Does, How to Set It Up, and Practical Use Cases in 2026

    Claude in Chrome: What It Does, How to Set It Up, and Practical Use Cases in 2026

    Claude in Chrome: What It Does, How to Set It Up, and Practical Use Cases in 2026

    Claude in Chrome is a browser extension that brings Claude directly into your web browsing experience. Rather than switching between tabs to copy-paste content into Claude, the extension lets Claude see and interact with the page you’re viewing. It launched as a beta feature and has become one of the most practical ways to use Claude for daily knowledge work. Here’s what it actually does, how to get it running, and where it shines.

    What Claude in Chrome Actually Does

    Claude in Chrome is a browser extension that gives Claude the ability to read the content of web pages you’re viewing and take actions within the browser. When activated, Claude can read and summarize articles, reports, documentation, or any text-heavy page. It can extract key information from complex pages like product comparisons, financial reports, or academic papers. It can help you draft responses to emails and messages while viewing them. It can analyze data tables and charts visible on web pages. It can assist with form filling and data entry tasks. And it can help navigate complex web applications.

    The extension works through a sidepanel interface — Claude appears alongside your browser content rather than replacing it. This side-by-side layout is what makes it practical: you can reference the page content while working with Claude’s output.

    How to Install Claude in Chrome

    Claude in Chrome is available through the Chrome Web Store. Search for “Claude” or navigate directly to the extension page. Click “Add to Chrome” and confirm the permissions. Once installed, you’ll see the Claude icon in your browser toolbar. Click it to open the sidepanel interface. You’ll need to sign in with your Claude account — the extension works with Free, Pro, Max, Team, and Enterprise plans.

    Practical Use Cases

    Research and summarization is the most common use case. When you’re reading a long article, technical documentation, or research paper, Claude can summarize it, extract key arguments, identify the main data points, and highlight what’s novel versus what’s already well-established. This works especially well with academic papers, legal documents, and technical specifications.

    Competitive analysis becomes faster when Claude can read competitor websites directly. Open a competitor’s pricing page, product page, or blog and ask Claude to compare it against your offering. No more copying and pasting between tabs.

    Email and messaging gets a boost when Claude can see the email you’re replying to. It understands the context — tone, topic, relationship dynamics — and can draft responses that match.

    Data extraction from web tables, dashboards, and reports is another strong use case. Claude can read HTML tables, identify patterns, and help you pull specific numbers without manual work.

    Learning and studying is enhanced when Claude can see the material you’re working through. Open a textbook chapter online, a course page, or documentation, and ask Claude to explain concepts, quiz you, or create study notes.

    What Claude in Chrome Cannot Do

    The extension has limitations worth understanding. It cannot access pages behind login walls unless you’re already authenticated. It cannot interact with content inside iframes or heavily JavaScript-rendered single-page applications in all cases. It does not have access to your browsing history, saved passwords, or other browser data. It cannot make purchases, submit forms, or take irreversible actions without your explicit confirmation.

    Privacy and Security

    Claude in Chrome only accesses page content when you actively invoke it. It does not passively monitor your browsing. Page content sent to Claude follows the same data handling policies as regular Claude conversations — on Team and Enterprise plans, content is not used for model training by default. The extension requires specific permissions that are reviewed during installation.

    Claude in Chrome vs Claude Desktop App

    The Chrome extension and the Claude desktop app serve different purposes. The desktop app (available for macOS and Windows) provides Claude Code, Cowork mode, and can interact with your local file system. The Chrome extension is browser-specific — it reads web pages and operates within Chrome. Many users run both: the desktop app for deep work with files and code, and the Chrome extension for web-based tasks.

    Frequently Asked Questions

    Is Claude in Chrome free?

    The extension itself is free to install. It uses your Claude account’s usage allowance — so free-tier users can use it within their free limits, and paid users get their plan’s full usage.

    Does Claude in Chrome work with other browsers?

    As of June 2026, Claude in Chrome is specifically built for Google Chrome. It may work on Chromium-based browsers like Edge and Brave, but it is officially supported on Chrome.

    Can Claude in Chrome see my passwords or personal data?

    No. Claude in Chrome only reads the visible content of pages you actively share with it. It does not access saved passwords, autofill data, browsing history, or other stored browser information.

    How is Claude in Chrome different from Claude for Microsoft 365?

    Claude in Chrome works within your web browser on any website. Claude for Microsoft 365 integrates directly into Word, Outlook, Teams, and other Microsoft applications. They are separate products that serve different workflows.

  • How Much Does Claude AI Cost? The Plain-English Pricing Breakdown for 2026

    How Much Does Claude AI Cost? The Plain-English Pricing Breakdown for 2026

    How Much Does Claude AI Cost? The Plain-English Pricing Breakdown for 2026

    If you searched “how much is Claude AI” or “Claude AI cost,” you’re probably looking for a straightforward answer, not a marketing page. Here it is: Claude has a free tier that costs nothing, a Pro plan at $20/month, a Max plan starting at $100/month, a Team plan starting at $20/seat/month, Enterprise pricing at $20/seat plus usage, and API access billed per token. Let’s break down what each actually gets you.

    The Free Tier: $0

    Claude’s free tier is genuinely free — no credit card required, no trial period. You get access to chat on web, mobile, and desktop apps. You can search the web, use memory across conversations, create and execute code, and even use extended thinking for complex tasks. The catch is usage limits: you’ll hit rate limits faster than paid users, and during high-traffic periods, free users may experience wait times.

    The free tier is surprisingly capable. You can connect Slack and Google Workspace, use desktop extensions, and access remote MCP integrations. For someone who uses Claude a few times a day for quick questions, writing help, or light coding, the free tier may be all you need.

    Claude Pro: $20/Month

    Pro costs $20/month billed monthly or $17/month if you pay annually ($200 upfront). Pro unlocks significantly more usage than the free tier, plus Claude Code (the command-line coding tool), Claude Cowork (the desktop automation tool), unlimited Projects, Research mode, access to additional models, and Claude for Microsoft 365 and Outlook. If you use Claude daily for work — writing, coding, analysis, research — Pro is the sweet spot for most individual users.

    Claude Max: $100 or $200/Month

    Max comes in two tiers. The $100/month tier gives you approximately 5x the usage of Pro. The $200/month tier gives approximately 20x. Max also adds higher output limits, early access to advanced features, and priority access during peak times. Max is for power users — people who spend hours a day in Claude Code, run long research sessions, or produce high volumes of content.

    Claude Team: From $20/Seat/Month

    Team pricing requires a minimum of 5 seats. Standard seats cost $25/seat/month (monthly) or $20/seat/month (annual). Premium seats cost $125/seat/month (monthly) or $100/seat/month (annual) for 5x the usage. Teams get SSO, central billing, admin controls, enterprise desktop deployment, and content that isn’t used for model training by default.

    Claude Enterprise: $20/Seat + Usage

    Enterprise charges $20/seat as a base, with additional usage billed at API rates. Enterprise adds SCIM, audit logs, compliance API, custom data retention, HIPAA readiness, IP allowlisting, role-based access, and Claude Security. Enterprise is available both as self-serve (sign up directly) and sales-assisted (custom contracts).

    Claude API: Pay Per Token

    If you’re building applications with Claude, API pricing is separate from subscription plans. The most cost-efficient model, Haiku 4.5, costs $1 per million input tokens and $5 per million output tokens. Sonnet 4.6 costs $3/$15. Opus 4.8 costs $5/$25. Batch processing cuts all rates by 50%, and prompt caching can reduce repeated input costs by up to 90%.

    Quick Cost Comparison Table

    Here’s a summary of what you’ll pay at each tier: Free costs $0 with basic usage limits. Pro costs $20/month ($17 annual) with standard usage. Max 5x costs $100/month with 5x Pro usage. Max 20x costs $200/month with 20x Pro usage. Team Standard costs $20-25/seat/month. Team Premium costs $100-125/seat/month. Enterprise costs $20/seat plus API-rate usage. API Haiku costs ~$1/MTok input. API Sonnet costs ~$3/MTok input. API Opus costs ~$5/MTok input.

    Frequently Asked Questions

    How much is Claude AI per month?

    Claude AI ranges from $0 (free tier) to $200/month (Max 20x) for individuals. Team plans start at $20/seat/month on annual billing. The most common paid tier is Pro at $20/month.

    Is Claude more expensive than ChatGPT?

    Claude Pro ($20/month) and ChatGPT Plus ($20/month) are priced identically. At the API level, Claude’s newest Opus models ($5/$25 per MTok) are competitive with GPT-4-class pricing. Both platforms offer free tiers.

    Can I use Claude for free forever?

    Yes. Claude’s free tier is not a trial — it’s a permanent plan with no expiration. Usage limits apply, but there’s no time restriction on free access.

    What’s the best value Claude plan?

    For most individual users, Pro at $20/month (or $17 annual) offers the best balance of features and usage. For teams, Standard seats at $20/seat/month (annual) provide the core collaborative features at a reasonable price point.

  • Claude Team Pricing in 2026: Standard vs Premium Seats, What’s Included, and How to Choose

    Claude Team Pricing in 2026: Standard vs Premium Seats, What’s Included, and How to Choose

    Claude Team Pricing in 2026: Standard vs Premium Seats, What’s Included, and How to Choose

    Claude’s Team plan is built for groups of 5 to 150 people who need collaborative AI access with centralized administration. As of June 2026, Anthropic offers two seat types within the Team plan — Standard and Premium — with meaningfully different usage allowances and price points. This guide breaks down exactly what each seat type includes, what the real costs look like, and how to decide which mix works for your organization.

    Team Plan Pricing Overview

    The Team plan uses per-seat pricing with two tiers. Standard seats cost $25 per seat per month on monthly billing, or $20 per seat per month on annual billing. Premium seats cost $125 per seat per month on monthly billing, or $100 per seat per month on annual billing. You can mix and match seat types within the same organization — not everyone needs the same usage level.

    For a 10-person team on annual billing with 7 Standard and 3 Premium seats, the monthly cost would be (7 × $20) + (3 × $100) = $440/month, or $5,280/year. Compare that to putting all 10 on Standard ($200/month) or all 10 on Premium ($1,000/month) to see why the mix-and-match model matters.

    What Standard Seats Include

    Standard seats include all Claude features — chat across web, iOS, Android, and desktop — plus more usage than what individual Pro subscribers get. Standard seat holders can access Claude Code and Claude Cowork, connect Microsoft 365, Slack, and other integrations, and use Enterprise search across the organization. They get SSO, admin controls, and the enterprise desktop app deployment. The key differentiator from Pro is the organizational layer: centralized billing, admin controls, and content that isn’t used for model training by default.

    What Premium Seats Add

    Premium seats provide approximately 5x the usage of Standard seats. This is designed for power users — engineers running Claude Code all day, researchers doing deep analysis sessions, content teams producing high volumes of output. Premium seats are the Team-plan equivalent of individual Max plans, but with all the organizational infrastructure (SSO, admin controls, no training on content) included.

    Team Plan vs Individual Pro/Max Plans

    The question many organizations face: should each person just buy their own Pro or Max subscription? The Team plan adds several capabilities that individual plans lack. Central billing means one invoice instead of individual expense reports. SSO and domain capture ensure that everyone in your organization uses the managed account. Admin controls let you manage connectors and desktop app deployment centrally. Content is not used for model training by default — individual free and Pro accounts have an opt-out option, but Team accounts are opted out by default. Enterprise search lets team members search across organizational knowledge.

    Team Plan vs Enterprise Plan

    The Team plan caps at 150 users. If you need more, or if you need features like SCIM provisioning, audit logs, compliance API, custom data retention, HIPAA readiness, IP allowlisting, or role-based access with fine-grained permissions, you need Enterprise. Enterprise pricing starts at $20/seat with usage at API rates — the per-seat cost is actually lower, but total cost depends on how much your team uses Claude.

    How to Choose Between Standard and Premium Seats

    Start with Standard seats for everyone and monitor usage. If specific team members consistently hit rate limits — especially developers using Claude Code heavily or analysts running extended research sessions — upgrade those individuals to Premium seats. The mix-and-match model means you don’t need to over-provision. A typical pattern for a 20-person team might be 4-5 Premium seats for heavy users and 15-16 Standard seats for everyone else.

    Frequently Asked Questions

    What is the minimum team size for Claude Team?

    The Claude Team plan requires a minimum of 5 seats. You can mix Standard and Premium seats within that minimum.

    Can I switch between Standard and Premium seats?

    Yes. Administrators can upgrade individual seats from Standard to Premium or downgrade from Premium to Standard. Changes take effect on the next billing cycle.

    Does Claude Team include Claude Code?

    Yes. Both Standard and Premium Team seats include access to Claude Code and Claude Cowork.

    Is my team’s data used for training on the Team plan?

    No. Content is not used for model training by default on the Claude Team plan.

    Related: Claude AI Pricing (2026) — every plan, API rate, and the cost calculator

  • Anthropic Console in 2026: The Complete Developer Guide to API Keys, Billing, and the Dashboard

    Anthropic Console in 2026: The Complete Developer Guide to API Keys, Billing, and the Dashboard

    Anthropic Console in 2026: The Complete Developer Guide to API Keys, Billing, and the Dashboard

    The Anthropic Console at platform.claude.com is where developers manage everything related to the Claude API. Whether you’re generating your first API key, tracking token usage, setting spend limits, or managing team workspaces, the console is your control center. This guide walks through every section of the console as it exists in June 2026.

    What Is the Anthropic Console?

    The Anthropic Console — also called the Anthropic Developer Console — is the web-based dashboard at platform.claude.com where you manage your Claude API access. It is separate from claude.ai, which is the consumer chat interface. The console handles API key generation, billing and payment, usage monitoring, workspace and team management, rate limit visibility, and access to developer documentation. Think of claude.ai as where you use Claude, and platform.claude.com as where you build with Claude.

    Getting Started: Creating an Account

    Navigate to platform.claude.com and sign up with your email or Google account. You’ll need to add a payment method before you can make API calls. Anthropic uses a prepaid credit system — you load credits onto your account and API calls draw from that balance. New accounts start with a default spending limit that increases as you build usage history.

    API Keys: Creating and Managing

    API keys are generated in the console under the API Keys section. Each key begins with “sk-ant-” and should be treated as a secret credential. Best practices include creating separate keys for different applications or environments (development, staging, production), naming keys descriptively so you can identify which application uses which key, rotating keys periodically, and never committing keys to source control. If a key is compromised, you can revoke it immediately from the console without affecting your other keys.

    Billing and Usage Monitoring

    The billing section shows your current credit balance, spending history, and usage breakdown by model. You can view costs broken down by Opus, Sonnet, and Haiku usage, see daily and monthly spending trends, set up automatic credit top-ups, and configure spending alerts. Usage is reported in tokens — both input tokens (what you send to Claude) and output tokens (what Claude generates). The console shows real-time and historical usage data with charts that break down costs by model, feature, and time period.

    Workspaces and Team Management

    For organizations, the console supports workspace-level management. You can invite team members with specific roles, set per-user or per-workspace spending limits, view aggregated usage across your organization, and manage API keys at the workspace level rather than individually. This is particularly useful for agencies or development teams where multiple people need API access but you want centralized billing and usage controls.

    Rate Limits and Service Tiers

    The console displays your current rate limits, which depend on your service tier. Anthropic offers three service tiers: Priority for when time, availability, and predictable pricing matter most; Standard as the default tier for both piloting and scaling everyday use cases; and Batch for asynchronous workloads processed together at 50% off. Rate limits increase as your account matures and your spending history grows. The console shows your current limits for requests per minute and tokens per minute across each model.

    Developer Documentation Access

    The console links directly to Anthropic’s developer documentation at platform.claude.com/docs, which includes API reference with endpoint specifications, SDK guides for Python and TypeScript, prompt engineering best practices, tool use and function calling documentation, vision and multimodal capabilities, and integration guides for AWS Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.

    Console vs Claude.ai: Key Differences

    A common point of confusion: the Anthropic Console (platform.claude.com) is not the same as Claude.ai. Claude.ai is the consumer-facing chat interface where individuals and teams interact with Claude through conversation. The console is the developer-facing dashboard for API management, billing, and infrastructure. You can have accounts on both — your Claude.ai subscription (Free, Pro, Max, Team, Enterprise) is separate from your API credits on the console.

    Frequently Asked Questions

    How do I access the Anthropic Console?

    Go to platform.claude.com and sign in with your Anthropic account. If you don’t have one, you can create a free account and add billing information to start making API calls.

    Is the Anthropic Console free to use?

    The console itself is free. You only pay for API usage based on the tokens consumed. There is no monthly fee for console access — you pay per token as you use the API.

    What is the difference between the Anthropic Console and the Anthropic Developer Console?

    They are the same thing. “Anthropic Console” and “Anthropic Developer Console” both refer to the dashboard at platform.claude.com where developers manage API keys, billing, and usage.

    Can I set spending limits on the Anthropic Console?

    Yes. The console allows you to set both per-workspace and per-user spending limits. You can also configure automatic credit top-ups and spending alerts to stay within budget.

  • Claude AI Pricing in June 2026: The Complete Guide to Every Plan, Model, and Cost

    Claude AI Pricing in June 2026: The Complete Guide to Every Plan, Model, and Cost

    

    Claude AI Pricing in June 2026: The Complete Guide to Every Plan, Model, and Cost

    Updated June 12, 2026: Added Claude Fable 5 — Anthropic’s new top-tier model released June 9, 2026 at $10/$50 per million tokens.

    Claude AI pricing changed significantly in mid-2026. Claude Fable 5 launched June 9 as the new most-capable model — above Opus 4.8 in the lineup at $10 input / $50 output per million tokens. The Team Premium tier and Enterprise self-serve path arrived earlier in the year. This guide covers every plan, every model, and every cost as of June 12, 2026 — verified directly from claude.com/pricing.

    Individual Plans: Free, Pro, and Max

    Claude offers three individual tiers. The Free plan costs nothing and gives you access to chat on web, iOS, Android, and desktop. You get web search, memory across conversations, file creation with code execution, desktop extensions, and the ability to connect Slack and Google Workspace services through connectors. Free users can access extended thinking for complex work and use remote MCP integrations. The limitation is usage volume — you hit rate limits faster than paid users.

    The Pro plan costs $20 per month billed monthly or $17 per month with an annual subscription ($200 billed upfront). Pro includes everything in Free plus significantly more usage, access to Claude Code and Claude Cowork, unlimited Projects for organizing chats and documents, Research mode, access to additional Claude models, and Claude for Microsoft 365 and Outlook.

    The Max plan starts at $100 per month and offers two tiers: $100/month for approximately 5x more usage than Pro, or $200/month for approximately 20x more usage than Pro. Max users get higher output limits for all tasks, early access to advanced Claude features, and priority access during high-traffic periods.

    Team Plan: Standard and Premium Seats

    The Team plan serves groups of 5 to 150 users and comes in two seat types. Standard seats cost $25 per seat per month billed monthly or $20 per seat per month billed annually. Standard seats include all Claude features plus more usage than Pro. Premium seats cost $125 per seat per month billed monthly or $100 per seat per month billed annually, offering 5x more usage than standard seats.

    Team plans include Claude Code and Claude Cowork, Microsoft 365 and Slack integrations, Enterprise search across the organization, central billing and administration, single sign-on (SSO), admin controls for connectors, enterprise desktop app deployment, and the ability to mix and match seat types. Content is not used for model training by default on Team plans.

    Enterprise Plan: Self-Serve and Sales-Assisted

    Enterprise pricing follows a seat-plus-usage model: $20 per seat with usage billed at API rates that scale with model and task. Anthropic now offers two Enterprise paths: a self-serve option where organizations can sign up at claude.ai/create/enterprise without contacting sales, and a traditional sales-assisted path for organizations needing custom contracts, MSAs, purchase orders, or usage commitments.

    Enterprise includes everything in Team plus admin-set user and org spend limits, role-based access with fine-grained permissioning, SCIM, audit logs, compliance API, custom data retention controls, network-level access control, IP allowlisting, HIPAA-ready offerings, and Claude Security (currently in beta). As of June 2026, Anthropic is running a promotion: $1,000 in Claude Code and Claude Cowork credits for every seat activated by July 2.

    API Pricing: Per-Token Costs for Every Model

    All API prices are per million tokens (MTok). Current models as of June 2026:

    Fable 5 (New — June 9, 2026)

    Input: $10/MTok. Output: $50/MTok. Prompt caching write: $12.50/MTok. Prompt caching read: $1.00/MTok. Fable 5 is Anthropic’s first Mythos-class model released for general availability — the highest-capability Claude model as of June 2026. It supports a 1M token context window with 128K max output and adaptive thinking always on. Two important constraints: (1) mandatory 30-day data retention (zero data retention not available), and (2) safety classifiers route certain domain prompts (cybersecurity, biology, chemistry, distillation) to an Opus 4.8 fallback at Fable 5 API rates. Full Fable 5 breakdown →

    Opus 4.8

    Input: $5/MTok. Output: $25/MTok. Prompt caching write: $6.25/MTok. Prompt caching read: $0.50/MTok. Opus 4.8 is Anthropic’s most intelligent model, optimized for agents and coding. It supports a 1M token context window with flat-rate pricing — no surcharge for long contexts.

    Sonnet 4.6

    Input: $3/MTok. Output: $15/MTok. Prompt caching write: $3.75/MTok. Prompt caching read: $0.30/MTok. Sonnet 4.6 balances intelligence, cost, and speed. It also supports a 1M token context window at flat rates.

    Haiku 4.5

    Input: $1/MTok. Output: $5/MTok. Prompt caching write: $1.25/MTok. Prompt caching read: $0.10/MTok. Haiku 4.5 is the fastest and most cost-efficient model with a 200K token context window.

    Cost Optimization Features

    Batch processing saves 50% on all token rates for asynchronous workloads. Prompt caching reduces repeated context costs by up to 90% — cached reads cost roughly 10% of standard input rates. Combining both strategies can reduce costs by up to 95%. US-only inference is available at 1.1x standard pricing for workloads requiring data residency. Fast mode for Opus 4.8 runs at 2x standard pricing with up to 2.5x faster speeds.

    Platform Feature Pricing

    Managed Agents cost $0.08 per session-hour for active runtime, plus standard token rates. Web search costs $10 per 1,000 searches (not including input/output tokens for processing). Code execution includes 50 free hours daily per organization, with additional hours at $0.05 per container-hour.

    Legacy Model Pricing

    Opus 4.7 and Opus 4.6 retain the same $5/$25 per MTok pricing as Opus 4.8. Sonnet 4.5 and Sonnet 4 maintain $3/$15. The older Opus 4.1 and Opus 4 remain at their higher legacy rates of $15/$75 per MTok — making the current-generation Opus models 66.7% cheaper than their predecessors for the same token volume.

    Frequently Asked Questions

    How much does Claude AI cost?

    Claude AI is free to use with usage limits. The Pro plan costs $20/month ($17/month annual), Max starts at $100/month, Team starts at $20/seat/month (annual), and Enterprise is $20/seat plus usage at API rates.

    Is Claude AI free?

    Yes. Claude offers a permanent free tier with access to chat, web search, memory, code execution, desktop extensions, and extended thinking. The free plan has lower usage limits than paid plans.

    What is the most capable Claude API model?

    Claude Fable 5, released June 9, 2026. API ID: claude-fable-5. Priced at $10 input / $50 output per million tokens — 2x the cost of Opus 4.8. It scores significantly higher than Opus 4.8 on SWE-bench (80% vs 69.2% on Pro) and the Senior Engineer benchmark (91 vs ~63 out of 100). Use Fable 5 for complex engineering tasks and long-horizon agentic work where quality justifies the cost.

    What is the cheapest Claude API model?

    Haiku 4.5 at $1/MTok input and $5/MTok output. With batch processing (50% off) and prompt caching (90% off reads), effective costs can drop below $0.10/MTok for cached inputs.

    Does Claude offer a student discount?

    Anthropic does not offer an individual student discount as of June 2026. However, they have an Education plan for universities that provides comprehensive institution-wide access at discounted rates for students, faculty, and staff.

    What is the difference between Claude Pro and Claude Max?

    Pro costs $20/month and provides a standard amount of usage. Max costs $100/month (5x usage) or $200/month (20x usage) and adds higher output limits, early access to features, and priority access during peak times.

    Ready to build with Claude?

    Claude Seed Kits give you a pre-configured skill file, 20 tested prompts, and a setup guide tailored to your use case. Install in minutes and start getting real output immediately — $47 each.

    Solo Builder Kit — $47 Creator Kit — $47 See all 5 kits →

  • llms-full.txt vs llms.txt: Why AI Agents Crawl It More (2026)

    llms-full.txt vs llms.txt: Why AI Agents Crawl It More (2026)

    Most conversations about AI crawlability focus on one file: llms.txt. But if you look at what Anthropic, Vercel, and LangGraph actually ship – and what GEO crawler research found AI agents fetching most – the file that matters more is its companion: llms-full.txt.

    Here’s the practical reality: llms.txt is the map. llms-full.txt is the territory. And in 2026, the agents that matter for citation traffic are fetching the territory.

    The Full File Family You Probably Don’t Know About

    The original llms.txt proposal – published by Jeremy Howard in September 2024 – defined one file. Implementers built the rest. The complete family as of mid-2026 is four files, but most sites only need two:

    FileWhat’s in itWhen to use
    /llms.txtCurated index – H1, summary, link sectionsAlways. The orientation layer.
    /llms-full.txtFull content of every linked page, concatenated as MarkdownWhen you want a model to deep-ingest your docs in a single fetch
    /llms-ctx.txtPre-expanded context without URLsFastHTML-style implementations
    /llms-ctx-full.txtPre-expanded context with URLs preservedSame, but URL-aware

    The pattern that works – and the one Anthropic, Vercel, and LangGraph all run – is the index + export pair: llms.txt for orientation, llms-full.txt for deep ingestion.

    Why llms-full.txt Gets Crawled More

    GEO researchers analyzing AI crawler behavior – including work cited by Profound – have noted that agents from Microsoft, OpenAI, and others tend to fetch llms-full.txt more frequently than llms.txt when both are present. The working explanation is structural: when a file contains the full content, it removes one retrieval step. An agent that fetches llms-full.txt gets everything it needs in a single HTTP request instead of fetching the index, parsing the links, then fetching each linked page individually. This is consistent with how developer documentation platforms like Mintlify describe the behavior of IDE agents operating under tight latency budgets.

    For IDE agents (Cursor, Continue, Cline) and MCP integrations, this is even more pronounced. These tools are operating under tight context windows and latency budgets. A single fetch that returns a clean Markdown blob of your entire docs is structurally preferable to a multi-step crawl.

    The implication: if you’ve shipped llms.txt but not llms-full.txt, you’ve done half the job.

    How to Build llms-full.txt

    The construction logic is simple: take every URL in your llms.txt, fetch each page, strip HTML to Markdown, and concatenate. In practice, most sites do this in their build pipeline.

    Here’s the minimal Node.js pattern:

    const fs = require('fs');
    const fetch = require('node-fetch');
    const TurndownService = require('turndown');
    const turndown = new TurndownService();
    
    async function buildLlmsFullTxt(llmsIndexPath, outputPath) {
      const index = fs.readFileSync(llmsIndexPath, 'utf8');
      const urlRegex = /\[.*?\]\((https?:\/\/[^\)]+)\)/g;
      const urls = [...index.matchAll(urlRegex)].map(m => m[1]);
    
      let output = '';
      for (const url of urls) {
        const res = await fetch(url);
        const html = await res.text();
        const markdown = turndown.turndown(html);
        output += \n\n---\n# Source: \n\n;
      }
    
      fs.writeFileSync(outputPath, output);
      console.log(Built llms-full.txt:  pages,  chars);
    }
    
    buildLlmsFullTxt('./public/llms.txt', './public/llms-full.txt');

    One constraint to manage: keep llms-full.txt under roughly 200,000 tokens (about 150K words, around 700KB). That’s the threshold where most models can ingest the file in a single context window. If your docs are larger, segment by product or language the way Supabase does – llms-full-api.txt, llms-full-guides.txt – and list the segmented files in your main llms.txt.

    The 2026 robots.txt Stack That Completes the Picture

    Shipping llms.txt and llms-full.txt is the visibility layer. The access-control layer is robots.txt – and it changed significantly in Q2 2026.

    The key development: Anthropic split its crawler into two separate user-agents. ClaudeBot is the training scraper (high bandwidth, no citation value – block it). Claude-Web is the live-retrieval agent that fetches pages to answer Claude.ai user queries in real time (allow it, because it drives citation traffic). Brands that blanket-block “all Anthropic crawlers” lose Claude citations entirely.

    Meta also shipped two active training scrapers in March 2026 – FacebookBot and Meta-ExternalAgent – at GPTBot-level crawl volume. Most sites have no rules for them yet.

    Here’s the 2026 template:

    # BLOCK: Training scrapers - high bandwidth, zero referral value
    User-agent: GPTBot
    Disallow: /
    
    User-agent: CCBot
    Disallow: /
    
    User-agent: ClaudeBot
    Disallow: /
    
    User-agent: FacebookBot
    Disallow: /
    
    User-agent: Meta-ExternalAgent
    Disallow: /
    
    # OPT OUT: Google Gemini training (keeps Search indexing intact)
    User-agent: Google-Extended
    Disallow: /
    
    # ALLOW: Live-retrieval agents - drive citation traffic
    User-agent: OAI-SearchBot
    Allow: /
    
    User-agent: ChatGPT-User
    Allow: /
    
    User-agent: Claude-Web
    Allow: /
    
    User-agent: anthropic-ai
    Allow: /
    
    User-agent: PerplexityBot
    Allow: /

    One important caveat on robots.txt enforcement: aggressive training scrapers often ignore the file or spoof their user-agents. The robots.txt rules signal intent and work for compliant bots; a WAF rule at the edge is the only deterministic block for non-compliant crawlers.

    The Honest State of the Technology

    The SERanking study of 300,000 domains (November 2025) found no measurable correlation between having llms.txt and being cited by ChatGPT, Claude, Gemini, or Perplexity. Google’s John Mueller compared the file to the deprecated keywords meta tag – something site owners declare but that search systems derive from the content itself.

    None of that means you shouldn’t ship both files. The cost is low, the optionality is real, and the IDE-agent ecosystem (Cursor, Continue, Cline) does actively use llms.txt. But the robots.txt work is the lever that moves outcomes today. The llms.txt + llms-full.txt pair is infrastructure investment – you want to be correct when major LLM providers start honoring it, and building the build pipeline now costs far less than retrofitting it later.

    The practical sequence for a site that hasn’t done this yet:

    1. Update robots.txt first. Add the Q2 2026 user-agent rules above. This takes twenty minutes and immediately affects how training scrapers treat your content.
    2. Ship llms.txt. Curated index, 20-50 priority pages, one-sentence description per link, sections in priority order.
    3. Build llms-full.txt. Concatenated Markdown of every linked page, under 200K tokens. Run it in your build pipeline so it stays current.
    4. Verify both files are served correctly. curl -I https://yoursite.com/llms.txt should return 200 with Content-Type: text/plain. A 404 on either file is the most common implementation error.
    5. Add an access-log check. Once per month, grep your logs for requests to /llms.txt and /llms-full.txt by user-agent. You want to see live-retrieval agents (Claude-Web, OAI-SearchBot, PerplexityBot) in the results – not just training scrapers.

    The goal isn’t to optimize for a standard that isn’t fully adopted yet. It’s to build the infrastructure correctly now, while the field is still forming, so that adoption changes work in your favor rather than requiring catch-up.

    Related Reading

    Frequently Asked Questions

    What is the difference between llms.txt and llms-full.txt?

    llms.txt is a curated index — an H1, a summary, and link sections that orient an AI agent to your site. llms-full.txt is the full content of every linked page concatenated as Markdown, so an agent can deep-ingest your documentation in a single fetch. The index is the map; the full file is the territory.

    Why do AI agents crawl llms-full.txt more often than llms.txt?

    Fetching llms-full.txt removes a retrieval step: the agent gets everything in one HTTP request instead of fetching the index, parsing links, and fetching each page individually. For IDE agents like Cursor, Continue, and Cline operating under tight latency and context budgets, a single clean Markdown blob is structurally preferable to a multi-step crawl.

    How big should llms-full.txt be?

    Keep it under roughly 200,000 tokens (about 150K words, around 700KB) so most models can ingest it in a single context window. If your docs are larger, segment by product or language — for example llms-full-api.txt and llms-full-guides.txt — and list the segmented files in your main llms.txt.

    Does having llms.txt actually improve AI citations?

    Not measurably on its own. A November 2025 SERanking study of 300,000 domains found no correlation between having llms.txt and being cited by ChatGPT, Claude, Gemini, or Perplexity, and Google’s John Mueller compared it to the deprecated keywords meta tag. The lever that moves outcomes today is robots.txt configuration; llms.txt and llms-full.txt are low-cost infrastructure for when adoption grows.

    Which AI crawlers should I allow in robots.txt in 2026?

    Allow live-retrieval agents that drive citation traffic — Claude-Web, OAI-SearchBot, ChatGPT-User, anthropic-ai, and PerplexityBot. Block high-bandwidth training scrapers with no referral value such as GPTBot, CCBot, ClaudeBot, FacebookBot, and Meta-ExternalAgent, and opt out of Google-Extended to skip Gemini training while keeping Search indexing intact.

  • How AI Engines Actually Cite Your Content: Grounding and GEO Guide

    How AI Engines Actually Cite Your Content: Grounding and GEO Guide

    Last verified: June 2026.

    Most “GEO” advice is recycled SEO with the word “AI” pasted on top. This guide is different. It describes what actually happens when Microsoft Copilot, Bing’s AI answers, and Google’s AI Overviews build a response and decide whose page to cite — based on running content sites that get cited tens of thousands of times a month. The short version: AI engines do not cite the page that ranks #1 for a head term. They cite the page that most directly answers the specific sub-question the model is grounding on. That distinction changes everything about what you should write.

    How grounding actually works (the part nobody explains)

    When you ask Copilot or Bing’s AI a question, the model does not answer from memory. It runs a retrieval step called grounding: it rewrites your question into one or more search queries, fetches a handful of live web results, reads them, and composes an answer with inline citations pointing back at the pages it used. Google’s AI Overviews work the same way with a technique it calls “query fan-out” — one user question becomes many narrower synthetic queries.

    Two things follow directly from this mechanism:

    • The model is not searching for your keyword. It is searching for the answer to a decomposed sub-question. A user who asks “what’s the best way to instantly index a new page” triggers grounding queries like “IndexNow API endpoint”, “submit URL to Bing programmatically”, and “IndexNow key file location”. The page that wins is the one that answers those narrow strings, not the one optimized for “indexing tips”.
    • Citations are extracted at the passage level, not the page level. The model lifts the specific sentence or table that answers the sub-question. If your answer is buried under 600 words of preamble, it loses to a page that states the fact in the first line under a matching heading.

    This is why a niche, specific page routinely out-cites a high-authority generalist. The generalist ranks; the specialist gets quoted.

    Why operational and comparison pages win over head terms

    Across real citation data, the pages that get pulled into AI answers cluster into three shapes. None of them are “ultimate guide to X”.

    1. Operational pages with real commands, configs, and error messages

    When someone asks an AI assistant “how do I fix [specific error]” or “what’s the exact command to do X”, the model needs a page that contains the literal command, the literal config, or the literal error string. Generic advice cannot be cited because there is nothing concrete to quote. A page that says:

    curl "https://www.bing.com/indexnow?url=https://example.com/new-page/&key=YOUR_KEY"
    # 200 = received (not "indexed"), 422 = URL/key mismatch, 429 = too many submits

    …is citation gold, because the model can extract that block verbatim and the user can act on it. The error-code annotations matter: questions about failures (“IndexNow 422”, “why am I getting 429”) are high-intent and low-competition, and a page that names the exact codes owns them.

    2. Comparison pages (“X vs Y”)

    “Which is better, X or Y” is one of the most common shapes of AI query, and comparison content is structurally easy to cite because it maps cleanly to a decision. If you maintain honest, current head-to-head pages, you become the default source the model reaches for when a user is choosing between tools. This is exactly why we keep dedicated comparison pages like Claude Code vs Cursor and Claude Code vs Codex — they answer a decision the model is constantly being asked to make, and a table of differences is trivially quotable.

    3. Fresh, dated pages on fast-moving topics

    For anything that changes — pricing, model versions, API limits, feature availability — grounding strongly favors recency. The model would rather cite a page dated this month than an “authoritative” page from two years ago that might be wrong. A visible “Last verified” date and a real publish/update timestamp are not decoration; they are a relevance signal the retrieval layer reads.

    The losing move is chasing broad head terms. “Best AI coding assistant” is saturated, generic, and rarely the literal grounding query. The winning move is to own the long, specific, operational and comparison strings that the fan-out actually generates.

    IndexNow: how to get cited the same day you publish

    Grounding can only cite pages the engine knows about. The bottleneck for new content is crawl latency — and IndexNow collapses it. IndexNow is an open protocol (backed by Microsoft Bing and Yandex) that lets you push a URL to the index the instant you publish, instead of waiting for a crawler to wander by.

    Setup is two steps:

    1. Host a key file. Generate a key of 8-128 hex characters and place it at your site root as a UTF-8 text file named {key}.txt containing exactly that key. Example: https://example.com/daa44a2c....txt. This proves you own the host.
    2. Ping on publish. Single URL via GET:
      curl "https://api.indexnow.org/indexnow?url=https://example.com/new-page/&key=YOUR_KEY"

      Or batch up to 10,000 URLs in one POST:

      curl -X POST "https://api.indexnow.org/indexnow" \
        -H "Content-Type: application/json" \
        -d '{"host":"example.com","key":"YOUR_KEY","urlList":["https://example.com/a/","https://example.com/b/"]}'

    A 200 means the endpoint received your URL (not that it is indexed yet). Submitting to api.indexnow.org shares the ping with all participating engines, so you do not need to hit Bing and Yandex separately. Most WordPress SEO plugins (Rank Math, Yoast, SEOPress) have IndexNow built in — turn it on and it fires automatically on every publish and update. The practical payoff: pages can enter Bing’s crawl queue within hours, which means they are eligible to be grounded and cited the same day, not next week.

    One caveat worth stating plainly: IndexNow accelerates indexing, which is a precondition for citation. It does not force a citation. You still need the page to be the best answer to the sub-question. But for fresh, time-sensitive content, same-day indexing is often the difference between getting cited while the topic is hot and showing up after the conversation has moved on.

    How to actually measure your AI citations

    For a long time AI citations were invisible — you could see referral clicks in analytics but not the citations themselves (most AI answers are zero-click). That changed. As of February 2026, Bing Webmaster Tools ships an AI Performance report (public preview) that shows when your pages are cited across Microsoft Copilot, Bing’s AI answers, and partner surfaces. It is the first direct, free window into AI citation behavior, and you should be reading it weekly.

    The four metrics that matter:

    • Total citations — how many times your site was cited as a source in AI answers over the period.
    • Average cited pages — the daily average count of unique URLs from your site that got referenced. This tells you whether citations are concentrated on one page or spread across the site.
    • Grounding queries — sample query phrases the AI used to retrieve and cite you. This is the single most actionable field in the report. It is a literal list of the sub-questions you are winning, which tells you exactly which operational/comparison angles to expand next.
    • Page-level citation activity — citations by URL, so you can see which pages are doing the work.

    Two limitations to keep in mind so you read the data honestly: the report does not show click data (you see citations, not visits from them), and it aggregates Copilot with Bing summaries, so you cannot isolate one surface from the other. For Google’s AI Overviews there is still no equivalent citation dashboard — the closest proxy is watching impressions and referral patterns in GA4 and Search Console, plus spot-checking your target queries by hand.

    The workflow that works: pull the grounding-queries list, find the patterns, and feed them straight back into your content plan. If you are getting cited for “claude mcp setup” variants, that is a signal to deepen pages like the Claude MCP setup guide and adjacent operational walkthroughs, not to chase a new head term.

    A repeatable checklist for citation-optimized pages

    Everything above reduces to a build pattern. For any page you want AI engines to cite:

    • Lead with the answer. Put a short, factual, quotable answer in the first 1-2 sentences under each heading. Assume the model reads only that passage.
    • Use question-shaped headings. H2s and H3s that mirror real queries (“How does IndexNow work?”, “How do I measure AI citations?”) match the grounding query and give the extractor a clean anchor.
    • Be specific and operational. Real commands, real config, real numbers, real error codes and fixes. Concrete text is extractable; vague advice is not.
    • Add a visible FAQ near the end. Plain question/answer pairs are the single most citation-friendly format, because each pair is a self-contained answer to a discrete sub-question. You do not need JSON-LD schema for this to work — visible Q&A text is what the model reads.
    • Date it and keep it current. A “Last verified” line plus genuine updates on fast-moving topics buys you the recency edge in grounding.
    • Push it with IndexNow so it is indexable the same day, then watch the AI Performance report to see which sub-questions it wins.

    If you want the larger system this fits into — the full toolchain for operating as an AI-first publisher, from MCP servers to publishing pipelines — start with the AI operator’s stack.

    FAQ

    Do AI engines cite the page that ranks #1 on Google?

    Not reliably. AI engines run their own grounding retrieval and cite the page that most directly answers the specific decomposed sub-question, which is often a niche, operational page rather than the head-term winner. Ranking helps your page be discoverable, but the citation goes to whichever passage best answers the exact grounding query.

    What is grounding in AI search?

    Grounding is the retrieval step where an AI assistant rewrites your question into search queries, fetches live web pages, reads them, and builds an answer with inline citations to those pages. It is why current, specific pages can get cited even by a model whose training data predates them.

    Does IndexNow guarantee my page will be cited by AI?

    No. IndexNow guarantees fast indexing, which is a precondition for being cited. The page still has to be the best, most specific answer to the sub-question the model is grounding on. Think of IndexNow as removing the crawl-latency excuse, not as buying a citation.

    How do I measure how often AI cites my site?

    Use the AI Performance report in Bing Webmaster Tools (public preview since February 2026). It shows total citations, average cited pages per day, sample grounding queries, and citation counts by URL across Microsoft Copilot and Bing AI answers. It does not yet show click-through from those citations, and there is no equivalent dashboard for Google AI Overviews.

    Do I need JSON-LD or schema markup to get cited?

    No. Citation extraction works on visible, well-structured text — question-shaped headings, short factual answers, and a plain visible FAQ. Schema can help search features generally, but it is not required for AI grounding to read and quote your page.

    What kind of pages get cited most?

    Three shapes dominate: operational pages with real commands, configs, and error fixes; comparison pages that resolve a “X vs Y” decision; and fresh, dated pages on fast-moving topics like pricing and model versions. Broad head-term content tends to get skipped because it rarely matches the literal grounding query and offers nothing concrete to quote.

  • How I Made a $400 Laptop More AI-First Than a Copilot+ PC

    How I Made a $400 Laptop More AI-First Than a Copilot+ PC

    All fall, Microsoft has been selling one idea: the future is the AI PC — a Copilot+ machine with a dedicated neural chip (an NPU), Recall, Click to Do, a thousand dollars and up, and your old laptop need not apply.

    I had a $400 budget laptop on my desk — an AMD Ryzen 5 7520U, 16 GB of RAM, no NPU — and a hunch that the whole framing was backwards. The AI-first laptop was never about the chip. It’s about architecture.

    A few hours later, that $400 laptop had a private AI brain, voice control, and a control panel I run from my phone. On the things that actually matter for operating a machine, it does more than the Copilot+ PC it’s supposedly too cheap to be. Here’s the exact build.

    The thesis: AI-first is architecture, not a chip

    The trick is to stop asking your laptop to be the supercomputer. Split the job:

    • The brain lives in the cloud. The heavy reasoning runs on a frontier model (I use Claude) with effectively unlimited horsepower. No NPU on Earth competes with that.
    • The body lives on your laptop. Your machine becomes the always-on hands: it holds your private data, runs small models locally for anything sensitive, and executes the actions the brain decides on.

    An NPU optimizes a handful of on-device Windows features. Architecture gives you an actual operator. Guess which one you feel every day.

    Step 0 — Make it always-on

    An operator rig is a little server, and servers don’t nap. My laptop kept sleeping and killing background jobs, so the first move was to take that off the table (while plugged in):

    powercfg /change monitor-timeout-ac 0
    powercfg /change standby-timeout-ac 0
    powercfg /setacvalueindex SCHEME_CURRENT SUB_BUTTONS LIDACTION 0
    powercfg /setactive SCHEME_CURRENT

    Screen never blanks, never sleeps, and it keeps running with the lid closed — while still sleeping on battery as a safety. Now it’s a real always-on host.

    Step 1 — A private AI brain that lives on the laptop

    The local engine is Ollama; the chat interface is open-webui (running in Docker). If you want the multi-agent version of this idea, I’ve also written up building a free AI agent army with Ollama and Claude. The only thing standing between me and a private, offline ChatGPT was one wrong setting — open-webui was pointed at a dead address. The fix was to aim it at the host:

    docker run -d --name open-webui --restart always -p 3000:8080 \
      -v open-webui:/app/backend/data \
      -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
      ghcr.io/open-webui/open-webui:main

    The proof: a 3-billion-parameter model (Llama 3.2) introduced itself in about 10 seconds at ~12 tokens/second — on the CPU, no NPU, no discrete GPU. Fast enough for real Q&A, drafting, and summaries. Seven models sit ready on disk, and the whole thing is reachable from my phone over a private network.

    Everything here runs offline. For anything I don’t want leaving the machine, that’s the entire point.

    Step 2 — Voice that never leaves the machine

    A local Whisper speech-to-text container (OpenAI-compatible API) became a push-to-talk dictation tool: hold a key, talk, release, and the text drops into whatever app is focused. I verified the pipeline without even touching the mic — Windows text-to-speech generated a clip, the local Whisper transcribed it, and it round-tripped clean:

    Spoken: “Testing one two three. This is the private local transcription engine.”
    Whisper heard: “Testing 1-2-3. This is the private local transcription engine.”

    Windows has built-in dictation (Win+H) and Copilot voice too — but those ship your audio to the cloud. The local version does the same job, and your voice never leaves the laptop.

    Step 3 — Turn your phone into the control panel

    Using Tailscale (a private mesh network), every service on the laptop is reachable from my phone — without exposing anything to the public internet. I added a tiny web page (one small nginx container) as a mobile operator console: one tap to the local AI, automations, status, and finance dashboards. Pin it to the home screen and the laptop is in your pocket.

    The honest scoreboard vs. a Copilot+ PC

    Capability Copilot+ PC ($1,000+) This $400 laptop
    Private AI running on the device Limited (small NPU models) ✅ Full Ollama stack, 7 models
    An AI that operates the machine ✅ Runs commands, edits files, fixes things
    Private, offline voice dictation ❌ (cloud) ✅ Local Whisper
    Phone control panel ✅ Tailscale operator console
    Recall / Click to Do / Cocreator ✅ (needs the NPU)
    Screenshots everything you do ⚠️ Recall does, by design ✅ No — nothing is recorded

    I’m being fair: the NPU-only features are genuinely off the table on cheap hardware. But for operating your computer — and for privacy — the architecture beats the chip.

    Why this matters more than it looks

    The quiet headline isn’t “I saved money.” It’s where the data lives. Microsoft’s flagship AI-PC feature, Recall, works by screenshotting everything you do. This build does the opposite: the sensitive payload stays on your machine, and the cloud is used only for the heavy thinking that doesn’t need your private files.

    That’s not just a hobbyist’s preference. It’s the exact requirement for anyone in a regulated field — healthcare, legal, finance — who can’t send client data to a third party but still wants real AI leverage. The cheap laptop isn’t the story. The architecture is.

    Frequently asked questions

    Do I need a Copilot+ PC or an NPU to run local AI?

    No. Any laptop with around 16 GB of RAM and a modern CPU can run small local models. An NPU accelerates certain Windows features but is not required for Ollama or local chat.

    Is local AI actually private?

    Yes. With Ollama, the model runs on your own machine and works with no internet connection — nothing is sent to a cloud service.

    What is the difference between Ollama and open-webui?

    Ollama is the engine that runs the models. open-webui is the friendly chat interface that sits in front of it.

    How fast is a local model on a budget laptop?

    On a CPU-only AMD Ryzen 5 with 16 GB of RAM, a 3-billion-parameter model answered at roughly 12 tokens per second — fine for quick questions, drafting, and summaries. Larger models run slower.

    Can I use it from my phone?

    Yes. Over a private Tailscale network you can reach your laptop’s AI and tools from your phone without exposing anything to the public internet.

    Is this better than a Copilot+ PC?

    For operating your machine and for privacy, this setup does more. For NPU-specific Windows features like Recall and Click to Do, a Copilot+ PC is required.

    Want this on your machine?

    Tygart Media builds privacy-first, local-AI operator setups — especially for teams in regulated industries that need real AI leverage without sending data to the cloud. Reach out and we’ll scope it to your hardware.

  • Using Claude in Chrome with LinkedIn: What It Is Good For (and What to Avoid)

    Using Claude in Chrome with LinkedIn: What It Is Good For (and What to Avoid)

    Last verified: June 2026.

    What Claude in Chrome can and can’t do on LinkedIn

    Task Verdict Notes
    Summarize a profile ✅ Safe and useful Read-only, no automation signal
    Draft a personalized DM ✅ Safe and useful You review and send manually
    Research a company page ✅ Safe and useful Read-only extraction
    Summarize a post or thread ✅ Safe and useful Read-only, no interaction
    Auto-post to your feed ❌ High risk Violates ToS, triggers automation detection
    Auto-connect with multiple people ❌ High risk Account restriction risk
    Bulk message sending ❌ High risk Spam detection, potential ban

    The Claude for Chrome extension lets Claude see and act inside your browser. The obvious temptation is to point it at LinkedIn and have it post for you. Do not do that. Here is what the extension is genuinely useful for on a professional network – and the one job you should never hand it.

    What to avoid: automated feed posting

    Driving the browser to auto-post feed content is a high-risk move. Professional networks actively detect automation, it violates their terms of service, and it can get an account throttled or suspended. If you want scheduled feed posts, use a social scheduler’s official API – that is the supported, durable path, and the one that will not get your account flagged. The browser is an assistant, not a posting robot.

    What it is actually good for

    1. Paste-assist for long-form Articles

    This is the real opportunity. Social schedulers – and every third-party tool – can only push short feed posts through the official API. Native long-form Articles and Newsletters have no public publishing endpoint, so they stay a manual copy-paste. That matters because AI engines cite long-form Articles far more often than short posts, by a wide margin. The most citation-valuable format is the one no tool can automate. That is exactly where an in-browser assistant earns its place: with you in the loop, it can help move a finished, formatted draft into the Article composer and tidy the formatting – turning a tedious manual paste into a guided one.

    2. Multi-account navigation

    If you operate a personal profile plus several company pages, the extension can help you move between already-authenticated sessions and keep track of which identity you are acting as – reducing the “posted from the wrong account” mistakes that come with juggling many pages by hand.

    3. Research, review, and drafting

    Reading a profile and summarizing it, scanning a feed for the day’s relevant threads, or drafting a thoughtful comment for your approval are all squarely in bounds. The assistant prepares; you decide and click.

    How to do it safely

    • Keep a human in the loop on anything that publishes or sends – review before you submit.
    • Never bulk-send connection requests, messages, or comments. That is the behavior detectors look for.
    • Use the official scheduler API for anything recurring; reserve the browser for the manual, assistive steps.
    • Treat the extension as read-and-prepare by default, act-and-publish only with your explicit click.

    Frequently asked questions

    Can Claude auto-post to LinkedIn for me?

    Not safely, and you should not try. Use a social scheduler’s API for feed posts. The browser extension is for assistive, human-in-the-loop work – especially the long-form Articles that no API can publish.

    Why can’t scheduling tools publish Articles or Newsletters?

    Because the platform exposes no public API for them. Feed posts have an endpoint; long-form does not. That limitation is shared by every tool, which is why the manual paste persists.

    Is browser automation against the rules?

    Automated posting and bulk outreach generally violate the terms and risk the account. Assistive, human-approved use – drafting, summarizing, helping you paste – is the safe lane. When in doubt, keep a person on the trigger.

    For the bigger picture of how this fits a full content operation, see The AI Operator’s Stack.

    Frequently Asked Questions

    What is the Claude for Chrome extension?

    Claude for Chrome (Claude in Chrome) is a browser extension that lets Claude see and interact with the page currently open in your browser. It can read page content, summarize what’s visible, draft responses based on what it sees, and in some configurations take actions like clicking or filling forms — depending on what permissions are active.

    Can I use Claude to automate LinkedIn posts?

    You should not. Professional networks like LinkedIn actively detect browser automation, and auto-posting violates their Terms of Service. Using Claude in Chrome to drive automated feed posting can result in account throttling or permanent suspension. Claude is useful for drafting post content, but you should always review and publish manually.

    What is Claude in Chrome actually useful for on LinkedIn?

    Legitimate high-value uses include: summarizing a prospect’s profile before a sales call, researching a company page, drafting a personalized connection request or DM based on what you read on a profile, and summarizing a post or comment thread. All of these are read-and-assist operations that don’t trigger automation signals.

    Does using Claude in Chrome on LinkedIn violate their terms of service?

    Read-only operations (summarizing, researching, drafting) generally do not violate LinkedIn’s terms. Automated actions (clicking, posting, connecting, messaging at scale) do. The key distinction is whether Claude is taking actions on LinkedIn’s platform autonomously versus helping you draft content that you then review and submit yourself.

    How is Claude in Chrome different from a LinkedIn scraper?

    Claude in Chrome reads what’s visible on the page you have open — it is not a bulk scraper that crawls hundreds of profiles automatically. It operates within your active browser session, one page at a time, and does not bypass LinkedIn’s normal page rendering. A scraper typically makes API calls or headless browser requests at volume; Claude in Chrome is a single-session reading assistant.

    What Claude model powers Claude in Chrome?

    Claude in Chrome uses Anthropic’s Claude models — currently Claude Sonnet 4.6 is the primary model for browser interactions, balancing capability and speed. Anthropic may update the underlying model over time. You can check your current model in the extension settings.