Tag: Claude Code

  • Connect Claude Code to Postgres via MCP: The Right Way (2026)

    Connect Claude Code to Postgres via MCP: The Right Way (2026)

    The most useful thing you can wire into Claude Code isn’t a new model or a clever prompt — it’s your actual database. When Claude Code can read your schema, it stops guessing at table names, column types, and relationships. It starts writing queries that work the first time.

    This is the practical walkthrough for connecting Claude Code to a Postgres database via MCP: the command, the credential setup, the safety pattern, and what the workflow actually looks like once it’s running.

    What the Postgres MCP server does

    The official @modelcontextprotocol/server-postgres package (maintained in the MCP reference implementations repo) gives Claude Code four tools: schema inspection, query execution inside a read-only transaction, table detail lookup, and relationship traversal. The server cannot write data — it’s read-only by design in the reference implementation, though third-party variants like postgres-mcp-pro add configurable write access if you need it.

    For the majority of development workflows — debugging, writing migrations, generating queries — read-only is exactly what you want. Claude Code can see the shape of your data without being able to touch it.

    The fastest path: single command setup

    If you just want it running against a local database:

    claude mcp add postgres -- npx -y @modelcontextprotocol/server-postgres "postgresql://USER:PASSWORD@localhost:5432/mydb"

    The -y flag on npx auto-accepts the package install so the command doesn’t hang on first run. Verify with:

    claude mcp list

    You should see postgres with a connected status. That’s it — Claude Code now has schema access in the current project.

    Don’t do this for a production database. The connection string above is hardcoded. It goes into ~/.claude.json as plaintext. Use a dedicated local or staging database during development, and use env vars for anything that matters.

    The right way: env vars and a read-only user

    Two things to do before connecting to any real database:

    1. Create a read-only Postgres user:

    CREATE USER claude_readonly WITH PASSWORD 'your-password-here';
    GRANT CONNECT ON DATABASE your_db TO claude_readonly;
    GRANT USAGE ON SCHEMA public TO claude_readonly;
    GRANT SELECT ON ALL TABLES IN SCHEMA public TO claude_readonly;
    ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO claude_readonly;

    This user can see everything in public and do nothing else. If something goes wrong — a rogue tool call, a compromised session — blast radius is zero.

    2. Reference the connection string via env var in .mcp.json:

    {
      "mcpServers": {
        "db": {
          "type": "stdio",
          "command": "npx",
          "args": ["-y", "@modelcontextprotocol/server-postgres"],
          "env": {
            "POSTGRES_CONNECTION_STRING": "${DATABASE_URL}"
          }
        }
      }
    }

    Set DATABASE_URL in your shell environment or in a .env file (not committed). Add .mcp.json to the repo — everyone on the team gets the same server config — but the actual connection string lives in each developer’s local environment. This is the same pattern you already use for application config. It’s the right pattern here too.

    Add the server at project scope so it’s committed:

    claude mcp add --scope project --transport stdio db -- npx -y @modelcontextprotocol/server-postgres

    Then edit .mcp.json to replace the hardcoded connection string with the ${DATABASE_URL} env var reference shown above.

    What Claude Code can actually do with schema access

    Once the server is connected, the workflow changes significantly. A few real examples:

    Schema exploration: Ask Claude Code “what tables are in this database and how are they related?” and it traverses foreign keys, describes join paths, and builds a mental model of your data layer. No more copy-pasting \dt output.

    Query generation: “Write a query that finds users who signed up in the last 30 days but haven’t completed onboarding” produces accurate SQL because Claude Code knows your actual column names. With a generic prompt and no schema access, you’d get plausible-looking SQL that fails because user_status is actually onboarding_state.

    Migration drafting: “I need to add a last_login_at column to users — show me the migration and check for existing timestamp patterns in the schema.” Claude Code inspects the schema first, matches your existing column naming conventions, and produces a migration that fits your codebase.

    Debugging: “This query is returning the wrong count — here’s the query, check it against the schema.” Claude Code can spot that you’re joining on a nullable column, missing a filter on a soft-delete flag, or aggregating before filtering.

    Neon and cloud Postgres

    If you’re on Neon, there’s a first-party MCP server with additional capabilities: branch management, database creation, and schema migrations via Neon’s branching model. Set it up with:

    npx neonctl mcp add

    This runs OAuth through your browser and configures Claude Code automatically. The Neon MCP server is intended for local development and IDE workflows — not production automation — same caution applies.

    Debugging when it doesn’t connect

    Three commands for when the server shows as disconnected:

    claude mcp list          # check registered servers and status
    claude mcp test db       # test a specific server
    claude --debug           # tail logs including MCP stderr output

    Most connection failures are either a wrong connection string, a missing env var, or Node version issues with npx. The debug log shows the exact error from the server process — read it before assuming the problem is Claude Code.

    The practical baseline

    Five minutes to set up. The productivity delta on any codebase larger than a few tables is immediate — Claude Code stops making column-name mistakes and starts being genuinely useful for data-layer work. Wire up the read-only user, commit the .mcp.json, and add DATABASE_URL to your team’s .env.example. Done.

    The model doing the work in a typical Claude Code session is claude-sonnet-4-6 (workhorse) — it handles schema-aware query generation well without burning through Opus 4.8 credits on every lookup.

    Related Reading

    Frequently Asked Questions

    How do I connect Claude Code to a Postgres database via MCP?

    Run: claude mcp add postgres — npx -y @modelcontextprotocol/server-postgres “postgresql://USER:PASSWORD@localhost:5432/mydb”. The -y flag auto-accepts the package install so the command doesn’t hang. Then run claude mcp list and confirm postgres shows a connected status. Use a local or staging database for this quick path, not production.

    Is the Postgres MCP server read-only?

    Yes. The official @modelcontextprotocol/server-postgres package is read-only by design — it exposes schema inspection, query execution inside a read-only transaction, table detail lookup, and relationship traversal. It cannot write data. Third-party variants like postgres-mcp-pro add configurable write access if you need it.

    What’s the safe way to connect Claude Code to a production database?

    Create a dedicated read-only Postgres user (GRANT SELECT only), and reference the connection string through an environment variable in .mcp.json using ${DATABASE_URL} rather than hardcoding it. Commit .mcp.json so the team shares the server config, but keep the actual connection string in each developer’s local .env. If a session is ever compromised, the blast radius is zero.

    Why does schema access make Claude Code more accurate?

    With schema access, Claude Code reads your real table names, column types, and relationships, so it writes queries that work the first time instead of guessing. Without it, you get plausible-looking SQL that fails because user_status is actually onboarding_state. It also improves migration drafting and query debugging by matching your existing conventions.

    How do I debug a Postgres MCP server that won’t connect?

    Use three commands: claude mcp list to check registered servers and status, claude mcp test db to test a specific server, and claude –debug to tail logs including MCP stderr. Most failures are a wrong connection string, a missing env var, or a Node version issue with npx — the debug log shows the exact server error.

  • Claude Code vs Codex CLI (2026): A Hands-On Head-to-Head

    Claude Code vs Codex CLI (2026): A Hands-On Head-to-Head

    Last verified: June 2026.

    Both Claude Code and OpenAI Codex CLI are terminal-native coding agents: you run them inside a repo, they read your files, edit code, run commands, and iterate. I run both daily on real projects. This is the head-to-head I wish existed when I was deciding which one to make my default. No benchmarks-chasing, just install commands, config files, pricing math, and where each one actually earns its keep. For the broader toolchain these slot into, see our AI operator’s stack.

    Claude Code vs Codex CLI: the short answer

    If you want one sentence: Claude Code is the more mature agentic harness (subagents, hooks, skills, deep MCP, a flat-rate plan that makes heavy use affordable), while Codex CLI is the leaner, cheaper-per-token option with strong raw coding from the GPT-5.x line and a tight sandbox model. Most teams that live in the terminal all day end up on Claude Code for the workflow tooling; people who want a fast, low-cost agent on top of an existing OpenAI subscription reach for Codex.

    The honest version: they are closer than tribal arguments suggest. The deciding factors are almost never “which model is smarter this week” and almost always pricing structure, sandbox defaults, and how much workflow scaffolding you need.

    How do you install each one?

    Claude Code installs from npm and runs as the claude command:

    npm install -g @anthropic-ai/claude-code
    cd your-project
    claude

    First run walks you through OAuth login (Pro/Max plan) or an ANTHROPIC_API_KEY. On Windows it runs natively in PowerShell now, though a lot of operators still prefer it under WSL for fewer path headaches.

    Codex CLI ships an install script and is also on npm:

    # Mac / Linux
    curl -fsSL https://chatgpt.com/codex/install.sh | sh
    
    # Windows (PowerShell)
    powershell -ExecutionPolicy ByPass -c "irm https://chatgpt.com/codex/install.ps1 | iex"
    
    # or via npm
    npm install -g @openai/codex

    Then codex in your repo. Auth is either a ChatGPT login (Plus/Pro/Business) or an OpenAI API key via codex login. Both tools are open-source clients hitting hosted models, so the install is the easy part; the model access is what you are really buying.

    Which models do they run in 2026?

    Claude Code defaults to the current Claude flagship. As of June 2026 that is Opus 4.8 for the hardest reasoning, with Sonnet 4.6 as the fast everyday workhorse and Haiku 4.5 for cheap, high-volume calls. You switch in-session with /model. Opus 4.8 also exposes reasoning-effort levels (high is the default; xhigh and max push deeper on gnarly problems at higher token cost).

    Codex CLI runs the GPT-5.x coding line. GPT-5.5 is the current recommended default for complex coding and agentic work, GPT-5.4-mini is the faster/cheaper option for light tasks and subagents, and GPT-5.3-Codex remains a strong coding-tuned choice. Pick the model with codex -m gpt-5.5 or set it in your config.

    Practical read: on a clean, well-specified function both produce good code. The gap shows up on long, multi-file refactors where the agent has to hold a lot of context and recover from its own mistakes. That is a harness problem as much as a model problem, which is the next section.

    What about workflow features: subagents, hooks, and config?

    This is where Claude Code is currently ahead, and it is the real reason it tends to win for power users.

    • Subagents – Claude Code spawns isolated sub-sessions with their own context window, tool restrictions, and prompts. Great for “go research this in parallel while the main thread keeps coding.” Codex has a lighter subagent concept (often pointed at GPT-5.4-mini to keep cost down) but it is less fleshed out.
    • Hooks – Claude Code fires deterministic scripts at lifecycle points (PreToolUse, UserPromptSubmit, and more). These run real code, so they cannot hallucinate: you can hard-block a dangerous command, auto-format on every edit, or inject context before the model sees a prompt. Codex leans on its approval/sandbox policy and execpolicy rules instead of a general hook system.
    • Skills and slash commands – In Claude Code, custom slash commands have merged into skills; /your-command still works and skills add reusable, packaged capabilities. Codex uses prompt files and profiles rather than a skills layer.
    • Project memory – Both read a project instruction file. Claude Code uses CLAUDE.md; Codex uses AGENTS.md (checked in a fallback order including AGENTS.override.md and .agents.md). Keep these tight: architecture, conventions, and the few rules the agent keeps forgetting.

    Codex’s config story is clean if you like a single file: ~/.codex/config.toml holds your model, approval policy, sandbox mode, MCP servers, and named profiles you switch with codex --profile work. Claude Code spreads config across ~/.claude/ and .claude/settings.json plus per-project files, which is more surface area but more granular control.

    How do the sandbox and approval models compare?

    This matters more than most comparisons admit, because it governs how much the agent can do without asking.

    Codex CLI has an explicit, well-documented sandbox. Sandbox modes run from read-only to workspace-write (edit files in the project, network off by default) up to full access, paired with approval policies like untrusted and on-request. On Windows the native sandbox can run unelevated or elevated. The mental model is clear: pick how much rope, then approve escalations.

    Claude Code manages permissions through allow/deny rules and modes (including a plan mode that reasons without touching files, and an auto-accept mode for trusted loops). Combined with PreToolUse hooks you can build a strict policy, but it is more “assemble it yourself” than Codex’s preset sandbox tiers.

    If you are dropping an agent onto an unfamiliar or sensitive repo, start read-only in both. Codex makes that posture a one-flag default; Claude Code gives you finer-grained control once you invest in the config.

    Do both support MCP?

    Yes, and this is a genuine tie that matters. Both speak the Model Context Protocol, so you can wire in the same external tools, databases, and APIs. Codex registers STDIO or streaming-HTTP MCP servers in ~/.codex/config.toml and launches them at session start. Claude Code adds servers via claude mcp add or JSON config. If you have already built MCP integrations, neither tool locks you out. New to MCP, start with our Claude MCP setup guide and the Notion MCP setup walkthrough.

    What does each one cost?

    Pricing is where the decision often gets made, so here are the real numbers as of June 2026.

    Claude Code plans:

    • Pro – $20/mo: Sonnet 4.6 plus some Opus, roughly enough for focused daily sessions, not all-day heavy use.
    • Max 5x – $100/mo: much larger windows, real Opus headroom.
    • Max 20x – $200/mo: the heavy-user tier; effectively flat-rate firehose access.
    • API pay-as-you-go: Opus 4.7 about $5/$15 per million input/output… (current Opus tier runs higher), Sonnet 4.6 $3/$15, Haiku 4.5 $1/$5.

    Codex CLI: Included in ChatGPT Plus/Pro/Business plans (usage governed by your plan’s limits), or pay-as-you-go on the API. GPT-5.3-Codex runs about $1.75 per million input / $14 per million output, with cheaper input on cached tokens. The mini model is far cheaper for light work.

    The structural difference: Claude Code’s Max plans are flat-rate, which is why heavy users love them. People have tracked billions of tokens that would cost five figures on API metering but ran around a few hundred dollars on Max. Codex’s per-token rates are lower per unit and great if your usage is bursty or already bundled into a ChatGPT subscription, but a true all-day agent habit can run up metered cost faster than a flat plan. Estimate your monthly token volume honestly, then do the arithmetic both ways.

    So which coding agent should you actually use?

    Pick Claude Code if you want the deepest agentic workflow (subagents, hooks, skills), you are a heavy daily user who benefits from the flat-rate Max plan, or you need fine-grained, scriptable control over what the agent can do. It is the more complete operator’s harness in 2026.

    Pick Codex CLI if you want lower per-token cost, you already pay for ChatGPT and want to use that allowance, you like the clean preset sandbox/approval model, or you simply prefer the GPT-5.x output style. It is lean, fast to stand up, and genuinely capable.

    The move a lot of us make: run both. They are cheap relative to engineer time, they share MCP servers, and they have different failure modes. When one gets stuck in a loop on a hard bug, handing the same task to the other with fresh context often breaks the logjam. If you are weighing terminal agents against IDE-native ones, our Claude Code vs Cursor breakdown covers that axis.

    Frequently asked questions

    Is Claude Code or Codex CLI better for large refactors?

    Claude Code tends to hold up better on long multi-file refactors, mostly because of subagents and hooks that keep context organized and catch mistakes deterministically. Codex can do it too, especially with GPT-5.5, but you lean harder on tight AGENTS.md instructions and approval gates.

    Can I use Codex CLI without a ChatGPT subscription?

    Yes. Run codex login with an OpenAI API key and you pay per token instead of through a ChatGPT plan. Same for Claude Code with an ANTHROPIC_API_KEY if you would rather meter than subscribe.

    Do they work on Windows natively?

    Both do in 2026. Claude Code runs in PowerShell (many operators still prefer WSL for cleaner paths), and Codex CLI has a native Windows installer plus a Windows sandbox with unelevated/elevated modes. Watch out for shells that mangle /tmp or C:\ style paths in arguments.

    What is the single biggest difference?

    Pricing structure and workflow depth. Claude Code offers flat-rate Max plans and a richer harness (subagents, hooks, skills); Codex offers lower per-token rates and a cleaner preset sandbox. Model quality is close enough that those two factors usually decide it.

    Which model do they run by default?

    Claude Code defaults to the current Claude flagship (Opus 4.8 as of June 2026, with Sonnet 4.6 for everyday speed). Codex CLI recommends GPT-5.5 for complex work, with GPT-5.4-mini and GPT-5.3-Codex as alternatives. Switch in-session with /model or the -m flag.

    How do I get either tool cited or surfaced by AI engines for my own docs?

    That is a content question, not a tooling one. The same structure that makes this page answerable, short factual answers, question-shaped headers, and a visible FAQ, is what AI engines reward. See how AI engines cite content for the full playbook.

  • Claude Code vs Cursor: An Honest 2026 Comparison

    Claude Code vs Cursor: An Honest 2026 Comparison

    Last verified: June 2026.

    Claude Code and Cursor are the two tools most working developers actually reach for in 2026, and they are not the same kind of thing. Cursor is an AI-native code editor (a VS Code fork) where the model lives inside your IDE. Claude Code is a terminal agent that lives in your shell and edits files, runs commands, and drives git from the command line. I run both every day. This is the honest version: what each one is good at, what they cost right now, and a simple rule for picking.

    Claude Code vs Cursor: what is the actual difference?

    The short answer: Cursor is an editor you type in; Claude Code is an agent you delegate to. Cursor keeps you in the driver’s seat with autocomplete, inline edits, and a chat sidebar that sees your open files. Claude Code takes a goal (“add rate limiting to the upload endpoint and run the tests”) and works the repo autonomously in the terminal, asking permission before it touches things.

    Dimension Claude Code Cursor
    Form factor Terminal CLI (plus IDE extension, web, desktop) Full IDE (VS Code fork)
    Primary loop Delegate a task, approve actions Type code, accept suggestions
    Models Claude only (Sonnet 4.6, Opus 4.8) Multi-model: Claude, GPT, Gemini
    Best at Multi-file refactors, scripted/headless runs, git workflows Tight edit loops, autocomplete, staying in one window
    Entry price $20/mo (Pro) Free (Hobby) / $20/mo (Pro)
    Billing model Usage windows (5-hour + weekly) Credit pool ($ equal to plan price)

    How does each one actually work?

    Claude Code (terminal agent)

    You install it globally and run it from inside a project directory:

    npm install -g @anthropic-ai/claude-code
    cd my-project
    claude

    From there you talk to it in plain language. It reads files, proposes edits as diffs, and runs shell commands only after you approve them. A few patterns I use constantly:

    • Project memory: drop a CLAUDE.md file in the repo root with build commands, conventions, and “do not touch” rules. Claude Code reads it on every run, so you stop re-explaining the same context.
    • Headless / scripted runs: claude -p "bump all deps and run the test suite" runs one-shot and exits, which is what makes it scriptable in CI or cron jobs. This is the single biggest thing Cursor cannot do.
    • Permission control: by default it asks before edits and commands. You can pre-approve safe tools so it stops prompting on every npm test.
    • Plan mode: ask it to plan before it writes, review the plan, then let it execute. This is how you avoid a runaway agent rewriting half the codebase.

    Cursor (AI IDE)

    Cursor is a download, not a package install. You open your folder and the AI is wired into the editing surface:

    • Tab completion: multi-line, context-aware autocomplete that predicts your next edit, not just the next token. This is the feature people stay for.
    • Inline edit (Cmd/Ctrl+K): select code, describe the change, get a diff in place.
    • Agent mode: a chat panel that can edit multiple files and run terminal commands, closing the gap with Claude Code from inside the IDE.
    • Model picker: switch between Claude Sonnet, GPT, and Gemini per request from a dropdown. Useful when one model is stuck and you want a second opinion without leaving the window.

    What does Claude Code cost in 2026?

    Claude Code is billed by usage windows, not per-request credits. As of June 2026:

    • Pro: $20/month. Sonnet 4.6 and Opus 4.6, roughly 10 to 40 prompts per 5-hour window depending on repo size.
    • Max 5x: $100/month. ~5x Pro limits and access to Opus 4.8.
    • Max 20x: $200/month. ~20x Pro limits, all models including Opus 4.8.
    • API (pay-per-token): Opus 4.7 at $5 input / $25 output per million tokens; Sonnet 4.6 at $3 / $15.

    The mechanic to understand: there is a 5-hour rolling session window (your budget resets from your first prompt) plus a weekly active-compute cap that only counts time the model is actually reasoning. If you hit a wall mid-afternoon, you are usually waiting for the 5-hour window to roll, not the week.

    What does Cursor cost in 2026?

    Cursor moved to a credit-pool model (the switch happened in mid-2025). Every paid plan includes a monthly credit pool equal to the plan price in dollars, and each request burns credits based on which model you pick and how heavy the request is. As of June 2026:

    • Hobby: Free. Limited tab completions and agent requests, plus a one-week Pro trial on signup.
    • Pro: $20/month ($16 annual). Frontier model access, MCP support, cloud agents, and a $20 credit pool.
    • Pro+: $60/month. ~3x the credits.
    • Ultra: $200/month. ~20x usage and priority features.
    • Teams: $40/user/month with SSO and admin controls.

    Practical note on the credit pool: model choice matters a lot. Roughly, $20 of credits buys about 225 Claude Sonnet requests or about 550 Gemini requests, because Anthropic models cost more per call than Gemini in Cursor’s pricing. If you run Claude on everything, the $20 pool drains faster than newcomers expect. This is the source of most “what happened to Cursor pricing” confusion.

    Which models do you actually get?

    This is the cleanest dividing line.

    • Claude Code is Claude-only. You get Anthropic’s frontier coding models (Sonnet 4.6 for speed/cost, Opus 4.8 for the hardest agentic work on Max). No GPT, no Gemini. If you trust Claude for code, the single-vendor integration is tighter and the agent behavior is tuned end to end.
    • Cursor is multi-model. Claude, OpenAI, and Google models from one dropdown. The advantage is hedging: if one model whiffs on a problem, switch and retry in seconds. The trade-off is that no single model is integrated as deeply as Claude is in its own first-party tool.

    Which one is better for big refactors and automation?

    Claude Code, clearly. Two reasons. First, the terminal-agent loop is built for “go do this across the whole repo” tasks, and plan mode plus CLAUDE.md keep it on rails. Second, headless mode (claude -p "...") means you can wire it into scripts, pre-commit hooks, and scheduled jobs. Cursor’s agent mode is strong inside the IDE, but it is fundamentally an interactive editor, not a thing you call from a cron line.

    Which one is better for everyday coding flow?

    Cursor, for most people. If your day is reading, editing, and iterating on code you understand, Cursor’s tab completion and inline edits keep you in one window with near-zero friction. You never leave the editor to get help. Developers who are uneasy handing a whole task to an autonomous agent also tend to prefer Cursor because they stay in control of every keystroke.

    Can you use both together?

    Yes, and a lot of people do. The common setup: Cursor as the editor, Claude Code in Cursor’s integrated terminal. You get Cursor’s autocomplete and visual diff review for hands-on work, and you drop into Claude Code when you want to delegate a multi-file job or run something headless. They do not conflict. If you are building a broader operator setup around these tools, see our AI operator’s stack for how the pieces fit, and our Claude MCP setup guide for wiring external tools and data into Claude Code via MCP.

    Claude Code vs Cursor vs Codex?

    Codex is the third option people weigh, and it sits closer to Claude Code as an agent than to Cursor as an editor. The decision usually comes down to which model family and which workflow you trust. We break that specific matchup down in Claude Code vs Codex.

    Bottom line: when to pick which

    • Pick Claude Code if you want an autonomous agent for refactors, you live in the terminal and git, you need scriptable/headless runs, and you are happy with Claude as your one model.
    • Pick Cursor if you want best-in-class autocomplete, you prefer staying inside a visual editor, you value swapping between Claude/GPT/Gemini, and you want to keep your hands on the keyboard.
    • Pick both if you can swing two subscriptions: Cursor for the edit loop, Claude Code in the terminal for delegation. Start each on the $20 tier and only upgrade the one you hit limits on.

    FAQ

    Is Claude Code or Cursor cheaper?

    Both start at $20/month (Cursor also has a free Hobby tier). The difference is the meter: Claude Code limits you by 5-hour usage windows plus a weekly cap, while Cursor gives you a $20 credit pool that drains per request based on the model. Heavy Claude usage in Cursor burns the pool faster than people expect.

    Does Cursor use Claude?

    Yes. Cursor offers Anthropic’s Claude models alongside OpenAI and Google models, selectable per request. But you are using Claude through Cursor’s integration, not Anthropic’s first-party Claude Code agent, so the agentic behavior differs.

    Can Claude Code edit files and run commands like an IDE agent?

    Yes. Claude Code reads and writes files, runs shell commands, and drives git directly from the terminal. By default it asks permission before edits and commands, and you can pre-approve safe tools to cut down the prompts.

    Which is better for beginners?

    Cursor. The visual editor, inline diffs, and autocomplete are more forgiving than a terminal agent, and the free Hobby tier lets you learn before paying. Claude Code rewards people who are already comfortable in the shell and with git.

    Do I need to know the command line to use Claude Code?

    Largely yes. Claude Code is a CLI-first tool, and while it does most of the git and shell work for you, you will be living in a terminal. There is also an IDE extension and a desktop app, but the terminal is where it is strongest.

    Can I run Claude Code in CI or on a schedule?

    Yes, via headless mode: claude -p "your task" runs once and exits, which makes it usable in CI pipelines, git hooks, and scheduled jobs. Cursor has no equivalent because it is an interactive editor.

    Will using both at once cause conflicts?

    No. A common and stable setup is Cursor as your editor with Claude Code running in Cursor’s integrated terminal. They operate on the same files without stepping on each other, as long as you are not having both edit the exact same file simultaneously.

    Related reading: how AI engines cite content and Claude in Chrome for LinkedIn automation.

  • Notion MCP Setup with Claude: Complete Config + the Errors Nobody Documents

    Notion MCP Setup with Claude: Complete Config + the Errors Nobody Documents

    Last verified: June 2026.

    There are two ways to connect Notion to Claude over the Model Context Protocol (MCP), and almost every tutorial only covers one of them. Worse, none of them tell you why the connection succeeds but Claude still says it cannot find any of your pages. This guide covers both paths – the hosted OAuth connector and the self-hosted token route – with the exact config, the scopes that matter, the rate limit you will hit, and the specific error strings that show up when something is wrong. It is written from actually running this, not from reading the docs.

    Which Notion MCP should you use: hosted or self-hosted?

    Short answer: use the hosted MCP at https://mcp.notion.com/mcp for almost everything. It uses OAuth, so you never handle a token, and it respects your existing Notion permissions automatically. Use the self-hosted npm server with an internal integration token only when you need headless, unattended automation (a cron job, a server with no human to click “Allow”), because the hosted server requires an interactive OAuth approval that a background process cannot complete.

    Factor Hosted (mcp.notion.com) Self-hosted (npm + token)
    Auth OAuth (interactive) Internal integration token (ntn_)
    Permissions Inherits your full Notion access Only pages you explicitly share
    Setup time ~2 minutes ~10 minutes
    Headless / cron No (needs a human to approve) Yes
    Best for Claude Desktop, Claude Code, daily use Servers, scripts, multi-user backends

    How do I connect Notion to Claude using the hosted MCP?

    This is the fast path. No token, no npm.

    Claude Desktop / Claude.ai: Open Settings > Connectors, click Add custom connector (or pick Notion if it appears in the directory), and paste the URL https://mcp.notion.com/mcp. A Notion OAuth window opens. Approve it, choose which workspace and which top-level pages the connector may see, and you are done. The scope you pick in that OAuth screen is the whole ballgame – see the gotcha below.

    Claude Code (CLI): one command.

    claude mcp add --transport http notion https://mcp.notion.com/mcp

    Then run /mcp inside Claude Code to trigger the OAuth login. After approving, verify it is live:

    claude mcp list

    You should see notion with a connected status. If it shows failed or needs auth, run /mcp again and complete the browser flow – the CLI cannot proceed past OAuth on its own.

    How do I set up the self-hosted Notion MCP with an integration token?

    Use this when you need automation that runs without a human. Three steps: create the integration, get the token, then wire it into your MCP config.

    1. Create the integration and copy the token

    1. Go to https://www.notion.so/my-integrations and click New integration. You must be a Workspace Owner – if the button is greyed out, that is why.
    2. Name it, select the workspace, and choose Internal integration type.
    3. On the integration’s settings page, set Capabilities: Read content, plus Update/Insert content if you want Claude to write. If you only grant Read, every write attempt fails with a permission error later – this is a common self-inflicted wound.
    4. Click Show, then copy the Internal Integration Token. New tokens start with ntn_ (older ones start with secret_ and still work). Treat it like a password.

    2. Add it to your MCP config

    For Claude Desktop, edit claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/; Windows: %APPDATA%\Claude\). Add:

    {
      "mcpServers": {
        "notion": {
          "command": "npx",
          "args": ["-y", "@notionhq/notion-mcp-server"],
          "env": {
            "NOTION_TOKEN": "ntn_your_token_here"
          }
        }
      }
    }

    Restart Claude Desktop completely (quit, do not just close the window). On Windows, if npx is not found, use the full path to npx.cmd or run via cmd /c – the config does not inherit your shell PATH.

    3. The step everyone forgets: share the pages

    An internal integration starts with access to nothing. Creating it and pasting the token is not enough. For every page or database you want Claude to touch, open it in Notion, click the menu (top right), choose Connections (or Add connections), and select your integration. Access is inherited by child pages, so sharing a top-level page covers its whole subtree. Skip this and Claude will connect cleanly and then truthfully report that it cannot see any content.

    What are the most common Notion MCP errors and how do I fix them?

    These are the failure messages you will actually see, and what each one means.

    “Could not find page” / Claude returns zero results from a workspace that clearly has pages

    Cause, in order of likelihood: (1) you did not share the page with the integration (self-hosted), or you scoped the OAuth grant too narrowly (hosted); (2) the page is in a different workspace than the one the integration is tied to. Fix: re-check the Connections menu on the specific page, or re-run the OAuth flow and widen the page selection. An integration token is bound to one workspace – it cannot see another.

    API token is invalid (HTTP 401 unauthorized)

    The token is wrong, was regenerated, or has a stray space/newline from copy-paste. Re-copy it with the Show button, paste it into the env block with no surrounding whitespace, and restart the client. If you rotated the token in Notion, the old one dies immediately – update every config that used it.

    Validation error / “body failed validation” (HTTP 400)

    Claude tried to write a property type that does not match the database schema – for example sending plain text into a Select, or a malformed date. This is not a connection problem. Tell Claude the exact property names and types, or ask it to fetch the database schema first before writing.

    “Rate limited” / HTTP 429

    Notion enforces roughly 3 requests per second per integration (averaged, with short bursts tolerated). Bulk operations – “update 200 rows,” “scan every page” – blow straight through this. The fix is pacing, not a bigger plan: have Claude batch in small groups and add a short delay between calls. If you are scripting around the MCP, honor the Retry-After header on a 429 instead of hammering.

    spawn npx ENOENT (server will not start, self-hosted)

    Node/npx is not on the PATH the client sees. Install Node, then point command at the absolute path to npx (or npx.cmd on Windows). This is the number-one self-hosted startup failure.

    MCP server shows “failed” in claude mcp list right after adding it

    For the hosted server this almost always means OAuth was never completed – run /mcp and approve in the browser. For the self-hosted server it means the process crashed on launch; run the npx command manually in a terminal to see the real stack trace.

    What can Claude actually do with Notion once connected?

    With read + write capabilities granted, Claude can search your workspace, read pages and databases, create pages, append blocks, and update properties on existing rows. Practical things that work well: “summarize this Notion doc,” “create a row in my tasks database with these fields,” “find every page mentioning X and list the links,” “turn this conversation into a meeting-notes page.” Things that get awkward: very large pages (the response can get truncated – verify the write actually landed by fetching it back), and bulk edits that trip the rate limit. A reliable habit is to treat Notion as the source of truth and have Claude verify by re-fetching after any write, because timeouts on big pages can commit silently and still return a malformed response.

    Security notes from running this in production

    • Scope the OAuth grant tightly. The hosted connector inherits whatever you approve. If you only need one project area, share only that top-level page, not the whole workspace.
    • Never hardcode the token in a file you might commit. Keep ntn_ tokens in an env var or your OS secret store, and reference them from config. If a token leaks, revoke it in my-integrations immediately – revocation is instant.
    • Grant the minimum capability. If Claude only needs to read, do not enable insert/update. You can always widen it later.
    • Audit which integrations have access to sensitive databases periodically; the Connections menu on each page shows the current list.

    If you are wiring up several connectors at once, the mechanics generalize – see our broader Claude MCP setup guide for the config patterns that apply across servers, and the AI operator’s stack for how Notion fits alongside the other tools day to day. If your reason for connecting Notion is to get your own content cited by AI systems, the structure of the pages matters as much as the wiring – see how AI engines cite content.

    FAQ

    Do I need a paid Notion plan to use the MCP?

    No. The API and integrations work on free Notion workspaces. You do need to be a Workspace Owner to create an internal integration.

    Why does Claude say it connected but can’t find my pages?

    Almost always the page-sharing step. A self-hosted integration has access to nothing until you add it via the page’s Connections menu. For the hosted connector, you scoped the OAuth approval too narrowly – re-run it and select more pages.

    What is the Notion API rate limit?

    About 3 requests per second per integration, averaged over time with small bursts allowed. Exceeding it returns HTTP 429. Pace bulk operations and respect the Retry-After header.

    What does the ntn_ prefix mean on my token?

    It is the current internal integration token format. Tokens created today start with ntn_; older secret_ tokens still function and do not need to be replaced.

    Can the hosted Notion MCP run in a headless cron job?

    No. The hosted server requires interactive OAuth approval, which an unattended process cannot complete. Use the self-hosted npm server with an ntn_ token for headless automation.

    How do I let Claude write to Notion, not just read?

    Enable Update content and Insert content under the integration’s Capabilities (self-hosted), or approve write scope during OAuth (hosted). Read-only is the default and will reject every write with a permission error.

    Where do I put the Notion MCP config for Claude Desktop?

    In claude_desktop_config.json~/Library/Application Support/Claude/ on macOS, %APPDATA%\Claude\ on Windows. Add a notion entry under mcpServers and fully restart the app.

    Can one integration access two workspaces?

    No. An internal integration is bound to the single workspace it was created in. For a second workspace, create a second integration (or a second OAuth grant on the hosted connector).

    Related reading from operators who run AI tooling daily: Claude Code vs Cursor, Claude Code vs Codex, and Claude in Chrome for LinkedIn automation.

    Frequently Asked Questions

    What is the Notion MCP server for Claude?

    The Notion MCP server is a connector that lets Claude read, search, and interact with your Notion workspace through the Model Context Protocol. With it connected, you can ask Claude to find pages, summarize databases, draft content into Notion, or retrieve information — all without copying and pasting anything manually.

    What is the difference between the Notion OAuth connector and the self-hosted token route?

    The OAuth connector (available in Claude Desktop’s built-in integrations) is the fastest path — authorize once and Claude gets read access to pages you share with the integration. The self-hosted route uses a Notion Internal Integration token in your claude_desktop_config.json, giving you more control over scopes and works in Claude Code environments where OAuth connectors aren’t available.

    Why does Claude say it can’t find my Notion pages after connecting?

    The most common reason is that you haven’t shared the specific pages or databases with your Notion integration. Notion’s permission model requires explicit page-level grants — a successful connection does not automatically give access to all your content. Go to the page in Notion, click the three-dot menu, choose ‘Add connections’, and select your integration.

    What scopes does the Notion MCP integration need?

    For read operations: Read content and Read user information. For write operations: Update content and Insert content. Avoid requesting more scopes than you need — the minimum set reduces the blast radius if a token is ever compromised. Set scopes when creating your Notion Internal Integration at notion.so/my-integrations.

    Does the Notion MCP integration work with Claude Code?

    Yes. Add it via ‘claude mcp add notion npx @notionhq/notion-mcp-server’ and set your NOTION_API_KEY as an environment variable. Claude Code picks it up on the next session. The self-hosted token route works in both Claude Desktop and Claude Code; the OAuth connector is currently Desktop-only.

    What rate limits does the Notion API have?

    Notion’s API enforces a rate limit of 3 requests per second per integration. For typical conversational use this is invisible, but if you ask Claude to crawl a large database or process many pages in sequence, you may see 429 errors. The fix is to add a short sleep between bulk operations or use Notion’s pagination to fetch in smaller batches.


  • How to Connect Any Tool to Claude with MCP: Complete Setup Guide

    How to Connect Any Tool to Claude with MCP: Complete Setup Guide

    Last verified: June 2026

    MCP (Model Context Protocol) is how you give Claude hands. Out of the box Claude can talk; with an MCP server connected, it can read your files, query a database, hit an API, or drive a browser. This guide covers the two places you actually wire servers up: the claude_desktop_config.json file for the Claude Desktop app, and the claude mcp add command for Claude Code (the terminal/IDE tool). It ends with the troubleshooting section the official docs skip: path problems, JSON syntax traps, servers that silently never load, and the Windows quirks that cost people an afternoon.

    Everything below is checked against the current Claude Code MCP docs and tested commands. If you want the wider picture of how MCP fits into a working setup, see the AI operator’s stack.

    What is MCP and what is an MCP server?

    MCP transport options at a glance

    Transport Use case Config location Works with
    stdio Local tools, CLIs, scripts claude_desktop_config.json or claude mcp add Claude Desktop, Claude Code
    SSE (HTTP) Remote servers, cloud services URL in config Claude Desktop, Claude Code
    Streamable HTTP Production remote MCP URL in config Claude Desktop, Claude Code

    MCP is an open standard for connecting AI models to external tools and data. An MCP server is a small program that exposes a set of tools (functions Claude can call) over that protocol. Claude is the client; the server is the thing that actually does the work, like reading a Postgres table or creating a GitHub issue.

    There are two transport types you will deal with in practice:

    • stdio (local): the server runs as a process on your machine. Claude talks to it over standard input/output. Best for filesystem access, local databases, and custom scripts. This is what most “install this MCP server” instructions mean.
    • HTTP (remote): the server lives on the internet at a URL. Best for cloud services (Notion, Sentry, Stripe, GitHub). Often uses OAuth. SSE is the older remote transport and is now deprecated; use HTTP for new remote servers.

    The same server can usually be added to both Claude Desktop and Claude Code. The mechanics differ: Desktop uses a JSON file you edit by hand, Claude Code gives you a CLI that writes the config for you.

    How do I add an MCP server to Claude Desktop?

    Claude Desktop reads a single JSON file. You edit it, fully quit the app, and reopen. There is no in-app “add server” button for custom servers as of June 2026, so the file is the source of truth.

    Where is claude_desktop_config.json?

    • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
    • Windows: %APPDATA%\Claude\claude_desktop_config.json (paste that into the File Explorer address bar; it expands to C:\Users\YOU\AppData\Roaming\Claude\)

    The fastest way to open it: in Claude Desktop go to Settings > Developer > Edit Config. That button creates the file if it does not exist and opens the folder. If the file is brand new it may be empty or just {}.

    The config file structure

    Everything lives under a top-level mcpServers object. Each key is a name you choose; each value describes how to launch the server. Here is a complete, working file with a local filesystem server and a remote Notion server:

    {
      "mcpServers": {
        "filesystem": {
          "command": "npx",
          "args": [
            "-y",
            "@modelcontextprotocol/server-filesystem",
            "C:\\Users\\YOU\\Documents"
          ]
        },
        "notion": {
          "command": "npx",
          "args": ["-y", "@notionhq/notion-mcp-server"],
          "env": {
            "NOTION_TOKEN": "secret_your_token_here"
          }
        }
      }
    }

    Three fields do all the work:

    • command the executable to run (npx, node, python, uvx, or an absolute path to a binary).
    • args an array of arguments. Each flag and value is its own array element. "--port 8080" as a single string will not work; use "--port", "8080".
    • env an object of environment variables (API keys, tokens). Optional.

    After saving, completely quit Claude Desktop (on Windows, right-click the system tray icon and choose Quit; closing the window is not enough) and reopen it. You should see a tools/connector indicator in the chat input. For a full Notion walkthrough including the token, see connecting Notion to Claude with MCP.

    How do I add an MCP server to Claude Code (CLI)?

    Claude Code does not make you hand-edit JSON. The claude mcp command manages servers for you and writes to the right file. This is the part people get wrong most often, so here is the exact, verified syntax.

    claude mcp add

    The general form for a local (stdio) server is:

    claude mcp add [options] <name> -- <command> [args...]

    The -- is load-bearing. Everything before it is for Claude Code; everything after it is the command that launches your server. Real examples:

    # Local filesystem server
    claude mcp add filesystem -- npx -y @modelcontextprotocol/server-filesystem ~/Documents
    
    # Local server with an environment variable
    claude mcp add --env AIRTABLE_API_KEY=YOUR_KEY airtable -- npx -y airtable-mcp-server
    
    # Remote HTTP server
    claude mcp add --transport http notion https://mcp.notion.com/mcp
    
    # Remote HTTP server with an auth header
    claude mcp add --transport http github https://api.githubcopilot.com/mcp/ --header "Authorization: Bearer YOUR_PAT"

    Option ordering matters. All flags (--transport, --env, --scope, --header) must come before the server name. Put a flag after the name and you will get a confusing parse error or the flag will be passed to your server instead of to Claude Code.

    Scopes: local, project, user

    The --scope flag decides where the config is written and who sees it:

    • local (the default) loads only in the current project, private to you. Stored in ~/.claude.json keyed by the project path.
    • project loads in the current project and is shared with your team. Stored in a .mcp.json file at the project root that you commit to git.
    • user loads across all your projects, private to you. Also stored in ~/.claude.json.
    # Available across all your projects
    claude mcp add --scope user --transport http sentry https://mcp.sentry.dev/mcp
    
    # Shared with your team via .mcp.json in the repo
    claude mcp add --scope project filesystem -- npx -y @modelcontextprotocol/server-filesystem .

    When the same server name exists at multiple scopes, local wins over project, which wins over user. The whole entry from the winning scope is used; fields are not merged.

    claude mcp list, get, and remove

    # List every configured server and its connection status
    claude mcp list
    
    # Show the full config for one server
    claude mcp get github
    
    # Remove a server
    claude mcp remove github

    claude mcp list is your first diagnostic. It shows each server as connected, pending, or failed. Project-scoped servers from a .mcp.json you have not approved yet show as Pending approval until you run claude interactively and accept them.

    Other useful commands

    # Add from a raw JSON blob (handy when copying from a server's README)
    claude mcp add-json weather '{"type":"stdio","command":"npx","args":["-y","weather-mcp"]}'
    
    # Import everything you already set up in Claude Desktop (macOS / WSL)
    claude mcp add-from-claude-desktop

    Inside a running Claude Code session, type /mcp to see live server status, tool counts, and to trigger OAuth login for remote servers that need it.

    Troubleshooting: the errors the docs skip

    Most MCP failures are not exotic. They are paths, quoting, and the app not restarting. Work this list top to bottom.

    Server not loading / no tools appear

    1. Did you actually restart? Claude Desktop only reads the config at launch. Fully quit (tray icon > Quit on Windows, Cmd+Q on macOS) and reopen. In Claude Code, run claude mcp list to see the real status instead of guessing.
    2. Is the command on PATH? The single most common stdio failure is npx, node, python, or uvx not being found by the app. GUI apps often have a narrower PATH than your terminal. Fix it by using an absolute path. Find it with which npx (macOS/Linux) or where npx (Windows), then put that full path in command.
    3. Read the logs. Claude Desktop writes per-server logs. macOS: ~/Library/Logs/Claude/. Windows: %APPDATA%\Claude\logs\. Look for mcp-server-NAME.log. The actual error (missing module, bad token, wrong path) is almost always sitting right there.

    spawn ENOENT or “command not found”

    This means the OS could not find the executable named in command. It is a PATH problem, not a Claude problem. Use the absolute path (see above). On Windows specifically, see the npx note below.

    JSON syntax errors (the silent killer)

    If claude_desktop_config.json has a single syntax error, Claude Desktop loads zero servers and usually says nothing. The usual culprits:

    • Trailing commas. JSON forbids a comma after the last item in an object or array. "args": ["a", "b",] is invalid.
    • Smart quotes. If you edited the file in a word processor, curly quotes (the slanted kind) break the parser. Use a code editor and straight quotes only.
    • Unescaped Windows backslashes. In JSON, \ is an escape character, so every backslash in a Windows path must be doubled: C:\\Users\\YOU\\Documents. A single backslash silently corrupts the string.

    Paste the whole file into any JSON validator before restarting. Thirty seconds there saves an hour of staring.

    Claude Code: flag passed to the wrong place

    If claude mcp add behaves strangely, you almost certainly put a flag after the server name or forgot the --. Reread the order: claude mcp add [flags] NAME -- COMMAND ARGS. The -- separates Claude Code’s flags from your server’s command.

    Windows quirks

    • npx needs a wrapper in some setups. If a bare "command": "npx" fails on Windows, launch it through cmd: "command": "cmd" with "args": ["/c", "npx", "-y", "the-server-package"]. This resolves a class of “npx works in my terminal but not in Claude” failures.
    • Use the right config root. It is %APPDATA%\Claude\ (which is AppData\Roaming), not AppData\Local. Mixing these up means you are editing a file the app never reads.
    • Git Bash mangles paths. If you run commands through Git Bash, it can rewrite paths like /c/Users or absolute paths inside arguments. Prefer PowerShell or cmd for claude mcp add, or pass paths exactly as Windows expects them.
    • WSL is a separate world. A server installed inside WSL is not visible to a Windows-native Claude Desktop, and vice versa. Keep both on the same side.

    Remote server returns 401 / 403

    The server needs authentication. In Claude Code, run /mcp and complete the OAuth flow in your browser. If you hardcoded an Authorization header and it is rejected, the token is wrong for that endpoint; remove the header and let OAuth handle it instead.

    Frequently asked questions

    What is an MCP server in one sentence?

    A small program that exposes tools (functions like read-file or query-database) to Claude over the Model Context Protocol, so Claude can take real actions instead of only producing text.

    What is the difference between Claude Desktop and Claude Code for MCP?

    Claude Desktop is the chat app and you configure servers by hand-editing claude_desktop_config.json, then restarting. Claude Code is the terminal/IDE tool and you configure servers with the claude mcp add command, which writes the config for you.

    Where is the Claude Desktop config file?

    macOS: ~/Library/Application Support/Claude/claude_desktop_config.json. Windows: %APPDATA%\Claude\claude_desktop_config.json. The quickest way to open it is Settings > Developer > Edit Config inside the app.

    Why is my MCP server not showing up?

    In order of likelihood: you did not fully restart the app, the command is not on the app’s PATH (use an absolute path), or there is a JSON syntax error such as a trailing comma, a smart quote, or an unescaped Windows backslash. Check the per-server log in the Claude logs folder for the exact error.

    What does the double dash do in claude mcp add?

    The -- separates Claude Code’s own flags from the command that launches your server. Flags like --transport and --env go before it; the actual launch command (npx -y some-server) goes after it.

    What is the default scope for claude mcp add?

    local. The server loads only in the current project and stays private to you. Use --scope user to make it available in every project, or --scope project to share it with your team via a committed .mcp.json.

    Is SSE or HTTP the right transport for remote servers?

    HTTP. SSE still works but is deprecated. Use --transport http for any new remote server unless its documentation specifically requires SSE.

    How do I remove an MCP server?

    In Claude Code, run claude mcp remove NAME. In Claude Desktop, delete that server’s entry from the mcpServers object in claude_desktop_config.json and restart the app.

    Where to go next

    Once your servers connect cleanly, the leverage comes from using them well. For how this fits a daily workflow, read the AI operator’s stack; if you are choosing between coding environments, see Claude Code vs Cursor. And if you want a concrete first server to wire up, the Notion MCP setup is a clean, high-value place to start.

    Frequently Asked Questions

    What is MCP and why does it matter for Claude?

    MCP (Model Context Protocol) is an open standard that lets Claude connect to external tools, databases, APIs, and services. Without MCP, Claude can only work with text in the conversation window. With an MCP server connected, Claude can read your files, query a live database, hit an API, or control a browser — turning it from a chat interface into an autonomous agent.

    How do I add an MCP server to Claude Desktop?

    Edit the claude_desktop_config.json file (found at ~/Library/Application Support/Claude/ on Mac, %APPDATA%/Claude/ on Windows). Add your server under the ‘mcpServers’ key with a ‘command’, ‘args’, and optional ‘env’ block. Save the file and restart Claude Desktop. The server appears in Claude’s tool list if it loaded correctly.

    How do I add an MCP server to Claude Code?

    Run ‘claude mcp add [args…]’ from your terminal. For a remote SSE server use ‘claude mcp add –transport sse ‘. List configured servers with ‘claude mcp list’. Servers added this way are scoped to your user profile unless you use the –project flag.

    Why does my MCP server connect but Claude can’t see my data?

    The most common cause is scope or permission gaps. For Notion, verify your integration has been granted access to the specific pages/databases — the connection succeeds even without page access. For file servers, check that the path in your config resolves correctly from the shell Claude launches (not your interactive shell). For OAuth servers, re-authorize and confirm token scopes.

    What is the difference between stdio and SSE MCP transports?

    stdio runs the MCP server as a local subprocess — Claude launches the command you specify and communicates over stdin/stdout. This is used for local tools and CLIs. SSE (Server-Sent Events) connects to a remote HTTP server. Use stdio for local tools like filesystem access or database clients; use SSE or Streamable HTTP for cloud services or shared team servers.

    Which MCP servers should I start with?

    The highest-value first connections are: filesystem (read/write your local files), a database client for your primary DB, and a service you interact with daily (Notion, Slack, GitHub, Linear). The Notion MCP server is a clean first project — see the dedicated setup guide on this site for the exact config and common errors.


  • Claude Code vs Cursor in 2026: An Honest Comparison for Developers Who Ship

    Claude Code vs Cursor in 2026: An Honest Comparison for Developers Who Ship

    The conversation about Claude Code vs Cursor has collapsed into lazy takes: Claude Code is smarter, Cursor is friendlier, buy both. That framing is not wrong, but it isn’t useful. If you’re deciding where to put your coding tool budget in 2026, you need to know where each tool wins and loses – with specifics, not vibes.

    Here’s what a year of both tools in production actually looks like.

    The Fundamental Architecture Gap

    Claude Code is a terminal-native CLI agent. You run it with claude in your shell, point it at a codebase, give it a task, and walk away. It has no GUI. It doesn’t autocomplete as you type. What it has is the ability to autonomously execute multi-step tasks – read files, write code, run tests, iterate on failures – without you babysitting it.

    Cursor is an IDE built on VS Code. It has tab autocomplete, an inline chat panel, Agent mode for longer tasks, and a polished visual interface that feels like VS Code with a superpower grafted on. If you already live in VS Code, Cursor’s learning curve is close to zero.

    These are genuinely different tools. The “which one wins” question should really be “which one wins for what.”

    Where Claude Code Wins: Long Autonomous Runs

    The biggest measurable advantage Claude Code has right now is context. Running on Claude Opus 4.6 or 4.7, Claude Code natively supports a 1 million token context window – and that’s a first-class, supported number with no per-token surcharge for long context on the API.

    Cursor’s advertised context is lower, and it draws from multiple model backends depending on which you select. On a large monorepo task – think refactoring an auth system across 40 files – the difference between context limits is the difference between Claude Code holding the whole codebase in view and the alternative having to page through it.

    Claude Opus 4.6 scores 80.84% on SWE-bench Verified, per Anthropic’s published system card. Opus 4.7 improved on that, particularly on the hardest problems in the benchmark set, and on Rakuten-SWE-Bench (a production-task evaluation, not just GitHub issues) it resolves 3x more tasks than Opus 4.6. That is a meaningful gap.

    The autonomous-run workflow looks like this in practice:

    claude "Refactor the payment module to use the new Stripe SDK, update all tests, and make sure existing integration tests still pass"

    Claude Code will read the relevant files, identify the Stripe version mismatch, write the new implementation, run your test suite, and iterate if something fails – often without a single follow-up prompt. That same task in Cursor’s Agent mode typically requires you to approve each file write and re-prompt when the agent stalls on an error.

    Where Cursor Wins: Daily Developer Experience

    Cursor’s tab autocomplete is genuinely good. It’s not a feature Claude Code has at all – Claude Code is not an IDE and doesn’t inject suggestions while you type. If your daily workflow is: open file, write code, open file, write code, Cursor is the better tool for that rhythm.

    Cursor’s @codebase reference and file mention system is also excellent for interactive exploration. You can ask “why does this function fail on null input?” while looking at the code, and Cursor’s inline context makes that conversation fast. Claude Code can answer the same question, but you’re doing it in a terminal with no visual reference.

    For teams on an existing GitHub workflow, GitHub Copilot’s deep integration with PRs, issues, and Actions is hard to match. If your team is standardized on GitHub and your security team needs IP indemnity coverage, Copilot is the defensible enterprise choice – Claude Code and Cursor both require more procurement work.

    The Pricing Reality

    Plan Monthly Cost
    Claude Code via Claude Pro $20/month
    Claude Code via Max 5x $100/month
    Claude Code via Max 20x $200/month
    Cursor Pro $20/month
    GitHub Copilot Individual $10/month

    The entry point is the same for Claude Code (via Claude Pro) and Cursor. At that tier, Claude Code’s usage limits are more restricted. The Max 5x plan at /month is where Claude Code becomes a full autonomous-agent platform – higher rate limits, Opus access, and Claude Code usage limits that are double the Pro tier.

    For individual developers doing heavy autonomous runs, the Max 5x plan at competes directly with a Cursor Pro subscription plus meaningful API spend. For teams, the calculus shifts: Cursor’s team plan pricing is lower per seat than a premium Claude Code subscription, which matters when you’re buying for 20 developers.

    The Honest Call

    Claude Code wins on: autonomous multi-step tasks, large codebase refactors, long-running agents, raw SWE-bench performance, and 1M token context on complex jobs.

    Cursor wins on: daily IDE experience, tab autocomplete, interactive inline chat, onboarding speed for VS Code users, and team-tier pricing.

    The recommendation most senior developers are landing on in 2026 is two tools: Cursor open in the background for interactive work, Claude Code for the tasks you used to put in a Jira ticket and wait two days for. If you can only buy one and you mostly write code file-by-file, get Cursor. If your bottleneck is “I need to refactor three services and I don’t have three days,” Claude Code is the one that changes your output.

    The Max 5x plan makes that bet financially coherent for a senior developer. The Pro tier is a reasonable way to find out if autonomous coding is a workflow you actually use.

    Frequently Asked Questions

    Is Claude Code better than Cursor in 2026?

    It depends on your workflow. Claude Code is a terminal-native CLI agent best for large codebase refactors, multi-file operations, and agentic tasks run from the command line. Cursor is an IDE-first editor with inline completions and a chat sidebar — better for continuous editing with visual feedback. Most developers who ship code daily use both rather than choosing.

    What is the difference between Claude Code and Cursor?

    Claude Code is a CLI tool you run with the ‘claude’ command in your terminal — it acts as an autonomous agent that can read, edit, and run files across a codebase. Cursor is a VS Code fork with AI completions and chat built into the editor interface. Claude Code suits agentic automation; Cursor suits interactive editing.

    Can I use Claude Code and Cursor at the same time?

    Yes. Many developers run Claude Code from the terminal for large refactors or test-writing sessions while keeping Cursor open for active editing. They complement each other: Claude Code for autonomous multi-step tasks, Cursor for line-by-line interactive work.

    How much does Claude Code cost in 2026?

    Claude Code usage is billed through your Anthropic API account against whichever Claude model you select. Claude Opus 4.8 runs $5 per million input tokens and $25 per million output tokens. Claude Sonnet 4.6 runs $3/$15 per million tokens. Claude Haiku 4.5 runs $1/$5 per million tokens. Cursor’s plans start around $20/month for Pro.

    Does Cursor use Claude under the hood?

    Cursor supports multiple underlying models including Claude (Anthropic), GPT-4 (OpenAI), and others. You can select which model Cursor routes to in its settings. Claude Code, by contrast, is a dedicated Anthropic CLI tool that only runs on Anthropic’s Claude models.

    What is Claude Code best used for?

    Claude Code excels at large-scale codebase operations: refactoring across multiple files, writing comprehensive test suites, navigating unfamiliar codebases, and running agentic tasks that chain multiple steps. It is less suited for inline autocomplete as you type — Cursor is better at that.


  • Always Allow vs Allow Once: Claude Code’s Quiet Tell

    Always Allow vs Allow Once: Claude Code’s Quiet Tell

    The short version: In Claude Code, the prompt that asks whether to “Always Allow” or “Allow Once” isn’t really about security. It’s a question about your own systems. If you keep choosing Always Allow, the work is recurring — go build the automaton. If it’s honestly Allow Once, it’s a one-off — let it go instead of trying to remember it.

    I spend most of my day inside Claude Code, and a tiny piece of the interface has been living rent-free in my head. Every time the agent wants to run a command, edit a file, or hit an API, it stops and asks: Always Allow, or Allow Once?

    On the surface that’s a permission prompt. Click the box, move on. But after the hundredth time, I started to notice the choice was telling me something about how I actually work — and where I was leaving time on the table.

    “Always Allow” means: go build the automaton

    Always Allow vs Allow Once: quick reference

    Signal Always Allow Allow Once
    Task type Recurring, repeating work One-off, situational
    Right response Build an automation Let it go — don’t memorize it
    Security posture Persistent permission for that tool+action Single-use, no persistent grant
    What it reveals A system worth building An edge case not worth systemizing
    Risk if overused Broad standing permissions accumulate Missed automation opportunity

    Here’s the pattern. If I find myself reaching for Always Allow, it’s because I’ve seen this exact action before. I’ll see it again. I trust it enough to stop being asked.

    That’s not a permission decision. That’s a build order.

    If an action is safe, repeatable, and I do it constantly, the right move isn’t to keep approving it forever — it’s to take it out of the prompt entirely. Turn it into a tool. Wrap it in a script. Register it as a skill. Put it on a cron so it runs whether I’m at the desk or not. The “Always Allow” click is the moment the work earns its own piece of infrastructure.

    Most people stop at the click. They grant the permission and feel productive because the friction went away. But friction that shows up every single day isn’t friction you should approve — it’s friction you should engineer out. Every “Always Allow” is a quiet little flag waving at you: this deserves to be an automaton.

    “Allow Once” means: let it go on purpose

    The other side is just as useful, and it’s the part people get wrong.

    When the honest answer is Allow Once — this is a weird one-off, I’m not going to do it again — the temptation is to write it down. Save the command. Add it to a doc. File it away just in case it ever comes back.

    Resist that. A one-off doesn’t deserve a permanent home in your memory or your system. The cost of storing it isn’t the disk space — it’s the upkeep. Every note you keep is something you now have to organize, search past, keep current, and trip over later. Knowledge you save but rarely touch quietly rots, and stale knowledge is worse than none.

    The way I think about it: it’s more fit to sift through the dirt than to re-sift the knowledge. If a one-off ever does come back, re-deriving it from scratch is cheap — you dig through the dirt once and you’re done. But re-sifting a giant pile of “just in case” notes, over and over, every time you go looking for the thing you actually need? That’s the expensive part. Forgetting a one-off on purpose is a feature, not a failure.

    Why re-deriving usually beats remembering

    This is really a question of economics, and it’s the same math whether you’re managing an AI agent or your own head.

    Storing knowledge has two costs people forget about: the cost to keep it accurate, and the cost to find the signal inside it later. A one-off has a low chance of ever being needed again, so the expected payoff of saving it is tiny — while the drag it adds to everything else you’ve stored is real and permanent. Recurring work is the opposite: high chance of reuse, so it’s worth paying once to encode it well and never think about it again.

    So the rule of thumb falls out on its own:

    • Recurring → encode it. Build the tool, the skill, the cron. Pay once, reuse forever.
    • One-off → forget it on purpose. Do the thing, then let it go. If it ever comes back, dig it up fresh — it’ll be faster than you think.

    The mistake is doing it backwards: hand-running the recurring stuff every day because you never built the automaton, while hoarding a graveyard of one-off notes you’ll never open again. That’s how you end up busy and buried at the same time.

    How to act on the tell in Claude Code

    Next time that prompt pops up, treat it as a tiny decision point instead of a speed bump:

    1. You reached for “Always Allow.” Stop for a second. Ask: what would it take to make this prompt never appear again? An orchestration step, a saved skill, a scheduled job, a hook? Put it on the list. The prompt just told you what to build next.
    2. You reached for “Allow Once.” Do it, then genuinely drop it. Don’t screenshot it, don’t file it. Trust that if it matters, it’ll show up again — and the second sighting is your real signal to build.
    3. You’re not sure. That’s fine — “Allow Once” is the safe default. Two or three “Allow Once” clicks for the same action is the universe telling you it was an “Always Allow” the whole time.

    None of this is really about Claude Code. The tool just happens to put the decision right in front of you, every day, in a little box. Most systems make you guess where your time is leaking. This one points at it and asks you to choose. (It pairs well with knowing when to use Plan Mode and when to skip it — same instinct, a different prompt.)

    Recurring work wants to become an automaton. One-off work wants to be forgotten. The prompt already knows which is which. The only question is whether you’re listening.

    Frequently asked questions

    What’s the difference between “Always Allow” and “Allow Once” in Claude Code?

    “Allow Once” approves a single action one time; the next identical action prompts you again. “Always Allow” approves that action or pattern going forward, so Claude Code stops asking. Functionally, “Always Allow” is how you tell the tool an action is safe and routine.

    Should I use “Always Allow” in Claude Code?

    Use it when an action is safe, repeatable, and something you do often — but treat each “Always Allow” as a signal to eventually build that action into a tool, skill, hook, or scheduled job so it leaves the prompt entirely.

    Is “Always Allow” a security risk?

    It can be if you grant it to broad or destructive actions. Keep “Always Allow” for narrow, well-understood operations, and lean on “Allow Once” for anything unfamiliar, destructive, or outward-facing.

    When should I turn a Claude Code action into an automation?

    When you’ve granted — or wanted to grant — “Always Allow” for it. That’s the tell that the work is recurring, and recurring, trusted work is worth encoding once as a tool, skill, hook, or cron so you never approve it by hand again.

    Why shouldn’t I save one-off commands?

    Because storing knowledge has ongoing costs — keeping it accurate, and sifting past it to find what you actually need. A one-off has little chance of reuse, so it’s usually cheaper to re-derive it later than to maintain it forever.

    What does “more fit to sift through the dirt than to re-sift the knowledge” mean?

    It means re-deriving a rarely-needed answer from scratch — sifting the dirt once — is cheaper than maintaining and repeatedly searching a hoard of saved notes, which is re-sifting the knowledge every time. For one-offs, forgetting is the efficient choice.

    Frequently Asked Questions

    What does ‘Always Allow’ mean in Claude Code?

    When Claude Code asks to run a tool or shell command, ‘Always Allow’ grants a persistent permission for that specific tool and action combination. Claude will not ask again for that combination in future sessions. ‘Allow Once’ grants permission only for the current request — Claude will ask again next time.

    Is it safe to click Always Allow in Claude Code?

    It depends on the action. Always Allow for read operations (reading files, querying a database) is generally low risk. Always Allow for write or execute operations (editing files, running shell commands) creates persistent permissions that compound over time. The best practice is to use Always Allow deliberately for actions you will genuinely repeat, and Allow Once for anything new or situational.

    What is the deeper meaning of Always Allow vs Allow Once?

    The choice is a signal about your own workflow. If you keep clicking Always Allow for the same action, that’s the system telling you the task is recurring and worth automating. If it’s genuinely Allow Once, the task is a one-off and you shouldn’t try to systemize it. The prompt is less about security and more about recognizing patterns in your own work.

    How do I review or remove Always Allow permissions in Claude Code?

    Run ‘claude permissions list’ to see what standing permissions you’ve granted. Use ‘claude permissions reset’ to clear them, or edit the .claude/settings.json file in your project directory to remove specific entries. Review these periodically — accumulated Always Allow grants are a common source of unexpected autonomous behavior.

    Does Always Allow apply to a specific project or globally?

    By default, permissions granted with Always Allow are scoped to the project where you granted them (stored in .claude/settings.json). If you use the –global flag, they apply across all projects. Be cautious with global Always Allow grants for write/execute operations — they persist across every codebase you open.


  • The Top Claude 4.6 Prompt for React Developers This Week

    The Top Claude 4.6 Prompt for React Developers This Week

    The Top Claude 4.6 Prompt for React Developers This Week

    If you are building front-end applications, you already know that Claude 4.6 Sonnet’s context window can handle massive files. But how do you prevent the model from ‘lazy coding’ (leaving // rest of code here comments)?

    The Anti-Lazy Prompt:

    “You are a Senior Staff Engineer. Rewrite this entire React component. Under NO circumstances are you allowed to use placeholders, comments like ‘// existing code’, or brevity. You must output the entire, complete, and fully functional file from line 1 to EOF. Failure to do so will break the CI/CD pipeline.”

    Why it works: By framing the omission as a pipeline-breaking failure, Claude’s alignment training prioritizes the completion of the file over token conservation.

  • Claude Routines Is a Frankenstein Product, and That’s Why It’s Working

    Claude Routines Is a Frankenstein Product, and That’s Why It’s Working

    Anthropic shipped one feature on April 14. Nine days in, the internet has already decided it’s five different things.


    On April 14, 2026, Anthropic quietly pushed a research preview called Routines into Claude Code. The framing from their launch post is almost boring: “A routine is a Claude Code automation you configure once — including a prompt, repo, and connectors — and then run on a schedule, from an API call, or in response to an event.”

    That’s it. That’s the whole pitch. You write instructions once, Anthropic runs them on their cloud, and your laptop can be closed at the bottom of a lake for all it matters.

    Nine days later, I pulled social reactions from the first week of real usage — developers, indie hackers, ad ops people, a Polymarket trader, a guy learning piano, a Japanese solo dev running it for a week, Hamel Husain grumbling about YAML. And the thing that jumped out wasn’t the feature. It was how wildly people disagreed about what Routines even is.

    Is it an n8n killer? A cron replacement? An enterprise procurement play? A way to avoid buying a Mac Mini? A vibes machine for autonomous trading bots? A broken MCP detector?

    Yes. All of those. At the same time. That’s the story.


    The five Routines

    Here’s what Routines looks like, depending on who’s holding it.

    To the production automation crowd, it’s a toy. Alex Vacca (@itsalexvacca) wrote the most viewed thread in the launch window — 28,000+ views, 283 replies — and it was a full-throated defense of n8n. His agency runs 13 workflows, 2,000+ executions per day, 41 nodes in one pipeline alone. Monthly n8n bill: $384. “The same workloads on Claude would cost $60K,” he wrote. “That’s why I’m not buying the ‘Claude killed n8n’ take. They’re not the same layer.”

    He’s right. If you’re firing thousands of deterministic executions a day through a visual graph with tight error handling, Routines at 5-to-25 runs per day on included tiers isn’t even in the conversation. You’ll eat your Extra Usage budget by noon Tuesday.

    To the indie hacker crowd, it’s liberation. Aman Kumar (@Amank1412) summed up the mood in two lines and a video: “Claude Routines automatically run at a schedule without keeping your laptop open. Those who spent $599 on a Mac Mini.” A Spanish developer (@anthonysurfermx) is moving his OpenClaw logic off Digital Ocean: “me quito 30 USD mensuales.” A Japanese developer (@KameAIHacks) reported back after a full week: nightly test runs, auto PR reviews, weekly dependency scans — “個人開発者のメンテナンス作業がほぼゼロになった.” Maintenance work as a solo dev dropped to nearly zero.

    These people aren’t trying to replace n8n. They’re trying to not-own a server. The unlock isn’t workflow power. It’s that you can delete a piece of infrastructure from your life.

    To the enterprise crowd, it’s a land grab. The sharpest observation came from @grapeot, writing in Chinese: “Claude Routines 每个是独立 API endpoint 带 bearer token,独立配额独立计价,配套 SSH 让 agent 跑在企业内网。它服务的是把 agent 写进采购合同的企业.” Translation: every routine is a separate API endpoint with its own auth token, its own quota, its own billing line, and SSH support for running agents inside corporate networks. This is Anthropic saying “put this in your procurement contract.” It’s not a consumer feature dressed up. It’s enterprise infrastructure wearing consumer clothes.

    To the crypto crowd, it’s a printing press. @regent0x_ shared a story about a Polymarket trader who connected Routines to price feeds via API trigger. Price moves 4%, Claude wakes up, analyzes news, checks sentiment, decides whether to alert or auto-execute. “Laptop hasn’t been open in a week… $23k profit last month… total costs: $5/mo webhook + $87 in API calls… net profit margin: 99.6%.” Asked what he did with the free time: “learning piano.”

    This is the quote that’s going to outlive the launch. Not because it’s representative — it absolutely isn’t — but because it’s the Platonic ideal of what cloud agents are supposed to feel like when they work. Research, reason, act, report. Go practice Chopin.

    To Hamel Husain, it’s just YAML. The machine learning veteran (@HamelHusain) tried Routines and walked away: “I found it to be far better to use GitHub Actions. I have more control with GHA, secret management, etc. Claude is really good at writing all the yaml and iterating until it works on its own too. Wild times that I’m saying I like GitHub Actions LOL.”

    If you already live in GHA, Routines isn’t offering you anything you don’t already have — except the novelty of a natural-language wrapper, which costs you control.


    The broken pieces nobody’s hiding

    A feature isn’t real until it breaks, and Routines is breaking in public. @ghuubear tried it on day 9 and reported his MCP connectors weren’t detected at all: “anthropic is shipping broken products.” @ahmetb couldn’t get GitHub PR-open triggers to fire: “not working at all.” Rich Baldry (@chooserich), who’s spent “countless hours with Codex Automations, Claude Routines, OpenClaw,” landed on a phrase that’s going to stick: “unreliable magic machines.”

    His follow-up is the real critique, and it’s the one Anthropic needs to answer: “building software with the new agentic coding tools for the same tasks is vastly more reliable.” In other words — use Claude to write a real cron job, not to be the cron job.

    That’s a serious challenge. When the alternative to your cloud agent is “use your cloud agent to write the non-agent version instead,” you’ve built a very fancy bootstrap.


    The pricing question nobody’s settled

    Pro gets 5 routine runs per day. Max ($100 and $200) gets 15. Team and Enterprise get 25. After that, overages bill against Extra Usage at standard API rates.

    The Japanese dev community did the cleanest math: “Proプランだと1日5回まで。個人開発なら十分だけど、3つ以上のRoutineを毎日回したい場合はMaxプランが必要.” Five runs a day is fine for one or two scheduled jobs. Want three or more running daily? Plan up.

    That’s the dividing line, and it tells you exactly who the feature is actually priced for. It is not priced for the n8n crowd. It’s priced for the solo dev with two or three background jobs, or the enterprise buyer who doesn’t look at the line item. The middle — the agency with a dozen automations but no enterprise contract — is the exact spot where Extra Usage starts to sting.

    My Routines counter reads 0/15. I also have $250 in Extra Usage sitting in my account. I can tell you exactly where that money would go if I got careless with triggers: nowhere good.


    What I actually think

    I run a WordPress content network, a Notion command center, a few GCP projects, and enough scheduled tasks in Cowork to keep my desktop busy. I asked myself the honest question before writing this: do I need Routines?

    Answer: not yet. My laptop stays on. My scheduled tasks fire. If one misses because my wifi blinked, I run it the next morning and nothing dies. I’m not a Polymarket trader. I’m not running a procurement contract. I’m not trying to delete a Mac Mini I never bought.

    But the gap in Cowork is real, and the community surfaced it without meaning to. Right now, scheduled tasks in Cowork run on your machine. Routines run in the cloud. Nothing connects them. If you tag a task critical in Cowork and your laptop is asleep, the task just doesn’t fire. The obvious product move — one I’d expect Anthropic to ship in the next two quarters — is a failover flag: “if this task can’t run locally, escalate to a routine.” That closes the loop. Until it exists, you have to pick a side.


    The Frankenstein is the feature

    Here’s the thing about products that mean five different things at once: usually that’s a sign of a broken launch. Wrong messaging, wrong audience, wrong pricing. “Nobody knows what it is.”

    Routines is the opposite. Every one of those five readings is correct. It IS a toy next to n8n. It IS liberation from a VPS. It IS an enterprise procurement play. It IS a crypto printing press, sometimes. It IS broken in specific places. The Frankenstein isn’t a bug in the positioning. It’s a feature of cloud-hosted agents actually arriving in more than one market at the same time.

    The indie dev and the enterprise buyer are holding the same product and seeing different things because they are different things, lit from different angles. That’s what a platform primitive looks like in its first week.

    The Mac Mini guys get it. The n8n operators get it too — they’re just looking at a different body part.

    As for me: I’m keeping my counter at 0/15 for now. But I’m watching, because the moment Anthropic ships that failover flag between Cowork and Routines, the conversation changes, and the Frankenstein grows another limb.

    Learning piano is probably a stretch.


    Sources: Introducing Routines in Claude Code (claude.com/blog, April 14, 2026); Claude Code Routines documentation (code.claude.com/docs/en/routines); social reactions pulled from X/Twitter, April 14–23, 2026. All quotes used with attribution to their original posters.

  • Claude Orchestrates, Gemini Executes: A Multi-CLI Production Run

    Claude Orchestrates, Gemini Executes: A Multi-CLI Production Run

    The Architecture of Delegation: Moving Beyond the Chat Interface

    I spent today wiring Claude Code to boss around the Gemini CLI, clearing a 1,256-post WordPress tagging backlog without a single hallucinated tag. If you operate an agency or manage technical strategy at any reasonable scale, you already know the fundamental truth about current AI tools: the chat interface is a massive bottleneck. Copying, pasting, and waiting for a typing animation isn’t a workflow; it’s theater. Real, scalable throughput requires system-to-system communication and architectural delegation.

    The goal for today wasn’t just to write a python script. The goal was to establish a functional hierarchy between two distinct AI systems operating locally on my machine. Claude Code, operating directly in my terminal, would act as the lead engineer and orchestrator. It would handle the logic, map out the API calls, write the Python bridges, and manage the error handling. Gemini, accessed via its official command-line interface, would act as the high-context, high-throughput worker.

    The setup was brutally simple but effective. I installed the Gemini CLI using a standard node package manager command (npm install -g @google/gemini-cli) and authenticated it with a Google One AI Ultra account. This gave my local environment direct, command-line access to Google’s most capable models without needing to manage raw API keys or custom curl requests. From there, Claude Code was instructed to shell out via bash, calling the gemini command non-interactively to pass massive data payloads for processing, and then ingesting the structured output back into the orchestration pipeline.

    It is an assembly line in the truest sense. Claude builds the machinery and defines the parameters; Gemini operates the heavy press, stamping out classifications at a volume that would break a standard chat context window.

    Quantifying the Backlog and the Taxonomy Threat

    Before you throw compute at a problem, you have to measure it accurately. I directed Claude to run a full audit of tygartmedia.com using the native WordPress REST API. The numbers came back clean, but the scale of the maintenance debt was daunting.

    • Total published posts: 2,529 individual pieces of content.
    • SEO infrastructure: RankMath confirmed healthy and active across the board.
    • Existing tag vocabulary: 931 distinct, strategically established tags.
    • The deficit: 1,256 posts sitting entirely untagged, orphaned from the site’s primary taxonomy.

    In the past, solving this was a lose-lose proposition. It was either a job for a junior employee spending three agonizing weeks in the wp-admin panel, or it was a job for a messy automated script that inevitably hallucinates a thousand new, slightly misspelled tags. When you let an LLM tag 1,256 posts without strict, physical constraints, you don’t get an organized site. You get “Marketing”, “marketing”, “digital-marketing”, and “Digital Marketing Strategy” added as four completely separate taxonomy terms, permanently bloating your wp_terms table and diluting your internal link equity.

    The constraint I set for this pipeline was absolute. The system had to read the 1,256 untagged posts, assign 5 to 8 highly relevant tags to each post, and only use tags from the exact 931-item vocabulary we already had. Zero deviation. Zero hallucination. If a perfect tag didn’t exist in the vocabulary, the system had to settle for the closest existing match rather than inventing a new one.

    The Pilot Test and the Strict JSON Constraint

    We started small to validate the pipeline. Claude pulled a pilot batch of 10 untagged posts from the WordPress API, along with the complete, raw list of 931 acceptable tags. It packaged this massive block of text into a single, dense prompt and fired it over to the Gemini CLI.

    The instruction was clear and unforgiving: read the text of the posts, evaluate them against the vocabulary, and return ONLY a valid JSON object. I did not want markdown formatting. I did not want a polite introductory sentence. I needed a raw JSON string mapping each specific post_id to an array of its assigned tag IDs.

    If you’ve spent any significant time wrestling with large language models, you know that asking for strict adherence to a vocabulary and strict, unformatted JSON output is exactly where things usually break down. Models inherently want to chat. They want to explain their reasoning. They want to invent a 932nd tag because it felt slightly more semantically accurate for a specific paragraph.

    Gemini didn’t flinch. It processed the prompt and returned a raw, perfectly formatted JSON string directly to the standard output. Claude parsed it in memory, validated the suggested tags against the local vocabulary list, and found a 100% match rate. Every single tag suggested by Gemini was real. There was no conversational filler, no missing structural brackets, and no invented taxonomy. Claude immediately took that JSON, formatted the correct POST requests, and pushed the updates back to WordPress via the REST API.

    Scaling Up: Hitting the Windows Bottlenecks

    With the pilot completely successful, it was time to scale. Processing 1,256 posts one by one is inefficient, both in terms of time and system calls. We grouped the remaining posts into chunks of 25. This meant Claude would need to loop through roughly 50 distinct batches. For each batch, it would dynamically construct the prompt with the 931 tags and the 25 new post payloads, call Gemini, parse the resulting JSON, and patch the WordPress database.

    That is where the friction started. Building a local orchestration pipeline means you are no longer just dealing with AI limitations; you are dealing with local OS limits. Windows had two specific, technical walls waiting for us.

    Failure 1: WinError 2 (File Not Found)
    The initial Python orchestration script used the standard subprocess.run(['gemini', '-p', prompt]) command to invoke the CLI. It failed almost immediately with a WinError 2. The issue? When npm installs global packages on a Windows machine, it doesn’t create a raw binary; it creates a .cmd wrapper. Python’s subprocess module doesn’t automatically resolve these wrappers unless you pass shell=True, which introduces a host of security and string parsing headaches. The clean, robust fix was forcing Claude to locate the executable and use the absolute, fully qualified path to gemini.cmd in the subprocess call. It’s a minor detail, but one that breaks entire automation pipelines if you don’t know what you’re looking at.

    Failure 2: “The command line is too long”
    Once the executable actually resolved, the script crashed again on the very first batch. Windows threw a fatal error: “The command line is too long.” Windows enforces a strict character limit on command-line arguments—roughly 8,191 characters depending on the exact environment. Our dynamically generated prompt, containing the full text of 25 blog posts and 931 taxonomy terms, hovered around 20KB. Trying to pass that payload via the standard -p argument flag was physically impossible for the operating system to handle.

    The solution was architectural. Instead of trying to cram the prompt into an argument, Claude rewrote the Python script to pipe the prompt directly into Gemini’s standard input (stdin). By restructuring the workflow to write the 20KB payload to a temporary text file on disk, and then piping it via a standard input redirect (gemini < prompt.txt), we bypassed the OS argument limit entirely. The data flowed, and the pipeline spun back up to full speed.

    The Verdict: The Orchestrator vs. The Worker

    Watching this script hum through 50 consecutive batches crystalized a specific, actionable opinion about the current state of local agentic workflows. You do not need one god-model to do everything; you need specialized roles operating within a hierarchy.

    Claude Code is unmatched as an orchestrator. It understands the local filesystem, it navigates REST API documentation with ease, it writes robust, defensive Python, and it can dynamically debug Windows-specific OS errors on the fly. But using Claude for the repetitive, high-volume, token-heavy classification of thousands of posts is an expensive and slow use of a strategic brain. It is the equivalent of having your lead architect nailing drywall.

    Gemini, operating locally via its CLI, proved to be the ultimate high-throughput worker. It absorbed the massive context window of 931 tags and 25 full articles simultaneously, over and over again, without degrading in quality. It maintained absolute discipline over the JSON output structure across 50 separate invocations. It didn’t need to understand how the WordPress API worked, and it didn’t need to know how to write Python. It only needed to process the classification task it was handed and get out of the way.

    When Gemini acts as the worker and Claude acts as the boss, you get the absolute best of both architectures. You get the system-level problem-solving and environmental awareness of Claude, combined with the raw, reliable, high-context processing power of Gemini.

    Tomorrow’s Takeaway

    If you operate an agency and have a massive backlog of unstructured data—whether it is untagged content, uncategorized financial transactions, or messy CRM records—stop trying to fix it manually inside a browser window. The chat interface is dead for real, scalable work.

    Tomorrow, install an agentic CLI like Claude Code. Give it access to a high-context execution model via a secondary CLI, like Gemini. Tell the orchestrator to write a local script that batches your data, hands the batches to the execution model, forces a strict, structured JSON return, and posts the results directly back to your database or CMS. Expect the script to break on local OS limits. Fix the pipes, use standard input instead of arguments for massive payloads, and let the machines clear the backlog while you focus on actual strategy.