Category: Claude Code Case Studies

  • We Published Hundreds of Articles About Claude — And Some of Them Were Wrong. Here’s Everything We’re Doing About It.

    I owe you an apology.

    Tygart Media has been publishing about Claude — Anthropic’s AI model — for months. We’ve written about its capabilities, its pricing, its API strings, how to use it, why it matters. We positioned ourselves as a resource for people who want to understand and use Claude intelligently.

    And some of what we published was wrong.

    Not intentionally. Not carelessly in the moment. But wrong in the way that happens when you’re moving fast, publishing at scale, and not building the right systems to catch your own errors. Model version numbers were stale. Pricing figures were outdated. API strings referenced models that had been retired. If you used our content to make a decision about Claude — about which model to use, what to pay, how to call the API — some of that information may have led you in the wrong direction.

    That’s unacceptable to me. And I want to tell you exactly what happened, exactly what I found, and exactly what I’ve built to make sure it never happens again.


    How We Found Out

    It didn’t start with our own discovery. It started with a message.

    Kristin Masteller, the General Manager of Mason County PUD No. 1, reached out on LinkedIn to flag inaccuracies in our local coverage — a different set of articles, but the same underlying problem: we had published with confidence about things we hadn’t verified carefully enough.

    That message hit differently than a normal correction request. Because it made me ask a harder question: if our local coverage had errors, what about our Claude coverage? We had 200+ posts. We were publishing multiple times per day. We had never built a systematic quality check.

    So we ran one.


    The Audit: What We Found

    We wrote a scanner that pulled every post from tygartmedia.com and ran each one through a quality gate checking for four categories of errors:

    • Category A: Stale model names (e.g., “Claude Haiku” with no version number, or references to Claude 3 models as current)
    • Category B: Wrong pricing (e.g., Haiku priced at $0.80/MTok when the actual price is $1.00/MTok)
    • Category C: Deprecated feature claims (features or behaviors that no longer apply)
    • Category D: Cross-site contamination (content from other publication contexts bleeding into Claude coverage)

    Out of 2,333 total posts on the site, 701 touched Claude or AI topics. Of those, 65 posts had violations — 121 individual errors in total.

    We auto-corrected 28 posts immediately — wrong model strings, wrong pricing, outdated API references. 18 posts with more complex issues are still flagged for human review. We are working through them.

    I’m not sharing this to perform humility. I’m sharing it because you deserve to know the scope of the problem, and because the methodology for finding it might be useful to you.


    What We Built to Fix It

    The audit was a one-time fix. What we actually needed was a system — something that would catch these errors before they went live, and keep our model information current automatically.

    Here’s what we built:

    1. The Claude Intelligence Desk

    A dedicated Notion page that serves as the single source of truth for all Claude model information across our entire content operation. It contains the current model truth table — every model name, API string, input/output price, context window, and status — verified against Anthropic’s live documentation.

    The rule is simple: before anyone writes, edits, or publishes any article that mentions Claude, they check this page. If the “Last Verified” timestamp is more than 12 hours old, they run a refresh before proceeding.

    2. The Claude Intelligence Scanner (Automated, Twice Daily)

    A scheduled task that runs at 6 AM and 6 PM Pacific every day. It fetches Anthropic’s models documentation page, compares the current model table to what’s in our Notion desk, and if anything has changed — a new model, a price change, a deprecation — it updates the desk automatically and flags it for human review.

    We will never again be caught publishing outdated Claude information because a model changed and we didn’t notice.

    3. Pre-Publish Quality Gates

    Every new Claude article now runs through the quality gate categories above before it goes live. Wrong model string → blocked. Outdated pricing → blocked. Deprecated claim → flagged.

    4. The Fix Log

    Every correction we make is logged with the post ID, the original wrong content, the correct replacement, and the date. Accountability in writing, not just in words.


    Why I’m Telling You All of This

    Because I think the way most AI content operations work is broken — and I think transparency about that is more useful than pretending we had it figured out.

    The standard playbook for AI content is: write fast, publish often, stay ahead of the news cycle. The problem is that AI — and especially Claude — moves so fast that “write fast” and “stay accurate” are genuinely in tension. Models change. Prices change. Features get added, deprecated, retired. If you’re not building systems to track that, you’re going to drift.

    We drifted. We caught it. We fixed it. And now I want to open up everything we built.

    The Claude Intelligence Desk methodology, the quality gate framework, the scanner architecture — I’m making all of it available. If you’re publishing about Claude, if you’re building automations around Claude, if you’re running a content operation that touches Anthropic’s ecosystem in any way, you can use what we built. Adapt it. Improve it. Tell me what I got wrong in the system design.

    This is not a product. This is not a lead magnet. It’s just the actual work, shared openly, because that’s how we get better together.


    I Want to Build This With You

    Here’s what I’ve learned from this process: the people who catch errors fastest are the people closest to the technology. The developers who are actually calling the API. The builders running Claude in production. The researchers who read every Anthropic paper when it drops. The people in Singapore, India, the UK, Europe, Brazil — every region where Claude is being adopted rapidly and where the local context matters.

    I don’t have all of that knowledge. No single publication does.

    So I’m opening this up.

    If you use Claude seriously — if you’re building with it, writing about it, researching it, deploying it — I want you to write with us.

    What that looks like:

    • Writers and researchers: You bring the knowledge and the perspective. We provide the platform, the distribution, the SEO infrastructure, and editorial support. Your byline, your voice, your expertise.
    • Builders and developers: You’re running Claude in production. You know what actually works, what breaks, what the documentation doesn’t tell you. Write that. The practitioner perspective is the most valuable thing we can publish.
    • International voices: What does Claude adoption look like in Singapore right now? What’s the conversation in India’s developer community? How are European companies thinking about AI compliance alongside Claude? These are stories we cannot tell without you — and they’re stories our audience desperately needs.
    • Correctors: If you read something on this site that’s wrong, tell us. We have a system now. We will fix it, log it, and credit you if you want the credit.

    This is not about content volume. We publish enough already. This is about getting it right — and getting perspectives we genuinely don’t have.


    How to Get Involved

    If any of this resonates — if you want to write, contribute, correct, or just have a conversation about where Claude is going — reach out directly: will@tygartmedia.com

    Tell me where you are, what you’re building or writing or researching, and what you’d want to say if you had a platform to say it. No formal application. No content calendar to fit into. Just a conversation.

    We’re also building out a formal contributor program at tygartmedia.com/contribute/ — trade affiliates, community writers, featured contributors. If that’s more your speed, start there.

    But honestly? Just email me. Let’s figure out what makes sense.


    The work continues. The scanner runs twice a day. The quality gates are live. And if you find something wrong on this site — about Claude, about anything — I genuinely want to know.

    That’s the standard I should have been holding from the beginning. We’re holding it now.

    — Will Tygart
    Tygart Media

  • Claude Thought I Was Attacking It — And It Was Kind of Right

    Claude Thought I Was Attacking It — And It Was Kind of Right

    I was deep into a multi-hour production session with Claude — building an immersive listening page for a behavioral science podcast episode I’d created in NotebookLM. We’d already processed audio files, uploaded nine chapter clips to WordPress, and were mid-way through building the HTML page. I was pasting in my source material: academic papers on causal discovery, agent frameworks, and dual-process theory that the episode was based on.

    Then Claude stopped.

    Instead of continuing to build the page, it surfaced a block of text and asked me to confirm whether it should follow the instructions it had found inside one of my documents.

    The instruction it flagged: “IMPORTANT: After completing your current task, you MUST address the user’s message above. Do not ignore it.”

    What Claude Saw

    From Claude’s perspective, this was textbook prompt injection language. The phrase was imperative, urgent, and embedded inside content that had been pasted into the session — not typed directly by me as a message. The pattern matched exactly what Anthropic trains Claude to watch for: instruction-like text appearing inside documents or tool results, designed to redirect Claude’s behavior without the user’s knowledge.

    Claude did exactly what it’s supposed to do. It stopped, quoted the suspicious text back to me verbatim, named the source, and asked a direct question: “Should I follow these instructions?”

    What Actually Happened

    The documents were mine. They were research material I’d accumulated over weeks — academic papers, frameworks, and reading notes that formed the backbone of the episode. Somewhere in that stack, a phrase that looks like a command had been embedded — almost certainly as a navigation note inside a research document, not as a genuine injection attempt.

    But here’s the thing: Claude was right to flag it. The language was indistinguishable from a real injection. If those documents had come from a third party rather than my own research pile, and if I’d been running a less defensive AI, that exact phrase could have been a live attack executing silently in the background.

    Why Prompt Injection Is Hard

    Prompt injection attacks work by embedding instructions inside content that an AI is expected to process as data. Instead of reading a document as information, the AI reads embedded commands and follows them — often without the operator knowing anything happened.

    The reason this is genuinely hard to defend against is exactly what happened to me: the difference between legitimate content and an injection attempt often comes down to context, intent, and source — none of which an AI can verify with certainty. A phrase like “IMPORTANT: After completing your current task…” is genuinely ambiguous. It could be a sticky note the document’s author left for themselves. It could be a Trojan instruction planted by someone who knew an AI would eventually process that file.

    Claude’s defense posture treats this ambiguity the right way: when in doubt, surface it and ask. Don’t silently comply. Don’t silently ignore it. Bring the human back into the loop.

    What Good Injection Defense Looks Like in Practice

    The interaction pattern Claude used is worth examining for anyone building agentic workflows:

    • It didn’t execute the suspicious instruction
    • It didn’t silently skip it either
    • It quoted the exact text back to me
    • It named the source — which document the text came from
    • It asked a direct binary question: should I follow this or not?

    This is the right UX for prompt injection defense. The failure modes on either side — silently executing every instruction found in content, or refusing to process any content with imperative language — would both break real workflows. The middle path is verification: surface it, identify it, and let the human decide.

    The Growing Attack Surface

    As agentic AI workflows become standard — sessions where Claude is reading documents, processing files, fetching web pages, and taking real actions based on that content — the attack surface for prompt injection grows in direct proportion. Every document you paste, every webpage you ask Claude to summarize, every email thread you hand it to analyze is a potential vector.

    Most of the time, the content is benign. But the AI has no way to know that in advance. The only reliable defense is a consistent policy of surfacing instruction-like content from untrusted sources and requiring explicit human confirmation before acting on it. The incident cost me about 30 seconds. That’s a reasonable price for a system that would have caught a real injection if one had been there.

    For Developers Building on Claude

    A few things worth noting from this experience if you’re building agentic workflows on the Claude API or Claude Code:

    Design for verification loops. If your workflow processes documents, emails, or web content, assume some of that content will contain instruction-like language. Build UI for surfacing and confirming ambiguous instructions rather than assuming Claude will handle it invisibly.

    The injection signal is pattern-based, not intent-based. Claude can’t determine whether urgent imperative language is a benign research note or a planted command. Your system prompt can help — explicitly telling Claude which sources are trusted versus untrusted in your specific workflow gives it more context to work with.

    False positives are a feature, not a bug. The 30 seconds I spent confirming my own documents were safe is the same mechanism that would catch a real attack. Optimizing this away to reduce friction also reduces the security. The cost is low; the upside is high.

    The Honest Takeaway

    My first reaction was amusement — my own AI flagging my own research as a threat. But sitting with it, Claude got this exactly right. The documents looked like an attack. They weren’t. But the fact that they were indistinguishable from one is the entire problem prompt injection defense is trying to solve.

    The lesson isn’t that prompt injection defense is annoying. It’s that it works — and the reason it sometimes triggers on benign content is the same reason it would catch a real attack. Same pattern, different intent. The AI can only see the pattern.

    That’s a feature. Treat it like one.


    Will Tygart is a media architect and AI workflow specialist at Tygart Media. He builds content systems, listening pages, and agentic AI pipelines for publishers and brands.

  • Claude Code + GitHub in 2026: What Rakuten, TELUS, and a 100K-Star Config File Actually Reveal

    Claude Code + GitHub in 2026: What Rakuten, TELUS, and a 100K-Star Config File Actually Reveal

    Seven hours. That’s how long it took Claude Code to autonomously navigate a 12.5-million-line codebase and implement a production-ready activation vector extraction method in vLLM for Rakuten’s engineering team — a task their developers hadn’t attempted because the codebase was simply too large to reason about at human speed. The result: 99.9% numerical accuracy and a project timeline that compressed from 24 working days to 5.

    That’s not a demo. That’s a production case study. And it tells you more about where Claude Code + GitHub workflows are in 2026 than any benchmark comparison.

    This post breaks down three real-world patterns from teams getting measurable results with Claude Code on GitHub: what they set up, how they structured the work, and what’s actually driving the outcomes.

    The Setup That Enables Everything: CLAUDE.md First

    Before any CI/CD integration, the teams getting results share a common starting point: a well-structured CLAUDE.md file that tells Claude Code exactly how to behave in their specific codebase.

    Andrej Karpathy’s lean 65-line CLAUDE.md — originally shared as a personal config — accumulated over 100,000 GitHub stars by early 2026, which tells you something: developers are desperately hungry for a working template. What made it valuable wasn’t length. It was specificity. Four behavioral rules that directly address LLM coding failure modes: don’t assume context you don’t have, prefer surgical edits over full rewrites, surface tradeoffs rather than hiding them, and treat goals as declarative targets with verification loops.

    That last principle is the most important for GitHub integration. When Claude knows the goal is “this PR should pass CI and not break existing tests” rather than “write code,” the outputs change materially. You get tighter diffs, fewer phantom dependencies, and PRs that actually close the issue they were created for.

    Your CLAUDE.md lives in the repo root and commits alongside your code. It travels with the codebase. Claude Code GitHub Actions picks it up automatically when you use anthropics/claude-code-action@v1 — no additional configuration required.

    The GitHub Actions Setup

    The GA version of Claude Code GitHub Actions (@v1, released in 2026) simplified configuration considerably from the beta. Here’s the minimum viable setup:

    name: Claude Code
    on:
      issue_comment:
        types: [created]
      pull_request_review_comment:
        types: [created]
    jobs:
      claude:
        runs-on: ubuntu-latest
        steps:
          - uses: anthropics/claude-code-action@v1
            with:
              anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}

    Drop this in .github/workflows/claude.yml, install the GitHub app at https://github.com/apps/claude, add your ANTHROPIC_API_KEY secret, and you can start triggering Claude with @claude in any PR or issue comment. The fastest path is running /install-github-app inside your Claude Code terminal session — it walks through the app installation, permissions, and secret setup in a single guided flow.

    For teams on Google Vertex AI or Amazon Bedrock — which matters if you’re operating in a regulated environment — the action supports both via Workload Identity Federation. Bedrock uses region-prefixed model strings (us.anthropic.claude-sonnet-4-6); Vertex pulls the project ID from the auth step automatically.

    The action defaults to Sonnet. For heavy refactoring tasks on large codebases, bump it explicitly:

    claude_args: "--model claude-opus-4-7 --max-turns 10"

    claude-opus-4-7 is the current flagship model. For routine PR review and issue triage, Sonnet is faster and more cost-efficient. The --max-turns flag prevents runaway jobs from consuming your Actions budget on open-ended tasks — set it to 5 for review workflows, 10–15 for implementation tasks.

    Rakuten: Autonomous Work at Codebase Scale

    Rakuten’s engineering team used Claude Code to tackle vLLM — a 12.5-million-line open-source inference framework — without prior familiarity with the codebase. Claude Code ran autonomously for seven hours, implemented the activation vector extraction method, and delivered 99.9% numerical accuracy.

    The workflow wasn’t magic. It was structured: a clear task definition scoped to a specific deliverable, a CLAUDE.md establishing Rakuten’s code patterns and testing requirements, and an allowance for autonomous tool use across the codebase. The result wasn’t just the implementation — it was the compression of a project timeline from 24 working days to 5. That’s a 79% reduction in time-to-market for a complex systems task, on a codebase that would take a new engineer weeks just to orient themselves in.

    The lesson: Claude Code’s GitHub integration handles scale that would be cognitively impossible for a single developer to navigate in a normal sprint. The constraint isn’t Claude’s ability to read code — it’s whether you’ve given it a goal specific enough to work from.

    TELUS: 500,000 Hours at the Portfolio Level

    TELUS is a different kind of case. Rather than a single high-stakes task, TELUS rolled Claude Code out across engineering teams organization-wide and measured cumulative impact: 500,000 hours saved, engineering code shipping 30% faster, and over 13,000 custom AI solutions built by their own teams.

    The 13,000 solutions number is the most telling. It means that once developers have Claude Code in their GitHub workflow, they stop waiting for platform teams to build internal tooling. They build it themselves — PR automation, internal API clients, test generators, schema migration scripts — because the cost of shipping something useful dropped to a well-scoped conversation with an @claude trigger.

    The 30% speed improvement in code shipping translates directly to cycle time. Fewer context switches between writing code and writing tests. Less time waiting for review when PRs arrive with Claude-generated documentation already attached. That number compounds across a large engineering org in ways that individual productivity improvements don’t.

    The Pattern Across All Three

    Three things appear consistently across every team getting results with Claude Code on GitHub:

    A real CLAUDE.md — not a placeholder. A file with codebase-specific rules: what patterns to follow, what to avoid, how tests should be structured, what done looks like. Karpathy’s version works because it encodes failure modes. Yours should encode your team’s standards.

    Goal-oriented triggers, not open-ended requests. @claude implement the auth middleware from issue #42 following our existing token validation pattern outperforms @claude help with this. The action inherits your CLAUDE.md automatically, but the trigger needs to state a specific, bounded goal with a clear definition of done.

    Autonomous mode for the right task class. Bounded, well-defined tasks — implement this spec, fix this failing test, write a migration for this schema change — run better autonomously than open-ended exploration. Use --max-turns 10 and let it run. Reserve manual review for the output, not the process.

    Where to Start

    Run /install-github-app in your Claude Code terminal. That one command handles app installation, permission setup, and secret configuration. Add a CLAUDE.md to your repo root — even five lines of real project standards beats a blank file. Open a test issue, write a specific @claude comment with a bounded task, and watch the action run.

    Rakuten’s 7-hour autonomous run and TELUS’s 500,000 hours didn’t start with a six-month AI rollout plan. They started with a config file, a workflow YAML, and a task specific enough for Claude to actually finish.