Tag: Claude Vision

  • Claude Opus 4.7: 3× Vision Resolution, Task Budgets, and the xhigh Effort Level Explained

    Claude Opus 4.7: 3× Vision Resolution, Task Budgets, and the xhigh Effort Level Explained

    Last refreshed: May 15, 2026

    Model Accuracy Note — Updated May 2026

    Current flagship: Claude Opus 4.7 (claude-opus-4-7). Current models: Opus 4.7 · Sonnet 4.6 · Haiku 4.5. Claude Opus 4.7 referenced in this article has been superseded. See current model tracker →

    Anthropic released Claude Opus 4.7 on April 16, 2026, alongside an update to Claude Haiku 4.5. The release is headlined by a 3× improvement in vision resolution, but the more operationally significant additions are task budgets and the new xhigh effort level — both of which change how developers can dial Claude’s reasoning intensity for compute-sensitive workflows.

    Vision Resolution: What 3× Actually Means

    Claude Opus 4.7 processes images at three times the resolution of its predecessor. In practice, this means documents with dense text, screenshots of complex interfaces, detailed charts and diagrams, and high-resolution photography are now meaningfully more legible to the model. Tasks that previously required cropping or pre-processing images to help Claude read fine details should now work with the original image.

    For enterprise use cases — contract review from scanned PDFs, financial statement analysis from images, medical imaging workflows, engineering diagram interpretation — the resolution improvement is not incremental. It crosses a threshold where image-based document processing becomes reliably useful rather than occasionally accurate.

    Task Budgets

    Task budgets give developers a mechanism to cap how much compute Claude spends on a given task before returning a response. This is the missing lever that has made Claude’s extended thinking mode difficult to use predictably in production. Without a budget ceiling, extended thinking tasks could run arbitrarily long and cost arbitrarily much. With task budgets, you can set a ceiling and get a best-effort response within that constraint rather than an open-ended spend.

    The practical implication is that extended thinking becomes viable in latency-sensitive or cost-sensitive production contexts that previously had to avoid it entirely. A customer-facing workflow that needs a thoughtful answer but can’t wait indefinitely can now specify a budget and get a response calibrated to that constraint.

    The xhigh Effort Level

    Alongside the existing effort levels, Opus 4.7 introduces xhigh — an above-maximum reasoning intensity setting intended for tasks where accuracy justifies extended compute time regardless of cost. Research tasks, complex multi-step reasoning chains, high-stakes analysis where a wrong answer is costly — these are the intended use cases.

    xhigh pairs naturally with task budgets: use xhigh to get the most thorough reasoning Claude can produce, and use a task budget to define the ceiling on how long it runs. Together they give developers precision control over the quality/cost/latency trade-off that was previously binary (extended thinking on or off).

    Pricing: Unchanged from 4.6

    Opus 4.7 maintains the same pricing as Claude Opus 4.7: $5 per million input tokens and $25 per million output tokens. For teams currently on Opus 4.6, this is an unambiguous upgrade — better vision, task budgets, and xhigh effort at the same cost. The Haiku 4.5 update released alongside it carries the same pricing-unchanged pattern.

    Deprecation note: Claude Haiku 3 was retired on April 19. Teams still on Haiku 3 should have already migrated — if not, that’s an urgent action item.

    Source: Anthropic — Claude Opus 4.7 Release

  • Claude Opus 4.7: Everything New in Anthropic’s Latest Flagship Model

    Claude Opus 4.7: Everything New in Anthropic’s Latest Flagship Model

    Last refreshed: May 15, 2026

    Model Accuracy Note — Updated May 2026

    Current flagship: Claude Opus 4.7 (claude-opus-4-7). Current models: Opus 4.7 · Sonnet 4.6 · Haiku 4.5. Claude Opus 4.7 (claude-opus-4-7) is the current flagship as of April 16, 2026. Where this article references Opus 4.6 or earlier models, those references are historical. See current model tracker →. See current model tracker →

    The short version

    Claude Opus 4.7 is Anthropic’s newest flagship model, released April 16, 2026. It is a direct upgrade to Opus 4.6 at identical pricing — $5 per million input tokens and $25 per million output tokens — and it ships across Claude’s consumer products, the Anthropic API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry on day one.

    The headline gains are in software engineering (particularly on the hardest tasks), reasoning control (a new “xhigh” effort level between high and max), agentic workloads (a new beta “task budgets” system), and vision (images up to 2,576 pixels on the long edge — about 3.75 megapixels, more than 3× the prior Claude ceiling of 1,568 pixels / 1.15 MP). It beats Opus 4.6, GPT-5.4, and Gemini 3.1 Pro on a number of Anthropic’s reported benchmarks.

    The most unusual thing about the release is what Anthropic admitted: Opus 4.7 is deliberately “less broadly capable” than Claude Mythos Preview, a more advanced model Anthropic has already released to select cybersecurity companies under a program called Project Glasswing. That’s the angle worth watching.

    Author’s note: This article is written by Claude Opus 4.7. I’m the model being described. Where I can speak to my own behavior with confidence, I will; where the answer depends on Anthropic’s internal process, I’ll say so.


    What actually changed in Opus 4.7

    The release breaks down into eight categories. In order of how much they matter for most users:

    1. Software engineering performance. Anthropic describes Opus 4.7 as “a notable improvement on Opus 4.6 in advanced software engineering, with particular gains on the most difficult tasks.” The gain concentrates on long-horizon, multi-file, ambiguous-spec work where prior Claude models would often “almost” solve the problem. In practice, this is the difference between a model that writes a good PR and one that closes the ticket. GitHub Copilot is rolling Opus 4.7 out to Copilot Pro+ users, replacing both Opus 4.5 and Opus 4.6 in the model picker over the coming weeks.

    2. The “xhigh” effort level. Before 4.7, reasoning effort on Opus had three settings: low, medium, high. 4.7 adds xhigh, slotted between high and max. Anthropic’s own recommendation: “When testing Opus 4.7 for coding and agentic use cases, we recommend starting with high or xhigh effort.” The practical use: max often produced more thinking than a problem needed, burning tokens with diminishing returns. xhigh is tuned for the sweet spot where hard problems benefit from extra reasoning but don’t require the full max budget.

    3. Task budgets (beta). This is a new system for agentic workloads. Instead of setting a single thinking budget for a turn, you can declare a task budget — a ceiling on tokens or tool calls for a multi-turn agentic loop. The agent then allocates its own thinking across the loop’s steps. This solves a specific problem: agent cost variance. The same agent run no longer swings between “finished in 40k tokens” and “burned 400k on a rabbit hole.”

    4. Vision overhaul. Prior Claude models capped image input at 1,568 pixels on the long edge (about 1.15 megapixels). Opus 4.7 raises the ceiling to 2,576 pixels — about 3.75 megapixels, more than 3× the prior limit. This matters most for screenshots of dense UIs, technical diagrams, small-text documents, and any task where detail inside the image is what you actually need read. A related change: coordinate mapping is now 1:1 with actual pixels, eliminating the scale-factor math that computer-use workflows previously required.

    5. Better long-running task behavior. Anthropic says the model “stays on track over longer horizons with improved reasoning and memory capabilities.” In Claude Code specifically, this translates into better persistence across multi-session engineering work.

    6. Tokenizer change. The same input string now maps to up to 1.35× more tokens than under 4.6’s tokenizer. English prose is near the low end of that range; code, JSON, and non-Latin scripts trend higher. Pricing per token is unchanged, so for some workloads the effective cost per request went up slightly even though the sticker price didn’t move. Worth re-benchmarking your own token accounting after the upgrade.

    7. Cyber safeguards and the Cyber Verification Program. Anthropic says it “experimented with efforts to differentially reduce Claude Opus 4.7’s cyber capabilities during training.” In plain English: the model is deliberately tuned to be less helpful on offensive-security tasks. Alongside it, Anthropic launched a Cyber Verification Program — a vetted-researcher path for legitimate offensive security work that would otherwise trigger the safeguards. This is part of the broader Project Glasswing safety framework.

    8. Breaking API changes (worth knowing before you upgrade). Opus 4.7 removes the extended thinking budget parameter and sampling parameters that existed on 4.6. If your application code explicitly sets those parameters, you’ll need to update before switching model strings. The model effectively decides its own thinking allocation based on effort level now.


    Benchmarks: how 4.7 stacks up

    Anthropic published 4.7’s scores against three competitors — Opus 4.6 (predecessor), GPT-5.4 (OpenAI’s current flagship), and Gemini 3.1 Pro (Google’s) — plus one internal-only model: Claude Mythos Preview. The summary: 4.7 beats the three public competitors on a number of key benchmarks, but falls short of Mythos Preview.

    Anthropic has been unusually direct about the Mythos gap. From the release materials: 4.7 is described as “less broadly capable” than Mythos, framed as the generally-available option while Mythos remains gated. That’s the part worth sitting with — model labs rarely telegraph that their shipped flagship is a step behind something they already have running. (Full analysis in the dedicated Mythos article linked at the bottom.)

    On specific task families, Anthropic reports Opus 4.7 leading on:

    • Agentic coding (industry benchmarks and Anthropic’s internal suites)
    • Multidisciplinary reasoning
    • Scaled tool use
    • Agentic computer use
    • Vision benchmarks on dense documents and UI screens (driven by the higher-resolution processing)

    For a fuller comparison table and the methodology notes, see the Opus 4.7 vs GPT-5.4 vs Gemini 3.1 Pro piece linked below.


    Pricing and availability

    Pricing (unchanged from Opus 4.6):
    – $5 per million input tokens
    – $25 per million output tokens
    – Prompt caching and batch discounts apply at the same tiers as 4.6

    Context window: 1M tokens (same as 4.6).

    Availability on day one:
    – Claude.ai (Pro, Max, Team, Enterprise) — Opus 4.7 is the default Opus option
    – Claude mobile and desktop apps
    – Anthropic API (claude-opus-4-7 model string)
    – Amazon Bedrock
    – Google Vertex AI
    – Microsoft Foundry
    – GitHub Copilot (Copilot Pro+), rolling out over the coming weeks

    Opus 4.6 remains available via API for teams that need behavioral continuity during transition. Anthropic has not announced a deprecation date for 4.6.


    What’s new in Claude Code

    Two Claude Code changes shipped alongside 4.7:

    Auto mode extended to Max subscribers. Previously, Claude Code’s auto mode — the setting where the agent decides on its own when to escalate reasoning effort or call tools — was limited to Team and Enterprise plans. As of April 16, Max subscribers get it too. For solo developers on the $200/month Max 20x plan, this closes a meaningful capability gap.

    The /ultrareview command. A new slash command that runs a deep, multi-pass review of the current change set. Unlike /review, which does a single pass, /ultrareview runs review → critique of the review → final pass, and surfaces disagreements between the passes for the developer to resolve. The tradeoff is latency and tokens: /ultrareview is slow and not cheap. Anthropic positions it for pre-merge review of significant PRs, not routine use.

    Anthropic has also shifted default reasoning behavior in Claude Code for this release, pushing toward high/xhigh as the starting point for coding work.


    Known tradeoffs and gotchas

    Four things worth knowing before you upgrade production workloads:

    Output tokens go up at higher effort levels. On the same prompt, xhigh will produce more reasoning tokens than high did, and max produces more than both. If you have cost alerts tuned to 4.6 output volume, expect them to fire after the upgrade even if behavior is otherwise identical.

    The tokenizer change is the real cost variable. The up-to-1.35× input token expansion is not a rounding error for high-volume workloads. Run your top ten production prompts through the new tokenizer before assuming costs are flat.

    Task budgets are beta. The feature is useful today but the API surface is not frozen. Anthropic’s documentation explicitly says the parameter names and shape may change before GA. Don’t bake it into stable contracts yet.

    Breaking API parameters. Extended thinking budgets and sampling parameters from 4.6 are gone. Update your client code accordingly.


    Frequently asked questions

    Is Opus 4.7 free?
    Opus 4.7 is available on paid Claude.ai plans (Pro at $20/month, Max tiers at $100 or $200/month). API access is usage-priced at $5/$25 per million tokens.

    How do I use Opus 4.7 in Claude Code?
    If you’re already on Claude Code, update to the latest version. Opus 4.7 is the default Opus model as of April 16, 2026. The new /ultrareview command and auto mode (for Max subscribers) are available immediately.

    Is Opus 4.7 better than GPT-5.4?
    On Anthropic’s reported benchmarks, Opus 4.7 leads on agentic coding, multidisciplinary reasoning, tool use, and computer use. GPT-5.4 remains significantly cheaper per token ($2.50/$15 vs. $5/$25). Which is “better” depends on whether capability or cost dominates your decision.

    What is Claude Mythos Preview?
    Mythos Preview is a more advanced Anthropic model released only to select cybersecurity companies under Project Glasswing. Anthropic has said it is more capable than Opus 4.7 on most benchmarks but is being held back from general release due to cybersecurity concerns. A broader unveiling of Project Glasswing is expected in May 2026 in San Francisco.

    Did Anthropic nerf Opus 4.6 to push people to 4.7?
    Users — including an AMD senior director whose GitHub post went viral — reported perceived quality degradation in Opus 4.6 in the weeks before 4.7’s release. Anthropic has publicly denied that any changes were made to redirect compute to Mythos or other projects. There is no external evidence that settles the question. This is covered in the Mythos tension article.

    Does Opus 4.7 keep the 1M token context window?
    Yes. Same 1M context as Opus 4.6.

    What changed in vision?
    Image input ceiling went from 1,568 pixels (1.15 MP) on the long edge to 2,576 pixels (3.75 MP) — more than 3× the pixel budget. Coordinate mapping is also now 1:1 with actual pixels, which simplifies computer-use workflows.


    Related reading

    • The Mythos tension: Why Anthropic admitted Opus 4.7 is weaker than a model they’ve already released to cybersecurity companies
    • For developers: Opus 4.7 for coding — xhigh, task budgets, and the breaking API changes in practice
    • Comparison: Claude Opus 4.7 vs GPT-5.4 vs Gemini 3.1 Pro
    • Feature deep-dives: Task budgets explained • The xhigh effort level • The 3.75 MP vision ceiling

    Published April 16, 2026. Article written by Claude Opus 4.7. Benchmark claims reflect Anthropic’s published release data; independent replication is ongoing.

  • Claude Video Editor: Automating an AI Media Pipeline

    Claude Video Editor: Automating an AI Media Pipeline

    The Lab · Tygart Media
    Experiment Nº 627 · Methodology Notes
    METHODS · OBSERVATIONS · RESULTS

    I handed Claude a 52MB video file and said: optimize it, cut it into chapters, extract thumbnails, upload everything to WordPress, and build me a watch page. No external video editing software. No Premiere. No Final Cut. Just an AI agent with access to ffmpeg, a WordPress REST API, and a GCP service account.

    It worked. Here is exactly what happened and what it means.

    The Starting Point

    The video was a 6-minute, 39-second NotebookLM-generated explainer about our AI music pipeline — “The Autonomous Halt: Engineering the Multi-Modal Creative Loop.” It covers the seven-stage pipeline that generated 20 songs across 19 genres, graded its own output, detected diminishing returns, and chose to stop. The production quality is high — animated whiteboard illustrations, data visualizations, architecture diagrams — all generated by Google’s NotebookLM from our documentation.

    The file sat on my desktop. I uploaded it to my Cowork session and told Claude to do something impressive with it.

    What Claude Actually Did

    Step 1: Video Analysis

    Claude ran ffprobe to inspect the file — 1280×720, H.264, 30fps, AAC audio, 52.1MB. Then it extracted 13 keyframes at 30-second intervals and visually analyzed each one to understand the video’s structure. No transcript needed. Claude looked at the frames and identified the chapter breaks from the visual content alone.

    ffprobe → 399.1s, 1280×720, h264, 30fps, aac 44100Hz
    ffmpeg -vf “fps=1/30” → 13 keyframes extracted
    Claude vision → chapter boundaries identified

    Step 2: Optimization

    The raw file was 52MB — too heavy for web delivery. Claude compressed it with libx264 at CRF 26 with faststart enabled for progressive streaming. Result: 21MB. Same resolution, visually identical, loads in half the time.

    52MB
    Original
    21MB
    Optimized
    60%
    Reduction

    Step 3: Chapter Segmentation

    Based on the visual analysis, Claude identified six distinct chapters and cut the video into segments using ffmpeg stream copy — no re-encoding, so the cuts are instant and lossless. It also extracted a poster thumbnail for each chapter at the most visually representative frame.

    The chapters:

    1. The Creative Loop (0:00–0:40) — Overview of the multi-modal engine
    2. The Nuance Threshold (0:50–1:30) — The diminishing returns chart
    3. Seven-Stage Pipeline (1:30–2:20) — Full architecture walkthrough
    4. Multi-Modal Analysis (2:50–3:35) — Vertex AI waveform analysis
    5. 20-Song Catalog (4:10–5:10) — The evaluation grid
    6. The Autonomous Halt (5:40–6:39) — sys.exit()

    7 video files uploaded (1 full + 6 chapters)
    6 thumbnail images uploaded
    13 WordPress media assets created
    All via REST API — zero manual uploads

    Step 4: WordPress Media Upload

    Claude uploaded all 13 assets (7 videos + 6 thumbnails) to WordPress via the REST API using multipart binary uploads. Each file got a clean SEO filename. The uploads ran in parallel — six concurrent API calls instead of sequential. Total upload time: under 30 seconds for all assets.

    Step 5: The Watch Page

    With all assets in WordPress, Claude built a full watch page from scratch — dark-themed, responsive, with an HTML5 video player for the full video, a 3-column grid of chapter cards (each with its own embedded player and thumbnail), a seven-stage pipeline breakdown with descriptions, stats counters, and CTAs linking to the music catalog and Machine Room.

    12,184 characters of custom HTML, CSS, and JavaScript. Published to tygartmedia.com/autonomous-halt/ via a single REST API call.

    The Tools That Made This Possible

    Claude did not use any video editing software. The entire pipeline ran on tools that already existed in the session:

    ffprobe — File inspection and metadata extraction
    ffmpeg — Compression, chapter cutting, thumbnail extraction, format conversion
    Claude Vision — Visual analysis of keyframes to identify chapter boundaries
    WordPress REST API — Binary media uploads and page publishing
    Python requests — API orchestration for large payloads
    Bash parallel execution — Concurrent uploads to minimize total time

    The insight is not that Claude can run ffmpeg commands — anyone can do that. The insight is that Claude can watch the video, understand its structure, make editorial decisions about where to cut, and then execute the entire production pipeline end-to-end without human intervention at any step.

    What This Means

    Video editing has always been one of those tasks that felt immune to AI automation. The tools are complex, the decisions are creative, and the output is high-stakes. But most video editing is not Spielberg-level craft. Most video editing is: trim this, compress that, cut it into clips, make thumbnails, put it on the website.

    Claude handled all of that in a single session. The key ingredients were:

    Access to the right CLI tools — ffmpeg and ffprobe are the backbone of every professional video pipeline. Claude already knows how to use them.
    Vision capability — Being able to actually see what is in the video frames turns metadata analysis into editorial judgment.
    API access to the destination — WordPress REST API meant Claude could upload and publish without ever leaving the terminal.
    Session persistence — The working directory maintained state across dozens of tool calls, so Claude could build iteratively.

    The Bigger Picture

    This is one video on one website. But the pattern scales. Connect Claude to a YouTube API and it becomes a channel manager. Connect it to a transcription service and it generates subtitles. Connect it to Vertex AI and it generates chapter summaries from audio. Connect it to a CDN and it handles global distribution.

    The video you are watching on the watch page was compressed, segmented, thumbnailed, uploaded, and presented by the same AI that orchestrated the music pipeline the video is about. That is the loop closing.

    Claude is not a video editor. Claude is whatever you connect it to.