Code with Claude 2026 · Anthropic · Change for Builders

Anthropic News
May 6, San Francisco · ALECIA's Playbook

Keynote Summary · Frameworks · Applied Guide for ITGIRL / Agenicore / Cocode / Filmteams

17×
API volume up YoY
3
New managed agent features
Claude Code rate limits
220K
SpaceX GPUs added
01 · Keynote Summary

What Actually Happened at Code w/ Claude 2026

The keynote was not a model drop — it was a strategy signal. Anthropic's CPO Ami Vora opened by declaring: "Today is about how we are making our products work better for you." The throughline was orchestration, agent reliability, and giving serious builders infrastructure they can run for hours, not minutes.

Infrastructure
🛰️

SpaceX Colossus Deal

Anthropic is using all capacity of SpaceX's Colossus data center — 300 MW, 220,000+ NVIDIA GPUs. Direct result: doubled rate limits for Claude Code Pro/Max/Enterprise users immediately.

Platform
🧠

The Advisor Strategy

Opus 4.7 can be called on-demand by Sonnet to provide advice on hard decisions — "advisor mode." One team (eve) got frontier model quality at 5× lower cost using this pattern.

Agents
🤖

Managed Agents 3-Pack

Multi-agent orchestration (public beta), Outcomes (public beta), and Dreaming (research preview) all dropped. This is Anthropic's answer to "how do teams ship 10× faster."

Claude Code

Routines + Async Coding

Routines are "higher-order prompts" — you set up async automations and wake up to merge-ready PRs. The head of Claude Code says most of his code is now built by routines.

Design
🎨

Claude Design + Opus Taste

Opus 4.7 has genuine visual design judgment. Claude Design was announced (labs.anthropic.com). Planning mode on amp switched to Opus 4.7 — better visual outputs across the board.

Velocity
📊

Hours, Not Minutes

This time last year agents worked for minutes. Today builders run them for hours. Shopify and Mercado Libre (23,000 engineers!) are targeting 90% autonomous coding by Q3 2025.

02 · Extracted Frameworks & Mental Models

The Thinking Tools From the Keynote

These are the repeatable frameworks and strategies Anthropic's leaders surfaced — extracted and translated for your work.

01
The Advisor Pattern
// SMALL MODEL DOES. BIG MODEL THINKS. //

Instead of running Opus for everything (expensive) or Sonnet for everything (lower quality), you architect a two-tier system. Sonnet handles execution — writing code, drafting content, processing tasks. Opus serves as on-call advisor for high-judgment moments: reviewing output quality, making tricky decisions, setting strategy.

The result: frontier-level quality at dramatically lower cost. One production team cut costs 5× while maintaining quality benchmarks.

PATTERN
// Advisor Pattern — two-tier model orchestration

EXECUTOR: claude-sonnet-4-6 // speed + cost
ADVISOR: claude-opus-4-6 // called when judgment needed

// Trigger Opus when:
// - Output confidence below threshold
// - Decision has high downstream impact
// - Novel/ambiguous input detected
// - Quality eval fails
02
Design for the Next Model
// BUILD WHAT DOESN'T WORK YET — IT WILL //

Dianne Penn from Anthropic Research dropped this as "classic advice": build things that don't quite work today on the assumption that they'll start working with the next model upgrade. Don't let current model limits define your architecture ceiling.

This is the AI builder's version of Moore's Law thinking. If a workflow fails at 70% reliability today but needs 90%, ship the scaffold and trust the model curve.

"Design for the next model. Build things that don't quite work today on the assumption that they'll start working with a model upgrade in the future."

— Dianne Penn, Head of Product for Research, Anthropic

03
The Three Differentiators Framework
// WHAT TOP TEAMS DO DIFFERENTLY //

Anthropic's Dianne Penn shared what separates the teams getting the most value from Claude from everyone else. Three things:

1

Automated Evals

They don't manually review outputs. They build automated evaluation pipelines that score Claude's work against defined criteria at every step. You can't improve what you don't measure systematically.

2

Simple Scaffolding

They resist the urge to over-engineer agent frameworks. Clean, minimal scaffolding that's easy to iterate on. The power is in the model + the task definition, not the orchestration complexity.

3

Imaginative Use Cases

They find use cases others haven't figured out yet. Not the obvious AI features — the adjacent, creative applications that unlock disproportionate value for their specific domain.

04
The Dreaming Loop
// AGENTS THAT LEARN FROM THEIR OWN HISTORY //

Anthropic's "Dreaming" feature for Managed Agents introduces a self-improvement loop: an agent reviews its own previous sessions overnight, identifies what it missed or did poorly, and generates new memory or playbooks. In the demo, the agent created a descent-playbook.md from reviewing past drone-landing sessions.

Even if you can't use Dreaming directly yet (research preview), the framework is something you can implement manually or prompt-engineer into your own loops.

1

Run Task

Agent executes multi-step work and logs its session context.

2

Dream (Self-Review)

Agent reads its own session history and identifies gaps, errors, or missing context.

3

Write Playbook

Generates updated playbook/memory file that improves future runs automatically.

4

Repeat

Each cycle the agent is smarter about its own domain without manual prompt tuning.

05
Routines = Higher-Order Prompts
// ASYNC AUTOMATION AT THE PROMPT LAYER //

Boris Cherny (who created Claude Code) defined this cleanly: Routines are higher-order prompts — not one-shot instructions but standing, reusable prompt patterns that trigger autonomous workflows. You define the routine once; it runs on schedule or trigger, produces PR-ready output, and you review/merge async.

The key insight: a lot of code going forward will be written asynchronously. You're not supervising a session — you're reviewing work that ran while you were elsewhere.

"The person who owns the PR is never going to see a red X. Claude is prompting Claude Code on its own."

— Boris Cherny, Head of Claude Code, Anthropic

03 · New Features Dropped

What's Live, Beta, and Preview

Feature status as announced May 6, 2026:

Feature Status What It Does Relevant To You
Multi-Agent Orchestration Public Beta Create Commander/Detector/Navigator-style agent fleets for complex tasks Cocode, Agenicore
Outcomes Public Beta Define what success looks like; Claude iterates until it gets there All projects
Dreaming Research Preview Agent reviews past sessions, generates self-improvement playbooks overnight Agenicore, ITGIRL
Routines Live Async, higher-order prompt automations; wake up to merge-ready PRs Cocode, Filmteams
CI Auto-Fix Live Claude files automatic fixes against failing PRs automatically Cocode dev pipeline
Security Reviews Live Automated security review pass on your codebase ITGIRL (Supabase/Stripe)
Claude Design New / Labs Opus 4.7's visual design taste applied to UI generation Filmteams UI, ITGIRL
Claude Code Desktop Live Full-screen GUI with preview pane; multiple parallel sessions All coding projects
Remote Agents Live Control your dev laptop from your phone Mobile-first workflow
Doubled Rate Limits Live (today) 5-hour Claude Code limit doubled for Pro/Max/Enterprise All projects
04 · Applied to AJ_Alecia Stack

Framework Map-Claude Projects

ITGIRL
itgirl.aleciaj.com
  • Use the Advisor Pattern: route simple module tasks (organizer, inbox) to Sonnet, escalate complex multi-step architecture questions in "Ask ITGIRL" to Opus 4.7
  • Build Automated Evals for each module output — score accuracy of file org suggestions, email summaries on a rubric
  • Use Outcomes to define success per module (e.g., "inbox reduced by 80%, zero false positives") so Claude iterates until criteria are met
  • Run Security Reviews on your Supabase RLS policies and Stripe edge functions — this is exactly the use case
Agenicore
agenicore.com
  • Multi-agent orchestration is your product's competitive moat — study the Commander/Detector/Navigator pattern from the keynote demo closely
  • The Dreaming loop is essentially what incident automation needs: agents that learn from previous incident patterns and self-improve playbooks
  • Position Agenicore as a Managed Agents layer for enterprise — Anthropic just validated this entire category publicly
  • Use Outcomes in your own infrastructure: define incident resolution SLAs as success criteria and let Claude iterate
Cocode Corporation
agenicore.com/cocode
  • Routines are Cocode's core workflow primitive — "describe it, ship it" should be powered by async Routines that deliver merge-ready code
  • The Agent SDK (code.claude.com/docs/en/agent-sdk) is your foundation layer — Claude Code's IDE and Desktop are built on it; you can build Cocode on it too
  • Use CI Auto-Fix as a native feature in Cocode's pipeline — agents that PR, fail, self-fix, and re-PR
  • Boris Cherny's async-first framing ("wake up to merge-ready PRs") is the exact UX narrative for Cocode — steal this framing for your pitch
Filmteams Platform
filmteams platform
  • Use Multi-Agent Orchestration for your S.I.M.P. pipeline: a Director agent, Prompt Engineer agent, and QA agent as a fleet
  • Claude Design + Opus 4.7 is directly relevant — use Opus in the visual design / mood board generation layer of your pipeline
  • Apply Routines to automate recurring pipeline steps: daily storyboard generation, prompt library refresh, episode beat sheet iterations
  • The "design for next model" framework means build the full production pipeline architecture now even if some AI steps are imperfect — model quality will close the gap
S.I.M.P. Project
multimedia · memoir · series
  • Use Dreaming pattern manually: after each writing session, have Claude review its own output, identify narrative gaps, and write a session-playbook.md that informs next session
  • Outcomes applied to creative work: define episode success criteria (tone, plot beats met, character arcs) and let Claude iterate drafts until criteria pass
  • The Advisor Pattern for creative: Sonnet drafts, Opus reviews for narrative judgment, emotional resonance, and cinematic quality
Uncoded Series
prestige TV · generational story
  • Apply "design for the next model" here: script the full season bible now; some visual generation steps may be rough today but Opus 4.7's design taste will close the gap
  • Build a Routine for trailer development: weekly prompt that ingests current story beats and produces updated trailer script + Suno music prompts automatically
  • Use Multi-Agent Orchestration for the three-season Westworld-style structure: a Story Architect agent, Historical Research agent, and Dialogue Writer agent fleet
05 · Action Checklist

Your 30-Day Build Plan

Check off as you go. These are ordered by impact and availability right now.

Week 1 — Quick Wins
Week 2 — Architecture
Week 3–4 — Agent Fleet Build
Ongoing Principles