Code with Claude 2026 — AJ-Alecia's Playbook

01 · Keynote Summary

What Actually Happened at Code w/ Claude 2026

The keynote was not a model drop — it was a strategy signal. Anthropic's CPO Ami Vora opened by declaring: "Today is about how we are making our products work better for you." The throughline was orchestration, agent reliability, and giving serious builders infrastructure they can run for hours, not minutes.

Infrastructure

🛰️

SpaceX Colossus Deal

Anthropic is using all capacity of SpaceX's Colossus data center — 300 MW, 220,000+ NVIDIA GPUs. Direct result: doubled rate limits for Claude Code Pro/Max/Enterprise users immediately.

Platform

🧠

The Advisor Strategy

Opus 4.7 can be called on-demand by Sonnet to provide advice on hard decisions — "advisor mode." One team (eve) got frontier model quality at 5× lower cost using this pattern.

Agents

🤖

Managed Agents 3-Pack

Multi-agent orchestration (public beta), Outcomes (public beta), and Dreaming (research preview) all dropped. This is Anthropic's answer to "how do teams ship 10× faster."

Claude Code

⚡

Routines + Async Coding

Routines are "higher-order prompts" — you set up async automations and wake up to merge-ready PRs. The head of Claude Code says most of his code is now built by routines.

Design

🎨

Claude Design + Opus Taste

Opus 4.7 has genuine visual design judgment. Claude Design was announced (labs.anthropic.com). Planning mode on amp switched to Opus 4.7 — better visual outputs across the board.

Velocity

📊

Hours, Not Minutes

This time last year agents worked for minutes. Today builders run them for hours. Shopify and Mercado Libre (23,000 engineers!) are targeting 90% autonomous coding by Q3 2025.

02 · Extracted Frameworks & Mental Models

The Thinking Tools From the Keynote

These are the repeatable frameworks and strategies Anthropic's leaders surfaced — extracted and translated for your work.

The Advisor Pattern

// SMALL MODEL DOES. BIG MODEL THINKS. //

Instead of running Opus for everything (expensive) or Sonnet for everything (lower quality), you architect a two-tier system. Sonnet handles execution — writing code, drafting content, processing tasks. Opus serves as on-call advisor for high-judgment moments: reviewing output quality, making tricky decisions, setting strategy.

The result: frontier-level quality at dramatically lower cost. One production team cut costs 5× while maintaining quality benchmarks.

PATTERN
// Advisor Pattern — two-tier model orchestration

EXECUTOR:claude-sonnet-4-6// speed + cost
ADVISOR:claude-opus-4-6// called when judgment needed

// Trigger Opus when:
// - Output confidence below threshold
// - Decision has high downstream impact
// - Novel/ambiguous input detected
// - Quality eval fails

Design for the Next Model

// BUILD WHAT DOESN'T WORK YET — IT WILL //

Dianne Penn from Anthropic Research dropped this as "classic advice": build things that don't quite work today on the assumption that they'll start working with the next model upgrade. Don't let current model limits define your architecture ceiling.

This is the AI builder's version of Moore's Law thinking. If a workflow fails at 70% reliability today but needs 90%, ship the scaffold and trust the model curve.

"Design for the next model. Build things that don't quite work today on the assumption that they'll start working with a model upgrade in the future."

— Dianne Penn, Head of Product for Research, Anthropic

The Three Differentiators Framework

// WHAT TOP TEAMS DO DIFFERENTLY //

Anthropic's Dianne Penn shared what separates the teams getting the most value from Claude from everyone else. Three things:

Automated Evals

They don't manually review outputs. They build automated evaluation pipelines that score Claude's work against defined criteria at every step. You can't improve what you don't measure systematically.

Simple Scaffolding

They resist the urge to over-engineer agent frameworks. Clean, minimal scaffolding that's easy to iterate on. The power is in the model + the task definition, not the orchestration complexity.

Imaginative Use Cases

They find use cases others haven't figured out yet. Not the obvious AI features — the adjacent, creative applications that unlock disproportionate value for their specific domain.

The Dreaming Loop

// AGENTS THAT LEARN FROM THEIR OWN HISTORY //

Anthropic's "Dreaming" feature for Managed Agents introduces a self-improvement loop: an agent reviews its own previous sessions overnight, identifies what it missed or did poorly, and generates new memory or playbooks. In the demo, the agent created a descent-playbook.md from reviewing past drone-landing sessions.

Even if you can't use Dreaming directly yet (research preview), the framework is something you can implement manually or prompt-engineer into your own loops.

Run Task

Agent executes multi-step work and logs its session context.

Dream (Self-Review)

Agent reads its own session history and identifies gaps, errors, or missing context.

Write Playbook

Generates updated playbook/memory file that improves future runs automatically.

Repeat

Each cycle the agent is smarter about its own domain without manual prompt tuning.

Routines = Higher-Order Prompts

// ASYNC AUTOMATION AT THE PROMPT LAYER //

Boris Cherny (who created Claude Code) defined this cleanly: Routines are higher-order prompts — not one-shot instructions but standing, reusable prompt patterns that trigger autonomous workflows. You define the routine once; it runs on schedule or trigger, produces PR-ready output, and you review/merge async.

The key insight: a lot of code going forward will be written asynchronously. You're not supervising a session — you're reviewing work that ran while you were elsewhere.

"The person who owns the PR is never going to see a red X. Claude is prompting Claude Code on its own."

— Boris Cherny, Head of Claude Code, Anthropic

03 · New Features Dropped

What's Live, Beta, and Preview

Feature status as announced May 6, 2026:

Feature	Status	What It Does	Relevant To You
Multi-Agent Orchestration	Public Beta	Create Commander/Detector/Navigator-style agent fleets for complex tasks	Cocode, Agenicore
Outcomes	Public Beta	Define what success looks like; Claude iterates until it gets there	All projects
Dreaming	Research Preview	Agent reviews past sessions, generates self-improvement playbooks overnight	Agenicore, ITGIRL
Routines	Live	Async, higher-order prompt automations; wake up to merge-ready PRs	Cocode, Filmteams
CI Auto-Fix	Live	Claude files automatic fixes against failing PRs automatically	Cocode dev pipeline
Security Reviews	Live	Automated security review pass on your codebase	ITGIRL (Supabase/Stripe)
Claude Design	New / Labs	Opus 4.7's visual design taste applied to UI generation	Filmteams UI, ITGIRL
Claude Code Desktop	Live	Full-screen GUI with preview pane; multiple parallel sessions	All coding projects
Remote Agents	Live	Control your dev laptop from your phone	Mobile-first workflow
Doubled Rate Limits	Live (today)	5-hour Claude Code limit doubled for Pro/Max/Enterprise	All projects

04 · Applied to AJ_Alecia Stack

Framework Map-Claude Projects

ITGIRL

itgirl.aleciaj.com

Use the Advisor Pattern: route simple module tasks (organizer, inbox) to Sonnet, escalate complex multi-step architecture questions in "Ask ITGIRL" to Opus 4.7
Build Automated Evals for each module output — score accuracy of file org suggestions, email summaries on a rubric
Use Outcomes to define success per module (e.g., "inbox reduced by 80%, zero false positives") so Claude iterates until criteria are met
Run Security Reviews on your Supabase RLS policies and Stripe edge functions — this is exactly the use case

Agenicore

agenicore.com

Multi-agent orchestration is your product's competitive moat — study the Commander/Detector/Navigator pattern from the keynote demo closely
The Dreaming loop is essentially what incident automation needs: agents that learn from previous incident patterns and self-improve playbooks
Position Agenicore as a Managed Agents layer for enterprise — Anthropic just validated this entire category publicly
Use Outcomes in your own infrastructure: define incident resolution SLAs as success criteria and let Claude iterate

Cocode Corporation

agenicore.com/cocode

Routines are Cocode's core workflow primitive — "describe it, ship it" should be powered by async Routines that deliver merge-ready code
The Agent SDK (code.claude.com/docs/en/agent-sdk) is your foundation layer — Claude Code's IDE and Desktop are built on it; you can build Cocode on it too
Use CI Auto-Fix as a native feature in Cocode's pipeline — agents that PR, fail, self-fix, and re-PR
Boris Cherny's async-first framing ("wake up to merge-ready PRs") is the exact UX narrative for Cocode — steal this framing for your pitch

Filmteams Platform

filmteams platform

Use Multi-Agent Orchestration for your S.I.M.P. pipeline: a Director agent, Prompt Engineer agent, and QA agent as a fleet
Claude Design + Opus 4.7 is directly relevant — use Opus in the visual design / mood board generation layer of your pipeline
Apply Routines to automate recurring pipeline steps: daily storyboard generation, prompt library refresh, episode beat sheet iterations
The "design for next model" framework means build the full production pipeline architecture now even if some AI steps are imperfect — model quality will close the gap

S.I.M.P. Project

multimedia · memoir · series

Use Dreaming pattern manually: after each writing session, have Claude review its own output, identify narrative gaps, and write a session-playbook.md that informs next session
Outcomes applied to creative work: define episode success criteria (tone, plot beats met, character arcs) and let Claude iterate drafts until criteria pass
The Advisor Pattern for creative: Sonnet drafts, Opus reviews for narrative judgment, emotional resonance, and cinematic quality

Uncoded Series

prestige TV · generational story

Apply "design for the next model" here: script the full season bible now; some visual generation steps may be rough today but Opus 4.7's design taste will close the gap
Build a Routine for trailer development: weekly prompt that ingests current story beats and produces updated trailer script + Suno music prompts automatically
Use Multi-Agent Orchestration for the three-season Westworld-style structure: a Story Architect agent, Historical Research agent, and Dialogue Writer agent fleet

05 · Action Checklist

Your 30-Day Build Plan

Check off as you go. These are ordered by impact and availability right now.

Week 1 — Quick Wins

Upgrade your ITGIRL API calls to use the Advisor Pattern — route Ask ITGIRL complex queries through Opus 4.7, standard module tasks through Sonnet 4.6.
Request access to Dreaming (research preview) at claude.com/form/claude-managed-agents — get on the list now while seats are limited.
Run a Security Review on ITGIRL's Supabase RLS + Stripe edge functions using Claude Code's new automated security review feature.
Download and set up Claude Code Desktop app — the multi-session parallel view is a productivity unlock for managing Cocode + ITGIRL + Agenicore simultaneously.

Week 2 — Architecture

Define Outcomes for each ITGIRL module — write explicit success criteria so Claude can iterate to completion, not just one-shot guess.
Prototype a Routine for Cocode — a standing async prompt that takes a feature description and outputs a PR-ready implementation. Test on a small Cocode module.
Build an eval rubric for S.I.M.P. episode drafts — define 5–8 quality criteria (tone, pacing, character voice, plot integrity) and automate scoring via Claude.
Study the Claude Agent SDK docs (code.claude.com/docs/en/agent-sdk) — this is the foundation for both Cocode's architecture and your Filmteams pipeline.

Week 3–4 — Agent Fleet Build

Design your Filmteams Multi-Agent Fleet — sketch the Commander/Director, Prompt Engineer, and QA agents using the Managed Agents multi-agent orchestration (public beta now).
Implement the manual Dreaming loop for S.I.M.P. — after each writing session, run a meta-prompt that has Claude review its output, identify gaps, and write a session-playbook.md.
Position Agenicore against the Managed Agents narrative — Anthropic just validated your category. Update Agenicore's positioning to reference multi-agent incident orchestration explicitly.
Build Cocode's pitch around async-first development — use Boris Cherny's exact framing: "wake up to merge-ready PRs." This is the Cocode UX story.

Ongoing Principles

Always design for the next model — if a Claude workflow is at 70% quality today, ship the scaffold. Model upgrades will close the gap.
Build automated evals into every project — don't manually review outputs at scale. Define criteria, automate scoring, iterate faster.
Find the imaginative use case in your domain others haven't — fashion, filmmaking, AI dev platforms — there's a non-obvious Claude application in each that creates outsized value.

Anthropic News May 6, San Francisco · ALECIA's Playbook