claude/advanced

LLM orchestration patterns

Composing multiple agents into coordinated workflows — sequential execution, parallel swarms, multi-perspective analysis. The next level after individual skills and agents.

📝 Wholesale AI Champions ⏱ 11 min read 📚 Team workflows

How to move beyond single prompts and compose skills, agents, and coordination layers into autonomous workflows.

The progression

Most teams start with Claude Code as a conversational assistant — ask a question, get an answer. But the real power comes from layering capabilities on top of each other:

Level	What it looks like	Example
Prompt	One-shot question and answer	”What does this function do?”
Skill	Reusable prompt template with steps	`/what-changed` summarizes your uncommitted diff
Agent	Autonomous expert that handles a full task	A code-reviewer agent that analyzes quality, security, and patterns
Orchestration	Multiple agents coordinated by a lead	8 reviewer agents run in parallel, a lead synthesizes findings

Each level builds on the last. Skills give Claude repeatable instructions. Agents give Claude specialized personas it can delegate to. Orchestration coordinates multiple agents into workflows that would be too complex for a single prompt.

Getting started

You don’t need to build a full orchestration system on day one. Start small and layer up:

Write a skill that automates a task you do repeatedly. See Skills for the format.
Extract an agent when part of your skill needs deep expertise. Move that logic into an agent file and have the skill spawn it.
Add a second agent when you need a different perspective on the same work. Run them in parallel and synthesize.
Add state tracking when your workflows get long enough to hit context compaction. Write progress to disk.
Add verification gates when you need confidence that the output is correct. Block progress until evidence is provided.

The marketplace plugins — essentials, multi-agent-orchestrator, refine-prompt, and adesa-workflow — are open source and full of patterns you can study and adapt.

Building blocks

Here’s how each Claude Code primitive contributes to orchestration:

Primitive	Role in orchestration	Where it lives
Skills	Define the steps of a workflow phase	`.claude/commands/*.md`
Agents	Autonomous specialists that skills delegate to	`agents/*.md` (in a plugin)
Hooks	Automated triggers on events (session start, tool use)	`settings.json` or `settings.local.json`
MCP Servers	Connect agents to external systems (ADO, databases, APIs)	`.mcp.json` or settings
State files	Persist progress across context compaction	Disk files (JSON, markdown)

A skill is the entry point — the thing the user invokes. It reads context, makes decisions, and spawns agents to do specialized work. Agents operate autonomously within their scope. Hooks react to system events. MCP servers bridge to external data. State files keep everything in sync when the conversation gets long.

Orchestration patterns

These are the four patterns used across the Carvana and Wholesale marketplaces. Each solves a different coordination problem.

Pattern 1: Sequential execution

One agent at a time, each building on the previous result. Used when tasks have dependencies.

How it works:

Skill loads a plan (task list with ordering)
For each task, spawn a fresh agent
Agent implements the task, returns results
Skill verifies the result before moving to the next task
If verification fails, create a fix task and retry

Real example — essentials-execute:

flowchart LR
    A[Plan] --> B[Spawn agent] --> C{Verify}
    C -- Pass --> D[Next task]
    C -- Fail --> E[Fix task] --> B
    D --> B

Key design decisions:

Fresh agent per task — Each agent gets a clean context window. Prevents context pollution from earlier tasks bleeding into later ones.
Skill verifies, not the agent — The orchestrating skill reads the code itself after the agent finishes. Agents shouldn’t mark their own homework.
Checkpoint after each batch — Write progress to disk so work isn’t lost if the session ends.

Pattern 2: Parallel swarm

Multiple agents working simultaneously on independent tasks. Used when work can be divided without dependencies.

How it works:

Skill decomposes work into independent streams
Spawn all agents in a single message (parallel execution)
Collect results as agents complete
Synthesize findings or merge outputs

Real example — adesa-workflow wave-based execution:

flowchart LR
    A[Analyze deps] --> W1
    subgraph W1 [Wave 1]
        R1[repo-a] & R2[repo-b] & R3[repo-c]
    end
    W1 --> V1{Verify} -- Pass --> W2
    subgraph W2 [Wave 2]
        R4[repo-d] & R5[repo-e]
    end
    W2 --> V2{Verify}

Key design decisions:

Dependency ordering — Analyze which repos/tasks depend on each other. Independent work runs in parallel; dependent work waits.
Wave boundaries — Don’t start the next wave until the current one is verified. This prevents cascading failures.
Gotcha propagation — Lessons learned in Wave 1 (unexpected patterns, edge cases) get passed to Wave 2 agents so they don’t hit the same issues.

Pattern 3: Multi-perspective analysis

Multiple agents examine the same artifact from different angles, then a lead synthesizes findings. Used for reviews, audits, and design evaluation.

How it works:

Skill identifies the artifact to analyze (diff, design doc, codebase)
Auto-triage: select which specialist agents are relevant
Spawn all selected agents in parallel
Each agent returns findings from its perspective
Lead synthesizes: deduplicate, prioritize, produce unified report

Real example — essentials-code-review (11 reviewers):

flowchart LR
    A[Diff] --> B[Auto-triage]
    B --> R["Reviewers (2-11 in parallel)"]
    R --> S[Lead synthesizes]
    S --> F[Findings + verdicts + top fixes]

Key design decisions:

Auto-triage — Don’t run all reviewers every time. If the diff has no SQL, skip the injection scanner. If there are no new interfaces, skip the contract guardian.
Agents are documentarians, not fixers — Reviewers describe what they find. The lead (or user) decides what to fix. This prevents agents from stepping on each other.
Structured output — Each agent returns findings in the same format (severity, file:line, description, suggestion). This makes synthesis possible.

Pattern 4: Coordinator + specialists

A central orchestrator agent discovers available specialists at runtime, routes work based on expertise, and resolves conflicts. Used when the agent roster isn’t known at design time.

How it works:

Orchestrator globs agents/*.md to discover available agents
Parses each agent’s scope (what it owns, what it defers)
Analyzes the incoming task and matches it to relevant agents
Consults agents (parallel or sequential, based on config)
Detects conflicts between agent recommendations
Resolves conflicts using configured strategy
Executes the merged plan

Real example — multi-agent-orchestrator:

flowchart LR
    A[Task] --> B[Discover agents]
    B --> C[Consult in parallel]
    C --> D{Conflict?}
    D -- No --> F[Execute]
    D -- Yes --> E[Resolve] --> F

Agent scope declarations make routing possible. Each agent’s definition includes what it owns and what it defers:

## Scope
**Owns:**
- Redis configuration and caching patterns

**Defers to:**
- Infrastructure provisioning → @infra-expert
- Cross-cutting architecture → @architecture-reviewer

## Handoff Triggers
- "If the question involves infrastructure sizing" → @infra-expert

The orchestrator reads these declarations to decide which agents are relevant for a task and how to handle overlaps.

Conflict resolution strategies:

Strategy	How it works	Best for
`orchestrator_decides`	Orchestrator weighs rationale, makes final call	Fast decisions, clear domain ownership
`consensus_required`	Re-query agents with each other’s reasoning until aligned	High-stakes architectural choices
`majority_vote`	Majority wins, dissent is logged	Lower-risk decisions with many agents
`escalate_to_human`	Present all perspectives to the user	Ambiguous trade-offs, policy decisions

Consensus is a loop, not a vote

The consensus_required strategy deserves special attention because it’s fundamentally different from the others. Instead of picking a winner, it feeds each agent’s rationale back to the disagreeing agents and asks them to reconsider. This creates an iterative refinement loop:

flowchart LR
    A[Agent opinions diverge] --> B[Share rationale across agents]
    B --> C{Aligned?}
    C -- No --> B
    C -- Yes --> D[Proceed with consensus]

In practice, this usually converges in 1-2 rounds — agents see constraints they hadn’t considered and adjust. If it doesn’t converge after a few iterations, the orchestrator can fall back to escalate_to_human rather than looping forever.

When to use consensus vs the alternatives:

Use orchestrator_decides as the default — it’s fast and works when domain boundaries are clear
Use consensus_required when the decision crosses multiple domains and no single agent has the full picture (e.g., a database schema change that affects API contracts, infrastructure, and testing)
Use majority_vote when you have many agents with overlapping expertise and want democratic resolution
Use escalate_to_human as a safety valve for policy decisions or when agents surface genuine trade-offs that require business context

Decision tracking

The orchestrator can optionally log every decision to .claude/state/decisions.json — which agents were consulted, what conflicts arose, and how they were resolved. This creates an audit trail for understanding why the system made a particular choice. Three tracking modes:

Mode	Behavior
`none`	No tracking (default)
`per-session`	Track during task, clear after
`persistent`	Full history preserved across sessions

Quality gates

Orchestration without verification is just automated guessing. Build gates into your workflow that block progress until evidence is provided.

Verification gate pattern

The core principle: no completion claims without fresh evidence.

IDENTIFY what to verify
  → RUN the verification (build, test, grep, read)
  → READ the output
  → VERIFY it matches expectations
  → CLAIM completion only if evidence supports it

Three-gate workflow

Gate	When	What it checks	Verdict
Pre-flight	Before implementation starts	Plan completeness, API contracts, DI wiring, config	CLEAR or HOLD
In-flight	After each task completes	Code compiles, tests pass, spec compliance	PASS or FAIL (retry)
Post-flight	After all implementation	Full build, integration tests, behavioral equivalence	CLEAR, ISSUES, or WARN

Real-world example: adesa-workflow

The adesa-workflow plugin in the Wholesale marketplace ties all these patterns together into a full development lifecycle. Here’s how the pieces connect:

The flow

discover → breakdown → coordinate → pickup → preflight → execute → review → verify → pr → reflect

Each phase is a skill. Each skill spawns specialized agents. The whole thing is coordinated from a single planning repo with absolute paths to code repos.

How it uses each pattern

Phase	Pattern	What happens
Discover	Parallel swarm	Research agents scan all affected repos simultaneously
Breakdown	Sequential	Decompose feature into stories, then specs, then review checklist (18 categories)
Preflight	Multi-perspective	4 checker agents (branch, wiring, contracts, config) run in parallel
Execute	Sequential + parallel	Repos grouped into dependency waves; repos within a wave run in parallel
Review	Multi-perspective	8-9 reviewer agents + spec reviewer + simplify pass
Verify	Parallel swarm	Per-repo verifier agents run simultaneously

State architecture

planning-repo/
├── repos.json                          # Team's repo registry
├── features/
│   └── {feature-id}/
│       ├── feature-overview.md         # High-level feature description
│       └── stories/
│           ├── {story}.md              # Business-focused story
│           └── {story}.impl.md         # Technical spec per repo
├── .context/{id}/
│   ├── discovery.md                    # Research notes
│   └── decisions.md                    # Settled questions
├── .executions/{story-id}.md           # Progress tracking (append-only)
├── .reflections/{story-id}.md          # Lessons learned
└── .handoffs/{id}-{timestamp}.md       # Session resume state

Every piece of state lives on disk. When context compaction hits, hooks re-inject the execution tracking. When a session ends, /aw-handoff saves everything needed to resume. When a new feature starts, /aw-discover loads past reflections as guardrails.

The feedback loop

After implementation, /aw-reflect captures what went well and what didn’t. These reflections feed back into /aw-discover and /aw-breakdown for the next feature. Over time, the system learns from its mistakes — plans get more accurate, specs get more detailed where they need to be, and common gotchas are caught earlier. See Reflection & Feedback Loops for more on this pattern.

←

Reflection and feedback loops

State management

→