LLM orchestration patterns
Composing multiple agents into coordinated workflows β sequential execution, parallel swarms, multi-perspective analysis. The next level after individual skills and agents.
How to move beyond single prompts and compose skills, agents, and coordination layers into autonomous workflows.
The progression
Most teams start with Claude Code as a conversational assistant β ask a question, get an answer. But the real power comes from layering capabilities on top of each other:
| Level | What it looks like | Example |
|---|---|---|
| Prompt | One-shot question and answer | βWhat does this function do?β |
| Skill | Reusable prompt template with steps | /what-changed summarizes your uncommitted diff |
| Agent | Autonomous expert that handles a full task | A code-reviewer agent that analyzes quality, security, and patterns |
| Orchestration | Multiple agents coordinated by a lead | 8 reviewer agents run in parallel, a lead synthesizes findings |
Each level builds on the last. Skills give Claude repeatable instructions. Agents give Claude specialized personas it can delegate to. Orchestration coordinates multiple agents into workflows that would be too complex for a single prompt.
Getting started
You donβt need to build a full orchestration system on day one. Start small and layer up:
- Write a skill that automates a task you do repeatedly. See Skills for the format.
- Extract an agent when part of your skill needs deep expertise. Move that logic into an agent file and have the skill spawn it.
- Add a second agent when you need a different perspective on the same work. Run them in parallel and synthesize.
- Add state tracking when your workflows get long enough to hit context compaction. Write progress to disk.
- Add verification gates when you need confidence that the output is correct. Block progress until evidence is provided.
The marketplace plugins β essentials, multi-agent-orchestrator, refine-prompt, and adesa-workflow β are open source and full of patterns you can study and adapt.
Building blocks
Hereβs how each Claude Code primitive contributes to orchestration:
| Primitive | Role in orchestration | Where it lives |
|---|---|---|
| Skills | Define the steps of a workflow phase | .claude/commands/*.md |
| Agents | Autonomous specialists that skills delegate to | agents/*.md (in a plugin) |
| Hooks | Automated triggers on events (session start, tool use) | settings.json or settings.local.json |
| MCP Servers | Connect agents to external systems (ADO, databases, APIs) | .mcp.json or settings |
| State files | Persist progress across context compaction | Disk files (JSON, markdown) |
A skill is the entry point β the thing the user invokes. It reads context, makes decisions, and spawns agents to do specialized work. Agents operate autonomously within their scope. Hooks react to system events. MCP servers bridge to external data. State files keep everything in sync when the conversation gets long.
Orchestration patterns
These are the four patterns used across the Carvana and Wholesale marketplaces. Each solves a different coordination problem.
Pattern 1: Sequential execution
One agent at a time, each building on the previous result. Used when tasks have dependencies.
How it works:
- Skill loads a plan (task list with ordering)
- For each task, spawn a fresh agent
- Agent implements the task, returns results
- Skill verifies the result before moving to the next task
- If verification fails, create a fix task and retry
Real example β essentials-execute:
flowchart LR A[Plan] --> B[Spawn agent] --> C{Verify} C -- Pass --> D[Next task] C -- Fail --> E[Fix task] --> B D --> BKey design decisions:
- Fresh agent per task β Each agent gets a clean context window. Prevents context pollution from earlier tasks bleeding into later ones.
- Skill verifies, not the agent β The orchestrating skill reads the code itself after the agent finishes. Agents shouldnβt mark their own homework.
- Checkpoint after each batch β Write progress to disk so work isnβt lost if the session ends.
Pattern 2: Parallel swarm
Multiple agents working simultaneously on independent tasks. Used when work can be divided without dependencies.
How it works:
- Skill decomposes work into independent streams
- Spawn all agents in a single message (parallel execution)
- Collect results as agents complete
- Synthesize findings or merge outputs
Real example β adesa-workflow wave-based execution:
flowchart LR A[Analyze deps] --> W1 subgraph W1 [Wave 1] R1[repo-a] & R2[repo-b] & R3[repo-c] end W1 --> V1{Verify} -- Pass --> W2 subgraph W2 [Wave 2] R4[repo-d] & R5[repo-e] end W2 --> V2{Verify}Key design decisions:
- Dependency ordering β Analyze which repos/tasks depend on each other. Independent work runs in parallel; dependent work waits.
- Wave boundaries β Donβt start the next wave until the current one is verified. This prevents cascading failures.
- Gotcha propagation β Lessons learned in Wave 1 (unexpected patterns, edge cases) get passed to Wave 2 agents so they donβt hit the same issues.
Pattern 3: Multi-perspective analysis
Multiple agents examine the same artifact from different angles, then a lead synthesizes findings. Used for reviews, audits, and design evaluation.
How it works:
- Skill identifies the artifact to analyze (diff, design doc, codebase)
- Auto-triage: select which specialist agents are relevant
- Spawn all selected agents in parallel
- Each agent returns findings from its perspective
- Lead synthesizes: deduplicate, prioritize, produce unified report
Real example β essentials-code-review (11 reviewers):
flowchart LR A[Diff] --> B[Auto-triage] B --> R["Reviewers (2-11 in parallel)"] R --> S[Lead synthesizes] S --> F[Findings + verdicts + top fixes]Key design decisions:
- Auto-triage β Donβt run all reviewers every time. If the diff has no SQL, skip the injection scanner. If there are no new interfaces, skip the contract guardian.
- Agents are documentarians, not fixers β Reviewers describe what they find. The lead (or user) decides what to fix. This prevents agents from stepping on each other.
- Structured output β Each agent returns findings in the same format (severity, file:line, description, suggestion). This makes synthesis possible.
Pattern 4: Coordinator + specialists
A central orchestrator agent discovers available specialists at runtime, routes work based on expertise, and resolves conflicts. Used when the agent roster isnβt known at design time.
How it works:
- Orchestrator globs
agents/*.mdto discover available agents - Parses each agentβs scope (what it owns, what it defers)
- Analyzes the incoming task and matches it to relevant agents
- Consults agents (parallel or sequential, based on config)
- Detects conflicts between agent recommendations
- Resolves conflicts using configured strategy
- Executes the merged plan
Real example β multi-agent-orchestrator:
flowchart LR A[Task] --> B[Discover agents] B --> C[Consult in parallel] C --> D{Conflict?} D -- No --> F[Execute] D -- Yes --> E[Resolve] --> FAgent scope declarations make routing possible. Each agentβs definition includes what it owns and what it defers:
## Scope**Owns:**- Redis configuration and caching patterns
**Defers to:**- Infrastructure provisioning β @infra-expert- Cross-cutting architecture β @architecture-reviewer
## Handoff Triggers- "If the question involves infrastructure sizing" β @infra-expertThe orchestrator reads these declarations to decide which agents are relevant for a task and how to handle overlaps.
Conflict resolution strategies:
| Strategy | How it works | Best for |
|---|---|---|
orchestrator_decides | Orchestrator weighs rationale, makes final call | Fast decisions, clear domain ownership |
consensus_required | Re-query agents with each otherβs reasoning until aligned | High-stakes architectural choices |
majority_vote | Majority wins, dissent is logged | Lower-risk decisions with many agents |
escalate_to_human | Present all perspectives to the user | Ambiguous trade-offs, policy decisions |
Consensus is a loop, not a vote
The consensus_required strategy deserves special attention because itβs fundamentally different from the others. Instead of picking a winner, it feeds each agentβs rationale back to the disagreeing agents and asks them to reconsider. This creates an iterative refinement loop:
flowchart LR A[Agent opinions diverge] --> B[Share rationale across agents] B --> C{Aligned?} C -- No --> B C -- Yes --> D[Proceed with consensus]In practice, this usually converges in 1-2 rounds β agents see constraints they hadnβt considered and adjust. If it doesnβt converge after a few iterations, the orchestrator can fall back to escalate_to_human rather than looping forever.
When to use consensus vs the alternatives:
- Use
orchestrator_decidesas the default β itβs fast and works when domain boundaries are clear - Use
consensus_requiredwhen the decision crosses multiple domains and no single agent has the full picture (e.g., a database schema change that affects API contracts, infrastructure, and testing) - Use
majority_votewhen you have many agents with overlapping expertise and want democratic resolution - Use
escalate_to_humanas a safety valve for policy decisions or when agents surface genuine trade-offs that require business context
Decision tracking
The orchestrator can optionally log every decision to .claude/state/decisions.json β which agents were consulted, what conflicts arose, and how they were resolved. This creates an audit trail for understanding why the system made a particular choice. Three tracking modes:
| Mode | Behavior |
|---|---|
none | No tracking (default) |
per-session | Track during task, clear after |
persistent | Full history preserved across sessions |
Quality gates
Orchestration without verification is just automated guessing. Build gates into your workflow that block progress until evidence is provided.
Verification gate pattern
The core principle: no completion claims without fresh evidence.
IDENTIFY what to verify β RUN the verification (build, test, grep, read) β READ the output β VERIFY it matches expectations β CLAIM completion only if evidence supports itThree-gate workflow
| Gate | When | What it checks | Verdict |
|---|---|---|---|
| Pre-flight | Before implementation starts | Plan completeness, API contracts, DI wiring, config | CLEAR or HOLD |
| In-flight | After each task completes | Code compiles, tests pass, spec compliance | PASS or FAIL (retry) |
| Post-flight | After all implementation | Full build, integration tests, behavioral equivalence | CLEAR, ISSUES, or WARN |
Real-world example: adesa-workflow
The adesa-workflow plugin in the Wholesale marketplace ties all these patterns together into a full development lifecycle. Hereβs how the pieces connect:
The flow
discover β breakdown β coordinate β pickup β preflight β execute β review β verify β pr β reflectEach phase is a skill. Each skill spawns specialized agents. The whole thing is coordinated from a single planning repo with absolute paths to code repos.
How it uses each pattern
| Phase | Pattern | What happens |
|---|---|---|
| Discover | Parallel swarm | Research agents scan all affected repos simultaneously |
| Breakdown | Sequential | Decompose feature into stories, then specs, then review checklist (18 categories) |
| Preflight | Multi-perspective | 4 checker agents (branch, wiring, contracts, config) run in parallel |
| Execute | Sequential + parallel | Repos grouped into dependency waves; repos within a wave run in parallel |
| Review | Multi-perspective | 8-9 reviewer agents + spec reviewer + simplify pass |
| Verify | Parallel swarm | Per-repo verifier agents run simultaneously |
State architecture
planning-repo/βββ repos.json # Team's repo registryβββ features/β βββ {feature-id}/β βββ feature-overview.md # High-level feature descriptionβ βββ stories/β βββ {story}.md # Business-focused storyβ βββ {story}.impl.md # Technical spec per repoβββ .context/{id}/β βββ discovery.md # Research notesβ βββ decisions.md # Settled questionsβββ .executions/{story-id}.md # Progress tracking (append-only)βββ .reflections/{story-id}.md # Lessons learnedβββ .handoffs/{id}-{timestamp}.md # Session resume stateEvery piece of state lives on disk. When context compaction hits, hooks re-inject the execution tracking. When a session ends, /aw-handoff saves everything needed to resume. When a new feature starts, /aw-discover loads past reflections as guardrails.
The feedback loop
After implementation, /aw-reflect captures what went well and what didnβt. These reflections feed back into /aw-discover and /aw-breakdown for the next feature. Over time, the system learns from its mistakes β plans get more accurate, specs get more detailed where they need to be, and common gotchas are caught earlier. See Reflection & Feedback Loops for more on this pattern.