← Learning paths
claude/advanced

LLM orchestration patterns

Composing multiple agents into coordinated workflows β€” sequential execution, parallel swarms, multi-perspective analysis. The next level after individual skills and agents.

πŸ“ Wholesale AI Champions ⏱ 11 min read πŸ“š Team workflows

How to move beyond single prompts and compose skills, agents, and coordination layers into autonomous workflows.


The progression

Most teams start with Claude Code as a conversational assistant β€” ask a question, get an answer. But the real power comes from layering capabilities on top of each other:

LevelWhat it looks likeExample
PromptOne-shot question and answer”What does this function do?”
SkillReusable prompt template with steps/what-changed summarizes your uncommitted diff
AgentAutonomous expert that handles a full taskA code-reviewer agent that analyzes quality, security, and patterns
OrchestrationMultiple agents coordinated by a lead8 reviewer agents run in parallel, a lead synthesizes findings

Each level builds on the last. Skills give Claude repeatable instructions. Agents give Claude specialized personas it can delegate to. Orchestration coordinates multiple agents into workflows that would be too complex for a single prompt.


Getting started

You don’t need to build a full orchestration system on day one. Start small and layer up:

  1. Write a skill that automates a task you do repeatedly. See Skills for the format.
  2. Extract an agent when part of your skill needs deep expertise. Move that logic into an agent file and have the skill spawn it.
  3. Add a second agent when you need a different perspective on the same work. Run them in parallel and synthesize.
  4. Add state tracking when your workflows get long enough to hit context compaction. Write progress to disk.
  5. Add verification gates when you need confidence that the output is correct. Block progress until evidence is provided.

The marketplace plugins β€” essentials, multi-agent-orchestrator, refine-prompt, and adesa-workflow β€” are open source and full of patterns you can study and adapt.


Building blocks

Here’s how each Claude Code primitive contributes to orchestration:

PrimitiveRole in orchestrationWhere it lives
SkillsDefine the steps of a workflow phase.claude/commands/*.md
AgentsAutonomous specialists that skills delegate toagents/*.md (in a plugin)
HooksAutomated triggers on events (session start, tool use)settings.json or settings.local.json
MCP ServersConnect agents to external systems (ADO, databases, APIs).mcp.json or settings
State filesPersist progress across context compactionDisk files (JSON, markdown)

A skill is the entry point β€” the thing the user invokes. It reads context, makes decisions, and spawns agents to do specialized work. Agents operate autonomously within their scope. Hooks react to system events. MCP servers bridge to external data. State files keep everything in sync when the conversation gets long.


Orchestration patterns

These are the four patterns used across the Carvana and Wholesale marketplaces. Each solves a different coordination problem.

Pattern 1: Sequential execution

One agent at a time, each building on the previous result. Used when tasks have dependencies.

How it works:

  1. Skill loads a plan (task list with ordering)
  2. For each task, spawn a fresh agent
  3. Agent implements the task, returns results
  4. Skill verifies the result before moving to the next task
  5. If verification fails, create a fix task and retry

Real example β€” essentials-execute:

flowchart LR
A[Plan] --> B[Spawn agent] --> C{Verify}
C -- Pass --> D[Next task]
C -- Fail --> E[Fix task] --> B
D --> B

Key design decisions:

  • Fresh agent per task β€” Each agent gets a clean context window. Prevents context pollution from earlier tasks bleeding into later ones.
  • Skill verifies, not the agent β€” The orchestrating skill reads the code itself after the agent finishes. Agents shouldn’t mark their own homework.
  • Checkpoint after each batch β€” Write progress to disk so work isn’t lost if the session ends.

Pattern 2: Parallel swarm

Multiple agents working simultaneously on independent tasks. Used when work can be divided without dependencies.

How it works:

  1. Skill decomposes work into independent streams
  2. Spawn all agents in a single message (parallel execution)
  3. Collect results as agents complete
  4. Synthesize findings or merge outputs

Real example β€” adesa-workflow wave-based execution:

flowchart LR
A[Analyze deps] --> W1
subgraph W1 [Wave 1]
R1[repo-a] & R2[repo-b] & R3[repo-c]
end
W1 --> V1{Verify} -- Pass --> W2
subgraph W2 [Wave 2]
R4[repo-d] & R5[repo-e]
end
W2 --> V2{Verify}

Key design decisions:

  • Dependency ordering β€” Analyze which repos/tasks depend on each other. Independent work runs in parallel; dependent work waits.
  • Wave boundaries β€” Don’t start the next wave until the current one is verified. This prevents cascading failures.
  • Gotcha propagation β€” Lessons learned in Wave 1 (unexpected patterns, edge cases) get passed to Wave 2 agents so they don’t hit the same issues.

Pattern 3: Multi-perspective analysis

Multiple agents examine the same artifact from different angles, then a lead synthesizes findings. Used for reviews, audits, and design evaluation.

How it works:

  1. Skill identifies the artifact to analyze (diff, design doc, codebase)
  2. Auto-triage: select which specialist agents are relevant
  3. Spawn all selected agents in parallel
  4. Each agent returns findings from its perspective
  5. Lead synthesizes: deduplicate, prioritize, produce unified report

Real example β€” essentials-code-review (11 reviewers):

flowchart LR
A[Diff] --> B[Auto-triage]
B --> R["Reviewers (2-11 in parallel)"]
R --> S[Lead synthesizes]
S --> F[Findings + verdicts + top fixes]

Key design decisions:

  • Auto-triage β€” Don’t run all reviewers every time. If the diff has no SQL, skip the injection scanner. If there are no new interfaces, skip the contract guardian.
  • Agents are documentarians, not fixers β€” Reviewers describe what they find. The lead (or user) decides what to fix. This prevents agents from stepping on each other.
  • Structured output β€” Each agent returns findings in the same format (severity, file:line, description, suggestion). This makes synthesis possible.

Pattern 4: Coordinator + specialists

A central orchestrator agent discovers available specialists at runtime, routes work based on expertise, and resolves conflicts. Used when the agent roster isn’t known at design time.

How it works:

  1. Orchestrator globs agents/*.md to discover available agents
  2. Parses each agent’s scope (what it owns, what it defers)
  3. Analyzes the incoming task and matches it to relevant agents
  4. Consults agents (parallel or sequential, based on config)
  5. Detects conflicts between agent recommendations
  6. Resolves conflicts using configured strategy
  7. Executes the merged plan

Real example β€” multi-agent-orchestrator:

flowchart LR
A[Task] --> B[Discover agents]
B --> C[Consult in parallel]
C --> D{Conflict?}
D -- No --> F[Execute]
D -- Yes --> E[Resolve] --> F

Agent scope declarations make routing possible. Each agent’s definition includes what it owns and what it defers:

## Scope
**Owns:**
- Redis configuration and caching patterns
**Defers to:**
- Infrastructure provisioning β†’ @infra-expert
- Cross-cutting architecture β†’ @architecture-reviewer
## Handoff Triggers
- "If the question involves infrastructure sizing" β†’ @infra-expert

The orchestrator reads these declarations to decide which agents are relevant for a task and how to handle overlaps.

Conflict resolution strategies:

StrategyHow it worksBest for
orchestrator_decidesOrchestrator weighs rationale, makes final callFast decisions, clear domain ownership
consensus_requiredRe-query agents with each other’s reasoning until alignedHigh-stakes architectural choices
majority_voteMajority wins, dissent is loggedLower-risk decisions with many agents
escalate_to_humanPresent all perspectives to the userAmbiguous trade-offs, policy decisions

Consensus is a loop, not a vote

The consensus_required strategy deserves special attention because it’s fundamentally different from the others. Instead of picking a winner, it feeds each agent’s rationale back to the disagreeing agents and asks them to reconsider. This creates an iterative refinement loop:

flowchart LR
A[Agent opinions diverge] --> B[Share rationale across agents]
B --> C{Aligned?}
C -- No --> B
C -- Yes --> D[Proceed with consensus]

In practice, this usually converges in 1-2 rounds β€” agents see constraints they hadn’t considered and adjust. If it doesn’t converge after a few iterations, the orchestrator can fall back to escalate_to_human rather than looping forever.

When to use consensus vs the alternatives:

  • Use orchestrator_decides as the default β€” it’s fast and works when domain boundaries are clear
  • Use consensus_required when the decision crosses multiple domains and no single agent has the full picture (e.g., a database schema change that affects API contracts, infrastructure, and testing)
  • Use majority_vote when you have many agents with overlapping expertise and want democratic resolution
  • Use escalate_to_human as a safety valve for policy decisions or when agents surface genuine trade-offs that require business context

Decision tracking

The orchestrator can optionally log every decision to .claude/state/decisions.json β€” which agents were consulted, what conflicts arose, and how they were resolved. This creates an audit trail for understanding why the system made a particular choice. Three tracking modes:

ModeBehavior
noneNo tracking (default)
per-sessionTrack during task, clear after
persistentFull history preserved across sessions

Quality gates

Orchestration without verification is just automated guessing. Build gates into your workflow that block progress until evidence is provided.

Verification gate pattern

The core principle: no completion claims without fresh evidence.

IDENTIFY what to verify
β†’ RUN the verification (build, test, grep, read)
β†’ READ the output
β†’ VERIFY it matches expectations
β†’ CLAIM completion only if evidence supports it

Three-gate workflow

GateWhenWhat it checksVerdict
Pre-flightBefore implementation startsPlan completeness, API contracts, DI wiring, configCLEAR or HOLD
In-flightAfter each task completesCode compiles, tests pass, spec compliancePASS or FAIL (retry)
Post-flightAfter all implementationFull build, integration tests, behavioral equivalenceCLEAR, ISSUES, or WARN

Real-world example: adesa-workflow

The adesa-workflow plugin in the Wholesale marketplace ties all these patterns together into a full development lifecycle. Here’s how the pieces connect:

The flow

discover β†’ breakdown β†’ coordinate β†’ pickup β†’ preflight β†’ execute β†’ review β†’ verify β†’ pr β†’ reflect

Each phase is a skill. Each skill spawns specialized agents. The whole thing is coordinated from a single planning repo with absolute paths to code repos.

How it uses each pattern

PhasePatternWhat happens
DiscoverParallel swarmResearch agents scan all affected repos simultaneously
BreakdownSequentialDecompose feature into stories, then specs, then review checklist (18 categories)
PreflightMulti-perspective4 checker agents (branch, wiring, contracts, config) run in parallel
ExecuteSequential + parallelRepos grouped into dependency waves; repos within a wave run in parallel
ReviewMulti-perspective8-9 reviewer agents + spec reviewer + simplify pass
VerifyParallel swarmPer-repo verifier agents run simultaneously

State architecture

planning-repo/
β”œβ”€β”€ repos.json # Team's repo registry
β”œβ”€β”€ features/
β”‚ └── {feature-id}/
β”‚ β”œβ”€β”€ feature-overview.md # High-level feature description
β”‚ └── stories/
β”‚ β”œβ”€β”€ {story}.md # Business-focused story
β”‚ └── {story}.impl.md # Technical spec per repo
β”œβ”€β”€ .context/{id}/
β”‚ β”œβ”€β”€ discovery.md # Research notes
β”‚ └── decisions.md # Settled questions
β”œβ”€β”€ .executions/{story-id}.md # Progress tracking (append-only)
β”œβ”€β”€ .reflections/{story-id}.md # Lessons learned
└── .handoffs/{id}-{timestamp}.md # Session resume state

Every piece of state lives on disk. When context compaction hits, hooks re-inject the execution tracking. When a session ends, /aw-handoff saves everything needed to resume. When a new feature starts, /aw-discover loads past reflections as guardrails.

The feedback loop

After implementation, /aw-reflect captures what went well and what didn’t. These reflections feed back into /aw-discover and /aw-breakdown for the next feature. Over time, the system learns from its mistakes β€” plans get more accurate, specs get more detailed where they need to be, and common gotchas are caught earlier. See Reflection & Feedback Loops for more on this pattern.