Wholesale AI Champions Claude Code playbook
← Workflows
External · GitHub

GitHub Agent HQ

GitHub's surface for delegating coding work to AI agents (Copilot Coding Agent, Claude, others), with gh-aw layered on top so workflows ship as markdown.

GitHub Agent HQ is the surface where you hand work off to AI coding agents instead of doing it interactively. The agent runs on GitHub-managed infrastructure, pushes commits to a branch, and opens a draft PR for review. It’s async, it’s bounded, and it never touches your local machine.

For most production work, the higher-leverage layer on top of Agent HQ is gh-aw (GitHub Agentic Workflows) — a CLI extension that lets you write agentic workflows as markdown files in .github/workflows/*.md with YAML frontmatter, compiled to .lock.yml runners. Same agents, but driven by triggers (push, schedule, workflow_dispatch) instead of by you assigning issues.

Two modes of using it

Ad-hoc: Issue → PR. A new issue arrives. You assign it to the copilot user (or wire an issue-to-pr.yml workflow that does it automatically on issues: opened). Copilot Coding Agent reads the issue, branches, implements, and opens a draft PR. Useful for the unpredictable backlog — bugs, small features, dependency-ish work.

Scheduled: Trigger → PR. A workflow runs on a push to master, a cron, or a workflow_dispatch. It executes a markdown-defined agent job — scan the codebase, hit MCPs for context, refine prompts, open a PR. Useful for recurring maintenance you’d otherwise forget to do.

A real example: autonomously-improving-agents

Lives at adesa-digital-inventory-manager/.github/workflows/autonomously-improving-agents.md. Triggered on every push to master that touches src/** or any *.csproj / Directory.*.props.

What it does end-to-end:

  1. Sets up .NET 10 and the Aspire CLI on the runner.
  2. Boots the Aspire AppHost (aspire run --launch-profile dev) so the Aspire MCP can resolve real resource state — same MCP wiring as local Claude Code.
  3. Globs .claude/agents/*.md to enumerate every Claude Code agent in the repo.
  4. For each agent, scans the codebase (Glob + Grep) for actual usage of that agent’s domain technology — package versions in csprojs, using statements, key API calls.
  5. Calls Context7 MCP for current docs on the agent’s primary library and compares against what the agent prompt references.
  6. Scores each agent on Freshness (40%), Accuracy (40%), Coverage (20%) — composite out of 100.
  7. Refines any agent scoring below 85 — edits the agent file with targeted fixes against the identified gaps.
  8. Logs results to .claude/state/decisions.json under an evolution_log array.
  9. If anything was refined, opens a PR titled chore: evolve agent prompts on branch chore/agent-evolution.

The whole job is one markdown file. The frontmatter declares the trigger, permissions, allowed network destinations (context7.com, mcp.context7.com), MCPs, and safe outputs (create-pull-request, add-comment, push-to-pull-request-branch). The body is the prompt the agent runs. No YAML acrobatics, no separate prompt file.

That’s the pattern that makes Agent HQ + gh-aw genuinely useful in this codebase: not “assign random issues to Copilot,” but “have an agent maintain my agents while I sleep.”

When to reach for it

  • Recurring maintenance work — agent prompt evolution, Splunk index scans, dependency triage, doc freshness checks. Anything you’d otherwise queue mentally and never get to.
  • Ad-hoc fire-and-forget tickets — bugs and small features with clear acceptance criteria, no architectural calls required.
  • Parallel exploration — three agents on three branches trying three approaches; keep the one that works.
  • Repetitive PRs — same shape of change, different files, churned through the queue while you focus elsewhere.

Skip it for:

  • Architectural decisions that require taste or product judgment
  • Multi-repo changes (each session sees one repo)
  • Anything where you’d want to interrupt mid-flight to course-correct

How it pairs with Claude Code

Local Claude Code is conversational and interactive — you’re in the loop, steering, reviewing partial output, making judgment calls. Agent HQ + gh-aw is the opposite: you describe the work in markdown, the trigger fires, and you review the PR when it’s ready.

The two complement each other. Use Claude Code for the work you’re actively driving; use Agent HQ + gh-aw for the long tail and the recurring maintenance. Same Claude models in both places, so capability is consistent — what differs is your role.

Concretely:

  • Complex feature you’re shaping → Claude Code (you’re driving)
  • Cross-repo work with planning specs → Adesa workflow + Claude Code
  • Library upgrade with mechanical test changes → assign issue to Copilot
  • “Keep my agents fresh against the codebase” → gh-aw on push to master

Setup

  1. Enable Copilot Coding Agent on the repo. Org admin enables Coding Agent in the Copilot policy; per-repo settings expose the agent’s permissions.
  2. Pick a default model. Claude Sonnet, Opus, and Haiku are first-class; OpenAI, Gemini, and Grok are also selectable per session. Set the org-wide default in Copilot settings.
  3. Run gh aw init in the repo. This scaffolds .github/workflows/copilot-setup-steps.yml — the hook Copilot Agent looks for to provision its environment, with the gh-aw setup-cli action wired in. The job has to be named copilot-setup-steps for Copilot to find it; init handles that for you. You don’t write this file by hand.
  4. Drop your first workflow markdown into .github/workflows/. Frontmatter declares the trigger, permissions, MCPs, allowed network destinations, and safe outputs; the body is the prompt the agent runs.
  5. Compile. gh aw compile [workflow-name] produces the .lock.yml that GitHub Actions actually runs. Re-compile whenever you edit the markdown or bump the gh-aw version.
  6. Wire MCPs in workflow frontmatter. Same MCPs you connect locally (Drive, Slack, Splunk, Context7, Aspire) can be referenced from the remote agent’s frontmatter — give the runner network access in the network.allowed block.

Writing issues that agents finish

For the ad-hoc Issue → PR mode, the shape of the issue determines the shape of the PR. Issues that fail are the ones whose acceptance criteria didn’t define done.

Issues that succeed:

  • Single repo, bounded scope (one feature, one bug, one refactor)
  • Acceptance criteria as observable behavior, not implementation
  • Existing patterns to follow (link to a similar PR or file)
  • Tests that exist and pass before the change
  • Files to touch named explicitly when the agent might guess wrong

Issues that fail:

  • “Improve X” without defining what improvement looks like
  • Cross-repo work (the agent only sees one repo)
  • “Architect a solution for Y” — agents implement; they don’t make architectural choices
  • Acceptance criteria that read as “looks good to me”

The same shape /aw-execute consumes from ADO tasks transfers cleanly to GitHub issues — one-paragraph summary, bulleted acceptance criteria, “Files to touch” and “Files to leave alone” sections.

Limits worth knowing

  • One repo per session. No cross-repo coordination; for multi-repo features, use the Adesa workflow.
  • Limited shared context across runs. Each session starts cold by default. gh-aw exposes a “memories” surface that agents can write to, but only the Platform team can read or delete them — engineers and product can’t, so they’re effectively write-only from your seat. For real cross-run state (like the autonomously-improving-agents evolution log), commit state files back to the repo under .claude/state/... and the next run picks them up as part of the checkout.
  • Review-first, not merge-first. Agent HQ opens draft PRs that wait for your review. Don’t auto-merge — the value is in the agent doing the work, not in skipping the gate.
  • Compute is bounded. Long-running tasks can time out; if a session needs hours of work, break it into smaller workflows or smaller issues.
  • Logs live in the PR or the workflow run. For ad-hoc work, the agent panel on the PR has the transcript; for gh-aw workflows, gh aw logs and gh aw audit <run-id> pull the session detail.
  • Dependabot bumps generated lock files. When gh-aw compiles, it generates package.json/requirements.txt/go.mod under .github/workflows/. Don’t merge Dependabot PRs against those directly — update the source .md and re-run gh aw compile --dependabot.