GitHub Agent HQ
GitHub's surface for delegating coding work to AI agents (Copilot Coding Agent, Claude, others), with gh-aw layered on top so workflows ship as markdown.
GitHub Agent HQ is the surface where you hand work off to AI coding agents instead of doing it interactively. The agent runs on GitHub-managed infrastructure, pushes commits to a branch, and opens a draft PR for review. It’s async, it’s bounded, and it never touches your local machine.
For most production work, the higher-leverage layer on top of Agent HQ is gh-aw (GitHub Agentic Workflows) — a CLI extension that lets you write agentic workflows as markdown files in .github/workflows/*.md with YAML frontmatter, compiled to .lock.yml runners. Same agents, but driven by triggers (push, schedule, workflow_dispatch) instead of by you assigning issues.
Two modes of using it
Ad-hoc: Issue → PR. A new issue arrives. You assign it to the copilot user (or wire an issue-to-pr.yml workflow that does it automatically on issues: opened). Copilot Coding Agent reads the issue, branches, implements, and opens a draft PR. Useful for the unpredictable backlog — bugs, small features, dependency-ish work.
Scheduled: Trigger → PR. A workflow runs on a push to master, a cron, or a workflow_dispatch. It executes a markdown-defined agent job — scan the codebase, hit MCPs for context, refine prompts, open a PR. Useful for recurring maintenance you’d otherwise forget to do.
A real example: autonomously-improving-agents
Lives at adesa-digital-inventory-manager/.github/workflows/autonomously-improving-agents.md. Triggered on every push to master that touches src/** or any *.csproj / Directory.*.props.
What it does end-to-end:
- Sets up .NET 10 and the Aspire CLI on the runner.
- Boots the Aspire AppHost (
aspire run --launch-profile dev) so the Aspire MCP can resolve real resource state — same MCP wiring as local Claude Code. - Globs
.claude/agents/*.mdto enumerate every Claude Code agent in the repo. - For each agent, scans the codebase (Glob + Grep) for actual usage of that agent’s domain technology — package versions in csprojs,
usingstatements, key API calls. - Calls Context7 MCP for current docs on the agent’s primary library and compares against what the agent prompt references.
- Scores each agent on Freshness (40%), Accuracy (40%), Coverage (20%) — composite out of 100.
- Refines any agent scoring below 85 — edits the agent file with targeted fixes against the identified gaps.
- Logs results to
.claude/state/decisions.jsonunder anevolution_logarray. - If anything was refined, opens a PR titled
chore: evolve agent promptson branchchore/agent-evolution.
The whole job is one markdown file. The frontmatter declares the trigger, permissions, allowed network destinations (context7.com, mcp.context7.com), MCPs, and safe outputs (create-pull-request, add-comment, push-to-pull-request-branch). The body is the prompt the agent runs. No YAML acrobatics, no separate prompt file.
That’s the pattern that makes Agent HQ + gh-aw genuinely useful in this codebase: not “assign random issues to Copilot,” but “have an agent maintain my agents while I sleep.”
When to reach for it
- Recurring maintenance work — agent prompt evolution, Splunk index scans, dependency triage, doc freshness checks. Anything you’d otherwise queue mentally and never get to.
- Ad-hoc fire-and-forget tickets — bugs and small features with clear acceptance criteria, no architectural calls required.
- Parallel exploration — three agents on three branches trying three approaches; keep the one that works.
- Repetitive PRs — same shape of change, different files, churned through the queue while you focus elsewhere.
Skip it for:
- Architectural decisions that require taste or product judgment
- Multi-repo changes (each session sees one repo)
- Anything where you’d want to interrupt mid-flight to course-correct
How it pairs with Claude Code
Local Claude Code is conversational and interactive — you’re in the loop, steering, reviewing partial output, making judgment calls. Agent HQ + gh-aw is the opposite: you describe the work in markdown, the trigger fires, and you review the PR when it’s ready.
The two complement each other. Use Claude Code for the work you’re actively driving; use Agent HQ + gh-aw for the long tail and the recurring maintenance. Same Claude models in both places, so capability is consistent — what differs is your role.
Concretely:
- Complex feature you’re shaping → Claude Code (you’re driving)
- Cross-repo work with planning specs → Adesa workflow + Claude Code
- Library upgrade with mechanical test changes → assign issue to Copilot
- “Keep my agents fresh against the codebase” → gh-aw on push to master
Setup
- Enable Copilot Coding Agent on the repo. Org admin enables Coding Agent in the Copilot policy; per-repo settings expose the agent’s permissions.
- Pick a default model. Claude Sonnet, Opus, and Haiku are first-class; OpenAI, Gemini, and Grok are also selectable per session. Set the org-wide default in Copilot settings.
- Run
gh aw initin the repo. This scaffolds.github/workflows/copilot-setup-steps.yml— the hook Copilot Agent looks for to provision its environment, with the gh-awsetup-cliaction wired in. The job has to be namedcopilot-setup-stepsfor Copilot to find it; init handles that for you. You don’t write this file by hand. - Drop your first workflow markdown into
.github/workflows/. Frontmatter declares the trigger, permissions, MCPs, allowed network destinations, and safe outputs; the body is the prompt the agent runs. - Compile.
gh aw compile [workflow-name]produces the.lock.ymlthat GitHub Actions actually runs. Re-compile whenever you edit the markdown or bump the gh-aw version. - Wire MCPs in workflow frontmatter. Same MCPs you connect locally (Drive, Slack, Splunk, Context7, Aspire) can be referenced from the remote agent’s frontmatter — give the runner network access in the
network.allowedblock.
Writing issues that agents finish
For the ad-hoc Issue → PR mode, the shape of the issue determines the shape of the PR. Issues that fail are the ones whose acceptance criteria didn’t define done.
Issues that succeed:
- Single repo, bounded scope (one feature, one bug, one refactor)
- Acceptance criteria as observable behavior, not implementation
- Existing patterns to follow (link to a similar PR or file)
- Tests that exist and pass before the change
- Files to touch named explicitly when the agent might guess wrong
Issues that fail:
- “Improve X” without defining what improvement looks like
- Cross-repo work (the agent only sees one repo)
- “Architect a solution for Y” — agents implement; they don’t make architectural choices
- Acceptance criteria that read as “looks good to me”
The same shape /aw-execute consumes from ADO tasks transfers cleanly to GitHub issues — one-paragraph summary, bulleted acceptance criteria, “Files to touch” and “Files to leave alone” sections.
Limits worth knowing
- One repo per session. No cross-repo coordination; for multi-repo features, use the Adesa workflow.
- Limited shared context across runs. Each session starts cold by default. gh-aw exposes a “memories” surface that agents can write to, but only the Platform team can read or delete them — engineers and product can’t, so they’re effectively write-only from your seat. For real cross-run state (like the
autonomously-improving-agentsevolution log), commit state files back to the repo under.claude/state/...and the next run picks them up as part of the checkout. - Review-first, not merge-first. Agent HQ opens draft PRs that wait for your review. Don’t auto-merge — the value is in the agent doing the work, not in skipping the gate.
- Compute is bounded. Long-running tasks can time out; if a session needs hours of work, break it into smaller workflows or smaller issues.
- Logs live in the PR or the workflow run. For ad-hoc work, the agent panel on the PR has the transcript; for gh-aw workflows,
gh aw logsandgh aw audit <run-id>pull the session detail. - Dependabot bumps generated lock files. When gh-aw compiles, it generates
package.json/requirements.txt/go.modunder.github/workflows/. Don’t merge Dependabot PRs against those directly — update the source.mdand re-rungh aw compile --dependabot.