A deterministic MCP server that transforms your AI coding assistant into a multi-agent advisory team. Classify tasks, score risk, assemble specialist reviewers, implement with human gates, and build persistent project memory — all with zero LLM calls in the server.
Most AI coding tools are black boxes. squad-mcp gives you a transparent, auditable, and configurable advisory layer between you and your AI assistant.
Zero LLM calls in the server. Every tool is a pure function — classification, risk scoring, agent selection, and consolidation are all heuristic and auditable.
Two explicit gates in every implementation run: plan approval and Blocker halt. The squad never writes code you didn't sign off on. Ever.
Auto-scales from quick (2 agents, sub-30s) to
deep (5+ agents, architect + security forced).
Override with --quick / --deep flags.
.squad/learnings.jsonl records every accept/reject
decision. The squad stops re-raising findings your team already
resolved. Decisions are versioned in git.
Feed a PRD, get atomic tasks with dependencies, scope globs, and agent hints. Each task narrows the squad's focus — no more re-analyzing the whole repo.
Every review produces a weighted rubric scorecard (0–100) per dimension. Configurable thresholds per repo. APPROVED, CHANGES_REQUIRED, or REJECTED.
A single /squad:implement command threads through
classification, planning, advisory, and consolidation — with
human approval at every critical point.
The server analyzes your prompt + changed files, classifies the work type (Feature, Bug Fix, Security, etc.), and computes a Low/Medium/High risk score from boolean signals (auth, money, migration, file count).
Based on the work type and risk, specialist agents are selected.
Depth auto-resolves: quick for low-risk,
deep for high-risk/security,
normal otherwise. Each agent gets only the files in
their domain.
A plan is drafted and sent to the tech-lead-planner for review (skipped in quick mode). The skill stops and asks you to approve. Reply "go" to proceed; anything else cancels.
Every selected agent reviews in parallel, emitting findings + a
Score: NN/100. Architect, DBA, Developer, QA,
Security, Reviewer — each sees only their slice.
The tech-lead-consolidator produces a verdict (APPROVED / CHANGES_REQUIRED / REJECTED) with a weighted scorecard. If any Blocker is found, the squad halts and asks you.
If approved, the implementer writes the code.
Never commits or pushes — that's your
call. The run is recorded to .squad/runs.jsonl for
observability.
Every command is a standalone skill. Chain them with
/squad:pipeline for a full cradle-to-grave workflow.
Every example is a single line you type. The squad sizes itself from the prompt + changed files. You only reach for a flag to override.
Four ways to install. Pick your host and go.
/plugin marketplace add ggemba/squad-mcp /plugin install squad@gempack
squad MCP server.
npx -y @gempack/squad-mcp
squad-mcp binary and works
with any MCP-capable client via stdio.
{
"mcpServers": {
"squad": {
"command": "npx",
"args": ["-y", "@gempack/squad-mcp"]
}
}
}
~/Library/Application
Support/Claude/claude_desktop_config.json%APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"squad": {
"command": "npx",
"args": ["-y", "@gempack/squad-mcp"]
}
}
}
.cursor/mcp.json or global Cursor
settings. Same config shape works for Warp too.
The server makes no LLM calls. The host LLM does all reasoning; the server hands it building blocks.
Each agent has a domain-specific system prompt and reviews only the files in their scope.