M10: Agent Teams and Parallel Orchestration
Tier 2 — Mastery | Duration: 90 min (pre-work + workshop)
Overview
Section titled “Overview”You’ve mastered single-session workflows and subagents. Now scale to coordinated teams. Agent Teams let multiple Claude sessions work in parallel on related tasks—the Team Lead (orchestrator) manages a shared task list, Teammates execute in isolation, and they message each other to coordinate.
But parallelism has a cost: 7x token consumption vs. a single session. Teams make sense for large refactors, parallel reviews, multi-component features, and cross-repo migrations. They don’t make sense for sequential dependencies, same-file edits, simple tasks, or cost-sensitive work.
This module teaches you when Agent Teams add value and when simpler patterns work better. You’ll study the theory, and the workshop (M10-Agent-Teams-workshop.md) will have you run a real multi-component task and develop judgment about orchestration. Agent Teams ship by default as of v2.1.7 — no feature flag required.
Takeaway: Judgment about when Agent Teams pay off vs when single-session/subagent is better
Prerequisites
Section titled “Prerequisites”- M09 completion (code review patterns)
- 1-2 weeks Claude Code usage
- Understanding of subagents and parallel execution
- Familiarity with task scheduling and dependencies
- Claude Code v2.1.7+ (Agent Teams is production, enabled by default as of March 2026)
Pre-work: Theory (15-20 minutes)
Section titled “Pre-work: Theory (15-20 minutes)”The Parallelism Stack
Section titled “The Parallelism Stack”There’s a progression from sequential to parallel execution:
| Layer | Parallelism | Cost | Best For | Worst For |
|---|---|---|---|---|
| Single Session | None (sequential) | 1x | Simple tasks, quick scripts | Large, independent tasks |
| Subagents | Limited (serial calls within main session) | Variable (1.3x for sequential; much higher with parallelism) | Role-based work (reviewer, security checker) | Truly parallel tasks |
| Manual Parallel Sessions | Full (you open 2 Claude Code instances) | 2x | Work that takes days (one person handles security, another does UI) | Tasks with shared state, high coordination cost |
| Agent Teams | Full with orchestration | 7x | Large refactors, parallel reviews, multi-component feature, cross-repo migration | Sequential dependencies, same-file edits, cost-sensitive work |
| Alternative Orchestrators | Full with custom logic | Varies | Specialized workflows (LangGraph, CrewAI, OpenAI Agents SDK) | Vendor lock-in risk; steeper setup |
| /batch Command | Full, independent tasks only | 2-3x | Parallel refactoring (rename function across 20 files) | Tasks with dependencies |
Agent Teams Architecture
Section titled “Agent Teams Architecture”Team Lead (Orchestrator):
- Central coordinator
- Manages shared task list
- Distributes work to teammates
- Receives progress updates
- Handles inter-agent messaging
Teammates (Workers):
- Independent sessions (separate context windows)
- Each has its own tools, skills, hooks
- Can work on different files in parallel
- Report status back to lead
- Can message each other
Shared Task List (Dependency-Aware):
Tasks: - id: "api_endpoint" description: "Create new /users endpoint with validation" dependencies: ["types_definition"] assigned_to: "api-developer" status: "in_progress"
- id: "types_definition" description: "Define User and UserInput TypeScript types" dependencies: [] assigned_to: "types-developer" status: "completed"
- id: "frontend_ui" description: "Build UserProfile component" dependencies: ["api_endpoint"] assigned_to: "frontend-developer" status: "pending"
- id: "integration_tests" description: "Test API → Frontend integration" dependencies: ["api_endpoint", "frontend_ui"] assigned_to: "qa-agent" status: "pending"Mailbox (Inter-Agent Messaging): Teammates message the lead:
api-developer → lead: "API endpoint complete. Types used: User, UserInput (v1.2)"frontend-developer → lead: "Ready to start UI. Need: API types definition"lead → frontend-developer: "See message from types-developer at 14:32"When Agent Teams Make Sense
Section titled “When Agent Teams Make Sense”YES, use teams:
- Large refactors (rename pattern across 500+ files)
- Parallel independent reviews (5 PRs reviewed simultaneously)
- Multi-component feature (API + UI + tests + docs, no file overlap)
- Cross-repo migration (update 10 repos in parallel)
- Complex dependency resolution (DAG of 20+ tasks)
NO, don’t use teams:
- Sequential dependencies (B waits for A, C waits for B)
- Same-file edits (conflicts, merge pain)
- Simple tasks (<30 min of work per task)
- Cost-sensitive work (7x token cost is not justified)
- Highly interdependent logic
Context Requirements for Team Success
Section titled “Context Requirements for Team Success”Agent Teams succeed or fail based on task definition quality. Ambiguous task definitions are a primary cause of coordination failures. Before launching a team:
- Define file ownership explicitly. Each task definition must state which files or directories the agent may modify. Example: “Task: API implementation — must NOT modify
ui/directory; must exportUserTypeandInputTypebefore completion.” - Use a shared
CLAUDE.mdfor constraints. Document shared rules (coding conventions, forbidden patterns, dependency boundaries) in a CLAUDE.md visible to all teammates. This prevents agents from diverging in their understanding of project standards. - Document dependencies precisely. Vague dependencies (“depends on the types work”) lead to integration failures. Reference exact exported symbols, file paths, or interface contracts.
Autonomy Levels & Trust Calibration
Section titled “Autonomy Levels & Trust Calibration”Agent Teams don’t operate in a single autonomy mode. Match oversight intensity to task risk:
- Supervised (default for new teams): Lead reviews all teammate work before merge. Use this when agents are unfamiliar with the codebase or when the task carries high risk.
- Plan-approval pattern: Teammates propose an implementation plan before writing code. Lead approves or redirects. Useful for complex refactors where a wrong approach wastes significant tokens.
- Low-risk autonomous: Non-critical changes (documentation updates, test generation) can proceed without lead approval. Reserve for teammates with demonstrated reliability on the codebase.
Start supervised. Expand autonomy only after teammates have proven reliable on the specific codebase and task type.
Failure Modes & Mitigation
Section titled “Failure Modes & Mitigation”36.9% of multi-agent failures are coordination breakdowns (MAST study, 2025). Multi-agent error rates amplify roughly 17x compared to single-agent systems — a small misunderstanding in one agent’s task definition can cascade through dependent agents before the lead detects it.
Common failure types and mitigations:
- Race conditions: Two agents editing overlapping code regions. Mitigate with strict file ownership in task definitions — partition files before the team starts.
- Merge conflicts: Agents that don’t respect file boundaries create conflicts that require manual resolution. Enforce directory-level ownership, not just file-level.
- Coordination bottleneck: Lead becomes a single point of failure when all teammates wait for its responses. Mitigate with periodic checkpoints rather than continuous approval.
- Context drift: Agents independently interpret ambiguous requirements and diverge. Mitigate with a shared CLAUDE.md that makes constraints explicit.
- Error amplification: A type mismatch in a foundational task (types agent) will break every downstream task that depends on it. Validate outputs at dependency boundaries before unblocking dependent teammates.
Guardrails for production teams:
- Run a 2-agent pilot before scaling to a full team
- Validate output at each task boundary before unblocking dependents
- Set conservative token limits per teammate in pilots; adjust based on measured usage
- Use independent verification (a separate reviewer agent or CI) before merging team output
Token Cost Analysis
Section titled “Token Cost Analysis”Single session: ~10,000 tokens for a feature (context + reasoning + code) Agent Teams: ~70,000 tokens for the same feature (in plan mode with full context duplication)
- Each teammate: ~10,000 (own context window)
- Lead: ~5,000 (task list, coordination)
- Duplication: ~5,000 (shared context sent to all teammates)
The 7x figure is a ceiling, not a constant. Real-world multipliers typically fall between 2–5x, depending on task interdependency, team size, model choice (Sonnet vs. Opus), and how much context is shared across teammates. Validate actual costs with a small pilot before committing to a full team.
Team sizing guidelines:
- Start with 3–5 teammates for most workflows
- Assign 5–6 tasks per teammate to keep everyone productive without excessive context switching
- Three focused teammates often outperform five scattered ones
- Coordination overhead plateaus around 4 agents — beyond that, coordination costs begin to exceed the parallelism gains
- For tasks requiring more than 5 specialized roles, use hierarchical nesting: spawn smaller sub-teams with lead agents coordinating between them
Use Shift+Down to cycle through teammates in the terminal and inspect their progress.
ROI threshold: Teams pay off when parallelism saves more time than the token cost.
Example:
- Feature with 4 independent components
- Single session: 4 hours to complete sequentially
- Agent Teams: 1 hour (4 parallel) + coordination overhead
- Cost increase: ~3–7x tokens (measure with a pilot)
- Time saved: 3 hours = ~36 developer-hours saved (if 12 people waiting)
- ROI: Positive for large teams, negative for 1-2 people
Workshop
Section titled “Workshop”The hands-on session for this module: M10 Workshop: Agent Teams and Parallel Orchestration
Takeaway
Section titled “Takeaway”You now own:
- ✓ Understanding of when Agent Teams add value vs when they’re overkill
- ✓ Ability to design tasks with parallel independence
- ✓ Experience running a real coordinated team
- ✓ Judgment: teams for large refactors, /batch for independent changes, single session for sequential work
- ✓ Cost-awareness: token multiplier vs time saved
Apply immediately:
- Don’t default to teams; start with single session or /batch
- Use teams only when time saved > 7x token cost
- Practice designing tasks with genuine parallelism (no same-file conflicts)
- Build team habits: “Is parallelism justified here?”
Key Concepts
Section titled “Key Concepts”Agent Teams: Multiple Claude sessions (Teammates) coordinated by a Lead (Orchestrator). Each teammate has its own context window and can work in parallel on independent tasks.
Task Dependency Graph (DAG): Visual or YAML representation of which tasks must complete before others can start. Agent Teams respect these dependencies.
Teammate Isolation: Each teammate is a separate session with its own tools, context, and state. They communicate via messaging, not shared memory.
Mailbox (Inter-Agent Messaging): Asynchronous messaging between teammates and the lead. Allows coordination without blocking.
/batch Command: Simpler alternative for truly independent tasks (no coordination needed). Lower token cost, faster execution.
Orchestrator vs Worker:
- Orchestrator (Lead): Manages task list, resolves blockers, facilitates messaging
- Worker (Teammate): Executes assigned tasks, reports status, messages lead
ROI Threshold: Teams are justified when: (Time saved in hours) × (cost per hour) > (Token multiplier × token cost)
References
Section titled “References”-
Claude Code Agent Teams Documentation: https://docs.anthropic.com/en/docs/claude-code/agent-teams
-
Parallel Task Orchestration: https://en.wikipedia.org/wiki/Job_scheduling
-
DAG (Directed Acyclic Graph) Tools:
- Airflow: https://airflow.apache.org/
- Dask: https://www.dask.org/
-
Alternative Orchestration Frameworks:
Framework Strengths Tradeoffs Status Claude Code Agent Teams Native integration, fast iteration, automatic coordination Less fine-grained control; vendor lock-in risk Production (v2.1.7+) LangGraph (v1.0+) Explicit state management, streaming, human-in-the-loop Code-based definition; steeper learning curve Production (Oct 2025) CrewAI Flows Role-based simplicity, minimal code, lightweight Less control over state transitions Production (Oct 2025) Microsoft Agent Framework Enterprise runtime, multi-language, managed deployment Early-stage (GA pending Q1 2026) Emerging OpenAI Agents SDK Handoff pattern for sequential expertise delegation Complementary to parallel patterns, not a replacement Production (March 2025) AutoGen Historical multi-agent framework Maintenance mode — not recommended for new projects Deprecated Multiclaude Team review workflows, long prompts then walk away Claude-specific; less flexible than native teams Community Gas Town Many agents in parallel for solo devs More complex setup Community OpenClaw/OpenCode Mixed-model teams (GPT + Gemini + Claude in same team) Vendor-agnostic but higher coordination overhead Community -
MAST (Multi-Agent Systems Failure Taxonomy): ArXiv 2601.13671 (March 2025)
-
Towards a Science of Scaling Agent Systems: ArXiv 2512.08296 (December 2025)
-
Cost-Benefit Analysis: https://en.wikipedia.org/wiki/Cost%E2%80%93benefit_analysis
-
The /batch Command: Claude Code documentation
Next: [M11 — Post-Deployment: Monitoring AI-Generated Systems](../Tier 3 - Operations and Scale/M11-Post-Deployment.md)