12: Fun with Adversarial Agents
30 minutes | You need: modules 1-6 completed, a file worth reviewing
If you’ve ever watched HBO’s Silicon Valley, you know Dinesh and Gilfoyle. Gilfoyle finds everything wrong with Dinesh’s code. Dinesh defends it — sometimes poorly, sometimes brilliantly. The audience laughs, but the code actually gets better.
That dynamic is the foundation of a real multi-agent pattern: adversarial review. Two agents with opposing objectives debate until they converge. The friction between them surfaces issues that a single reviewer would miss — and filters out false positives that a single critic would overreport.
In this module you’ll install a skill that implements this pattern, run it on your own code, and then dissect why it works.
Install the /dg Skill
Section titled “Install the /dg Skill”From your project root:
mkdir -p .claude/skills/dgcurl -sL https://raw.githubusercontent.com/v1r3n/dinesh-gilfoyle/main/dg/SKILL.md -o .claude/skills/dg/SKILL.mdcurl -sL https://raw.githubusercontent.com/v1r3n/dinesh-gilfoyle/main/dg/gilfoyle-agent.md -o .claude/skills/dg/gilfoyle-agent.mdcurl -sL https://raw.githubusercontent.com/v1r3n/dinesh-gilfoyle/main/dg/dinesh-agent.md -o .claude/skills/dg/dinesh-agent.mdStart a new Claude Code session (the skill loader runs at startup), then verify:
/dg src/some-file.tsYou should see Gilfoyle attack the file, Dinesh defend it, and a structured verdict at the end.
Do This
Section titled “Do This”1. Run a review
Section titled “1. Run a review”Pick a file you wrote recently — something with real logic, not a config file:
/dg path/to/your/fileWatch the debate unfold. Pay attention to:
- Which of Gilfoyle’s findings does Dinesh concede? Those are your real issues.
- Which does Dinesh successfully defend? Those validate your code.
- Which does Dinesh dismiss as nitpicks? The synthesis will tell you if that dismissal held.
2. Read the agent prompts
Section titled “2. Read the agent prompts”Open the three files you just downloaded:
@.claude/skills/dg/SKILL.md@.claude/skills/dg/gilfoyle-agent.md@.claude/skills/dg/dinesh-agent.mdNotice the structure:
- SKILL.md is the orchestrator — it gathers code, dispatches agents, detects convergence, and synthesizes the final review
- gilfoyle-agent.md defines the attacker’s persona, technical domains, and output format
- dinesh-agent.md defines the defender’s persona, response strategies, and honesty constraints
3. Trace the architecture
Section titled “3. Trace the architecture”Ask Claude to explain the orchestration:
How does the /dg skill orchestrate the debate between agents? Walk me through the flow.How It Works: Adversarial Agents as a GAN
Section titled “How It Works: Adversarial Agents as a GAN”The /dg skill implements a pattern borrowed from machine learning: Generative Adversarial Networks (GANs). In a GAN, two neural networks compete — a generator creates content, a discriminator tries to reject it. Neither is useful alone. Together, they push each other toward quality.
The /dg skill maps this to code review:
| GAN Concept | /dg Implementation |
|---|---|
| Generator | Dinesh (the code author/defender) |
| Discriminator | Gilfoyle (the critic/attacker) |
| Training loop | Multi-round debate with convergence detection |
| Loss function | Structured findings with severity ratings |
| Convergence | Gilfoyle repeats findings or Dinesh concedes all points |
Why two agents beat one
Section titled “Why two agents beat one”A single agent asked to “review this code thoroughly” faces a fundamental tension: it has to be both critic and advocate in the same context window. It hedges. It qualifies. It finds three issues and says “overall the code is solid.”
Splitting the roles eliminates that tension:
Gilfoyle (the attacker) has one job: find everything wrong. His prompt tells him to be devastating, to scale contempt to the severity of the issue, and to never let Dinesh feel like he won. This asymmetric pressure means he doesn’t stop at the obvious bugs — he digs into security, dependency vulnerabilities, distributed systems anti-patterns, and architectural rot.
Dinesh (the defender) has a different job: separate real issues from noise. His prompt requires him to classify each finding as [concede], [defend], or [dismiss] — and critically, to be honest in findings even when the banter is defensive. This constraint means his concessions are high-signal: if Dinesh can’t mount a defense, the issue is real.
The convergence loop
Section titled “The convergence loop”The orchestrator runs rounds until one of two things happens:
- Gilfoyle repeats himself — he’s found everything worth finding
- Dinesh concedes everything — the code needs work
This is not a fixed number of passes. The debate self-terminates when it stops producing new information. That’s the same principle behind GAN training: you stop when the discriminator can no longer distinguish real from generated — or in this case, when the critic can no longer find new issues the defender hasn’t already addressed.
The structured output trick
Section titled “The structured output trick”Both agents produce two sections: BANTER (the entertainment) and FINDINGS (the data). The banter keeps the review engaging and gives each agent a character voice that reinforces their adversarial role. The findings give the orchestrator machine-readable data to classify issues by severity and track concessions across rounds.
This dual output is a pattern you can reuse: let agents be expressive in one section (which helps them reason better — personas improve output quality) while producing structured data in another (which lets the system act on results programmatically).
Why the personas matter technically
Section titled “Why the personas matter technically”This isn’t just comedy. Character personas serve three engineering purposes:
-
Role commitment — A prompt that says “find bugs” gets a generic review. A prompt that says “you are Gilfoyle, and bad code is a moral failing” gets an agent that refuses to let anything slide. The persona creates pressure to be thorough.
-
Asymmetric objectives — Gilfoyle is rewarded (in-character) for finding issues. Dinesh is rewarded for defending valid code. This creates genuine tension that surfaces edge cases neither role would find alone.
-
Calibrated concessions — Dinesh’s character is “defensive but competent.” He doesn’t concede easily, which means when he does concede, it carries weight. A less defined character might concede everything to seem agreeable, destroying the signal.
The Pattern Beyond Code Review
Section titled “The Pattern Beyond Code Review”Adversarial agents work anywhere you need robust evaluation:
| Use Case | Attacker | Defender |
|---|---|---|
| Code review | Critic (finds bugs) | Author (defends decisions) |
| Proposal review | Skeptic (finds gaps) | Proposer (justifies choices) |
| Security audit | Red team (finds exploits) | Blue team (validates controls) |
| Documentation | Confused user (finds gaps) | Writer (defends clarity) |
| Architecture review | Pessimist (finds failure modes) | Architect (defends tradeoffs) |
The core recipe is always the same:
- Define two agents with opposing objectives
- Give each a structured output format
- Loop until convergence
- Synthesize the debate into a verdict
Artifact
Section titled “Artifact”A completed /dg review of your own code. Understanding of why adversarial multi-agent patterns produce better results than single-agent review.
Go Deeper
Section titled “Go Deeper”Module 11 — Everything Everywhere for the full taxonomy of multi-agent orchestration patterns, including fan-out, consensus councils, and structured instruction design.