When you’re not sure which approach will work best, build all of them. Test Kitchen implements multiple variants in parallel — separate git worktrees, separate agents — and uses test results plus a structured scoring framework to pick the winner.
Install
/plugin marketplace add 2389-research/claude-plugins
/plugin install test-kitchen
Test Kitchen orchestrates several superpowers skills (brainstorming, writing-plans, dispatching-parallel-agents, etc.) and falls back gracefully if any aren’t installed.
What it does
Test Kitchen adds two gate skills to your workflow.
Omakase-off is the entry gate. It triggers when you ask to build something — “build a notification system”, “create a CLI tool”, “add auth”. You get two choices: brainstorm the design collaboratively, or go omakase (chef’s choice) where 3-5 architectural approaches get implemented in parallel and tests determine which one wins.
If you pick brainstorming, omakase-off watches for architectural indecision. When you say “not sure” or “either works” on two or more design decisions, it offers to explore those alternatives in parallel instead of forcing a premature choice.
Cookoff is the exit gate. It triggers when design is done and you’re ready to code. Multiple agents each read the same design doc but create their own implementation plans. No shared plan means you get real differences in approach. Each agent builds in an isolated worktree. After they finish, a judge skill scores all implementations across five criteria: fitness for purpose, justified complexity, readability, robustness, and maintainability.
Both gates offer a single-agent or local implementation option too. The parallel path is there when the cost of picking wrong is higher than the cost of building three prototypes and throwing two away.
How it works
The lifecycle for a parallel run:
- Worktree setup — each variant gets its own git worktree and branch (
feature/omakase/variant-slugorfeature/cookoff/impl-N) - Parallel dispatch — all agents launch in a single message for true parallelism
- Independent planning — each agent writes its own implementation plan from the shared design doc
- TDD implementation — agents follow red-green-refactor in their isolated worktrees
- Scenario testing — same test scenarios run against every implementation
- Fresh-eyes review — survivors get a security and logic check before judging
- Judge scoring — 5-criteria worksheet with integer scores, hard gates on fitness gaps and critical flaws
- Cleanup — user picks winner from scored results, loser worktrees and branches get removed
Results go in docs/plans/<feature>/cookoff/result.md so you have a record of what was tried and why the winner won.
