Superpowers: Engineering Workflows for Coding Agents
1. The Problem Is Workflow, Not Output
Modern Coding Agents can write a lot of code. They can read a repository, edit files, run commands, debug failures, and often produce implementation-level work that looks close to what a senior engineer would write inside a well-scoped module. The harder problem is usually not that the agent cannot write code. It is that the agent does not reliably work like an engineer.
When a user says “help me add login”, a typical Coding Agent can jump straight into tables, APIs, and frontend pages, then report that the task is done. The problem is what may be missing before that implementation starts: Is this password login, OAuth, passkeys, or a migration path from an existing account system? What should happen after repeated failures? What are the error codes? Is this an MVP, an internal admin capability, or production-facing authentication? This is not only a model capability problem. It is an engineering workflow problem.
Human engineers carry an implicit checklist. Clarify the requirement. Align on the design. Isolate the workspace. Define correctness before writing the implementation. Review the code. Verify before delivery. A Coding Agent does not have those professional habits by default unless we encode them into its context and workflow.
That is where Superpowers is interesting. It does not try to replace the agent runtime. It asks a narrower engineering question: how do we make existing agents follow known software engineering practices? Its answer is to encode the workflow of an experienced engineer into agentic workflows that are triggerable, composable, and verifiable.
2. Engineering Practice as Reusable SOPs
At first glance, Superpowers looks like a collection of skills. In practice, it is closer to a software engineering methodology for Coding Agents. It breaks the development lifecycle into composable skills, then uses hooks and plugin manifests to adapt the same practices to Claude Code, Codex, Cursor, and other agent runtimes.
The static structure is straightforward. skills/ defines engineering SOPs. hooks/ injects startup behavior. Plugin manifests under .*-plugin/ adapt the same workflow to different runtimes. scripts/ and tests/ support synchronization, release, and verification:
superpowers/
├── skills/ # Triggerable engineering SOPs
│ ├── using-superpowers/
│ ├── brainstorming/
│ ├── writing-plans/
│ ├── test-driven-development/
│ ├── subagent-driven-development/
│ ├── requesting-code-review/
│ ├── receiving-code-review/
│ ├── verification-before-completion/
│ └── finishing-a-development-branch/
├── hooks/ # Inject startup rules into agent sessions
│ ├── session-start
│ ├── hooks.json
│ ├── hooks-cursor.json
│ └── run-hook.cmd
├── .claude-plugin/
├── .codex-plugin/
├── .cursor-plugin/
├── .opencode/ # Runtime-specific plugin adapters
├── docs/
├── scripts/
└── tests/2.1 Skills: The Smallest Unit of Engineering Practice
Skills are the core of the methodology. Each skill is a progressively loadable set of markdown instructions. It defines when the skill applies, what steps to follow, what mistakes to avoid, and what completion looks like. In other words, it gives the agent a concrete engineering SOP for a specific phase:
| # | Phase | Skills | Practice | Agent problem |
|---|---|---|---|---|
| 1 | Bootstrap | using-superpowers | Check the workflow before acting | Forgetting to load relevant skills |
| 2 | Requirement Clarification | brainstorming | Clarify goals and design first | Starting implementation from a vague request |
| 3 | Planning | writing-plans | Write an executable implementation plan | Vague plans with no files, tests, or commands |
| 4 | Workspace Setup | using-git-worktrees | Isolate development work | Polluting the current branch or user changes |
| 5 | Execution | subagent-driven-development, executing-plans | Execute and track scoped tasks | Drifting in a long context or skipping steps |
| 6 | Parallel Investigation | dispatching-parallel-agents | Investigate independent questions in parallel | Serializing unrelated work |
| 7 | Development Discipline | test-driven-development | Red -> Green -> Refactor | Writing implementation first and tests later |
| 8 | Debugging | systematic-debugging | Find the root cause before fixing | Guessing at code changes |
| 9 | Review | requesting-code-review, receiving-code-review | Review early and handle feedback seriously | Self-confirmation or blind agreement with reviewers |
| 10 | Verification | verification-before-completion | Evidence before conclusions | Claiming completion without running checks |
| 11 | Finishing | finishing-a-development-branch | Choose merge, PR, keep, or discard | Leaving the repo in an unclear state |
| 12 | Meta | writing-skills | Write workflow documents with TDD | Skills that cannot be verified |
This table is not a hard-coded state machine. It is a representative task path. In real work, the Main Agent chooses skills from context: a failure triggers systematic-debugging; an independent investigation triggers dispatching-parallel-agents; writing a new skill triggers writing-skills; and so on.
2.2 Hooks: Inject the Startup Rule
Hooks are the startup injection layer. The important one is session-start: it reads the using-superpowers skill and emits it as extra context for the runtime. hooks.json binds SessionStart to events such as start, clear, and compact. hooks-cursor.json connects the same startup rule to Cursor’s sessionStart.
2.3 Plugin Manifests: The Cross-Runtime Adapter
Plugin manifests are the platform adapter layer. .codex-plugin/, .cursor-plugin/, and .claude-plugin/ serve Codex, Cursor, and Claude Code respectively. They define the skills directory, plugin metadata, hook configuration, and other runtime-specific wiring so the same engineering practice can be exposed to different Coding Agents.
3. How a Full Engineering Task Runs
The easiest way to understand the system is to walk through a typical task. First, the roles:
| Role | Responsibility | Why it matters |
|---|---|---|
| User | Defines the goal, confirms requirements, approves key decisions | Prevents the agent from expanding or misunderstanding the task |
| Agent Runtime | Runs the session, injects context, exposes tools | Determines how skills and hooks are loaded |
| Main Agent | Detects phase, loads skills, splits work, integrates results | Turns loose capabilities into an engineering process |
| Skill Loader | Reads matching skills | Brings the right SOP into context at the right time |
| Implementer Subagent | Implements, tests, commits, and self-checks scoped work | Isolates context and reduces pollution in the main session |
| Reviewer Subagent | Reviews code for compliance and quality | Separates “did we build the right thing” from “did we build it well” |
| Tool Executor | Runs shell, file, git, and test operations | Converts natural language decisions into verifiable actions |
3.1 Session Bootstrap: Put the Workflow into Context
Before the task starts, the Agent Runtime runs the session-start hook. The hook reads using-superpowers and injects the rule “check whether a relevant skill applies before acting” into the session:
sequenceDiagram
actor User as User
participant Runtime as Agent Runtime
participant Main as Main Agent
participant Skill as Skill Loader
participant Impl as Implementer Subagent
participant Review as Reviewer Subagent
participant Tool as Tool Executor
rect rgba(148,163,184,0.14)
Note over Runtime,Main: Session Bootstrap
Runtime->>Runtime: Run `session-start` hook
Runtime->>Tool: Read `using-superpowers`
Tool-->>Runtime: Return startup instructions
Runtime-->>Main: Start Main Agent with injected workflow rules
endThis step does not solve the user’s business problem. It installs the entry point for later decisions. As soon as the Main Agent starts, it knows that it should check for applicable skills.
using-superpowers behaves more like a soft scheduler than a traditional workflow engine. It does not maintain an external state table, and it does not hard-code “step one must be X, step two must be Y”. It simply puts one rule into context: before taking action, check whether a relevant skill should be loaded; when multiple skills apply, prefer process-oriented skills first.
The upside is that this is lightweight, cross-runtime, and easy to extend. Adding a new skill does not require changing the runtime. The tradeoff is that the constraint still depends on the Main Agent interpreting context correctly. It is not as strong as a code-level state machine.
3.2 Requirement Clarification: Clarify Before Implementing
After the user submits a task, the Main Agent uses using-superpowers to decide whether brainstorming applies. If the task involves design, features, or behavior changes, the agent first reads the project context, then clarifies goals, constraints, edge cases, and success criteria:
sequenceDiagram
actor User as User
participant Runtime as Agent Runtime
participant Main as Main Agent
participant Skill as Skill Loader
participant Impl as Implementer Subagent
participant Review as Reviewer Subagent
participant Tool as Tool Executor
rect rgba(59,130,246,0.14)
Note over User,Main: Requirement Clarification
User->>Main: Submit coding task
Main->>Skill: Check applicable skills via `using-superpowers`
Skill-->>Main: Use matching skill before proceeding
Main->>Skill: Load `brainstorming`
Skill-->>Main: Return brainstorming SOP
loop Until direction is clear
Main->>User: Clarify goals, constraints, edge cases, and success criteria
User-->>Main: Answer and refine direction
end
endThis is the most basic professional habit of engineering work: do not translate a one-line request directly into code. Turn an ambiguous goal into verifiable constraints, so the agent does not build something complete but wrong.
3.3 Planning: Write the Plan Before the Code
After clarification, the Main Agent loads writing-plans and turns the confirmed spec into an implementation plan. This is not a one-line plan like “implement login”. A useful plan includes file paths, test code, commands, expected output, and commit steps, then asks the user for approval:
sequenceDiagram
actor User as User
participant Runtime as Agent Runtime
participant Main as Main Agent
participant Skill as Skill Loader
participant Impl as Implementer Subagent
participant Review as Reviewer Subagent
participant Tool as Tool Executor
rect rgba(34,197,94,0.13)
Note over Main,Tool: Planning
Main->>Skill: Load `writing-plans`
Skill-->>Main: Return planning SOP
Main->>Tool: Write implementation plan with files, tests, and commands
Tool-->>Main: Plan document created
Main->>User: Ask for plan approval
alt Changes requested
User-->>Main: Request plan changes
Main->>Tool: Revise implementation plan
Tool-->>Main: Updated plan
else Approved
User-->>Main: Approve plan
end
endThe plan is not ceremony. It externalizes implicit judgment before implementation starts: which files will change, which behaviors need tests, which commands prove the result, and which decisions require user confirmation. The later implementation, review, and verification now share the same reference point.
3.4 Isolated Workspace Setup: Protect the User’s Work
Before implementation, the Main Agent loads using-git-worktrees. It detects whether it is already in an isolated worktree, then creates or reuses a worktree and branch. The goal is to protect the current branch and any uncommitted user changes by separating the agent’s experiment space from the main workspace:
sequenceDiagram
actor User as User
participant Runtime as Agent Runtime
participant Main as Main Agent
participant Skill as Skill Loader
participant Impl as Implementer Subagent
participant Review as Reviewer Subagent
participant Tool as Tool Executor
rect rgba(245,158,11,0.14)
Note over Main,Tool: Isolated Workspace Setup
Main->>Skill: Load `using-git-worktrees`
Skill-->>Main: Return worktree SOP
Main->>Tool: Detect current workspace state
Tool-->>Main: Current repo, branch, and worktree status
Main->>Tool: Create or reuse isolated worktree and branch
Tool-->>Main: Worktree and branch ready
Main->>Tool: Run preflight checks if available
Tool-->>Main: Preflight result
alt Preflight fails
Main-->>User: Report blocking issue before implementation
else Preflight passes or no check exists
Main->>Main: Continue to plan execution
end
endOnce the workspace is isolated, the agent still needs to check that the starting point is trustworthy. Dependencies should install, the project should start, and existing test, lint, or build commands should run according to the current project state. This does not prove the new feature. It proves that the starting environment is usable. Otherwise, a later failure is hard to classify: did this change break something, or was the baseline already broken?
3.5 Plan Execution: Subagents, TDD, and Small Loops
During execution, the Main Agent loads subagent-driven-development, assigns each scoped task to an Implementer Subagent, and the implementer loads test-driven-development when appropriate. The loop is Red, Green, Refactor: write the failing test, confirm it fails for the right reason, implement the minimum code, then refactor while keeping tests green.
sequenceDiagram
actor User as User
participant Runtime as Agent Runtime
participant Main as Main Agent
participant Skill as Skill Loader
participant Impl as Implementer Subagent
participant Review as Reviewer Subagent
participant Tool as Tool Executor
rect rgba(168,85,247,0.13)
Note over Main,Impl: Plan Execution
Main->>Skill: Load `subagent-driven-development`
Skill-->>Main: Return subagent execution SOP
loop Each task in plan
Main->>Impl: Assign one task with acceptance criteria
Impl->>Skill: Load `test-driven-development`
Skill-->>Impl: Return TDD SOP
Impl->>Tool: Write failing test
Tool-->>Impl: RED: test fails for expected reason
Impl->>Tool: Implement minimal code
Tool-->>Impl: Code changed
Impl->>Tool: Run relevant tests
Tool-->>Impl: GREEN: tests pass
Impl->>Tool: Refactor and rerun tests
Tool-->>Impl: Tests still pass
Impl-->>Main: Return implementation result
end
endThis addresses one of the most common failure modes in long-context agent development: drift. The agent writes too much at once, loses track of the original acceptance criteria, and eventually relies on subjective judgment to declare completion. Breaking the plan into tasks and pushing each task through a TDD loop creates more small feedback cycles, which keeps both the output and the process aligned.
3.6 Two-Stage Review: First Correctness, Then Quality
After implementation, the Main Agent loads requesting-code-review. The review is deliberately split into two passes. First comes spec compliance review: did we build the right thing, did we miss requirements, did we add behavior that was not requested? Only after that passes does the agent request code quality review: structure, tests, maintainability, and production risk.
| Review type | Question | Typical issues | Example |
|---|---|---|---|
spec compliance review | Did we build the right thing? | Missing requirements, extra behavior, misunderstood acceptance criteria | Requirement says 10 failed login attempts return 429; implementation returns 403 after 5 attempts |
code quality review | Did we build it well? | Duplication, coupling, weak error handling, shallow tests | Behavior is correct, but rate limiting is hard-coded in the controller |
The interaction looks like this:
sequenceDiagram
actor User as User
participant Runtime as Agent Runtime
participant Main as Main Agent
participant Skill as Skill Loader
participant Impl as Implementer Subagent
participant Review as Reviewer Subagent
participant Tool as Tool Executor
rect rgba(236,72,153,0.13)
Note over Main,Review: Two-Stage Review
Main->>Skill: Load `requesting-code-review`
Skill-->>Main: Return review request SOP
Main->>Review: Request spec compliance review
Review-->>Main: Spec review result
alt Spec issue found
Main->>Skill: Load `receiving-code-review`
Skill-->>Main: Return review handling SOP
Main->>Impl: Send required spec fixes
Impl->>Tool: Fix implementation and rerun relevant tests
Tool-->>Impl: Verification result
Impl-->>Main: Updated implementation result
else Spec passes
Main->>Review: Request code quality review
Review-->>Main: Quality review result
alt Quality issue found
Main->>Skill: Load `receiving-code-review`
Skill-->>Main: Return review handling SOP
Main->>Impl: Send required quality fixes
Impl->>Tool: Fix implementation and rerun relevant tests
Tool-->>Impl: Verification result
Impl-->>Main: Updated implementation result
else Quality review passes
Main->>Tool: Mark task complete in plan
Tool-->>Main: Plan document updated
end
end
endThis is not process obsession. It avoids a common false positive: high-quality code can still solve the wrong problem. When review finds an issue, the Main Agent loads receiving-code-review, turns the feedback into a concrete fix, and returns to the implementation and testing loop.
3.7 Final Verification: Evidence Before Claims
After all tasks are complete, the Main Agent loads verification-before-completion and runs the full checks: tests, lint, build, and any key project-specific validation. If verification fails, it loads systematic-debugging, reproduces the failure, gathers evidence, identifies the root cause, and only then decides how to fix it.
sequenceDiagram
actor User as User
participant Runtime as Agent Runtime
participant Main as Main Agent
participant Skill as Skill Loader
participant Impl as Implementer Subagent
participant Review as Reviewer Subagent
participant Tool as Tool Executor
rect rgba(20,184,166,0.13)
Note over Main,Tool: Final Verification
Main->>Skill: Load `verification-before-completion`
Skill-->>Main: Return completion verification SOP
Main->>Tool: Run full tests, lint, build, and key checks
Tool-->>Main: Verification output
alt Verification fails
Main->>Skill: Load `systematic-debugging`
Skill-->>Main: Return debugging SOP
Main->>Tool: Reproduce failure and collect evidence
Tool-->>Main: Evidence and logs
Main->>Impl: Return to fix loop
else Verification passes
Main->>Main: Completion evidence is ready
end
endThis step targets one of the most dangerous agent habits: saying “done” without current verification. The workflow puts evidence before the conclusion. The agent must run the command, read the output, confirm the result, and only then claim completion.
It is useful to separate this from preflight checks. Both may look like “run tests”, but they answer different questions. Preflight happens before work starts and asks whether the environment is trustworthy. Final verification happens before delivery and asks whether the result is trustworthy.
| Gate | Timing | Purpose | Examples |
|---|---|---|---|
preflight checks | Before implementation | Confirm that the workspace, dependencies, and baseline project state are usable | test, lint, build, dependency check, startup check |
final verification | Before delivery | Prove that this change did not break the intended result | full tests, key regressions, build, requirements checklist |
3.8 Branch Finishing: Close the Development State
After verification passes, the Main Agent loads finishing-a-development-branch, presents clear options such as merge, PR, keep, or discard, and reports the verification evidence:
sequenceDiagram
actor User as User
participant Runtime as Agent Runtime
participant Main as Main Agent
participant Skill as Skill Loader
participant Impl as Implementer Subagent
participant Review as Reviewer Subagent
participant Tool as Tool Executor
rect rgba(99,102,241,0.13)
Note over Main,User: Branch Finishing
Main->>Skill: Load `finishing-a-development-branch`
Skill-->>Main: Return branch finishing SOP
Main->>User: Offer merge, PR, keep, or discard
User-->>Main: Choose finishing action
Main->>Tool: Execute chosen git action
Tool-->>Main: Branch finishing done
Main-->>User: Report result with verification evidence
endThis may look like a small cleanup step, but it matters. Code being written is not the same as the task being finished. Whether to merge, open a PR, keep a worktree, or discard an experiment is a separate decision. Without it, the agent easily leaves behind a repository state that feels finished in the conversation but remains unclear in Git.
At this point the main shape of Superpowers is visible: it breaks a development task into triggerable engineering checkpoints and pulls the agent into a standard process at each important moment.
4. Making Practice the Default Path
Superpowers is a workflow layer, not a complete agent runtime. It does not solve evals, observability, cost control, long-term memory, or permission management. It does not provide a hard state machine, a task database, or a unified scheduler. Skill execution quality still depends on the underlying model, and supporting multiple runtimes creates maintenance cost.
That limitation makes its value clearer. Superpowers does not replace the runtime. It injects the default path of an experienced engineer into the runtime. It does not finish the entire eval story, but it encodes task-level constraints such as TDD, review, verification, and human confirmation very clearly.
The valuable part is turning tacit engineering practice into the agent’s default path: ask first, plan first, isolate the workspace, write tests, accept review, and deliver with evidence. The real superpower is not speed of output. It is not skipping judgment at the moments where judgment matters.
5. Appendix: Full Task Sequence
The full task sequence looks like this:
sequenceDiagram
autonumber
actor User as User
participant Runtime as Agent Runtime
participant Main as Main Agent
participant Skill as Skill Loader
participant Impl as Implementer Subagent
participant Review as Reviewer Subagent
participant Tool as Tool Executor
rect rgba(148,163,184,0.14)
Note over Runtime,Main: Session Bootstrap
Runtime->>Runtime: Run `session-start` hook
Runtime->>Tool: Read `using-superpowers`
Tool-->>Runtime: Return startup instructions
Runtime-->>Main: Start Main Agent with injected workflow rules
end
rect rgba(59,130,246,0.14)
Note over User,Main: Requirement Clarification
User->>Main: Submit coding task
Main->>Skill: Check applicable skills via `using-superpowers`
Skill-->>Main: Use matching skill before proceeding
Main->>Skill: Load `brainstorming`
Skill-->>Main: Return brainstorming SOP
loop Until direction is clear
Main->>User: Clarify goals, constraints, edge cases, and success criteria
User-->>Main: Answer and refine direction
end
end
rect rgba(34,197,94,0.13)
Note over Main,Tool: Planning
Main->>Skill: Load `writing-plans`
Skill-->>Main: Return planning SOP
Main->>Tool: Write implementation plan with files, tests, and commands
Tool-->>Main: Plan document created
Main->>User: Ask for plan approval
alt Changes requested
User-->>Main: Request plan changes
Main->>Tool: Revise implementation plan
Tool-->>Main: Updated plan
else Approved
User-->>Main: Approve plan
end
end
rect rgba(245,158,11,0.14)
Note over Main,Tool: Isolated Workspace Setup
Main->>Skill: Load `using-git-worktrees`
Skill-->>Main: Return worktree SOP
Main->>Tool: Detect current workspace state
Tool-->>Main: Current repo, branch, and worktree status
Main->>Tool: Create or reuse isolated worktree and branch
Tool-->>Main: Worktree and branch ready
Main->>Tool: Run preflight checks if available
Tool-->>Main: Preflight result
alt Preflight fails
Main-->>User: Report blocking issue before implementation
else Preflight passes or no check exists
Main->>Main: Continue to plan execution
end
end
rect rgba(168,85,247,0.13)
Note over Main,Impl: Plan Execution
Main->>Skill: Load `subagent-driven-development`
Skill-->>Main: Return subagent execution SOP
loop Each task in plan
Main->>Impl: Assign one task with acceptance criteria
Impl->>Skill: Load `test-driven-development`
Skill-->>Impl: Return TDD SOP
Impl->>Tool: Write failing test
Tool-->>Impl: RED: test fails for expected reason
Impl->>Tool: Implement minimal code
Tool-->>Impl: Code changed
Impl->>Tool: Run relevant tests
Tool-->>Impl: GREEN: tests pass
Impl->>Tool: Refactor and rerun tests
Tool-->>Impl: Tests still pass
Impl-->>Main: Return implementation result
end
end
rect rgba(236,72,153,0.13)
Note over Main,Review: Two-Stage Review
Main->>Skill: Load `requesting-code-review`
Skill-->>Main: Return review request SOP
Main->>Review: Request spec compliance review
Review-->>Main: Spec review result
alt Spec issue found
Main->>Skill: Load `receiving-code-review`
Skill-->>Main: Return review handling SOP
Main->>Impl: Send required spec fixes
Impl->>Tool: Fix implementation and rerun relevant tests
Tool-->>Impl: Verification result
Impl-->>Main: Updated implementation result
else Spec passes
Main->>Review: Request code quality review
Review-->>Main: Quality review result
alt Quality issue found
Main->>Skill: Load `receiving-code-review`
Skill-->>Main: Return review handling SOP
Main->>Impl: Send required quality fixes
Impl->>Tool: Fix implementation and rerun relevant tests
Tool-->>Impl: Verification result
Impl-->>Main: Updated implementation result
else Quality review passes
Main->>Tool: Mark task complete in plan
Tool-->>Main: Plan document updated
end
end
end
rect rgba(20,184,166,0.13)
Note over Main,Tool: Final Verification
Main->>Skill: Load `verification-before-completion`
Skill-->>Main: Return completion verification SOP
Main->>Tool: Run full tests, lint, build, and key checks
Tool-->>Main: Verification output
alt Verification fails
Main->>Skill: Load `systematic-debugging`
Skill-->>Main: Return debugging SOP
Main->>Tool: Reproduce failure and collect evidence
Tool-->>Main: Evidence and logs
Main->>Impl: Return to fix loop
else Verification passes
Main->>Main: Completion evidence is ready
end
end
rect rgba(99,102,241,0.13)
Note over Main,User: Branch Finishing
Main->>Skill: Load `finishing-a-development-branch`
Skill-->>Main: Return branch finishing SOP
Main->>User: Offer merge, PR, keep, or discard
User-->>Main: Choose finishing action
Main->>Tool: Execute chosen git action
Tool-->>Main: Branch finishing done
Main-->>User: Report result with verification evidence
end