Zhihao

1. The Problem Is Workflow, Not Output

Modern Coding Agents can write a lot of code. They can read a repository, edit files, run commands, debug failures, and often produce implementation-level work that looks close to what a senior engineer would write inside a well-scoped module. The harder problem is usually not that the agent cannot write code. It is that the agent does not reliably work like an engineer.

When a user says “help me add login”, a typical Coding Agent can jump straight into tables, APIs, and frontend pages, then report that the task is done. The problem is what may be missing before that implementation starts: Is this password login, OAuth, passkeys, or a migration path from an existing account system? What should happen after repeated failures? What are the error codes? Is this an MVP, an internal admin capability, or production-facing authentication? This is not only a model capability problem. It is an engineering workflow problem.

Human engineers carry an implicit checklist. Clarify the requirement. Align on the design. Isolate the workspace. Define correctness before writing the implementation. Review the code. Verify before delivery. A Coding Agent does not have those professional habits by default unless we encode them into its context and workflow.

That is where Superpowers is interesting. It does not try to replace the agent runtime. It asks a narrower engineering question: how do we make existing agents follow known software engineering practices? Its answer is to encode the workflow of an experienced engineer into agentic workflows that are triggerable, composable, and verifiable.

2. Engineering Practice as Reusable SOPs

At first glance, Superpowers looks like a collection of skills. In practice, it is closer to a software engineering methodology for Coding Agents. It breaks the development lifecycle into composable skills, then uses hooks and plugin manifests to adapt the same practices to Claude Code, Codex, Cursor, and other agent runtimes.

The static structure is straightforward. skills/ defines engineering SOPs. hooks/ injects startup behavior. Plugin manifests under .*-plugin/ adapt the same workflow to different runtimes. scripts/ and tests/ support synchronization, release, and verification:

Text
superpowers/
├── skills/                         # Triggerable engineering SOPs
│   ├── using-superpowers/
│   ├── brainstorming/
│   ├── writing-plans/
│   ├── test-driven-development/
│   ├── subagent-driven-development/
│   ├── requesting-code-review/
│   ├── receiving-code-review/
│   ├── verification-before-completion/
│   └── finishing-a-development-branch/
├── hooks/                          # Inject startup rules into agent sessions
│   ├── session-start
│   ├── hooks.json
│   ├── hooks-cursor.json
│   └── run-hook.cmd
├── .claude-plugin/
├── .codex-plugin/
├── .cursor-plugin/
├── .opencode/                      # Runtime-specific plugin adapters
├── docs/
├── scripts/
└── tests/

2.1 Skills: The Smallest Unit of Engineering Practice

Skills are the core of the methodology. Each skill is a progressively loadable set of markdown instructions. It defines when the skill applies, what steps to follow, what mistakes to avoid, and what completion looks like. In other words, it gives the agent a concrete engineering SOP for a specific phase:

#PhaseSkillsPracticeAgent problem
1Bootstrapusing-superpowersCheck the workflow before actingForgetting to load relevant skills
2Requirement ClarificationbrainstormingClarify goals and design firstStarting implementation from a vague request
3Planningwriting-plansWrite an executable implementation planVague plans with no files, tests, or commands
4Workspace Setupusing-git-worktreesIsolate development workPolluting the current branch or user changes
5Executionsubagent-driven-development, executing-plansExecute and track scoped tasksDrifting in a long context or skipping steps
6Parallel Investigationdispatching-parallel-agentsInvestigate independent questions in parallelSerializing unrelated work
7Development Disciplinetest-driven-developmentRed -> Green -> RefactorWriting implementation first and tests later
8Debuggingsystematic-debuggingFind the root cause before fixingGuessing at code changes
9Reviewrequesting-code-review, receiving-code-reviewReview early and handle feedback seriouslySelf-confirmation or blind agreement with reviewers
10Verificationverification-before-completionEvidence before conclusionsClaiming completion without running checks
11Finishingfinishing-a-development-branchChoose merge, PR, keep, or discardLeaving the repo in an unclear state
12Metawriting-skillsWrite workflow documents with TDDSkills that cannot be verified

This table is not a hard-coded state machine. It is a representative task path. In real work, the Main Agent chooses skills from context: a failure triggers systematic-debugging; an independent investigation triggers dispatching-parallel-agents; writing a new skill triggers writing-skills; and so on.

2.2 Hooks: Inject the Startup Rule

Hooks are the startup injection layer. The important one is session-start: it reads the using-superpowers skill and emits it as extra context for the runtime. hooks.json binds SessionStart to events such as start, clear, and compact. hooks-cursor.json connects the same startup rule to Cursor’s sessionStart.

2.3 Plugin Manifests: The Cross-Runtime Adapter

Plugin manifests are the platform adapter layer. .codex-plugin/, .cursor-plugin/, and .claude-plugin/ serve Codex, Cursor, and Claude Code respectively. They define the skills directory, plugin metadata, hook configuration, and other runtime-specific wiring so the same engineering practice can be exposed to different Coding Agents.

3. How a Full Engineering Task Runs

The easiest way to understand the system is to walk through a typical task. First, the roles:

RoleResponsibilityWhy it matters
UserDefines the goal, confirms requirements, approves key decisionsPrevents the agent from expanding or misunderstanding the task
Agent RuntimeRuns the session, injects context, exposes toolsDetermines how skills and hooks are loaded
Main AgentDetects phase, loads skills, splits work, integrates resultsTurns loose capabilities into an engineering process
Skill LoaderReads matching skillsBrings the right SOP into context at the right time
Implementer SubagentImplements, tests, commits, and self-checks scoped workIsolates context and reduces pollution in the main session
Reviewer SubagentReviews code for compliance and qualitySeparates “did we build the right thing” from “did we build it well”
Tool ExecutorRuns shell, file, git, and test operationsConverts natural language decisions into verifiable actions

3.1 Session Bootstrap: Put the Workflow into Context

Before the task starts, the Agent Runtime runs the session-start hook. The hook reads using-superpowers and injects the rule “check whether a relevant skill applies before acting” into the session:

mermaid
sequenceDiagram
    actor User as User
    participant Runtime as Agent Runtime
    participant Main as Main Agent
    participant Skill as Skill Loader
    participant Impl as Implementer Subagent
    participant Review as Reviewer Subagent
    participant Tool as Tool Executor

    rect rgba(148,163,184,0.14)
        Note over Runtime,Main: Session Bootstrap
        Runtime->>Runtime: Run `session-start` hook
        Runtime->>Tool: Read `using-superpowers`
        Tool-->>Runtime: Return startup instructions
        Runtime-->>Main: Start Main Agent with injected workflow rules
    end

This step does not solve the user’s business problem. It installs the entry point for later decisions. As soon as the Main Agent starts, it knows that it should check for applicable skills.

using-superpowers behaves more like a soft scheduler than a traditional workflow engine. It does not maintain an external state table, and it does not hard-code “step one must be X, step two must be Y”. It simply puts one rule into context: before taking action, check whether a relevant skill should be loaded; when multiple skills apply, prefer process-oriented skills first.

The upside is that this is lightweight, cross-runtime, and easy to extend. Adding a new skill does not require changing the runtime. The tradeoff is that the constraint still depends on the Main Agent interpreting context correctly. It is not as strong as a code-level state machine.

3.2 Requirement Clarification: Clarify Before Implementing

After the user submits a task, the Main Agent uses using-superpowers to decide whether brainstorming applies. If the task involves design, features, or behavior changes, the agent first reads the project context, then clarifies goals, constraints, edge cases, and success criteria:

mermaid
sequenceDiagram
    actor User as User
    participant Runtime as Agent Runtime
    participant Main as Main Agent
    participant Skill as Skill Loader
    participant Impl as Implementer Subagent
    participant Review as Reviewer Subagent
    participant Tool as Tool Executor

    rect rgba(59,130,246,0.14)
        Note over User,Main: Requirement Clarification
        User->>Main: Submit coding task
        Main->>Skill: Check applicable skills via `using-superpowers`
        Skill-->>Main: Use matching skill before proceeding
        Main->>Skill: Load `brainstorming`
        Skill-->>Main: Return brainstorming SOP
        loop Until direction is clear
            Main->>User: Clarify goals, constraints, edge cases, and success criteria
            User-->>Main: Answer and refine direction
        end
    end

This is the most basic professional habit of engineering work: do not translate a one-line request directly into code. Turn an ambiguous goal into verifiable constraints, so the agent does not build something complete but wrong.

3.3 Planning: Write the Plan Before the Code

After clarification, the Main Agent loads writing-plans and turns the confirmed spec into an implementation plan. This is not a one-line plan like “implement login”. A useful plan includes file paths, test code, commands, expected output, and commit steps, then asks the user for approval:

mermaid
sequenceDiagram
    actor User as User
    participant Runtime as Agent Runtime
    participant Main as Main Agent
    participant Skill as Skill Loader
    participant Impl as Implementer Subagent
    participant Review as Reviewer Subagent
    participant Tool as Tool Executor

    rect rgba(34,197,94,0.13)
        Note over Main,Tool: Planning
        Main->>Skill: Load `writing-plans`
        Skill-->>Main: Return planning SOP
        Main->>Tool: Write implementation plan with files, tests, and commands
        Tool-->>Main: Plan document created
        Main->>User: Ask for plan approval
        alt Changes requested
            User-->>Main: Request plan changes
            Main->>Tool: Revise implementation plan
            Tool-->>Main: Updated plan
        else Approved
            User-->>Main: Approve plan
        end
    end

The plan is not ceremony. It externalizes implicit judgment before implementation starts: which files will change, which behaviors need tests, which commands prove the result, and which decisions require user confirmation. The later implementation, review, and verification now share the same reference point.

3.4 Isolated Workspace Setup: Protect the User’s Work

Before implementation, the Main Agent loads using-git-worktrees. It detects whether it is already in an isolated worktree, then creates or reuses a worktree and branch. The goal is to protect the current branch and any uncommitted user changes by separating the agent’s experiment space from the main workspace:

mermaid
sequenceDiagram
    actor User as User
    participant Runtime as Agent Runtime
    participant Main as Main Agent
    participant Skill as Skill Loader
    participant Impl as Implementer Subagent
    participant Review as Reviewer Subagent
    participant Tool as Tool Executor

    rect rgba(245,158,11,0.14)
        Note over Main,Tool: Isolated Workspace Setup
        Main->>Skill: Load `using-git-worktrees`
        Skill-->>Main: Return worktree SOP
        Main->>Tool: Detect current workspace state
        Tool-->>Main: Current repo, branch, and worktree status
        Main->>Tool: Create or reuse isolated worktree and branch
        Tool-->>Main: Worktree and branch ready
        Main->>Tool: Run preflight checks if available
        Tool-->>Main: Preflight result
        alt Preflight fails
            Main-->>User: Report blocking issue before implementation
        else Preflight passes or no check exists
            Main->>Main: Continue to plan execution
        end
    end

Once the workspace is isolated, the agent still needs to check that the starting point is trustworthy. Dependencies should install, the project should start, and existing test, lint, or build commands should run according to the current project state. This does not prove the new feature. It proves that the starting environment is usable. Otherwise, a later failure is hard to classify: did this change break something, or was the baseline already broken?

3.5 Plan Execution: Subagents, TDD, and Small Loops

During execution, the Main Agent loads subagent-driven-development, assigns each scoped task to an Implementer Subagent, and the implementer loads test-driven-development when appropriate. The loop is Red, Green, Refactor: write the failing test, confirm it fails for the right reason, implement the minimum code, then refactor while keeping tests green.

mermaid
sequenceDiagram
    actor User as User
    participant Runtime as Agent Runtime
    participant Main as Main Agent
    participant Skill as Skill Loader
    participant Impl as Implementer Subagent
    participant Review as Reviewer Subagent
    participant Tool as Tool Executor

    rect rgba(168,85,247,0.13)
        Note over Main,Impl: Plan Execution
        Main->>Skill: Load `subagent-driven-development`
        Skill-->>Main: Return subagent execution SOP
        loop Each task in plan
            Main->>Impl: Assign one task with acceptance criteria
            Impl->>Skill: Load `test-driven-development`
            Skill-->>Impl: Return TDD SOP
            Impl->>Tool: Write failing test
            Tool-->>Impl: RED: test fails for expected reason
            Impl->>Tool: Implement minimal code
            Tool-->>Impl: Code changed
            Impl->>Tool: Run relevant tests
            Tool-->>Impl: GREEN: tests pass
            Impl->>Tool: Refactor and rerun tests
            Tool-->>Impl: Tests still pass
            Impl-->>Main: Return implementation result
        end
    end

This addresses one of the most common failure modes in long-context agent development: drift. The agent writes too much at once, loses track of the original acceptance criteria, and eventually relies on subjective judgment to declare completion. Breaking the plan into tasks and pushing each task through a TDD loop creates more small feedback cycles, which keeps both the output and the process aligned.

3.6 Two-Stage Review: First Correctness, Then Quality

After implementation, the Main Agent loads requesting-code-review. The review is deliberately split into two passes. First comes spec compliance review: did we build the right thing, did we miss requirements, did we add behavior that was not requested? Only after that passes does the agent request code quality review: structure, tests, maintainability, and production risk.

Review typeQuestionTypical issuesExample
spec compliance reviewDid we build the right thing?Missing requirements, extra behavior, misunderstood acceptance criteriaRequirement says 10 failed login attempts return 429; implementation returns 403 after 5 attempts
code quality reviewDid we build it well?Duplication, coupling, weak error handling, shallow testsBehavior is correct, but rate limiting is hard-coded in the controller

The interaction looks like this:

mermaid
sequenceDiagram
    actor User as User
    participant Runtime as Agent Runtime
    participant Main as Main Agent
    participant Skill as Skill Loader
    participant Impl as Implementer Subagent
    participant Review as Reviewer Subagent
    participant Tool as Tool Executor

    rect rgba(236,72,153,0.13)
        Note over Main,Review: Two-Stage Review
        Main->>Skill: Load `requesting-code-review`
        Skill-->>Main: Return review request SOP
        Main->>Review: Request spec compliance review
        Review-->>Main: Spec review result
        alt Spec issue found
            Main->>Skill: Load `receiving-code-review`
            Skill-->>Main: Return review handling SOP
            Main->>Impl: Send required spec fixes
            Impl->>Tool: Fix implementation and rerun relevant tests
            Tool-->>Impl: Verification result
            Impl-->>Main: Updated implementation result
        else Spec passes
            Main->>Review: Request code quality review
            Review-->>Main: Quality review result
            alt Quality issue found
                Main->>Skill: Load `receiving-code-review`
                Skill-->>Main: Return review handling SOP
                Main->>Impl: Send required quality fixes
                Impl->>Tool: Fix implementation and rerun relevant tests
                Tool-->>Impl: Verification result
                Impl-->>Main: Updated implementation result
            else Quality review passes
                Main->>Tool: Mark task complete in plan
                Tool-->>Main: Plan document updated
            end
        end
    end

This is not process obsession. It avoids a common false positive: high-quality code can still solve the wrong problem. When review finds an issue, the Main Agent loads receiving-code-review, turns the feedback into a concrete fix, and returns to the implementation and testing loop.

3.7 Final Verification: Evidence Before Claims

After all tasks are complete, the Main Agent loads verification-before-completion and runs the full checks: tests, lint, build, and any key project-specific validation. If verification fails, it loads systematic-debugging, reproduces the failure, gathers evidence, identifies the root cause, and only then decides how to fix it.

mermaid
sequenceDiagram
    actor User as User
    participant Runtime as Agent Runtime
    participant Main as Main Agent
    participant Skill as Skill Loader
    participant Impl as Implementer Subagent
    participant Review as Reviewer Subagent
    participant Tool as Tool Executor

    rect rgba(20,184,166,0.13)
        Note over Main,Tool: Final Verification
        Main->>Skill: Load `verification-before-completion`
        Skill-->>Main: Return completion verification SOP
        Main->>Tool: Run full tests, lint, build, and key checks
        Tool-->>Main: Verification output
        alt Verification fails
            Main->>Skill: Load `systematic-debugging`
            Skill-->>Main: Return debugging SOP
            Main->>Tool: Reproduce failure and collect evidence
            Tool-->>Main: Evidence and logs
            Main->>Impl: Return to fix loop
        else Verification passes
            Main->>Main: Completion evidence is ready
        end
    end

This step targets one of the most dangerous agent habits: saying “done” without current verification. The workflow puts evidence before the conclusion. The agent must run the command, read the output, confirm the result, and only then claim completion.

It is useful to separate this from preflight checks. Both may look like “run tests”, but they answer different questions. Preflight happens before work starts and asks whether the environment is trustworthy. Final verification happens before delivery and asks whether the result is trustworthy.

GateTimingPurposeExamples
preflight checksBefore implementationConfirm that the workspace, dependencies, and baseline project state are usabletest, lint, build, dependency check, startup check
final verificationBefore deliveryProve that this change did not break the intended resultfull tests, key regressions, build, requirements checklist

3.8 Branch Finishing: Close the Development State

After verification passes, the Main Agent loads finishing-a-development-branch, presents clear options such as merge, PR, keep, or discard, and reports the verification evidence:

mermaid
sequenceDiagram
    actor User as User
    participant Runtime as Agent Runtime
    participant Main as Main Agent
    participant Skill as Skill Loader
    participant Impl as Implementer Subagent
    participant Review as Reviewer Subagent
    participant Tool as Tool Executor

    rect rgba(99,102,241,0.13)
        Note over Main,User: Branch Finishing
        Main->>Skill: Load `finishing-a-development-branch`
        Skill-->>Main: Return branch finishing SOP
        Main->>User: Offer merge, PR, keep, or discard
        User-->>Main: Choose finishing action
        Main->>Tool: Execute chosen git action
        Tool-->>Main: Branch finishing done
        Main-->>User: Report result with verification evidence
    end

This may look like a small cleanup step, but it matters. Code being written is not the same as the task being finished. Whether to merge, open a PR, keep a worktree, or discard an experiment is a separate decision. Without it, the agent easily leaves behind a repository state that feels finished in the conversation but remains unclear in Git.

At this point the main shape of Superpowers is visible: it breaks a development task into triggerable engineering checkpoints and pulls the agent into a standard process at each important moment.

4. Making Practice the Default Path

Superpowers is a workflow layer, not a complete agent runtime. It does not solve evals, observability, cost control, long-term memory, or permission management. It does not provide a hard state machine, a task database, or a unified scheduler. Skill execution quality still depends on the underlying model, and supporting multiple runtimes creates maintenance cost.

That limitation makes its value clearer. Superpowers does not replace the runtime. It injects the default path of an experienced engineer into the runtime. It does not finish the entire eval story, but it encodes task-level constraints such as TDD, review, verification, and human confirmation very clearly.

The valuable part is turning tacit engineering practice into the agent’s default path: ask first, plan first, isolate the workspace, write tests, accept review, and deliver with evidence. The real superpower is not speed of output. It is not skipping judgment at the moments where judgment matters.

5. Appendix: Full Task Sequence

The full task sequence looks like this:

mermaid
sequenceDiagram
    autonumber

    actor User as User
    participant Runtime as Agent Runtime
    participant Main as Main Agent
    participant Skill as Skill Loader
    participant Impl as Implementer Subagent
    participant Review as Reviewer Subagent
    participant Tool as Tool Executor

    rect rgba(148,163,184,0.14)
        Note over Runtime,Main: Session Bootstrap
        Runtime->>Runtime: Run `session-start` hook
        Runtime->>Tool: Read `using-superpowers`
        Tool-->>Runtime: Return startup instructions
        Runtime-->>Main: Start Main Agent with injected workflow rules
    end

    rect rgba(59,130,246,0.14)
        Note over User,Main: Requirement Clarification
        User->>Main: Submit coding task
        Main->>Skill: Check applicable skills via `using-superpowers`
        Skill-->>Main: Use matching skill before proceeding
        Main->>Skill: Load `brainstorming`
        Skill-->>Main: Return brainstorming SOP
        loop Until direction is clear
            Main->>User: Clarify goals, constraints, edge cases, and success criteria
            User-->>Main: Answer and refine direction
        end
    end

    rect rgba(34,197,94,0.13)
        Note over Main,Tool: Planning
        Main->>Skill: Load `writing-plans`
        Skill-->>Main: Return planning SOP
        Main->>Tool: Write implementation plan with files, tests, and commands
        Tool-->>Main: Plan document created
        Main->>User: Ask for plan approval
        alt Changes requested
            User-->>Main: Request plan changes
            Main->>Tool: Revise implementation plan
            Tool-->>Main: Updated plan
        else Approved
            User-->>Main: Approve plan
        end
    end

    rect rgba(245,158,11,0.14)
        Note over Main,Tool: Isolated Workspace Setup
        Main->>Skill: Load `using-git-worktrees`
        Skill-->>Main: Return worktree SOP
        Main->>Tool: Detect current workspace state
        Tool-->>Main: Current repo, branch, and worktree status
        Main->>Tool: Create or reuse isolated worktree and branch
        Tool-->>Main: Worktree and branch ready
        Main->>Tool: Run preflight checks if available
        Tool-->>Main: Preflight result

        alt Preflight fails
            Main-->>User: Report blocking issue before implementation
        else Preflight passes or no check exists
            Main->>Main: Continue to plan execution
        end
    end

    rect rgba(168,85,247,0.13)
        Note over Main,Impl: Plan Execution
        Main->>Skill: Load `subagent-driven-development`
        Skill-->>Main: Return subagent execution SOP

        loop Each task in plan
            Main->>Impl: Assign one task with acceptance criteria

            Impl->>Skill: Load `test-driven-development`
            Skill-->>Impl: Return TDD SOP

            Impl->>Tool: Write failing test
            Tool-->>Impl: RED: test fails for expected reason

            Impl->>Tool: Implement minimal code
            Tool-->>Impl: Code changed

            Impl->>Tool: Run relevant tests
            Tool-->>Impl: GREEN: tests pass

            Impl->>Tool: Refactor and rerun tests
            Tool-->>Impl: Tests still pass

            Impl-->>Main: Return implementation result
        end
    end

    rect rgba(236,72,153,0.13)
        Note over Main,Review: Two-Stage Review
        Main->>Skill: Load `requesting-code-review`
        Skill-->>Main: Return review request SOP

        Main->>Review: Request spec compliance review
        Review-->>Main: Spec review result

        alt Spec issue found
            Main->>Skill: Load `receiving-code-review`
            Skill-->>Main: Return review handling SOP
            Main->>Impl: Send required spec fixes
            Impl->>Tool: Fix implementation and rerun relevant tests
            Tool-->>Impl: Verification result
            Impl-->>Main: Updated implementation result

        else Spec passes
            Main->>Review: Request code quality review
            Review-->>Main: Quality review result

            alt Quality issue found
                Main->>Skill: Load `receiving-code-review`
                Skill-->>Main: Return review handling SOP
                Main->>Impl: Send required quality fixes
                Impl->>Tool: Fix implementation and rerun relevant tests
                Tool-->>Impl: Verification result
                Impl-->>Main: Updated implementation result

            else Quality review passes
                Main->>Tool: Mark task complete in plan
                Tool-->>Main: Plan document updated
            end
        end
    end

    rect rgba(20,184,166,0.13)
        Note over Main,Tool: Final Verification
        Main->>Skill: Load `verification-before-completion`
        Skill-->>Main: Return completion verification SOP
        Main->>Tool: Run full tests, lint, build, and key checks
        Tool-->>Main: Verification output

        alt Verification fails
            Main->>Skill: Load `systematic-debugging`
            Skill-->>Main: Return debugging SOP
            Main->>Tool: Reproduce failure and collect evidence
            Tool-->>Main: Evidence and logs
            Main->>Impl: Return to fix loop

        else Verification passes
            Main->>Main: Completion evidence is ready
        end
    end

    rect rgba(99,102,241,0.13)
        Note over Main,User: Branch Finishing
        Main->>Skill: Load `finishing-a-development-branch`
        Skill-->>Main: Return branch finishing SOP
        Main->>User: Offer merge, PR, keep, or discard
        User-->>Main: Choose finishing action
        Main->>Tool: Execute chosen git action
        Tool-->>Main: Branch finishing done
        Main-->>User: Report result with verification evidence
    end

6. References

  1. obra/superpowers: An agentic skills framework & software development methodology that works.