test-driven-development

by obra

Install and use the test-driven-development skill to enforce strict TDD: write a failing test first, verify the failure, implement the minimum code, then refactor safely.

Stars121.8k

Favorites0

Comments0

AddedMar 29, 2026

CategoryTest Automation

Install Command

npx skills add obra/superpowers --skill test-driven-development

Curation Score

This skill scores 78/100, which means it is a solid directory listing candidate: agents get a clear trigger (`before writing implementation code` for features, bug fixes, refactors, and behavior changes), a strongly defined operating rule set, and enough procedural guidance to execute TDD with less guesswork than a generic prompt. Directory users should still expect a document-centered skill rather than a fully tooled package, since there are no support scripts, install instructions, or embedded automation assets.

78/100

Strengths

Highly triggerable: frontmatter and `When to Use` make activation conditions explicit, including common cases and exceptions.
Operationally clear: the skill defines strict TDD rules (`NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST`) and a red-green-refactor workflow with verification steps.
Useful supporting reference: `testing-anti-patterns.md` adds concrete examples and guardrails around mocks and test design, improving execution quality.

Cautions

Adoption is manual: no install command, scripts, or support files, so users are installing a guidance document rather than an executable workflow.
The prescription is intentionally rigid (`Always`, `No exceptions`, `Delete it. Start over.`), which may limit fit for teams that use lighter or context-dependent testing practices.

Testing Codex Cursor Workflow Playbook Developer Audience Superpowers

Overview

Overview of test-driven-development skill

What the test-driven-development skill actually does

The test-driven-development skill gives an AI agent a strict TDD workflow for feature work, bug fixes, and behavior changes: write a test first, confirm it fails for the right reason, write the minimum production code to pass, then refactor safely. Its core value is not “write tests too” but enforcing sequencing so implementation is driven by executable behavior.

Who this skill is best for

This test-driven-development skill fits developers using AI for real repository work where correctness matters: app features, service logic, bug fixes, refactors, and regression prevention. It is especially useful when you want the model to stop jumping straight into implementation and instead produce smaller, verifiable steps.

The real job-to-be-done

Most users install test-driven-development because generic coding prompts often create code first and retrofit tests later. This skill changes that behavior. It helps you get implementation that is anchored to failing tests, making the agent easier to review and less likely to invent unverified behavior.

What makes it different from a generic “write tests” prompt

The differentiator is the “iron law” from the skill: no production code without a failing test first. That is much stricter than ordinary prompting. The skill also emphasizes verifying that the initial failure is the correct failure, not just any red result, which is a practical guardrail many shallow TDD summaries miss.

Important limits before you install

This is a process skill, not a framework-specific testing toolkit. It will not choose your full test architecture for you, and it does not ship helper scripts or rich references beyond SKILL.md and testing-anti-patterns.md. If you need deep Jest, Pytest, JUnit, or Playwright setup guidance, this skill is better as a workflow layer than a complete testing manual.

How to Use test-driven-development skill

Install the test-driven-development skill

Install from the repository with:

npx skills add https://github.com/obra/superpowers --skill test-driven-development

If your environment supports local skill discovery, confirm the skill appears as test-driven-development and is available to the agent before starting feature work.

Read these files first

For this test-driven-development install and usage flow, start with:

skills/test-driven-development/SKILL.md
skills/test-driven-development/testing-anti-patterns.md

Read SKILL.md first for the workflow and constraints. Read testing-anti-patterns.md next if your task involves mocks, isolation, UI tests, or any temptation to add test-only seams into production code.

Know the minimum input the skill needs

The skill works best when you provide:

the feature, bug, or behavior change
the relevant files or module boundaries
the test framework already used in the repo
the desired user-visible or system-visible behavior
any constraints on API shape, backward compatibility, or performance

Without that context, the agent can still apply TDD mechanically, but it may choose the wrong test level or create awkward tests that fit the tool more than the codebase.

Turn a rough request into a TDD-ready prompt

Weak prompt:

Add support for password reset.

Stronger prompt:

Use the test-driven-development skill. We need password reset in the existing Node/Express app. Write the first failing integration or service-level test before any production code. Verify the failure is for missing reset behavior, not setup issues. Then implement the minimum code to pass. Keep the current route style, use Jest, and avoid changing unrelated auth flows.

The stronger version gives the agent enough context to choose the right initial test and obey the red-green-refactor cycle.

Use the skill as a stepwise workflow, not one big generation

A practical test-driven-development usage pattern is:

Ask for the first failing test only.
Review whether the failure targets the intended behavior.
Ask for the minimal implementation to make it pass.
Ask for refactoring only after green.
Repeat for the next small behavior slice.

This produces better output than requesting the full feature in one shot, because the skill is built around small validated increments.

Verify the “red” phase correctly

A key detail in this test-driven-development guide is that a failing test is not enough by itself. The failure must prove the test is aimed at the right missing behavior. If the test fails due to import errors, broken fixtures, or unrelated setup, the cycle has not really started yet.

When prompting, explicitly ask the agent to state why the test fails and why that failure is the correct one.

Choose the right first test

The best first test usually targets the smallest externally meaningful behavior change. Good candidates include:

a bug reproduction
one business rule
one endpoint response change
one domain method behavior
one UI interaction with clear user impact

Bad starting points include giant end-to-end scenarios, broad snapshot coverage, or tests that lock in internal implementation too early.

Apply the anti-pattern guidance when mocks appear

The support file testing-anti-patterns.md matters if the agent starts overusing mocks. The skill strongly warns against testing mock behavior instead of real behavior. That is especially relevant for test-driven-development for Test Automation, where AI agents often create assertions against mock placeholders because they are easier to satisfy than real outputs.

If a test asserts that a mock rendered, a mock was called in a trivial way, or a test-only method had to be added to production code, stop and re-scope the test.

Ask the agent to preserve the iron law

If the model already drafted implementation, the skill’s own guidance is strict: delete the production code and restart from a failing test. In practice, you do not need to be theatrical, but you should instruct the agent to ignore prior speculative implementation and regenerate from the test-first sequence.

Useful wording:

Do not continue from implementation-first code. Restart with a failing test and derive the implementation from that test.

Fit the skill to your repository’s test stack

The skill is process-centric, so you should anchor it to your stack:

pytest for Python services
Jest or Vitest for JS/TS logic
RSpec for Ruby
JUnit for Java
Playwright or similar only when the behavior truly belongs at browser level

If your repo already has a strong test pyramid, tell the agent where this change belongs. Otherwise, the model may default to the most visible test style rather than the cheapest useful one.

Example prompt for realistic repository work

A solid test-driven-development skill prompt looks like this:

Use the test-driven-development skill for a bug fix. In billing/invoice_service.py, invoices with zero-amount adjustments should remain payable if tax is still due. Start by writing the smallest failing pytest that reproduces the current bug. Confirm the failure is caused by the missing business rule, not fixture issues. Then implement the minimum fix, run or describe the expected green result, and suggest any safe refactor only after the test passes.

This prompt gives behavior, location, framework, and review criteria.

test-driven-development skill FAQ

Is test-driven-development worth installing if I already know TDD?

Yes, if your main issue is getting AI agents to actually follow TDD instead of merely talking about it. The test-driven-development skill is useful less as education and more as behavioral constraint for the model.

Is this beginner-friendly?

Mostly yes. The workflow is simple and explicit. The harder part for beginners is choosing the right first test and test level. If you are new to testing, use this skill on small bug fixes first rather than broad new features.

When is test-driven-development a poor fit?

It is a weaker fit for throwaway prototypes, generated code, or pure configuration edits unless correctness risk is high and your human reviewer still wants test-first discipline. The source guidance explicitly treats these as exceptions to discuss with a human partner.

How is this different from ordinary prompting?

Ordinary prompts often say “implement X and add tests.” This skill changes the order of work and treats that order as non-negotiable. That sequencing is the real value because it reduces hallucinated implementation and improves reviewability.

Does this skill cover framework setup too?

Not deeply. test-driven-development install is straightforward, but the skill itself does not provide extensive framework-specific setup docs. It assumes you can point the agent at your existing test stack or repository conventions.

Can I use test-driven-development for refactoring?

Yes. It is a good fit for refactoring when behavior must stay stable. The practical pattern is to first lock current behavior with tests, then refactor with green tests protecting you.

Is this good for UI and end-to-end testing?

Sometimes, but use care. For UI work, the anti-pattern file is especially relevant because AI can drift into asserting mock presence or implementation artifacts. Start with the smallest real user behavior you can verify.

How to Improve test-driven-development skill

Give behavior, not solution ideas

To get better test-driven-development usage, describe the expected behavior and constraints rather than dictating implementation. TDD works best when the test defines outcomes and the code emerges from those checks.

Better input:
Users should see an error when uploading files over 10MB.

Worse input:
Add a fileSizeValidator class and call it from the controller.

The first leaves space for a cleaner minimal implementation.

Specify the test level you want

Many weak results come from mismatched test scope. Tell the agent whether you want:

unit-level business logic
integration around a service or API
browser-level behavior

This one choice often matters more than any other prompt detail.

Force smaller increments

A common failure mode is asking for too much in one cycle. If the model writes a broad test suite and a large implementation together, narrow it:

Pick one failing test that captures the first slice of behavior. Do not implement the whole feature yet.

That keeps the test-driven-development loop intact.

Require explanation of why the first test is correct

Ask the agent to justify:

why this test is the smallest useful slice
what exact failure is expected
why that failure proves the behavior is missing

This improves quality because it surfaces hidden assumptions before implementation starts.

Watch for anti-patterns early

The most common quality drops are:

testing mocks instead of behavior
introducing test-only methods into production code
writing passing tests first and calling it TDD
choosing assertions tied to implementation details
skipping the refactor step once green

If you see one, stop the cycle and ask for a corrected first test rather than patching around it.

Provide repository conventions explicitly

The skill gets better when you tell it:

naming conventions for tests
where tests live
fixture patterns
mocking policy
preferred assertion style

Because the repository only includes light support material, these local conventions materially improve output quality.

Iterate after the first output

After the initial result, do not just ask for “more.” Ask targeted follow-ups:

Can you make the failing test narrower?
Is this failure due to setup or missing behavior?
Can we remove this mock and test real behavior instead?
What is the minimum code needed to pass?
What refactor is now safe with tests green?

This is the highest-leverage way to improve the test-driven-development skill in practice: keep the agent inside the cycle instead of letting it jump ahead.

Pair it with human judgment on exceptions

The skill is intentionally strict. That is a strength, but it can be over-applied. If the task is a pure config change, generated code refresh, or disposable prototype, ask whether full TDD is worth the cost. Better results come from using the skill where test-first sequencing changes decision quality, not merely where it can be applied.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

playwright-interactive

by openai

playwright-interactive is a browser automation skill for persistent Playwright sessions in local web and Electron apps. Use it to inspect UI state, retry interactions, and run functional or visual QA without restarting the toolchain. Ideal when you need a practical playwright-interactive guide for iterative debugging.

Browser Automation

Favorites 0GitHub 0

playwright-skill

by testdino-hq

playwright-skill is a Playwright-specific guide for reliable browser automation. It helps teams write, debug, and scale tests for E2E flows, API checks, component testing, visual regression, accessibility, auth, CI/CD, and migration from Cypress or Selenium. Use the playwright-skill skill when you want practical patterns instead of generic testing advice.

Test Automation

Favorites 0GitHub 0

laravel-tdd

by affaan-m

laravel-tdd is a Laravel test-driven-development guide for PHPUnit and Pest. It helps with unit, feature, and integration test choices, database strategy, fakes, coverage targets, and a practical workflow for test automation.

Test Automation

Favorites 0GitHub 156.2k

cpp-testing

by affaan-m

The cpp-testing skill helps you write, run, and debug C++ tests with GoogleTest, GoogleMock, CMake, and CTest. Use it for coverage, flaky-test fixes, sanitizer-backed diagnostics, and practical cpp-testing usage in modern C++ projects.

Test Automation

Favorites 0GitHub 156.1k

test-driven-development

by addyosmani

The test-driven-development skill helps you change code by writing a failing test first, then making the smallest fix pass. Use it for logic changes, bug fixes, regressions, and edge cases where proof matters more than a plausible patch.

Skill Testing

Favorites 0GitHub 18.8k

wp-playground

by WordPress

The wp-playground skill helps you create disposable, reproducible WordPress Playground environments for plugin and theme testing, version switching, blueprints, snapshots, and isolated debugging. It supports browser or CLI workflows via @wp-playground/cli and is especially useful for backend development, QA, and controlled issue reproduction.

Backend Development

Favorites 0GitHub 1.4k

playwright-best-practices

by currents-dev

playwright-best-practices is a Playwright + TypeScript skill for writing stable tests, reducing flake, improving auth flows, choosing fixtures vs page objects, and handling CI, popups, mobile, iframes, websockets, and multi-user scenarios with practical repo-backed guidance.

Test Automation

Favorites 0GitHub 174

playwright-skill

by lackeyjb

playwright-skill is a browser automation skill for testing pages, filling forms, checking links, taking screenshots, validating responsive layouts, and working through login or checkout flows. It auto-detects dev servers, uses a universal executor, and helps you run reliable Playwright tasks with less setup and guesswork.

Browser Automation

Favorites 0GitHub 0

property-based-testing

by trailofbits

property-based-testing skill guide for writing, reviewing, and improving PBT across languages and smart contracts. Use this property-based-testing guide to spot roundtrip, idempotence, invariant, parser, validator, and normalization cases, choose generators, and decide when property-based-testing is stronger than example-based tests.

Skill Testing

Favorites 0GitHub 5k

terraform-test

by hashicorp

terraform-test is a practical guide for writing and running Terraform tests with .tftest.hcl files, run blocks, assertions, mocks, and CI-friendly workflows. Use it to validate module outputs, resource arguments, conditional logic, and plan or apply behavior before merge.

Code Generation

Favorites 0GitHub 583

browser-testing-with-devtools

by addyosmani

browser-testing-with-devtools helps agents test and debug real browser behavior through Chrome DevTools MCP. Use it to inspect the DOM, capture console errors, analyze network requests, profile performance, and verify fixes in a live browser.

Test Automation

Favorites 0GitHub 18.7k

ios-simulator-skill

by conorluddy

ios-simulator-skill is a task-focused iOS simulator skill for accessibility-aware app launch, navigation, text entry, gestures, screenshots, state capture, build/test loops, and simulator lifecycle control. It is designed to reduce guesswork for AI agents, QA engineers, and developers working on repeatable iOS test automation.

Test Automation

Favorites 0GitHub 0

autoresearch

by github

autoresearch is an autonomous experimentation loop for coding tasks with measurable outcomes. It helps developers define a goal, baseline, metric, and scope, then iterate through code changes, tests, and keep-or-revert decisions using git-backed checkpoints.

Workflow Automation

Favorites 0GitHub 0

atheris

by trailofbits

Atheris is a coverage-guided Python fuzzing skill built on libFuzzer. Use the atheris skill to fuzz pure Python code and Python C extensions, find crashes, hangs, and memory-safety bugs, and support Security Audit workflows with fast, practical harness guidance.

Security Audit

Favorites 0GitHub 5k

playwright-cli

by VoltAgent

playwright-cli is a browser automation skill for Playwright from the command line. It helps with opening pages, inspecting elements, clicking through flows, filling forms, capturing screenshots, mocking requests, and generating test code from real interactions. Use it for repeatable browser automation and UI testing.

Browser Automation

Favorites 0GitHub 8.5k

playwright

by openai

Use the playwright skill to automate a real browser from the terminal with a wrapper script and `playwright-cli`. It fits browser automation tasks like navigation, form filling, screenshots, snapshots, extraction, and UI-flow debugging. Check `npx`, install the skill, set `PWCLI`, then follow the CLI-first workflow.

Browser Automation

Favorites 0GitHub 0