tdd is a Test Driven Development skill that teaches strict red-green-refactor, behavior-focused tests through public interfaces, integration-style testing, and mocking only at system boundaries.
This skill scores 78/100, which means it is a solid directory listing candidate: users can reasonably expect an agent to recognize when to invoke it and get better TDD-specific guidance than from a generic prompt, though they should not expect a fully scripted end-to-end workflow.
- Strong triggerability from the frontmatter: it clearly names TDD, red-green-refactor, integration tests, and test-first usage cues.
- Good operational guidance on what to test and what to avoid, with concrete examples in tests.md and mocking.md.
- Useful supporting concepts for better execution, including interface design, deep modules, mocking boundaries, and post-cycle refactoring cues.
- No quick-start or install/execution instructions in SKILL.md, so agents must infer how to apply it in-session.
- Mostly principle-and-example guidance; it lacks a stepwise worked TDD session for a real feature or bug fix.
Overview of tdd skill
The tdd skill is a focused guide for doing test-first development with a strict red-green-refactor loop, not a generic “write some tests” prompt. It is best for developers who want AI help building features or fixing bugs while keeping tests tied to behavior through public interfaces instead of implementation details.
What tdd is for
Use tdd for Test Driven Development when you want the model to:
- plan a feature as small vertical slices
- write one failing test at a time
- implement only enough code to pass
- refactor safely after green
- avoid brittle tests that break during harmless refactors
Who gets the most value from tdd
The tdd skill fits best if you already have:
- a codebase with runnable tests
- a clear public API, endpoint, command, or user workflow to target
- permission to change both tests and production code
- a desire for integration-style tests over heavy mocking
It is especially useful for backend services, libraries, domain logic, and application flows where behavior can be exercised through a real interface.
What makes this tdd skill different
The key differentiator is its opinionated stance on test quality:
- test behavior, not internals
- prefer integration-style tests
- mock only at system boundaries
- avoid “horizontal slicing” where you write all tests first and all code later
- use TDD to shape better interfaces, not just to add coverage
That makes tdd usage more disciplined than an ordinary prompt asking an AI to “write tests and implementation.”
What can block adoption
This skill is a weaker fit if your environment cannot run tests, your codebase has no stable public seams, or your team mainly wants snapshot-heavy or mock-heavy unit tests. It also assumes you are willing to iterate in small steps instead of generating a full feature in one shot.
How to Use tdd skill
Install tdd in your skills environment
If you are using the Skills system from the repository, a common install pattern is:
npx skills add mattpocock/skills --skill tdd
Then invoke tdd when asking the model to implement or repair behavior using test-first workflow.
Read these files first before heavy tdd usage
The fastest path to understanding this skill is:
SKILL.mdtests.mdmocking.mdinterface-design.mdrefactoring.mddeep-modules.md
That order matters. SKILL.md gives the operating philosophy, while the supporting files explain what “good tests” and “good design” mean in practice.
Know the core workflow tdd expects
The skill is built around a tight loop:
- choose one small behavior
- write one failing test through a public interface
- implement the minimum code to pass
- refactor while keeping tests green
- repeat with the next smallest slice
If you ask for a whole feature at once, you lose most of the value of tdd usage.
Start with a behavior, not an implementation idea
Strong input:
- “Add checkout support for expired cards. Public entrypoint is
checkout(cart, paymentMethod). Existing test file ischeckout.test.ts. Keep using integration-style tests.”
Weak input:
- “Create classes for payment orchestration and add unit tests for each method.”
The first prompt gives the skill a behavioral target. The second nudges it toward internal design speculation and brittle tests.
Give the skill the public interface and test command
For good tdd install and execution outcomes, include:
- the function, route, CLI command, or UI action under test
- where tests live
- the test runner and command
- relevant constraints such as DB, HTTP, or external services
- what can and cannot be mocked
A practical prompt template:
Use the tdd skill.
Goal: Add [behavior].
Public interface: [function/route/command].
Test location: [path].
Run tests with: [command].
Boundaries to mock: [external API, clock, filesystem].
Do not mock: [internal modules/classes].
Work in red-green-refactor steps and explain each step briefly.
Use vertical slices, not horizontal slices
One of the biggest practical takeaways from this repository is to avoid bulk-writing all tests up front. Good tdd usage here means:
- pick one real scenario
- get it passing
- let that result inform the next scenario
This reduces imagined abstractions and usually produces better API shape.
Prefer integration-style tests in tdd
The repository strongly favors tests that exercise real code paths through public APIs. In practice, that means:
- call exported functions instead of private helpers
- hit route handlers through their supported interface
- verify observable outcomes, not internal call order
- name tests by capability, such as “user can checkout with valid cart”
If a refactor changes internals but behavior stays the same, a good test should usually stay green.
Mock only at system boundaries
The tdd skill is not anti-mock; it is anti-mocking your own implementation. Mock:
- payment gateways
- email providers
- time/randomness
- sometimes databases or filesystem, depending on test setup
Do not mock:
- your own modules
- internal collaborators
- private methods
- thin wrappers you control
This guideline alone will change output quality more than most prompt tweaks.
Shape code for testability before writing too much code
The support files make an important point: better interfaces make TDD easier. Ask the model to favor code that:
- accepts dependencies instead of creating them internally
- returns results instead of mutating hidden state
- keeps the public surface area small
If your current design fights testing, tell the model to first propose a smaller, more testable public interface.
A strong tdd prompt example
Use the tdd skill to add password reset token expiry.
Context:
- Node + TypeScript
- Public API: `requestPasswordReset(email)` and `resetPassword(token, newPassword)`
- Tests: `src/auth/password-reset.test.ts`
- Run with: `pnpm test password-reset`
- Mock only email sending and time
- Do not mock repository code or internal services
Please:
1. choose the smallest failing behavior first
2. write integration-style tests through public APIs
3. implement minimum code to pass
4. refactor after green
5. avoid asserting internal call counts unless at an external boundary
This works because it gives the skill a target, a boundary policy, and a real execution path.
Watch for the main output quality signals
A good tdd guide outcome from this skill usually includes:
- tests named around user-visible behavior
- one small scenario at a time
- minimal implementation per step
- refactoring notes after green
- little or no coupling to private structure
A poor outcome usually includes:
- lots of mocks for internal code
- assertions about call order
- giant first-step test suites
- speculative abstractions before any behavior passes
tdd skill FAQ
Is tdd only for brand-new features?
No. The tdd skill also fits bug fixes well. In many mature codebases, the best first step is a failing regression test that reproduces the bug through the public interface, then a minimal fix, then cleanup.
Is this tdd skill beginner-friendly?
Yes, if you already understand how to run your project’s tests. The guidance is opinionated but simple: test behavior, keep slices small, and avoid implementation-detail assertions. Absolute beginners may still need help understanding their project’s architecture and test tooling.
How is tdd different from asking an AI to write tests?
Ordinary prompts often produce coverage-oriented or mock-heavy tests. The tdd skill pushes the model toward:
- behavioral specifications
- safer refactoring
- cleaner interfaces
- smaller iterative steps
That changes both the tests and the production design.
When should I not use tdd?
Skip or limit tdd for Test Driven Development when:
- the behavior cannot be exercised meaningfully yet
- the environment is too hard to run during iteration
- you are doing exploratory throwaway spikes
- the task is mostly mechanical, like renaming or dependency bumps
You can still return to TDD once the public seam is clearer.
Does tdd require integration tests only?
Not strictly, but its bias is toward integration-style tests through real interfaces. The goal is not maximum test size; it is testing behavior at a stable seam. Small focused tests are still fine if they stay at the public interface and avoid internal coupling.
What languages or frameworks fit this skill?
The ideas are broadly language-agnostic. The examples skew toward TypeScript and JavaScript, but the design principles apply to Python, Java, Go, Ruby, and similar ecosystems wherever you can define clear public interfaces and test boundaries.
How to Improve tdd skill
Give tdd a smaller first slice
The easiest way to improve tdd results is to shrink the first step. Instead of “build user invitations,” start with “user with valid email can request an invitation.” Smaller slices reduce hallucinated architecture and produce cleaner tests.
Provide explicit boundary rules
Many bad outputs come from unclear mocking policy. Tell the model exactly:
- which external systems can be mocked
- which internal modules must stay real
- whether a test DB is available
- whether time should be injected or frozen
This helps the skill stay aligned with the repository’s philosophy.
Ask for public-interface test names
If you want better tests, request names that describe outcomes:
- good:
user can checkout with valid cart - weaker:
checkout calls payment service
That single instruction often prevents implementation-detail drift.
Force red-green-refactor to be visible
If the first answer comes back as a full implementation dump, ask the model to restructure it:
- show the first failing test
- show minimal code to pass
- explain the refactor separately
- stop after one slice if needed
The tdd skill works best when the loop is visible, not implied.
Improve design when tests feel awkward
If the model struggles to write clean tests, the problem is often interface design, not test syntax. Ask it to revise toward:
- dependency injection
- explicit inputs and outputs
- smaller public surfaces
- fewer side effects
This is where interface-design.md and deep-modules.md become especially valuable.
Use refactoring as a separate quality pass
After a few green slices, explicitly ask for a refactor review using the repository’s cues:
- duplication
- long methods
- shallow modules
- feature envy
- primitive obsession
This keeps tdd usage from stopping at “tests pass” and improves maintainability without changing behavior.
Correct common failure modes early
If output quality drops, these are the usual causes:
- too much code generated before the first failing test
- internal collaborators mocked heavily
- tests verify implementation details
- the feature request is too large for one cycle
- the public interface is unclear or changing every step
When that happens, reset with one behavior, one seam, one failing test.
Use repository files as decision aids
For better results, map your problem to the support docs:
tests.mdwhen test style is weakmocking.mdwhen boundary decisions are unclearinterface-design.mdwhen seams are awkwardrefactoring.mdafter greendeep-modules.mdwhen API shape is getting too broad
That reading path gives more value than skimming SKILL.md alone.
Iterate by tightening constraints, not repeating the same prompt
If the first output is mediocre, do not just say “try again.” Improve the next round with concrete constraints:
- target one behavior only
- preserve current public API
- do not mock internal modules
- prefer a test DB over repository mocks
- stop after first red-green-refactor cycle
That kind of iteration consistently improves tdd guide quality more than asking for more detail.
