Skill Testing

Browse Skill Testing agent skills in Skill Building and compare related workflows, tools, and use cases.

32 skills
A
verification-loop

by affaan-m

verification-loop is a Claude Code verification workflow for checking builds, types, lint, tests, security, and diffs after code changes. This verification-loop skill is useful before PRs and after refactors when you want a structured post-change guide instead of a generic prompt.

Verification
Favorites 0GitHub 156.3k
A
rust-testing

by affaan-m

rust-testing is a practical guide for Rust testing patterns, including unit tests, integration tests, async testing, property-based testing, mocks, and coverage. It helps you choose the right test shape and follow a TDD workflow with less guesswork.

Skill Testing
Favorites 0GitHub 156.2k
A
python-testing

by affaan-m

python-testing helps you design, write, and review Python tests with a pytest-first workflow. Use it for TDD, fixtures, mocking, parametrization, coverage checks, and maintaining a reliable test suite for Skill Testing and real projects.

Skill Testing
Favorites 0GitHub 156.2k
A
perl-testing

by affaan-m

perl-testing is a practical guide for writing, running, and improving Perl tests with Test2::V0, Test::More, prove, mocking, coverage, and TDD. Use the perl-testing skill for install guidance, usage patterns, migration help, and faster debugging of failing suites.

Skill Testing
Favorites 0GitHub 156.2k
A
kotlin-testing

by affaan-m

kotlin-testing is a practical guide for Kotlin test automation with Kotest, MockK, coroutine testing, property-based tests, and Kover coverage. Use this kotlin-testing skill to follow a TDD-friendly workflow, write clearer unit and component tests, and reduce guesswork when mocking dependencies or testing suspending code.

Test Automation
Favorites 0GitHub 156.2k
A
golang-testing

by affaan-m

The golang-testing skill helps you write and improve Go tests with table-driven cases, subtests, benchmarks, fuzzing, and coverage-aware TDD. It is designed for developers working on real Go code who want practical, idiomatic guidance rather than generic testing advice.

Test Automation
Favorites 0GitHub 156.2k
A
eval-harness

by affaan-m

The eval-harness skill is a formal evaluation framework for Claude Code sessions and eval-driven development. It helps you define pass/fail criteria, build capability and regression evals, and measure agent reliability before shipping prompt or workflow changes.

Model Evaluation
Favorites 0GitHub 156.1k
A
csharp-testing

by affaan-m

csharp-testing is a practical guide for C# and .NET test automation, covering xUnit, FluentAssertions, mocking, integration tests, and readable test structure for maintainable coverage.

Test Automation
Favorites 0GitHub 156.1k
A
cpp-testing

by affaan-m

The cpp-testing skill helps you write, run, and debug C++ tests with GoogleTest, GoogleMock, CMake, and CTest. Use it for coverage, flaky-test fixes, sanitizer-backed diagnostics, and practical cpp-testing usage in modern C++ projects.

Test Automation
Favorites 0GitHub 156.1k
A
context-budget

by affaan-m

The context-budget skill audits Claude Code context use across agents, skills, rules, and MCP servers. It helps identify bloat, duplicate content, and high-cost components, then returns prioritized cleanup actions. Use this context-budget guide for practical context-budget usage and for Skill Testing in larger setups.

Skill Testing
Favorites 0GitHub 156.1k
O
writing-skills

by obra

writing-skills is a Skill Authoring guide for creating, editing, and validating agent skills with a test-driven workflow. Learn the key files, prerequisites, and practical steps for pressure scenarios, baseline tests, and concise SKILL.md iteration.

Skill Authoring
Favorites 0GitHub 121.9k
A
skill-creator

by anthropics

skill-creator is a Skill Authoring meta-skill for drafting new skills, revising existing SKILL.md files, running evals, comparing variants, and improving trigger descriptions with repository scripts and review tools.

Skill Authoring
Favorites 2GitHub 105.1k
W
llm-evaluation

by wshobson

Use the llm-evaluation skill to design repeatable evaluation plans for LLM apps, prompts, RAG systems, and model changes with metrics, human review, benchmarking, and regression checks.

Model Evaluation
Favorites 0GitHub 32.6k
G
agentic-eval

by github

agentic-eval is a GitHub Copilot skill that shows how to build evaluation loops for AI outputs using reflection, rubric-based critique, and evaluator-optimizer patterns.

Model Evaluation
Favorites 0GitHub 27.8k
A
test-driven-development

by addyosmani

The test-driven-development skill helps you change code by writing a failing test first, then making the smallest fix pass. Use it for logic changes, bug fixes, regressions, and edge cases where proof matters more than a plausible patch.

Skill Testing
Favorites 0GitHub 18.8k
M
context-fundamentals

by muratcankoylan

context-fundamentals is a practical guide to context engineering for AI agent systems. It helps you decide what belongs in the prompt, debug context issues, and manage token budgets with clearer context structure. Use this context-fundamentals skill when you need a grounded context-fundamentals guide for agent design and prompt optimization.

Context Engineering
Favorites 0GitHub 15.6k
Y
skill-builder

by yusufkaraaslan

skill-builder helps skill authors turn docs, GitHub repos, PDFs, videos, and codebases into AI-ready skills with Skill Seekers. It includes source-type detection, a recommended workflow, and tool-based steps for repeatable skill authoring instead of one-off prompting.

Skill Authoring
Favorites 0GitHub 13.5k
P
test-scenarios

by phuryn

The test-scenarios skill turns user stories into execution-ready test scenarios with objectives, starting conditions, user roles, steps, expected outcomes, and edge cases. Use it for QA test cases, acceptance testing, feature validation, and clearer test design when you need a structured test-scenarios guide.

Acceptance Testing
Favorites 0GitHub 11k
T
testing-handbook-generator

by trailofbits

testing-handbook-generator is a meta-skill for creating Claude Code skills from the Trail of Bits Testing Handbook (appsec.guide). It helps skill authors, security engineers, and maintainers turn handbook sections into reusable skills with a clear workflow, scope control, and repeatable generation. Use the testing-handbook-generator skill when you need a testing-handbook-generator guide for handbook-to-skill authoring.

Skill Authoring
Favorites 0GitHub 5k
T
property-based-testing

by trailofbits

property-based-testing skill guide for writing, reviewing, and improving PBT across languages and smart contracts. Use this property-based-testing guide to spot roundtrip, idempotence, invariant, parser, validator, and normalization cases, choose generators, and decide when property-based-testing is stronger than example-based tests.

Skill Testing
Favorites 0GitHub 5k
D
create-skill-test

by dotnet

create-skill-test scaffolds eval.yaml test files for agent skills in dotnet/skills. Use it to create skill tests, define scenarios, fixtures, assertions, and rubrics, and reduce overfitting in evaluation design. It is not for running existing tests, debugging validator errors, or authoring SKILL.md files.

Skill Testing
Favorites 0GitHub 3k
M
skill-optimizer

by mcollina

skill-optimizer helps authors improve AI skills for activation, clarity, and cross-model reliability. Use it for Skill Authoring when a skill is written but not reliably followed, when triggers are weak, regressions appear, or context cost needs trimming. It supports benchmark loops, release gates, and tighter usage fidelity.

Skill Authoring
Favorites 0GitHub 1.8k
S
skill-judge

by softaworks

skill-judge is a review and scoring skill for auditing AI skill packages and SKILL.md files. It helps authors and maintainers judge knowledge delta, activation clarity, workflow quality, and publish readiness with actionable improvement guidance.

Skill Validation
Favorites 0GitHub 1.3k
N
judge

by NeoLabHQ

Judge is a two-phase evaluation skill that launches a meta-judge first, then a judge sub-agent to score work with isolated context, evidence, and clear criteria. Use it for report-only reviews of code, writing, analysis, or Skill Authoring when you need a defensible judge guide instead of a casual opinion.

Skill Authoring
Favorites 0GitHub 982
Skill Testing agent skills