parallel-debugging
by wshobsonparallel-debugging is a structured debugging skill for bugs with multiple plausible causes. Install it from wshobson/agents and use its competing-hypothesis workflow, evidence templates, and arbitration steps to reach a defensible root cause.
This skill scores 78/100, which means it is a solid directory listing candidate for agents that need structured root-cause analysis instead of ad hoc debugging. Repository evidence shows a real workflow: clear use triggers, a defined hypothesis-generation framework, and supporting reference templates for evidence collection and arbitration, though users should expect to translate the method into their own agent/task setup.
- Strong triggerability: the description and "When to Use" section clearly target multi-cause bugs, failed initial debugging, and cross-component issues.
- Operationally useful structure: SKILL.md defines six failure-mode categories and the reference file provides concrete investigation and evidence report templates.
- Good agent leverage over generic prompting: the ACH-style parallel hypothesis workflow helps reduce confirmation bias and organize competing investigations.
- No install or execution scaffolding in the skill itself; there are no scripts, rules, or quick-start commands to show how to run the parallel workflow in practice.
- Workflow is methodology-heavy but repo-light: only one reference file is included, so adoption depends on the agent/user being able to operationalize the templates independently.
Overview of parallel-debugging skill
What parallel-debugging does
The parallel-debugging skill is a structured debugging workflow for cases where one bug has several plausible causes and a normal linear investigation keeps stalling. Instead of chasing a single theory, it uses competing hypotheses, parallel investigation, evidence collection, and explicit arbitration to decide the most likely root cause.
Who should install this skill
This parallel-debugging skill fits developers, AI agents, and debugging-heavy teams working on messy failures that cross files, services, or layers. It is especially useful when symptoms are real but the cause is unclear, when prior debugging attempts were inconclusive, or when confirmation bias is likely.
Best job-to-be-done
Use parallel-debugging for Debugging when you need to answer: “What is the most defensible root cause, based on evidence?” The real value is not just brainstorming causes. It is turning a vague bug report into falsifiable hypotheses, scoped investigations, file-level evidence, and a reasoned verdict.
What makes it different from a generic debugging prompt
Most ordinary prompts ask the model to “find the bug,” which often leads to one plausible guess. parallel-debugging is stronger when multiple causes could explain the same symptom. The skill pushes the investigation into failure-mode categories, requires confirming and falsifying evidence, and uses an arbitration step rather than treating the first decent explanation as truth.
Core method surfaced by the repository
The repository centers the Analysis of Competing Hypotheses approach and organizes debugging around six failure categories: logic error, data issue, state problem, integration failure, resource issue, and environment. That categorization is practical because it widens search coverage without becoming unbounded.
When this skill is a poor fit
Skip parallel-debugging usage for simple, local bugs where the failing line is already obvious, for routine syntax errors, or when you only need a quick patch suggestion. The method adds overhead, so it pays off mainly when uncertainty is the problem.
How to Use parallel-debugging skill
parallel-debugging install context
Install from the wshobson/agents repository:
npx skills add https://github.com/wshobson/agents --skill parallel-debugging
If your environment uses a different skill loader, the important part is the source path: plugins/agent-teams/skills/parallel-debugging.
Read these files first before first use
Start with:
SKILL.mdreferences/hypothesis-testing.md
SKILL.md gives the workflow and failure-mode framing. references/hypothesis-testing.md is the higher-value file for actual execution because it contains investigation and evidence report templates you can reuse directly.
What input the skill needs to work well
For good parallel-debugging usage, give it more than “X is broken.” The skill works best when you provide:
- observed symptom
- expected behavior
- recent change or deployment context
- affected files, modules, or services
- reproduction steps
- logs, stack traces, or failing tests
- constraints on what the agent may inspect or run
Without those, the model can still generate hypotheses, but the investigation becomes generic and less falsifiable.
Turn a rough bug report into a strong invocation
Weak input:
- “Login is failing in production. Debug this.”
Stronger input:
- “Investigate intermittent login failures after yesterday’s auth middleware change. Symptom: users with valid credentials sometimes get 401 on first attempt but succeed on retry. Check
src/middleware/auth.ts, session cache behavior, recent commits from the last 3 days, and tests undertests/auth/. Generate competing hypotheses, collect confirming and falsifying evidence, and rank the most likely root cause.”
The second version gives symptom shape, time window, likely surfaces, and evidence boundaries.
Use the skill as a staged workflow
A practical parallel-debugging guide looks like this:
- State the symptom and scope.
- Ask for 3–5 competing hypotheses across different failure categories.
- For each hypothesis, define confirming and falsifying evidence.
- Investigate in parallel or simulate parallel branches in one response.
- Compare evidence quality, not just plausibility.
- End with a ranked verdict, confidence level, and next action.
This is the main adoption benefit: it prevents premature convergence.
Ask for file:line evidence, not summaries
The reference template explicitly expects file citations and causal chains. In practice, require:
file:lineevidence- contradiction evidence
- confidence level
- recommended fix only after verdict
That ordering matters. If you ask for fixes too early, the model often optimizes for patch-writing before root-cause certainty.
Use the six failure modes to widen search intelligently
If the first hypothesis list is narrow, ask the model to cover all repository-defined categories:
- Logic Error
- Data Issue
- State Problem
- Integration Failure
- Resource Issue
- Environment
This is one of the strongest parts of the parallel-debugging skill: it gives a disciplined way to explore alternatives without random speculation.
Suggested prompt pattern for real investigations
Use a prompt shape like:
Use the parallel-debugging skill.
Issue:
{symptom, expected behavior, reproduction}
Scope:
{files, modules, tests, logs, recent commits}
Generate 4 competing hypotheses across different failure modes.
For each hypothesis, provide:
- falsifiable statement
- confirming evidence to seek
- falsifying evidence to seek
- likely files/tests to inspect
Then produce an evidence-based arbitration:
- confirmed, falsified, or inconclusive
- confidence
- causal chain
- recommended next step
This mirrors the repository’s templates closely enough to improve output quality without copying the skill text verbatim.
Best workflow for multi-module bugs
For bugs spanning frontend, backend, queueing, and infrastructure boundaries, use parallel-debugging to assign one hypothesis per layer rather than one per file. Example:
- frontend state regression
- API contract mismatch
- cache invalidation problem
- environment/config drift
That framing usually produces better investigations than splitting by random code areas.
Practical constraints to expect
This skill improves reasoning structure, not tool access. If the agent cannot read logs, run tests, inspect git history, or open the relevant code, the outputs may still be thoughtful but lower-confidence. It is also not a replacement for reproducing nondeterministic issues when runtime evidence is essential.
Repository-reading path if you want to customize it
If you plan to adapt the skill for team use:
- Read
SKILL.mdfor the top-level workflow. - Read
references/hypothesis-testing.mdfor reusable templates. - Extract the evidence report structure into your own bug triage prompts or internal docs.
This repo is light on helper scripts, so most of the value is in the method and prompt scaffolding.
parallel-debugging skill FAQ
Is parallel-debugging better than a normal debugging prompt?
For straightforward bugs, not necessarily. For ambiguous bugs with multiple plausible causes, yes. The parallel-debugging skill is better when the main risk is latching onto the wrong explanation too early.
Is this skill beginner-friendly?
Yes, if the beginner can describe the symptom clearly and share relevant context. The structure helps less-experienced debuggers ask better questions. But beginners still need enough system awareness to recognize which files, tests, or logs matter.
Do I need multiple agents to use parallel-debugging?
No. The repository frames the method around parallel agent investigation, but you can still use it effectively with one model by asking it to maintain separate hypothesis tracks and then arbitrate between them.
When should I not use parallel-debugging?
Avoid it for trivial defects, obvious stack-trace fixes, pure syntax issues, or situations where execution access is more important than reasoning structure. In those cases, the method can be slower than direct debugging.
What makes evidence quality good in parallel-debugging usage?
Good evidence is specific, falsifiable, and cited. The best outputs point to exact files, tests, or logs; explain why that evidence supports or contradicts a hypothesis; and connect cause to symptom in a clear chain.
Does the skill help with root cause analysis after incidents?
Yes. parallel-debugging for Debugging is a strong fit for post-incident analysis because it separates plausible narratives from evidence-backed conclusions and makes confidence explicit.
How to Improve parallel-debugging skill
Give better starting evidence
The fastest way to improve parallel-debugging results is to supply concrete artifacts:
- stack traces
- failing test names
- suspect commit range
- environment differences
- exact error messages
- timings or intermittency patterns
These sharply reduce generic hypothesis generation.
Force hypotheses to compete, not overlap
A common failure mode is producing four versions of the same idea. Ask for hypotheses that differ by failure mode or system layer. That creates real competition and makes arbitration meaningful.
Require falsification, not just confirmation
If you only ask what supports a theory, the model will overfit. The repository reference is valuable because it explicitly asks for falsifying evidence. Keep that requirement in your prompt every time.
Narrow the investigation scope when outputs get vague
If the first run is broad and hand-wavy, rerun with tighter boundaries:
- “Only inspect auth middleware and session caching”
- “Use commits from the last 72 hours”
- “Prioritize evidence from failing integration tests”
Better scope usually beats asking for “more detail.”
Ask for a verdict format with confidence
Insist on:
Confirmed | Falsified | InconclusiveHigh | Medium | Low confidence- evidence list
- contradiction list
- causal chain
This makes the parallel-debugging guide operational instead of exploratory.
Iterate after the first answer
A strong second-round prompt is:
- “Hypothesis 2 and 4 still look plausible. Compare them directly. What single observation would best distinguish them?”
That question often produces the next best debugging step faster than asking for another full brainstorm.
Watch for these common failure modes
Typical weak outputs include:
- hypotheses with no testable difference
- no contradiction evidence
- fix suggestions before evidence review
- confidence with no basis
- category labels that do not change investigation strategy
If you see these, tighten the prompt instead of accepting a polished but shallow answer.
Adapt the templates into your team workflow
The best long-term use of the parallel-debugging skill is to standardize how bugs are investigated. Reuse the hypothesis and evidence report formats from references/hypothesis-testing.md in issue templates, incident reviews, or AI debugging playbooks so results are easier to compare across investigations.
