agent-introspection-debugging

by affaan-m

The agent-introspection-debugging skill provides a structured self-debugging workflow for AI agent failures: capture the failure state, diagnose likely causes, apply a contained recovery step, and produce a human-readable introspection report. Use it for looping, retry-heavy, or drift-prone runs, not routine verification.

Stars156k

Favorites0

Comments0

AddedApr 15, 2026

CategoryDebugging

Install Command

npx skills add affaan-m/everything-claude-code --skill agent-introspection-debugging

Curation Score

This skill scores 81/100 because it provides a clearly triggerable self-debugging workflow for agent failures, with enough operational detail to be useful in a directory listing. For directory users, that means it is worth installing if they want a structured recovery path for looping, drifting, or repeatedly failing runs, though they should note the limited supporting files and some bounded scope guidance.

81/100

Strengths

Clear activation cues for repeated failures, loop limits, token burn, drift, and recoverable tool issues.
Concrete four-phase workflow with failure capture, diagnosis, contained recovery, and reporting, which reduces guesswork for agents.
Strong operational framing: it explicitly says this is a workflow skill for self-debugging before escalation, not a hidden runtime.

Cautions

No scripts, references, or support files are included, so users must rely on the SKILL.md workflow alone.
It explicitly excludes some uses, such as feature verification after code changes and narrower framework-specific debugging, which limits breadth.

Ai Agents Workflow Claude Code Context

Overview

Overview of agent-introspection-debugging skill

What agent-introspection-debugging is for

The agent-introspection-debugging skill is a structured self-debugging workflow for AI agents that are failing, looping, retrying without progress, or drifting off task. Instead of telling the model to “try harder,” it guides the agent to pause, capture the failure state, diagnose likely causes, apply a small recovery step, and produce a readable debug report.

Who should install this skill

This skill fits developers, agent builders, and operators who already run multi-step AI workflows with tools, files, or execution environments. It is most useful when failures are operational rather than purely logical: repeated tool misuse, context bloat, environment mismatch, or a stuck retry loop. If you want a reusable recovery method rather than another generic debugging prompt, agent-introspection-debugging is a strong fit.

What makes it different from a normal prompt

The main differentiator is containment. The skill pushes the agent to stop blind retries, document what happened, and choose a smaller corrective action instead of escalating token waste. It also sets boundaries: it is for agent failure recovery, not full feature verification or framework-specific debugging where a narrower skill would outperform it.

How to Use agent-introspection-debugging skill

Install context and where to read first

Install the agent-introspection-debugging skill through your normal skills workflow for the affaan-m/everything-claude-code repository. Then read skills/agent-introspection-debugging/SKILL.md first; this repo exposes the skill almost entirely through that file, with no extra scripts or reference assets to hide important behavior. That means your adoption decision should focus on the workflow itself, not on missing automation.

When to invoke agent-introspection-debugging

Use agent-introspection-debugging after a failed or degraded run, especially for:

loop-limit or max-tool-call failures
repeated retries with no forward progress
prompt drift or context growth that lowers output quality
filesystem or environment state mismatch
tool failures that seem recoverable with diagnosis and a narrower next step

Do not invoke it as your default coding flow. It adds the most value when the agent is already off the rails and needs disciplined recovery.

What input produces the best output

Give the skill a compact failure packet, not just “debug this.” Strong input usually includes:

original goal
expected result
actual failure
last meaningful tool-call sequence
relevant error text or stack trace
what changed just before failure
current constraints, such as “do not edit more than one file” or “no network access”

Example prompt:
“Use agent-introspection-debugging for Debugging. Goal: update auth middleware tests. Expected: green test run. Actual: agent retried npm test 6 times, then edited unrelated files. Error: MODULE_NOT_FOUND in tests/auth.spec.ts. Last useful actions: edited jest.config.js, ran tests, listed files. Constraints: no dependency upgrades, keep changes minimal. Produce failure capture, diagnosis, one contained recovery action, and a short introspection report.”

This works better because it gives the skill enough evidence to separate root cause from noise.

Practical workflow and output expectations

A good agent-introspection-debugging usage pattern is:

Trigger it only after a clear failure pattern appears.
Force a capture step before any new edits or retries.
Ask for one contained recovery action, not a broad rewrite.
Review the introspection report before letting the agent resume.

In practice, the skill is strongest when you use it to narrow the next move: confirm environment assumptions, inspect one suspect file, or reverse one harmful change. If you ask for “debug everything,” you lose the containment benefit that makes this skill valuable.

agent-introspection-debugging skill FAQ

Is this skill better than an ordinary debugging prompt?

Usually yes, when the problem is agent behavior rather than just code defects. A normal prompt often encourages more retries. The agent-introspection-debugging skill is better at stopping loops, preserving failure evidence, and producing a report a human can inspect quickly.

Is agent-introspection-debugging good for beginners?

It is usable by beginners, but it works best if you can recognize symptoms like prompt drift, tool loops, or environment mismatch. If you are very new, the skill still helps because it imposes a checklist-like structure, but you will get better results if you provide concrete failure evidence instead of broad descriptions.

When should I not use agent-introspection-debugging?

Skip it for routine code verification, final QA, or narrow framework debugging where a specialized skill exists. Also skip it when the issue is obviously non-recoverable in the current harness, such as missing permissions or unavailable infrastructure that the agent cannot fix from inside the session.

Does the repository include automation or only guidance?

For this skill, the repository evidence points to guidance in SKILL.md, not helper scripts or rule files. That is not necessarily a weakness, but it means agent-introspection-debugging install does not give you automatic enforcement. You are adopting a workflow the agent must follow well.

How to Improve agent-introspection-debugging skill

Give better evidence, not longer prompts

The biggest output-quality lever is sharper failure capture. Include the exact stopping point, failed command, recent edits, and constraints. Omit unrelated history. agent-introspection-debugging guide quality improves when the model can compare intended action versus actual trajectory without scanning noise.

Ask for diagnosis and recovery separately

A common failure mode is collapsing diagnosis into immediate repair. Improve agent-introspection-debugging usage by explicitly requiring:

probable failure pattern
confidence level
smallest next action
success check after that action

This prevents the agent from jumping from symptom to large speculative fix.

Use containment rules to stop repeat damage

If prior runs made the repo worse, add limits such as:

inspect before editing
one-file change maximum
no repeated command without new evidence
summarize why the next action is safer than retrying

These constraints align tightly with what agent-introspection-debugging for Debugging is trying to do: reduce wasted actions while preserving recoverability.

Iterate on the first report, not from scratch

If the first introspection report is weak, do not restart with a brand-new prompt. Ask the agent to refine the missing parts: “restate root cause candidates,” “separate evidence from assumptions,” or “propose a smaller recovery action.” That preserves the structured loop and usually yields better second-pass results than abandoning the skill entirely.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

systematic-debugging

by obra

systematic-debugging is a root-cause-first debugging skill for bugs, flaky tests, build failures, and unexpected behavior. Learn the four-phase workflow, companion files, and when to use it before proposing fixes.

Debugging

Favorites 0GitHub 121.8k

hunt

by tw93

hunt is a debugging-first skill that forces root-cause thinking before any fix is applied. Use it for errors, crashes, regressions, failing tests, stale cache issues, screenshot bugs, and “it used to work” failures. It helps you find a testable hypothesis, gather evidence, and avoid guesswork. Not for code review or new features.

Debugging

Favorites 0GitHub 5.1k

typescript-magician

by mcollina

typescript-magician helps solve hard TypeScript problems: deep generic design, strict typing cleanup, compiler errors, type guards, and advanced type transformations. Use it for typescript-magician usage when you need type-safe code generation, `any` removal, `infer`, conditional types, mapped types, template literal types, branded types, or utility types.

Code Generation

Favorites 0GitHub 1.8k

web-perf

by cloudflare

web-perf analyzes web performance with Chrome DevTools MCP. It measures Core Web Vitals, trace-based load issues, render-blocking resources, layout shifts, caching problems, and accessibility gaps. Use the web-perf skill for Performance Optimization, debugging slow pages, and evidence-based web-perf guide workflows that rely on current docs and live traces.

Performance Optimization

Favorites 0GitHub 1.3k

playwright-best-practices

by currents-dev

playwright-best-practices is a Playwright + TypeScript skill for writing stable tests, reducing flake, improving auth flows, choosing fixtures vs page objects, and handling CI, popups, mobile, iframes, websockets, and multi-user scenarios with practical repo-backed guidance.

Test Automation

Favorites 0GitHub 174

autofix

by coderabbitai

autofix helps safely turn CodeRabbit PR review-thread feedback into validated code changes on the current GitHub branch. Use this autofix skill when you need a branch-aware CodeRabbit for Code Review workflow with explicit approval, not a generic prompt-following fixer. It checks repo state, reads trusted instructions, and applies only verified fixes.

Code Review

Favorites 0GitHub 0

sentry

by openai

The sentry skill is a read-only Observability tool for inspecting Sentry issues, events, and health signals. Use it to investigate recent production errors, summarize impact, and run repeatable CLI-based queries with structured output. It is best when you need a practical sentry guide for triage, not a broad observability overview.

Observability

Favorites 0GitHub 0

aspire

by github

aspire skill for install, AppHost setup, local run, dashboard debugging, and publish workflows for Deployment. Covers CLI usage, references, troubleshooting, and the key publish-vs-deploy boundary.

Deployment

Favorites 0GitHub 0

property-based-testing

by trailofbits

property-based-testing skill guide for writing, reviewing, and improving PBT across languages and smart contracts. Use this property-based-testing guide to spot roundtrip, idempotence, invariant, parser, validator, and normalization cases, choose generators, and decide when property-based-testing is stronger than example-based tests.

Skill Testing

Favorites 0GitHub 5k

terminal-ops

by affaan-m

terminal-ops is an evidence-first repo execution skill for terminal work. Use it to run commands, inspect git state, debug CI or builds, and make narrow fixes with proof of what changed and what was verified. This terminal-ops guide helps reduce guesswork for Code Editing and repo operations.

Code Editing

Favorites 0GitHub 156.3k

investigate

by garrytan

The investigate skill guides systematic debugging and root-cause analysis for broken, flaky, or unexpected behavior. Use it for code review, incident triage, bug fixes, and "it worked yesterday" cases when you need evidence before changing code. It follows a four-phase workflow: investigate, analyze, hypothesize, implement.

Code Review

Favorites 0GitHub 91.8k

browser-testing-with-devtools

by addyosmani

browser-testing-with-devtools helps agents test and debug real browser behavior through Chrome DevTools MCP. Use it to inspect the DOM, capture console errors, analyze network requests, profile performance, and verify fixes in a live browser.

Test Automation

Favorites 0GitHub 18.7k

libfuzzer

by trailofbits

libfuzzer is a coverage-guided fuzzer for C/C++ projects compiled with Clang. This libfuzzer skill helps you install, understand, and use the workflow for harnessing targets, running sanitizers, and starting a practical security audit with minimal setup.

Security Audit

Favorites 0GitHub 5k

vue-debug-guides

by vuejs-ai

vue-debug-guides is a Vue 3 debugging skill for diagnosing runtime errors, warnings, async component failures, reactivity issues, and SSR or hydration mismatches with targeted reference-based fixes.

Debugging

Favorites 0GitHub 2.1k

ios-simulator-skill

by conorluddy

ios-simulator-skill is a task-focused iOS simulator skill for accessibility-aware app launch, navigation, text entry, gestures, screenshots, state capture, build/test loops, and simulator lifecycle control. It is designed to reduce guesswork for AI agents, QA engineers, and developers working on repeatable iOS test automation.

Test Automation

Favorites 0GitHub 0

datadog-cli

by softaworks

datadog-cli helps agents run Datadog CLI workflows for logs, traces, metrics, services, and dashboards. Learn setup with DD_API_KEY and DD_APP_KEY, use npx @leoflores/datadog-cli commands, and handle --site plus dashboard update safety for incident triage.

Observability

Favorites 0GitHub 0