differential-review
by trailofbitsdifferential-review is a security-focused code review skill for PRs, commits, and diffs. It uses baseline history, blast radius, test coverage, and structured reporting to help catch regressions in auth, crypto, external calls, and other high-risk paths. Use it for differential-review for Code Review when you need evidence-backed findings.
This skill scores 78/100, which means it is a solid but not top-tier listing candidate: directory users get a clearly security-focused differential review workflow with enough structure to justify installation, but they should expect some manual interpretation and limited onboarding help.
- Explicitly triggers on PRs, commits, and diffs for security-focused review, so agents know when to use it.
- Strong operational guidance: risk-first rules, baseline context building, blast-radius analysis, adversarial modeling, and mandatory report generation.
- Evidence-backed workflow with git history, line numbers, attack scenarios, and explicit confidence/coverage expectations, which improves agent leverage over a generic prompt.
- No install command and no support files, so adoption depends on reading the skill content rather than a packaged setup experience.
- The description/frontmatter is sparse and there is no quick-start example, so agents may still need to infer the exact entrypoint and execution sequence from the body.
Overview of differential-review skill
What differential-review does
The differential-review skill is a security-focused workflow for reviewing PRs, commits, and diffs with more rigor than a normal prompt. It is built for reviewers who need to decide whether a change introduces regressions, especially in auth, crypto, external calls, state changes, and other high-risk paths.
Who it fits best
Use the differential-review skill if you are reviewing security-sensitive code, inherited a large diff, or need a repeatable method that adapts to codebase size. It is a strong fit for engineers, security reviewers, and AI-assisted auditors who want evidence-backed findings instead of a shallow line-by-line skim.
What makes it different
The main value of differential-review is that it forces context before conclusions: baseline history, blast radius, test coverage, and explicit confidence limits. The repository also pushes output into a structured markdown report, so the skill is not just an analysis prompt; it is a review process with a deliverable.
How to Use differential-review skill
Install and load the skill
A typical differential-review install starts with the repository toolchain, then points the agent at the skill folder. For this package, the install path is plugins/differential-review/skills/differential-review. If you are using the Trail of Bits skills repo, install with the project’s skills command and then open SKILL.md first.
Give it a review-shaped input
For best differential-review usage, ask it to review a specific base/head range, commit, or PR, and name the security concern if you have one. Strong inputs look like: “Review base..head for auth bypass, reentrancy, and missing tests; focus on external call paths and state transitions.” Weak inputs like “check this diff” leave too much room for guesswork.
Read the right files first
A good differential-review guide starts with SKILL.md, then methodology.md, adversarial.md, patterns.md, and reporting.md. These files tell the agent how to build baseline context, what attack models to use, what patterns to scan for, and how to format the final report. There are no helper scripts or extra reference folders in this plugin, so the skill files are the source of truth.
Workflow tips that change output quality
Use the skill when you can provide a clean diff, a baseline commit, and enough repository context to inspect callers and tests. Tell it if the codebase is small, medium, or large, or let it infer scale, but do not skip the baseline/history step. For differential-review for Code Review, the highest-value inputs are concrete: changed files, likely trust boundaries, suspicious functions, and any regression history you already know about.
differential-review skill FAQ
Is differential-review only for security reviews?
Yes, primarily. It is designed for security-focused differential review, not general style cleanup or feature acceptance. You can still use it for ordinary code review, but the main value appears when the change could affect trust boundaries, data integrity, or exploitability.
How is it different from a normal prompt?
A normal prompt may summarize the diff; differential-review tries to prove or disprove risk with history, blast radius, and attacker modeling. It also expects a markdown report, which makes the output easier to hand off or archive.
Is it beginner-friendly?
It is usable for beginners, but it assumes the user can point to a specific diff and wants structured analysis. If you do not know the codebase well, the skill still helps because it demands baseline context and makes missing coverage explicit.
When should I not use it?
Do not use differential-review for trivial text changes, low-risk formatting-only PRs, or cases where you only need a one-paragraph summary. It is overkill when there is no meaningful security or regression risk, and its process adds value only if there is something worth checking deeply.
How to Improve differential-review skill
Provide stronger review context
The biggest improvement comes from giving the skill the exact review surface: PR number, commit range, target branch, and any suspected risk area. If you know the project domain, say so up front: a Solidity change, an API auth flow, or a payment path will steer the analysis toward the right attack model.
Ask for the right depth on the first pass
If you want better differential-review usage, specify whether you care more about correctness, exploitability, or regression risk. For example: “Focus on externally callable functions, changed validation, and any missing tests for new branches.” That narrows the search to the paths that matter most and reduces noisy findings.
Watch for the common failure modes
The most common misses are treating small diffs as low risk, ignoring removed code history, and forgetting transitive callers when judging blast radius. The skill is explicitly written to avoid those mistakes, but it still needs a concrete baseline and a clearly bounded diff to do that well.
Iterate after the first report
Use the first report to refine the next pass. If the result is too broad, ask for a narrower attacker model or a deeper inspection of one subsystem. If it is too shallow, ask it to re-run with more history, stronger test scrutiny, or a stricter focus on invariants and regression paths.
