G

benchmark

by garrytan

The benchmark skill helps detect performance regressions in web and app workflows. Use it to establish a baseline, compare before and after changes, and track whether a PR made pages slower, heavier, or less stable. It is a practical benchmark guide for performance optimization, Core Web Vitals, Lighthouse checks, bundle size, and load time trends.

Stars91.8k
Favorites0
Comments0
AddedMay 9, 2026
CategoryPerformance Optimization
Install Command
npx skills add garrytan/gstack --skill benchmark
Curation Score

This skill scores 67/100, which means it is listable for directory users but with clear caveats: it appears genuinely workflow-oriented for performance regression benchmarking, yet the install decision is weakened by missing supporting assets and some placeholder markers. Users who need automated page-speed regression checks should consider it; users who want a very polished, self-contained install experience may want more documentation first.

67/100
Strengths
  • Specific, actionable purpose: performance regression detection for page load times, Core Web Vitals, and resource sizes.
  • Good triggerability: explicit use cases and voice aliases such as "speed test" and "check performance" reduce guesswork.
  • Substantial workflow content in SKILL.md with many headings and code-fenced steps, suggesting real operational guidance rather than a stub.
Cautions
  • No install command and no supporting scripts/references/resources, so adoption may require more manual setup and inspection.
  • Placeholder markers are present, which lowers trust that every branch of the workflow is fully finalized.
Overview

Overview of benchmark skill

What benchmark skill does

The benchmark skill is for performance regression detection in web and app workflows. It helps you establish a baseline, compare before/after changes, and track whether a PR made pages slower, heavier, or less stable. In practice, the benchmark skill is most useful for teams trying to answer one question: did this change improve or harm performance?

Who should use it

Use this benchmark skill if you care about page speed, Core Web Vitals, Lighthouse-style checks, bundle size, or load time trends over time. It is a strong fit for reviewers, frontend engineers, and AI agents that need a repeatable way to evaluate performance changes instead of guessing from a screenshot or a quick manual test.

Why it is different

The benchmark skill is not just a generic “run a test” prompt. It is oriented around before/after comparison, regression detection, and ongoing trend awareness, with workflow guidance tuned for browser-based performance measurement. That makes it more useful for Performance Optimization than a one-off prompt that only asks for “speed issues.”

How to Use benchmark skill

benchmark install and setup

Install the benchmark skill in your Claude skills environment with the repository’s skill command, then open the skill file before using it in a real task. The expected install path is:
npx skills add garrytan/gstack --skill benchmark

After install, confirm the skill is available in the current workspace and that your task is specific enough to measure. The skill works best when the repo under test, the page or route, and the change being evaluated are all known up front.

What to read first

Start with SKILL.md, then inspect SKILL.md.tmpl if you want to understand the generated structure. Because this repository does not expose extra rules/, resources/, or helper scripts for the skill, the main source of truth is the skill file itself. For decision-making, the most important sections are the preamble, plan-mode guidance, and any routing or constraint notes that affect when the benchmark skill should run.

How to write a good prompt

A weak prompt says “check performance.” A stronger benchmark usage prompt names the target, the baseline, and the decision you need:

  • “Compare /pricing before and after the image compression change and report any regressions in LCP, CLS, and total transfer size.”
  • “Benchmark the checkout page on mobile emulation and tell me whether the new bundle split improved load time.”
  • “Run a performance benchmark for the homepage and summarize whether the PR is safe to merge.”

Include the page, device assumptions, and what counts as a failure. That reduces ambiguity and makes the result actionable.

Workflow that produces useful results

Use the benchmark guide as a repeatable loop: identify the page, establish the baseline, run the comparison, and then interpret the delta against the change you made. If you are working in plan mode, confirm whether the skill should only inspect or should also execute measurements. For best output, keep the test scope narrow; one important route usually beats a whole-site sweep.

benchmark skill FAQ

Is benchmark skill only for web performance?

It is primarily for browser-visible performance optimization, especially pages, routes, and frontend changes. If your task is backend latency, infra profiling, or database tuning, the benchmark skill may not be the best first choice unless the user-facing page metric is the goal.

Do I need a full prompt, or is the skill enough?

The skill helps structure the work, but it still needs a concrete target. A generic prompt can trigger the benchmark skill, but stronger benchmark usage happens when you provide a route, a change, and a comparison point. The more specific your request, the less the agent has to infer.

Is benchmark good for beginners?

Yes, if you want a guided way to check whether a change made performance worse. It is easier to use than building your own evaluation checklist from scratch, but you still need to know what page or feature you want measured.

When should I not use it?

Do not use benchmark skill when you only need a qualitative UI review, when the page is too unstable to measure meaningfully, or when your main problem is not performance. If you cannot define a stable before/after comparison, the benchmark result will be noisy.

How to Improve benchmark skill

Give the skill a measurable target

The biggest quality boost comes from specifying exactly what to benchmark and what success looks like. Say which URL, device class, and metric matter most. For Performance Optimization, that often means naming one primary metric, such as LCP or bundle size, instead of asking for “all performance issues.”

Include the change being tested

Benchmarking is strongest when the skill knows what changed: a new image pipeline, a code-splitting refactor, a font swap, or a third-party script removal. That context helps separate normal variance from a real regression and makes the output easier to trust.

Ask for the comparison you will act on

If you need a merge decision, say so. If you need remediation ideas, say that too. Useful follow-up prompts include:

  • “Compare against the last stable build and flag anything above a 5% regression.”
  • “Benchmark this branch, then tell me the highest-impact fix if results are worse.”
  • “Rerun the check on mobile and desktop, but prioritize the route with the worst LCP.”

Iterate on the first run

If the first result is noisy, improve the input before rerunning: narrow the route, remove unrelated changes, or define the test conditions more tightly. The benchmark skill is best when you treat it as a repeatable benchmark skill for decision support, not a single-pass diagnostic for every kind of speed problem.

Ratings & Reviews

No ratings yet
Share your review
Sign in to leave a rating and comment for this skill.
G
0/10000
Latest reviews
Saving...