browse

by garrytan

browse is a fast headless browser skill for QA, dogfooding, and browser automation. Use it to open pages, interact with elements, verify state, compare before and after actions, capture screenshots, and check responsive layouts, forms, uploads, dialogs, and element states. Install browse when you need browser evidence instead of a generic prompt.

Stars91.8k

Favorites0

Comments0

AddedMay 9, 2026

CategoryBrowser Automation

Install Command

npx skills add garrytan/gstack --skill browse

Curation Score

This skill scores 78/100, which means it is a solid listing candidate for directory users who need a fast headless browser workflow for QA, dogfooding, screenshots, and state verification. The repository shows enough real operational content that an agent can likely trigger and use it with less guesswork than a generic prompt, though users should still expect some adoption friction from missing install-command guidance and a few placeholder markers.

78/100

Strengths

Explicit trigger language and use cases in SKILL.md: "browse a page," "headless browser," "take page screenshot," plus QA testing, deployment verification, and bug evidence.
Large, workflow-heavy skill body with many headings and signal counts for scope, workflow, constraints, and practical steps, suggesting real operational guidance rather than a stub.
Supporting code and scripts indicate a functioning browser-skill system, including client/server integration, activity/audit logging, and a build script for Node compatibility.

Cautions

The SKILL.md excerpt shows placeholder markers and no install command, so first-time setup may require extra repository exploration.
The description is broad but the directory evidence does not include a concise quick-start or reference docs, which may slow agent adoption for users seeking immediate execution confidence.

Testing Screenshots Responsive Design Forms JavaScript TypeScript Playwright

Overview

Overview of browse skill

What browse is for

The browse skill is a fast headless browser tool for QA, dogfooding, and browser automation. It is designed for when you need to open a page, interact with it, verify state, compare before and after an action, or capture evidence such as screenshots and element-state checks. If your job is “test this flow in a browser and tell me what happened,” browse is the right fit.

Who should install it

Install browse if you regularly validate web pages, demos, forms, responsive layouts, uploads, dialogs, or deployment checks. It is especially useful for agents that must prove a UI behavior with screenshots or state assertions instead of relying on a generic prompt. It is less useful for pure backend tasks or simple page reading.

What makes browse different

The browse skill is built around real browser execution, not just text-based page inspection. The repo signals support for command routing, browser management, CDP bridging, network capture, cookie handling, and annotated visual checks. That means browse is aimed at practical browser automation with evidence, not a lightweight “summarize this site” helper.

How to Use browse skill

Install browse correctly

Use the install path shown in the skill docs or your skill manager’s add command, then confirm the skill is discoverable in your local skill directory. The repo includes helper shims such as bin/find-browse, which suggests browse is meant to be located and invoked from a workspace-aware install. If the binary is missing, the first fix is usually to run the skill setup/build path rather than rewriting prompts.

Give browse a task, not a vague goal

Strong browse usage starts with an explicit browser job: URL, action, expected result, and what evidence you want back. Good input looks like: “Open the login page, submit valid credentials, confirm redirect to /dashboard, and return a screenshot plus any console or network errors.” Weak input like “test the site” leaves too much routing ambiguity.

Read these files first

For install and usage decisions, start with SKILL.md, then inspect PLAN-snapshot-dropdown-interactive.md for known workflow constraints, SKILL.md.tmpl for how the skill is generated, and bin/find-browse plus bin/remote-slug for path and repo resolution behavior. If you are evaluating browser automation fit, also skim src/browser-manager.ts, src/cdp-bridge.ts, and src/browser-skill-commands.ts to understand what the skill can actually execute.

Use the skill in a workflow

A reliable browse workflow is: define the page state you want, run the browser action, verify the output, then iterate on the next constraint. For example, specify responsive width, form inputs, or expected DOM changes up front so browse can check them in one pass. This reduces back-and-forth and makes the first run more useful than a generic prompt.

browse skill FAQ

Is browse only for screenshots?

No. Screenshots are only one output. The skill is also intended for navigation, interaction, state verification, responsive checks, form testing, uploads, and bug evidence. If your real need is “prove this browser behavior,” browse is more complete than a screenshot-only tool.

How is browse different from a normal prompt?

A normal prompt asks an agent to reason about a browser task. The browse skill gives the agent a browser-specific execution path, including command routing and browser-state checks. That usually means less guesswork, better repeatability, and clearer evidence when a flow fails.

Is browse beginner-friendly?

Yes, if you can describe a browser task clearly. Beginners do best when they provide a URL, one action, one expected result, and one evidence request. If you already know how to write a test case, you can usually use browse effectively on the first try.

When should I not use browse?

Do not use browse when you only need static content extraction, repo inspection, or a plain coding answer. It is also a poor fit if you cannot specify a browser target or if the task does not require an actual rendered page. In those cases, a normal agent prompt is simpler.

How to Improve browse skill

Give stronger browser inputs

The best browse results come from inputs that name the page, the user action, the success condition, and the artifact you want returned. For example: “On the pricing page, switch to annual billing, confirm the total updates, and capture a screenshot of the final state.” That is better than “check pricing,” because it removes ambiguity around what success means.

Watch for the common failure modes

The most common browse failure is under-specification: missing URL, missing state, or missing expected outcome. The second is asking for visual proof without saying what part of the page matters. If the task includes forms, menus, dialogs, or dynamic content, say so explicitly; those details materially affect browse usage.

Iterate after the first run

If the first result is close but incomplete, refine the next prompt with the exact mismatch: wrong viewport, missed element, stale state, or missing network evidence. Browse is most valuable when each pass narrows uncertainty. Use the output to add constraints rather than restating the same request.

Tune browse for Browser Automation

For browser automation, include concrete fixtures: test account type, device size, locale, and whether cookies or login state matter. If you are validating a bug, include the repro step and the expected/actual delta. This makes browse act like a browser automation assistant instead of a generic QA note-taker, and it usually produces better evidence on the first pass.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

playwright-interactive

by openai

playwright-interactive is a browser automation skill for persistent Playwright sessions in local web and Electron apps. Use it to inspect UI state, retry interactions, and run functional or visual QA without restarting the toolchain. Ideal when you need a practical playwright-interactive guide for iterative debugging.

Browser Automation

Favorites 0GitHub 0

playwright-skill

by testdino-hq

playwright-skill is a Playwright-specific guide for reliable browser automation. It helps teams write, debug, and scale tests for E2E flows, API checks, component testing, visual regression, accessibility, auth, CI/CD, and migration from Cypress or Selenium. Use the playwright-skill skill when you want practical patterns instead of generic testing advice.

Test Automation

Favorites 0GitHub 0

data-scraper-agent

by affaan-m

data-scraper-agent helps build a repeatable public-data pipeline for web scraping, enrichment, and storage. It is designed for monitoring jobs, prices, news, repos, sports, and listings on a schedule using GitHub Actions, with outputs to Notion, Sheets, or Supabase. Best for ongoing tracking, not one-off extractions.

Web Scraping

Favorites 0GitHub 156.1k

playwright-best-practices

by currents-dev

playwright-best-practices is a Playwright + TypeScript skill for writing stable tests, reducing flake, improving auth flows, choosing fixtures vs page objects, and handling CI, popups, mobile, iframes, websockets, and multi-user scenarios with practical repo-backed guidance.

Test Automation

Favorites 0GitHub 174

x-twitter-scraper

by Xquik-dev

Use x-twitter-scraper to retrieve X (Twitter) data and confirmation-gated actions through Xquik. It supports tweet search, user lookup, follower extraction, media download, monitors, webhooks, MCP, and write actions. Best for web scraping-style research with an API key, not X login secrets.

Web Scraping

Favorites 0GitHub 71

composio

by ComposioHQ

Use composio to connect AI workflows to external apps through the CLI or SDK. This composio skill is built for workflow automation, app actions, per-user connections, toolkit discovery, and a practical guide to install and usage before you start building.

Workflow Automation

Favorites 0GitHub 48

playwright-skill

by lackeyjb

playwright-skill is a browser automation skill for testing pages, filling forms, checking links, taking screenshots, validating responsive layouts, and working through login or checkout flows. It auto-detects dev servers, uses a universal executor, and helps you run reliable Playwright tasks with less setup and guesswork.

Browser Automation

Favorites 0GitHub 0

browser-use

by browser-use

browser-use is a browser automation skill for opening pages, inspecting state, clicking indexed elements, typing into fields, taking screenshots, and reusing a persistent browser session. Use it for reliable form filling, navigation, and logged-in workflows with the browser-use CLI.

Browser Automation

Favorites 0GitHub 84.9k

browser-testing-with-devtools

by addyosmani

browser-testing-with-devtools helps agents test and debug real browser behavior through Chrome DevTools MCP. Use it to inspect the DOM, capture console errors, analyze network requests, profile performance, and verify fixes in a live browser.

Test Automation

Favorites 0GitHub 18.7k

baoyu-post-to-x

by JimLiu

baoyu-post-to-x automates posting to X with real Chrome and CDP. Publish text, images, videos, quote posts, and Markdown-based X Articles using bun scripts, preview mode, and browser-based execution.

Social Media

Favorites 0GitHub 13.2k

use-my-browser

by xixu-me

use-my-browser is a browser automation strategy skill for choosing the right web layer: public web tools, live Chrome, raw fetch, or Playwright for signed-in, dynamic, and DevTools-driven tasks.

Browser Automation

Favorites 0GitHub 6

playwright-cli

by VoltAgent

playwright-cli is a browser automation skill for Playwright from the command line. It helps with opening pages, inspecting elements, clicking through flows, filling forms, capturing screenshots, mocking requests, and generating test code from real interactions. Use it for repeatable browser automation and UI testing.

Browser Automation

Favorites 0GitHub 8.5k

windows-vm

by obra

Use the windows-vm skill to create, manage, and SSH into a headless Windows 11 VM in Docker with KVM acceleration. It fits desktop automation, Windows app setup, and repeatable agent workflows when you need a real Windows environment without manual RDP.

Desktop Automation

Favorites 0GitHub 323

notebooklm

by PleasePrompto

Use the notebooklm skill to query Google NotebookLM notebooks from Claude Code for source-grounded, citation-backed answers. Built for notebooklm usage in document-first workflows, with browser automation, persistent auth, and notebook management for NotebookLM guide and workflow automation tasks.

Workflow Automation

Favorites 0GitHub 0

playwright

by openai

Use the playwright skill to automate a real browser from the terminal with a wrapper script and `playwright-cli`. It fits browser automation tasks like navigation, form filling, screenshots, snapshots, extraction, and UI-flow debugging. Check `npx`, install the skill, set `PWCLI`, then follow the CLI-first workflow.

Browser Automation

Favorites 0GitHub 0

canary-watch

by affaan-m

canary-watch is a post-deploy monitoring skill for checking a live URL for regressions after releases, merges, or dependency updates across staging or production.

Monitoring

Favorites 0GitHub 156.1k