open-source

by browser-use

Documentation lookup for the browser-use Python library. The open-source skill helps with install, setup, Agent and Browser code, model env vars, tools, MCP integrations, monitoring, and legacy Actor API guidance.

Stars84.9k

Favorites0

Comments0

AddedMar 29, 2026

CategoryCode Generation

Install Command

npx skills add browser-use/browser-use --skill open-source

Curation Score

This skill scores 82/100, which means it is a solid directory listing candidate: agents get a clear trigger boundary, a usable topic-to-file map, and substantial reference content for coding with the browser-use open-source library, though users should view it as documentation lookup rather than a tightly guided end-to-end workflow.

82/100

Strengths

Strong triggerability: SKILL.md explicitly says when to use this skill and when to defer to the cloud or browser-use skills.
Good operational depth: reference files cover install/quickstart, models, agent config, browser config, tools, integrations, monitoring, and examples.
Trustworthy, concrete details: docs include Python snippets, parameter explanations, environment variables, and MCP/client configuration examples.

Cautions

The top-level skill is mostly a routing document; agents still need to choose and read the right reference file rather than follow one unified workflow.
No install command appears in SKILL.md itself, so basic setup depends on opening the referenced quickstart material.

Python MCP MCP Server Automation Browser Automation Documentation

Overview

Overview of open-source skill

What the open-source skill is for

The open-source skill is the documentation lookup skill for the Python browser-use library. It helps an agent answer implementation questions about Agent, Browser, tools, model setup, MCP integrations, monitoring, and the legacy Actor API without guessing from generic browser automation patterns.

This is most useful for developers who are writing or reviewing code that imports from browser_use, choosing a runtime setup, or debugging configuration details that are easy to get wrong from memory.

Best-fit users and jobs-to-be-done

Use the open-source skill when you need to:

install and configure the open-source browser-use Python library
pick an LLM backend and the right environment variables
write Agent(...) or Browser(...) code with valid parameters
add custom tools, hooks, or structured output
connect browser-use to MCP, skills, docs tooling, or observability
understand the legacy low-level Actor API

The real job is not “summarize the repo.” It is “help me produce correct browser_use code and config faster than I could by manually hunting across reference files.”

What makes this skill different from a generic prompt

A generic prompt may know browser automation broadly, but this skill is anchored to the repository’s own reference set:

references/quickstart.md
references/models.md
references/agent.md
references/browser.md
references/tools.md
references/actor.md
references/integrations.md
references/monitoring.md
references/examples.md

That matters because browser-use has product-specific classes, parameter names, env vars, cloud boundaries, and integration paths that are not interchangeable with Playwright, Selenium, or cloud-only Browser Use APIs.

Key boundary you should know before installing

This open-source skill is for the open-source Python library, not every Browser Use product surface.

Do use it for:

local or Python library usage
code generation for browser_use
setup questions around models, tools, hooks, browser sessions, and monitoring

Do not use it for:

Cloud API or SDK pricing and cloud product workflows
direct CLI browser automation requests better handled by the separate browser-use skill

If your task is “write Python code with from browser_use import ...,” this is the right fit.

How to Use open-source skill

Install context for open-source usage

Install the skill in a skills-enabled environment, then invoke it when your task involves the browser_use Python library.

A common add command pattern is:

npx skills add https://github.com/browser-use/browser-use --skill open-source

After install, use the skill as a reference layer while generating code, not as a standalone app. It is designed to guide code-writing and configuration decisions.

Read these files first before asking for code

If you want fast, accurate open-source usage, start with the file that matches your task instead of reading the whole repo:

install or first run: references/quickstart.md
choose model provider: references/models.md
write an agent: references/agent.md
configure browser sessions: references/browser.md
add tools: references/tools.md
need low-level deterministic control: references/actor.md
wire MCP or skills: references/integrations.md
add tracing or cost tracking: references/monitoring.md
copy working patterns: references/examples.md

This skill is strongest when the prompt names the topic explicitly.

What input the open-source skill needs

Provide enough context for the skill to select the right reference file and generate working code. The highest-value inputs are:

your goal in one sentence
whether you want Agent, Browser, tools, or Actor API
your model provider, if known
whether execution is local, remote CDP, or cloud-connected
any constraints such as headless mode, auth, allowed domains, structured output, or observability

Weak input:

“Use browser-use for automation.”

Strong input:

“Write Python code using browser_use.Agent with ChatOpenAI(model="gpt-4.1-mini"), a non-headless Browser, allowed domains limited to example.com, and a Pydantic output schema.”

Turn a rough goal into a strong prompt

For better open-source for Code Generation results, transform a vague request into a prompt with four parts:

target API surface
runtime assumptions
output shape
constraints

Example:

Use the open-source skill to write a Python example with `browser_use.Agent`.
Model: `ChatGoogle(model="gemini-flash-latest")`.
Browser: headless, custom window size, keep browser alive after run.
Task: log in, navigate to a dashboard, extract three metrics.
Return complete code plus required env vars and pip installs.

Why this works:

it points the skill toward agent.md, browser.md, and models.md
it avoids cloud/API confusion
it asks for code, setup, and operational details in one pass

Minimal open-source install path to ask for

If you are still deciding whether to adopt, ask the skill for the shortest working setup first:

Python install steps
the smallest runnable Agent example
one supported LLM option and its env var
any browser/runtime assumptions

The repo references show that model setup varies by provider, so “install browser-use” is not enough by itself. You also need the correct chat class and API key variable, such as BROWSER_USE_API_KEY, GOOGLE_API_KEY, or OPENAI_API_KEY.

Practical open-source usage patterns it supports well

The skill is strongest for these workflows:

generate a first Agent(...) script
compare model classes such as ChatBrowserUse, ChatGoogle, ChatOpenAI, or ChatAnthropic
configure Browser(...) options like headless, window_size, cdp_url, or domain restrictions
add custom tools and understand ActionResult
enable structured output with output_model_schema
set timeouts, retries, fallback LLMs, or hooks
add Laminar or OpenLIT monitoring
use the legacy Actor API for lower-level page and element control

Important constraints that affect output quality

The open-source skill has a few decision-critical constraints:

The Actor API is explicitly legacy and not the same as Playwright.
Browser is an alias of BrowserSession, which helps when reading examples.
Domain control uses allowed_domains and prohibited_domains patterns with specific matching rules.
Some features, such as loading skills via skills or skill_ids, require BROWSER_USE_API_KEY.
Cloud MCP setup exists, but that is not the same thing as the open-source Python library workflow.

These details are where generic prompting often fails.

Best workflow for open-source code generation

A practical workflow is:

Ask for the smallest working example for your exact provider and task.
Ask the skill to annotate every non-default parameter it adds.
Run the example locally.
If it fails, paste the traceback and your current code.
Ask for a revised version using the relevant reference file.

This works better than asking for a “full production implementation” first, because many failures come from setup mismatch rather than missing business logic.

Example prompt that invokes the skill well

Use the open-source skill for browser-use.
I need Python code, not cloud API usage.
Please build a script that uses `Agent` with `ChatBrowserUse()`, runs headless,
extracts structured output into a Pydantic model, and tracks cost.
Also list the env vars, pip packages, and which reference docs you used.

That prompt gives the skill enough signal to combine agent.md, models.md, and monitoring.md.

When to use Actor API instead of Agent

Use Agent when you want goal-driven browsing with LLM planning.

Use the Actor API when you need deterministic low-level actions and can manage timing yourself. The references note important differences from Playwright, including immediate element returns and stricter evaluate() formatting. If your code assumes Playwright semantics, ask the skill to adapt the example specifically for Actor API behavior.

open-source skill FAQ

Is open-source only for installation help?

No. open-source covers install, setup, code generation, configuration, integrations, and debugging for the browser_use Python library. Installation is just the first step; the bigger value is getting correct parameter names, provider setup, and API-specific examples.

Is the open-source skill good for beginners?

Yes, if you ask for a minimal path. Beginners should request:

one provider
one short task
one complete script
env vars and install commands
explanation of each import

Avoid asking for tools, hooks, monitoring, and MCP in the first prompt unless you already know you need them.

How is this different from an ordinary prompt about browser automation?

An ordinary prompt may default to Playwright or Selenium assumptions. The open-source skill is better when you need repository-accurate browser_use details such as ChatBrowserUse, output_model_schema, domain restrictions, fallback LLM behavior, cloud-vs-open-source boundaries, or Actor API quirks.

When should I not use open-source?

Do not use it when your task is:

Browser Use Cloud pricing or cloud SDK guidance
generic browser automation without browser_use
direct command-style browser control better matched to another skill

If your request does not involve the Python library or Browser Use docs, this skill is probably the wrong tool.

Does open-source help with model selection?

Yes. The references include supported model providers and env vars across Browser Use, Google Gemini, OpenAI, Anthropic, Azure OpenAI, Bedrock, Groq, Ollama, and OpenAI-compatible APIs. This is one of the most practical reasons to use the skill before coding.

Can open-source help with production concerns?

Yes, within the library scope. It can guide you on retries, fallback LLMs, browser persistence, remote browser connection by cdp_url, monitoring with Laminar or OpenLIT, and performance-oriented example patterns like fast mode or parallel browsers.

How to Improve open-source skill

Give open-source a concrete implementation target

The fastest way to improve results is to specify exactly what code object you want:

“write an Agent example”
“configure a Browser with cdp_url”
“add a custom tool”
“return structured output”
“show Actor API page interaction”

This reduces reference-file drift and avoids mixed answers.

Include runtime and provider details up front

Many poor outputs come from omitted environment assumptions. State:

Python context
chosen model class
API key source
headless vs visible browser
local browser vs remote CDP
whether skills or MCP are required

Without that, the skill may return a plausible snippet that still cannot run in your setup.

Ask for a runnable example before abstractions

If you want reusable architecture, still ask for a runnable script first. Then iterate toward:

helper functions
config extraction
stronger schemas
tool registration
monitoring hooks

This catches install and import mistakes early, which is where most adoption friction happens.

Name the reference file you want the answer grounded in

A high-leverage prompt pattern is:

Use the open-source skill and ground the answer in `references/agent.md` and `references/browser.md`.

Do this when accuracy matters more than breadth. It helps the skill stay aligned with the repository’s actual API surface.

Common failure modes to watch for

The main adoption blockers are:

mixing cloud product guidance with open-source library code
assuming Playwright behavior in Actor API examples
missing provider env vars
asking for advanced features without naming the base setup
requesting “browser-use” help without saying whether you mean Agent, Browser, tools, or Actor API

If the first answer feels broad, narrow the API surface instead of asking for “more detail.”

Provide stronger inputs for better code generation

Better prompt:

Use the open-source skill to generate Python code with:
- `from browser_use import Agent, Browser, ChatOpenAI`
- model `gpt-4.1-mini`
- headless browser
- `allowed_domains=["example.com"]`
- structured output via Pydantic
- cost tracking enabled
Return install steps, env vars, and a short explanation of each parameter.

This works because every requested feature maps cleanly to documented references.

Iterate after the first output

After you get an initial answer, improve it by asking one of these:

“Remove everything non-essential and keep it runnable.”
“Adapt this to ChatBrowserUse() instead of OpenAI.”
“Add a custom tool and explain where it plugs into the agent.”
“Switch from Agent to Actor API for deterministic control.”
“Add monitoring with OpenLIT only.”

These focused revisions usually outperform a single giant prompt.

Use open-source as a doc router, not just a summary tool

The best use of open-source is as a routing layer to the right internal docs. Treat it as a fast path to the exact reference you need, then ask for code grounded in that file. That is where the skill adds real value over a generic prompt or a quick repo skim.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

dart-flutter-patterns

by affaan-m

dart-flutter-patterns is a practical Dart and Flutter skill for frontend development, covering null safety, immutable state, async composition, widget structure, state management, GoRouter navigation, Dio networking, testing, and clean architecture. Use the dart-flutter-patterns guide to choose production-ready patterns for new features and refactors.

Frontend Development

Favorites 0GitHub 156.1k

vercel-react-native-skills

by vercel-labs

vercel-react-native-skills is a React Native and Expo skill for performance-minded frontend development. Use it to improve list rendering, animations, navigation, UI patterns, state management, and native module setup. It includes practical rules, install guidance, and usage patterns for working through mobile app bottlenecks with less guesswork.

Frontend Development

Favorites 0GitHub 25.9k

chatgpt-apps

by openai

chatgpt-apps is the skill for building or fixing ChatGPT Apps SDK projects that pair an MCP server with a widget UI. Use it for docs-aligned setup, tool design, bridge wiring, resource registration, metadata, CSP, and repo validation. It also supports chatgpt-apps for Backend Development when backend and UI must be designed together.

Backend Development

Favorites 0GitHub 18.6k

gsap-frameworks

by greensock

gsap-frameworks is the GSAP skill for Vue, Nuxt, Svelte, SvelteKit, and other non-React frameworks. It covers lifecycle-safe animation setup, scoped selectors, and cleanup on unmount so component animations behave correctly in Frontend Development.

Frontend Development

Favorites 0GitHub 3.2k

gsap-react

by greensock

gsap-react is the official GSAP skill for React and Next.js. It covers useGSAP(), refs, gsap.context(), scoped selectors, and cleanup so you can build React-safe animations without re-render or unmount bugs. Use this gsap-react guide when you need install and usage help for frontend development.

Frontend Development

Favorites 0GitHub 3.2k

azure-ai-projects-ts

by microsoft

Build Azure AI Foundry apps with azure-ai-projects-ts and @azure/ai-projects in TypeScript. Use this skill for project clients, agents, connections, deployments, datasets, indexes, evaluations, and OpenAI access. It is a practical guide for API development with Azure project resources and credentials.

API Development

Favorites 0GitHub 2.3k

typescript-magician

by mcollina

typescript-magician helps solve hard TypeScript problems: deep generic design, strict typing cleanup, compiler errors, type guards, and advanced type transformations. Use it for typescript-magician usage when you need type-safe code generation, `any` removal, `infer`, conditional types, mapped types, template literal types, branded types, or utility types.

Code Generation

Favorites 0GitHub 1.8k

terraform-style-guide

by hashicorp

terraform-style-guide helps generate and review Terraform HCL using HashiCorp style conventions, file layout, and security-minded defaults. Use it for Terraform-native code generation, module structure, variables, outputs, and safer configuration in real repositories.

Code Generation

Favorites 0GitHub 583

swift

by Joannis

The swift skill helps you write better Swift code with less guesswork. Use it for Swift configuration, logging, observability, testing, cross-platform patterns, API design, access control, and memory-safety features. It is especially useful for Swift for Backend Development, where practical swift usage, install guidance, and repo-aware implementation details matter.

Backend Development

Favorites 0GitHub 57

node

by mcollina

The node skill is a practical guide for Node.js backend development with TypeScript, native type stripping, async behavior, modules, tests, logging, streams, performance, and graceful shutdown. Use it when you need Node-specific install, setup, and usage guidance for Node 22+ projects, especially when runtime compatibility or no-build-step workflows matter.

Backend Development

Favorites 0GitHub 0

wp-block-development

by WordPress

The wp-block-development skill helps you create, update, and debug WordPress Gutenberg blocks with less guesswork. Use it for block.json metadata, register_block_type(_from_metadata), attributes and serialization, supports, dynamic rendering, deprecations, and build tooling. It is especially useful for Frontend Development tasks that affect editor and frontend parity.

Frontend Development

Favorites 0GitHub 0

remotion-video-creation

by affaan-m

remotion-video-creation is a Remotion-focused skill for React video work. It helps reduce rendering mistakes with 29 rules covering animations, assets, audio, captions, charts, compositions, and transitions. Use it for Video Editing workflows, templated explainers, social clips, and data-driven motion graphics.

Video Editing

Favorites 0GitHub 156.2k

terraform-test

by hashicorp

terraform-test is a practical guide for writing and running Terraform tests with .tftest.hcl files, run blocks, assertions, mocks, and CI-friendly workflows. Use it to validate module outputs, resource arguments, conditional logic, and plan or apply behavior before merge.

Code Generation

Favorites 0GitHub 583

agentic-development

by alinaqi

The agentic-development skill helps you build AI agents for multi-step orchestration with Pydantic AI in Python or Claude Agent SDK in Node.js. Use it to choose a framework, define tools, and shape typed, production-ready agent workflows.

Agent Orchestration

Favorites 0GitHub 0

source-driven-development

by addyosmani

The source-driven-development skill grounds framework-specific coding in official docs, helping you verify patterns before you implement. It is ideal for source-driven-development usage in React, Vue, Next.js, Svelte, Angular, and similar stacks when correctness, provenance, and version-sensitive decisions matter.

Code Generation

Favorites 0GitHub 18.8k

huggingface-llm-trainer

by huggingface

huggingface-llm-trainer helps you train or fine-tune language and vision models on Hugging Face Jobs with TRL or Unsloth. Use this huggingface-llm-trainer skill for SFT, DPO, GRPO, reward modeling, dataset checks, GPU selection, Hub saving, Trackio monitoring, and GGUF export for backend development workflows.

Backend Development

Favorites 0GitHub 10.4k