B

open-source

by browser-use

Documentation lookup for the browser-use Python library. The open-source skill helps with install, setup, Agent and Browser code, model env vars, tools, MCP integrations, monitoring, and legacy Actor API guidance.

Stars84.9K
Favorites0
Comments0
AddedMar 29, 2026
CategoryCode Generation
Install Command
npx skills add https://github.com/browser-use/browser-use --skill open-source
Curation Score

This skill scores 82/100, which means it is a solid directory listing candidate: agents get a clear trigger boundary, a usable topic-to-file map, and substantial reference content for coding with the browser-use open-source library, though users should view it as documentation lookup rather than a tightly guided end-to-end workflow.

82/100
Strengths
  • Strong triggerability: SKILL.md explicitly says when to use this skill and when to defer to the cloud or browser-use skills.
  • Good operational depth: reference files cover install/quickstart, models, agent config, browser config, tools, integrations, monitoring, and examples.
  • Trustworthy, concrete details: docs include Python snippets, parameter explanations, environment variables, and MCP/client configuration examples.
Cautions
  • The top-level skill is mostly a routing document; agents still need to choose and read the right reference file rather than follow one unified workflow.
  • No install command appears in SKILL.md itself, so basic setup depends on opening the referenced quickstart material.
Overview

Overview of open-source skill

What the open-source skill is for

The open-source skill is the documentation lookup skill for the Python browser-use library. It helps an agent answer implementation questions about Agent, Browser, tools, model setup, MCP integrations, monitoring, and the legacy Actor API without guessing from generic browser automation patterns.

This is most useful for developers who are writing or reviewing code that imports from browser_use, choosing a runtime setup, or debugging configuration details that are easy to get wrong from memory.

Best-fit users and jobs-to-be-done

Use the open-source skill when you need to:

  • install and configure the open-source browser-use Python library
  • pick an LLM backend and the right environment variables
  • write Agent(...) or Browser(...) code with valid parameters
  • add custom tools, hooks, or structured output
  • connect browser-use to MCP, skills, docs tooling, or observability
  • understand the legacy low-level Actor API

The real job is not “summarize the repo.” It is “help me produce correct browser_use code and config faster than I could by manually hunting across reference files.”

What makes this skill different from a generic prompt

A generic prompt may know browser automation broadly, but this skill is anchored to the repository’s own reference set:

  • references/quickstart.md
  • references/models.md
  • references/agent.md
  • references/browser.md
  • references/tools.md
  • references/actor.md
  • references/integrations.md
  • references/monitoring.md
  • references/examples.md

That matters because browser-use has product-specific classes, parameter names, env vars, cloud boundaries, and integration paths that are not interchangeable with Playwright, Selenium, or cloud-only Browser Use APIs.

Key boundary you should know before installing

This open-source skill is for the open-source Python library, not every Browser Use product surface.

Do use it for:

  • local or Python library usage
  • code generation for browser_use
  • setup questions around models, tools, hooks, browser sessions, and monitoring

Do not use it for:

  • Cloud API or SDK pricing and cloud product workflows
  • direct CLI browser automation requests better handled by the separate browser-use skill

If your task is “write Python code with from browser_use import ...,” this is the right fit.

How to Use open-source skill

Install context for open-source usage

Install the skill in a skills-enabled environment, then invoke it when your task involves the browser_use Python library.

A common add command pattern is:

npx skills add https://github.com/browser-use/browser-use --skill open-source

After install, use the skill as a reference layer while generating code, not as a standalone app. It is designed to guide code-writing and configuration decisions.

Read these files first before asking for code

If you want fast, accurate open-source usage, start with the file that matches your task instead of reading the whole repo:

  • install or first run: references/quickstart.md
  • choose model provider: references/models.md
  • write an agent: references/agent.md
  • configure browser sessions: references/browser.md
  • add tools: references/tools.md
  • need low-level deterministic control: references/actor.md
  • wire MCP or skills: references/integrations.md
  • add tracing or cost tracking: references/monitoring.md
  • copy working patterns: references/examples.md

This skill is strongest when the prompt names the topic explicitly.

What input the open-source skill needs

Provide enough context for the skill to select the right reference file and generate working code. The highest-value inputs are:

  • your goal in one sentence
  • whether you want Agent, Browser, tools, or Actor API
  • your model provider, if known
  • whether execution is local, remote CDP, or cloud-connected
  • any constraints such as headless mode, auth, allowed domains, structured output, or observability

Weak input:

  • “Use browser-use for automation.”

Strong input:

  • “Write Python code using browser_use.Agent with ChatOpenAI(model="gpt-4.1-mini"), a non-headless Browser, allowed domains limited to example.com, and a Pydantic output schema.”

Turn a rough goal into a strong prompt

For better open-source for Code Generation results, transform a vague request into a prompt with four parts:

  1. target API surface
  2. runtime assumptions
  3. output shape
  4. constraints

Example:

Use the open-source skill to write a Python example with `browser_use.Agent`.
Model: `ChatGoogle(model="gemini-flash-latest")`.
Browser: headless, custom window size, keep browser alive after run.
Task: log in, navigate to a dashboard, extract three metrics.
Return complete code plus required env vars and pip installs.

Why this works:

  • it points the skill toward agent.md, browser.md, and models.md
  • it avoids cloud/API confusion
  • it asks for code, setup, and operational details in one pass

Minimal open-source install path to ask for

If you are still deciding whether to adopt, ask the skill for the shortest working setup first:

  • Python install steps
  • the smallest runnable Agent example
  • one supported LLM option and its env var
  • any browser/runtime assumptions

The repo references show that model setup varies by provider, so “install browser-use” is not enough by itself. You also need the correct chat class and API key variable, such as BROWSER_USE_API_KEY, GOOGLE_API_KEY, or OPENAI_API_KEY.

Practical open-source usage patterns it supports well

The skill is strongest for these workflows:

  • generate a first Agent(...) script
  • compare model classes such as ChatBrowserUse, ChatGoogle, ChatOpenAI, or ChatAnthropic
  • configure Browser(...) options like headless, window_size, cdp_url, or domain restrictions
  • add custom tools and understand ActionResult
  • enable structured output with output_model_schema
  • set timeouts, retries, fallback LLMs, or hooks
  • add Laminar or OpenLIT monitoring
  • use the legacy Actor API for lower-level page and element control

Important constraints that affect output quality

The open-source skill has a few decision-critical constraints:

  • The Actor API is explicitly legacy and not the same as Playwright.
  • Browser is an alias of BrowserSession, which helps when reading examples.
  • Domain control uses allowed_domains and prohibited_domains patterns with specific matching rules.
  • Some features, such as loading skills via skills or skill_ids, require BROWSER_USE_API_KEY.
  • Cloud MCP setup exists, but that is not the same thing as the open-source Python library workflow.

These details are where generic prompting often fails.

Best workflow for open-source code generation

A practical workflow is:

  1. Ask for the smallest working example for your exact provider and task.
  2. Ask the skill to annotate every non-default parameter it adds.
  3. Run the example locally.
  4. If it fails, paste the traceback and your current code.
  5. Ask for a revised version using the relevant reference file.

This works better than asking for a “full production implementation” first, because many failures come from setup mismatch rather than missing business logic.

Example prompt that invokes the skill well

Use the open-source skill for browser-use.
I need Python code, not cloud API usage.
Please build a script that uses `Agent` with `ChatBrowserUse()`, runs headless,
extracts structured output into a Pydantic model, and tracks cost.
Also list the env vars, pip packages, and which reference docs you used.

That prompt gives the skill enough signal to combine agent.md, models.md, and monitoring.md.

When to use Actor API instead of Agent

Use Agent when you want goal-driven browsing with LLM planning.

Use the Actor API when you need deterministic low-level actions and can manage timing yourself. The references note important differences from Playwright, including immediate element returns and stricter evaluate() formatting. If your code assumes Playwright semantics, ask the skill to adapt the example specifically for Actor API behavior.

open-source skill FAQ

Is open-source only for installation help?

No. open-source covers install, setup, code generation, configuration, integrations, and debugging for the browser_use Python library. Installation is just the first step; the bigger value is getting correct parameter names, provider setup, and API-specific examples.

Is the open-source skill good for beginners?

Yes, if you ask for a minimal path. Beginners should request:

  • one provider
  • one short task
  • one complete script
  • env vars and install commands
  • explanation of each import

Avoid asking for tools, hooks, monitoring, and MCP in the first prompt unless you already know you need them.

How is this different from an ordinary prompt about browser automation?

An ordinary prompt may default to Playwright or Selenium assumptions. The open-source skill is better when you need repository-accurate browser_use details such as ChatBrowserUse, output_model_schema, domain restrictions, fallback LLM behavior, cloud-vs-open-source boundaries, or Actor API quirks.

When should I not use open-source?

Do not use it when your task is:

  • Browser Use Cloud pricing or cloud SDK guidance
  • generic browser automation without browser_use
  • direct command-style browser control better matched to another skill

If your request does not involve the Python library or Browser Use docs, this skill is probably the wrong tool.

Does open-source help with model selection?

Yes. The references include supported model providers and env vars across Browser Use, Google Gemini, OpenAI, Anthropic, Azure OpenAI, Bedrock, Groq, Ollama, and OpenAI-compatible APIs. This is one of the most practical reasons to use the skill before coding.

Can open-source help with production concerns?

Yes, within the library scope. It can guide you on retries, fallback LLMs, browser persistence, remote browser connection by cdp_url, monitoring with Laminar or OpenLIT, and performance-oriented example patterns like fast mode or parallel browsers.

How to Improve open-source skill

Give open-source a concrete implementation target

The fastest way to improve results is to specify exactly what code object you want:

  • “write an Agent example”
  • “configure a Browser with cdp_url
  • “add a custom tool”
  • “return structured output”
  • “show Actor API page interaction”

This reduces reference-file drift and avoids mixed answers.

Include runtime and provider details up front

Many poor outputs come from omitted environment assumptions. State:

  • Python context
  • chosen model class
  • API key source
  • headless vs visible browser
  • local browser vs remote CDP
  • whether skills or MCP are required

Without that, the skill may return a plausible snippet that still cannot run in your setup.

Ask for a runnable example before abstractions

If you want reusable architecture, still ask for a runnable script first. Then iterate toward:

  • helper functions
  • config extraction
  • stronger schemas
  • tool registration
  • monitoring hooks

This catches install and import mistakes early, which is where most adoption friction happens.

Name the reference file you want the answer grounded in

A high-leverage prompt pattern is:

Use the open-source skill and ground the answer in `references/agent.md` and `references/browser.md`.

Do this when accuracy matters more than breadth. It helps the skill stay aligned with the repository’s actual API surface.

Common failure modes to watch for

The main adoption blockers are:

  • mixing cloud product guidance with open-source library code
  • assuming Playwright behavior in Actor API examples
  • missing provider env vars
  • asking for advanced features without naming the base setup
  • requesting “browser-use” help without saying whether you mean Agent, Browser, tools, or Actor API

If the first answer feels broad, narrow the API surface instead of asking for “more detail.”

Provide stronger inputs for better code generation

Better prompt:

Use the open-source skill to generate Python code with:
- `from browser_use import Agent, Browser, ChatOpenAI`
- model `gpt-4.1-mini`
- headless browser
- `allowed_domains=["example.com"]`
- structured output via Pydantic
- cost tracking enabled
Return install steps, env vars, and a short explanation of each parameter.

This works because every requested feature maps cleanly to documented references.

Iterate after the first output

After you get an initial answer, improve it by asking one of these:

  • “Remove everything non-essential and keep it runnable.”
  • “Adapt this to ChatBrowserUse() instead of OpenAI.”
  • “Add a custom tool and explain where it plugs into the agent.”
  • “Switch from Agent to Actor API for deterministic control.”
  • “Add monitoring with OpenLIT only.”

These focused revisions usually outperform a single giant prompt.

Use open-source as a doc router, not just a summary tool

The best use of open-source is as a routing layer to the right internal docs. Treat it as a fast path to the exact reference you need, then ask for code grounded in that file. That is where the skill adds real value over a generic prompt or a quick repo skim.

Ratings & Reviews

No ratings yet
Share your review
Sign in to leave a rating and comment for this skill.
G
0/10000
Latest reviews
Saving...