open-source
by browser-useDocumentation lookup for the browser-use Python library. The open-source skill helps with install, setup, Agent and Browser code, model env vars, tools, MCP integrations, monitoring, and legacy Actor API guidance.
This skill scores 82/100, which means it is a solid directory listing candidate: agents get a clear trigger boundary, a usable topic-to-file map, and substantial reference content for coding with the browser-use open-source library, though users should view it as documentation lookup rather than a tightly guided end-to-end workflow.
- Strong triggerability: SKILL.md explicitly says when to use this skill and when to defer to the cloud or browser-use skills.
- Good operational depth: reference files cover install/quickstart, models, agent config, browser config, tools, integrations, monitoring, and examples.
- Trustworthy, concrete details: docs include Python snippets, parameter explanations, environment variables, and MCP/client configuration examples.
- The top-level skill is mostly a routing document; agents still need to choose and read the right reference file rather than follow one unified workflow.
- No install command appears in SKILL.md itself, so basic setup depends on opening the referenced quickstart material.
Overview of open-source skill
What the open-source skill is for
The open-source skill is the documentation lookup skill for the Python browser-use library. It helps an agent answer implementation questions about Agent, Browser, tools, model setup, MCP integrations, monitoring, and the legacy Actor API without guessing from generic browser automation patterns.
This is most useful for developers who are writing or reviewing code that imports from browser_use, choosing a runtime setup, or debugging configuration details that are easy to get wrong from memory.
Best-fit users and jobs-to-be-done
Use the open-source skill when you need to:
- install and configure the open-source
browser-usePython library - pick an LLM backend and the right environment variables
- write
Agent(...)orBrowser(...)code with valid parameters - add custom tools, hooks, or structured output
- connect browser-use to MCP, skills, docs tooling, or observability
- understand the legacy low-level Actor API
The real job is not “summarize the repo.” It is “help me produce correct browser_use code and config faster than I could by manually hunting across reference files.”
What makes this skill different from a generic prompt
A generic prompt may know browser automation broadly, but this skill is anchored to the repository’s own reference set:
references/quickstart.mdreferences/models.mdreferences/agent.mdreferences/browser.mdreferences/tools.mdreferences/actor.mdreferences/integrations.mdreferences/monitoring.mdreferences/examples.md
That matters because browser-use has product-specific classes, parameter names, env vars, cloud boundaries, and integration paths that are not interchangeable with Playwright, Selenium, or cloud-only Browser Use APIs.
Key boundary you should know before installing
This open-source skill is for the open-source Python library, not every Browser Use product surface.
Do use it for:
- local or Python library usage
- code generation for
browser_use - setup questions around models, tools, hooks, browser sessions, and monitoring
Do not use it for:
- Cloud API or SDK pricing and cloud product workflows
- direct CLI browser automation requests better handled by the separate browser-use skill
If your task is “write Python code with from browser_use import ...,” this is the right fit.
How to Use open-source skill
Install context for open-source usage
Install the skill in a skills-enabled environment, then invoke it when your task involves the browser_use Python library.
A common add command pattern is:
npx skills add https://github.com/browser-use/browser-use --skill open-source
After install, use the skill as a reference layer while generating code, not as a standalone app. It is designed to guide code-writing and configuration decisions.
Read these files first before asking for code
If you want fast, accurate open-source usage, start with the file that matches your task instead of reading the whole repo:
- install or first run:
references/quickstart.md - choose model provider:
references/models.md - write an agent:
references/agent.md - configure browser sessions:
references/browser.md - add tools:
references/tools.md - need low-level deterministic control:
references/actor.md - wire MCP or skills:
references/integrations.md - add tracing or cost tracking:
references/monitoring.md - copy working patterns:
references/examples.md
This skill is strongest when the prompt names the topic explicitly.
What input the open-source skill needs
Provide enough context for the skill to select the right reference file and generate working code. The highest-value inputs are:
- your goal in one sentence
- whether you want
Agent,Browser, tools, or Actor API - your model provider, if known
- whether execution is local, remote CDP, or cloud-connected
- any constraints such as headless mode, auth, allowed domains, structured output, or observability
Weak input:
- “Use browser-use for automation.”
Strong input:
- “Write Python code using
browser_use.AgentwithChatOpenAI(model="gpt-4.1-mini"), a non-headlessBrowser, allowed domains limited toexample.com, and a Pydantic output schema.”
Turn a rough goal into a strong prompt
For better open-source for Code Generation results, transform a vague request into a prompt with four parts:
- target API surface
- runtime assumptions
- output shape
- constraints
Example:
Use the open-source skill to write a Python example with `browser_use.Agent`.
Model: `ChatGoogle(model="gemini-flash-latest")`.
Browser: headless, custom window size, keep browser alive after run.
Task: log in, navigate to a dashboard, extract three metrics.
Return complete code plus required env vars and pip installs.
Why this works:
- it points the skill toward
agent.md,browser.md, andmodels.md - it avoids cloud/API confusion
- it asks for code, setup, and operational details in one pass
Minimal open-source install path to ask for
If you are still deciding whether to adopt, ask the skill for the shortest working setup first:
- Python install steps
- the smallest runnable
Agentexample - one supported LLM option and its env var
- any browser/runtime assumptions
The repo references show that model setup varies by provider, so “install browser-use” is not enough by itself. You also need the correct chat class and API key variable, such as BROWSER_USE_API_KEY, GOOGLE_API_KEY, or OPENAI_API_KEY.
Practical open-source usage patterns it supports well
The skill is strongest for these workflows:
- generate a first
Agent(...)script - compare model classes such as
ChatBrowserUse,ChatGoogle,ChatOpenAI, orChatAnthropic - configure
Browser(...)options likeheadless,window_size,cdp_url, or domain restrictions - add custom tools and understand
ActionResult - enable structured output with
output_model_schema - set timeouts, retries, fallback LLMs, or hooks
- add Laminar or OpenLIT monitoring
- use the legacy Actor API for lower-level page and element control
Important constraints that affect output quality
The open-source skill has a few decision-critical constraints:
- The Actor API is explicitly legacy and not the same as Playwright.
Browseris an alias ofBrowserSession, which helps when reading examples.- Domain control uses
allowed_domainsandprohibited_domainspatterns with specific matching rules. - Some features, such as loading skills via
skillsorskill_ids, requireBROWSER_USE_API_KEY. - Cloud MCP setup exists, but that is not the same thing as the open-source Python library workflow.
These details are where generic prompting often fails.
Best workflow for open-source code generation
A practical workflow is:
- Ask for the smallest working example for your exact provider and task.
- Ask the skill to annotate every non-default parameter it adds.
- Run the example locally.
- If it fails, paste the traceback and your current code.
- Ask for a revised version using the relevant reference file.
This works better than asking for a “full production implementation” first, because many failures come from setup mismatch rather than missing business logic.
Example prompt that invokes the skill well
Use the open-source skill for browser-use.
I need Python code, not cloud API usage.
Please build a script that uses `Agent` with `ChatBrowserUse()`, runs headless,
extracts structured output into a Pydantic model, and tracks cost.
Also list the env vars, pip packages, and which reference docs you used.
That prompt gives the skill enough signal to combine agent.md, models.md, and monitoring.md.
When to use Actor API instead of Agent
Use Agent when you want goal-driven browsing with LLM planning.
Use the Actor API when you need deterministic low-level actions and can manage timing yourself. The references note important differences from Playwright, including immediate element returns and stricter evaluate() formatting. If your code assumes Playwright semantics, ask the skill to adapt the example specifically for Actor API behavior.
open-source skill FAQ
Is open-source only for installation help?
No. open-source covers install, setup, code generation, configuration, integrations, and debugging for the browser_use Python library. Installation is just the first step; the bigger value is getting correct parameter names, provider setup, and API-specific examples.
Is the open-source skill good for beginners?
Yes, if you ask for a minimal path. Beginners should request:
- one provider
- one short task
- one complete script
- env vars and install commands
- explanation of each import
Avoid asking for tools, hooks, monitoring, and MCP in the first prompt unless you already know you need them.
How is this different from an ordinary prompt about browser automation?
An ordinary prompt may default to Playwright or Selenium assumptions. The open-source skill is better when you need repository-accurate browser_use details such as ChatBrowserUse, output_model_schema, domain restrictions, fallback LLM behavior, cloud-vs-open-source boundaries, or Actor API quirks.
When should I not use open-source?
Do not use it when your task is:
- Browser Use Cloud pricing or cloud SDK guidance
- generic browser automation without
browser_use - direct command-style browser control better matched to another skill
If your request does not involve the Python library or Browser Use docs, this skill is probably the wrong tool.
Does open-source help with model selection?
Yes. The references include supported model providers and env vars across Browser Use, Google Gemini, OpenAI, Anthropic, Azure OpenAI, Bedrock, Groq, Ollama, and OpenAI-compatible APIs. This is one of the most practical reasons to use the skill before coding.
Can open-source help with production concerns?
Yes, within the library scope. It can guide you on retries, fallback LLMs, browser persistence, remote browser connection by cdp_url, monitoring with Laminar or OpenLIT, and performance-oriented example patterns like fast mode or parallel browsers.
How to Improve open-source skill
Give open-source a concrete implementation target
The fastest way to improve results is to specify exactly what code object you want:
- “write an
Agentexample” - “configure a
Browserwithcdp_url” - “add a custom tool”
- “return structured output”
- “show Actor API page interaction”
This reduces reference-file drift and avoids mixed answers.
Include runtime and provider details up front
Many poor outputs come from omitted environment assumptions. State:
- Python context
- chosen model class
- API key source
- headless vs visible browser
- local browser vs remote CDP
- whether skills or MCP are required
Without that, the skill may return a plausible snippet that still cannot run in your setup.
Ask for a runnable example before abstractions
If you want reusable architecture, still ask for a runnable script first. Then iterate toward:
- helper functions
- config extraction
- stronger schemas
- tool registration
- monitoring hooks
This catches install and import mistakes early, which is where most adoption friction happens.
Name the reference file you want the answer grounded in
A high-leverage prompt pattern is:
Use the open-source skill and ground the answer in `references/agent.md` and `references/browser.md`.
Do this when accuracy matters more than breadth. It helps the skill stay aligned with the repository’s actual API surface.
Common failure modes to watch for
The main adoption blockers are:
- mixing cloud product guidance with open-source library code
- assuming Playwright behavior in Actor API examples
- missing provider env vars
- asking for advanced features without naming the base setup
- requesting “browser-use” help without saying whether you mean Agent, Browser, tools, or Actor API
If the first answer feels broad, narrow the API surface instead of asking for “more detail.”
Provide stronger inputs for better code generation
Better prompt:
Use the open-source skill to generate Python code with:
- `from browser_use import Agent, Browser, ChatOpenAI`
- model `gpt-4.1-mini`
- headless browser
- `allowed_domains=["example.com"]`
- structured output via Pydantic
- cost tracking enabled
Return install steps, env vars, and a short explanation of each parameter.
This works because every requested feature maps cleanly to documented references.
Iterate after the first output
After you get an initial answer, improve it by asking one of these:
- “Remove everything non-essential and keep it runnable.”
- “Adapt this to
ChatBrowserUse()instead of OpenAI.” - “Add a custom tool and explain where it plugs into the agent.”
- “Switch from Agent to Actor API for deterministic control.”
- “Add monitoring with OpenLIT only.”
These focused revisions usually outperform a single giant prompt.
Use open-source as a doc router, not just a summary tool
The best use of open-source is as a routing layer to the right internal docs. Treat it as a fast path to the exact reference you need, then ask for code grounded in that file. That is where the skill adds real value over a generic prompt or a quick repo skim.
