electron
by vercel-labsAutomate existing Electron desktop apps like VS Code, Slack, Discord, Figma, Notion, and Spotify via agent-browser and Chrome DevTools Protocol (CDP). This skill helps you connect to a running Electron app, take snapshots, and interact with its UI as part of end-to-end desktop and workflow automation.
Overview
What the electron skill does
The electron skill lets an agent automate existing Electron desktop applications using agent-browser and the Chrome DevTools Protocol (CDP). Many popular tools such as VS Code, Slack, Discord, Figma, Notion, and Spotify are built with Electron and internally run on Chromium. By exposing a DevTools port, they can be controlled much like a website in a browser.
With this skill, the agent can:
- Launch an Electron app with remote debugging enabled
- Connect to the app’s CDP port via
agent-browser - Take snapshots to understand the current UI
- Interact with elements (click, type, navigate) using stable element references
- Re-snapshot after state changes to continue a multi-step workflow
Who this skill is for
Use the electron skill if you:
- Need to automate workflows inside an Electron desktop app (e.g., send a Slack message, navigate a Notion workspace, trigger a VS Code command)
- Want to include Electron apps in end-to-end tests or regression checks
- Need cross-app workflow automation, using Electron apps alongside web automation
- Prefer CLI-based automation built on CDP rather than GUI recorders or proprietary APIs
It fits teams and individuals who are already comfortable with:
- Running shell commands (macOS, Linux) via Bash
- Using
agent-browseror similar CDP tooling - Treating Electron apps as “browser targets” instead of scripting native OS widgets directly
When electron is not a good fit
The electron skill is not ideal if:
- The app you want to control is not an Electron app and does not expose a DevTools port
- You need deep, OS-level interactions outside the app window (system dialogs, file pickers that are not rendered by Electron, etc.)
- You want to build Electron apps (this skill is for automating existing apps, not developing new ones)
- You require zero-terminal, fully GUI-based automation tools
In those cases, you may want a different desktop automation or OS-native scripting solution.
How the automation model works
Under the hood, the electron skill uses the same snapshot/interact pattern as web automation in agent-browser:
- Launch with
--remote-debugging-portso the Electron app exposes CDP - Connect to that port from
agent-browser - Snapshot to capture the current DOM / accessibility tree
- Interact with UI elements using agent-browser commands and element refs
- Re-snapshot after each major state or navigation change
Because this is CDP-based, the agent sees the app similar to a browser page, enabling repeatable, scriptable flows across sessions.
How to Use
1. Install the electron skill
To make the electron skill available to your agent environment, install it from the vercel-labs/agent-browser repository:
npx skills add https://github.com/vercel-labs/agent-browser --skill electron
This pulls in the electron skill definition and allows the agent to use:
Bash(agent-browser:*)Bash(npx agent-browser:*)
You should also have agent-browser installed or available via npx so the commands used by this skill can be executed.
2. Confirm prerequisites
Before running flows that rely on the electron skill, confirm:
- You have macOS or Linux with a shell where you can run
open,bash, or equivalent commands - The target application is actually Electron-based (Slack, VS Code, Discord, Figma, Notion, Spotify, and many others)
- You can start the app with the
--remote-debugging-portflag (this is built into Chromium/Electron)
If you cannot start the app with that flag, the agent will not be able to connect with CDP.
3. Launch an Electron app with CDP enabled
The core requirement is to start the app with a remote debugging port:
# Example: Slack on macOS
open -a "Slack" --args --remote-debugging-port=9222
This pattern applies to other Electron apps as well; adjust the app name as needed.
Once launched, the app exposes a CDP endpoint on the specified port (here, 9222).
4. Connect agent-browser to the running app
With the app running under remote debugging, connect using agent-browser:
agent-browser connect 9222
After a successful connection, you can run the usual snapshot and interaction commands against the Electron app window.
5. Run the standard snapshot–interact workflow
Now you can treat the Electron app much like a browser page:
# Discover interactive elements
agent-browser snapshot -i
# Click a specific element reference (example: @e5)
agent-browser click @e5
# Capture a screenshot of the current window
agent-browser screenshot slack-desktop.png
In a typical agent-run workflow, the agent will:
- Call
snapshotto understand the current UI state - Choose elements by their references (e.g.,
@e5,@e12) to click or type into - Use
snapshotagain after any major change (navigation, modal open/close, etc.)
6. Integrate into larger desktop and workflow automation
The electron skill is especially useful when you need to chain multiple apps together. For example, an agent can:
- Retrieve data from a web app in Chrome
- Open Slack (Electron) and post a formatted status message
- Switch to VS Code (Electron) to trigger a build or run a task
Because everything runs through CDP and agent-browser, you can script this from the CLI or allow an LLM-based agent to orchestrate it automatically.
7. Adapting to your environment
While the repository examples focus on the generic Electron pattern, you should adapt the approach to your specific:
- Applications (Slack, Discord, Notion, custom in-house Electron tools)
- Ports (choose a free port;
9222is common but not required) - OS commands (use
openon macOS, appropriate launch commands on Linux)
Whenever you adjust launch commands, keep the --remote-debugging-port flag intact so the electron skill can still connect via agent-browser.
FAQ
Is the electron skill only for Slack and VS Code?
No. The electron skill works with any Electron app that can be started with --remote-debugging-port. Slack and VS Code are common examples, but the same pattern applies to Discord, Figma, Notion, Spotify, and many other Electron-based tools.
How does electron know it is talking to an Electron app and not a website?
From the automation point of view, Electron apps expose a Chrome DevTools Protocol interface similar to a Chromium browser. Once connected to the specified port, agent-browser interacts with the target as if it were a browser page. The electron skill simply assumes that the port corresponds to an Electron-based Chromium instance.
Do I need to modify the Electron app’s source code?
No. You do not need to change the app’s source code. You only need to launch the existing app with the --remote-debugging-port flag so CDP is exposed. This works for packaged, off-the-shelf Electron applications as long as the OS launch command allows additional arguments.
Can the electron skill automate system dialogs or non-Electron windows?
The electron skill is focused on Electron windows that are backed by Chromium and accessible via CDP. OS-native dialogs or windows outside of Electron are generally not visible through this interface. For that kind of automation, you would need a separate OS-level automation tool.
What commands does the skill rely on?
According to the skill metadata, electron is allowed to use:
Bash(agent-browser:*)Bash(npx agent-browser:*)
This means the agent can run agent-browser commands directly or via npx, including connect, snapshot, click, screenshot, and other supported subcommands.
How do I troubleshoot connection issues on the CDP port?
If the agent cannot connect:
- Check that the app was started with
--remote-debugging-port=<port> - Confirm the port number used in
agent-browser connectmatches the launch command - Verify that only one instance of the app is running; close extra instances and relaunch with the debugging flag
If the port is blocked or already in use, choose another available port and update both the launch and connect commands.
When should I choose another skill instead of electron?
Choose another skill when:
- The target is a regular website in a browser (use a browser-focused automation skill instead)
- You need OS-level actions like file management, system preferences, or non-Electron apps
- You are primarily building, not automating, Electron applications
The electron skill is most effective when you specifically want CDP-based automation of an existing Electron desktop app as part of broader desktop or workflow automation.
