screenshot

by openai

The screenshot skill helps capture a full screen, app window, or pixel region when you need an OS-level image instead of a browser-only capture. Use it for screenshot usage in Workflow Automation, with save-location rules, macOS permission handling, and clear install guidance for reliable desktop captures.

Stars0

Favorites0

Comments0

AddedMay 8, 2026

CategoryWorkflow Automation

Install Command

npx skills add openai/skills --skill screenshot

Curation Score

This skill scores 78/100, which means it is a solid listing candidate for directory users who need reliable screenshot capture guidance. The repository clearly defines when to use it, provides operational scripts for macOS and cross-platform capture, and includes preflight steps that reduce guesswork, though the install decision would still benefit from a clearer quick-start and explicit install command.

78/100

Strengths

Explicit trigger guidance: use when a user asks for a desktop/system screenshot or when tool-specific capture is unavailable.
Operational depth: includes dedicated scripts for macOS, Windows, and Python-based capture, plus macOS permission preflight to reduce repeated prompts.
Good agent leverage: the skill specifies save-location rules and tool priority, helping an agent choose the right capture path with less ambiguity.

Cautions

No install command in SKILL.md, so adopters may need to infer setup rather than follow a one-step install flow.
The excerpt is strong on mechanics but still leaves some platform-specific execution details to the scripts, which may require inspection for edge cases.

Screenshot Screenshots Browser Automation Desktop Automation Macos Windows

Overview

Overview of screenshot skill

What the screenshot skill does

The screenshot skill helps an agent capture the right desktop image when a task needs a whole screen, a specific window, an app region, or a saved file path. It is the right fit when you need a real OS-level screenshot rather than a browser-only capture, a design-tool capture, or a generic prompt answer.

When it is the right install

Install screenshot if your workflow includes desktop apps, multi-window review, OS UI debugging, or cases where tool-specific capture is unavailable. It is especially useful for Workflow Automation jobs that need visual proof, handoff artifacts, or pixel-accurate references.

What makes it different

This screenshot skill is decision-oriented, not just a command wrapper. It encodes capture priority, save-location rules, and macOS permission handling so the agent can choose a workable path faster and with fewer prompts. That reduces guesswork when the user only says “take a screenshot” or gives an incomplete target.

How to Use screenshot skill

Install and locate the core files

Install with npx skills add openai/skills --skill screenshot. Then read SKILL.md first, followed by scripts/take_screenshot.py, scripts/ensure_macos_permissions.sh, and agents/openai.yaml. If you need platform-specific behavior, inspect the Swift helpers in scripts/ before assuming the capture path.

Give the skill a complete capture brief

A strong screenshot usage request names four things: target, area, output path, and constraints. For example: “Capture the active Photoshop window and save it to /tmp/review.png” or “Take a full-screen screenshot of display 2 in the default screenshot folder.” If the path is omitted, the skill follows the OS default; if Codex needs the image for inspection, it should save to temp.

Use the right workflow for the platform

For browsers, Figma, or Electron, prefer their native or tool-specific screenshot path first. Use this skill when you need the whole desktop, when app/window capture is the real requirement, or when another tool cannot capture what matters. On macOS, run the permission preflight before window/app capture to avoid repeated Screen Recording prompts.

Practical prompt pattern

A good screenshot guide prompt is specific enough to trigger the correct helper and output. Example: “Use the screenshot skill to capture the left half of the editor window on macOS, then save it to the default screenshot location.” If you need a region, provide coordinates in x,y,w,h form and say whether the region is relative to the screen or to a window.

screenshot skill FAQ

Is screenshot only for full-screen captures?

No. The screenshot skill covers full-screen, window, app, and region captures. Use full-screen only when the entire desktop context matters; otherwise narrow the target to reduce noise and improve usefulness.

When should I not use this skill?

Do not use screenshot when a better-integrated tool can capture the exact surface you need, such as a Figma or browser-specific workflow. Also avoid it if your goal is text extraction or UI reasoning without needing an actual image artifact.

Do beginners need to know OS details first?

No. Beginners can use the skill with a plain request like “take a screenshot of this window.” The main improvement comes from adding the target, save path, and any crop details. On macOS, permissions may still be the main blocker, so expect one setup step.

How is this different from a generic prompt?

A generic prompt may describe the desired image, but the screenshot skill also handles capture choice, save-location rules, and macOS permission friction. That makes it more reliable for Workflow Automation because the agent is guided toward a concrete file output instead of an abstract answer.

How to Improve screenshot skill

Give the clearest target possible

The biggest quality gain comes from naming exactly what should appear in frame. “Capture the editor” is weaker than “capture the VS Code window showing app.py with the terminal visible.” Specific targets reduce failed captures, especially when multiple similar windows are open.

Add constraints that change the result

If the screenshot must exclude private content, include only one monitor, or show a specific resolution, say so up front. For region captures, provide coordinates and explain whether the crop should include chrome, title bars, or only content. These details matter more than extra prose.

Use the first output to refine the next one

If the screenshot is too broad, too small, or missing the relevant UI state, iterate by tightening the target and referencing what was wrong. For example: “Retake with only the modal visible” or “Move the crop down to include the status bar.” That feedback loop is the fastest way to improve screenshot usage.

Watch for common failure modes

The usual problems are permission prompts on macOS, capturing the wrong monitor, and asking for a screenshot when a better tool could have delivered a cleaner result. If the first capture fails, improve the request by adding the app name, window title, screen number, or exact region.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

playwright-interactive

by openai

playwright-interactive is a browser automation skill for persistent Playwright sessions in local web and Electron apps. Use it to inspect UI state, retry interactions, and run functional or visual QA without restarting the toolchain. Ideal when you need a practical playwright-interactive guide for iterative debugging.

Browser Automation

Favorites 0GitHub 0

huggingface-datasets

by huggingface

Use the huggingface-datasets skill for Hugging Face Dataset Viewer API workflows to validate datasets, resolve splits, preview and paginate rows, search text, apply filters, and fetch parquet links or statistics. It is a practical huggingface-datasets guide for read-only dataset exploration.

Web Scraping

Favorites 0GitHub 10.4k

iterative-retrieval

by affaan-m

iterative-retrieval is a workflow pattern for progressively refining context retrieval in agentic work. It helps subagents avoid too much or too little context, making it useful for iterative-retrieval usage, install decisions, and iterative-retrieval for Workflow Automation.

Workflow Automation

Favorites 0GitHub 156.2k

data-scraper-agent

by affaan-m

data-scraper-agent helps build a repeatable public-data pipeline for web scraping, enrichment, and storage. It is designed for monitoring jobs, prices, news, repos, sports, and listings on a schedule using GitHub Actions, with outputs to Notion, Sheets, or Supabase. Best for ongoing tracking, not one-off extractions.

Web Scraping

Favorites 0GitHub 156.1k

notion-meeting-intelligence

by openai

notion-meeting-intelligence helps turn Notion context into meeting-ready agendas and pre-reads, with Codex research for decisions, status, planning, retros, and 1:1 prep. Best for the notion-meeting-intelligence for Meeting Prep workflow when you need grounded materials, clear timeboxes, and attendee-specific outputs.

Meeting Prep

Favorites 0GitHub 18.6k

building-incident-response-playbook

by mukul975

building-incident-response-playbook helps security teams create reusable incident response playbooks with step-by-step phases, decision trees, escalation criteria, RACI ownership, and SOAR-ready structure. It is designed for incident response procedure documentation, incident triage workflows, and audit-friendly operational response plans.

Incident Triage

Favorites 0GitHub 6.1k

building-patch-tuesday-response-process

by mukul975

building-patch-tuesday-response-process helps teams build a repeatable Microsoft Patch Tuesday process to triage advisories, rank risk, test patches, approve rollout, and track compliance. Useful for security operations, vulnerability management, and building-patch-tuesday-response-process for Project Management.

Project Management

Favorites 0GitHub 6.1k

secure-workflow-guide

by trailofbits

secure-workflow-guide guides a 5-step Solidity security workflow: Slither triage, feature-specific checks, visual inspection, security-property notes, and manual review. It is built for smart contract teams, auditors, and builders who want a repeatable secure-workflow-guide guide before deployment or release.

Security Audit

Favorites 0GitHub 4.9k

twitter-cli

by public-clis

twitter-cli is a terminal-first Twitter/X skill for reading timelines, bookmarks, search results, profiles, and tweet details, with posting and other write actions when authenticated. Use it for Social Media research, account monitoring, and lightweight publishing from the command line.

Social Media

Favorites 0GitHub 2.3k

azure-ai-contentunderstanding-py

by microsoft

azure-ai-contentunderstanding-py is the Python skill for Azure AI Content Understanding. It extracts structured content from documents, images, audio, and video for RAG workflows and automation. Use it when you need reliable multimodal extraction, Azure authentication, and repeatable pipeline-ready output.

RAG Workflows

Favorites 0GitHub 2.2k

wp-performance

by WordPress

Use wp-performance to investigate and improve WordPress performance from the backend, without a browser UI. It supports measurement-first diagnosis for slow frontend requests, admin pages, REST routes, and WP-Cron, with guidance on WP-CLI profile/doctor, Query Monitor via REST headers, Server-Timing, database queries, autoloaded options, object caching, cron, and remote HTTP calls.

Performance Optimization

Favorites 0GitHub 1.4k

wp-wpcli-and-ops

by WordPress

The wp-wpcli-and-ops skill helps with WordPress operations in WP-CLI: safe search-replace, db export/import, plugin and theme actions, cron, cache flushing, multisite targeting, and repeatable automation for backend development.

Backend Development

Favorites 0GitHub 1.4k

agents-sdk

by cloudflare

agents-sdk helps you build Cloudflare Workers agents with stateful conversations, durable execution, WebSocket or streaming chat, MCP integration, scheduled tasks, and browser automation. This agents-sdk skill focuses on install decisions, configuration, and practical usage for existing or new Workers apps, with guidance for multi-agent systems only when they fit Cloudflare runtime constraints.

Multi-Agent Systems

Favorites 0GitHub 1.3k

reddit-ads

by alinaqi

reddit-ads skill for Reddit Ads API workflows: campaign creation, targeting, conversion tracking, and ad optimization. Install the reddit-ads guide to manage account hierarchy, budgets, audiences, and API-based optimization with less guesswork.

Ad Optimization

Favorites 0GitHub 611

existing-repo

by alinaqi

existing-repo helps agents analyze an existing codebase, detect stack and conventions, and add guardrails without breaking local patterns. Use this existing-repo skill for Git Workflows, first-time repo work, maintenance, and setup changes where understand-before-modifying matters most.

Git Workflows

Favorites 0GitHub 607

composio

by ComposioHQ

Use composio to connect AI workflows to external apps through the CLI or SDK. This composio skill is built for workflow automation, app actions, per-user connections, toolkit discovery, and a practical guide to install and usage before you start building.

Workflow Automation

Favorites 0GitHub 48