transcribe

by openai

transcribe turns audio or video into text with optional diarization and known-speaker hints. It is well suited for Technical Writing, meeting notes, interviews, lectures, and content ops when you need a repeatable transcribe skill with clear output formats and less guesswork than a generic prompt.

Stars18.8k

Favorites0

Comments0

AddedMay 11, 2026

CategoryTechnical Writing

Install Command

npx skills add openai/skills --skill transcribe

Curation Score

This skill scores 74/100, which means it is a credible install candidate for directory users: it has a clear transcription use case, a bundled CLI, and enough operational guidance to reduce guesswork versus a generic prompt. It is still somewhat limited because the repository evidence points to a focused audio-transcription workflow rather than a broadly documented end-to-end package.

74/100

Strengths

Explicit triggerability for audio/video transcription, speaker labeling, and interview/meeting use cases in SKILL.md.
Bundled script and quick reference document the key operating constraints: response formats, chunking strategy, max file size, and known-speaker limits.
Operational workflow is concrete: check API key, run the CLI, validate output, and save results in a standard output path.

Cautions

The skill is narrow in scope and centered on one transcription workflow, so users needing broader media-processing behavior will need something else.
The install path is not fully self-serve in the evidence shown: SKILL.md mentions dependencies, but the excerpt does not show a complete install command or full quick-start example.

Speech To Text Transcription Audio Video OpenAI Cli Python

Overview

Overview of transcribe skill

What the transcribe skill does

The transcribe skill turns audio or video into text using OpenAI, with optional speaker diarization and known-speaker hints. It is a good fit when you need a reliable transcribe result from recordings, interviews, meetings, lectures, or short video clips, especially when speaker labels matter.

Who should use it

Use this transcribe skill if you want a repeatable workflow rather than a one-off prompt. It is especially useful for Technical Writing, meeting notes, content ops, research interviews, and anyone who needs clean text plus traceable speaker structure.

Why this skill is different

The main advantage is operational clarity: it prefers a bundled CLI, has explicit decision rules for model and output format, and supports diarized output when requested. That makes transcribe easier to run consistently than a generic “please transcribe this” prompt, especially when you care about repeatability and output shape.

How to Use transcribe skill

Install the transcribe skill

Install with npx skills add openai/skills --skill transcribe. If you are using the repository directly, start from skills/.curated/transcribe and keep the bundled workflow intact unless your environment requires a change.

Prepare the right input for transcribe usage

For best transcribe usage, provide:

the audio or video file path
the desired response format: text, json, or diarized_json
an optional language hint
known speaker references if you need diarization

A strong prompt looks like: “Transcribe this 18-minute interview, return diarized_json, and label the host and two guests if possible.” That is better than asking for “a transcript” because it tells the skill what output structure and speaker context to optimize for.

Read these files first

Start with SKILL.md, then check references/api.md for format limits and diarization rules. If you are extending or automating the flow, inspect scripts/transcribe_diarize.py and agents/openai.yaml for the default model, CLI behavior, and prompt entrypoint.

Practical workflow tips

Use gpt-4o-mini-transcribe for fast plain transcription, and switch to gpt-4o-transcribe-diarize when speaker labels are important. Keep chunking_strategy on auto for audio longer than about 30 seconds. Make sure OPENAI_API_KEY is set locally before you run; this skill expects a configured environment rather than pasted secrets.

transcribe skill FAQ

Is transcribe good for Technical Writing?

Yes. The transcribe skill is a strong fit for Technical Writing when you need source audio converted into editable text for docs, interviews, or content cleanup. It is less about creative rewriting and more about turning speech into dependable structured text.

When should I not use transcribe?

Do not use transcribe if you only need a rough summary with no transcript, or if your file is too large for the supported request limits without splitting. It is also a poor fit if you want heavy paraphrasing instead of literal speech conversion.

How is this different from a normal prompt?

A normal prompt can ask for transcription, but this transcribe skill adds a reproducible workflow, a preferred CLI, explicit response-format choices, and diarization guidance. That reduces guesswork when you need consistent output across multiple files.

Is transcribe beginner-friendly?

Yes, if you can identify the file and desired output. Beginners usually only need to choose between plain text and diarized output. The main blocker is environment setup, so verify OPENAI_API_KEY first.

How to Improve transcribe skill

Give transcribe better source context

The biggest quality gain usually comes from better inputs, not more prompting. For example, say whether the audio is a podcast, call recording, or lecture; whether there are overlapping speakers; and whether you want verbatim text or cleaned transcript output. That helps transcribe choose a more suitable path.

Use speaker hints when diarization matters

If you know the speaker names, include them as references instead of expecting the model to infer everything from audio alone. This is especially important for transcribe when one person sounds similar to another or when the recording has multiple guests. Known speakers improve label consistency, but only if the references are accurate.

Iterate with one change at a time

If the first transcribe output is weak, change one variable: model, chunking, response format, or speaker hints. Avoid rewriting the whole request at once. For example, if labels are wrong, keep the transcript goal the same and only add speaker references or switch to diarized JSON.

Watch for common failure modes

The most common issues are missing API keys, unsupported file handling, vague output requests, and asking for diarization without usable speaker context. If you are building a transcribe guide for a workflow, document the file types you expect, the preferred output format, and the fallback when the recording is noisy or too long.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

seo-hreflang

by AgriciDaniel

seo-hreflang helps validate and generate hreflang markup for multilingual and multi-region sites. Use it to check self-references, return tags, x-default, and valid language-region codes across HTML, HTTP headers, and XML sitemaps. Ideal for SEO Content teams that need reliable seo-hreflang guide support.

SEO Content

Favorites 0GitHub 0

openai-docs

by openai

Use openai-docs for Technical Writing, OpenAI API and product questions, model selection, migration checks, and prompt-upgrade guidance. It prioritizes official OpenAI docs via the Developer Docs MCP server, with bundled references as fallback context only when needed.

Technical Writing

Favorites 0GitHub 0

seo

by affaan-m

The seo skill helps audit, plan, and implement technical SEO, on-page optimization, structured data, Core Web Vitals, and keyword mapping. Use it for crawlability, indexability, metadata, schema, internal linking, sitemap and robots changes, or seo for Keyword Research, with page-specific, implementable guidance.

Keyword Research

Favorites 0GitHub 156.3k

adr-skill

by vercel

adr-skill helps teams create and maintain Architecture Decision Records that agents can execute. It supports drafting, bootstrapping ADR folders, choosing templates, updating status, and validating decisions with checklists, scripts, and examples.

Technical Writing

Favorites 0GitHub 23.1k

building-incident-response-playbook

by mukul975

building-incident-response-playbook helps security teams create reusable incident response playbooks with step-by-step phases, decision trees, escalation criteria, RACI ownership, and SOAR-ready structure. It is designed for incident response procedure documentation, incident triage workflows, and audit-friendly operational response plans.

Incident Triage

Favorites 0GitHub 6.1k

prd-development

by deanpeters

The prd-development skill helps you turn discovery notes into a structured PRD with problem framing, users, solution, scope, and success criteria. Use it for engineering handoff, new feature planning, and prd-development for Technical Writing.

Technical Writing

Favorites 0GitHub 4.1k

user-story

by deanpeters

The user-story skill helps you turn product needs into a single, development-ready story with Mike Cohn wording and Gherkin acceptance criteria. Use it for clearer handoffs, better estimation, and a tighter user-story guide for Technical Writing and product teams.

Technical Writing

Favorites 0GitHub 4.1k

provider-docs

by hashicorp

The provider-docs skill helps you create, update, and verify Terraform Registry documentation for Terraform providers. Use it for provider-docs guide work, provider-docs for Technical Writing, and for keeping schema descriptions, tfplugindocs templates, and Registry output in sync when docs change.

Technical Writing

Favorites 0GitHub 0

api-design

by affaan-m

api-design is a REST API design skill for planning and reviewing endpoints, resource naming, status codes, pagination, filtering, versioning, and error responses.

API Development

Favorites 0GitHub 156.1k

press-release

by deanpeters

The press-release skill helps you draft an Amazon-style Working Backwards press release before you build. Use it to clarify customer value, test a product or feature idea, and align stakeholders with a concise, customer-centered narrative. Helpful for press-release for Technical Writing and early product planning.

Technical Writing

Favorites 0GitHub 4.1k

asc-whats-new-writer

by rudrankriyam

asc-whats-new-writer turns git logs, bullets, or free text into localized App Store Connect What’s New copy, using canonical metadata in `./metadata` and optional promotional text updates. It is designed for release managers, app marketers, and SEO Content teams that need a repeatable asc-whats-new-writer guide with less guesswork.

SEO Content

Favorites 0GitHub 790

source-driven-development

by addyosmani

The source-driven-development skill grounds framework-specific coding in official docs, helping you verify patterns before you implement. It is ideal for source-driven-development usage in React, Vue, Next.js, Svelte, Angular, and similar stacks when correctness, provenance, and version-sensitive decisions matter.

Code Generation

Favorites 0GitHub 18.8k

readme-i18n

by xixu-me

readme-i18n helps translate a GitHub-style README into maintainable multilingual variants, preserving Markdown, links, code blocks, file naming, and a shared language selector across README files.

Translation

Favorites 0GitHub 6

prd-generator

by ognjengt

prd-generator turns a rough product idea into an AI-ready Product Requirements Document. It asks clarifying questions, follows a fixed template, and helps founders, product leads, and Skill Authoring workflows produce clearer specs for downstream AI coding tools. Use prd-generator when you need structured requirements, metrics, constraints, and implementation-ready context.

Skill Authoring

Favorites 0GitHub 0

brainstorming

by obra

brainstorming is a pre-implementation skill that explores context, asks clarifying questions one at a time, and requires design approval before any code. Includes an optional visual companion and strong support for Requirements Planning.

Requirements Planning

Favorites 1GitHub 121.7k

crafting-effective-readmes

by softaworks

crafting-effective-readmes helps write, update, and review README files using project-type templates, section checklists, style guidance, and repo-aware prompts for clearer install and usage docs.

Technical Writing

Favorites 0GitHub 1.3k