elevenlabs-dubbing
by inferen-shelevenlabs-dubbing lets you automatically dub and translate audio or video into 29 languages using the inference.sh CLI, preserving the original speakers’ voices. Ideal for video editors, podcasters, and localization teams who need fast, high‑quality multilingual versions of existing content.
Overview
What is elevenlabs-dubbing?
elevenlabs-dubbing is an automated dubbing skill that uses the inference.sh CLI to translate and dub audio or video into 29 languages while preserving the original speakers’ voices. It wraps the ElevenLabs dubbing pipeline in a simple CLI workflow so you can quickly localize existing media for global audiences.
Instead of manually exporting audio, sending it to separate tools, and re‑syncing tracks in your editor, you can send a single command that:
- Detects speakers in the source
- Translates speech into the target language
- Generates natural‑sounding dubbed audio in the speakers’ original voices
- Outputs a finished, localized audio track (and works seamlessly with video files)
Who is elevenlabs-dubbing for?
elevenlabs-dubbing is a good fit if you:
- Edit or produce video content and need multilingual dubs (YouTube channels, courses, product walkthroughs, marketing videos)
- Run a podcast or audio show and want localized versions for new regions
- Work on localization or post‑production teams and need to scale dubbing without hiring native‑language voice actors for every language
- Build automated media workflows and want a CLI/API‑friendly dubbing step you can script or run in CI
It is less suitable if you:
- Need frame‑accurate, hand‑mixed sound design or creative re‑interpretation rather than straight translation
- Require offline processing without internet access (inference.sh runs as a cloud service)
- Must integrate directly into a GUI NLE (this skill is CLI‑driven and best used alongside your editor, not inside it)
Key capabilities
Based on the upstream skill definition, elevenlabs-dubbing provides:
- Automatic dubbing for audio and video via the
infshCLI - Translation into 29 languages, controlled with a simple
target_langcode - Voice‑preserving dubbing, keeping the original speakers’ identity in the new language
- Auto speaker handling, so multi‑speaker recordings can be processed without per‑speaker setup
- Audio localization for international distribution, ideal for repurposing existing assets at scale
This aligns strongly with video editing, audio editing, translation, and voice generation workflows, making it a versatile tool in a post‑production or localization toolkit.
How to Use
Prerequisites and installation
To use elevenlabs-dubbing, you need the inference.sh CLI (infsh) installed and authenticated.
-
Install the inference.sh CLI
Follow the official instructions from the repository:- Open the CLI install guide at:
https://raw.githubusercontent.com/inference-sh/skills/refs/heads/main/cli-install.md - Install
infshfor your platform as described there.
- Open the CLI install guide at:
-
Log in with inference.sh
Once installed, authenticate your CLI session:infsh loginFollow the on‑screen prompts (e.g., opening a URL or pasting a token) so the CLI can access the ElevenLabs dubbing app.
-
Add the skill to your agent environment (optional)
If you are using a skills‑based agent environment, install this skill with:npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-dubbingThis makes the elevenlabs-dubbing workflow available to your agents while still using the
infshCLI behind the scenes.
Basic dubbing workflow (Quick Start)
Once infsh is installed and logged in, you can dub a video or audio file into another language with a single command.
Example: Dub an English video to Spanish
infsh app run elevenlabs/dubbing --input '{
"audio": "https://video.mp4",
"target_lang": "es"
}'
``
How this works:
- `elevenlabs/dubbing` is the hosted dubbing app invoked by the CLI.
- `audio` is the URL to your source media (audio or video). This can be an `https://` link to a file such as `video.mp4`.
- `target_lang` is the language code for the dubbed output (here `es` for Spanish).
The app processes the source media, translates the speech, and outputs dubbed audio in the target language while preserving speaker voices.
### Supported languages
The skill supports 29 languages via simple language codes (examples from the upstream table):
- `en` – English
- `es` – Spanish
- `fr` – French
- `de` – German
- `it` – Italian
- `pt` – Portuguese
- `pl` – Polish
- `hi` – Hindi
- `ar` – Arabic
- `ko` – Korean
- `ru` – Russian
- `tr` – Turkish
- `nl` – Dutch
- `sv` – Swedish
- `da` – Danish
- `fi` – Finnish
- `no` – Norwegian
- `cs` – Czech
Refer to the full language table in the upstream `SKILL.md` if you need the complete set of supported codes.
### Typical usage patterns
#### 1. Localizing YouTube or course videos
1. Upload your source video somewhere reachable via HTTPS (e.g., storage bucket or unlisted hosting URL).
2. Run `infsh app run elevenlabs/dubbing` with the video URL and desired `target_lang`.
3. Download the dubbed audio and align or replace audio in your video editor (Premiere Pro, Final Cut, DaVinci Resolve, etc.).
#### 2. Translating podcasts and interviews
1. Host the original audio file (`.mp3`, `.wav`, or video with audio) at a public or authorized URL.
2. Call elevenlabs-dubbing with that URL and a target language code.
3. Publish the localized version as a separate feed or episode.
#### 3. Scripting and automation
Because elevenlabs-dubbing is driven via the CLI, you can:
- Wrap the `infsh app run` command in shell scripts
- Integrate dubbing into CI/CD pipelines for content publishing
- Chain it with other tools (e.g., transcription, clipping, or formatting scripts) in a larger automation flow
### Where to look in the repository
If you install the skill into an agent environment, explore these files for deeper details:
- `SKILL.md` – Core description, capabilities, and quick start
- `tools/audio/elevenlabs-dubbing` (directory) – Location of this skill in the shared skills repo
Use these as implementation references rather than copying them verbatim; adapt the patterns to your own infrastructure, storage, and security needs.
## FAQ
### When is elevenlabs-dubbing a good fit?
elevenlabs-dubbing is a strong fit when you already have finished or near‑finished video or audio and you want fast, high‑quality multilingual versions without re‑recording:
- Turning a successful English video into Spanish, French, or German versions
- Localizing webinars, tutorials, or e‑learning content
- Expanding podcasts or interviews into new language markets
It shines when you value speed, scalability, and speaker voice preservation over bespoke studio dubbing.
### When is elevenlabs-dubbing not ideal?
Consider other approaches if:
- You need complete creative re‑interpretation (new scripts, comedy timing, or new cast of voice actors)
- Your workflow must be fully offline (no cloud calls)
- You require a point‑and‑click GUI integrated directly into your NLE
In those cases, a traditional dubbing studio or on‑prem voice solution may be more appropriate.
### How do I install elevenlabs-dubbing?
There are two layers:
1. **Install the inference.sh CLI** by following the instructions at:
`https://raw.githubusercontent.com/inference-sh/skills/refs/heads/main/cli-install.md`
2. **(Optional) Add the skill to your agent environment** with:
```bash
npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-dubbing
The actual dubbing is executed via the infsh CLI against the elevenlabs/dubbing app.
What input formats can I use?
The example in the upstream SKILL file shows a video URL (https://video.mp4) passed as the audio field. That implies:
- You can send video files that contain an audio track (e.g.,
.mp4with sound) - Audio extraction and dubbing are handled behind the scenes by the app
For best results, provide a clean, well‑recorded source with clear speech and minimal background noise.
How do I choose the language for dubbing?
Use the target_lang field in the JSON input to specify your desired output language:
infsh app run elevenlabs/dubbing --input '{
"audio": "https://video.mp4",
"target_lang": "fr"
}'
Replace fr with any of the supported language codes like es, de, pt, or others from the supported list.
Does elevenlabs-dubbing preserve the original speaker’s voice?
Yes. According to the skill description, elevenlabs-dubbing is designed for voice‑preserving translation, keeping the original speakers’ vocal identity while changing the language. This is ideal for creators who want viewers to still feel they are hearing the original person, just in a different language.
How does elevenlabs-dubbing relate to video editing tools?
elevenlabs-dubbing does not replace your video editor. Instead, it acts as a specialized dubbing step in your workflow:
- Use your editor to cut and finish the master video.
- Export or host that master file.
- Run elevenlabs-dubbing via
infshto generate localized audio. - Re‑import or relink the dubbed audio in your editor to finalize output for each language.
This separation lets you keep your existing editing stack while adding powerful multilingual dubbing as an automated step.
Where can I see more technical details?
Open the skill’s source in the repository:
- GitHub URL:
https://github.com/inferen-sh/skills/tree/main/tools/audio/elevenlabs-dubbing - Skill definition and quick start:
SKILL.md
Use these files to understand the exact configuration and examples provided by the maintainers.
