elevenlabs-dubbing

by inferen-sh

elevenlabs-dubbing lets you automatically dub and translate audio or video into 29 languages using the inference.sh CLI, preserving the original speakers’ voices. Ideal for video editors, podcasters, and localization teams who need fast, high‑quality multilingual versions of existing content.

Stars0

Favorites0

Comments0

CategoryVideo Editing

Install Command

npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-dubbing

Audio Video Cli API

Overview

What is elevenlabs-dubbing?

elevenlabs-dubbing is an automated dubbing skill that uses the inference.sh CLI to translate and dub audio or video into 29 languages while preserving the original speakers’ voices. It wraps the ElevenLabs dubbing pipeline in a simple CLI workflow so you can quickly localize existing media for global audiences.

Instead of manually exporting audio, sending it to separate tools, and re‑syncing tracks in your editor, you can send a single command that:

Detects speakers in the source
Translates speech into the target language
Generates natural‑sounding dubbed audio in the speakers’ original voices
Outputs a finished, localized audio track (and works seamlessly with video files)

Who is elevenlabs-dubbing for?

elevenlabs-dubbing is a good fit if you:

Edit or produce video content and need multilingual dubs (YouTube channels, courses, product walkthroughs, marketing videos)
Run a podcast or audio show and want localized versions for new regions
Work on localization or post‑production teams and need to scale dubbing without hiring native‑language voice actors for every language
Build automated media workflows and want a CLI/API‑friendly dubbing step you can script or run in CI

It is less suitable if you:

Need frame‑accurate, hand‑mixed sound design or creative re‑interpretation rather than straight translation
Require offline processing without internet access (inference.sh runs as a cloud service)
Must integrate directly into a GUI NLE (this skill is CLI‑driven and best used alongside your editor, not inside it)

Key capabilities

Based on the upstream skill definition, elevenlabs-dubbing provides:

Automatic dubbing for audio and video via the infsh CLI
Translation into 29 languages, controlled with a simple target_lang code
Voice‑preserving dubbing, keeping the original speakers’ identity in the new language
Auto speaker handling, so multi‑speaker recordings can be processed without per‑speaker setup
Audio localization for international distribution, ideal for repurposing existing assets at scale

This aligns strongly with video editing, audio editing, translation, and voice generation workflows, making it a versatile tool in a post‑production or localization toolkit.

How to Use

Prerequisites and installation

To use elevenlabs-dubbing, you need the inference.sh CLI (infsh) installed and authenticated.

Install the inference.sh CLI
Follow the official instructions from the repository:
- Open the CLI install guide at:
  https://raw.githubusercontent.com/inference-sh/skills/refs/heads/main/cli-install.md
- Install infsh for your platform as described there.
Log in with inference.sh
Once installed, authenticate your CLI session:
```
infsh login
```
Follow the on‑screen prompts (e.g., opening a URL or pasting a token) so the CLI can access the ElevenLabs dubbing app.
Add the skill to your agent environment (optional)
If you are using a skills‑based agent environment, install this skill with:
```
npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-dubbing
```
This makes the elevenlabs-dubbing workflow available to your agents while still using the infsh CLI behind the scenes.

Basic dubbing workflow (Quick Start)

Once infsh is installed and logged in, you can dub a video or audio file into another language with a single command.

Example: Dub an English video to Spanish

infsh app run elevenlabs/dubbing --input '{
  "audio": "https://video.mp4",
  "target_lang": "es"
}'
``

How this works:

- `elevenlabs/dubbing` is the hosted dubbing app invoked by the CLI.
- `audio` is the URL to your source media (audio or video). This can be an `https://` link to a file such as `video.mp4`.
- `target_lang` is the language code for the dubbed output (here `es` for Spanish).

The app processes the source media, translates the speech, and outputs dubbed audio in the target language while preserving speaker voices.

### Supported languages

The skill supports 29 languages via simple language codes (examples from the upstream table):

- `en` – English
- `es` – Spanish
- `fr` – French
- `de` – German
- `it` – Italian
- `pt` – Portuguese
- `pl` – Polish
- `hi` – Hindi
- `ar` – Arabic
- `ko` – Korean
- `ru` – Russian
- `tr` – Turkish
- `nl` – Dutch
- `sv` – Swedish
- `da` – Danish
- `fi` – Finnish
- `no` – Norwegian
- `cs` – Czech

Refer to the full language table in the upstream `SKILL.md` if you need the complete set of supported codes.

### Typical usage patterns

#### 1. Localizing YouTube or course videos

1. Upload your source video somewhere reachable via HTTPS (e.g., storage bucket or unlisted hosting URL).
2. Run `infsh app run elevenlabs/dubbing` with the video URL and desired `target_lang`.
3. Download the dubbed audio and align or replace audio in your video editor (Premiere Pro, Final Cut, DaVinci Resolve, etc.).

#### 2. Translating podcasts and interviews

1. Host the original audio file (`.mp3`, `.wav`, or video with audio) at a public or authorized URL.
2. Call elevenlabs-dubbing with that URL and a target language code.
3. Publish the localized version as a separate feed or episode.

#### 3. Scripting and automation

Because elevenlabs-dubbing is driven via the CLI, you can:

- Wrap the `infsh app run` command in shell scripts
- Integrate dubbing into CI/CD pipelines for content publishing
- Chain it with other tools (e.g., transcription, clipping, or formatting scripts) in a larger automation flow

### Where to look in the repository

If you install the skill into an agent environment, explore these files for deeper details:

- `SKILL.md` – Core description, capabilities, and quick start
- `tools/audio/elevenlabs-dubbing` (directory) – Location of this skill in the shared skills repo

Use these as implementation references rather than copying them verbatim; adapt the patterns to your own infrastructure, storage, and security needs.

## FAQ

### When is elevenlabs-dubbing a good fit?

elevenlabs-dubbing is a strong fit when you already have finished or near‑finished video or audio and you want fast, high‑quality multilingual versions without re‑recording:

- Turning a successful English video into Spanish, French, or German versions
- Localizing webinars, tutorials, or e‑learning content
- Expanding podcasts or interviews into new language markets

It shines when you value speed, scalability, and speaker voice preservation over bespoke studio dubbing.

### When is elevenlabs-dubbing not ideal?

Consider other approaches if:

- You need complete creative re‑interpretation (new scripts, comedy timing, or new cast of voice actors)
- Your workflow must be fully offline (no cloud calls)
- You require a point‑and‑click GUI integrated directly into your NLE

In those cases, a traditional dubbing studio or on‑prem voice solution may be more appropriate.

### How do I install elevenlabs-dubbing?

There are two layers:

1. **Install the inference.sh CLI** by following the instructions at:  
   `https://raw.githubusercontent.com/inference-sh/skills/refs/heads/main/cli-install.md`
2. **(Optional) Add the skill to your agent environment** with:

   ```bash
   npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-dubbing

The actual dubbing is executed via the infsh CLI against the elevenlabs/dubbing app.

What input formats can I use?

The example in the upstream SKILL file shows a video URL (https://video.mp4) passed as the audio field. That implies:

You can send video files that contain an audio track (e.g., .mp4 with sound)
Audio extraction and dubbing are handled behind the scenes by the app

For best results, provide a clean, well‑recorded source with clear speech and minimal background noise.

How do I choose the language for dubbing?

Use the target_lang field in the JSON input to specify your desired output language:

infsh app run elevenlabs/dubbing --input '{
  "audio": "https://video.mp4",
  "target_lang": "fr"
}'

Replace fr with any of the supported language codes like es, de, pt, or others from the supported list.

Does elevenlabs-dubbing preserve the original speaker’s voice?

Yes. According to the skill description, elevenlabs-dubbing is designed for voice‑preserving translation, keeping the original speakers’ vocal identity while changing the language. This is ideal for creators who want viewers to still feel they are hearing the original person, just in a different language.

How does elevenlabs-dubbing relate to video editing tools?

elevenlabs-dubbing does not replace your video editor. Instead, it acts as a specialized dubbing step in your workflow:

Use your editor to cut and finish the master video.
Export or host that master file.
Run elevenlabs-dubbing via infsh to generate localized audio.
Re‑import or relink the dubbed audio in your editor to finalize output for each language.

This separation lets you keep your existing editing stack while adding powerful multilingual dubbing as an automated step.

Where can I see more technical details?

Open the skill’s source in the repository:

GitHub URL: https://github.com/inferen-sh/skills/tree/main/tools/audio/elevenlabs-dubbing
Skill definition and quick start: SKILL.md

Use these files to understand the exact configuration and examples provided by the maintainers.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

elevenlabs-sound-effects

by inferen-sh

Generate AI sound effects from text prompts using ElevenLabs via the inference.sh CLI. Ideal for video editors, game developers, podcasters, filmmakers, and content creators who need fast, royalty-free sound design. Supports text-to-sound-effect, adjustable duration, and prompt control for cinematic, ambient, and game-ready SFX.

Audio Editing

Favorites 0GitHub 0

ai-social-media-content

by inferen-sh

AI-powered social media content generator for TikTok, Instagram, YouTube, and X. Use the inference.sh CLI to create platform-ready videos, reels, shorts, thumbnails, images, captions, and hashtags with models like FLUX, Veo, Seedance, Wan, Kokoro TTS, and Claude.

Social Media

Favorites 0GitHub 0

ai-avatar-video

by inferen-sh

Generate AI avatar and talking head videos from an image and audio track using the inference.sh CLI. ai-avatar-video wraps OmniHuman, Fabric, and PixVerse Lipsync apps for audio-driven avatars, lipsync videos, and virtual presenters, ideal for marketing, explainers, and social content workflows.

Video Editing

Favorites 0GitHub 0

ai-marketing-videos

by inferen-sh

AI-powered marketing video creation via inference.sh CLI. Use ai-marketing-videos to generate promo videos, product demos, explainers, and ad creatives for Facebook, YouTube, Instagram, and TikTok using models like Veo, Seedance, Wan, FLUX, and Kokoro voiceover.

Video Editing

Favorites 0GitHub 0

ai-video-generation

by inferen-sh

Generate AI videos with Google Veo, Seedance, Wan, Grok and 40+ models via the inference.sh CLI. Supports text-to-video, image-to-video, lipsync, avatar animation, video upscaling, and foley sound for social media clips, marketing content, explainers, and product demos.

Video Editing

Favorites 0GitHub 0

remotion-best-practices

by remotion-dev

Practical Remotion best practices for building programmatic videos, animations, and audio-driven compositions in React.

Video Editing

Favorites 0GitHub 2.4K

agent-tools

by inferen-sh

agent-tools exposes the inference.sh CLI inside your agent so you can run 150+ AI apps from one place: image generation, video creation, LLMs, search, 3D, and Twitter automation. Ideal when you need a unified workflow runner for FLUX, Veo, Gemini, Grok, Claude, Seedance, OmniHuman, Tavily, Exa, OpenRouter, and more without managing GPUs or complex integrations.

Workflow Automation

Favorites 0GitHub 0

ai-content-pipeline

by inferen-sh

Design and run multi-step AI content pipelines that chain image, video, audio, and text tools together via the inference.sh CLI. Use ai-content-pipeline to automate workflows like: generate an image, animate it to video, add sound or voiceover, and prepare content for YouTube, social media, and marketing campaigns.

Workflow Automation

Favorites 0GitHub 0