elevenlabs-voice-changer

by inferen-sh

ElevenLabs voice changer skill using the inference.sh CLI (infsh) to transform recorded speech into a different synthetic voice while preserving content and emotion. Supports eleven_multilingual_sts_v2 (70+ languages) and eleven_english_sts_v2 for speech-to-speech, accent change, and voice disguise in content creation, dubbing, and character voices.

Stars0

Favorites0

Comments0

CategoryVoice Generation

Install Command

npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-voice-changer

Audio Video Marketing Social Media Ai

Overview

What is elevenlabs-voice-changer?

elevenlabs-voice-changer is a skill that connects the ElevenLabs speech-to-speech voice changer to the inference.sh command-line interface (infsh). It lets you send an existing audio recording and receive the same speech back in a different synthetic voice, while preserving what is said and how it is expressed.

Under the hood, the skill calls the ElevenLabs voice-changer app via infsh app run elevenlabs/voice-changer, so you don’t need to wire up APIs manually. You describe the input audio and the target voice, and the service returns transformed audio.

Key capabilities

Speech-to-speech conversion – turn any spoken audio into a new voice without re-recording.
Multilingual support (70+ languages) – via eleven_multilingual_sts_v2.
English-optimized model – via eleven_english_sts_v2 for higher quality English results.
Accent and style changes – swap accents, tone, or persona using ElevenLabs’ premium voices.
Voice disguise and privacy – anonymize or mask your real voice for public content.

Who is this skill for?

This skill is a good fit if you:

Create YouTube, TikTok, or social media content and want to change or upgrade your narration voice.
Produce podcasts or voiceovers and need quick language, accent, or voice swaps.
Work in marketing or product explainers and want multiple branded voices without hiring different actors.
Build AI characters or demos and need consistent, reusable voices.

It is less suitable if you:

Need a visual GUI-only workflow with timeline editing (it is CLI-focused).
Require completely offline processing (it depends on inference.sh and ElevenLabs in the cloud).
Want fine-grained audio engineering tools like EQ, mixing, or multi-track editing; this is focused on voice transformation, not full DAW features.

Models and voice options

The elevenlabs-voice-changer skill exposes the same models described in the repository:

Multilingual STS v2 – model ID: eleven_multilingual_sts_v2 (default, supports 70+ languages).
English STS v2 – model ID: eleven_english_sts_v2 (optimized for English speech).

It can use the 22+ premium ElevenLabs voices also available in their TTS products, including defaults like:

george – British, authoritative (default voice in the docs).
aria – American, conversational.

You select these voices by passing the voice parameter when calling the app.

How to Use

1. Prerequisites and installation

Before using elevenlabs-voice-changer, you must have the inference.sh CLI installed and authenticated.

Install inference.sh CLI (infsh)
Follow the official instructions from the repository:
https://raw.githubusercontent.com/inference-sh/skills/refs/heads/main/cli-install.md
Log in to inference.sh using your account:
```
infsh login
```
Add the skill (Agent Skills Finder / skills registry)
If you are using this as a skill inside the skills collection, add it with:
```
npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-voice-changer
```

After these steps, your environment is ready to call the ElevenLabs voice changer app via infsh.

2. Basic voice transformation

The quickest way to try elevenlabs-voice-changer is to run the provided example from the skill docs:

infsh login

# Transform voice
infsh app run elevenlabs/voice-changer --input '{"audio": "https://recording.mp3", "voice": "aria"}'

In this example:

audio is a URL pointing to your input recording (e.g., an .mp3 file hosted online).
voice is the target ElevenLabs voice ID (aria in this case).

The app processes the recording and returns a new audio file with the same speech content, but in the aria voice.

3. Choosing models and languages

By default, the skill is configured to use:

eleven_multilingual_sts_v2 for broad language coverage (70+ languages).

If your use case is strictly English and you want an English-optimized model, configure the app input or your workflow to use:

eleven_english_sts_v2 for improved English clarity and prosody.

The exact field for selecting the model is handled inside the ElevenLabs app configuration, but when you choose models, use these IDs as referenced in the skill documentation.

4. Working with different voices and accents

To experiment with different accents or styles, change the voice parameter in your --input JSON.

Examples (pattern):

# British, authoritative
infsh app run elevenlabs/voice-changer --input '{"audio": "https://recording.mp3", "voice": "george"}'

# American, conversational
infsh app run elevenlabs/voice-changer --input '{"audio": "https://recording.mp3", "voice": "aria"}'

You can repeat the same original audio with multiple runs and different voice IDs to quickly audition voices for your project.

5. Integrating into your workflow

Because elevenlabs-voice-changer runs entirely through the CLI, it integrates well with scripted or automated pipelines:

Batch processing – loop over a folder of audio URLs or pre-uploaded recordings and call infsh app run repeatedly.
Content localization – record once, then transform narrations to different accents or voices for different markets.
Voice anonymization – post-process recorded calls, interviews, or user submissions before publishing.

If you are using a broader agent framework or orchestration layer, you can call this skill as a step in your pipeline wherever “voice conversion” or “dubbing” is required.

6. Files to review in the repository

When you open the skill in the inferen-sh/skills repository, start with:

SKILL.md – high-level description, capabilities, and quick start command you can copy and adapt.

Other common files in the skills repository (like AGENTS.md, metadata.json, and rules/ or scripts/ folders when present in other tools) show how skills fit into larger agent workflows. For elevenlabs-voice-changer, SKILL.md is the main documentation.

FAQ

What does elevenlabs-voice-changer actually do?

elevenlabs-voice-changer uses the ElevenLabs speech-to-speech models, called through the inference.sh CLI, to convert an existing voice recording into a different AI-generated voice. It keeps the wording and emotion of the original but changes how the voice sounds.

How do I install elevenlabs-voice-changer?

You don’t install the skill as a standalone app. Instead, you:

Install the infsh CLI using the instructions at:
https://raw.githubusercontent.com/inference-sh/skills/refs/heads/main/cli-install.md
Run infsh login to authenticate.

Optionally, register the skill in your skills setup with:

npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-voice-changer

After that, you can call the ElevenLabs voice changer app with infsh app run elevenlabs/voice-changer.

Do I need an ElevenLabs account to use this?

The skill itself is a wrapper around ElevenLabs’ models running via inference.sh. Any underlying requirements for ElevenLabs usage (such as accounts, credits, or quotas) are handled by your inference.sh and ElevenLabs setup. Check the inference.sh and ElevenLabs documentation for current access and billing details.

Can I run elevenlabs-voice-changer locally without the cloud?

The repository documentation shows the skill running via infsh against an online ElevenLabs app. It does not document a fully offline mode. Expect to need network access to inference.sh and the ElevenLabs backend.

What audio formats can I use as input?

The example uses an .mp3 file served over HTTP ("https://recording.mp3"). The specific format and size limits are governed by the ElevenLabs app itself. For best results, use common web audio formats (like mp3) hosted at a stable URL.

Can I use my own custom voice?

The skill description focuses on the standard ElevenLabs voice set (22+ premium voices) such as george and aria. It does not describe custom voice-training flows. If you need a bespoke voice, consult ElevenLabs’ own documentation to see how custom voices integrate with their speech-to-speech app.

Is this good for real-time voice changing?

The repository shows file-based speech-to-speech usage via CLI, where you provide a recorded file URL and get a processed file back. It does not describe real-time or live call voice conversion, so treat it as an asynchronous, file-based tool rather than a live voice changer.

When should I not use elevenlabs-voice-changer?

Consider other tools if you:

Need a full DAW or nonlinear editor for detailed audio mixing and mastering.
Require live, low-latency voice effects for streaming or gaming.
Must run everything offline without cloud services.

For scripted, repeatable speech-to-speech voice conversion via CLI, elevenlabs-voice-changer is a strong fit.

Where can I see or modify the configuration?

Open the skill in the inferen-sh/skills GitHub repository under:

tools/audio/elevenlabs-voice-changer/

Review SKILL.md there to see the official quick start, models, and voice options, and adapt the example commands to your environment.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

elevenlabs-dialogue

by inferen-sh

Generate polished multi-speaker dialogue audio with ElevenLabs via the inference.sh CLI. Turn structured scripts into natural-sounding conversations with multiple voices in a single file for podcasts, audiobooks, explainers, tutorials, character dialogue, and video scripts.

Voice Generation

Favorites 0GitHub 0

elevenlabs-stt

by inferen-sh

High-accuracy ElevenLabs speech-to-text via inference.sh CLI using Scribe v1/v2 models. Supports transcription, speaker diarization, audio event tagging, word-level timestamps, forced alignment, and subtitle generation for meetings, podcasts, and other audio workflows.

Audio Editing

Favorites 0GitHub 0

ai-podcast-creation

by inferen-sh

Create AI-powered podcasts and voice content from text using Kokoro TTS, DIA TTS, and the inference.sh CLI. Mix multiple voices, add music, and assemble full episodes for podcasts, audiobooks, and audio newsletters.

Voice Generation

Favorites 0GitHub 0

elevenlabs-music

by inferen-sh

Generate original AI music from text prompts using the inference.sh CLI and ElevenLabs. Control duration, style, and mood to create royalty-free background music, soundtracks, jingles, podcasts beds, and game audio directly from your terminal.

Audio Editing

Favorites 0GitHub 0

elevenlabs-tts

by inferen-sh

ElevenLabs text-to-speech via inference.sh CLI, with 22+ premium voices, multilingual support, and fast model options for production voice generation workflows.

Voice Generation

Favorites 0GitHub 0

elevenlabs-dubbing

by inferen-sh

elevenlabs-dubbing lets you automatically dub and translate audio or video into 29 languages using the inference.sh CLI, preserving the original speakers’ voices. Ideal for video editors, podcasters, and localization teams who need fast, high‑quality multilingual versions of existing content.

Video Editing

Favorites 0GitHub 0

ai-music-generation

by inferen-sh

Generate AI music and full songs from text prompts using ElevenLabs Music, Diffrythm, and Tencent Song Generation via the inference.sh CLI. Ideal for background tracks, soundtracks, social clips, podcasts, and royalty-free music. Supports fast song generation, instrumentals, and full vocal songs.

Voice Generation

Favorites 0GitHub 0

ai-voice-cloning

by inferen-sh

ai-voice-cloning is an inference.sh-based skill for AI voice generation, text-to-speech, and voice cloning from the CLI. It wraps ElevenLabs, Kokoro TTS, DIA, Chatterbox, Higgs, and VibeVoice models for natural speech, multi-voice narration, and voice transformation for audio and video projects.

Voice Generation

Favorites 0GitHub 0