elevenlabs-voice-isolator

by inferen-sh

CLI-driven ElevenLabs voice isolator skill for removing background noise and isolating vocals from audio via inference.sh. Ideal for podcast cleanup, interviews, music vocals, noisy recordings, and audio restoration workflows.

Stars232

Favorites0

Comments0

CategoryAudio Editing

Install Command

npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-voice-isolator

Audio Cli Bash Ffmpeg

Overview

What is elevenlabs-voice-isolator?

The elevenlabs-voice-isolator skill is a command-line audio cleanup tool that uses the ElevenLabs Voice Isolator app through the inference.sh (infsh) CLI. It focuses on removing background noise and isolating spoken voice or vocals from an input audio file.

It is built as a reusable skill inside the inferen-sh/skills repository, so you can call it from compatible agent environments or from your own terminal as long as you have the infsh CLI set up.

Key capabilities

Using the ElevenLabs voice isolator model via infsh, this skill can:

Remove ambient background noise (room tone, hum, traffic, crowd noise)
Isolate voices or vocals from a noisy recording
Clean up podcast tracks and interview recordings
Improve intelligibility of speech in difficult environments
Support common audio formats (WAV, MP3, FLAC, OGG, AAC)
Handle long recordings (up to 1 hour, 500MB per file as indicated in the skill docs)

Who is this skill for?

Use elevenlabs-voice-isolator if you:

Record podcasts and want cleaner voice tracks without manual noise reduction
Capture remote interviews and need to reduce background noise from guests
Work with music demos or vocal takes and want to better isolate the vocal line
Maintain audio archives and want basic speech-focused restoration
Build AI agents or automation that must clean audio on the fly using a CLI tool

If you already use ffmpeg or a DAW but want a higher-level voice isolation step accessible from the terminal or an agent, this skill fits that niche.

When it’s a good fit (and when it isn’t)

A good fit when:

Your main goal is voice isolation or speech cleanup, not full multitrack audio mixing.
You are comfortable running CLI commands (Bash) and working with URLs or local files.
You can install and authenticate the inference.sh CLI (infsh).

Not the best fit when:

You need deep editing, multitrack mixing, or effects chains inside a GUI DAW.
Your workflow is entirely offline and you cannot use the infsh CLI or external model calls.
You require fine-grained, frame-level control over the DSP process instead of a model-driven isolator.

How to Use

Prerequisites

Before using elevenlabs-voice-isolator, make sure you have:

inference.sh CLI (infsh) installed
- The skill’s quick start references infsh and links to CLI install instructions.
- Follow the latest installation instructions from:
  - https://raw.githubusercontent.com/inference-sh/skills/refs/heads/main/cli-install.md
Access to the ElevenLabs Voice Isolator app via infsh
- The skill calls elevenlabs/voice-isolator through infsh app run.
Bash-capable environment
- The skill’s allowed-tools include Bash(infsh *), so it is designed for Bash shells and CLI workflows.

Basic installation in an agent skills environment

If you are using an environment that supports npx skills and the inferen-sh/skills repository, you can add the skill with:

npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-voice-isolator
``

This makes the elevenlabs-voice-isolator skill available alongside other tools from the same repo. Once added, your agent or tooling can invoke the underlying `infsh` commands defined by the skill.

### Log in to inference.sh
Before running any audio isolation, authenticate the CLI:

```bash
infsh login

Follow the prompts to complete login. This step is required for the subsequent infsh app run commands to work.

Run a simple voice isolation command

The core usage pattern for elevenlabs-voice-isolator through infsh looks like this:

infsh app run elevenlabs/voice-isolator --input '{"audio": "https://noisy-recording.mp3"}'

Replace https://noisy-recording.mp3 with the URL to your own noisy audio file. The app processes the input and returns a response (typically JSON) with references to the cleaned audio.

Supported audio formats and limits

According to the skill documentation, the ElevenLabs voice isolator supports:

WAV – up to 500MB, max 1 hour
MP3 – up to 500MB, max 1 hour
FLAC – up to 500MB, max 1 hour
OGG – up to 500MB, max 1 hour
AAC – up to 500MB, max 1 hour

For best stability, stay within these sizes and durations when preparing audio for elevenlabs-voice-isolator.

Example: Clean up a podcast recording

This example mirrors the skill’s own quick-start scenario for podcast cleanup:

# Remove background noise from a podcast recording
infsh app run elevenlabs/voice-isolator --input '{"audio": "https://noisy-podcast.mp3"}'

Use this pattern for any spoken-word content where you want clearer narration or dialogue. Host your file somewhere accessible over HTTPS (or follow current infsh guidance for local file usage if supported in your environment).

Example: Clean an interview recording

To improve an interview with room noise or street sounds, adjust the input URL:

infsh app run elevenlabs/voice-isolator --input '{"audio": "https://noisy-interview-file.mp3"}'

You can integrate this command into scripts that automatically clean every new interview file before editing.

Integrating with your own tools and agents

Because elevenlabs-voice-isolator is defined as a skill in inferen-sh/skills:

Agents: An AI agent that can call Bash(infsh *) can use this skill to clean audio as part of a pipeline (for example, isolation → transcription → summarization).
CLI pipelines: You can wrap infsh app run elevenlabs/voice-isolator inside shell scripts, CI workflows, or batch processing tools.
Audio post-production: Use this as a pre-processing step before importing the cleaned file into a DAW or editor like Audacity, Reaper, or Adobe Audition.

Files and configuration to inspect

Within the inferen-sh/skills repository, open:

tools/audio/elevenlabs-voice-isolator/SKILL.md

This file describes the skill, its description, and the example usage commands. There is no complex per-user configuration exposed in the skill file, but the CLI and app may offer additional options documented elsewhere in the inference.sh ecosystem.

FAQ

What does elevenlabs-voice-isolator actually do to my audio?

The elevenlabs-voice-isolator skill sends your audio to the ElevenLabs Voice Isolator model via the inference.sh CLI. The model focuses on separating and enhancing the voice while reducing background noise. The result is an audio output where speech or vocals are clearer and less noisy, suitable for podcasts, interviews, and similar content.

Do I need the inference.sh CLI to use elevenlabs-voice-isolator?

Yes. The published quick start shows usage through the inference.sh CLI (infsh). You must install and authenticate infsh before running the example commands or integrating the skill into an agent.

Which audio formats can I process?

Based on the skill’s documentation, elevenlabs-voice-isolator supports:

WAV, MP3, FLAC, OGG, and AAC
Up to 500MB file size and 1 hour duration per file

If your files exceed these limits, trim or downsample them before processing.

Can I run elevenlabs-voice-isolator on local files instead of URLs?

The examples in SKILL.md use HTTPS URLs for the audio field. Whether local paths are supported depends on current infsh capabilities and configuration. Check the latest inference.sh CLI documentation for how to reference local files (for example, via upload or local path conventions) and adapt your --input argument accordingly.

Is elevenlabs-voice-isolator suitable for music production?

It can be helpful for isolating vocals or cleaning noisy demo recordings, but it is not a full music production suite. Use it as a pre-processing or utility step, then finish detailed mixing and mastering in your DAW.

How does this differ from traditional noise reduction in a DAW?

Traditional DAW noise reduction often requires noise prints, manual tuning, and real-time monitoring. elevenlabs-voice-isolator is a model-based, batch-style process accessed via CLI. You pass an audio file, the model performs isolation and noise removal, and you receive a processed output. This is convenient for automated or large-scale cleanup, especially when paired with agents or scripts.

What if I just want a simple denoise filter without voice isolation?

The elevenlabs-voice-isolator skill focuses on voice isolation and background removal together. If you only need basic denoising or EQ, a local ffmpeg filter or DAW plugin may be simpler. Use this skill when you specifically want voice separation and enhanced speech clarity driven by the ElevenLabs model.

Where can I learn more or troubleshoot issues?

For the most accurate and current details:

Open tools/audio/elevenlabs-voice-isolator/SKILL.md in the inferen-sh/skills repository.
Review the general infsh installation and usage guide at cli-install.md in the same repo.
Consult inference.sh and ElevenLabs documentation for service-specific limits, authentication, and error codes.

If something fails, start by confirming infsh login succeeds, your audio URL is reachable, and your file respects the supported formats and size/duration limits.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

dialogue-audio

by inferen-sh

Create realistic multi-speaker dialogue audio with Dia TTS and ElevenLabs via the inference.sh CLI. The dialogue-audio skill helps you control speakers, emotion, pacing, and conversation flow for podcasts, audiobooks, explainers, character scenes, and other conversational content.

Voice Generation

Favorites 0GitHub 0

elevenlabs-stt

by inferen-sh

High-accuracy ElevenLabs speech-to-text via inference.sh CLI using Scribe v1/v2 models. Supports transcription, speaker diarization, audio event tagging, word-level timestamps, forced alignment, and subtitle generation for meetings, podcasts, and other audio workflows.

Audio Editing

Favorites 0GitHub 0

elevenlabs-voice-changer

by inferen-sh

ElevenLabs voice changer skill using the inference.sh CLI (infsh) to transform recorded speech into a different synthetic voice while preserving content and emotion. Supports eleven_multilingual_sts_v2 (70+ languages) and eleven_english_sts_v2 for speech-to-speech, accent change, and voice disguise in content creation, dubbing, and character voices.

Voice Generation

Favorites 0GitHub 0

ai-voice-cloning

by inferen-sh

ai-voice-cloning is an inference.sh-based skill for AI voice generation, text-to-speech, and voice cloning from the CLI. It wraps ElevenLabs, Kokoro TTS, DIA, Chatterbox, Higgs, and VibeVoice models for natural speech, multi-voice narration, and voice transformation for audio and video projects.

Voice Generation

Favorites 0GitHub 0

elevenlabs-music

by inferen-sh

Generate original AI music from text prompts using the inference.sh CLI and ElevenLabs. Control duration, style, and mood to create royalty-free background music, soundtracks, jingles, podcasts beds, and game audio directly from your terminal.

Audio Editing

Favorites 0GitHub 0

ai-podcast-creation

by inferen-sh

Create AI-powered podcasts and voice content from text using Kokoro TTS, DIA TTS, and the inference.sh CLI. Mix multiple voices, add music, and assemble full episodes for podcasts, audiobooks, and audio newsletters.

Voice Generation

Favorites 0GitHub 0

elevenlabs-sound-effects

by inferen-sh

Generate AI sound effects from text prompts using ElevenLabs via the inference.sh CLI. Ideal for video editors, game developers, podcasters, filmmakers, and content creators who need fast, royalty-free sound design. Supports text-to-sound-effect, adjustable duration, and prompt control for cinematic, ambient, and game-ready SFX.

Audio Editing

Favorites 0GitHub 0

ai-music-generation

by inferen-sh

Generate AI music and full songs from text prompts using ElevenLabs Music, Diffrythm, and Tencent Song Generation via the inference.sh CLI. Ideal for background tracks, soundtracks, social clips, podcasts, and royalty-free music. Supports fast song generation, instrumentals, and full vocal songs.

Voice Generation

Favorites 0GitHub 0