I

elevenlabs-voice-isolator

by inferen-sh

CLI-driven ElevenLabs voice isolator skill for removing background noise and isolating vocals from audio via inference.sh. Ideal for podcast cleanup, interviews, music vocals, noisy recordings, and audio restoration workflows.

Stars232
Favorites0
Comments0
CategoryAudio Editing
Install Command
npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-voice-isolator
Overview

Overview

What is elevenlabs-voice-isolator?

The elevenlabs-voice-isolator skill is a command-line audio cleanup tool that uses the ElevenLabs Voice Isolator app through the inference.sh (infsh) CLI. It focuses on removing background noise and isolating spoken voice or vocals from an input audio file.

It is built as a reusable skill inside the inferen-sh/skills repository, so you can call it from compatible agent environments or from your own terminal as long as you have the infsh CLI set up.

Key capabilities

Using the ElevenLabs voice isolator model via infsh, this skill can:

  • Remove ambient background noise (room tone, hum, traffic, crowd noise)
  • Isolate voices or vocals from a noisy recording
  • Clean up podcast tracks and interview recordings
  • Improve intelligibility of speech in difficult environments
  • Support common audio formats (WAV, MP3, FLAC, OGG, AAC)
  • Handle long recordings (up to 1 hour, 500MB per file as indicated in the skill docs)

Who is this skill for?

Use elevenlabs-voice-isolator if you:

  • Record podcasts and want cleaner voice tracks without manual noise reduction
  • Capture remote interviews and need to reduce background noise from guests
  • Work with music demos or vocal takes and want to better isolate the vocal line
  • Maintain audio archives and want basic speech-focused restoration
  • Build AI agents or automation that must clean audio on the fly using a CLI tool

If you already use ffmpeg or a DAW but want a higher-level voice isolation step accessible from the terminal or an agent, this skill fits that niche.

When it’s a good fit (and when it isn’t)

A good fit when:

  • Your main goal is voice isolation or speech cleanup, not full multitrack audio mixing.
  • You are comfortable running CLI commands (Bash) and working with URLs or local files.
  • You can install and authenticate the inference.sh CLI (infsh).

Not the best fit when:

  • You need deep editing, multitrack mixing, or effects chains inside a GUI DAW.
  • Your workflow is entirely offline and you cannot use the infsh CLI or external model calls.
  • You require fine-grained, frame-level control over the DSP process instead of a model-driven isolator.

How to Use

Prerequisites

Before using elevenlabs-voice-isolator, make sure you have:

  1. inference.sh CLI (infsh) installed

    • The skill’s quick start references infsh and links to CLI install instructions.
    • Follow the latest installation instructions from:
      • https://raw.githubusercontent.com/inference-sh/skills/refs/heads/main/cli-install.md
  2. Access to the ElevenLabs Voice Isolator app via infsh

    • The skill calls elevenlabs/voice-isolator through infsh app run.
  3. Bash-capable environment

    • The skill’s allowed-tools include Bash(infsh *), so it is designed for Bash shells and CLI workflows.

Basic installation in an agent skills environment

If you are using an environment that supports npx skills and the inferen-sh/skills repository, you can add the skill with:

npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-voice-isolator
``

This makes the elevenlabs-voice-isolator skill available alongside other tools from the same repo. Once added, your agent or tooling can invoke the underlying `infsh` commands defined by the skill.

### Log in to inference.sh
Before running any audio isolation, authenticate the CLI:

```bash
infsh login

Follow the prompts to complete login. This step is required for the subsequent infsh app run commands to work.

Run a simple voice isolation command

The core usage pattern for elevenlabs-voice-isolator through infsh looks like this:

infsh app run elevenlabs/voice-isolator --input '{"audio": "https://noisy-recording.mp3"}'

Replace https://noisy-recording.mp3 with the URL to your own noisy audio file. The app processes the input and returns a response (typically JSON) with references to the cleaned audio.

Supported audio formats and limits

According to the skill documentation, the ElevenLabs voice isolator supports:

  • WAV – up to 500MB, max 1 hour
  • MP3 – up to 500MB, max 1 hour
  • FLAC – up to 500MB, max 1 hour
  • OGG – up to 500MB, max 1 hour
  • AAC – up to 500MB, max 1 hour

For best stability, stay within these sizes and durations when preparing audio for elevenlabs-voice-isolator.

Example: Clean up a podcast recording

This example mirrors the skill’s own quick-start scenario for podcast cleanup:

# Remove background noise from a podcast recording
infsh app run elevenlabs/voice-isolator --input '{"audio": "https://noisy-podcast.mp3"}'

Use this pattern for any spoken-word content where you want clearer narration or dialogue. Host your file somewhere accessible over HTTPS (or follow current infsh guidance for local file usage if supported in your environment).

Example: Clean an interview recording

To improve an interview with room noise or street sounds, adjust the input URL:

infsh app run elevenlabs/voice-isolator --input '{"audio": "https://noisy-interview-file.mp3"}'

You can integrate this command into scripts that automatically clean every new interview file before editing.

Integrating with your own tools and agents

Because elevenlabs-voice-isolator is defined as a skill in inferen-sh/skills:

  • Agents: An AI agent that can call Bash(infsh *) can use this skill to clean audio as part of a pipeline (for example, isolation → transcription → summarization).
  • CLI pipelines: You can wrap infsh app run elevenlabs/voice-isolator inside shell scripts, CI workflows, or batch processing tools.
  • Audio post-production: Use this as a pre-processing step before importing the cleaned file into a DAW or editor like Audacity, Reaper, or Adobe Audition.

Files and configuration to inspect

Within the inferen-sh/skills repository, open:

  • tools/audio/elevenlabs-voice-isolator/SKILL.md

This file describes the skill, its description, and the example usage commands. There is no complex per-user configuration exposed in the skill file, but the CLI and app may offer additional options documented elsewhere in the inference.sh ecosystem.

FAQ

What does elevenlabs-voice-isolator actually do to my audio?

The elevenlabs-voice-isolator skill sends your audio to the ElevenLabs Voice Isolator model via the inference.sh CLI. The model focuses on separating and enhancing the voice while reducing background noise. The result is an audio output where speech or vocals are clearer and less noisy, suitable for podcasts, interviews, and similar content.

Do I need the inference.sh CLI to use elevenlabs-voice-isolator?

Yes. The published quick start shows usage through the inference.sh CLI (infsh). You must install and authenticate infsh before running the example commands or integrating the skill into an agent.

Which audio formats can I process?

Based on the skill’s documentation, elevenlabs-voice-isolator supports:

  • WAV, MP3, FLAC, OGG, and AAC
  • Up to 500MB file size and 1 hour duration per file

If your files exceed these limits, trim or downsample them before processing.

Can I run elevenlabs-voice-isolator on local files instead of URLs?

The examples in SKILL.md use HTTPS URLs for the audio field. Whether local paths are supported depends on current infsh capabilities and configuration. Check the latest inference.sh CLI documentation for how to reference local files (for example, via upload or local path conventions) and adapt your --input argument accordingly.

Is elevenlabs-voice-isolator suitable for music production?

It can be helpful for isolating vocals or cleaning noisy demo recordings, but it is not a full music production suite. Use it as a pre-processing or utility step, then finish detailed mixing and mastering in your DAW.

How does this differ from traditional noise reduction in a DAW?

Traditional DAW noise reduction often requires noise prints, manual tuning, and real-time monitoring. elevenlabs-voice-isolator is a model-based, batch-style process accessed via CLI. You pass an audio file, the model performs isolation and noise removal, and you receive a processed output. This is convenient for automated or large-scale cleanup, especially when paired with agents or scripts.

What if I just want a simple denoise filter without voice isolation?

The elevenlabs-voice-isolator skill focuses on voice isolation and background removal together. If you only need basic denoising or EQ, a local ffmpeg filter or DAW plugin may be simpler. Use this skill when you specifically want voice separation and enhanced speech clarity driven by the ElevenLabs model.

Where can I learn more or troubleshoot issues?

For the most accurate and current details:

  • Open tools/audio/elevenlabs-voice-isolator/SKILL.md in the inferen-sh/skills repository.
  • Review the general infsh installation and usage guide at cli-install.md in the same repo.
  • Consult inference.sh and ElevenLabs documentation for service-specific limits, authentication, and error codes.

If something fails, start by confirming infsh login succeeds, your audio URL is reachable, and your file respects the supported formats and size/duration limits.

Ratings & Reviews

No ratings yet
Share your review
Sign in to leave a rating and comment for this skill.
G
0/10000
Latest reviews
Saving...