elevenlabs-tts

by inferen-sh

ElevenLabs text-to-speech via inference.sh CLI, with 22+ premium voices, multilingual support, and fast model options for production voice generation workflows.

Stars0

Favorites0

Comments0

AddedMar 27, 2026

CategoryVoice Generation

Install Command

npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-tts

Audio Video Cli

Overview

What is elevenlabs-tts?

The elevenlabs-tts skill connects the ElevenLabs text-to-speech API to the inference.sh (infsh) CLI, giving you a fast, scriptable way to turn text into high-quality speech. It exposes ElevenLabs models and voice options as a reusable tool inside the inferen-sh skills ecosystem.

This skill focuses on premium, natural-sounding voices with support for 32 languages and multiple performance tiers so you can choose between maximum quality or ultra-low latency.

Key capabilities

Text-to-speech generation from plain text
22+ premium voices accessible via the CLI
Model selection for different speed/quality trade-offs:
- eleven_multilingual_v2 – highest quality, multilingual
- eleven_turbo_v2_5 – balanced speed and quality
- eleven_flash_v2_5 – ultra-fast, low latency
Voice selection from the ElevenLabs voice library
Designed for CLI and automation workflows using infsh

Who is elevenlabs-tts for?

This skill is aimed at users who:

Already use, or are comfortable with, a command line interface
Want to automate or batch-produce voiceovers and narration
Need consistent, reusable voices across projects
Work within the inference.sh / inferen-sh skills ecosystem

Typical users include:

Video editors and creators who need voiceovers for YouTube, product demos, and explainer videos
Podcasters and audio producers generating intros, outros, and segments
E-learning and training teams producing course narration
Developers building IVR, assistants, or accessibility features that require natural speech

When is elevenlabs-tts a good fit?

Use elevenlabs-tts when you:

Need reliable, production-ready voices rather than experimental models
Want to run everything from the CLI rather than a web UI
Need to script or schedule TTS generation as part of CI, pipelines, or batch jobs
Are already using, or willing to install, the inference.sh CLI (infsh)

It’s not an ideal fit if you:

Only want a point-and-click web interface for manual use
Need fine-grained audio editing (cutting, mixing, effects) inside the skill itself — you’ll generate audio here, then edit in a DAW (e.g., Audacity, Reaper, Premiere)
Cannot use external CLIs or outbound network access in your environment

How to Use

Prerequisites

Before using elevenlabs-tts, make sure you have:

inference.sh CLI (infsh) installed
A working infsh login configured
Access to the ElevenLabs TTS app through inference.sh

You can find CLI install instructions in the repository’s cli-install.md referenced from SKILL.md.

Step 1 – Install the elevenlabs-tts skill

From a compatible Agent Skills / inferen-sh environment, add the skill:

npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-tts

This pulls the elevenlabs-tts skill from the inferen-sh/skills repository and registers it so your agents or workflows can call it.

Step 2 – Log in with the inference.sh CLI

The skill relies on the infsh CLI to talk to the ElevenLabs backend.

infsh login

Follow the prompts to authenticate. Once you’re logged in, the CLI can run the ElevenLabs TTS app on your behalf.

Step 3 – Run a basic text-to-speech conversion

The quickest way to see elevenlabs-tts in action is by calling the ElevenLabs TTS app directly via infsh:

infsh app run elevenlabs/tts --input '{"text": "Hello, welcome to our product demo.", "voice": "aria"}'

This example:

Sends the text "Hello, welcome to our product demo."
Uses the "aria" voice (a sample voice ID from the ElevenLabs voice library)
Returns generated speech audio (e.g., as a file or stream depending on your infsh configuration)

Once the skill is integrated, your agents can call this same capability programmatically.

Step 4 – Choose the right ElevenLabs model

The elevenlabs-tts skill supports multiple models, each tuned for a specific balance of quality and latency:

eleven_multilingual_v2
- Best for: highest quality, long-form content, and 32-language support
- Typical use: audiobooks, course narration, branded voiceovers
eleven_turbo_v2_5
- Best for: a balanced mix of quality and speed
- Typical use: product demos, marketing videos, internal training
eleven_flash_v2_5
- Best for: ultra-low latency responses where speed is critical
- Typical use: chatbots, assistants, IVR systems that must respond quickly

How you specify the model may depend on your infsh app run configuration or agent wiring. Check your local toolchain docs on how to pass model IDs as parameters when leveraging this skill.

Step 5 – Integrate into your workflows

Once installed and tested, you can:

Wire elevenlabs-tts into agent prompts so text responses are automatically converted to speech
Use it in CLI scripts to batch-generate voiceovers from a list of text files
Add it to CI pipelines to automatically produce updated narration when documentation or scripts change

For deeper context on how the skill is defined and any helper logic, open the following repo file:

tools/audio/elevenlabs-tts/SKILL.md

That file documents the skill metadata, description, and any specific notes about allowed tools (it currently allows Bash via infsh).

FAQ

What does the elevenlabs-tts skill actually do?

The elevenlabs-tts skill provides a preconfigured way for agents and CLI workflows to call ElevenLabs text-to-speech through the inference.sh CLI. It focuses on generating natural-sounding speech audio from plain text, with access to multiple models and voices.

Do I need the inference.sh CLI to use elevenlabs-tts?

Yes. The repository’s SKILL.md explicitly references infsh and the inference.sh CLI as a requirement. You must install the CLI, run infsh login, and ensure it can access the elevenlabs/tts app.

What kinds of projects is elevenlabs-tts best for?

This skill is well-suited for:

Voiceovers for product demos, tutorials, and marketing videos
Audiobooks and long-form narration, especially using eleven_multilingual_v2
E-learning and training narration
Podcasts and trailers (intros, outros, scripted segments)
Accessibility and IVR systems that need clear, natural voices

Can I use elevenlabs-tts for real-time applications?

For more responsive use cases, choose eleven_turbo_v2_5 or eleven_flash_v2_5, which are designed for lower latency than the highest-quality multilingual model. Actual “real-time” behavior will depend on your network and integration, but these models are intended to support faster turnarounds.

How many voices does elevenlabs-tts support?

The skill description in SKILL.md notes 22+ premium voices. You can select among these using the voice field (for example, "aria") when calling infsh app run elevenlabs/tts or when wiring the skill into your agents.

Does elevenlabs-tts support multiple languages?

Yes. The eleven_multilingual_v2 model is described as supporting 32 languages, making elevenlabs-tts suitable for multilingual narration and global products. Other models may be more optimized for latency but still offer broad language support through ElevenLabs.

Where can I see how the skill is configured?

Look in the inferen-sh/skills repository under:

tools/audio/elevenlabs-tts/SKILL.md

This file contains the official description, allowed tools, and pointers to installation information for the inference.sh CLI.

Can I edit audio inside elevenlabs-tts?

No. The elevenlabs-tts skill focuses on audio generation, not editing. You’ll typically:

Use elevenlabs-tts to generate clean speech audio from text.
Import that audio into a DAW or video editor (e.g., Audacity, Reaper, Premiere, Resolve) for cutting, mixing, and adding effects.

What if I only want a web UI, not a CLI?

If you prefer a purely web-based workflow, elevenlabs-tts may not be the best fit, because it is built around the inference.sh CLI and agent skills ecosystem. In that case, consider using ElevenLabs’ own web dashboard or other UI-focused tools.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

onboard

by pbakaus

Create and refine onboarding flows, empty states, and first-run experiences to help users quickly realize product value. Ideal for improving user activation and guiding first-time users.

UI/UX Design

Favorites 0GitHub 0

marketing-ideas

by coreyhaines31

A marketing strategist skill with 139 proven ideas for growing SaaS and software products, helping you pick the right strategies by stage, budget, and resources.

Content Marketing

Favorites 0GitHub 0

modern-javascript-patterns

by wshobson

Learn to apply ES6+ features and functional programming patterns for clean, efficient, and maintainable JavaScript code. Ideal for frontend developers refactoring legacy code, adopting modern async workflows, or optimizing web applications.

Frontend Development

Favorites 0GitHub 0

lead-magnets

by coreyhaines31

Plan and optimize high-converting lead magnets for email capture and lead generation, using proven formats, benchmarks, and distribution tactics.

Content Marketing

Favorites 0GitHub 0

colorize

by pbakaus

Strategically add color to monochromatic or dull interfaces to make them more engaging and expressive. Ideal for UI designers and developers seeking to enhance visual interest and hierarchy.

UI Design

Favorites 0GitHub 0

slo-implementation

by wshobson

Implement Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets to set and monitor service reliability targets. Ideal for SRE teams and anyone aiming to measure and improve service performance.

Frontend Development

Favorites 0GitHub 0

clarify

by pbakaus

Enhance user interfaces by clarifying confusing UX copy, error messages, microcopy, labels, and instructions. Ideal for teams aiming to improve interface text for better user understanding and experience.

UI Design

Favorites 0GitHub 0

typescript-advanced-types

by wshobson

Gain practical expertise in TypeScript's advanced type system, including generics, conditional types, mapped types, template literals, and utility types. Ideal for frontend developers seeking to build robust, type-safe applications and reusable components. Use this skill to solve complex type logic and ensure compile-time safety in TypeScript projects.

Frontend Development

Favorites 0GitHub 0