elevenlabs-music

by inferen-sh

Generate original AI music from text prompts using the inference.sh CLI and ElevenLabs. Control duration, style, and mood to create royalty-free background music, soundtracks, jingles, podcasts beds, and game audio directly from your terminal.

Stars0

Favorites0

Comments0

CategoryAudio Editing

Install Command

npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-music

Audio Cli API Ai Developer Audience

Overview

What is elevenlabs-music?

elevenlabs-music is a command-line focused AI music generation skill that connects your agent or terminal workflow to the ElevenLabs music model via the inference.sh (infsh) CLI.

With a short text prompt, you can generate original, royalty-free music tailored to your project. The skill wraps the elevenlabs/music app on inference.sh so you can:

Turn text descriptions into music (text-to-music)
Control track duration from 5 seconds up to 10 minutes
Steer genre, mood, and instrumentation in your prompt
Produce audio suitable for commercial use, such as videos, podcasts, and games

Who is this skill for?

elevenlabs-music is designed for:

Creators and editors who need quick custom background tracks for YouTube, TikTok, livestreams, podcasts, or trailers
Game and app developers who want adaptive, on-demand music beds for levels, menus, or in-app experiences
Marketers and brand teams creating jingles, short cues, and ad-friendly music without hiring a composer for every variation
Developers and agent builders who want a predictable CLI/API-style interface to generate music from within scripts, automations, or AI agents

If you already use inference.sh or build workflows around CLI tools, elevenlabs-music fits naturally into your stack.

What problems does elevenlabs-music solve?

This skill helps when you need:

Fast, royalty-free music without digging through stock libraries
Consistent style on demand (e.g., multiple tracks with a similar vibe for a series)
Automation-friendly audio creation, where an agent or script can generate music in response to user input or content metadata

Because it runs via infsh app run elevenlabs/music, you can integrate it into shell scripts, CI pipelines, or chat-based agents without building a custom API layer.

When is elevenlabs-music a good fit?

Use elevenlabs-music when:

You are comfortable with basic CLI commands or agent tools that call the CLI
You want to generate background music, ambiences, or simple cues more than fully structured vocal songs
You need quick iteration: try multiple prompts and durations to find the right track

It may be less suitable if:

You require fine-grained musical arrangement (bars, tempo maps, chord progressions) controlled programmatically
You need vocal performance, lyrics alignment, or multi-stem exports (e.g., separate drums, bass, vocals)
You do not want to use the inference.sh CLI at all—this skill depends on infsh

How to Use

1. Prerequisites and installation

Check your environment

Before using elevenlabs-music, make sure you have:

A system where you can install and run the inference.sh CLI (infsh)
Network access so infsh can call the ElevenLabs-powered elevenlabs/music app

Install the skill into your agent environment

If you are using the skills loader described in the inferen-sh/skills repo, install elevenlabs-music with:

npx skills add https://github.com/inferen-sh/skills --skill elevenlabs-music

This pulls the skill definition from the repository and makes it available to your agent tooling.

Install the inference.sh CLI

elevenlabs-music relies on the infsh CLI. Follow the official installation instructions from the repo:

CLI install guide: https://raw.githubusercontent.com/inference-sh/skills/refs/heads/main/cli-install.md

After installation, verify it works:

infsh --help

If the command responds with help text, your CLI is ready.

2. Log in to inference.sh

Before generating music, authenticate your CLI session:

infsh login

Follow the interactive prompts to complete login. This links infsh to your inference.sh account and enables access to the elevenlabs/music app.

3. Generate your first AI music track

Basic text-to-music command

Once logged in, you can immediately generate music with a simple prompt:

infsh app run elevenlabs/music --input '{"prompt": "Upbeat electronic dance track with driving synths"}'

This command sends your description to the ElevenLabs music model via inference.sh. The output will be JSON containing references to the generated audio (such as URLs or file metadata, depending on your CLI configuration).

4. Control duration and style

The elevenlabs-music skill exposes parameters from the underlying app so you can tune results.

Available parameters

prompt (string, required)
- A natural language description of the music you want (max 2000 characters).
- Include genre, mood, tempo, and instruments where possible.
duration_seconds (number, optional)
- Default: 30
- Min: 5, Max: 600 (up to 10 minutes)

Usage examples

Example: short background sting (10 seconds)

infsh app run elevenlabs/music --input '{
  "prompt": "Short cinematic logo sting with orchestral hit and subtle whoosh",
  "duration_seconds": 10
}'

Example: lo-fi study beats (2 minutes)

infsh app run elevenlabs/music --input '{
  "prompt": "Lo-fi hip hop beat, chill study music, vinyl crackle, mellow piano",
  "duration_seconds": 120
}'

These examples show how you can adapt duration for intros, stingers, or longer background beds.

5. Interpreting the output

The ElevenLabs music generation runs inside the inference.sh app environment and returns JSON output. While the exact structure can vary over time, you can generally expect keys that reference the generated audio (for example, a URL to the rendered file or an ID within inference.sh).

Typical next steps:

Parse the JSON in your script or agent
Download the audio file for use in your editor (DAW, video editor, podcast tool)
Store metadata (prompt, duration, timestamp) alongside your media assets for later re-generation or documentation

6. Using elevenlabs-music inside agents and workflows

Because this skill is defined in the inferen-sh/skills repository and marked to use Bash via infsh, agents can:

Call infsh app run elevenlabs/music when they detect a user intent such as “generate background music for my video intro”
Dynamically construct the prompt and duration_seconds based on user instructions
Return the music link or file reference to the user or to downstream tools

This makes elevenlabs-music useful for:

Multi-step content pipelines (e.g., generate script → generate images → generate matching music)
Chat-based creative assistants that can supply custom soundtracks on request

7. Files and configuration to review

After installing the skill, open these files in the inferen-sh/skills repository to understand or customize behavior:

SKILL.md (root-level for this skill): High-level description and quick-start commands
tools/audio/elevenlabs-music/ (if present in your clone): Implementation details and any helper scripts

These files document how the skill is wired to the CLI and clarify any changes or updates.

FAQ

Is elevenlabs-music free to use?

elevenlabs-music itself is a skill definition that connects to the elevenlabs/music app via inference.sh. Any usage costs or limits come from your inference.sh and ElevenLabs configuration, not from this skill directly.

Check your inference.sh account and ElevenLabs plan for pricing, quotas, and rate limits before heavy use.

What kind of music can elevenlabs-music generate?

The underlying ElevenLabs model is aimed at instrumental and background-style tracks driven by natural language prompts. You can describe:

Genres: lo-fi, EDM, cinematic, ambient, rock, orchestral, etc.
Moods: upbeat, dark, suspenseful, relaxing, uplifting
Contexts: study music, trailer score, game level theme, podcast intro, advertisement bed

Use detailed prompts (mood + genre + instruments + context) to improve results.

How long can the generated tracks be?

You can set duration_seconds between 5 seconds and 600 seconds:

Minimum: 5
Maximum: 600 (10 minutes)

If you omit the parameter, it defaults to 30 seconds.

How do I change the duration of the music?

Include duration_seconds in the JSON you pass to --input:

infsh app run elevenlabs/music --input '{
  "prompt": "Epic orchestral battle music",
  "duration_seconds": 300
}'

Adjust the number to your required length, within the 5–600 second limits.

Can I use elevenlabs-music tracks commercially?

The SKILL description states royalty-free commercial use as a capability of the ElevenLabs AI music generation via inference.sh. However, always confirm current licensing and terms directly with ElevenLabs and inference.sh, as policies can change over time.

Do I need to write code to use elevenlabs-music?

You do not need full application code, but you should be comfortable with:

Running commands in a terminal
Providing JSON input via the --input flag

For deeper integration (e.g., inside a web app or agent platform), your code will typically shell out to infsh or use whatever mechanism your agent framework provides for calling CLI tools.

Does elevenlabs-music support voice or lyrics?

This skill is focused on music generation from text prompts, not lyric alignment or vocal performance. You can describe vocal-like textures in your prompt (e.g., “choir pads” or “vocal chops”), but precise lyric-to-melody generation is outside this skill’s documented scope.

Where can I find more details or updates?

Visit the skill in the inferen-sh/skills repository:

Repo: https://github.com/inferen-sh/skills
Skill path: tools/audio/elevenlabs-music

Check SKILL.md and related files for the latest examples, parameters, and CLI usage notes. If the CLI or app name changes, those files will typically be updated first.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

ai-voice-cloning

by inferen-sh

ai-voice-cloning is an inference.sh-based skill for AI voice generation, text-to-speech, and voice cloning from the CLI. It wraps ElevenLabs, Kokoro TTS, DIA, Chatterbox, Higgs, and VibeVoice models for natural speech, multi-voice narration, and voice transformation for audio and video projects.

Voice Generation

Favorites 0GitHub 0

ai-music-generation

by inferen-sh

Generate AI music and full songs from text prompts using ElevenLabs Music, Diffrythm, and Tencent Song Generation via the inference.sh CLI. Ideal for background tracks, soundtracks, social clips, podcasts, and royalty-free music. Supports fast song generation, instrumentals, and full vocal songs.

Voice Generation

Favorites 0GitHub 0

ai-podcast-creation

by inferen-sh

Create AI-powered podcasts and voice content from text using Kokoro TTS, DIA TTS, and the inference.sh CLI. Mix multiple voices, add music, and assemble full episodes for podcasts, audiobooks, and audio newsletters.

Voice Generation

Favorites 0GitHub 0

elevenlabs-tts

by inferen-sh

ElevenLabs text-to-speech via inference.sh CLI, with 22+ premium voices, multilingual support, and fast model options for production voice generation workflows.

Voice Generation

Favorites 0GitHub 0

elevenlabs-voice-changer

by inferen-sh

ElevenLabs voice changer skill using the inference.sh CLI (infsh) to transform recorded speech into a different synthetic voice while preserving content and emotion. Supports eleven_multilingual_sts_v2 (70+ languages) and eleven_english_sts_v2 for speech-to-speech, accent change, and voice disguise in content creation, dubbing, and character voices.

Voice Generation

Favorites 0GitHub 0

elevenlabs-sound-effects

by inferen-sh

Generate AI sound effects from text prompts using ElevenLabs via the inference.sh CLI. Ideal for video editors, game developers, podcasters, filmmakers, and content creators who need fast, royalty-free sound design. Supports text-to-sound-effect, adjustable duration, and prompt control for cinematic, ambient, and game-ready SFX.

Audio Editing

Favorites 0GitHub 0

ai-content-pipeline

by inferen-sh

Design and run multi-step AI content pipelines that chain image, video, audio, and text tools together via the inference.sh CLI. Use ai-content-pipeline to automate workflows like: generate an image, animate it to video, add sound or voiceover, and prepare content for YouTube, social media, and marketing campaigns.

Workflow Automation

Favorites 0GitHub 0

elevenlabs-dubbing

by inferen-sh

elevenlabs-dubbing lets you automatically dub and translate audio or video into 29 languages using the inference.sh CLI, preserving the original speakers’ voices. Ideal for video editors, podcasters, and localization teams who need fast, high‑quality multilingual versions of existing content.

Video Editing

Favorites 0GitHub 0