I

ai-video-generation

by inferen-sh

Generate AI videos with Google Veo, Seedance, Wan, Grok and 40+ models via the inference.sh CLI. Supports text-to-video, image-to-video, lipsync, avatar animation, video upscaling, and foley sound for social media clips, marketing content, explainers, and product demos.

Stars0
Favorites0
Comments0
AddedMar 27, 2026
CategoryVideo Editing
Install Command
npx skills add https://github.com/inferen-sh/skills --skill ai-video-generation
Overview

Overview

What is ai-video-generation?

The ai-video-generation skill connects your agent to the inference.sh CLI so it can generate and edit videos with Google Veo, Seedance, Wan, Grok and 40+ AI video models. It is designed for workflows where an AI assistant needs to call a CLI tool (via Bash) to create and refine short-form and long-form video assets.

The skill currently declares *Bash(infsh ) as its allowed tool, which means agents can safely run infsh commands to trigger AI video generation and related processing steps.

Key capabilities

Using the underlying models and the infsh CLI, ai-video-generation can support workflows such as:

  • Text-to-video (T2V): Turn natural language prompts into fully rendered video clips.
  • Image-to-video (I2V): Animate a still image into a moving sequence.
  • Lipsync & avatars: Drive faces and characters with audio to create talking-head or presenter-style content (where supported by the selected model).
  • Video upscaling: Enhance resolution and quality of existing footage.
  • Foley and audio: Add or improve soundtracks and ambient audio when provided by the model.

Available models (as described in the skill) include:

  • Google Veo 3.1 / Veo 3 / Veo 3 Fast
  • Seedance 1.5 Pro
  • Wan 2.5
  • Grok Imagine Video
  • OmniHuman, Fabric, HunyuanVideo

and many more via the inference.sh apps catalogue.

Who is this skill for?

ai-video-generation is a good fit if you:

  • Produce social media videos (TikTok, Instagram Reels, YouTube Shorts, X, LinkedIn) and want AI-first visuals.
  • Create marketing assets such as product teasers, launch videos, and ad variations.
  • Build explainers and tutorials where text prompts describe scenes, UI flows, or diagrams that become short videos.
  • Need to prototype AI avatar presenters or talking-head content quickly.
  • Want an agent-driven workflow that programmatically calls the infsh CLI instead of clicking through a web UI.

It is less suitable if you need:

  • A purely GUI-based editor with timeline and manual keyframing.
  • On-premise or offline video generation (inference.sh is a cloud service).
  • Real-time streaming or live video output.

How ai-video-generation fits in your stack

This skill belongs primarily under video editing and content marketing workflows. You can combine it with:

  • Copywriting skills that write scripts and prompts.
  • Image-generation skills that create frames or reference stills, which are then animated via image-to-video.
  • Post-production tools that add branding, captions, and distribution automations after the initial AI render.

Once installed, your agent can:

  1. Draft prompts and storyboards.
  2. Use infsh app run ... commands to render video clips.
  3. Iterate on the prompt until the result matches your creative brief.

How to Use

1. Install the ai-video-generation skill

To add this skill to a compatible agent environment using the Skills CLI:

npx skills add https://github.com/inferen-sh/skills --skill ai-video-generation

This pulls the ai-video-generation tool definition from the inferen-sh/skills repository and makes it available to your agent so it can call the infsh CLI through Bash.

After installation, open the SKILL.md file in the tools/video/ai-video-generation directory to see the embedded description and links used by this skill.

2. Install and log in to the inference.sh CLI

The skill depends on the inference.sh CLI (infsh). The repository’s SKILL.md links to installation instructions at:

  • https://raw.githubusercontent.com/inference-sh/skills/refs/heads/main/cli-install.md

Follow those steps to install the CLI on your system. Once installed, authenticate:

infsh login

Ensure this works from a regular shell before relying on the agent. The agent will be using the same infsh binary via Bash.

3. Quick start: generate your first AI video

The skill’s quick start demonstrates generating a video with Google Veo 3.1 Fast:

# Generate a video with Veo
infsh app run google/veo-3-1-fast --input '{"prompt": "drone shot flying over a forest"}'

In an agent workflow, your assistant will:

  1. Compose the JSON input payload (e.g., prompt text, duration, style options if supported by the app).
  2. Call the allowed Bash tool with an infsh app run ... command.
  3. Parse the CLI response to surface video URLs or IDs back to you.

You can adapt the prompt to your use case, such as:

  • Product demo: "a rotating 3D render of a sleek wireless headset on a dark gradient background"
  • Social teaser: "fast-paced montage of city nightlife, neon lights, and skyscrapers"
  • Explainer: "minimal flat-style animation showing a phone app sending payments across the world"

4. Choose and switch between models

The SKILL.md file documents multiple model categories (for example, Text-to-Video). Each model has an App ID used by infsh.

For text-to-video, the pattern is generally:

infsh app run <APP_ID> --input '{"prompt": "your description here"}'

Examples based on the skill’s model list:

  • High quality with audio (where supported):

    infsh app run google/veo-3 --input '{"prompt": "cinematic close-up of a chef plating gourmet food"}'
    
  • Best quality with frame interpolation (Veo 3.1):

    infsh app run google/veo-3-1 --input '{"prompt": "slow motion shot of waves crashing at sunset"}'
    
  • Fast iterations (Veo 3.1 Fast):

    infsh app run google/veo-3-1-fast --input '{"prompt": "energetic sports highlights reel"}'
    

For image-to-video, lipsync, avatar, or upscaling models, use the model-specific App IDs documented in the repository and adapt the --input JSON fields accordingly (for example, including an image_url, video_url, or audio_url where required by the chosen app).

5. Integrate into agent prompts and workflows

When wiring ai-video-generation into your agent system:

  • Describe the tool in system prompts: Tell the agent it can generate videos via infsh app run and that model options are available (Veo, Seedance, Wan, etc.).
  • Encourage structured inputs: Ask the agent to build explicit JSON inputs for the CLI, including fields for prompt, duration, and style if supported.
  • Plan for long-running operations: Video generation may take longer than text completions. Design your UX to reflect that (progress messages, polling, etc.).
  • Post-process outputs: Once the CLI returns URLs or file IDs, the agent can write them into project notes, marketing briefs, or downstream automation steps.

6. When this skill is not the best fit

You may want a different solution if:

  • You cannot install or use a CLI on the target environment.
  • Your workflow requires strict on-prem compute where external APIs are disallowed.
  • You only need basic trimming or editing of existing footage and no AI generation.

In those cases, look for pure video-editing skills or integrations with desktop NLEs rather than a cloud AI generation stack.

FAQ

What does ai-video-generation actually install?

The ai-video-generation skill installs metadata and tooling configuration from the inferen-sh/skills repository so your agent knows how to call the infsh CLI for AI video generation. It does not itself install the infsh binary or any models. You must install the inference.sh CLI independently using the instructions referenced in SKILL.md.

Do I need an inference.sh account to use ai-video-generation?

Yes. The quick start explicitly uses infsh login, which requires valid credentials for inference.sh. Without an account and login, infsh app run ... commands invoked by the skill will fail.

Which AI video models can I access with this skill?

The skill description lists multiple supported apps, including Google Veo 3.1, Veo 3, Veo 3 Fast, Seedance 1.5 Pro, Wan 2.5, Grok Imagine Video, OmniHuman, Fabric, and HunyuanVideo, plus many other models available through inference.sh. The exact list and parameters are maintained in the inference.sh catalogue and may evolve over time.

Can I do image-to-video and lipsync, or only text-to-video?

According to the skill description, ai-video-generation supports text-to-video, image-to-video, lipsync, avatar animation, video upscaling, and foley sound, provided you use appropriate models that expose those features through infsh. Check the relevant app documentation on inference.sh for required inputs (for example, image, audio, or video URLs).

How do I control video length, aspect ratio, or style?

Specific control parameters depend on the chosen model’s API surface within inference.sh. The skill itself focuses on wiring the CLI into your agent, not on enforcing a single schema. To adjust duration, aspect ratio, or style, pass the fields supported by the App ID you are using in the --input JSON. Refer to inference.sh app docs for each model for the latest options.

Where are the generated videos stored?

The skill uses the inference.sh CLI, which returns information such as result URLs or IDs. Storage location and retention are managed by inference.sh, not by the skill itself. Typically you will receive a link or reference that you can download, embed in a CMS, or feed into subsequent tools.

Can I run ai-video-generation in CI/CD or headless environments?

Yes, as long as the environment can install and authenticate the infsh CLI and your agent runtime can execute Bash commands. This makes it possible to script bulk marketing video generation, social content variations, or automated preview clips as part of a pipeline.

Is ai-video-generation a good choice for traditional video editing?

Use ai-video-generation when you primarily want AI-generated or AI-transformed video. For fine-grained editing of existing footage (multi-track timelines, manual cuts, complex transitions), you will still need a conventional video editor. You can, however, combine this skill with traditional editing by generating base clips with AI and polishing them in your NLE afterward.

How do I update or remove the skill later?

You manage installation and removal with the same Skills CLI you used to add it. Run the relevant skills command (for example, a remove or update subcommand if supported by your environment). Removing the skill does not uninstall the infsh CLI; it only detaches the ai-video-generation integration from your agent.

Ratings & Reviews

No ratings yet
Share your review
Sign in to leave a rating and comment for this skill.
G
0/10000
Latest reviews
Saving...