ai-rag-pipeline

by inferen-sh

Build Retrieval Augmented Generation (RAG) pipelines that combine web search tools (Tavily, Exa) with LLMs (Claude, GPT-4, Gemini via OpenRouter) using the inference.sh CLI. Ideal for research agents, fact-checkers, and AI assistants that need grounded, cited answers.

Stars0

Favorites0

Comments0

AddedMar 27, 2026

CategoryRAG Workflows

Install Command

npx skills add https://github.com/inferen-sh/skills --skill ai-rag-pipeline

RAG Workflow Automation Ai Bash Cli Anthropic OpenAI

Overview

What is ai-rag-pipeline?

The ai-rag-pipeline skill helps you build Retrieval Augmented Generation (RAG) workflows that combine live web search with large language models via the inference.sh (infsh) CLI. It provides a simple pattern for:

Running web research with tools like Tavily Search and Exa
Passing those search results into LLMs such as Claude, GPT-4, or Gemini (via OpenRouter)
Getting grounded, source-aware answers instead of unsupported guesses

Under the hood, ai-rag-pipeline is a shell-friendly pattern for chaining infsh app run calls, so you can quickly prototype RAG-style research, question answering, and fact-checking pipelines.

Who is this skill for?

ai-rag-pipeline is a good fit if you:

Use inference.sh to orchestrate LLM tools from the command line
Want research-style answers with citations or explicit web context
Are building AI research agents or assistant workflows that must stay up to date
Need fact-checking or web-grounded summaries from multiple sources

It is especially useful for developers, data/AI researchers, and power users who are comfortable with Bash, CLIs, and JSON inputs.

What problems does ai-rag-pipeline solve?

This skill focuses on a common RAG use case: combining search and LLMs in a repeatable, scriptable way. It helps you:

Move beyond single-prompt chat to pipeline-style research
Use Tavily or Exa to pull in fresh, relevant information
Feed that content into Claude, GPT-4, Gemini (via OpenRouter) through infsh
Produce answers that can be inspected, audited, and reused in other tools or agents

If you want a Perplexity-like workflow using your own tools and models, ai-rag-pipeline gives you the building blocks.

When is ai-rag-pipeline not a good fit?

Consider other skills or approaches if:

You are not using the inference.sh CLI or cannot install it
You need a fully packaged web app or GUI (this is CLI/bash oriented)
You need deep, domain-specific indexing over private docs (this skill focuses on live web search patterns rather than full vector database setup)

For higher-level automation around documents, knowledge bases, or agents, use ai-rag-pipeline as a low-level RAG building block and compose it with other skills.

How to Use

Prerequisites

Before installing ai-rag-pipeline, make sure:

You have inference.sh CLI (infsh) installed.
- The upstream repo links to install instructions at:
  - https://raw.githubusercontent.com/inference-sh/skills/refs/heads/main/cli-install.md
You can run infsh login and authenticate successfully.
You have access to the tools you intend to use (e.g., Tavily, Exa, OpenRouter-backed models) through inference.sh.

Install the ai-rag-pipeline skill

In an Agent Skills–enabled environment, install the skill with:

npx skills add https://github.com/inferen-sh/skills --skill ai-rag-pipeline

This pulls the ai-rag-pipeline definition from tools/llm/ai-rag-pipeline in the inferen-sh/skills repository and makes it available to your agent or workspace.

After installation, open the Files view and review:

SKILL.md – core description and quick start

Quick start: Simple search + answer RAG pipeline

The SKILL file illustrates a minimal RAG flow using the infsh CLI.

infsh login

Run a Tavily search and store the result in a Bash variable:

SEARCH=$(infsh app run tavily/search-assistant --input '{"query": "latest AI developments 2024"}')

Pass that research into an LLM (example: a Claude model via OpenRouter) for summarization:

infsh app run openrouter/claude-sonnet-45 --input "{
  \"prompt\": \"Based on this research, summarize the key trends: $SEARCH\"
}"

This pattern demonstrates the core idea of ai-rag-pipeline:

Retrieval – tavily/search-assistant performs web research
Augmentation – the search output is embedded in your prompt as $SEARCH
Generation – the Claude model generates a summary grounded in that research

You can swap in other search tools (e.g., Exa Search / Exa Answer) or other models (e.g., GPT-4, Gemini via OpenRouter) as supported by your inference.sh setup.

Customizing the RAG workflow

Once the basic flow works, adapt it to your use case:

1. Change the research query

Tailor the query field to your domain:

SEARCH=$(infsh app run tavily/search-assistant --input '{"query": "impact of LLMs on healthcare regulation"}')

2. Use a different model

Swap openrouter/claude-sonnet-45 with a different LLM app configured in inference.sh, such as a GPT-4 or Gemini-backed route, if available in your environment.

3. Adjust the output style

Modify the prompt text to request:

Bullet summaries
Pros/cons lists
Fact-check reports
Study notes with citations

For example:

infsh app run openrouter/claude-sonnet-45 --input "{
  \"prompt\": \"Using the sources in $SEARCH, write a concise fact-check report. Highlight any conflicting claims and clearly list cited URLs.\"
}"

4. Wrap into reusable scripts

Because ai-rag-pipeline is Bash-friendly, you can place the pattern into shell scripts for reuse:

research-topic.sh – takes a topic and returns a web-grounded summary
fact-check.sh – takes a claim, runs search, and produces a fact-check report
briefing.sh – creates a briefing from the latest sources in a given domain

Call these scripts from your agents or CI jobs to automate research workflows.

Integrating with AI agents and workflows

ai-rag-pipeline is designed to work with agent frameworks that:

Can call Bash (the SKILL allows Bash(infsh *))
Need a RAG step to fetch and ground context before generating answers

Typical integrations:

AI research assistants that automatically call infsh app run tavily/... before answering
Fact-checking agents that run a search pipeline before confirming or rejecting claims
Knowledge base updaters that periodically fetch and summarize the latest news on specific topics

By standardizing on this pattern, your agents can reuse the same RAG logic while changing queries, tools, or models as needed.

FAQ

What is ai-rag-pipeline in simple terms?

ai-rag-pipeline is a small, CLI-focused RAG blueprint that teaches your agents how to:

Call web search tools through the inference.sh CLI
Capture the search output
Feed it into an LLM for grounded, source-based responses

It does not try to be a full framework; instead, it gives you a practical pattern you can customize.

Do I need inference.sh to use ai-rag-pipeline?

Yes. The skill is built around the inference.sh (infsh) CLI. The Quick Start and example commands depend on infsh app run .... If you do not use inference.sh, this skill will not be directly applicable.

Which tools and models can I use with ai-rag-pipeline?

The SKILL description references these families of tools and models, as long as they are exposed through your inference.sh setup:

Search / retrieval: Tavily Search, Exa Search, Exa Answer
LLMs via OpenRouter: Claude variants, GPT-4, Gemini (and other OpenRouter routes supported by your account)

Check your infsh app list for the exact apps available in your environment.

Can I use ai-rag-pipeline for fact-checking?

Yes. Fact-checking is one of the main intended uses. A typical flow is:

Formulate the claim or question
Use Tavily or Exa to gather multiple sources
Ask the LLM to compare sources, highlight conflicts, and provide a justified conclusion

Because the answer is grounded in retrieved content, you can inspect and verify the underlying sources.

Is this a full RAG framework with vector databases?

No. ai-rag-pipeline focuses on live web search–driven RAG via inference.sh. It does not configure databases, embeddings, or index management for private corpora. You can, however, combine its patterns with your own indexing layer if your environment exposes those tools through infsh.

How do I debug issues with the pipeline?

If something goes wrong:

Run each infsh app run ... command separately to confirm it returns valid JSON
Echo or log the $SEARCH variable to see the raw search output
Simplify the prompt and add instructions like “show your reasoning and list the URLs you used”
Consult the upstream SKILL.md for any updated quick-start examples

Where can I learn more about RAG concepts?

The SKILL file includes a short explanation of RAG as a three-step process: Retrieval → Augmentation → Generation. For deeper conceptual material, use ai-rag-pipeline itself:

SEARCH=$(infsh app run tavily/search-assistant --input '{"query": "introduction to retrieval augmented generation"}')
infsh app run openrouter/claude-sonnet-45 --input "{
  \"prompt\": \"Using the following sources, explain RAG to a developer audience: $SEARCH\"
}"

This lets you bootstrap your RAG learning using the pipeline you plan to deploy.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

frontend-design

by pbakaus

The frontend-design skill helps you build visually distinctive, production-grade frontend interfaces with high design quality. It generates creative, polished code and avoids generic AI aesthetics, making it ideal for web components, pages, applications, and design artifacts where strong visual direction and interface polish are required.

UI Design

Favorites 0GitHub 0

arrange

by pbakaus

arrange helps designers and developers improve UI layout, spacing, and visual rhythm by fixing monotonous grids, inconsistent spacing, and weak visual hierarchy. Use it when your interface feels crowded or lacks clear composition.

UI Design

Favorites 0GitHub 0

bolder

by pbakaus

bolder transforms bland or overly safe UI designs into visually engaging and memorable experiences, increasing impact while preserving usability. Ideal when feedback calls for more personality or stronger visual direction.

UI Design

Favorites 0GitHub 14.1K

brand-guidelines

by anthropics

Use the brand-guidelines skill to apply Anthropic brand colors and typography to documents, visuals, and other artifacts that need a consistent official look and feel.

Branding

Favorites 0GitHub 0

email-sequence

by coreyhaines31

Expert guidance for planning, writing, and optimizing multi-email sequences like welcome, nurture, onboarding, and re-engagement campaigns, with structured prompts and templates built in.

Email Campaigns

Favorites 0GitHub 0

ckm:brand

by nextlevelbuilder

Brand voice, visual identity, messaging frameworks, and asset consistency tooling for teams that need structured brand guidelines and reviews.

Branding

Favorites 0GitHub 0

verification-before-completion

by obra

Enforce a strict rule that no work is declared done, fixed, or passing until you run the actual verification command, inspect the output, and base your claim on fresh evidence.

Test Automation

Favorites 0GitHub 0

xlsx

by anthropics

Use the xlsx skill when a spreadsheet file is the main deliverable, including .xlsx, .xlsm, .csv, and .tsv workflows.

Spreadsheet Workflows

Favorites 0GitHub 0