I

ai-rag-pipeline

by inferen-sh

Build Retrieval Augmented Generation (RAG) pipelines that combine web search tools (Tavily, Exa) with LLMs (Claude, GPT-4, Gemini via OpenRouter) using the inference.sh CLI. Ideal for research agents, fact-checkers, and AI assistants that need grounded, cited answers.

Stars0
Favorites0
Comments0
AddedMar 27, 2026
CategoryRAG Workflows
Install Command
npx skills add https://github.com/inferen-sh/skills --skill ai-rag-pipeline
Overview

Overview

What is ai-rag-pipeline?

The ai-rag-pipeline skill helps you build Retrieval Augmented Generation (RAG) workflows that combine live web search with large language models via the inference.sh (infsh) CLI. It provides a simple pattern for:

  • Running web research with tools like Tavily Search and Exa
  • Passing those search results into LLMs such as Claude, GPT-4, or Gemini (via OpenRouter)
  • Getting grounded, source-aware answers instead of unsupported guesses

Under the hood, ai-rag-pipeline is a shell-friendly pattern for chaining infsh app run calls, so you can quickly prototype RAG-style research, question answering, and fact-checking pipelines.

Who is this skill for?

ai-rag-pipeline is a good fit if you:

  • Use inference.sh to orchestrate LLM tools from the command line
  • Want research-style answers with citations or explicit web context
  • Are building AI research agents or assistant workflows that must stay up to date
  • Need fact-checking or web-grounded summaries from multiple sources

It is especially useful for developers, data/AI researchers, and power users who are comfortable with Bash, CLIs, and JSON inputs.

What problems does ai-rag-pipeline solve?

This skill focuses on a common RAG use case: combining search and LLMs in a repeatable, scriptable way. It helps you:

  • Move beyond single-prompt chat to pipeline-style research
  • Use Tavily or Exa to pull in fresh, relevant information
  • Feed that content into Claude, GPT-4, Gemini (via OpenRouter) through infsh
  • Produce answers that can be inspected, audited, and reused in other tools or agents

If you want a Perplexity-like workflow using your own tools and models, ai-rag-pipeline gives you the building blocks.

When is ai-rag-pipeline not a good fit?

Consider other skills or approaches if:

  • You are not using the inference.sh CLI or cannot install it
  • You need a fully packaged web app or GUI (this is CLI/bash oriented)
  • You need deep, domain-specific indexing over private docs (this skill focuses on live web search patterns rather than full vector database setup)

For higher-level automation around documents, knowledge bases, or agents, use ai-rag-pipeline as a low-level RAG building block and compose it with other skills.

How to Use

Prerequisites

Before installing ai-rag-pipeline, make sure:

  • You have inference.sh CLI (infsh) installed.
    • The upstream repo links to install instructions at:
      • https://raw.githubusercontent.com/inference-sh/skills/refs/heads/main/cli-install.md
  • You can run infsh login and authenticate successfully.
  • You have access to the tools you intend to use (e.g., Tavily, Exa, OpenRouter-backed models) through inference.sh.

Install the ai-rag-pipeline skill

In an Agent Skills–enabled environment, install the skill with:

npx skills add https://github.com/inferen-sh/skills --skill ai-rag-pipeline

This pulls the ai-rag-pipeline definition from tools/llm/ai-rag-pipeline in the inferen-sh/skills repository and makes it available to your agent or workspace.

After installation, open the Files view and review:

  • SKILL.md – core description and quick start

Quick start: Simple search + answer RAG pipeline

The SKILL file illustrates a minimal RAG flow using the infsh CLI.

  1. Log in to inference.sh:
infsh login
  1. Run a Tavily search and store the result in a Bash variable:
SEARCH=$(infsh app run tavily/search-assistant --input '{"query": "latest AI developments 2024"}')
  1. Pass that research into an LLM (example: a Claude model via OpenRouter) for summarization:
infsh app run openrouter/claude-sonnet-45 --input "{
  \"prompt\": \"Based on this research, summarize the key trends: $SEARCH\"
}"

This pattern demonstrates the core idea of ai-rag-pipeline:

  • Retrievaltavily/search-assistant performs web research
  • Augmentation – the search output is embedded in your prompt as $SEARCH
  • Generation – the Claude model generates a summary grounded in that research

You can swap in other search tools (e.g., Exa Search / Exa Answer) or other models (e.g., GPT-4, Gemini via OpenRouter) as supported by your inference.sh setup.

Customizing the RAG workflow

Once the basic flow works, adapt it to your use case:

1. Change the research query

Tailor the query field to your domain:

SEARCH=$(infsh app run tavily/search-assistant --input '{"query": "impact of LLMs on healthcare regulation"}')

2. Use a different model

Swap openrouter/claude-sonnet-45 with a different LLM app configured in inference.sh, such as a GPT-4 or Gemini-backed route, if available in your environment.

3. Adjust the output style

Modify the prompt text to request:

  • Bullet summaries
  • Pros/cons lists
  • Fact-check reports
  • Study notes with citations

For example:

infsh app run openrouter/claude-sonnet-45 --input "{
  \"prompt\": \"Using the sources in $SEARCH, write a concise fact-check report. Highlight any conflicting claims and clearly list cited URLs.\"
}"

4. Wrap into reusable scripts

Because ai-rag-pipeline is Bash-friendly, you can place the pattern into shell scripts for reuse:

  • research-topic.sh – takes a topic and returns a web-grounded summary
  • fact-check.sh – takes a claim, runs search, and produces a fact-check report
  • briefing.sh – creates a briefing from the latest sources in a given domain

Call these scripts from your agents or CI jobs to automate research workflows.

Integrating with AI agents and workflows

ai-rag-pipeline is designed to work with agent frameworks that:

  • Can call Bash (the SKILL allows Bash(infsh *))
  • Need a RAG step to fetch and ground context before generating answers

Typical integrations:

  • AI research assistants that automatically call infsh app run tavily/... before answering
  • Fact-checking agents that run a search pipeline before confirming or rejecting claims
  • Knowledge base updaters that periodically fetch and summarize the latest news on specific topics

By standardizing on this pattern, your agents can reuse the same RAG logic while changing queries, tools, or models as needed.

FAQ

What is ai-rag-pipeline in simple terms?

ai-rag-pipeline is a small, CLI-focused RAG blueprint that teaches your agents how to:

  1. Call web search tools through the inference.sh CLI
  2. Capture the search output
  3. Feed it into an LLM for grounded, source-based responses

It does not try to be a full framework; instead, it gives you a practical pattern you can customize.

Do I need inference.sh to use ai-rag-pipeline?

Yes. The skill is built around the inference.sh (infsh) CLI. The Quick Start and example commands depend on infsh app run .... If you do not use inference.sh, this skill will not be directly applicable.

Which tools and models can I use with ai-rag-pipeline?

The SKILL description references these families of tools and models, as long as they are exposed through your inference.sh setup:

  • Search / retrieval: Tavily Search, Exa Search, Exa Answer
  • LLMs via OpenRouter: Claude variants, GPT-4, Gemini (and other OpenRouter routes supported by your account)

Check your infsh app list for the exact apps available in your environment.

Can I use ai-rag-pipeline for fact-checking?

Yes. Fact-checking is one of the main intended uses. A typical flow is:

  1. Formulate the claim or question
  2. Use Tavily or Exa to gather multiple sources
  3. Ask the LLM to compare sources, highlight conflicts, and provide a justified conclusion

Because the answer is grounded in retrieved content, you can inspect and verify the underlying sources.

Is this a full RAG framework with vector databases?

No. ai-rag-pipeline focuses on live web search–driven RAG via inference.sh. It does not configure databases, embeddings, or index management for private corpora. You can, however, combine its patterns with your own indexing layer if your environment exposes those tools through infsh.

How do I debug issues with the pipeline?

If something goes wrong:

  • Run each infsh app run ... command separately to confirm it returns valid JSON
  • Echo or log the $SEARCH variable to see the raw search output
  • Simplify the prompt and add instructions like “show your reasoning and list the URLs you used”
  • Consult the upstream SKILL.md for any updated quick-start examples

Where can I learn more about RAG concepts?

The SKILL file includes a short explanation of RAG as a three-step process: Retrieval → Augmentation → Generation. For deeper conceptual material, use ai-rag-pipeline itself:

SEARCH=$(infsh app run tavily/search-assistant --input '{"query": "introduction to retrieval augmented generation"}')
infsh app run openrouter/claude-sonnet-45 --input "{
  \"prompt\": \"Using the following sources, explain RAG to a developer audience: $SEARCH\"
}"

This lets you bootstrap your RAG learning using the pipeline you plan to deploy.

Ratings & Reviews

No ratings yet
Share your review
Sign in to leave a rating and comment for this skill.
G
0/10000
Latest reviews
Saving...