hybrid-search-implementation

by wshobson

The hybrid-search-implementation skill shows how to combine vector and keyword retrieval with RRF, linear fusion, reranking, and cascade patterns for RAG and search systems.

Stars32.6k

Favorites0

Comments0

AddedMar 30, 2026

CategoryRAG Workflows

Install Command

npx skills add wshobson/agents --skill hybrid-search-implementation

Curation Score

This skill scores 71/100, which means it is listable for directory users as a solid but somewhat self-serve implementation guide. The repository provides a clear trigger, substantial body content, and concrete fusion patterns for hybrid search, so an agent is more likely to apply it correctly than from a generic prompt alone. However, install-decision clarity is limited by the lack of support files, quick-start setup, and stronger operational workflow cues.

71/100

Strengths

Clear use cases in frontmatter and "When to Use" section help agents trigger it for RAG and search tasks.
Includes concrete implementation patterns such as RRF and other fusion methods, with code fences that add reusable technical substance.
Substantial written content with structured headings improves scanability and progressive disclosure beyond a minimal prompt template.

Cautions

No support files, references, or install command, so users must infer environment, dependencies, and integration steps.
Workflow guidance appears more pattern-oriented than end-to-end, which may leave agents guessing about production setup and evaluation.

RAG Semantic Search Embedding Vector Databases Llm Ai Python

Overview

Overview of hybrid-search-implementation skill

What hybrid-search-implementation actually helps you do

The hybrid-search-implementation skill is a practical pattern library for combining vector retrieval and keyword retrieval in one search pipeline. It is best for teams building RAG systems, internal knowledge search, or domain search where pure semantic search misses exact terms and pure lexical search misses intent. The real job-to-be-done is not “add another retrieval method,” but improve recall without losing the precision needed for names, IDs, acronyms, product codes, and specialized vocabulary.

Who should install this skill

This skill is a strong fit for:

RAG builders seeing missed facts in retrieval
search teams balancing semantic and exact-match behavior
developers working with technical, medical, legal, catalog, or enterprise content
anyone comparing fusion strategies before hard-coding one approach

If your current retrieval works poorly on exact tokens or long-tail terminology, hybrid-search-implementation is more useful than a generic “improve my RAG” prompt.

What makes this skill different from ordinary prompting

The value of the hybrid-search-implementation skill is that it gives you implementation patterns, not just high-level advice. The source focuses on:

a clear two-branch hybrid architecture
concrete fusion options such as RRF, linear weighting, cross-encoder reranking, and cascade patterns
fit guidance for when hybrid retrieval is worth the added complexity

That makes it better for design and implementation decisions than asking a model to improvise a search stack from scratch.

What it does not do for you

This skill does not ship a ready-made production package, indexing pipeline, or benchmark harness. It gives patterns and code templates you adapt to your own stack. If you need vendor-specific setup for Elasticsearch, OpenSearch, Postgres, Pinecone, Weaviate, or Vespa, expect to map the concepts yourself.

How to Use hybrid-search-implementation skill

Install context for hybrid-search-implementation

Install the skill from the repository that contains it:

npx skills add https://github.com/wshobson/agents --skill hybrid-search-implementation

Because this skill lives as a single SKILL.md pattern document, the main install decision is whether you want implementation guidance and templates rather than a full runnable package.

Read this file first

Start with:

plugins/llm-application-dev/skills/hybrid-search-implementation/SKILL.md

The upstream structure is simple, so there is little to inspect beyond that file. Read it in this order:

When to Use This Skill
Core Concepts
Fusion Methods
template code sections

That path gets you to the key decision quickly: which fusion method fits your latency, quality, and tuning needs.

Inputs the skill needs from you

The hybrid-search-implementation usage quality depends heavily on the inputs you provide. Before invoking it, define:

your corpus type: docs, tickets, manuals, code, product data
your retrieval backends: vector DB, BM25 engine, SQL full-text, etc.
your query patterns: natural language, short keywords, identifiers, mixed queries
your constraints: latency budget, reranking budget, indexing complexity
your success metric: recall, top-3 precision, answer grounding, cost

Without those, the model can only return generic architecture advice.

Turn a rough goal into a strong prompt

Weak goal:

“Help me add hybrid search.”

Better prompt:

“Use the hybrid-search-implementation skill to design a retrieval pipeline for a RAG assistant over 200k technical support articles. Queries often contain product names, error codes, and natural language troubleshooting questions. We currently use vector search only and miss exact error-code matches. Recommend whether to use RRF, linear fusion, or reranking, show request flow, ranking logic, and evaluation plan under a 500ms latency target.”

This works better because it tells the skill:

why vector-only retrieval fails
what exact-match behavior matters
what fusion tradeoff to optimize

Choose the right fusion method first

The most important decision in the hybrid-search-implementation guide is usually the fusion method:

RRF: best default if your two systems score differently and you want robust rank fusion without score calibration
Linear: use when you can normalize scores and want tunable balance between semantic and lexical signals
Cross-encoder: use when top-result quality matters enough to pay extra latency and compute
Cascade: use when efficiency matters and you want staged filtering before expensive reranking

A common adoption path is RRF first, then reranking later if quality still plateaus.

Suggested workflow for real projects

Use this workflow instead of dropping the template code in unchanged:

list failure cases from your current search
separate “semantic miss” from “exact token miss”
implement parallel vector and keyword retrieval
fuse with RRF as the baseline
inspect top-k overlap and disagreement
evaluate on a small query set before tuning weights
only add reranking if simple fusion is still not enough

This sequence keeps you from overengineering too early.

What stronger inputs look like in practice

For hybrid-search-implementation for RAG Workflows, useful prompt inputs include examples like:

“Acronym-heavy enterprise wiki where queries mention exact policy IDs”
“Ecommerce catalog with brand names, SKU codes, and descriptive shopping language”
“Support corpus where users type stack traces, error strings, and plain-English symptoms”

Those examples matter because hybrid retrieval pays off most when exact terms and semantic meaning both influence relevance.

Practical output you should ask the skill to produce

Ask for specific deliverables, not just “an architecture”:

retrieval pipeline pseudocode
score fusion function
top-k settings for each branch
fallback strategy when one branch returns nothing
evaluation query set design
failure-mode analysis
rollout plan from vector-only to hybrid

That turns the skill into implementation support rather than brainstorming.

Constraints and tradeoffs to surface early

Before using the hybrid-search-implementation skill, decide:

whether your keyword engine supports stemming, synonyms, and phrase search
whether vector scores are comparable across query types
whether duplicate handling happens before or after fusion
whether document chunking hurts exact-term retrieval
whether metadata filters should run in both branches

These details often matter more than the fusion formula itself.

When hybrid-search-implementation is a poor fit

Do not force hybrid retrieval if:

your corpus is tiny and keyword search already performs well
your queries are mostly exact IDs with little semantic variation
you cannot operate two retrieval paths reliably
you have no evaluation set and cannot tell if complexity helped

In those cases, simpler search may outperform a rushed hybrid design.

hybrid-search-implementation skill FAQ

Is hybrid-search-implementation good for beginners

Yes, if you already understand the basics of vector search and keyword search. The skill explains the main architecture cleanly, but it assumes you can adapt templates into your own codebase. It is more beginner-friendly for retrieval design than for full production deployment.

What problem does hybrid-search-implementation solve better than a normal prompt

A normal prompt may suggest “combine BM25 and embeddings,” but this skill gives you named fusion patterns and clearer decision boundaries. That makes it more useful when you need to choose an implementation path rather than collect generic ideas.

Is the hybrid-search-implementation skill only for RAG

No. It is especially relevant to hybrid-search-implementation for RAG Workflows, but the same patterns apply to site search, enterprise search, product discovery, and knowledge retrieval systems where exact tokens and semantic intent both matter.

Do I need a cross-encoder reranker to benefit

No. Start with RRF or linear fusion first. Cross-encoder reranking improves final ranking quality, but it adds latency and operational complexity. Many teams get meaningful gains from simple hybrid fusion alone.

How does it compare with vector search only

Hybrid search usually helps when vector retrieval misses exact strings, identifiers, rare domain terms, or short keyword-heavy queries. If your failure cases already show that pattern, this skill is likely worth installing.

How does it compare with keyword search only

Keyword-only systems often struggle with paraphrases, intent-level similarity, and natural-language questions. hybrid-search-implementation helps you preserve exact matching while recovering broader semantic recall.

Can I use it with any search backend

Usually yes at the design level. The skill is backend-agnostic, which is helpful for concepts but means you must adapt implementation details to your actual engines and scoring behavior.

How to Improve hybrid-search-implementation skill

Start with failure cases, not architecture diagrams

To get better results from hybrid-search-implementation, collect 20 to 50 real queries where your current retrieval fails. Label why they fail:

exact term not matched
semantic intent missed
wrong document outranked
duplicate chunks crowding results

This gives the skill something concrete to optimize against.

Give the skill your retrieval realities

Your prompt should include:

current retriever types
top-k settings
chunk size and overlap
metadata filters
query examples
latency budget

That context produces much better output than asking for a generic hybrid design.

Ask for a baseline and an upgrade path

A strong request is:

“Design the simplest robust baseline first, then show what to add if evaluation still shows misses.”

This usually leads to a practical sequence such as:

parallel retrieval
RRF
deduplication
optional reranking

That is more actionable than jumping directly to a complex multi-stage stack.

Watch for common failure modes

The biggest implementation mistakes are:

fusing scores that are not comparable
retrieving too few candidates from one branch
ignoring duplicate chunk collapse
treating identifiers the same as natural-language queries
adding reranking before measuring baseline hybrid gains

If the first output looks overly polished but does not mention these risks, ask the model to revise.

Improve prompt quality with query examples

A better hybrid-search-implementation usage prompt includes examples like:

“reset MFA for contractor portal”
“ERR_AUTH_Z-403”
“difference between partner and reseller billing”
“Model X200 battery thermal notice”

Mixed examples force the skill to handle both semantic and lexical behavior.

Iterate using evaluation questions

After the first output, ask follow-ups such as:

“Which queries benefit most from RRF over linear fusion here?”
“Where will chunking break exact-match behavior?”
“How should we normalize scores if our vector and BM25 ranges differ?”
“What should we log to debug missed retrievals?”

These questions improve implementation quality much faster than asking for more code alone.

Use the skill to make decisions, not just generate snippets

The best use of hybrid-search-implementation is to narrow decision uncertainty:

whether hybrid search is justified
which fusion method to start with
how to evaluate it
what operational tradeoffs come next

If you use it that way, the skill adds real value beyond a quick repo skim.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

iterative-retrieval

by affaan-m

iterative-retrieval is a workflow pattern for progressively refining context retrieval in agentic work. It helps subagents avoid too much or too little context, making it useful for iterative-retrieval usage, install decisions, and iterative-retrieval for Workflow Automation.

Workflow Automation

Favorites 0GitHub 156.2k

azure-ai-contentunderstanding-py

by microsoft

azure-ai-contentunderstanding-py is the Python skill for Azure AI Content Understanding. It extracts structured content from documents, images, audio, and video for RAG workflows and automation. Use it when you need reliable multimodal extraction, Azure authentication, and repeatable pipeline-ready output.

RAG Workflows

Favorites 0GitHub 2.2k

azure-search-documents-ts

by microsoft

azure-search-documents-ts helps backend developers build Azure AI Search solutions with the @azure/search-documents SDK. Use it for index creation, document upload, keyword, vector, hybrid, and semantic search, plus credential and environment setup. It is a practical azure-search-documents-ts guide for backend development.

Backend Development

Favorites 0GitHub 2.3k

vector-index-tuning

by wshobson

vector-index-tuning helps tune vector search indexes for latency, recall, and memory. Use it to choose index types, adjust HNSW settings, and compare quantization options for RAG workflows.

RAG Workflows

Favorites 0GitHub 32.6k

embedding-strategies

by wshobson

embedding-strategies helps you choose and optimize embedding models for semantic search and RAG workflows, with practical guidance on chunking, model tradeoffs, multilingual content, and retrieval evaluation.

RAG Workflows

Favorites 0GitHub 32.6k

rag-implementation

by wshobson

rag-implementation is a practical skill for planning RAG systems with vector databases, embeddings, retrieval patterns, and grounded-answer workflows. Use it to compare stack options, shape architecture decisions, and guide install and usage for document Q&A, knowledge assistants, and semantic search.

RAG Workflows

Favorites 0GitHub 32.6k

langchain-architecture

by wshobson

langchain-architecture is a design guide for building LangChain 1.x and LangGraph applications. Use it to choose between chains, agents, retrieval, memory, and stateful orchestration patterns before implementation.

Agent Orchestration

Favorites 0GitHub 32.6k

similarity-search-patterns

by wshobson

similarity-search-patterns helps you choose distance metrics, index types, and hybrid retrieval patterns for semantic search and RAG workflows. Use it to plan production vector search tradeoffs around recall, latency, and scale.

RAG Workflows

Favorites 0GitHub 32.6k

frontend-design

by anthropics

frontend-design helps you turn vague UI ideas into distinctive, production-grade interfaces with real frontend code, strong aesthetic direction, and less generic AI styling.

UI Design

Favorites 1GitHub 105.2k

create-colleague

by titanwings

create-colleague turns coworker docs, chats, emails, screenshots, Feishu, and DingTalk data into an editable AI skill with separate work and persona outputs, plus update flows for ongoing refinement.

Skill Authoring

Favorites 1GitHub 747

hyperframes

by heygen-com

hyperframes is a workflow skill for building HTML-based video compositions in HyperFrames. Use it for title cards, overlays, captions, voiceovers, audio-reactive motion, and scene transitions when you need structured, code-first hyperframes for Video Editing. It favors layout, timing, and animation decisions over generic prompt-only video requests.

Video Editing

Favorites 0GitHub 2.7k

kreuzberg

by kreuzberg-dev

The kreuzberg skill helps you install and use Kreuzberg for document extraction across 91+ formats, including PDFs, Office files, images, HTML, email, and archives. It covers Python, Node.js/TypeScript, Rust, and CLI workflows for OCR, tables, metadata, batch processing, and practical parsing guidance.

PDF Processing

Favorites 0GitHub 0

skill-creator

by anthropics

skill-creator is a Skill Authoring meta-skill for drafting new skills, revising existing SKILL.md files, running evals, comparing variants, and improving trigger descriptions with repository scripts and review tools.

Skill Authoring

Favorites 2GitHub 105.1k

azure-identity-py

by microsoft

azure-identity-py helps set up Azure authentication in Python with Microsoft Entra ID. Use it to choose DefaultAzureCredential, managed identity, or service principal auth, configure environment variables, and troubleshoot access control and credential chain issues. Install guidance, usage patterns, and practical setup notes are based on the repo skill file.

Access Control

Favorites 0GitHub 2.2k

claude-api

by anthropics

claude-api is a practical skill for installing and using the Claude API and Anthropic SDKs. It helps developers choose the right SDK or raw HTTP path, detect language-specific docs, and implement streaming, tool use, files, batches, and error handling with less guesswork.

API Development

Favorites 0GitHub 105k

wrangler

by cloudflare

The wrangler skill helps you find correct CLI commands, config shapes, and deployment steps for Cloudflare Workers. Use it for wrangler usage, wrangler install checks, and a practical wrangler guide when building or shipping Workers for Backend Development.

Backend Development

Favorites 0GitHub 1.3k