similarity-search-patterns

by wshobson

similarity-search-patterns helps you choose distance metrics, index types, and hybrid retrieval patterns for semantic search and RAG workflows. Use it to plan production vector search tradeoffs around recall, latency, and scale.

Stars32.6k

Favorites0

Comments0

AddedMar 30, 2026

CategoryRAG Workflows

Install Command

npx skills add wshobson/agents --skill similarity-search-patterns

Curation Score

This skill scores 67/100, which means it is listable for directory users as a useful reference-oriented skill, but not a highly operational one. The repository evidence shows solid conceptual coverage for similarity search use cases and clear triggering cues, yet limited step-by-step workflow guidance or executable artifacts, so agents may still need to infer implementation details.

67/100

Strengths

Strong triggerability: the description and 'When to Use' section clearly map to semantic search, RAG retrieval, recommendation, latency optimization, and large-scale vector search.
Substantial written content: a long SKILL.md with multiple sections, tables, and code fences gives agents reusable patterns beyond a generic one-paragraph prompt.
Covers core design decisions such as distance metrics and index types, helping agents reason about common similarity-search tradeoffs in production systems.

Cautions

Operational clarity is limited: structural signals show workflow 0 and practical 0, with no install command, scripts, references, or supporting resources.
Trust and adoption depth are moderate rather than strong because the skill appears documentation-only, with no linked files, examples, or implementation artifacts to validate execution details.

RAG Semantic Search Embedding Vector Databases Llm Ai Backend

Overview

Overview of similarity-search-patterns skill

What similarity-search-patterns actually helps with

The similarity-search-patterns skill is a decision and implementation guide for building vector retrieval that works in production, not just in demos. It is most useful when you need to choose a distance metric, pick an index strategy, balance recall against latency, and design retrieval behavior for semantic search or RAG systems.

Best fit users and projects

This skill is a strong fit for:

engineers building semantic search or recommendation features
teams designing similarity-search-patterns for RAG Workflows
developers moving from “just store embeddings” to production retrieval design
practitioners comparing exact search, HNSW, and IVF-style tradeoffs

If you already know your vector database well and only need vendor-specific commands, this skill is less valuable. Its strength is pattern selection and system design, not database-specific setup.

The real job-to-be-done

Most users do not need a generic explanation of embeddings. They need to answer practical questions such as:

Which distance metric matches my embedding model?
When is exact search acceptable?
When should I use HNSW or IVF-style indexing?
How do I combine semantic and keyword retrieval?
What retrieval pattern fits my scale, latency target, and recall needs?

The similarity-search-patterns skill is useful because it frames those choices directly.

What makes this skill different from a normal prompt

A normal prompt might tell an agent to “implement vector search.” This skill is more valuable when the hard part is architectural judgment. It gives the agent a structured way to reason about:

distance metrics and what they imply
index types and their recall/latency tradeoffs
scaling from small datasets to millions of vectors
hybrid retrieval patterns instead of pure vector search everywhere

That makes it more useful for design-quality output than a one-line prompt.

Important limits before you install

This is not a turnkey integration package. The repository evidence shows only a SKILL.md file with no scripts, references, or vendor-specific examples. So expect conceptual and architectural guidance rather than copy-paste setup for Pinecone, Weaviate, pgvector, FAISS, Milvus, Elasticsearch, or OpenSearch.

Install this if you want better retrieval decisions. Do not install it expecting a full implementation scaffold.

How to Use similarity-search-patterns skill

Install context for similarity-search-patterns

Install the skill from the wshobson/agents repository:

npx skills add https://github.com/wshobson/agents --skill similarity-search-patterns

Because this skill is documentation-driven, the main asset to read is:

plugins/llm-application-dev/skills/similarity-search-patterns/SKILL.md

There are no support scripts or reference files, so most of the value comes from how well you frame your problem when invoking it.

Read this file first

Start with SKILL.md and focus on the sections covering:

when to use the skill
distance metrics
index types

Those sections are likely to shape most implementation decisions. Read them before asking an agent for code, otherwise you risk getting a plausible but mismatched retrieval design.

What inputs the skill needs to work well

The similarity-search-patterns usage quality depends heavily on the context you provide. At minimum, include:

your use case: semantic search, RAG, recommendation, deduplication
approximate corpus size
expected query volume and latency target
whether recall or speed matters more
embedding model or embedding behavior if known
whether you need keyword + semantic hybrid search
your storage or vector database constraints

Without that, the skill can only return generic advice.

Turn a rough goal into a strong invocation

Weak goal:

“Help me build vector search.”

Stronger goal:

“Design a similarity search approach for a RAG system over 3 million support documents. We use normalized embeddings, need sub-200ms retrieval, can tolerate slight recall loss, and want to combine semantic retrieval with keyword filtering for product IDs and error codes.”

The stronger version helps the agent choose:

cosine vs other metrics
HNSW vs IVF-style approaches
whether hybrid retrieval is necessary
how to reason about filtering and scale

A practical prompt template

Use a prompt like this when calling the similarity-search-patterns skill:

“Apply similarity-search-patterns to recommend a retrieval design for [use case]. Corpus size is [size]. Latency target is [target]. Priority is [recall/speed/cost]. Embeddings are [normalized/raw/unknown]. We need [pure semantic search / hybrid keyword+vector / metadata filtering]. Compare index options, recommend a metric, explain tradeoffs, and give an implementation plan.”

This usually produces better output than asking directly for code.

How to use similarity-search-patterns for RAG Workflows

For similarity-search-patterns for RAG Workflows, ask the agent to reason about retrieval quality, not just indexing. Useful additions:

document chunk size and overlap
whether metadata filters are required
top-k target
reranking availability
whether exact phrase matches matter
expected failure cases like code snippets, IDs, or legal citations

RAG systems often fail because teams use pure semantic retrieval where hybrid retrieval or stronger metadata constraints are needed. This skill is especially helpful for surfacing that mismatch early.

Metric choice is one of the highest-value outputs

A common adoption blocker is uncertainty around distance metrics. This skill is most useful when you ask it to justify the metric choice based on your embedding behavior:

cosine for normalized embeddings
Euclidean for raw embeddings in some setups
dot product when magnitude carries signal
Manhattan/L1 mainly in sparse or specialized cases

If you do not know whether your embeddings are normalized, say so explicitly and ask the agent to state assumptions.

Index selection should match scale and tolerance

One of the best uses of similarity-search-patterns install is not installation at all, but avoiding the wrong index choice:

flat/exact search for smaller datasets or high-recall validation
HNSW for strong practical performance on medium to large datasets
IVF+PQ-style approaches when scale and memory pressure matter more than perfect recall

Ask the agent to recommend both a default production choice and a simpler baseline for testing. That gives you a migration path instead of a brittle first decision.

Suggested workflow after first output

A good workflow is:

Ask for a retrieval design recommendation.
Ask the agent to list assumptions it made.
Ask for one “high recall” option and one “low latency” option.
Ask for failure modes specific to your corpus.
Only then ask for implementation steps in your chosen stack.

This keeps the skill focused on decision quality before code generation.

What to ask the agent for next

After the first design pass, useful follow-up requests include:

“Compare HNSW vs IVF+PQ for my scale and memory budget.”
“When would hybrid search outperform pure semantic retrieval here?”
“What test queries should I use to evaluate recall?”
“What retrieval mistakes are likely with product codes, names, and abbreviations?”
“How should I benchmark latency vs recall before launch?”

Those questions turn the skill into a practical planning tool rather than a glossary.

similarity-search-patterns skill FAQ

Is similarity-search-patterns beginner-friendly?

Yes, if you already understand embeddings at a basic level. The skill explains the major retrieval choices clearly, but it is more about system design than first-principles teaching. Beginners can use it, but they will get more value if they provide a concrete use case.

Is this skill enough to implement a full vector search stack?

No. The similarity-search-patterns guide is best for choosing patterns and tradeoffs. It does not ship with scripts, examples, or vendor-specific integration assets. You will likely pair it with your database documentation and your application framework.

When is similarity-search-patterns better than an ordinary prompt?

It is better when your main risk is choosing the wrong retrieval approach. If you ask a general model for “vector search code,” you may get implementation details without sound index, metric, or hybrid-search reasoning. This skill improves that reasoning layer.

When should I not use similarity-search-patterns?

Skip it if:

you only need a quick toy demo
your vendor already gives a fixed, opinionated retrieval setup
you are solving a purely keyword-based search problem
your task is database administration rather than retrieval design

Does it help with hybrid search?

Yes. The source explicitly points toward combining semantic and keyword search as a valid use case. That is important for domains where identifiers, exact phrases, codes, or names matter. Pure embedding search is often not enough in those cases.

Can I use it for recommendation systems too?

Yes. The core ideas transfer well to nearest-neighbor recommendation, especially where you must choose index structures and optimize search latency at scale. Just specify your similarity objective and traffic constraints clearly.

How to Improve similarity-search-patterns skill

Give the skill operational constraints, not just a feature request

The fastest way to improve similarity-search-patterns usage is to include real constraints:

corpus size
update frequency
latency SLO
memory budget
expected recall target
filtering needs
whether batch indexing or real-time ingestion matters

This changes the recommendation from generic to actionable.

State embedding assumptions explicitly

Many poor outputs come from hidden embedding assumptions. Improve results by telling the agent:

the embedding model name if known
whether vectors are normalized
embedding dimension if relevant
whether semantic similarity alone is trustworthy in your domain

That helps the skill recommend an appropriate metric and avoid mismatched similarity calculations.

Ask for tradeoffs in a decision table

A strong way to use the similarity-search-patterns skill is to request a compact comparison table with columns like:

option
recall
latency
memory cost
implementation complexity
best fit
risks

This forces clearer decisions than narrative-only answers.

Push for corpus-specific failure modes

Do not stop at “which index should I use?” Ask:

what types of queries will vector search miss?
where would hybrid search be necessary?
what kinds of tokens should bypass semantic retrieval?
which queries should be used in offline evaluation?

This is especially important for similarity-search-patterns for RAG Workflows, where retrieval mistakes directly degrade answer quality.

Common failure modes to watch for

Typical mistakes include:

choosing cosine without checking embedding normalization
using approximate search before establishing an exact baseline
expecting pure semantic retrieval to handle IDs or exact terminology
optimizing latency before measuring recall quality
selecting an index for current scale but ignoring growth

The skill is strongest when you ask it to surface these risks directly.

Iterate after the first answer

After the initial recommendation, improve output quality by asking the agent to:

challenge its own index recommendation
propose an evaluation plan
separate MVP choices from scale-up choices
identify what to test before committing to a vendor
rewrite the plan for your specific stack

That turns similarity-search-patterns from a one-shot explainer into a practical design review assistant.

Pair the skill with measurement requests

The best improvement step is to ask for measurement criteria, not just architecture:

recall@k targets
latency percentiles
indexing throughput
memory footprint
hybrid retrieval lift on difficult queries

If the agent cannot tell you how to evaluate the design, the recommendation is not yet strong enough to implement.

Use it as a before-you-build checkpoint

For many teams, the highest-value use of similarity-search-patterns install is before any coding starts. Use it to validate:

whether vector search is the right approach
whether hybrid retrieval is required
whether exact search is sufficient at current scale
whether your embedding assumptions are valid

That early checkpoint prevents expensive retrieval architecture rework later.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

iterative-retrieval

by affaan-m

iterative-retrieval is a workflow pattern for progressively refining context retrieval in agentic work. It helps subagents avoid too much or too little context, making it useful for iterative-retrieval usage, install decisions, and iterative-retrieval for Workflow Automation.

Workflow Automation

Favorites 0GitHub 156.2k

azure-ai-contentunderstanding-py

by microsoft

azure-ai-contentunderstanding-py is the Python skill for Azure AI Content Understanding. It extracts structured content from documents, images, audio, and video for RAG workflows and automation. Use it when you need reliable multimodal extraction, Azure authentication, and repeatable pipeline-ready output.

RAG Workflows

Favorites 0GitHub 2.2k

azure-search-documents-ts

by microsoft

azure-search-documents-ts helps backend developers build Azure AI Search solutions with the @azure/search-documents SDK. Use it for index creation, document upload, keyword, vector, hybrid, and semantic search, plus credential and environment setup. It is a practical azure-search-documents-ts guide for backend development.

Backend Development

Favorites 0GitHub 2.3k

vector-index-tuning

by wshobson

vector-index-tuning helps tune vector search indexes for latency, recall, and memory. Use it to choose index types, adjust HNSW settings, and compare quantization options for RAG workflows.

RAG Workflows

Favorites 0GitHub 32.6k

hybrid-search-implementation

by wshobson

The hybrid-search-implementation skill shows how to combine vector and keyword retrieval with RRF, linear fusion, reranking, and cascade patterns for RAG and search systems.

RAG Workflows

Favorites 0GitHub 32.6k

embedding-strategies

by wshobson

embedding-strategies helps you choose and optimize embedding models for semantic search and RAG workflows, with practical guidance on chunking, model tradeoffs, multilingual content, and retrieval evaluation.

RAG Workflows

Favorites 0GitHub 32.6k

rag-implementation

by wshobson

rag-implementation is a practical skill for planning RAG systems with vector databases, embeddings, retrieval patterns, and grounded-answer workflows. Use it to compare stack options, shape architecture decisions, and guide install and usage for document Q&A, knowledge assistants, and semantic search.

RAG Workflows

Favorites 0GitHub 32.6k

langchain-architecture

by wshobson

langchain-architecture is a design guide for building LangChain 1.x and LangGraph applications. Use it to choose between chains, agents, retrieval, memory, and stateful orchestration patterns before implementation.

Agent Orchestration

Favorites 0GitHub 32.6k

frontend-design

by anthropics

frontend-design helps you turn vague UI ideas into distinctive, production-grade interfaces with real frontend code, strong aesthetic direction, and less generic AI styling.

UI Design

Favorites 1GitHub 105.2k

create-colleague

by titanwings

create-colleague turns coworker docs, chats, emails, screenshots, Feishu, and DingTalk data into an editable AI skill with separate work and persona outputs, plus update flows for ongoing refinement.

Skill Authoring

Favorites 1GitHub 747

hyperframes

by heygen-com

hyperframes is a workflow skill for building HTML-based video compositions in HyperFrames. Use it for title cards, overlays, captions, voiceovers, audio-reactive motion, and scene transitions when you need structured, code-first hyperframes for Video Editing. It favors layout, timing, and animation decisions over generic prompt-only video requests.

Video Editing

Favorites 0GitHub 2.7k

kreuzberg

by kreuzberg-dev

The kreuzberg skill helps you install and use Kreuzberg for document extraction across 91+ formats, including PDFs, Office files, images, HTML, email, and archives. It covers Python, Node.js/TypeScript, Rust, and CLI workflows for OCR, tables, metadata, batch processing, and practical parsing guidance.

PDF Processing

Favorites 0GitHub 0

skill-creator

by anthropics

skill-creator is a Skill Authoring meta-skill for drafting new skills, revising existing SKILL.md files, running evals, comparing variants, and improving trigger descriptions with repository scripts and review tools.

Skill Authoring

Favorites 2GitHub 105.1k

azure-identity-py

by microsoft

azure-identity-py helps set up Azure authentication in Python with Microsoft Entra ID. Use it to choose DefaultAzureCredential, managed identity, or service principal auth, configure environment variables, and troubleshoot access control and credential chain issues. Install guidance, usage patterns, and practical setup notes are based on the repo skill file.

Access Control

Favorites 0GitHub 2.2k

claude-api

by anthropics

claude-api is a practical skill for installing and using the Claude API and Anthropic SDKs. It helps developers choose the right SDK or raw HTTP path, detect language-specific docs, and implement streaming, tool use, files, batches, and error handling with less guesswork.

API Development

Favorites 0GitHub 105k

wrangler

by cloudflare

The wrangler skill helps you find correct CLI commands, config shapes, and deployment steps for Cloudflare Workers. Use it for wrangler usage, wrangler install checks, and a practical wrangler guide when building or shipping Workers for Backend Development.

Backend Development

Favorites 0GitHub 1.3k