embedding-strategies

by wshobson

embedding-strategies helps you choose and optimize embedding models for semantic search and RAG workflows, with practical guidance on chunking, model tradeoffs, multilingual content, and retrieval evaluation.

Stars32.6k

Favorites0

Comments0

AddedMar 30, 2026

CategoryRAG Workflows

Install Command

npx skills add wshobson/agents --skill embedding-strategies

Curation Score

This skill scores 70/100, which means it is acceptable to list for directory users who want a substantive written guide to embedding model choice and chunking tradeoffs, but it stops short of being a highly operational install because execution still depends on the agent inferring missing evaluation steps and implementation details.

70/100

Strengths

Strong triggerability: the description and "When to Use" section clearly cover model selection, chunking, RAG, multilingual content, and embedding optimization.
Substantive content depth: the SKILL.md is long and structured, with multiple sections, tables, and code fences rather than placeholder text.
Useful install-decision signal: the model comparison table names concrete embedding options, dimensions, token limits, and use-case fit, helping users judge relevance before installing.

Cautions

Operational leverage is limited by lack of support files, scripts, references, or repo-linked examples, so agents must translate prose guidance into execution on their own.
Some trust and freshness risk remains because recommendations rely on an in-document comparison table labeled "2026" without cited sources or validation artifacts.

Embedding Semantic Search RAG Llm Ai Anthropic OpenAI

Overview

Overview of embedding-strategies skill

What embedding-strategies does

The embedding-strategies skill helps you choose, evaluate, and operationalize embedding models for semantic search and retrieval systems. It is most useful when you are building or tuning RAG pipelines and need better decisions than “pick a popular embedding model and hope.”

Who should use embedding-strategies

This skill fits builders working on search, document retrieval, agent memory, knowledge bases, and embedding-strategies for RAG Workflows. It is especially useful if you need to compare hosted vs local models, handle domain-specific corpora, decide chunking strategy, or balance quality against vector size and cost.

The real job-to-be-done

Users do not usually need a generic explanation of embeddings. They need help answering practical questions such as:

which model should I start with for my stack
how should I chunk my documents
when does dimensionality reduction help
how do I evaluate retrieval quality before shipping

The value of embedding-strategies is that it turns those choices into a structured decision process instead of ad hoc prompting.

What makes this skill different

The skill is stronger than an ordinary “recommend an embedding model” prompt because it focuses on tradeoffs that change production results: context length, domain fit, multilingual support, cost, code retrieval, and chunking implications. It also gives you a current comparison frame for major embedding options rather than treating all embeddings as interchangeable.

Best-fit and misfit cases

Best fit:

selecting embeddings for a new RAG system
revisiting poor retrieval quality
choosing between OpenAI, Voyage, and open-source options
handling legal, finance, code, or multilingual content

Misfit:

you only need a basic vector database tutorial
your problem is really reranking, query rewriting, or bad source data
you want benchmark truth without running your own retrieval tests

How to Use embedding-strategies skill

Install context for embedding-strategies

This skill lives in the wshobson/agents repository under plugins/llm-application-dev/skills/embedding-strategies.

If you use the Skills CLI, install it with:

npx skills add https://github.com/wshobson/agents --skill embedding-strategies

If your environment loads skills from a cloned repo, point it at the folder:
plugins/llm-application-dev/skills/embedding-strategies

Read this file first

Start with:

SKILL.md

This repository slice is simple: the decision logic is in the main skill file, so you do not need to hunt through helper scripts or reference folders before using it.

What input the skill needs from you

embedding-strategies usage is best when you provide operational context, not just “pick the best model.” Include:

document types: docs, PDFs, tickets, code, contracts, chats
language mix: English only or multilingual
average and max document length
expected query style: keyword-ish, natural language, code, entity lookup
latency and budget constraints
deployment constraints: hosted APIs vs local/self-hosted
evaluation goal: recall, precision, cost, or memory footprint

Without this, the skill can only give generic rankings.

Turn a rough goal into a strong prompt

Weak prompt:

Help me choose embeddings for my RAG app.

Better prompt:

Use the embedding-strategies skill to recommend an embedding setup for a support-doc RAG system. Corpus: 250k English docs plus some code snippets. Queries are natural-language troubleshooting questions. We deploy on hosted infrastructure, want good recall, can tolerate moderate latency, and need cost awareness. Compare 2-3 candidate embedding models, suggest chunking ranges, and explain what to test first.

That second version gives the skill enough information to make a usable recommendation.

Suggested workflow for embedding-strategies for RAG Workflows

A practical sequence:

Describe your corpus, query patterns, and constraints.
Ask the skill for 2-3 candidate models, not a single “winner.”
Request chunking guidance tied to those models.
Ask for an evaluation plan using your retrieval tasks.
Run a small benchmark before indexing everything.
Iterate on chunk size, overlap, and model choice together.

This workflow matters because embedding quality and chunking quality are tightly coupled.

What the skill helps you decide

The embedding-strategies skill is most useful for decisions like:

general-purpose vs domain-specific embeddings
hosted API vs open-source local embeddings
large vs cost-efficient embedding models
code retrieval vs document retrieval
multilingual support requirements
whether to reduce dimensions to save storage

These are the real adoption blockers for teams, and the skill gives a structured way to reason through them.

Model-selection guidance you can expect

From the source, the skill compares modern options such as Voyage models, OpenAI embedding models, and open-source BGE-family choices. In practice, that means:

Voyage is a strong fit when you want current high-quality hosted embeddings and longer input windows
OpenAI models are a natural fit if your stack already centers on OpenAI APIs
BGE-style open-source models matter when local deployment, privacy, or infra control is more important than top hosted quality

Use the skill to narrow candidates, then validate with your own retrieval set.

Chunking advice matters as much as model choice

A common mistake is switching models when the actual problem is chunking. Use the skill to ask:

what chunk size matches my document structure
whether overlap is needed
whether code, legal, or long-form docs need different segmentation
whether headings, sections, and metadata should be preserved

For many RAG systems, better chunking produces a larger retrieval gain than moving from a decent model to a slightly better one.

Practical evaluation questions to ask

After the first recommendation, ask follow-up questions like:

Which 20 queries should I use for a smoke test?
What failure modes would indicate poor chunking vs poor embeddings?
If storage cost is high, where can I reduce dimensions safely?
For multilingual content, should I use one embedding space or route by language?

This makes embedding-strategies guide outputs more actionable than a static model table.

Common adoption constraints

Before embedding-strategies install, check these likely blockers:

your vector DB may have storage or dimension constraints
your corpus may exceed model token limits unless chunked well
local models may increase ops burden significantly
domain-specific embeddings help only if your content truly matches that domain
benchmark claims do not replace in-domain testing

The skill helps frame these tradeoffs, but it does not remove the need for evaluation.

embedding-strategies skill FAQ

Is embedding-strategies good for beginners?

Yes, if you already understand the basics of RAG. The skill is approachable because it organizes decisions clearly, but it is still aimed at implementation choices, not a first-principles tutorial on vectors.

When should I use embedding-strategies instead of a normal prompt?

Use embedding-strategies when the model choice will affect cost, recall, storage, or deployment architecture. A normal prompt may give a generic recommendation; this skill is better when you need structured tradeoff analysis for a real retrieval system.

Does embedding-strategies pick one best model?

No. It is better used to shortlist candidates based on your workload. The right choice depends on corpus type, language coverage, context length, infrastructure, and evaluation criteria.

Is embedding-strategies only for RAG?

No, but embedding-strategies for RAG Workflows is the clearest fit. It also applies to semantic search, code search, clustering, memory retrieval, and domain-specific vector applications.

Should I trust benchmark-style recommendations without testing?

No. Use the skill to choose a strong starting point, then validate on your own corpus and queries. Retrieval quality is highly workload-specific.

When is this skill not enough by itself?

If your retrieval issues come from bad OCR, poor metadata, missing reranking, weak query rewriting, or low-quality source documents, embedding-strategies usage alone will not solve the problem.

How to Improve embedding-strategies skill

Give corpus details, not tool preferences

A frequent weak input is:

We use Pinecone and LangChain, what embeddings should we use?

A stronger input is:

Our corpus is 80k internal policy docs and meeting notes, mostly English with some German. Queries are compliance questions with exact terminology. We need high recall, hosted APIs are acceptable, and storage cost matters.

The second prompt leads to better recommendations because it describes retrieval behavior rather than framework branding.

Ask for tradeoffs in a fixed format

To improve embedding-strategies output quality, request a comparison table with:

model
strengths
weaknesses
token/window limits
cost or efficiency notes
best-fit document types
risks for your use case

This prevents vague “it depends” answers.

Separate embedding and chunking decisions

If you ask for both at once, require the skill to explain which problem each recommendation addresses. Otherwise, it may over-attribute retrieval issues to the embedding model when segmentation is the bigger problem.

Provide representative queries and documents

The best upgrade you can make is to include:

5-20 real user queries
a few sample chunks or raw documents
examples of relevant vs irrelevant retrievals

This lets the skill reason about semantic match quality instead of guessing from labels like “knowledge base.”

Watch for common failure modes

Poor results often come from:

chunks too large for precise retrieval
chunks too small to preserve meaning
multilingual content sent to English-centric models
code and prose indexed with one generic strategy
choosing huge vectors without enough quality gain to justify cost

Ask the skill to identify which of these is most likely in your setup.

Iterate after the first recommendation

A good second-round prompt is:

Based on the recommended setup, what are the top 3 retrieval risks in my pipeline, what metrics should I track, and what one variable should I change first if recall is poor?

This pushes the embedding-strategies skill from static advice into a practical tuning loop.

Improve install-to-value time

For faster embedding-strategies install adoption inside a team, standardize a short intake template:

use case
corpus size and type
languages
budget and latency target
hosted vs local requirement
sample queries
success metric

That makes the skill consistently useful across projects instead of relying on whoever asks the best ad hoc question.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

iterative-retrieval

by affaan-m

iterative-retrieval is a workflow pattern for progressively refining context retrieval in agentic work. It helps subagents avoid too much or too little context, making it useful for iterative-retrieval usage, install decisions, and iterative-retrieval for Workflow Automation.

Workflow Automation

Favorites 0GitHub 156.2k

azure-ai-contentunderstanding-py

by microsoft

azure-ai-contentunderstanding-py is the Python skill for Azure AI Content Understanding. It extracts structured content from documents, images, audio, and video for RAG workflows and automation. Use it when you need reliable multimodal extraction, Azure authentication, and repeatable pipeline-ready output.

RAG Workflows

Favorites 0GitHub 2.2k

azure-search-documents-ts

by microsoft

azure-search-documents-ts helps backend developers build Azure AI Search solutions with the @azure/search-documents SDK. Use it for index creation, document upload, keyword, vector, hybrid, and semantic search, plus credential and environment setup. It is a practical azure-search-documents-ts guide for backend development.

Backend Development

Favorites 0GitHub 2.3k

vector-index-tuning

by wshobson

vector-index-tuning helps tune vector search indexes for latency, recall, and memory. Use it to choose index types, adjust HNSW settings, and compare quantization options for RAG workflows.

RAG Workflows

Favorites 0GitHub 32.6k

hybrid-search-implementation

by wshobson

The hybrid-search-implementation skill shows how to combine vector and keyword retrieval with RRF, linear fusion, reranking, and cascade patterns for RAG and search systems.

RAG Workflows

Favorites 0GitHub 32.6k

rag-implementation

by wshobson

rag-implementation is a practical skill for planning RAG systems with vector databases, embeddings, retrieval patterns, and grounded-answer workflows. Use it to compare stack options, shape architecture decisions, and guide install and usage for document Q&A, knowledge assistants, and semantic search.

RAG Workflows

Favorites 0GitHub 32.6k

langchain-architecture

by wshobson

langchain-architecture is a design guide for building LangChain 1.x and LangGraph applications. Use it to choose between chains, agents, retrieval, memory, and stateful orchestration patterns before implementation.

Agent Orchestration

Favorites 0GitHub 32.6k

similarity-search-patterns

by wshobson

similarity-search-patterns helps you choose distance metrics, index types, and hybrid retrieval patterns for semantic search and RAG workflows. Use it to plan production vector search tradeoffs around recall, latency, and scale.

RAG Workflows

Favorites 0GitHub 32.6k

frontend-design

by anthropics

frontend-design helps you turn vague UI ideas into distinctive, production-grade interfaces with real frontend code, strong aesthetic direction, and less generic AI styling.

UI Design

Favorites 1GitHub 105.2k

create-colleague

by titanwings

create-colleague turns coworker docs, chats, emails, screenshots, Feishu, and DingTalk data into an editable AI skill with separate work and persona outputs, plus update flows for ongoing refinement.

Skill Authoring

Favorites 1GitHub 747

hyperframes

by heygen-com

hyperframes is a workflow skill for building HTML-based video compositions in HyperFrames. Use it for title cards, overlays, captions, voiceovers, audio-reactive motion, and scene transitions when you need structured, code-first hyperframes for Video Editing. It favors layout, timing, and animation decisions over generic prompt-only video requests.

Video Editing

Favorites 0GitHub 2.7k

kreuzberg

by kreuzberg-dev

The kreuzberg skill helps you install and use Kreuzberg for document extraction across 91+ formats, including PDFs, Office files, images, HTML, email, and archives. It covers Python, Node.js/TypeScript, Rust, and CLI workflows for OCR, tables, metadata, batch processing, and practical parsing guidance.

PDF Processing

Favorites 0GitHub 0

skill-creator

by anthropics

skill-creator is a Skill Authoring meta-skill for drafting new skills, revising existing SKILL.md files, running evals, comparing variants, and improving trigger descriptions with repository scripts and review tools.

Skill Authoring

Favorites 2GitHub 105.1k

azure-identity-py

by microsoft

azure-identity-py helps set up Azure authentication in Python with Microsoft Entra ID. Use it to choose DefaultAzureCredential, managed identity, or service principal auth, configure environment variables, and troubleshoot access control and credential chain issues. Install guidance, usage patterns, and practical setup notes are based on the repo skill file.

Access Control

Favorites 0GitHub 2.2k

claude-api

by anthropics

claude-api is a practical skill for installing and using the Claude API and Anthropic SDKs. It helps developers choose the right SDK or raw HTTP path, detect language-specific docs, and implement streaming, tool use, files, batches, and error handling with less guesswork.

API Development

Favorites 0GitHub 105k

wrangler

by cloudflare

The wrangler skill helps you find correct CLI commands, config shapes, and deployment steps for Cloudflare Workers. Use it for wrangler usage, wrangler install checks, and a practical wrangler guide when building or shipping Workers for Backend Development.

Backend Development

Favorites 0GitHub 1.3k