The Problem with Keyword Search

Keyword search works by matching exact terms. Search for "fix flat tire" and you'll find documents containing those words. But you won't find "tire puncture repair guide" — even though it's exactly what you need.

This is the fundamental limitation: keyword search matches words, not meaning.

How Semantic Search Works

Semantic search uses embeddings to match by meaning. Here's the process:

1. Convert your documents into vectors (embeddings) 2. Convert the search query into a vector 3. Find documents whose vectors are most similar to the query vector

Because embeddings capture meaning, "fix flat tire" and "tire puncture repair" end up close together in vector space.

When Keyword Search Wins

Keyword search isn't dead. It's actually better in several scenarios:

Exact matches. Searching for an error code like "ERR_CONNECTION_REFUSED" or a product SKU. Embeddings might match semantically similar but wrong results. Known-item search. When users know exactly what they're looking for. "Python datetime documentation" — they want the docs, not a tutorial about time. Structured data. Filtering by category, date range, price range, or other structured fields. This is a database query, not a semantic problem. Speed at massive scale. Inverted indexes (Elasticsearch, Solr) can search billions of documents in milliseconds. Vector search at that scale requires significant infrastructure.

When Semantic Search Wins

Semantic search excels when:

Users don't know the right terms. Your documentation says "authentication" but users search for "login." Semantic search bridges this vocabulary gap. Questions and natural language. "How do I deploy to production?" won't match a doc titled "Deployment Guide" with keyword search. Semantic search handles this naturally. Cross-language search. With multilingual embeddings (like Cohere's), a query in Spanish can find results written in English. Conceptual similarity. Finding similar support tickets, related articles, or duplicate content. These require understanding meaning, not matching words.

The Hybrid Approach

The best search systems combine both. Here's how:

async function hybridSearch(query, limit = 10) {
  // 1. Keyword search (BM25)
  const keywordResults = await elasticsearch.search({
    query: { match: { content: query } },
    size: limit,
  })
// 2. Semantic search
  const queryVector = await getEmbedding(query)
  const semanticResults = await vectorDb.search({
    vector: queryVector,
    limit: limit,
  })
// 3. Combine with Reciprocal Rank Fusion
  return reciprocalRankFusion(keywordResults, semanticResults)
}
function reciprocalRankFusion(list1, list2, k = 60) {
  const scores = new Map()
for (const [rank, item] of list1.entries()) {
    const score = 1 / (k + rank + 1)
    scores.set(item.id, (scores.get(item.id) || 0) + score)
  }
for (const [rank, item] of list2.entries()) {
    const score = 1 / (k + rank + 1)
    scores.set(item.id, (scores.get(item.id) || 0) + score)
  }
return [...scores.entries()]
    .sort((a, b) => b[1] - a[1])
    .map(([id]) => id)
}

Reciprocal Rank Fusion (RRF) is simple and effective. It combines the rankings from both search methods without needing to normalize scores.

Implementation Guide

Basic Semantic Search in 20 Lines

// Index documents
async function indexDocuments(documents) {
  const embeddings = await fetch('https://api.embedroute.com/v1/embeddings', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer er_your_key',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'openai/text-embedding-3-small',
      input: documents.map(d => d.content),
    }),
  }).then(r => r.json())
// Store in your vector database
  for (let i = 0; i < documents.length; i++) {
    await vectorDb.upsert({
      id: documents[i].id,
      vector: embeddings.data[i].embedding,
      metadata: { title: documents[i].title },
    })
  }
}
// Search
async function search(query) {
  const queryEmb = await getEmbedding(query)
  return vectorDb.search({ vector: queryEmb, limit: 5 })
}

Choosing Your Embedding Model

The model you choose affects search quality more than any other factor:

General search: OpenAI text-embedding-3-small — reliable, cheap, good enough for most cases
Technical docs: Voyage voyage-3 — better at understanding technical concepts
Code search: Voyage voyage-code-3 — specifically trained on code
Global audience: Cohere embed-multilingual-v3.0 — works across 100+ languages

Use EmbedRoute to test multiple models on your actual queries without changing your code.

Measuring Search Quality

Don't guess — measure. Create a test set of:

1. Queries — 50-100 real queries your users make 2. Relevant documents — For each query, which documents should be returned?

Then calculate:

Recall@5 — What percentage of relevant documents appear in the top 5?
MRR — Mean Reciprocal Rank — how high is the first relevant result?

Run this evaluation for keyword search, semantic search, and hybrid. The data will tell you which approach works best for your use case.

Conclusion

Most applications benefit from semantic search, but the best implementations combine both approaches. Start with semantic search for natural language queries, add keyword search as a fallback for exact matches, and use hybrid ranking to get the best of both worlds.

The embedding model is the most important choice in your semantic search stack. Test different models — what works for general text might not work for your specific domain.

Semantic Search vs Keyword Search: When to Use Each