Back to blog
·EmbedRoute

What Are Embeddings? A Practical Guide for Developers

Learn what embeddings are, how they work, and why they matter for modern AI applications like RAG, search, and recommendations.

What Are Embeddings?

Embeddings are numerical representations of text (or images, audio, etc.) that capture semantic meaning. They convert human-readable content into vectors—arrays of numbers—that machines can process and compare.

Think of it like this: the sentence "I love pizza" might become a vector like [0.23, -0.45, 0.87, ...] with hundreds or thousands of dimensions. Similar sentences end up with similar vectors.

Why Do Embeddings Matter?

Embeddings power most modern AI applications:

Semantic Search: Find documents by meaning, not just keywords. "How do I fix a flat tire?" matches "Tire puncture repair guide" even though they share few words. RAG (Retrieval-Augmented Generation): Feed relevant context to LLMs. Embeddings find the right documents to include in your prompt. Recommendations: "Users who liked X also liked Y" works by finding items with similar embeddings. Clustering & Classification: Group similar items together automatically.

How Do Embeddings Work?

1. Input: You send text to an embedding model 2. Processing: The model (a neural network) processes the text 3. Output: You get back a vector of floating-point numbers

const response = await openai.embeddings.create({
  model: "text-embedding-3-small",
  input: "What is machine learning?"
})

// Returns something like: // [0.0023, -0.0142, 0.0841, ...] // (1536 dimensions for this model)

Comparing Embeddings

The magic happens when you compare vectors. Similar meanings = similar vectors.

Cosine Similarity is the most common comparison method. It measures the angle between two vectors:
  • 1.0 = identical meaning
  • 0.0 = unrelated
  • -1.0 = opposite meaning
function cosineSimilarity(a, b) {
  const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0)
  const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0))
  const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0))
  return dotProduct / (magnitudeA * magnitudeB)
}

Practical Example: Building Search

Here's how to build semantic search with embeddings:

1. Index your documents:
const documents = [
  "How to make pizza dough",
  "Italian cooking basics",
  "Best pizza toppings"
]

const embeddings = await Promise.all( documents.map(doc => getEmbedding(doc)) )

// Store embeddings in a vector database

2. Search by meaning:
const query = "I want to learn to cook Italian food"
const queryEmbedding = await getEmbedding(query)

// Find most similar documents const results = embeddings .map((emb, i) => ({ doc: documents[i], score: cosineSimilarity(queryEmbedding, emb) })) .sort((a, b) => b.score - a.score)

Choosing an Embedding Model

Key factors:

  • Dimensions: More dimensions = more detail, but higher cost/storage
  • Context length: How much text can you embed at once?
  • Performance: Benchmark scores on your type of content
  • Cost: Prices vary 10x between models
Popular options:
  • OpenAI text-embedding-3-small: Great balance of cost and quality
  • Voyage AI: Best for code and technical content
  • Cohere: Strong multilingual support

Vector Databases

Once you have embeddings, you need somewhere to store and search them:

  • Pinecone: Managed, easy to use
  • Weaviate: Open source, feature-rich
  • Qdrant: Open source, fast
  • pgvector: PostgreSQL extension (simple setup)
  • Chroma: Lightweight, good for prototypes

Getting Started

1. Pick an embedding model (start with OpenAI small) 2. Generate embeddings for your content 3. Store them in a vector database 4. Build your search/RAG/recommendation system

Or use EmbedRoute to access multiple embedding providers through one API and find what works best for your use case.

Ready to try multiple embedding models?

Access OpenAI, Voyage, Cohere, and more with a single API.

Join the Waitlist