Your users search with intent, not keywords. Here is how to add semantic understanding to ElasticSearch without ripping out your existing infrastructure.
ElasticSearch powers search for millions of applications. But keyword-based BM25 search has a fundamental limitation: it matches words, not meaning. A user searching for "comfortable shoes for running" won't find documents that mention "cushioned athletic footwear." Semantic search solves this by matching on meaning through vector embeddings. The good news: you don't need to replace ElasticSearch. Since version 8.0, Elastic supports native vector search alongside traditional BM25, giving you the best of both worlds.
Pure keyword search misses synonyms and intent. Pure semantic search misses exact terms and entity names (product IDs, error codes, brand names). Hybrid search combines both and consistently outperforms either approach alone.
| Search Type | Strengths | Weaknesses |
|---|---|---|
| BM25 (Keyword) | Exact matches, entity names, codes | No synonym understanding |
| Vector (Semantic) | Intent understanding, synonyms | Misses exact terms, higher latency |
| Hybrid | Both strengths combined | More complex to tune |
# Add embedding field to existing index mapping
PUT /products/_mapping
{
"properties": {
"description_embedding": {
"type": "dense_vector",
"dims": 1536,
"index": true,
"similarity": "cosine"
}
}
}
Use an embedding model (OpenAI text-embedding-3-small, Cohere embed-v3, or a self-hosted model) to generate vector representations of your documents. Index these alongside your existing text fields.
# Backfill embeddings for existing documents
from elasticsearch import Elasticsearch
from openai import OpenAI
es = Elasticsearch("http://localhost:9200")
openai_client = OpenAI()
def backfill_embeddings(index_name: str):
for doc in scroll_all_documents(index_name):
embedding = openai_client.embeddings.create(
model="text-embedding-3-small",
input=doc["description"]
).data[0].embedding
es.update(index=index_name, id=doc["_id"],
body={"doc": {"description_embedding": embedding}})
# Hybrid query combining BM25 + kNN vector search
query = {
"query": {
"match": {
"description": "comfortable shoes for running"
}
},
"knn": {
"field": "description_embedding",
"query_vector": query_embedding,
"k": 10,
"num_candidates": 100
},
"rank": {
"rrf": {} # Reciprocal Rank Fusion to combine scores
}
}
Reciprocal Rank Fusion (RRF) is the recommended scoring strategy. It normalizes scores from both BM25 and vector search and combines them without requiring manual weight tuning.
For the highest search quality, add a cross-encoder re-ranker as a final stage. ElasticSearch retrieves candidates using hybrid search, then a cross-encoder (like a Cohere reranker or a custom BERT model) re-scores the top 20-50 results for precise relevance ranking.
You don't need to reindex everything at once. Use this phased approach:
For a comparison of dedicated vector databases when you outgrow ElasticSearch's vector capabilities, see our vector DB comparison. For using semantic search as part of a RAG pipeline, read about fixing RAG failures with agentic AI.
Not necessarily. If you need full-text search, faceting, aggregations, and vector search, ElasticSearch handles all of them. Replace only if vector search is your primary use case and you need better performance at scale, in which case look at Pinecone or Weaviate.
For English text: OpenAI text-embedding-3-small (good balance of quality and cost), Cohere embed-v3 (strong multilingual), or sentence-transformers/all-MiniLM-L6-v2 (free, self-hosted, fast). Match the model's dimension count to your dense_vector field configuration.
Embeddings increase storage by approximately 6 KB per document (for 1536-dimensional vectors). kNN search adds 20-50ms per query. The embedding API costs (if using OpenAI) are approximately $0.02 per 1M tokens for the backfill, then per-query embedding costs. See our guide on reducing OpenAI costs for optimization strategies.
We modernize search infrastructure with semantic capabilities. From ElasticSearch optimization to full RAG pipelines.
Modernize Your Search