Skip to main content

πŸ—“οΈ 07082025 1037
πŸ“Ž

vector_search

Vector search is a technique for finding similar items based on numerical vector representations (aka embeddings) of data, instead of doing keyword or exact-match lookups.

Traditional search:

  • Looks for exact keyword matches
  • Doesn't understand meaning or context Vector search:
  • Finds items that are semantically similar, even if the words used are different

πŸ’‘ How It Works (Simplified)​

  1. Convert data to vectors (embeddings)
    • e.g. a sentence like "I love pizza" β†’ [0.3, -0.7, 0.1, ...]
  2. Store all vectors in a special database or index
  3. When you query (e.g. "best food"),
    • it’s also converted into a vector

🧭 Common Use Cases​

Use CaseDescription
πŸ” Semantic SearchSearch documents/images by meaning, not exact words
πŸ§‘β€πŸ€β€πŸ§‘ Recommendation SystemsFind similar products/users based on embeddings
🧠 AI Chatbots / RAGRetrieve relevant knowledge chunks before answering
πŸ–ΌοΈ Image SearchFind visually similar images
🧬 GenomicsCompare DNA embeddings
ToolDescription
FAISS (Meta)Fast indexing of vectors (C++/Python)
Annoy (Spotify)Approximate Nearest Neighbor in Rust
Milvus / QdrantScalable vector DBs with APIs
WeaviateFull-featured vector DB with modules
PineconeManaged vector DB service
Elasticsearch + kNNVector plugin for hybrid search

βœ… Pros​

  • Understands semantics, not just keywords
  • Enables fuzzy, context-aware search
  • Great for unstructured data: text, images, audio, etc.

⚠️ Cons​

  • Slower than keyword search (but getting faster!)
  • Needs preprocessing: embedding generation
  • Scalability and freshness challenges with large data
Keyword SearchVector Search
Based onExact wordsSemantic meaning
Example Query"red shoes""comfortable running gear"
FindsPages with "red shoes"Pages about sneakers or running shoes
Under the hoodInverted indexVector similarity (ANN)

References

  • ChatGPT