Vector search with cross-encoder re-ranking, hybrid BM25+vector retrieval, incremental index updates, and multiple LLM backends (Ollama local, OpenAI API).
576 B
576 B
Simple query in ChatGPT produced
| Metric | Best For | Type | Notes |
|---|---|---|---|
| Cosine Similarity | L2-normalized vectors | Similarity | Scale-invariant |
| Dot Product | Transformer embeddings | Similarity | Fast, especially on GPUs |
| Euclidean Distance | Raw vectors with meaningful norms | Distance | Sensitive to scale |
| Jaccard | Sparse binary or set-based data | Similarity | Discrete features |
| Soft Cosine | Sparse with semantic overlap | Similarity | Better for text-term overlap |
| Learned Similarity | Fine-tuned deep models | Varies | Best accuracy, slowest retrieval |