Showing posts with the label Performance Engineering

Crushing RAG Latency: 50% Faster Retrieval with HNSW Tuning & Hybrid Re-ranking

You’ve built a RAG pipeline, the answers are accurate, but the retrieval step alone is eating up 800ms. In a recent project handling document search for a financial assistant, we faced exactly this…
Crushing RAG Latency: 50% Faster Retrieval with HNSW Tuning & Hybrid Re-ranking
OlderHomeNewest