Retrieval - Ragrails

embedder = rag.embedder(provider="voyage", model="voyage-3", input_type="query") result = rag.retrieve( "How do I authenticate?", embedder=embedder, vector_db="qdrant", collection="docs", url="http://localhost:6333", top_k=10, ) for chunk in result.items: print(chunk.score, chunk.text)

Tune it by symptom

Symptom	Fix
Not enough / too many results	Adjust `top_k`
Right topic, wrong ranking	Turn on reranking
Conversational follow-ups miss	Turn on query rewriting

Reranking

Why: vector search is fast but rough. A reranker re-reads the top candidates and reorders them by true relevance. Use it when precision matters more than speed.

reranker = rag.reranker(provider="voyage", model="rerank-2-lite")

result = rag.retrieve(
    "How do I authenticate?",
    embedder=embedder,
    vector_db="qdrant", collection="docs", url="http://localhost:6333",
    top_k=20,                 # retrieve wide
    use_rerank=True,
    reranker=reranker,
    rerank_top_k=5,           # keep the best few
)

Retrieve wide (top_k=20), rerank down (rerank_top_k=5). The reranker needs candidates to choose from.

Query rewriting

Why: follow-ups like “what about the second one?” mean nothing to a search engine alone. Query rewriting uses context to expand them into standalone questions.

rewrite_llm = rag.llm(provider="openai", model="gpt-4o-mini")

result = rag.retrieve(
    "What about the second step?",
    embedder=embedder,
    vector_db="qdrant", collection="docs", url="http://localhost:6333",
    use_query_rewrite=True,
    rewrite_llm=rewrite_llm,
    session_context="User is asking about the onboarding flow.",
)
result.search_query  # the rewritten query actually searched

​Tune it by symptom

​Reranking

​Query rewriting

​Reference

Tune it by symptom

Reranking

Query rewriting

Reference