Chat is stateless. Ragrails never stores conversations. You pass
history in and get the updated history back, so you decide where it lives (session, database, memory). Safe for multi-user apps.Tuning knobs
Each is a config object. Use them by symptom:| Symptom | Config | What it does |
|---|---|---|
| Follow-ups miss context | QueryRewriteConfig | Expands follow-ups into standalone questions. |
| Long chats slow / costly | HistoryCompactionConfig | Summarizes old turns, keeps recent ones. |
| Small talk triggers search | IntentRoutingConfig | Skips retrieval for non-questions. |
| Model answers with weak context | ChatRetrievalQualityConfig | Sets confidence thresholds; answers with caution or refuses. |
Query rewriting
Why: “what about the second one?” is meaningless to search alone. Rewriting uses history to make it standalone.
History compaction
Why: long conversations overflow the context window and cost more every turn. Compaction summarizes old messages and keeps recent ones verbatim.
Intent routing
Why: “Thanks!” doesn’t need a database lookup. Routing skips retrieval for small talk, which is faster and avoids irrelevant context.
Retrieval quality
Why: sometimes your index just doesn’t have the answer. Rather than hallucinate, Ragrails scores the retrieved context and can answer with caution or refuse.
What you get back
| Field | Use it for |
|---|---|
answer | The grounded reply |
history | Pass to the next turn |
sources | Show citations |
intent | "rag" or "direct" |
answer_confidence | Flag low-confidence answers in your UI |

