Skip to main content
This example builds a small support knowledge base, then chats over it. Run it as-is, then swap in your own content.

1. Install

pip install "ragrails[voyage,openai,qdrant]"

2. Set your keys and start Qdrant

export VOYAGE_API_KEY="..."
export OPENAI_API_KEY="..."
docker run -p 6333:6333 qdrant/qdrant

3. Build the knowledge base

from ragrails import RagRails

rag = RagRails()

rag.ingest(
    markdown=(
        "# Refund policy\n\n"
        "Customers can request a refund within 30 days of purchase. "
        "Refunds are returned to the original payment method within 5 business days."
    ),
    embedding={"provider": "voyage", "model": "voyage-3"},
    storage={"vector_db": "qdrant", "collection": "support", "url": "http://localhost:6333"},
)
ingest() chunks, embeds, and stores the content in one call. Swap markdown= for docs=, urls=, or api= to load real files, websites, or APIs.

4. Chat over it

llm = rag.llm(provider="openai", model="gpt-4o-mini")
embedder = rag.embedder(provider="voyage", model="voyage-3", input_type="query")

result = rag.chat(
    "How long do I have to request a refund?",
    llm=llm,
    embedder=embedder,
    vector_db="qdrant",
    collection="support",
    url="http://localhost:6333",
    history=[],
)

print(result.answer)
# "You can request a refund within 30 days of purchase."
Chat is stateless. Save result.history and pass it to the next turn to keep context.

Run it from the CLI

ragrails ingest \
  --markdown "# Refund policy. Customers can request a refund within 30 days of purchase." \
  --vector-db qdrant --collection support --url http://localhost:6333 \
  --provider voyage --model voyage-3

ragrails chat "How long do I have to request a refund?" \
  --vector-db qdrant --collection support --url http://localhost:6333 \
  --llm-provider openai --llm-model gpt-4o-mini

Next

How RAG works

Understand the pipeline.

Tune retrieval and chat

Reranking, rewriting, and more.