Skip to main content
Chunking splits documents into small passages before embedding.
Why: search works best on focused passages, not whole files. Chunking lets a query match the one paragraph that answers it, not a 50-page document.
result = rag.chunk(
    markdown=docs.outputs,   # ingestion output, or plain strings
    chunk_size=2000,
    chunk_overlap=200,
    min_chunk_length=100,
)
result.items  # list of chunk dicts

Choosing chunk size

chunk_size is the main lever. It’s a tradeoff:
Too smallToo large
Loses surrounding contextRetrieves noise around the answer
More chunks, higher costWeaker similarity matching
Start at chunk_size=2000, chunk_overlap=200. Lower the size for dense, factual content (specs, FAQs); raise it for narrative content where context matters.
Overlap repeats a little text between adjacent chunks so a sentence split across a boundary isn’t lost. Keep it ~10% of chunk_size. min_chunk_length drops tiny fragments (stray headings, footers) that add noise.

Reference

Full parameters: SDK chunking · CLI · REST.