Skip to main content
chunk reads JSON files from --input-dir (ingestion output) and writes a chunks.json file to --output-dir.
ragrails chunk \
  --input-dir files/output/docs/ \
  --output-dir files/chunks/ \
  --chunk-size 2000 \
  --chunk-overlap 200 \
  --min-chunk-length 100

Options

OptionDefaultDescription
--input-dirrequiredFolder containing ingestion output JSON files
--output-dirrequiredFolder to write chunks.json to
--chunk-size2000Max characters per chunk
--chunk-overlap200Character overlap between chunks
--min-chunk-length100Minimum chunk length to keep