Skip to main content

POST /v1/ingest/url

Scrape one or more URLs.
{
  "url": "https://example.com/docs",
  "mode": "each",
  "max_depth": 3,
  "max_pages": 200,
  "output_format": "json"
}
Multiple URLs with per-URL config:
{
  "url": [
    "https://example.com/docs",
    {"url": "https://example.com/blog", "mode": "full", "max_depth": 1}
  ]
}

POST /v1/ingest/docs

Parse local document files or a folder.
{
  "folder": "files/docs/",
  "output_format": "json"
}

POST /v1/ingest/api

Fetch a REST API endpoint.
{
  "url": "https://api.example.com/posts",
  "title": "Blog posts",
  "headers": {"Authorization": "Bearer token"},
  "pagination": {"type": "page", "param": "page", "size_param": "per_page", "size": 100},
  "max_pages": 10
}

All ingestion responses follow this shape:
{
  "documents": 5,
  "failed": 0,
  "outputs": [...],
  "errors": []
}