Standard RAG pipelines treat documents as flat strings of text. They use "fixed-size chunking" (cutting a document every 500 ...
Just... Python 🐍 — Pure Python, zero dependencies — no C extensions or system libraries, easy to debug, and works anywhere Python runs, including PyPy and ...
A focused pipeline to parse medical guidelines (PDF/HTML) into structured JSON for downstream clinical RAG or summarization. This implements models, parsers, normalization utils, and a CLI to ingest ...