How to configure backend settings of document ingestion #644
-
Hi, thanks for open-sourcing this awesome project! I have been playing around with an experimental RAG workflow, for which I want to customize the document ingestion behaviour of the However, it seems that the various sets of pipeline options used by the Is this the correct workflow, or is there a simpler way to configure the ingestion settings to my liking? It seems a bit overkill to define a new subclass just to change a module's configuration. (Similarly, I would like to change the chunking behaviour of my document parser, which I believe would require a similar solution). |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @djdameln! Sorry for the late response. I haven't properly configured notifications from GH discussions. You are absolutely right, we should allow options to be passed to the Docling parser. I've created an issue for that, and we'll add this to the next release: #662 To answer the question:
while DocumentParser is designed to be subclassed and extended by the users - this particular case should (and will be soon) be available without it. Thanks for contributing! |
Beta Was this translation helpful? Give feedback.
Hi @djdameln! Sorry for the late response. I haven't properly configured notifications from GH discussions.
You are absolutely right, we should allow options to be passed to the Docling parser. I've created an issue for that, and we'll add this to the next release: #662
To answer the question:
while DocumentParser is designed to be subclassed and extended by the users - this particular case should (and will be soon) be available without it.
Thanks for contributing!