Replies: 1 comment 1 reply
-
I think there is an example here about using an ingest pipeline to do chunking. So I'd expect something like the following to do the chunking for you:
You can define the pipeline within the fscrawler configuration. I'd be happy to hear if this works for you. And if so, would love to get a documentation PR on it. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I trying to build a small POC of knowledge repository enabled with GPT feature for my organisation with sensitive content with all format of files. I am using ES with langchain. The setup is working but with no accuracy, one of the suggested method is to chunk the data. As part of that, I was looking if there is any way in FScrawler to split the file (at least for PDF) based on paragraph for better semantic analysis.
Thanks
Beta Was this translation helpful? Give feedback.
All reactions