Enhancement: Upload Documents as input Context vs RAG Workflow #2755

achhabra2 · 2024-05-16T18:03:24Z

What features would you like to see added?

It would be great if we could upload PDFs or text documents and have it processed as input context versus the current workflow which uses the RAG API instead. Some models now have 200K - 1M context window and we could utilize that outside of pasting large blocks of text into the chat window.

More details

We may need to have a secondary upload button implemented or something that signifies which type of workflow you are using. If you use chatGPT or Gemini web interface, etc. those documents get processed as context.

Below added from #3791

In the Claude UI, if you paste a large piece of text, it automatically gets attached and treated like a document. This is very easy to use as I can just paste large pieces of text from different sources, they get treated as separate documents, and then chat with them.

But LibreChat is more like ChatGPT, where any amount of pasted text gets added in the text box like a normal message. So having the above behavior I think is beneficial in many ways, at least as a toggle switch in the settings.

Sorry if this is duplicated; I couldn't find anything like this in the Issues. Loving LibreChat so far; really great alternative to paying for ChatGPT, Claude, and Gemini separately. Thanks!

Paste a large amount of text (threshold could be customizable maybe) and it will get uploaded as a TXT file instead of appearing in the chatbox.

Second, when clicking on such a file, a UI popup opens up where we can check the file.

Which components are impacted by your request?

No response

Pictures

This is what I'm referring to.

Code of Conduct

I agree to follow this project's Code of Conduct

danny-avila · 2024-05-16T18:14:38Z

This request already exists, but your specifications are more clear so I will close the other in favor of yours. Thanks for the write up!

Closing #2335

raphaelgurtner · 2024-06-10T07:22:41Z

+1, this would be a major improvement, especially for use with gemini 1.5 models with their large context sizes. If it helps: supported mime types for each model can be found here https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/send-multimodal-prompts#media_requirements

danny-avila · 2024-06-10T13:12:37Z

Google will get a lot of love soon due to their improved dev tools. May have something to do with @logankilpatrick joining the team 😊

wesselhuising · 2024-07-10T07:21:13Z

Hi, any update on this? I would like to send PDFs to the Gemini 1.5 pro model instead using the RAG API. Cheers.

marcelamsler · 2024-07-15T09:28:35Z

This would improve the value of LibreChat massively and we are looking forward to this. Is there any timeline you could share?

schnaker85 · 2024-07-16T06:56:45Z

upvote +1 🚀

amir-ghasemi · 2024-07-16T13:20:37Z

FYI everyone, Librechat already supports this as it comes with a rag_use_full_context flag which puts the entire document into context. One just needs to control this via the .env or add a setting in the UI.

schnaker85 · 2024-07-16T14:56:31Z

FYI everyone, Librechat already supports this as it comes with a rag_use_full_context flag which puts the entire document into context. One just needs to control this via the .env or add a setting in the UI.

https://github.com/danny-avila/LibreChat/blob/main/api/app/clients/prompts/createContextHandlers.js#L25
I think it still requires to use the RAG_API_URL to be setup.

amir-ghasemi · 2024-07-16T17:21:40Z

FYI everyone, Librechat already supports this as it comes with a rag_use_full_context flag which puts the entire document into context. One just needs to control this via the .env or add a setting in the UI.

https://github.com/danny-avila/LibreChat/blob/main/api/app/clients/prompts/createContextHandlers.js#L25 I think it still requires to use the RAG_API_URL to be setup.

Yes, indeed. We still need the RAG for question answering against knowledgebases consisting of 1000s of documents so that feature should not go away. This flag and the existing endpoint in rag api allows for including the full document in context with minimal changes.

raphaelgurtner · 2024-08-26T12:09:29Z

FYI everyone, Librechat already supports this as it comes with a rag_use_full_context flag which puts the entire document into context. One just needs to control this via the .env or add a setting in the UI.

A document included in the context in this way is however still subject to any preprocessing/text-extraction on LibreChat's part right? The idea of this feature request would be to circumvent that (if the user so desires) and have the model (endpoint) deal with the document as-is. This would allow many different use-cases / document types, not only pdf but even sound/video/csv etc. - basically whatever is supported by the model-endpoints.

Is that feature considered out of scope (since it's not in the roadmap currently at all) or just low priority? If it's just low priority, would help implementing it still be appreciated?

danny-avila · 2024-08-26T13:20:20Z

A document included in the context in this way is however still subject to any preprocessing/text-extraction on LibreChat's part right?

No I think it would be nice to have a simple "use full text" option while uploading. If it's text-based, the browser can handle it and the server never interacts with the file, other than adding it to the AI request, it would just get appended to the user message.

marcelamsler · 2024-09-10T09:24:47Z

Is there an update here? We would love to use Librechat to compare Contracts in PDF Format with Gemini 1.5.

o42o · 2024-09-10T19:45:14Z

+1

banjavi · 2024-09-13T18:20:28Z

+1

bsu3338 · 2024-09-15T17:44:01Z

Could this change incorporate the option to send images to RAG instead of having the model process it? Thinking of images with newspaper articles, document scans in image format. Maybe possible to do the OCR with RAG API and then include it in "use full text". I can put this in as a separate request. Like the option to choose to either use RAG or process content with model.

hksitorus · 2024-10-07T03:48:26Z

This would be really great! Some of our colleagues asked if they can summarize a doc/pdf. Which does not really working well with RAG workflow. This will be a really great use case for Gemini.

We may need to have a secondary upload button implemented or something that signifies which type of workflow you are using. If you use chatGPT or Gemini web interface, etc. those documents get processed as context.

I feel like secondary upload button (only for model that support it) is making more sense, so user can choose between RAG workflow or input context.

Xtrah · 2024-10-07T08:38:15Z

This feature would be amazing and is so far the only crucial limitation I'm having using LibreChat.

danny-avila · 2024-10-22T14:17:32Z

A native implementation for this is planned and I will work on it soon, in order to send text from files as part of the context.

schnaker85 · 2024-10-22T14:32:58Z

A native implementation for this is planned and I will work on it soon, in order to send text from files as part of the context.

the above PR for the closed issue(#4503) would allow to send the complete files as base64 encoded string in the requests for example for google models.

Is this what you mean or only text-files (eg. no PDF's)?

marcelamsler · 2024-10-31T10:34:47Z

Why was the Pull Request closed? It contains a working implementation, which could be used as a base. We really need this feature, as the usage is so limited without the possibility to upload Files.

danny-avila · 2024-10-31T14:10:43Z

Why was the Pull Request closed? It contains a working implementation, which could be used as a base. We really need this feature, as the usage is so limited without the possibility to upload Files.

the PR is open but it doesn't address this issue. The point is to pass text-based files into the prompt, not to pass it as base64

helgster77 · 2024-11-25T16:33:39Z

Is there any way to completely skip the embedding process and just upload a PDF file straight to the LLM? I don't need RAG capabilities for my use case.

achhabra2 added the enhancement New feature or request label May 16, 2024

danny-avila mentioned this issue May 16, 2024

Enhancement: Better handle large amounts of pasted input text #2335

Closed

1 task

danny-avila mentioned this issue Aug 26, 2024

Enhancement: Add large pasted text as a file upload (similar to Claude UI) #3791

Closed

1 task

danny-avila mentioned this issue Sep 15, 2024

📁 feat: Add C# Support for Native File Search #4058

Merged

4 tasks

schnaker85 mentioned this issue Oct 22, 2024

Enhancement: Upload of Documents/files to use as input in models (not using RAG) #4502

Closed

1 task

franperic mentioned this issue Nov 7, 2024

Enhancement: PDF to Image Transformation for Comprehensive Image Processing #4656

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: Upload Documents as input Context vs RAG Workflow #2755

Enhancement: Upload Documents as input Context vs RAG Workflow #2755

achhabra2 commented May 16, 2024 •

edited by danny-avila

Loading

danny-avila commented May 16, 2024

raphaelgurtner commented Jun 10, 2024

danny-avila commented Jun 10, 2024

wesselhuising commented Jul 10, 2024

marcelamsler commented Jul 15, 2024

schnaker85 commented Jul 16, 2024

amir-ghasemi commented Jul 16, 2024

schnaker85 commented Jul 16, 2024

amir-ghasemi commented Jul 16, 2024 •

edited

Loading

raphaelgurtner commented Aug 26, 2024

danny-avila commented Aug 26, 2024 •

edited

Loading

marcelamsler commented Sep 10, 2024

o42o commented Sep 10, 2024

banjavi commented Sep 13, 2024

bsu3338 commented Sep 15, 2024

hksitorus commented Oct 7, 2024

Xtrah commented Oct 7, 2024 •

edited

Loading

danny-avila commented Oct 22, 2024

schnaker85 commented Oct 22, 2024

marcelamsler commented Oct 31, 2024

danny-avila commented Oct 31, 2024

helgster77 commented Nov 25, 2024 •

edited

Loading

Enhancement: Upload Documents as input Context vs RAG Workflow #2755

Enhancement: Upload Documents as input Context vs RAG Workflow #2755

Comments

achhabra2 commented May 16, 2024 • edited by danny-avila Loading

What features would you like to see added?

More details

Which components are impacted by your request?

Pictures

Code of Conduct

danny-avila commented May 16, 2024

raphaelgurtner commented Jun 10, 2024

danny-avila commented Jun 10, 2024

wesselhuising commented Jul 10, 2024

marcelamsler commented Jul 15, 2024

schnaker85 commented Jul 16, 2024

amir-ghasemi commented Jul 16, 2024

schnaker85 commented Jul 16, 2024

amir-ghasemi commented Jul 16, 2024 • edited Loading

raphaelgurtner commented Aug 26, 2024

danny-avila commented Aug 26, 2024 • edited Loading

marcelamsler commented Sep 10, 2024

o42o commented Sep 10, 2024

banjavi commented Sep 13, 2024

bsu3338 commented Sep 15, 2024

hksitorus commented Oct 7, 2024

Xtrah commented Oct 7, 2024 • edited Loading

danny-avila commented Oct 22, 2024

schnaker85 commented Oct 22, 2024

marcelamsler commented Oct 31, 2024

danny-avila commented Oct 31, 2024

helgster77 commented Nov 25, 2024 • edited Loading

achhabra2 commented May 16, 2024 •

edited by danny-avila

Loading

amir-ghasemi commented Jul 16, 2024 •

edited

Loading

danny-avila commented Aug 26, 2024 •

edited

Loading

Xtrah commented Oct 7, 2024 •

edited

Loading

helgster77 commented Nov 25, 2024 •

edited

Loading