Support GPT4All Server API #11870

ThiloteE · 2024-10-01T11:27:22Z

Describtion of solution:

I want to use JabRef's AI feature locally. There are multiple applications out there that provide a server API. They very often offer an API that resembles OpenAI API.

GPT4All is such an application. Others are Llama.cpp, Ollama, LMStudio, Jan, KobolCPP. I am sure, there are more, but those are the most well known ones.

The grand advantage of those applications is that they offer more samplers, GPU acceleration, hardware support and support for models that have not been added to JabRef.

Problem

It kinda works with GPT4All already, but something is wrong. I believe the embeddings are not sent together with the prompt and responses look like they are cutoff in the middle.

GPT4All:

JabRef:

Additional context

GPT4All documentation for local API server
The phi-3.1-mini-128k-instruct model can be downloaded here: https://huggingface.co/GPT4All-Community/Phi-3.1-mini-128k-instruct-GGUF/tree/main. Just move the model file into the model directory of GPT4All and then configure it in the model settings as shown in my screenshots down below.
Documentation about how to configure other custom models: https://github.com/nomic-ai/gpt4all/wiki/Configuring-Custom-Models.

JabRef preferences:

GPT4All preferences:

GPT4All model settings 1:

GPT4All model settings 2:

ThiloteE · 2024-10-01T11:29:12Z

I also often get those errors/warnings in the commandline, when I try to send messages when connected to GPT4All server API. Not sure, if related.

2024-10-01 13:13:53 [pool-2-thread-4] org.jabref.logic.ai.chatting.AiChatLogic.execute()
INFO: Sending message to AI provider (https://api.openai.com/v1) for answering in entry CooperEtAl200708cah: What are the authors of the paper?
2024-10-01 13:13:53 [JavaFX Application Thread] org.jabref.gui.ai.components.aichat.AiChatComponent.lambda$onSendMessage$11()
ERROR: Got an error while sending a message to AI: io.github.stefanbratanov.jvm.openai.OpenAIException: 400 - message: Invalid 'messages[2].role': did not expect 'user' here, type: invalid_request_error, param: null, code: null
        at [email protected]/io.github.stefanbratanov.jvm.openai.OpenAIClient.lambda$validateHttpResponse$6(OpenAIClient.java:129)
        at java.base/java.util.Optional.ifPresentOrElse(Optional.java:196)
        at [email protected]/io.github.stefanbratanov.jvm.openai.OpenAIClient.validateHttpResponse(OpenAIClient.java:127)
        at [email protected]/io.github.stefanbratanov.jvm.openai.OpenAIClient.sendHttpRequest(OpenAIClient.java:85)
        at [email protected]/io.github.stefanbratanov.jvm.openai.OpenAIClient.sendHttpRequest(OpenAIClient.java:78)
        at [email protected]/io.github.stefanbratanov.jvm.openai.ChatClient.createChatCompletion(ChatClient.java:37)
        at [email protected]/org.jabref.logic.ai.chatting.model.JvmOpenAiChatLanguageModel.generate(JvmOpenAiChatLanguageModel.java:65)
        at [email protected]/org.jabref.logic.ai.chatting.model.JabRefChatLanguageModel.generate(JabRefChatLanguageModel.java:142)
        at [email protected]/dev.langchain4j.chain.ConversationalRetrievalChain.execute(ConversationalRetrievalChain.java:85)
        at [email protected]/dev.langchain4j.chain.ConversationalRetrievalChain.execute(ConversationalRetrievalChain.java:32)
        at [email protected]/org.jabref.logic.ai.chatting.AiChatLogic.execute(AiChatLogic.java:168)
        at [email protected]/org.jabref.gui.ai.components.aichat.AiChatComponent.lambda$onSendMessage$9(AiChatComponent.java:204)
        at [email protected]/org.jabref.logic.util.BackgroundTask$1.call(BackgroundTask.java:73)
        at [email protected]/org.jabref.gui.util.UiTaskExecutor$1.call(UiTaskExecutor.java:191)
        at javafx.graphics@23/javafx.concurrent.Task$TaskCallable.call(Task.java:1401)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)

ThiloteE · 2024-10-01T11:31:03Z

I think there is also an issue when switching models.

Even after clearing the chat history, it is not possible to get around this erorr message, unless switching to a different entry in JabRef. Then I can chat again.

ThiloteE · 2024-10-01T11:31:12Z

@InAnYan

InAnYan · 2024-10-01T11:36:14Z

Such an interesting issue and such an interesting behaviour...

IMHO it's very wrong that there is no OpenAI API compatible mode in gpt4all. (Standardization made people's life much easier).

However, I will look into this issue in more detail, because Gpt4All is a popular app and this is also important

FeiLi-lab · 2024-10-15T04:57:53Z

This is so weird. I tried it last night and encountered the problem mentioned above, that is, the returned content would be truncated. But I tried it again this afternoon and found that it could be returned normally without truncating the content.

FeiLi-lab · 2024-10-15T05:47:02Z

Ok... I reproduced it. I will try to fix it.

InAnYan · 2024-10-15T06:31:52Z

Oh, @FeiLi-lab, @ThiloteE when you have the issue with truncated output, could you try to click on text area?

Because there is some bug in the UI, when text are is not expanded. Could this be the case of truncated output?

ThiloteE · 2024-10-17T21:55:56Z

I will try to have a look at this on the weekend.

ThiloteE · 2024-10-17T21:57:41Z

Maybe could have been ggerganov/llama.cpp#9867 in upstream llama.cpp too. The fix would need some time to reach downstream GPT4All.

NoahXu718 · 2024-10-18T15:38:53Z

I am also working on this issue. I think the problem that responses look like cutoff in the middle may come from the request.

I ran the following two commands on my computer, one with max_token set and one without, and the result shows that the answer without max_token set was cutoff.

curl -X POST http://localhost:4891/v1/chat/completions -H "Content-Type: application/json" -d "{"model": "Phi-3.1-mini-128k-instruct-Q4_0-precise-output-tensor", "messages": [{"role": "user", "content": "could you please introduce more about your self?"}], "max_tokens": 2048, "temperature": 0.7}"

curl -X POST http://localhost:4891/v1/chat/completions -H "Content-Type: application/json" -d "{"model": "Phi-3.1-mini-128k-instruct-Q4_0-precise-output-tensor", "messages": [{"role": "user", "content": "could you please introduce more about your self?"}], "temperature": 0.7}"

I set the max_token in the code, now the response looks complete.
preference:

JabRef:

@ThiloteE I will refine the code later if you can review it.

ThiloteE · 2024-10-19T21:56:03Z

Oh, nice! Good to know. Yes, a pull-request would be nice, otherwise nobody can review.
Do you think this is something we would need to add to the preferences?

github-actions · 2024-10-19T22:03:09Z

Welcome to the vibrant world of open-source development with JabRef!

Newcomers, we're excited to have you on board. Start by exploring our Contributing guidelines, and don't forget to check out our workspace setup guidelines to get started smoothly.

In case you encounter failing tests during development, please check our developer FAQs!

Having any questions or issues? Feel free to ask here on GitHub. Need help setting up your local workspace? Join the conversation on JabRef's Gitter chat. And don't hesitate to open a (draft) pull request early on to show the direction it is heading towards. This way, you will receive valuable feedback.

⚠ Note that this issue will become unassigned if it isn't closed within 30 days.

🔧 A maintainer can also add the Pinned label to prevent it from being unassigned automatically.

Happy coding! 🚀

…0' into Support-GPT4All-Server-API-JabRef#11870

ThiloteE added type: enhancement AI Related to AI Chat/Summarization labels Oct 1, 2024

ThiloteE mentioned this issue Oct 1, 2024

Support for server APIs of apps specialised in local LLMs #11872

Open

15 tasks

ThiloteE assigned NoahXu718 Oct 19, 2024

ThiloteE added the FirstTimeCodeContribution Triggers GitHub Greeter Workflow label Oct 19, 2024

NoahXu718 linked a pull request Oct 24, 2024 that will close this issue

Support gpt4 all server api #11870 #12078

Open

7 tasks

NoahXu718 added a commit to ShunL12324/jabref that referenced this issue Oct 24, 2024

Merge branch 'main' into Support-GPT4All-Server-API-JabRef#11870

3f55ce4

NoahXu718 added a commit to ShunL12324/jabref that referenced this issue Oct 24, 2024

Merge remote-tracking branch 'origin/Support-GPT4All-Server-API-#1187…

ebee427

…0' into Support-GPT4All-Server-API-JabRef#11870

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support GPT4All Server API #11870

Support GPT4All Server API #11870

ThiloteE commented Oct 1, 2024 •

edited

Loading

ThiloteE commented Oct 1, 2024

ThiloteE commented Oct 1, 2024

ThiloteE commented Oct 1, 2024

InAnYan commented Oct 1, 2024

FeiLi-lab commented Oct 15, 2024

FeiLi-lab commented Oct 15, 2024

InAnYan commented Oct 15, 2024

ThiloteE commented Oct 17, 2024

ThiloteE commented Oct 17, 2024 •

edited

Loading

NoahXu718 commented Oct 18, 2024

ThiloteE commented Oct 19, 2024

github-actions bot commented Oct 19, 2024

Support GPT4All Server API #11870

Support GPT4All Server API #11870

Comments

ThiloteE commented Oct 1, 2024 • edited Loading

Describtion of solution:

Problem

Additional context

ThiloteE commented Oct 1, 2024

ThiloteE commented Oct 1, 2024

ThiloteE commented Oct 1, 2024

InAnYan commented Oct 1, 2024

FeiLi-lab commented Oct 15, 2024

FeiLi-lab commented Oct 15, 2024

InAnYan commented Oct 15, 2024

ThiloteE commented Oct 17, 2024

ThiloteE commented Oct 17, 2024 • edited Loading

NoahXu718 commented Oct 18, 2024

ThiloteE commented Oct 19, 2024

github-actions bot commented Oct 19, 2024

ThiloteE commented Oct 1, 2024 •

edited

Loading

ThiloteE commented Oct 17, 2024 •

edited

Loading