Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated prompt_style to be moved to the main LLM settings and included various settings into the openailike mode. #1835

Merged
merged 4 commits into from
Apr 30, 2024

Conversation

icsy7867
Copy link
Contributor

@icsy7867 icsy7867 commented Apr 4, 2024

I noticed, after trying new models and trying to implement new chat style templates, it did not seem to have an effect when using openailike.

I have moved the "prompt_style" under the LLM: section in the settings.yaml file as this should be able to be universally used across the various LLM modes.

However, since I am currently using VLLM to serve my model, I can test the other implementations. I have only changed the existing llamacpp and openailike to use the prompt_style (It is not currently being used anywhere else).

I also noticed that when using openailike, that I would easily exceed my various token limits. I believe this was due to max_tokens being set to none, or unlimited. I have changed this to use the "max_new_tokens" from the settings.yaml file.

I also included temperature, context_window,messages_to_prompt and completion_to_prompt values.

In the future, instead of hardcoding the different chat templates/styles, llama_index supports using .jinja formats:
https://github.com/vllm-project/vllm/tree/main/examples

https://github.com/chujiezheng/chat_templates/tree/main/chat_templates

Instead of changing the code, it would be more effective to have these templates in a directory, and to be able to reference them by their file name. That way if you want a new template, you could easily drop it into a specific folder and reference the filename, instead of having to change code. This allows that functionality to be dynamic without having to change anything within pgpt. However I dont have a PR for this yet.

…Ms from llama_index can utilize this. I also included temperature, context window size, max_tokens, max_new_tokens into the openailike to help ensure the settings are consistent from the other implementations.
@imartinez
Copy link
Collaborator

@icsy7867 I think this is a very good and balanced solution, although I agree with the future steps.

In order to merge this PR please remove the prompt_style from LlamaCPPSettings (and the related entry in the settings-x.yaml files)

@imartinez
Copy link
Collaborator

Also @icsy7867 please pull the latest changes of main into your branch. Otherwise CI tests won't work.

@icsy7867
Copy link
Contributor Author

@imartinez
I have been away due to medical reasons. But I think I removed those bits from the settings.py and settings.yaml file.

@imartinez imartinez merged commit e21bf20 into zylon-ai:main Apr 30, 2024
6 checks passed
@icsy7867 icsy7867 deleted the openai-mode-new branch April 30, 2024 11:20
@icsy7867 icsy7867 restored the openai-mode-new branch April 30, 2024 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants