-
-
Notifications
You must be signed in to change notification settings - Fork 887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cohere - command-r #1422
Comments
They now also released a larger, 104B parameter model: C4AI Command R+ |
Hey, regular training should work as-is except for sample_packing. |
@Undi95 , can you try preprocess it separately first? also, make sure trust remote code is on |
I always preprocess my dataset before launching a train with python -m axolotl.cli.preprocess config.yml --debug |
I forgot to mention that there's an untested PR for sample packing with cohere: #1547 . If anyone else is following, do you also get the same issue as Undi? |
🔖 Feature description
command-r has a new attention mechanism which is a bit different from llama2
✔️ Solution
implementation of q/lora training for cohere
❓ Alternatives
No response
📝 Additional Context
ggerganov/llama.cpp#6033 - its merged in llama.cpp already
it would be great if we can get a way to train that model as its amazing in writing / rag / function calling
Acknowledgements
The text was updated successfully, but these errors were encountered: