-
Notifications
You must be signed in to change notification settings - Fork 34
Issues: sangmichaelxie/doremi
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
program stuck (when ”Loading cached shuffled indices for dataset at ...“)
#29
opened Mar 29, 2024 by
ccx06
Questions about the loss used for optimizing the proxy model
#25
opened Jan 9, 2024 by
clarkkent0618
Questions about directly applying the weights from paper or the repo to train main model
#23
opened Jan 4, 2024 by
clarkkent0618
ModuleNotFoundError: No module named 'flash_attn.models.falcon'
#22
opened Dec 26, 2023 by
Sniper970119
Cannot reproduce the results shown in Github repo with the 120M reference model on A800 (8*80G).
#20
opened Dec 18, 2023 by
kiseliu
How many rounds do we need to converge domain weights on The Pile?
#15
opened Sep 27, 2023 by
ouyangliqi
How do you get the model to be good at code if it downsamples code?
#13
opened Sep 14, 2023 by
teknium1
ProTip!
Add no:assignee to see everything that’s not assigned.