Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Exlv2] Allow setting "num_experts_per_token" in the WebUI #4954

Closed
NiriProject opened this issue Dec 17, 2023 · 3 comments
Closed

[Exlv2] Allow setting "num_experts_per_token" in the WebUI #4954

NiriProject opened this issue Dec 17, 2023 · 3 comments
Labels
enhancement New feature or request stale

Comments

@NiriProject
Copy link

Description
For those who don't mind trading some speed and vram for some reduced perplexity, this setting is great for squeezing just that extra bit from mixtral.

Additional Context

The setting can be found in "q_mlp.cu" It used to be hardcoded to 2, but is now simple integer variable. I've been editing it manually, but it's probably possible to pass a user defined setting from the webui.

@NiriProject NiriProject added the enhancement New feature or request label Dec 17, 2023
@oobabooga
Copy link
Owner

oobabooga commented Dec 17, 2023

Added here #4955. Could you check if it works?

@github-actions github-actions bot added the stale label Jan 28, 2024
Copy link

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

@Thireus
Copy link
Contributor

Thireus commented Mar 12, 2024

@AncientZarko, do we have some metrics about perfs and PPL improvements versus num_experts_per_token used?

Edit: https://www.reddit.com/r/LocalLLaMA/comments/18la6ao/optimal_number_of_experts_per_token_in/ found this Reddit thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

3 participants