Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor quantized processing functions #509

Merged
merged 3 commits into from
Mar 28, 2023
Merged

Conversation

sw
Copy link
Contributor

@sw sw commented Mar 25, 2023

To avoid code duplication when implementing additional quantization formats (#456), refactor the forward_mul_mat and forward_get_rows functions to use a table of function pointers, indexed by ggml_type.

This makes some functions non-inlined, I didn't see a regression in performance on my machine.

I tried to fix the "unused variable" warnings without complicating things too much, some are used in asserts.

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow - this is great!
Let me just double check everything and will merge it

@slaren
Copy link
Collaborator

slaren commented Mar 25, 2023

I think this is great, considering the sizes of the rows (4096 in the smallest model), it shouldn't be an issue if these functions cannot be inlined anymore.

@anzz1 anzz1 added enhancement New feature or request performance Speed related topics hardware Hardware related labels Mar 27, 2023
@sw
Copy link
Contributor Author

sw commented Mar 27, 2023

@ggerganov did you want to look at this again or can we merge it?

@ggerganov
Copy link
Owner

Please don't merge yet - it's top priority for merging but I need some time to take a closer look

@ggerganov ggerganov merged commit 99c5b27 into ggerganov:master Mar 28, 2023
@sw sw deleted the q-refactor branch March 28, 2023 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request hardware Hardware related performance Speed related topics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants