Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for quantization of convolutional layers #1414

Open
JohnnyRacer opened this issue Nov 13, 2024 · 0 comments
Open

Support for quantization of convolutional layers #1414

JohnnyRacer opened this issue Nov 13, 2024 · 0 comments
Labels
contributions-welcome We welcome contributions to fix this issue! feature-request

Comments

@JohnnyRacer
Copy link

Feature request

Would be great if the library could support more types of layers for quantization like with torch.ao, seems like it is already available with conv2d. Sadly torch.ao does not seem to support CUDA as a backend right now. Would it be possible to implement the 8-bit and 4-bit kernels in Triton or CUDA to allow for the quantization of convolutional layers? A similar issue has been raised here earlier.

Motivation

Make modules that use convolutional layers use less memory through quantization.

Your contribution

Yes, I am willing to work on implementing convolutional kernels if it is possible to integrate with this library.

@matthewdouglas matthewdouglas added contributions-welcome We welcome contributions to fix this issue! feature-request labels Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributions-welcome We welcome contributions to fix this issue! feature-request
Projects
None yet
Development

No branches or pull requests

2 participants