You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would be great if the library could support more types of layers for quantization like with torch.ao, seems like it is already available with conv2d. Sadly torch.ao does not seem to support CUDA as a backend right now. Would it be possible to implement the 8-bit and 4-bit kernels in Triton or CUDA to allow for the quantization of convolutional layers? A similar issue has been raised here earlier.
Motivation
Make modules that use convolutional layers use less memory through quantization.
Your contribution
Yes, I am willing to work on implementing convolutional kernels if it is possible to integrate with this library.
The text was updated successfully, but these errors were encountered:
Feature request
Would be great if the library could support more types of layers for quantization like with
torch.ao
, seems like it is already available with conv2d. Sadlytorch.ao
does not seem to support CUDA as a backend right now. Would it be possible to implement the 8-bit and 4-bit kernels in Triton or CUDA to allow for the quantization of convolutional layers? A similar issue has been raised here earlier.Motivation
Make modules that use convolutional layers use less memory through quantization.
Your contribution
Yes, I am willing to work on implementing convolutional kernels if it is possible to integrate with this library.
The text was updated successfully, but these errors were encountered: