-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weight compression via Lora Correction Algorithm #2816
Weight compression via Lora Correction Algorithm #2816
Conversation
nncf/quantization/algorithms/weight_compression/openvino_backend.py
Outdated
Show resolved
Hide resolved
a355df6
to
24b1f9e
Compare
for i in range(n_iters): | ||
VX = Vr @ X | ||
if not w_regularization: | ||
sol = fns.linalg.lstsq(fns.transpose(VX), fns.transpose(dY)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
driver="gelsy" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, corrected
if not w_regularization: | ||
sol = fns.linalg.lstsq(fns.transpose(X), fns.transpose(dYU), driver="gelsy") | ||
else: | ||
Ind = fns.eye(Vr.shape[1], backend=Vr.backend, dtype=Vr.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be also worth to add some math formulas in comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, also refactored to avoid too many transpose. please take a look at it as well
nncf/quantization/algorithms/weight_compression/lora_correction.py
Outdated
Show resolved
Hide resolved
nncf/quantization/algorithms/weight_compression/lora_correction.py
Outdated
Show resolved
Hide resolved
nncf/quantization/algorithms/weight_compression/openvino_backend.py
Outdated
Show resolved
Hide resolved
nncf/quantization/algorithms/weight_compression/weight_lowering.py
Outdated
Show resolved
Hide resolved
470fbd8
to
a3b26c7
Compare
…nl/lora_correct_prod_squash
for i in range(n_gs): | ||
offset = i * gs | ||
denum = fns.sum(s[offset : offset + gs]) | ||
s[offset : offset + gs] = s[offset : offset + gs] / denum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can skip first normalization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please consider taking my comments into consideration
Latest build of conformance test with lora test case: |
Changes
Lora Correction algorithm for int4/nf4 weight compression.
Reason for changes
Method for improving accuracy by migrating quantization noise to “learnable” lora adapters.
Related tickets
135863
Tests