Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOCS Update optimization docs with NNCF PTQ changes and deprecation of POT #17398
DOCS Update optimization docs with NNCF PTQ changes and deprecation of POT #17398
Changes from 84 commits
bdecc94
16c2815
f22f6c7
f516b36
c47cd01
cb4dd95
a7ced37
d993aa1
23388ee
40ecdb2
cf8fb1f
01cf75e
4f94f65
1fd7027
50c4d0b
7aaac92
4a7e9ea
4c396d9
a11b566
f50bff7
2886ed4
3956dd0
180e0f5
c9dee08
d19b2dd
ab5cc02
227f294
a709d4e
83ba861
178ce95
46d8a6d
65d096a
94410aa
a3e2d93
73e5415
31e3260
1bdc5d5
3438758
426a3f7
80bb362
e053339
c0263a3
f44e9fa
d4ee04d
15975ba
a8e23c5
adc2fb6
4931ac2
20ad788
8e4a1da
75c9cff
3484315
abfff0b
3d7e028
d4a330b
86ee0cb
8811267
365dba9
c2bd494
992532e
aeede28
9510506
f86efa5
c6f0627
cf3eb93
f1eb2cc
db29bff
2984914
d633886
d9a29f2
82b0b7b
0602ebc
fa73991
0f1d08d
a0b31eb
e7041d6
2fdff8d
822db41
276698d
1254af1
b27a6aa
66c13a3
0c97070
ca1bc8c
005a759
6ed85ab
e093d73
68ac01a
578e148
6462797
22d2a98
6da75da
ade83fd
50cad15
0c2a803
ad5a3f7
ffcf095
d7c9170
068c2a6
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this picture is confusing in document context.
There are two pictures one after another, which looks a bit like a draft. And it is not easy to distill message from this picture.
If I look at it I would think that training time quantization has best performance since pruning and sparsity are shown higher.
Secondly, methods are shown separately and perception that people can get is that it is either one or another.
I wonder if we should change the picture completely to show performance vs. accuracy chart or something similar? We can probably discuss verbally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pictures were updated in #17421 but I believe it doesn't address your comment. let's discuss how to change these pictures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably sounds nitpicky, but it seems the dataset should not return data samples feasible for inference, because they should not include the batch dimension (because that's added by the data loader) and OpenVINO inference fails without that added. I think it would be good to be explicit about this (and I would love an example that does exactly this: load a list of images).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Helena, I absolutely agree with you that
for example the list of images
is confusing.The main meaning of this sentence that nncf.Dataset supports for any iterable Python object over a dataset and user can implement or use any entity that implements Python iterable interface . If custom or framework dataset returns data samples feasible for inference then a transformation function is not required.I would recommend to reformulate this sentence:
I don't seen any reason to add example here because example will include the following code:
Not sure if this will bring any additional information. If you think otherwise, we can do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@helena-intel , if you have a suggestion how to better formulate it, feel free to share it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an example of such an iterable would be very useful, also because the PTQ OpenVINO example use an Ultralytics or PyTorch dataloader, and if you're not familiar with them, it may not be obvious what they return, how they handle batching, etc. I'll try to create the most basic NNCF PTQ example today. We can see how to incorporate that in either docs or examples, but that can be in a future PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to learn more about the effects of setting this parameter. I recently noticed a tiny accuracy degredation on SPR compared to ICL. Will specifying target_device=CPU_SPR fix this? Will that cause lower accuracy on ICL though? (Not asking for an answer, but that's what I wonder when I read this). Will this influence accuracy, performance or both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPU_SPR option improves model performance on SPR device due to removal of some FQs. It doesn't improve the accuracy, at least it doesn't solve the issue with bf16 enabled by default which I guess is the main reason of the accuracy degradation in your case.