Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about the paper #1

Open
CaiShilv opened this issue Oct 19, 2021 · 4 comments
Open

Some questions about the paper #1

CaiShilv opened this issue Oct 19, 2021 · 4 comments

Comments

@CaiShilv
Copy link

It's amazing to see this work,I'd like to ask some questions.
What do the dashed lines and implementations in Figure 6 (b) mean?
thank's very much!

@micmic123
Copy link
Owner

The dashed lines indicate the accuracies of the top-5 prediction results from the classifier, i.e. whether the ground truth is one of the 5 classes with the largest scores.
For each color, the reconstructed images fed to the classifier were same in the dashed lines (measuring top-5 accuracy) and solid lines (measuring top-1 accuracy).

If you have any more questions, feel free to ask me again!
Thank you.

@CaiShilv
Copy link
Author

CaiShilv commented Oct 25, 2021

Thank you very much. I understand picture 6. Does the PSNR calculated by the article code when running eval.py show nan values? When I run eval.py, I find that Nan values appear in the reconstructed image X_hat,Nan values appear to be generated by x=self.g_s5(x) in g_S decoder.Is there something I didn't notice?

@micmic123
Copy link
Owner

In my case, when training the model from scratch with learning rate = 1e-4, the nan outputs appeared occasionally.
I tried to solve this instability of training, e.g., by improving numerical stability of our model, but the training was still unstable.
It might be due to some problems in the library (compressai).
Although it is not a fundamental solution, we recommend to skip update of model parameters when nan values appear in a training step.
For example, in train.py:

QmapCompression/train.py

Lines 117 to 119 in 8500f8b

# for stability
if out_criterion['loss'].isnan().any() or out_criterion['loss'].isinf().any() or out_criterion['loss'] > 10000:
continue

In fact, after decaying the learning rate to 1e-5 (at 1.4M iterations in our experiments), the nan values disappeared soon.
Meanwhile, you can test with the released pretrained model.

@CaiShilv
Copy link
Author

CaiShilv commented Nov 1, 2021

In compressai, the input probability parameters of entropy encoder encoding and decoding need to be equal. In hs network structure, the function nn.convtransposed2d ( ) is used, which has random sampling. For the same input, two output results have slight differences, which is not reliable for entropy encoder. I don't know if that makes sense to me. Would it be better to replace nn.convtransposed2d ( ) with nn.pixelshuffle ( )?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants