Some questions about the paper #1

CaiShilv · 2021-10-19T09:33:48Z

It's amazing to see this work,I'd like to ask some questions.
What do the dashed lines and implementations in Figure 6 (b) mean?
thank's very much!

micmic123 · 2021-10-20T14:39:21Z

The dashed lines indicate the accuracies of the top-5 prediction results from the classifier, i.e. whether the ground truth is one of the 5 classes with the largest scores.
For each color, the reconstructed images fed to the classifier were same in the dashed lines (measuring top-5 accuracy) and solid lines (measuring top-1 accuracy).

If you have any more questions, feel free to ask me again!
Thank you.

CaiShilv · 2021-10-25T11:31:45Z

Thank you very much. I understand picture 6. Does the PSNR calculated by the article code when running eval.py show nan values? When I run eval.py, I find that Nan values appear in the reconstructed image X_hat,Nan values appear to be generated by x=self.g_s5(x) in g_S decoder.Is there something I didn't notice?

micmic123 · 2021-10-29T13:14:03Z

In my case, when training the model from scratch with learning rate = 1e-4, the nan outputs appeared occasionally.
I tried to solve this instability of training, e.g., by improving numerical stability of our model, but the training was still unstable.
It might be due to some problems in the library (compressai).
Although it is not a fundamental solution, we recommend to skip update of model parameters when nan values appear in a training step.
For example, in train.py:

QmapCompression/train.py

Lines 117 to 119 in 8500f8b

    
           # for stability 
        
           if out_criterion['loss'].isnan().any() or out_criterion['loss'].isinf().any() or out_criterion['loss'] > 10000: 
        
               continue

In fact, after decaying the learning rate to 1e-5 (at 1.4M iterations in our experiments), the nan values disappeared soon.
Meanwhile, you can test with the released pretrained model.

CaiShilv · 2021-11-01T07:24:59Z

In compressai, the input probability parameters of entropy encoder encoding and decoding need to be equal. In hs network structure, the function nn.convtransposed2d ( ) is used, which has random sampling. For the same input, two output results have slight differences, which is not reliable for entropy encoder. I don't know if that makes sense to me. Would it be better to replace nn.convtransposed2d ( ) with nn.pixelshuffle ( )?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions about the paper #1

Some questions about the paper #1

CaiShilv commented Oct 19, 2021

micmic123 commented Oct 20, 2021

CaiShilv commented Oct 25, 2021 •

edited

Loading

micmic123 commented Oct 29, 2021

CaiShilv commented Nov 1, 2021 •

edited

Loading

Some questions about the paper #1

Some questions about the paper #1

Comments

CaiShilv commented Oct 19, 2021

micmic123 commented Oct 20, 2021

CaiShilv commented Oct 25, 2021 • edited Loading

micmic123 commented Oct 29, 2021

CaiShilv commented Nov 1, 2021 • edited Loading

CaiShilv commented Oct 25, 2021 •

edited

Loading

CaiShilv commented Nov 1, 2021 •

edited

Loading