Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What does "HO_weight" and "binary_weight" mean? #36

Closed
yeliudev opened this issue Dec 3, 2019 · 13 comments
Closed

What does "HO_weight" and "binary_weight" mean? #36

yeliudev opened this issue Dec 3, 2019 · 13 comments

Comments

@yeliudev
Copy link

yeliudev commented Dec 3, 2019

Hi @DirtyHarryLYL ! Thanks a lot for your great work!

I noticed that in lib/networks/TIN_HICO.py, you've added two extra weights self.HO_weight and self.binary_weight to the classification scores from both HOI and binary classifiers, which is different from the code from iCAN. May I ask why did you multiply the weights with the raw classification scores and how are the weights be generated?

Thanks!

@DirtyHarryLYL
Copy link
Owner

The two sets of weights are used for the classification with long-tail data distribution. The more training samples a class has, its loss weight would be smaller (because it has more chances to get better, thus each update can be small). Meanwhile, the rare HOI class with fewer samples needs a larger weight.
We use the k/(n^i/N) to decide the weights, i.e., N is the total sample number, n^i is the sample number of class i, like the frequency of occurrence of this HOI. k is decided on your experience.
You could also try k*f(1/(n^i/N)), where f() is a non-linear function like lg() that we choose. Because it will make the curve smoother.
If you want to choose a more convenient way, focal loss (Kaiming et.al.) is a good one to try, but it still needs the hyper-parameters trial.

@yeliudev
Copy link
Author

yeliudev commented Dec 3, 2019

Thank you for your reply.

It seems that these weights are used to handle the class-wise imbalance, but focal loss is designed for heavy hard/easy imbalance or pos/neg imbalance, how can it be used for this problem?

I've tried to apply focal loss or GHM loss to train the binary classifier, but the results are almost the same with training with BinaryCrossEntropy loss with balanced sampling.

@DirtyHarryLYL
Copy link
Owner

Yep, the performances of various loss tricks are comparable in HICO-DET, in our experiments the log loss weight performs best for HOI classification. For extreme rare classes (many), all these tricks contribute very small.

@DirtyHarryLYL
Copy link
Owner

BTW, each HOI adopted a Sigmoid for binary classification because of the multi-label problem (i.e. one person can perform multiple actions simultaneously). Thus we computed the sum of the 600 Sigmoid cross entropies as the HOI loss.

@yeliudev
Copy link
Author

yeliudev commented Dec 4, 2019

Thanks, your answer helps me a lot.

@yeliudev
Copy link
Author

yeliudev commented Dec 4, 2019

Hi @DirtyHarryLYL , I'm still confused about the imbalance of pos and neg samples for each class.

Since you used image-centric sampling strategy, for each training batch, all the candidate box pairs come from the same image, and you update the whole model using SigmoidCrossEntropy loss. However, it may happen that, i.e. for the first 10000 images in one epoch, there is not even one sample for HOI class n (0<n<599), so that the model would only learn from the neg samples of class n and always predict 0 for this class, which may cause the death of the classifier. How did you deal with this imbalance problem?

Besides, I noticed that the model predicts a 1x2 vector to represent each binary label, why not just use a single 0 or 1 since one number is enough to represent the probability of "interactiveness"?

@DirtyHarryLYL
Copy link
Owner

DirtyHarryLYL commented Dec 5, 2019

In each mini-batch, i.e., samples from one image, we will input the fixed number of pos and neg human-object pairs into the model (e.g. 15 pos pairs and 60 neg pairs).
Pos and neg are said for each class, i.e., a pair can be positive for HOI i but negative for HOI j.
Thus, we can keep the pos:neg ratio in training by inputting curated samples.

We have tried both single scalar 0~1, or 1x2 vector (two probabilities for pos and neg). The results are comparable. In the initial version, we choose the 1x2 vector for the convenience of the analysis and did not change in the later versions. You could also try other output formats in your experiments.

@yeliudev
Copy link
Author

yeliudev commented Dec 5, 2019

Thank you.

@DirtyHarryLYL
Copy link
Owner

No problem~

@xxxzhi
Copy link

xxxzhi commented Mar 28, 2020

could you provide the formula for weights in detail ?

I found k*f(1/(n^i/N)) can not obtain the weights in the file in HICO.
For example, label 600 and 597 have same frequency (2) in HICO in the training set.

But the weights in your code is different: 9.609821 and 13.670264

@DirtyHarryLYL
Copy link
Owner

DirtyHarryLYL commented Mar 28, 2020

The formula is just the simple frequency as probability, i.e. k*lg(1/frequency).
I think the problem comes from the sample number:
pos number = gt pair number (following the ican policy)

neg number has two parts:

  1. number of gt human boxes (iou>0.5)* number of gt obj boxes (iou>0.5)- number of gt pair boxes, that is, the wrong pairings;
  2. the number of pairs composed of inaccurate human and obj boxes (iou < 0.5).

To my knowledge, the 600 and 597 HOIs have the same gt pair numbers, so the results are different.

@xxxzhi
Copy link

xxxzhi commented Mar 28, 2020

Thanks for your reply. Yeah, I made a mistake. This weight will affect the performance largely.

@DirtyHarryLYL
Copy link
Owner

DirtyHarryLYL commented Mar 28, 2020

No problem~ yeah, long-tail data distribution learning is still an open question, the studies on loss, data sampling, latent space learning are very interesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants