Confusion about num_classes #108

dvd42 · 2020-06-26T16:05:22Z

Hi, I was looking through the code, and other posted issues and it is still not clear to me what the number of classes should be. For coco it is set to 91 (90 + 1 for the no-object class) as explained here. However as seen in the code that builds the model:

detr/models/detr.py

Lines 36 to 38 in 10a2c75

    
           hidden_dim = transformer.d_model 
        
           self.class_embed = nn.Linear(hidden_dim, num_classes + 1) 
        
           self.bbox_embed = MLP(hidden_dim, hidden_dim, 4, 3)

+1 is added in the classification layer for the no object class. So if I have a dataset that has X number of classes (without including the bg), what should I set the value of num_classes to be.

P.S: Thanks for this great project!!! :)

alcinos · 2020-06-26T16:23:14Z

Hi @dvd42
Thank you for your interest in DETR.

The explanation you're pointing at was slightly incorrect, I fixed it.
You should always use num_classes = max_id + 1 where max_id is the highest class ID that you have in your dataset.
For example, if you have 4 classes with IDs 1, 23, 24, 56, then you will use num_classes=57. Detr will then reserve id 57 for the "no_object" class.
In general, you should try to make your ids consecutive if possible, but it doesn't really matter if there are a few "holes"

I think I have answered your question, and as such I'm closing this. Feel free to reach out if you have further concerns.

fmassa · 2020-06-26T16:28:12Z

Thanks for fixing my wrong answer @alcinos !

facebookresearch#108 (comment)

woctezuma · 2020-08-17T12:35:37Z

Just to be clear about the +1 in the original question, I think it is only there:

to account for the fact that indexing in programming languages starts at 0 instead of 1,
because labels are expected to be indexed starting with 1, as in the COCO dataset where the first label is person with ID n°1.

>>> labels = torch.randint(1, 91, (4, 11))

So let us say that you have N labels, indexed from 1 to N (with no "hole"). You would feed num_classes equal to N+1 to DETR, so that DETR assigns the no_object class to ID equal to num_classes=N+1. Then, when it comes to nn.Linear, the +1 is there so that the output has a sufficient length, where:

prediction for ID n°0 is dummy,
predictions for ID from n°1 to n°num_classes match our convention (N objects n°1...N, plus one no_object class n°N+1).

detr/models/detr.py

Line 37 in 10a2c75

self.class_embed = nn.Linear(hidden_dim, num_classes + 1)

A good piece of news is that the code should still work fine even if the user were to start indexing the classes at 0.
It is compatible with both conventions, as long as the parameter num_classes is actually max ID + 1, as explained above. The only issue is that the parameter name can be confusing.

facebookresearch#108 (comment)

yangsenius · 2020-09-09T10:15:27Z

Hi @alcinos.

I have read many issues and your comments about num_classes problems, I still want to make sure if my understanding is right.

The labels of COCO dataset are from 1 to 90. So in detr, num_classes = 90+1=91 and

self.class_embed = nn.Linear(hidden_dim, num_classes + 1)

My questions are:

Does this mean the self.class_embed will output a 92-dim class vector for each query? Is the first dim (i.e. 0-index of the vector) always not used for any classes, even for non-object class? (because the last dim, i.e. 91-index is for non-object?)
If the answer of the question 1 is yes, I would like to ask: can we set the first dim of the class vector as the non-object class logit when the labels do not contain the 0 label id such as labels of dataset = [1,2,3,4]?
In this way, we set num_classes=len(labels of dataset) is , and only change the self.num_class to 0 in the following code

target_classes_o = torch.cat([t["labels"][J] for t, (_, J) in zip(targets, indices)])
        target_classes = torch.full(src_logits.shape[:2], **0**,
                                    dtype=torch.int64, device=src_logits.device)
        target_classes[idx] = target_classes_o

Can it work well?

woctezuma · 2020-09-09T11:11:49Z

Not Alcinos, but:

Yes.
The convention used by DETR has no real downside as far as I understand. Sure, it is not the most optimized solution, but:

the convention can deal with small gaps in the numbering of categories, without the need to keep a mapping of indices,
the convention works fine no matter if the first category is labelled with index n°0 or index n°1,
the network does not seem to suffer from a few dummy labels with zero example in the training dataset.

In terms of minimizing time spent debugging and errors/issues encountered by other users, this convention is a good trade-off.

yangsenius · 2020-09-09T11:48:31Z

Thanks @woctezuma.

The matcher in

detr/models/matcher.py

Line 68 in 5e66b4c

cost_class = -out_prob[:, tgt_ids]

will always take all object classes predictions as cost. So I think the way I propose above will not affect the match as well.

alcinos · 2020-09-09T14:11:12Z

@yangsenius As noted by @woctezuma, there is no theoretical issue with using 0 as the "no-object" label, but I personally don't see any good reason to do it.
You'll run into issues if you forget about this and label a true class with label 0.
Also note that the assumption that the "no-object" class is the last one is used throughout the code, and I'm not sure I'd be able to list all the places where this assumption is made. Some example that come to mind are the postprocessor, and most of our visualization codes. If you don't want to spend time debugging, and you don't have a strong, compelling reason to make this change, I'd suggest sticking to the current convention.

Best of luck.

yangsenius · 2020-09-10T01:36:18Z

@yangsenius As noted by @woctezuma, there is no theoretical issue with using 0 as the "no-object" label, but I personally don't see any good reason to do it.
You'll run into issues if you forget about this and label a true class with label 0.
Also note that the assumption that the "no-object" class is the last one is used throughout the code, and I'm not sure I'd be able to list all the places where this assumption is made. Some example that come to mind are the postprocessor, and most of our visualization codes. If you don't want to spend time debugging, and you don't have a strong, compelling reason to make this change, I'd suggest sticking to the current convention.

Best of luck.

Very thanks for your suggestions!

facebookresearch#108 (comment)

alcinos closed this as completed Jun 26, 2020

fmassa added the question Further information is requested label Jun 26, 2020

woctezuma mentioned this issue Jul 27, 2020

support new args.num_classes for custom training, robustly and with full back-compat #89

Closed

woctezuma added a commit to woctezuma/detr that referenced this issue Jul 27, 2020

Specify num_classes = 3 for my custom dataset

106cb85

facebookresearch#108 (comment)

woctezuma added a commit to woctezuma/detr that referenced this issue Jul 28, 2020

Specify num_classes = 2 for my custom dataset

9af0b95

facebookresearch#108 (comment)

woctezuma added a commit to woctezuma/detr that referenced this issue Jul 28, 2020

Specify num_classes = 2 for my custom dataset

fc7629f

facebookresearch#108 (comment)

woctezuma added a commit to woctezuma/detr that referenced this issue Jul 28, 2020

Specify num_classes = 2 for my custom dataset

a5fb509

facebookresearch#108 (comment)

woctezuma added a commit to woctezuma/detr that referenced this issue Aug 3, 2020

Specify num_classes = 2 for my custom dataset

c34fb25

facebookresearch#108 (comment)

woctezuma added a commit to woctezuma/detr that referenced this issue Aug 13, 2020

Specify num_classes = 2 for my custom dataset

6ecf58f

facebookresearch#108 (comment)

woctezuma mentioned this issue Aug 17, 2020

Recommendations for training Detr on custom dataset? #9

Open

woctezuma added a commit to woctezuma/detr that referenced this issue Aug 20, 2020

Specify num_classes = 2 for my custom dataset

448da86

facebookresearch#108 (comment)

alcinos mentioned this issue Sep 19, 2020

Question about the changes of accuracy (mAP) on my own dataset #230

Closed

woctezuma added a commit to woctezuma/detr that referenced this issue Oct 2, 2020

Specify num_classes = 2 for my custom dataset

0217b40

facebookresearch#108 (comment)

eslambakr mentioned this issue Sep 28, 2021

Fixing the gaps in the classes IDs for COCO dataset #444

Open

sparkstj mentioned this issue Mar 25, 2022

labe issue MCG-NJU/RTD-Action#18

Closed

BinhuiXie mentioned this issue Jul 29, 2022

Why is nums_classes set to 91 and but，the actual number of categories in coco's dataset is 80 IDEA-Research/DINO#46

Closed

woctezuma added a commit to woctezuma/detr that referenced this issue Aug 21, 2022

Specify num_classes = 2 for my custom dataset

c18bac8

facebookresearch#108 (comment)

woctezuma added a commit to woctezuma/detr that referenced this issue Aug 21, 2022

Specify num_classes = 2 for my custom dataset

c08ccbc

facebookresearch#108 (comment)

woctezuma added a commit to woctezuma/detr that referenced this issue Nov 6, 2023

Specify num_classes = 2 for my custom dataset

eb8c141

facebookresearch#108 (comment)

GitWumm mentioned this issue Aug 14, 2024

Is starting Id of labels/categories relevant for evaluation? NielsRogge/coco-eval#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusion about num_classes #108

Confusion about num_classes #108

dvd42 commented Jun 26, 2020 •

edited

Loading

alcinos commented Jun 26, 2020

fmassa commented Jun 26, 2020

woctezuma commented Aug 17, 2020 •

edited

Loading

yangsenius commented Sep 9, 2020 •

edited

Loading

woctezuma commented Sep 9, 2020 •

edited

Loading

yangsenius commented Sep 9, 2020

alcinos commented Sep 9, 2020

yangsenius commented Sep 10, 2020

Confusion about num_classes #108

Confusion about num_classes #108

Comments

dvd42 commented Jun 26, 2020 • edited Loading

alcinos commented Jun 26, 2020

fmassa commented Jun 26, 2020

woctezuma commented Aug 17, 2020 • edited Loading

yangsenius commented Sep 9, 2020 • edited Loading

woctezuma commented Sep 9, 2020 • edited Loading

yangsenius commented Sep 9, 2020

alcinos commented Sep 9, 2020

yangsenius commented Sep 10, 2020

dvd42 commented Jun 26, 2020 •

edited

Loading

woctezuma commented Aug 17, 2020 •

edited

Loading

yangsenius commented Sep 9, 2020 •

edited

Loading

woctezuma commented Sep 9, 2020 •

edited

Loading