Add native CityScapes dataset #1090

botcs · 2019-09-15T22:37:17Z

CityScapes Dataset

We have a long list of CityScapes issues here.

Probably many alternative solutions exist to solve the same problem and it could be useful to have a canonical approach for saving some time.

As mentioned earlier #466, there were quite a few unnecessary steps involved when training on this dataset that could be eliminated by using a native module that fetches cityscapes data to the memory in a form which maskrcnn-benchmark handles inputs and target labels.

Scores:

Using the abstract-COCO extension from PR #1096 for evaluating on cityscapes_poly_instance_val dataset. It matches the reported results in the Mask-RCNN paper

bbox: 
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.327
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.568
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.322
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.150
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.315
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.454
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.248
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.388
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.410
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.200
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.371
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.583

mask:
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.315
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.559
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.290
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.112
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.275
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.445
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.258
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.380
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.398
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.174
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.332
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.565

Using the CityScapes evaluation tool from PR #1104 the scores are the following:

BBox

#################################################################
what           :             AP         AP_50%         AP_75%
#################################################################
__background__ :            nan            nan            nan
person         :          0.384          0.723          0.366
rider          :          0.472          0.809          0.486
car            :          0.555          0.843          0.583
truck          :          0.364          0.632          0.332
bus            :          0.562          0.818          0.632
caravan        :          0.413          0.758          0.331
trailer        :          0.394          0.774          0.231
train          :          0.228          0.576          0.129
motorcycle     :          0.320          0.704          0.211
bicycle        :          0.314          0.624          0.283
-----------------------------------------------------------------
average        :          0.401          0.726          0.358


Mask

#################################################################
what           :             AP         AP_50%         AP_75%
#################################################################
__background__ :            nan            nan            nan
person         :          0.334          0.696          0.285
rider          :          0.350          0.835          0.172
car            :          0.544          0.826          0.571
truck          :          0.394          0.649          0.411
bus            :          0.581          0.805          0.635
caravan        :          0.400          0.758          0.370
trailer        :          0.450          0.774          0.414
train          :          0.419          0.690          0.419
motorcycle     :          0.278          0.655          0.173
bicycle        :          0.222          0.567          0.138
-----------------------------------------------------------------
average        :          0.397          0.726          0.359

maskrcnn_benchmark/data/datasets/cityscapes.py

kHarshit · 2019-10-09T10:34:10Z

I'm getting the following error while training:

2019-10-09 15:04:28,911 maskrcnn_benchmark.utils.miscellaneous WARNING: Dataset [CityScapesDataset] has no categories attribute, labels.json file won't be created
2019-10-09 15:04:28,911 maskrcnn_benchmark.trainer INFO: Start training
Traceback (most recent call last):
  File "tools/train_net.py", line 200, in <module>
    main()
  File "tools/train_net.py", line 193, in main
    model = train(cfg, args.local_rank, args.distributed)
  File "tools/train_net.py", line 94, in train
    arguments,
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/engine/trainer.py", line 72, in do_train
    for iteration, (images, targets, _) in enumerate(data_loader, start_iter):
  File "/nfs/interns/kharshit/miniconda3/envs/pyenv/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
    return self._process_data(data)
  File "/nfs/interns/kharshit/miniconda3/envs/pyenv/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
    data.reraise()
  File "/nfs/interns/kharshit/miniconda3/envs/pyenv/lib/python3.7/site-packages/torch/_utils.py", line 369, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/nfs/interns/kharshit/miniconda3/envs/pyenv/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/nfs/interns/kharshit/miniconda3/envs/pyenv/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/nfs/interns/kharshit/miniconda3/envs/pyenv/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/data/datasets/cityscapes.py", line 122, in __getitem__
    img, target = self.transforms(img, target)
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/data/transforms/transforms.py", line 15, in __call__
    image, target = t(image, target)
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/data/transforms/transforms.py", line 73, in __call__
    target = target.transpose(0)
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/structures/bounding_box.py", line 163, in transpose
    v = v.transpose(method)
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/structures/segmentation_mask.py", line 515, in transpose
    flipped_instances = self.instances.transpose(method)
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/structures/segmentation_mask.py", line 115, in transpose
    flipped_masks = self.masks.flip(dim)
RuntimeError: "flip_cpu" not implemented for 'Bool'

It's in the following file

maskrcnn-benchmark/maskrcnn_benchmark/structures/segmentation_mask.py

Lines 113 to 116 in 0ce8f6f

    
           def transpose(self, method): 
        
               dim = 1 if method == FLIP_TOP_BOTTOM else 2 
        
               flipped_masks = self.masks.flip(dim) 
        
               return BinaryMaskList(flipped_masks, self.size)

botcs · 2019-10-09T13:57:02Z

Oh wow. that's pretty sad, because PyTorch 1.2<= uses bools class but has no implemented cpu method for it, maybe it works just on GPU.
I would try to avoid using bool (for now, just as a poor workaround) ant keep uint8 for masks, to keep compatibility... or stick with pytorch 1.0.

Sorry

kHarshit · 2019-10-10T06:25:22Z

Solved it by increasing swap space!

I'm using 8GB RAM with two GPUs (RTX 2080 8GB, GTX 1060 Ti 6GB), although the GPUs work fine, the RAM gets filled easily leading to memory error after few iterations (around 1000) on dataset of
around 8000 images. Is there a way to make it train?

I'm using python -m torch.distributed.launch –nproc_per_node=$NGPUS tools/train_net.py –config-file “configs/cityscapes/e2e_mask_rcnn_R_50_FPN_1x_cocostyle.yaml” MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 2000 SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.005 SOLVER.MAX_ITER 30000

Error:

2019-10-10 12:05:19,457 maskrcnn_benchmark.trainer INFO: eta: 10:14:18  iter: 760  loss: 1.4275 (1.4549)  loss_box_reg: 0.1858 (0.2219)  loss_classifier: 0.3475 (0.3617)  loss_mask: 0.5325 (0.5562)  loss_objectness: 0.0434 (0.0640)  loss_rpn_box_reg: 0.2477 (0.2511)  time: 1.2313 (1.2605)  data: 0.0063 (0.0111)  lr: 0.005000  max mem: 2356
EMPTY ENTRY: /nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/datasets/cityscapes/gtFine/train/2/053923_gtFine_instanceids.png
2019-10-10 12:05:42,819 maskrcnn_benchmark.trainer INFO: eta: 10:12:43  iter: 780  loss: 1.2912 (1.4520)  loss_box_reg: 0.1729 (0.2207)  loss_classifier: 0.2848 (0.3598)  loss_mask: 0.5600 (0.5563)  loss_objectness: 0.0458 (0.0637)  loss_rpn_box_reg: 0.2582 (0.2515)  time: 1.1719 (1.2582)  data: 0.0058 (0.0110)  lr: 0.005000  max mem: 2356
2019-10-10 12:06:06,639 maskrcnn_benchmark.trainer INFO: eta: 10:11:29  iter: 800  loss: 1.2122 (1.4473)  loss_box_reg: 0.1670 (0.2196)  loss_classifier: 0.2999 (0.3584)  loss_mask: 0.5333 (0.5557)  loss_objectness: 0.0386 (0.0632)  loss_rpn_box_reg: 0.1877 (0.2504)  time: 1.2300 (1.2565)  data: 0.0054 (0.0109)  lr: 0.005000  max mem: 2356
Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7f9c2e4055f0>
...
RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 127968000 bytes. Error code 12 (Cannot allocate memory)

botcs · 2019-10-11T09:40:26Z

I think reducing the NUM_WORKERS could ameliorate this issue.
Probably you are using the Binary Segmentation masks when loading the images, try out swapping to using Polygons.

reminder: the two masks are a bit different, so expect some differences in the outcomes as well, but your training will be super-fast afterwards, since it is easier to manipulate polygons than binary tensors of the full image.

let me know if this helps

kHarshit · 2019-10-11T13:12:09Z

Thanks, using polygons significantly reduced the training time per image; and RAM usage is around 5.2G and swap is empty. I'm using NUM_WORKERS=2 now.

botcs · 2019-10-12T15:35:27Z

Yay! good news then :)

botcs added 3 commits September 16, 2019 00:11

Add CityScapes dataset

dd519fc

Update paths catalog

0154a00

Add new CityScapes config files

fe766fa

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Sep 15, 2019

botcs mentioned this pull request Sep 15, 2019

Fix cityscapes to coco conversion script #1088

Closed

NUM_CLASSES: 9->11; Small import fix

b9547f4

This was referenced Sep 22, 2019

How to train Cityscapes dataset #1103

Open

Low instance segmentation AP result on Cityscapes ? #611

Closed

kHarshit reviewed Oct 4, 2019

View reviewed changes

maskrcnn_benchmark/data/datasets/cityscapes.py Show resolved Hide resolved

SkeletonOne mentioned this pull request Oct 15, 2019

How to do evaluation on CityscapesDataset? #1139

Closed

botcs merged commit 523ae86 into facebookresearch:master Oct 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add native CityScapes dataset #1090

Add native CityScapes dataset #1090

botcs commented Sep 15, 2019 •

edited

Loading

kHarshit commented Oct 9, 2019

botcs commented Oct 9, 2019

kHarshit commented Oct 10, 2019 •

edited

Loading

botcs commented Oct 11, 2019

kHarshit commented Oct 11, 2019 •

edited

Loading

botcs commented Oct 12, 2019

Add native CityScapes dataset #1090

Add native CityScapes dataset #1090

Conversation

botcs commented Sep 15, 2019 • edited Loading

CityScapes Dataset

Scores:

kHarshit commented Oct 9, 2019

botcs commented Oct 9, 2019

kHarshit commented Oct 10, 2019 • edited Loading

botcs commented Oct 11, 2019

kHarshit commented Oct 11, 2019 • edited Loading

botcs commented Oct 12, 2019

botcs commented Sep 15, 2019 •

edited

Loading

kHarshit commented Oct 10, 2019 •

edited

Loading

kHarshit commented Oct 11, 2019 •

edited

Loading