Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Add native CityScapes dataset #1090

Merged
merged 4 commits into from
Oct 17, 2019
Merged

Conversation

botcs
Copy link
Contributor

@botcs botcs commented Sep 15, 2019

CityScapes Dataset

We have a long list of CityScapes issues here.

Probably many alternative solutions exist to solve the same problem and it could be useful to have a canonical approach for saving some time.

As mentioned earlier #466, there were quite a few unnecessary steps involved when training on this dataset that could be eliminated by using a native module that fetches cityscapes data to the memory in a form which maskrcnn-benchmark handles inputs and target labels.

Scores:

Using the abstract-COCO extension from PR #1096 for evaluating on cityscapes_poly_instance_val dataset. It matches the reported results in the Mask-RCNN paper

bbox: 
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.327
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.568
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.322
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.150
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.315
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.454
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.248
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.388
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.410
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.200
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.371
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.583

mask:
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.315
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.559
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.290
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.112
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.275
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.445
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.258
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.380
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.398
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.174
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.332
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.565

Using the CityScapes evaluation tool from PR #1104 the scores are the following:

BBox

#################################################################
what           :             AP         AP_50%         AP_75%
#################################################################
__background__ :            nan            nan            nan
person         :          0.384          0.723          0.366
rider          :          0.472          0.809          0.486
car            :          0.555          0.843          0.583
truck          :          0.364          0.632          0.332
bus            :          0.562          0.818          0.632
caravan        :          0.413          0.758          0.331
trailer        :          0.394          0.774          0.231
train          :          0.228          0.576          0.129
motorcycle     :          0.320          0.704          0.211
bicycle        :          0.314          0.624          0.283
-----------------------------------------------------------------
average        :          0.401          0.726          0.358


Mask

#################################################################
what           :             AP         AP_50%         AP_75%
#################################################################
__background__ :            nan            nan            nan
person         :          0.334          0.696          0.285
rider          :          0.350          0.835          0.172
car            :          0.544          0.826          0.571
truck          :          0.394          0.649          0.411
bus            :          0.581          0.805          0.635
caravan        :          0.400          0.758          0.370
trailer        :          0.450          0.774          0.414
train          :          0.419          0.690          0.419
motorcycle     :          0.278          0.655          0.173
bicycle        :          0.222          0.567          0.138
-----------------------------------------------------------------
average        :          0.397          0.726          0.359

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Sep 15, 2019
@kHarshit
Copy link

kHarshit commented Oct 9, 2019

I'm getting the following error while training:

2019-10-09 15:04:28,911 maskrcnn_benchmark.utils.miscellaneous WARNING: Dataset [CityScapesDataset] has no categories attribute, labels.json file won't be created
2019-10-09 15:04:28,911 maskrcnn_benchmark.trainer INFO: Start training
Traceback (most recent call last):
  File "tools/train_net.py", line 200, in <module>
    main()
  File "tools/train_net.py", line 193, in main
    model = train(cfg, args.local_rank, args.distributed)
  File "tools/train_net.py", line 94, in train
    arguments,
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/engine/trainer.py", line 72, in do_train
    for iteration, (images, targets, _) in enumerate(data_loader, start_iter):
  File "/nfs/interns/kharshit/miniconda3/envs/pyenv/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
    return self._process_data(data)
  File "/nfs/interns/kharshit/miniconda3/envs/pyenv/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
    data.reraise()
  File "/nfs/interns/kharshit/miniconda3/envs/pyenv/lib/python3.7/site-packages/torch/_utils.py", line 369, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/nfs/interns/kharshit/miniconda3/envs/pyenv/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/nfs/interns/kharshit/miniconda3/envs/pyenv/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/nfs/interns/kharshit/miniconda3/envs/pyenv/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/data/datasets/cityscapes.py", line 122, in __getitem__
    img, target = self.transforms(img, target)
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/data/transforms/transforms.py", line 15, in __call__
    image, target = t(image, target)
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/data/transforms/transforms.py", line 73, in __call__
    target = target.transpose(0)
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/structures/bounding_box.py", line 163, in transpose
    v = v.transpose(method)
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/structures/segmentation_mask.py", line 515, in transpose
    flipped_instances = self.instances.transpose(method)
  File "/nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/maskrcnn_benchmark/structures/segmentation_mask.py", line 115, in transpose
    flipped_masks = self.masks.flip(dim)
RuntimeError: "flip_cpu" not implemented for 'Bool'

It's in the following file

def transpose(self, method):
dim = 1 if method == FLIP_TOP_BOTTOM else 2
flipped_masks = self.masks.flip(dim)
return BinaryMaskList(flipped_masks, self.size)

@botcs
Copy link
Contributor Author

botcs commented Oct 9, 2019

Oh wow. that's pretty sad, because PyTorch 1.2<= uses bools class but has no implemented cpu method for it, maybe it works just on GPU.
I would try to avoid using bool (for now, just as a poor workaround) ant keep uint8 for masks, to keep compatibility... or stick with pytorch 1.0.

Sorry

@kHarshit
Copy link

kHarshit commented Oct 10, 2019

Solved it by increasing swap space!


I'm using 8GB RAM with two GPUs (RTX 2080 8GB, GTX 1060 Ti 6GB), although the GPUs work fine, the RAM gets filled easily leading to memory error after few iterations (around 1000) on dataset of
around 8000 images. Is there a way to make it train?

I'm using python -m torch.distributed.launch –nproc_per_node=$NGPUS tools/train_net.py –config-file “configs/cityscapes/e2e_mask_rcnn_R_50_FPN_1x_cocostyle.yaml” MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 2000 SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.005 SOLVER.MAX_ITER 30000

Error:

2019-10-10 12:05:19,457 maskrcnn_benchmark.trainer INFO: eta: 10:14:18  iter: 760  loss: 1.4275 (1.4549)  loss_box_reg: 0.1858 (0.2219)  loss_classifier: 0.3475 (0.3617)  loss_mask: 0.5325 (0.5562)  loss_objectness: 0.0434 (0.0640)  loss_rpn_box_reg: 0.2477 (0.2511)  time: 1.2313 (1.2605)  data: 0.0063 (0.0111)  lr: 0.005000  max mem: 2356
EMPTY ENTRY: /nfs/interns/kharshit/Documents/temp/maskrcnn-benchmark/datasets/cityscapes/gtFine/train/2/053923_gtFine_instanceids.png
2019-10-10 12:05:42,819 maskrcnn_benchmark.trainer INFO: eta: 10:12:43  iter: 780  loss: 1.2912 (1.4520)  loss_box_reg: 0.1729 (0.2207)  loss_classifier: 0.2848 (0.3598)  loss_mask: 0.5600 (0.5563)  loss_objectness: 0.0458 (0.0637)  loss_rpn_box_reg: 0.2582 (0.2515)  time: 1.1719 (1.2582)  data: 0.0058 (0.0110)  lr: 0.005000  max mem: 2356
2019-10-10 12:06:06,639 maskrcnn_benchmark.trainer INFO: eta: 10:11:29  iter: 800  loss: 1.2122 (1.4473)  loss_box_reg: 0.1670 (0.2196)  loss_classifier: 0.2999 (0.3584)  loss_mask: 0.5333 (0.5557)  loss_objectness: 0.0386 (0.0632)  loss_rpn_box_reg: 0.1877 (0.2504)  time: 1.2300 (1.2565)  data: 0.0054 (0.0109)  lr: 0.005000  max mem: 2356
Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7f9c2e4055f0>
...
RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 127968000 bytes. Error code 12 (Cannot allocate memory)

Screenshot from 2019-10-09 19-02-27

@botcs
Copy link
Contributor Author

botcs commented Oct 11, 2019

I think reducing the NUM_WORKERS could ameliorate this issue.
Probably you are using the Binary Segmentation masks when loading the images, try out swapping to using Polygons.

reminder: the two masks are a bit different, so expect some differences in the outcomes as well, but your training will be super-fast afterwards, since it is easier to manipulate polygons than binary tensors of the full image.

let me know if this helps

@kHarshit
Copy link

kHarshit commented Oct 11, 2019

Thanks, using polygons significantly reduced the training time per image; and RAM usage is around 5.2G and swap is empty. I'm using NUM_WORKERS=2 now.

Screenshot from 2019-10-11 18-39-07

@botcs
Copy link
Contributor Author

botcs commented Oct 12, 2019

Yay! good news then :)

@botcs botcs merged commit 523ae86 into facebookresearch:master Oct 17, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants