Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support eval concate dataset and add tool to show dataset #833

Merged
merged 13 commits into from
Sep 9, 2021

Conversation

FreyWang
Copy link
Contributor

@FreyWang FreyWang commented Aug 27, 2021

This pr

  1. add tool to show mask
  2. support concate dataset eval and format result
  3. fix bug of metric nan when pre_eval=False

Add tool to show origin train set and augmented train set

Add file tools/browse_dataset.py, ConcatDataset and RepeatDataset is also supported.

usage

  1. python tools/browse_dataset.py {CONFIG}
    it will save augmented train image to args.output-dir
  2. python tools/browse_dataset.py {CONFIG} --show-origin
    it will save origin train image to args.output-dir

Support eval concate dataset

this version is compatible with progressive eval #PR709

usage

  1. The format of concate val and test dataset is same as train dataset. If separate_eval=True, it will eval every sub dataset separately, else eval as a whole dataset.
    test=dict(
        type=dataset_type,
        data_root=data_root,
        img_dir=['images1/validation',
                 'images2/validation'],
        ann_dir=['annotations1/validation',
                 'annotations2/validation'],
        separate_eval=False,
        pipeline=test_pipeline))
  1. CityscapesDataset is not support in concat dataset

Modification

mmseg/datasets/custom.py

  1. Add 'gt_seg_maps' argument, used in evaluation of concate dataset
  2. Assert self.CLASSES is not None in test mode to avoid call generator gt_seg_maps repeatedly

mmseg/datasets/dataset_wrapper.py

  1. Add evaluate() to ConcatDataset. When separate_eval=True, each subset is evaluated separately. When separate_eval=False, the generator gt_seg_maps of each subset is merged to calculate whole result.
  2. Add pre_eval() to be compatible with progressive eval
  3. Add format_results() to save each result, all image result from different subset will be saved to imgfile_prefix/{dataset_idx}

Some numerical results

Use configs/fcn/fcn_r50-d8_512x1024_40k_cityscapes.py and its trained checkpoint as demo, when repeat test set twice

  1. separate_eval=True
zsh tools/dist_test.sh configs/fcn/fcn_r50-d8_512x512_80k_ade20k.py fcn_r50-d8_512x512_80k_ade20k_20200614_144016-f8ac5082.pth 8 --eval mIoU  --options data.test.img_dir="[images/validation,images/validation]" data.test.ann_dir="[annotations/validation,annotations/validation]"  data.test.separate_eval=True

It will eval twice, each 2000 image result is 35.94
sub set 1:

+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 77.39 | 35.94 | 45.69 |
+-------+-------+-------+

sub set 2:

+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 77.39 | 35.94 | 45.69 |
+-------+-------+-------+
  1. separate_eval=False
zsh tools/dist_test.sh configs/fcn/fcn_r50-d8_512x512_80k_ade20k.py fcn_r50-d8_512x512_80k_ade20k_20200614_144016-f8ac5082.pth 8 --eval mIoU  --options data.test.img_dir="[images/validation,images/validation]" data.test.ann_dir="[annotations/validation,annotations/validation]"  data.test.separate_eval=False

It will eval 2000*2 image as a whole, the result is alse 35.94

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 4000/4000, 9.4 task/s, elapsed: 426s, ETA:     0sper class results:

+---------------------+-------+-------+
|        Class        |  IoU  |  Acc  |
+---------------------+-------+-------+
Summary:

+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 77.39 | 35.94 | 45.69 |
+-------+-------+-------+

@Junjun2016
Copy link
Collaborator

Hi @FreyWang
Thanks for updating.
We will review it ASAP.

@Junjun2016
Copy link
Collaborator

Please use pre-commit to fix the lint error.

@Junjun2016
Copy link
Collaborator

Please also use pytest to check the code error or compatibility.

@@ -15,10 +20,107 @@ class ConcatDataset(_ConcatDataset):
datasets (list[:obj:`Dataset`]): A list of datasets.
"""

def __init__(self, datasets):
def __init__(self, datasets, separate_eval=True):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring for separate_eval.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, I hava fix the issue and add unittest for it, does I need to submit a new PR or not?

@@ -99,6 +99,9 @@ def __init__(self,
self.label_map = None
self.CLASSES, self.PALETTE = self.get_classes_and_palette(
classes, palette)
if test_mode:
assert self.CLASSES is not None, \
'`cls.CLASSES` or `classes` should be specified when testing'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this modify leads to failed github CI (checked).
Could you please add some unittests and fix the failed unitsests ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will find time to fix the issue above 😂

raise NotImplementedError(
'All the datasets should have same types when self.separate_eval=False')
else:
gt_seg_maps = chain(*[dataset.get_gt_seg_maps()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the results are pre_eval results, we do not need gt_seg_maps.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the results are pre_eval results, we do not need gt_seg_maps.

yes, but if pre_eval = False when training, it may case error

evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if the results are pre_eval results, we do not need gt_seg_maps and set gt_seg_maps=None.
We only need to collect gt_seg_maps when the results are eval results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it 😯

def format_results(self, results, imgfile_prefix, indices=None, **kwargs):
"""format result for every sample of ConcatDataset """
ret_res = []
for i, indice in enumerate(indices):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about indices=None
Maybe we need handle this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about indices=None
Maybe we need handle this case.

you are right, I will fix it

@Junjun2016 Junjun2016 self-requested a review August 30, 2021 12:06
Comment on lines 79 to 80
gt_seg_maps = chain(*[dataset.get_gt_seg_maps()
for dataset in self.datasets])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
gt_seg_maps = chain(*[dataset.get_gt_seg_maps()
for dataset in self.datasets])
if mmcv.is_list_of(results, np.ndarray) or mmcv.is_list_of(
results, str):
gt_seg_maps = chain(*[dataset.get_gt_seg_maps()
for dataset in self.datasets])
else:
gt_seg_maps = None

Does this work?
Please have a check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,i will

@Junjun2016
Copy link
Collaborator

Junjun2016 commented Sep 2, 2021 via email

@FreyWang
Copy link
Contributor Author

FreyWang commented Sep 2, 2021

Update in this PR. 发自 网易邮箱大师 ---- 回复的原邮件 ---- 发件人 @.> 日期 2021年09月02日 21:53 收件人 @.> 抄送至 @.@.> 主题 Re: [open-mmlab/mmsegmentation] [Feature] Support eval concate dataset and add tool to show dataset (#833) @FreyWang commented on this pull request. In mmseg/datasets/dataset_wrappers.py: > @@ -15,10 +20,107 @@ class ConcatDataset(_ConcatDataset): datasets (list[:obj:Dataset]): A list of datasets. """ - def init(self, datasets): + def init(self, datasets, separate_eval=True): hi, I hava fix the issue and add unittest for it, does I need to submit a new PR or not? — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

Done. Additionally, one bug fix have been commited, please review it too. bf48690
len(list(gt_seg_maps)) will lead the generator to be empty and return nan metric when pre_eval=False

@codecov
Copy link

codecov bot commented Sep 2, 2021

Codecov Report

Merging #833 (28e3bd2) into master (d35fbbd) will increase coverage by 0.10%.
The diff coverage is 97.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #833      +/-   ##
==========================================
+ Coverage   88.90%   89.00%   +0.10%     
==========================================
  Files         110      110              
  Lines        5928     5992      +64     
  Branches      950      966      +16     
==========================================
+ Hits         5270     5333      +63     
- Misses        465      466       +1     
  Partials      193      193              
Flag Coverage Δ
unittests 88.98% <97.33%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmseg/core/evaluation/metrics.py 90.42% <ø> (-0.20%) ⬇️
mmseg/datasets/dataset_wrappers.py 97.67% <97.01%> (-2.33%) ⬇️
mmseg/datasets/builder.py 89.61% <100.00%> (+0.13%) ⬆️
mmseg/datasets/custom.py 92.09% <100.00%> (-0.05%) ⬇️
mmseg/datasets/ade.py 93.93% <0.00%> (+3.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d35fbbd...28e3bd2. Read the comment docs.

@@ -112,8 +112,6 @@ def total_intersect_and_union(results,
ndarray: The prediction histogram on all classes.
ndarray: The ground truth histogram on all classes.
"""
num_imgs = len(results)
assert len(list(gt_seg_maps)) == num_imgs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove these assert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list(gt_seg_maps) will loop the generator and then gt_seg_maps will be empty,

for result, gt_seg_map in zip(results, gt_seg_maps):
will case error, lead metric to be nan, I have add unittest in bf48690.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still missing some lines.
You can view it through files changed.

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the assert is not remove,

eval_results = train_dataset.evaluate(pseudo_results, metric=['mIoU'])
The eval result will be nan

@Junjun2016
Copy link
Collaborator

Could you please fix the lint error and add more unittests to improve the coverage?

@FreyWang
Copy link
Contributor Author

FreyWang commented Sep 2, 2021

Could you please fix the lint error and add more unittests to improve the coverage?

OK, I will check again. Actually I did use pre-commit to refactor the code😕

@Junjun2016
Copy link
Collaborator

Could you please fix the lint error and add more unittests to improve the coverage?

OK, I will check again. Actually I did use pre-commit to refactor the code

image

@@ -30,6 +30,7 @@ def _concat_dataset(cfg, default_args=None):
img_dir = cfg['img_dir']
ann_dir = cfg.get('ann_dir', None)
split = cfg.get('split', None)
separate_eval = cfg.get('separate_eval', True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may pop separate_eval here?

Copy link
Contributor Author

@FreyWang FreyWang Sep 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me check😢

@xvjiarui
Copy link
Collaborator

xvjiarui commented Sep 2, 2021

Please fix the lint

Copy link
Contributor Author

@FreyWang FreyWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please fix the lint error and add more unittests to improve the coverage?

Done

@@ -49,6 +50,9 @@ def _concat_dataset(cfg, default_args=None):
datasets = []
for i in range(num_dset):
data_cfg = copy.deepcopy(cfg)
# pop 'separate_eval' since it is not a valid key for common datasets.
if 'separate_eval' in data_cfg:
data_cfg.pop('separate_eval')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate_eval has been poped here for every subset @xvjiarui

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use separate_eval = cfg.pop('separate_eval', True) in L33?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use separate_eval = cfg.pop('separate_eval', True) in L33?

Sure, I think it will be better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated 28e3bd2

Copy link
Collaborator

@Junjun2016 Junjun2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your efforts, LGTM.

@Junjun2016
Copy link
Collaborator

Hi @xvjiarui
Please review it again.

@openmmlab-bot
Copy link
Collaborator

@Junjun2016 Junjun2016 merged commit 872e544 into open-mmlab:master Sep 9, 2021
bowenroom pushed a commit to bowenroom/mmsegmentation that referenced this pull request Feb 25, 2022
…pen-mmlab#833)

* [Feature] Add tool to show origin or augmented train data

* [Feature] Support eval concate dataset

* Add docstring and modify evaluate of concate dataset

Signed-off-by: FreyWang <[email protected]>

* format concat dataset in subfolder of imgfile_prefix

Signed-off-by: FreyWang <[email protected]>

* add unittest of concate dataset

Signed-off-by: FreyWang <[email protected]>

* update unittest for eval dataset with CLASSES is None

Signed-off-by: FreyWang <[email protected]>

* [FIX] bug of generator,  which lead metric to nan when pre_eval=False

Signed-off-by: FreyWang <[email protected]>

* format code

Signed-off-by: FreyWang <[email protected]>

* add more unittest

* add more unittest

* optim concat dataset builder
aravind-h-v pushed a commit to aravind-h-v/mmsegmentation that referenced this pull request Mar 27, 2023
Patch Release: 0.5.1
@jason102811
Copy link

freywang,您好!您在MMSeg项目中给我们提的PR非常重要,感谢您付出私人时间帮助改进开源项目,相信很多开发者会从你的PR中受益。
我们非常期待与您继续合作,OpenMMLab专门成立了贡献者组织MMSIG,为贡献者们提供开源证书、荣誉体系和专享好礼,可通过添加微信:openmmlabwx 联系我们(请备注mmsig+GitHub id),由衷希望您能加入!
Dear freywang,
First of all, we want to express our gratitude for your significant PR in the MMSeg project. Your contribution is highly appreciated, and we are grateful for your efforts in helping improve this open-source project during your personal time. We believe that many developers will benefit from your PR.
We are looking forward to continuing our collaboration with you. OpenMMLab has established a special contributors' organization called MMSIG, which provides contributors with open-source certificates, a recognition system, and exclusive rewards. You can contact us by adding our WeChat(if you have WeChat): openmmlabwx, or join in our discord: https://discord.gg/qH9fysxPDW. We sincerely hope you will join us!
Best regards! @FreyWang

wjkim81 pushed a commit to wjkim81/mmsegmentation that referenced this pull request Dec 3, 2023
sibozhang pushed a commit to sibozhang/mmsegmentation that referenced this pull request Mar 22, 2024
* correct tpn sthv1 testing

* Update tpn_tsm_r50_1x1x8_150e_sthv1_rgb.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants