Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while running classification regression test with DeiT-Tiny template #2567

Closed
yunchu opened this issue Oct 19, 2023 · 1 comment
Closed
Assignees
Labels
BUG Something isn't working

Comments

@yunchu
Copy link
Contributor

yunchu commented Oct 19, 2023

Error while running following TC

  • tests/regression/classification/test_classification.py::TestRegressionMultiClassClassification::test_otx_train_cls_incr[Custom_Image_Classification_DeiT-Tiny]

https://github.com/openvinotoolkit/training_extensions/actions/runs/6570309598/job/17847515102

Use cmd below to reproduce this issue on the local machine.

$ CI_DATA_ROOT=<absolute-path-to-ci-datasets> tox -vvv -e tests-cls-py310-pt1 -- tests/regression/classification/test_classification.py::TestRegressionMultiClassClassification::test_otx_train[Custom_Image_Classification_DeiT-Tiny] tests/regression/classification/test_classification.py::TestRegressionMultiClassClassification::test_otx_train_cls_incr[Custom_Image_Classification_DeiT-Tiny]

error capture

2023-10-19T05:44:33.8108903Z 2023-10-19 05:44:31,331 - mmcls - INFO - Epoch(val) [5][32] accuracy_top-1: 0.6125, accuracy_top-5: 0.9400, airplane accuracy: 0.7000, automobile accuracy: 0.6250, bird accuracy: 0.5750, cat accuracy: 0.3750, deer accuracy: 0.7000, dog accuracy: 0.5500, frog accuracy: 0.5250, horse accuracy: 0.6750, ship accuracy: 0.8250, truck accuracy: 0.5750, mean accuracy: 0.6125, accuracy: 0.6125, current_iters: 160
2023-10-19T05:44:33.8111028Z 2023-10-19 05:44:31,332 - mmcls - INFO - MemCacheHandlerBase uses 0 / 0 (0.0%) memory pool and store 0 items.
2023-10-19T05:44:33.8111701Z 2023-10-19 05:44:31,333 - mmcls - INFO -
2023-10-19T05:44:33.8112148Z Best Score: 0.6125, Current Score: 0.6125, Patience: 1 Count: 0
2023-10-19T05:44:33.8112617Z Process SpawnProcess-1:
2023-10-19T05:44:33.8112933Z Traceback (most recent call last):
2023-10-19T05:44:33.8119433Z File "/home/validation/actions-runner/_work/_tool/Python/3.10.13/x64/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
2023-10-19T05:44:33.8119937Z self.run()
2023-10-19T05:44:33.8120475Z File "/home/validation/actions-runner/_work/_tool/Python/3.10.13/x64/lib/python3.10/multiprocessing/process.py", line 108, in run
2023-10-19T05:44:33.8121113Z self._target(*self._args, **self._kwargs)
2023-10-19T05:44:33.8121841Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/cli/utils/multi_gpu.py", line 269, in run_child_process
2023-10-19T05:44:33.8122436Z train_func()
2023-10-19T05:44:33.8123035Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/cli/tools/train.py", line 290, in train
2023-10-19T05:44:33.8123575Z task.train(
2023-10-19T05:44:33.8124217Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/task.py", line 213, in train
2023-10-19T05:44:33.8124830Z results = self._train_model(dataset)
2023-10-19T05:44:33.8125600Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/task.py", line 410, in _train_model
2023-10-19T05:44:33.8126298Z train_model(
2023-10-19T05:44:33.8126889Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/apis/train.py", line 233, in train_model
2023-10-19T05:44:33.8127455Z runner.run(data_loaders, cfg.workflow)
2023-10-19T05:44:33.8128128Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
2023-10-19T05:44:33.8128717Z epoch_runner(data_loaders[i], **kwargs)
2023-10-19T05:44:33.8129457Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/common/adapters/mmcv/runner.py", line 81, in train
2023-10-19T05:44:33.8130102Z self.run_iter(data_batch, train_mode=True, **kwargs)
2023-10-19T05:44:33.8131163Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 31, in run_iter
2023-10-19T05:44:33.8131871Z outputs = self.model.train_step(data_batch, self.optimizer,
2023-10-19T05:44:33.8132646Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/parallel/distributed.py", line 63, in train_step
2023-10-19T05:44:33.8133262Z output = self.module.train_step(*inputs[0], **kwargs[0])
2023-10-19T05:44:33.8134116Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/classifiers/mixin.py", line 29, in train_step
2023-10-19T05:44:33.8134860Z return super().train_step(data, optimizer, **kwargs)
2023-10-19T05:44:33.8135832Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/classifiers/mixin.py", line 105, in train_step
2023-10-19T05:44:33.8136574Z return super().train_step(data, optimizer, **kwargs)
2023-10-19T05:44:33.8137312Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/models/classifiers/base.py", line 139, in train_step
2023-10-19T05:44:33.8137890Z losses = self(**data)
2023-10-19T05:44:34.5742421Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
2023-10-19T05:44:34.5743463Z return forward_call(*input, **kwargs)
2023-10-19T05:44:34.5746123Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/runner/fp16_utils.py", line 149, in new_func
2023-10-19T05:44:34.5746887Z output = old_func(*new_args, **new_kwargs)
2023-10-19T05:44:34.5747809Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/models/classifiers/base.py", line 83, in forward
2023-10-19T05:44:34.5748419Z return self.forward_train(img, **kwargs)
2023-10-19T05:44:34.5749327Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/classifiers/custom_image_classifier.py", line 83, in forward_train
2023-10-19T05:44:34.5750118Z loss = self.head.forward_train(x, gt_label)
2023-10-19T05:44:34.5750921Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/models/heads/vision_transformer_head.py", line 122, in forward_train
2023-10-19T05:44:34.5751649Z losses = self.loss(cls_score, gt_label, **kwargs)
2023-10-19T05:44:34.5752532Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/heads/custom_vision_transformer_head.py", line 24, in loss
2023-10-19T05:44:34.5753340Z loss = self.compute_loss(cls_score, gt_label, feature=feature)
2023-10-19T05:44:34.5754077Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
2023-10-19T05:44:34.5754669Z return forward_call(*input, **kwargs)
2023-10-19T05:44:34.5755469Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/losses/ib_loss.py", line 61, in forward
2023-10-19T05:44:34.5756240Z feature = torch.sum(torch.abs(feature), 1).reshape(-1, 1)
2023-10-19T05:44:34.5756622Z TypeError: abs(): argument 'input' (position 1) must be Tensor, not NoneType
2023-10-19T05:44:34.5756920Z Traceback (most recent call last):
2023-10-19T05:44:34.5757441Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/bin/otx", line 8, in
2023-10-19T05:44:34.5757886Z sys.exit(main())
2023-10-19T05:44:34.5758476Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/cli/tools/cli.py", line 77, in main
2023-10-19T05:44:34.5759029Z results = globals()f"otx_{name}"
2023-10-19T05:44:34.5759668Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/cli/tools/train.py", line 192, in main
2023-10-19T05:44:34.5760262Z return train(exit_stack)
2023-10-19T05:44:34.5760882Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/cli/tools/train.py", line 290, in train
2023-10-19T05:44:34.5761420Z task.train(
2023-10-19T05:44:34.5762077Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/task.py", line 213, in train
2023-10-19T05:44:34.5762681Z results = self._train_model(dataset)
2023-10-19T05:44:34.5763434Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/task.py", line 410, in _train_model
2023-10-19T05:44:34.5764105Z train_model(
2023-10-19T05:44:34.5764743Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/apis/train.py", line 233, in train_model
2023-10-19T05:44:34.5765310Z runner.run(data_loaders, cfg.workflow)
2023-10-19T05:44:34.5765989Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
2023-10-19T05:44:34.5766576Z epoch_runner(data_loaders[i], **kwargs)
2023-10-19T05:44:36.2499831Z
2023-10-19T05:44:36.2501500Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/common/adapters/mmcv/runner.py", line 81, in train
2023-10-19T05:44:36.2502702Z self.run_iter(data_batch, train_mode=True, **kwargs)
2023-10-19T05:44:36.2503929Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 31, in run_iter
2023-10-19T05:44:36.2505326Z outputs = self.model.train_step(data_batch, self.optimizer,
2023-10-19T05:44:36.2506559Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/parallel/distributed.py", line 63, in train_step
2023-10-19T05:44:36.2507599Z output = self.module.train_step(*inputs[0], **kwargs[0])
2023-10-19T05:44:36.2509023Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/classifiers/mixin.py", line 29, in train_step
2023-10-19T05:44:36.2510281Z return super().train_step(data, optimizer, **kwargs)
2023-10-19T05:44:36.2511711Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/classifiers/mixin.py", line 105, in train_step
2023-10-19T05:44:36.2512971Z return super().train_step(data, optimizer, **kwargs)
2023-10-19T05:44:36.2514154Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/models/classifiers/base.py", line 139, in train_step
2023-10-19T05:44:36.2515125Z losses = self(**data)
2023-10-19T05:44:36.2516161Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
2023-10-19T05:44:36.2517127Z return forward_call(*input, **kwargs)
2023-10-19T05:44:36.2518205Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcv/runner/fp16_utils.py", line 149, in new_func
2023-10-19T05:44:36.2519391Z 2023-10-19 05:44:31,333 | INFO : Balanced sampler will select balanced samples 32 times
2023-10-19T05:44:36.2520296Z 2023-10-19 05:44:34,659 | WARNING : Some of child processes are terminated abnormally. process exits.
2023-10-19T05:44:36.2520905Z output = old_func(*new_args, **new_kwargs)
2023-10-19T05:44:36.2522035Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/models/classifiers/base.py", line 83, in forward
2023-10-19T05:44:36.2523017Z return self.forward_train(img, **kwargs)
2023-10-19T05:44:36.2524490Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/classifiers/custom_image_classifier.py", line 83, in forward_train
2023-10-19T05:44:36.2525802Z loss = self.head.forward_train(x, gt_label)
2023-10-19T05:44:36.2527163Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/mmcls/models/heads/vision_transformer_head.py", line 122, in forward_train
2023-10-19T05:44:36.2528266Z losses = self.loss(cls_score, gt_label, **kwargs)
2023-10-19T05:44:36.2529715Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/heads/custom_vision_transformer_head.py", line 24, in loss
2023-10-19T05:44:36.2531111Z loss = self.compute_loss(cls_score, gt_label, feature=feature)
2023-10-19T05:44:36.2532306Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
2023-10-19T05:44:36.2533267Z return forward_call(*input, **kwargs)
2023-10-19T05:44:36.2534690Z File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-cls-py310-pt1/lib/python3.10/site-packages/otx/algorithms/classification/adapters/mmcls/models/losses/ib_loss.py", line 61, in forward
2023-10-19T05:44:36.2536044Z feature = torch.sum(torch.abs(feature), 1).reshape(-1, 1)
2023-10-19T05:44:36.2536660Z TypeError: abs(): argument 'input' (position 1) must be Tensor, not NoneType

@wonjuleee
Copy link
Contributor

wonjuleee commented Nov 2, 2023

The problem was, we missed to feed features in DeiT model for calculating IB losses during class-incremental learning. The PR #2594 was merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants