Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable torch.complile to work with classification #3758

Merged
merged 3 commits into from
Jul 26, 2024

Conversation

harimkang
Copy link
Contributor

@harimkang harimkang commented Jul 23, 2024

Summary

    torch.compile Small   Medium     Large    
Task Model Enable Test/Accuracy Train/e2e_time Test/Accuracy Train/e2e_time Train/iter_time Test/Accuracy Train/e2e_time Train/iter_time
MULTI_CLASS_CLS EfficientNet-B0 X 1.0 10 0.7819 87 0.0724 0.8055 210 0.0839
    O 1.0 85 0.7842 273 0.0720 0.8425 513 0.0858
  DeiT-Tiny X 1.0 10 0.7998 89 0.0612 0.8633 336 0.0820
    O 1.0 36 0.8108 136 0.0635 0.8620 418 0.0814

How to test

otx train ~~~ --model.torch_compile True

Checklist

  • I have added unit tests to cover my changes.​
  • I have added integration tests to cover my changes.​
  • I have ran e2e tests and there is no issues.
  • I have added the description of my changes into CHANGELOG in my target branch (e.g., CHANGELOG in develop).​
  • I have updated the documentation in my target branch accordingly (e.g., documentation in develop).
  • I have linked related issues.

License

  • I submit my code changes under the same Apache License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below).
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

@github-actions github-actions bot added TEST Any changes in tests OTX 2.0 labels Jul 23, 2024
Copy link
Contributor

@eunwoosh eunwoosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work. LGTM, but how about adding an integration test for this?
And it seems that there is no gain for speed referring experiment result. Do you know why?

@harimkang
Copy link
Contributor Author

harimkang commented Jul 25, 2024

Thanks for your work. LGTM, but how about adding an integration test for this? And it seems that there is no gain for speed referring experiment result. Do you know why?

This will require more experimentation. Currently, it uses a lot of time at 0 epoch to do the model compile. In my opinion, this time is relatively high for the current use-case of OTX Classification. Also, custom models may not be optimized in most cases. I'll have to look into this further and experiment with improving it later. For now, this PR is about getting compile just to work.

@harimkang harimkang added this to the 2.2.0 milestone Jul 26, 2024
@harimkang harimkang enabled auto-merge July 26, 2024 01:17
Copy link
Contributor

@sungchul2 sungchul2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Related to following results, as you said, it seems to need more experiments because there is no gain for iter_time.
I know compiling time is required and it can be bottleneck for e2e time, but official results show improved training performance and I think very short iter_time is required to reduce e2e time as the official results said.
And I searched some custom experimental results and usually torch.compile showed enhanced results about large batch size or large input resolution.
As far as I know, we use (maybe?) large batch size (=64) but small input resolution (=224) for classification.
I'd like to suggest extensive experiments to find which case affects critical damage to training performance.

@harimkang harimkang added this pull request to the merge queue Jul 26, 2024
Merged via the queue into openvinotoolkit:develop with commit 4696991 Jul 26, 2024
22 checks passed
@harimkang harimkang deleted the harimkan/enable-compile branch July 26, 2024 01:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TEST Any changes in tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants