Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: Torch 2.0.1 conditional dependency specification #409

Closed
wants to merge 14 commits into from

Conversation

gtdang
Copy link
Collaborator

@gtdang gtdang commented May 16, 2023

Description

Addressing issue #408

Type of change

  • build: Changes that affect the build system or external dependencies (i.e., poetry, conda)

Features (Optional)

  • bulleted list of features, describing the code contributions in this pull request

Questions (Optional)

  • potential questions for the code reviewer (e.g., if you want feedback on a certain method or class; any feedback required from other code contributors with their mentioning)

@gtdang gtdang marked this pull request as ready for review May 16, 2023 21:49
@gtdang gtdang requested a review from musslick as a code owner May 16, 2023 21:49
@gtdang gtdang changed the title Torch 2.0.1 fixes build: Torch 2.0.1 conditional dependency specification May 16, 2023
@gtdang
Copy link
Collaborator Author

gtdang commented May 16, 2023

Screenshot 2023-05-16 at 6 00 36 PM

I removed fail-fast in the gh actions pytest workflow and confirmed that this issue is only for Linux builds.

@gtdang gtdang marked this pull request as draft May 16, 2023 22:03
@gtdang gtdang requested a review from benwandrew May 16, 2023 22:04
@@ -10,7 +10,7 @@ on:
jobs:
test:
strategy:
fail-fast: true
fail-fast: false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know we touched on this, but to confirm, is this just for now or more generally going forward? once we've got the conditional solution is there any reason we wouldn't revert to fail-fast?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! You got it! We're still in testing mode to figure out the right solution so we'll leave it off for now. But when we figure it out we'll switch this back to true. Fail-fast is a cost-saving feature, since you don't want keep running/paying for the compute resources for all the instances if one has failed. However it does obscure if the problem is isolated to one type of build vs across all builds.

pyproject.toml Show resolved Hide resolved
@benwandrew benwandrew self-requested a review May 16, 2023 23:34
@gtdang
Copy link
Collaborator Author

gtdang commented May 17, 2023

@benwandrew Wow! That was a battle but I got it working. So this is at least one solution. I'd like to get @hollandjg perspective on this when he's back next week. So lets wait to merge?

@gtdang gtdang requested a review from hollandjg May 17, 2023 20:22
@gtdang gtdang marked this pull request as ready for review May 17, 2023 20:22
@benwandrew
Copy link
Collaborator

@benwandrew Wow! That was a battle but I got it working. So this is at least one solution. I'd like to get @hollandjg perspective on this when he's back next week. So lets wait to merge?

@gtdang sounds good, i agree. thanks for fighting your way through!

@hollandjg
Copy link
Member

This should be a problem which is isolated to autora-theorist-darts once we've done the reorganization.
Is it just because torch 2.0.1 doesn't work on linux yet? If so, we can just keep the current working version 1.x until this is fixed.

@gtdang
Copy link
Collaborator Author

gtdang commented May 22, 2023

This should be a problem which is isolated to autora-theorist-darts once we've done the reorganization.
Is it just because torch 2.0.1 doesn't work on linux yet? If so, we can just keep the current working version 1.x until this is fixed.

It's actually because the the default for pip install behavior differs between operating systems. Linux defaults to a GPU version and you need to specify the CPU version whl source for it. For Mac and Windows it defaults to the CPU version. This is an issue for our tests because GH Runners don't have gpu. The build works but the test fails when searching to compile GPU modules.

I'm not sure when this change happened, but it the problem arose when trying to update to v2.0.1.

I brought this up at last week's maintenance meeting. We discussed that the CPU version should be the default dependency and have a warning about GPU resources when the user imports the Darts modules. I added an issue with more info to the autora-theorist-darts for this feature. AutoResearch/autora-theorist-darts#3

@hollandjg
Copy link
Member

this needs to be closed here and moved over to the relevant subpackages which use pytorch (darts, falsification)

@hollandjg hollandjg closed this Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants