Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an environment variable to disable ort session cache if necessary | fix(atenlib) #685

Merged
merged 6 commits into from
Apr 29, 2023

Conversation

fatcat-z
Copy link
Contributor

Try to disable ort session cache to avoid Windows memory exception caused by CI tests.
Fix #614

@fatcat-z fatcat-z added the topic: torch_lib Related to the torch/aten function lib in development label Apr 27, 2023
@justinchuby
Copy link
Collaborator

How about removing the cache? Any cons?

@fatcat-z
Copy link
Contributor Author

How about removing the cache? Any cons?

@gramalingam , any comments about adding this cache in the beginning?

@@ -27,7 +27,7 @@
from typing_extensions import TypeAlias

from onnxscript import autocast, irbuilder, onnx_opset, tensor, utils, values
from onnxscript._internal import param_manipulation
from onnxscript._internal import feature_switch, param_manipulation

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [onnxscript._internal.param_manipulation](1) begins an import cycle.
import os

# By default: Enable
CACHE_ORT_SESSIONS: bool = os.getenv("CACHE_ORT_SESSIONS", "1") != "0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CACHE_ORT_SESSIONS: bool = os.getenv("CACHE_ORT_SESSIONS", "1") != "0"
cache_ort_sessions: bool = os.getenv("CACHE_ORT_SESSIONS", "1") != "0"

sorry - I changed my mind because it looks like it is a flag that can be changed in runtime (not constant), so snake case instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this in _internal by design? I can see arguments both ways. It should ideally be exposed to the user, but if we think this API will change, I can see a point in leaving it here.

Copy link
Contributor Author

@fatcat-z fatcat-z Apr 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this in _internal by design? I can see arguments both ways. It should ideally be exposed to the user, but if we think this API will change, I can see a point in leaving it here.

My opinion is: Such switches should not be exposed to users calling onnx-script ops or torch_libs to construct a model. Because these switches control some internal features of onnx-script only. These internal features are not necessary to be known by those users.

Copy link
Contributor Author

@fatcat-z fatcat-z Apr 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry - I changed my mind because it looks like it is a flag that can be changed in runtime (not constant), so snake case instead?

I think a variable in upper case is more like a constant which should not be changed in runtime. Isn't it? And I think this is aligned with the assumption that these switches are designed to be controlled by the environment variable only, so we don't need to change them to snake case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider flags.py as in "feature flags"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my past projects, such things which determines if a feature is enable or not were named like "xxxx_switch", so I followed them.
In addition, I think 'switch' implies this can control something while 'flag' is more like a static label.

Copy link
Collaborator

@justinchuby justinchuby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Last round of minor suggestions.

@fatcat-z fatcat-z merged commit 319ced6 into microsoft:main Apr 29, 2023
@fatcat-z fatcat-z deleted the remove_cache branch April 29, 2023 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: torch_lib Related to the torch/aten function lib in development
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Windows CI has MemoryError
3 participants