You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As discussed in #78, there are (at least) two forms of optimisation that would be relatively straight forward to facilitate in some capacity, but require more consideration/are unlikely to be the default options (which is why they are not included in the referenced PR):
From our benchmarking (see FTorch with/without gradients and/or frozen models sections), freezing the model can make more modest, but not insignificant, improvements (in most cases)
Tests were carried out by replacing scripted_model.save(filename) with frozen_model = torch.jit.freeze(scripted_model), and then frozen_model.save(filename) in pt2ts.py
While not a problem for the use of FTorch, it's also worth noting that running the same TorchScript via Forpy (on CPU or GPU) seemed to give a similar errors to what optimize_for_inference can gives e.g. AttributeError: 'RecursiveScriptModule' object has no attribute 'training'
Freezing the model appears to lead to numerical errors (~10^-6) for the ResNet benchmark, raising a RuntimeError when saving, but this doesn't seem to be the case for the cgdrag benchmark, and it is unclear why
The guidance part of the title is perhaps most relevant here, as this is less about the main FTorch library, and more about how we enable users to use tools like pt2ts.py as part of a workflow involving FTorch
This is somewhat dependent on the typical familiarity of potential FTorch users with the processes involved in saving to TorchScript
Note: trace_to_torchscript currently uses model freezing. It would be preferable to have a shared setting and/or behaviour, unless there is a clear reason to use freezing in only one of the functions
Any guidelines on trace_to_torchscript compared with script_to_torchscript may also be useful, as currently there is no clear motivation not to use the "default" script_to_torchscript
From our benchmarking (see FTorch with InferenceMode and NoGradMode sections), benefits were less clear, but in general it is expected to be at least as fast
Tests were carried out by replacing torch::AutoGradMode enable_grad(requires_grad); with c10::InferenceMode guard(requires_grad); in all ctorch.cpp functions, but ideally both options would be presented to users
This mode was only added (as a beta) in PyTorch 1.9, so we would need to consider support for older versions
The mode is also much stricter than NoGradMode, so cannot be used in all cases
The text was updated successfully, but these errors were encountered:
As discussed in #78, there are (at least) two forms of optimisation that would be relatively straight forward to facilitate in some capacity, but require more consideration/are unlikely to be the default options (which is why they are not included in the referenced PR):
optimize_for_inference
(which is currently broken, unfortunately)FTorch with/without gradients and/or frozen models
sections), freezing the model can make more modest, but not insignificant, improvements (in most cases)scripted_model.save(filename)
withfrozen_model = torch.jit.freeze(scripted_model)
, and thenfrozen_model.save(filename)
in pt2ts.pyoptimize_for_inference
can gives e.g.AttributeError: 'RecursiveScriptModule' object has no attribute 'training'
pt2ts.py
as part of a workflow involving FTorchtrace_to_torchscript
currently uses model freezing. It would be preferable to have a shared setting and/or behaviour, unless there is a clear reason to use freezing in only one of the functionstrace_to_torchscript
compared withscript_to_torchscript
may also be useful, as currently there is no clear motivation not to use the "default"script_to_torchscript
FTorch with InferenceMode and NoGradMode
sections), benefits were less clear, but in general it is expected to be at least as fasttorch::AutoGradMode enable_grad(requires_grad);
withc10::InferenceMode guard(requires_grad);
in all ctorch.cpp functions, but ideally both options would be presented to usersNoGradMode
, so cannot be used in all casesThe text was updated successfully, but these errors were encountered: