Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optimisation options/guidance #81

Closed
ElliottKasoar opened this issue Dec 15, 2023 · 1 comment
Closed

Add optimisation options/guidance #81

ElliottKasoar opened this issue Dec 15, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@ElliottKasoar
Copy link
Contributor

As discussed in #78, there are (at least) two forms of optimisation that would be relatively straight forward to facilitate in some capacity, but require more consideration/are unlikely to be the default options (which is why they are not included in the referenced PR):

  1. Model freezing
  • See: torch.jit.freeze
  • Applies system-independent optimization, as opposed to the system-dependent optimize_for_inference (which is currently broken, unfortunately)
  • Can give up to 50% speedup
  • From our benchmarking (see FTorch with/without gradients and/or frozen models sections), freezing the model can make more modest, but not insignificant, improvements (in most cases)
    • Tests were carried out by replacing scripted_model.save(filename) with frozen_model = torch.jit.freeze(scripted_model), and then frozen_model.save(filename) in pt2ts.py
  • While not a problem for the use of FTorch, it's also worth noting that running the same TorchScript via Forpy (on CPU or GPU) seemed to give a similar errors to what optimize_for_inference can gives e.g. AttributeError: 'RecursiveScriptModule' object has no attribute 'training'
  • Freezing the model appears to lead to numerical errors (~10^-6) for the ResNet benchmark, raising a RuntimeError when saving, but this doesn't seem to be the case for the cgdrag benchmark, and it is unclear why
  • The guidance part of the title is perhaps most relevant here, as this is less about the main FTorch library, and more about how we enable users to use tools like pt2ts.py as part of a workflow involving FTorch
    • This is somewhat dependent on the typical familiarity of potential FTorch users with the processes involved in saving to TorchScript
    • Note: trace_to_torchscript currently uses model freezing. It would be preferable to have a shared setting and/or behaviour, unless there is a clear reason to use freezing in only one of the functions
    • Any guidelines on trace_to_torchscript compared with script_to_torchscript may also be useful, as currently there is no clear motivation not to use the "default" script_to_torchscript
  1. InferenceMode
  • See: inference mode, autograd mechanics and the dev podcast
  • From our benchmarking (see FTorch with InferenceMode and NoGradMode sections), benefits were less clear, but in general it is expected to be at least as fast
    • Tests were carried out by replacing torch::AutoGradMode enable_grad(requires_grad); with c10::InferenceMode guard(requires_grad); in all ctorch.cpp functions, but ideally both options would be presented to users
  • This mode was only added (as a beta) in PyTorch 1.9, so we would need to consider support for older versions
  • The mode is also much stricter than NoGradMode, so cannot be used in all cases
@ElliottKasoar ElliottKasoar added the enhancement New feature or request label Dec 15, 2023
@jatkinson1000
Copy link
Member

Closing as broken down into #112 and #113

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants