Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model Freezing #113

Open
jatkinson1000 opened this issue Apr 8, 2024 · 2 comments
Open

Model Freezing #113

jatkinson1000 opened this issue Apr 8, 2024 · 2 comments
Labels
enhancement New feature or request good first issue Good for newcomers Hacktoberfest Issues open for Hacktoberfest contributions

Comments

@jatkinson1000
Copy link
Member

@ElliottKasoar Originally commented in #103:

  • Model freezing
    No changes are currently included, and less directly applicable to the main FTorch library, although there are still interactions e.g. freezing the model can allow InferenceMode to be enabled when loading the model.
    Freezing is currently the "default" when tracing in pt2ts.py, but not for scripting, despite potentially improving performance.
    Freezing appears to (sometimes) introduce numerical errors when saving the reloading (differences ~10^-6), and can seem to lead to issues loading with Forpy too.

Questions for @ElliottKasoar:

  • Is there any reason not to freeze the scripted models in the same way we do the traced ones?
  • Do you feel we should make this an option for the user to set?
  • Do you feel it is better to have frozen (for performance, but with a warning to check accuracy in Fortran) or unfrozen (to ensure reproducibility) as a default?
@jatkinson1000 jatkinson1000 added enhancement New feature or request good first issue Good for newcomers labels Apr 8, 2024
@jatkinson1000
Copy link
Member Author

Copying @ElliottKasoar's original comment from #81

  1. Model freezing
  • See: torch.jit.freeze
  • Applies system-independent optimization, as opposed to the system-dependent optimize_for_inference (which is currently broken, unfortunately)
  • Can give up to 50% speedup
  • From our benchmarking (see FTorch with/without gradients and/or frozen models sections), freezing the model can make more modest, but not insignificant, improvements (in most cases)
    • Tests were carried out by replacing scripted_model.save(filename) with frozen_model = torch.jit.freeze(scripted_model), and then frozen_model.save(filename) in pt2ts.py
  • While not a problem for the use of FTorch, it's also worth noting that running the same TorchScript via Forpy (on CPU or GPU) seemed to give a similar errors to what optimize_for_inference can gives e.g. AttributeError: 'RecursiveScriptModule' object has no attribute 'training'
  • Freezing the model appears to lead to numerical errors (~10^-6) for the ResNet benchmark, raising a RuntimeError when saving, but this doesn't seem to be the case for the cgdrag benchmark, and it is unclear why
  • The guidance part of the title is perhaps most relevant here, as this is less about the main FTorch library, and more about how we enable users to use tools like pt2ts.py as part of a workflow involving FTorch
    • This is somewhat dependent on the typical familiarity of potential FTorch users with the processes involved in saving to TorchScript
    • Note: trace_to_torchscript currently uses model freezing. It would be preferable to have a shared setting and/or behaviour, unless there is a clear reason to use freezing in only one of the functions
    • Any guidelines on trace_to_torchscript compared with script_to_torchscript may also be useful, as currently there is no clear motivation not to use the "default" script_to_torchscript

@ElliottKasoar
Copy link
Contributor

ElliottKasoar commented Apr 9, 2024

  • Is there any reason not to freeze the scripted models in the same way we do the traced ones?
  • Do you feel we should make this an option for the user to set?
  • Do you feel it is better to have frozen (for performance, but with a warning to check accuracy in Fortran) or unfrozen (to ensure reproducibility) as a default?

I think generally we need to understand its behaviour a little better before answering most of this definitively, but my initial thoughts:

  1. Probably not, if it works
  2. Yes, I think so
  3. Probably unfrozen

My feeling is there are probably enough caveats/complications that unfrozen would be best as the default, so more or less anything a user throws at the library can work.

Ideally we'd probably then strongly encourage users looking to optimise inference to freeze their models, either via pt2ts, which is what I mostly used, or our interface (C++ docs are limited, and I'm not sure I tested that at all). Having the option built into FTorch seems like a useful thing to add though.

There definitely seems to be some strange behaviour e.g. trying to change the model device (see discussion/linked issue here and note at the end of this), the errors I had using the model with forpy, the numerical differences etc.

Some of those issues may be related, and none so far seem to be too challenging:

  • There is a work-around for moving the model, and it may not be a problem for FTorch anyway
  • If it works for FTorch, we don't necessarily need to worry about forpy
  • Floating point arithmetic etc. shouldn't be expected to be exact

So arguably most of what we need do is confirm and document these potential issues.

Once we have a handle on the limitations, I think generally we should be able to say that there's no reason not to freeze the model for inference.

@jatkinson1000 jatkinson1000 added the Hacktoberfest Issues open for Hacktoberfest contributions label Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers Hacktoberfest Issues open for Hacktoberfest contributions
Projects
None yet
Development

No branches or pull requests

2 participants