Model Freezing #113

jatkinson1000 · 2024-04-08T15:55:21Z

@ElliottKasoar Originally commented in #103:

Model freezing
No changes are currently included, and less directly applicable to the main FTorch library, although there are still interactions e.g. freezing the model can allow InferenceMode to be enabled when loading the model.
Freezing is currently the "default" when tracing in pt2ts.py, but not for scripting, despite potentially improving performance.
Freezing appears to (sometimes) introduce numerical errors when saving the reloading (differences ~10^-6), and can seem to lead to issues loading with Forpy too.

Questions for @ElliottKasoar:

Is there any reason not to freeze the scripted models in the same way we do the traced ones?
Do you feel we should make this an option for the user to set?
Do you feel it is better to have frozen (for performance, but with a warning to check accuracy in Fortran) or unfrozen (to ensure reproducibility) as a default?

jatkinson1000 · 2024-04-09T01:30:07Z

Copying @ElliottKasoar's original comment from #81

Model freezing

See: torch.jit.freeze
Applies system-independent optimization, as opposed to the system-dependent optimize_for_inference (which is currently broken, unfortunately)
Can give up to 50% speedup
From our benchmarking (see FTorch with/without gradients and/or frozen models sections), freezing the model can make more modest, but not insignificant, improvements (in most cases)
- Tests were carried out by replacing scripted_model.save(filename) with frozen_model = torch.jit.freeze(scripted_model), and then frozen_model.save(filename) in pt2ts.py
While not a problem for the use of FTorch, it's also worth noting that running the same TorchScript via Forpy (on CPU or GPU) seemed to give a similar errors to what optimize_for_inference can gives e.g. AttributeError: 'RecursiveScriptModule' object has no attribute 'training'
Freezing the model appears to lead to numerical errors (~10^-6) for the ResNet benchmark, raising a RuntimeError when saving, but this doesn't seem to be the case for the cgdrag benchmark, and it is unclear why
The guidance part of the title is perhaps most relevant here, as this is less about the main FTorch library, and more about how we enable users to use tools like pt2ts.py as part of a workflow involving FTorch
- This is somewhat dependent on the typical familiarity of potential FTorch users with the processes involved in saving to TorchScript
- Note: trace_to_torchscript currently uses model freezing. It would be preferable to have a shared setting and/or behaviour, unless there is a clear reason to use freezing in only one of the functions
- Any guidelines on trace_to_torchscript compared with script_to_torchscript may also be useful, as currently there is no clear motivation not to use the "default" script_to_torchscript

ElliottKasoar · 2024-04-09T13:53:31Z

Is there any reason not to freeze the scripted models in the same way we do the traced ones?

Do you feel we should make this an option for the user to set?

Do you feel it is better to have frozen (for performance, but with a warning to check accuracy in Fortran) or unfrozen (to ensure reproducibility) as a default?

I think generally we need to understand its behaviour a little better before answering most of this definitively, but my initial thoughts:

Probably not, if it works
Yes, I think so
Probably unfrozen

My feeling is there are probably enough caveats/complications that unfrozen would be best as the default, so more or less anything a user throws at the library can work.

Ideally we'd probably then strongly encourage users looking to optimise inference to freeze their models, either via pt2ts, which is what I mostly used, or our interface (C++ docs are limited, and I'm not sure I tested that at all). Having the option built into FTorch seems like a useful thing to add though.

There definitely seems to be some strange behaviour e.g. trying to change the model device (see discussion/linked issue here and note at the end of this), the errors I had using the model with forpy, the numerical differences etc.

Some of those issues may be related, and none so far seem to be too challenging:

There is a work-around for moving the model, and it may not be a problem for FTorch anyway
If it works for FTorch, we don't necessarily need to worry about forpy
Floating point arithmetic etc. shouldn't be expected to be exact

So arguably most of what we need do is confirm and document these potential issues.

Once we have a handle on the limitations, I think generally we should be able to say that there's no reason not to freeze the model for inference.

jatkinson1000 added enhancement New feature or request good first issue Good for newcomers labels Apr 8, 2024

jatkinson1000 mentioned this issue Apr 9, 2024

Add optimisation options/guidance #81

Closed

jatkinson1000 added the Hacktoberfest Issues open for Hacktoberfest contributions label Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Freezing #113

Model Freezing #113

jatkinson1000 commented Apr 8, 2024

jatkinson1000 commented Apr 9, 2024

ElliottKasoar commented Apr 9, 2024 •

edited

Loading

Model Freezing #113

Model Freezing #113

Comments

jatkinson1000 commented Apr 8, 2024

jatkinson1000 commented Apr 9, 2024

ElliottKasoar commented Apr 9, 2024 • edited Loading

ElliottKasoar commented Apr 9, 2024 •

edited

Loading