-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For performance, consider accessing user supplied context by reference #298
Comments
Can you make a PR for the benches you've added (although I would keep the current That's a pretty huge rewrite of the renderer though and it would need to pass all the tests (currently 24 failing) before I take a good look at it though. Is rewriting the whole renderer necessary or is mostly related to for loops? |
Maybe a more skillful simple replacement could be done. I tried first to simply replace Value with &'a Value and just could not make it work. I think RefOrOwned concept, which may exist elsewhere in rust as standard, is required to stop deep cloning and that required an extra class. That is, the work could not be done in Renderer::render because you need that main function to hold ref to user context to ensure lifetime is long enough struct doing for real work, which I called AstProcessor. I understand your reservations and have them also. I'll be glad to try to get those tests working - I just wanted some thoughts before continuing. I will also make a PR for the benches leaving original |
I've got almost all the tests working. I believe the only ones not working are related to error processing where the tests look for specific error text. For some, I updated the required text to match slightly new output. I imagine the text of errors is not considered part of the api? Anyway, I never really acclimated to rust error_chain; Honestly I find it difficult to use/understand. I believe the updates will be faster for many/most renderings that involve any significant context data. For very small contexts in tight loops it might be worse. Here are new benchmarks vs current as run on my macbook. For my specific use case it takes from 2,000ms down to under 100ms - but I may be doing something atypical in my templates. @Keats - Let me know what you think: less_clone
Original
|
It's interesting that looking at the benchmarks it only really affects the loop so that's really the bottleneck, nothing else. I don't think I will have time this week but I should be able to try having a look on my end next week.
It is not. I'll probably switch from |
I think it is more than the loop. Maybe an example of "correlation is not causation"? I believe the issue really is the combination of (1) storing by value and (2) using Also, originally I had assumed it was Rocket that was the culprit. The reason is I was timing my rocket route handler and it was fast (millis) yet it took two seconds to make it to browser - so I assumed rocket middleware on transit. It turns out that the request to render in rocket really only deferred the call to tera rendering until after return of the route. Anyway, I was discussing some with Rocket community and in the process I created this gist: https://gist.github.com/patefacio/d861650e7297b4d27c6e0c591a43cb12 If you look at the last file you'll see an image in the call stack and when I rummaged through it a very large percentage of time was in clone. That is why I think it is more than looping. Of course, with looping it will show big because it does clone there as well. tera_less_clone
original
|
Let me know when everything is working and i'll have a look, that does look promising |
Sorry for long delay - vacation and stuff. All tests pass and although some tests I changed slightly, I believe they capture the same intent. I look forward to comments/suggestions. |
no worries. |
The code has been released with 0.11.13, thanks a lot! |
In this branch (https://github.com/patefacio/tera/tree/less_clone) I've updated the Renderer code to, where possible, not do deep copies (ie clones). It required some refactor. The strategy was the Renderer keeps a borrow of user supplied context and passes that on to an AstProcessor. This way an instance of a class is not required to have reference to another field in same struct - which I think is impossible in safe rust. I added a RefOrOwned class to treat an immutable ref and a owned type transparently. Then the actual processor can clone those RefOrOwned instances and it is a large pointer copy. I have not finished all features yet so I have more work to do, but the initial results are promising.
Looking for comments, suggestions.
Perhaps my usage is pathological; I have a large context to generate this html: (https://patefacio.github.io/plusauri.github.io/html/forecast/this_just_in_bogle_sample.plusauri.as_modeled.uai_true.html)
For my particular template the performance change is stark: (durations in milliseconds).
7 matches for "took" in buffer: plus_release_server.out
2400:Render (less_clone) for template took 67: For template (forecast) with parents ([])
4867:Render (less_clone) for template took 62: For template (forecast) with parents ([])
7332:Render (less_clone) for template took 56: For template (forecast) with parents ([])
9797:Render (less_clone) for template took 59: For template (forecast) with parents ([])
12262:Render (less_clone) for template took 65: For template (forecast) with parents ([])
14727:Render (less_clone) for template took 59: For template (forecast) with parents ([])
17192:Render (less_clone) for template took 58: For template (forecast) with parents ([])
8 matches for "took" in buffer: plus_release_server.out
2:Render (tera_ref) for template took 1909: For template (forecast) with parents ([])
70:Render (tera_ref) for template took 2105: For template (forecast) with parents ([])
137:Render (tera_ref) for template took 1999: For template (forecast) with parents ([])
204:Render (tera_ref) for template took 2003: For template (forecast) with parents ([])
271:Render (tera_ref) for template took 1931: For template (forecast) with parents ([])
338:Render (tera_ref) for template took 2062: For template (forecast) with parents ([])
405:Render (tera_ref) for template took 2073: For template (forecast) with parents ([])
472:Render (tera_ref) for template took 2034: For template (forecast) with parents ([])
The text was updated successfully, but these errors were encountered: