-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor memory optimisations for ArC #18355
Conversation
@Sanne formatting issue in one of your commit:
|
thanks :) I guess that will teach me from adding javadoc last minute :D |
hopefully fixed |
Failing Jobs - Building 88a8f6e
Full information is available in the Build summary check run. Test Failures⚙️ JVM Tests - JDK 11 #📦 extensions/vertx-http/deployment✖ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 from me, but ultimately up to @mkouba
I need to take a look at those changes, hopefully this afternoon. Speaking of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like most of the changes, except for the RemovedBean
modifications and some LinkedList
replacements that remind me premature optimizations ;-).
I'll try to cherry-pick some commits from this PR and propose an alternative PR so that we could compare the results using Sanne's test and keep the previous version API if possible.
*/ | ||
public Set<Type> getTypes(); | ||
@Deprecated(forRemoval = true) | ||
Set<Type> getTypes(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to disagree with this modification. A Set is much more useful than an Iterable and corresponds to the javax.enterprise.inject.spi.BeanAttributes.getTypes()
methods.
Maybe we could remove the qualifiers()
and types()
and change the boolean matchesType(Type requiredType)
signature to boolean matches(Type requiredType, Annotation... qualifiers)
. This method could be optimized and used in ArcContainer
whereas the getTypes()
and getQualifiers()
could possibly create a new set for every invocation from any data structure used internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well but think of your clients.. this code is only used by that little utility which dumps a nice error message when a removed bean shouldn't have been removed: we surely can handle a little extra complexity there since it's not "public" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW regarding style.. ArC is consistently exposing Set(s) which often requires wrapping in immutable sets, sometimes it requires full copies. And iterators aren't free either.
In general it would be more efficient to internalize the logic into the host which controls the collection, like I did by introducing the specialized matchesType
method here.
The drawback is of course the specialization.. far less flexible and it only works if you know very well what the purpose of the iteration is going to be.
So I tend to make such changes for internal/private components, but sure I understand you can't do it for the ArC public APIs as a general rule. Still useful to consider exposing some specialized methods though, for the most common cases can be implemented more efficiently.
private final Set<Type> types; | ||
private final Set<Annotation> qualifiers; | ||
private final Type[] types; | ||
private final Annotation[] qualifiers; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that most of the beans use the default qualifiers (aka the shared set), although many of the removed beans in any quickstart will be [@Any, @ConfigProperty]
(because SR Config registers a dozen of config producer methods).
I believe that we could modify the io.quarkus.arc.processor.ComponentsProviderGenerator
to share all sets of qualifiers with no runtime penalty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sharing sets would be nice, but is it worth it for the removed beans? By using an array we already reduce the retained memory quite a bit, not sure if it's worth spending more time in trying to safe some KBs as this is all just references overhead.
If you have such ideas, would be nice to apply the optimisation to the non-removed beans? I focused on the removed ones as I felt it was odd to spend significant memory just to provide that error message (and because it was simpler for me to get a full picture of the use cases), but there's certainly more to be won in the area of the actually active beans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have such ideas, would be nice to apply the optimisation to the non-removed beans?
+1
@@ -155,13 +157,14 @@ private void putContext(InjectableContext context) { | |||
if (values == null) { | |||
contexts.put(context.getScope(), Collections.singleton(context)); | |||
} else { | |||
List<InjectableContext> multi = new LinkedList<>(values); | |||
List<InjectableContext> multi = new ArrayList<>(values.size() + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1. This branch is only used if multiple contexts are registered for the same scope annotation, which is possible but rather rare in case of quarkus. Also what's wrong with a LinkedList
? I seriously doubt that this change would cause a noticable difference...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LinkedLists are a very poor choice, especially if we care about the memory usage: you need an additional object to wrap each element in the list. And it just doesn't have any benefits.
@@ -610,17 +613,16 @@ private static Integer getAlternativePriority(InjectableBean<?> bean) { | |||
} | |||
|
|||
List<InjectableBean<?>> getMatchingBeans(Resolvable resolvable) { | |||
List<InjectableBean<?>> matching = new LinkedList<>(); | |||
List<InjectableBean<?>> matching = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, what's wrong with the LinkedList
here? The list of matching beans is expected to be very small in 95% of cases. Moreover, the result of this method is cached for majority of usages (incl. programmatic lookup), only BeanManager.getBeans(Type, Annotation...)
does not leverage the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure the benefit is really small here, but there's a benefit. More importantly, nobody else is using LinkedList so this code is forcing us to a load a class which is otherwise dead.
Besides, the ArrayList does have better iteration performance and also consumes less memory.
I would certainly not have bothered you with this change alone - not worth it. But in the bigger picture it makes sense to consistently use the better lists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More importantly, nobody else is using LinkedList so this code is forcing us to a load a class which is otherwise dead.
Are you sure? I found many references in many projects and libs used in quarkus (jackson, jaeger, graphql, neo4j, vertx, etc.). That said I'm not a big fan of this data structure but it obviously has some pros and cons.
Besides, the ArrayList does have better iteration performance and also consumes less memory.
Agreed but given the small number of elements I'd guess that the foreach
loop would take O(n)
for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure?
Well no, of course, and it would depend on user code as well.
But if I diagnose the CRUD quickstart, there's only this one an a single use in Narayana. It made me think that it was within reach to avoid it completely in a good amount of cases :)
And of course I won't spend an hour to fix Narayana for this purpose alone, but I'm making a list of optimisations for Narayana and will then send a PR fixing them all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ArrayList
is almost always better than a LinkedList
. If you look at memory footprint, forward/backward iteration, random access, and even modifications. Modifying a linked list (especially prepending or inserting into the middle) is theoretically faster than an array in certain computation models (random-access memory, where pointer chasing is not an issue), but in the real world (with memory hierarchy and highly optimized memory copying), arrays win there as well. There may be some cases where a linked list is better in practice, but I believe it's safe to always start with an ArrayList
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LinkedList
doesn't make a lot of sense to use all you plan to do is append entries and later on iterate through the results. You are paying the cost of creating new nodes and then traversing them for no benefit whatsoever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, guys I get your point! ;-). Nobody questions that ArrayList
is the best "default choice". My point is that changes like these look like nitpicking and have no real impact on the performance/memory consumption in this particular use case. Well, we could save few hundreds of bytes in some special cases... Is worth the effort? That said, I'm going to incorporate these LinkedList
changes in the "cumulative" PR...
to be clear I meant ~30% less for representing this specific data structure, not of the whole application. In terms of tests, when I worked on this I was looking at the |
This doesn't quite resolve #18345 yet but it's a good step in that direction, and already reduces the memory cost a fair bit (~30% in my test)