Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor memory optimisations for ArC #18355

Closed
wants to merge 7 commits into from
Closed

Conversation

Sanne
Copy link
Member

@Sanne Sanne commented Jul 2, 2021

This doesn't quite resolve #18345 yet but it's a good step in that direction, and already reduces the memory cost a fair bit (~30% in my test)

@Sanne Sanne added the kind/enhancement New feature or request label Jul 2, 2021
@Sanne Sanne requested a review from mkouba July 2, 2021 14:55
@quarkus-bot
Copy link

quarkus-bot bot commented Jul 2, 2021

This workflow status is outdated as a new workflow run has been triggered.

Failing Jobs - Building 6f262c9

Status Name Step Test failures Logs Raw logs
Initial JDK 11 Build Build ⚠️ Check → Logs Raw logs

@gsmet
Copy link
Member

gsmet commented Jul 2, 2021

@Sanne formatting issue in one of your commit:

Error: Failed to execute goal net.revelc.code.formatter:formatter-maven-plugin:2.16.0:validate (default) on project arc: File '/home/runner/work/quarkus/quarkus/independent-projects/arc/runtime/src/main/java/io/quarkus/arc/RemovedBean.java' has not been previously formatted. Please format file and commit before running validation! -> [Help 1]

@Sanne
Copy link
Member Author

Sanne commented Jul 2, 2021

thanks :) I guess that will teach me from adding javadoc last minute :D

@Sanne Sanne force-pushed the ArCRemovedBeans branch from 6f262c9 to 88a8f6e Compare July 2, 2021 19:29
@quarkus-bot quarkus-bot bot added area/arc Issue related to ARC (dependency injection) area/vertx labels Jul 2, 2021
@Sanne
Copy link
Member Author

Sanne commented Jul 2, 2021

hopefully fixed

@quarkus-bot
Copy link

quarkus-bot bot commented Jul 2, 2021

Failing Jobs - Building 88a8f6e

Status Name Step Test failures Logs Raw logs
JVM Tests - JDK 11 Build Test failures Logs Raw logs
✔️ JVM Tests - JDK 16

Full information is available in the Build summary check run.

Test Failures

⚙️ JVM Tests - JDK 11 #

📦 extensions/vertx-http/deployment

io.quarkus.vertx.http.testrunner.TestChangeTrackingWhenStartFailsTestCase.testChangeTrackingOnStartupFailure line 69 - More details - Source on GitHub

Copy link
Contributor

@geoand geoand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 from me, but ultimately up to @mkouba

@mkouba
Copy link
Contributor

mkouba commented Jul 7, 2021

I need to take a look at those changes, hopefully this afternoon. Speaking of ~30% in my test... that's quite a number, @Sanne could you share your test as well or at least some setup info (number of beans, etc.) so that we can guess what impact those individual optimizations have?

Copy link
Contributor

@mkouba mkouba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like most of the changes, except for the RemovedBean modifications and some LinkedList replacements that remind me premature optimizations ;-).

I'll try to cherry-pick some commits from this PR and propose an alternative PR so that we could compare the results using Sanne's test and keep the previous version API if possible.

*/
public Set<Type> getTypes();
@Deprecated(forRemoval = true)
Set<Type> getTypes();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to disagree with this modification. A Set is much more useful than an Iterable and corresponds to the javax.enterprise.inject.spi.BeanAttributes.getTypes() methods.

Maybe we could remove the qualifiers() and types() and change the boolean matchesType(Type requiredType) signature to boolean matches(Type requiredType, Annotation... qualifiers). This method could be optimized and used in ArcContainer whereas the getTypes() and getQualifiers() could possibly create a new set for every invocation from any data structure used internally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well but think of your clients.. this code is only used by that little utility which dumps a nice error message when a removed bean shouldn't have been removed: we surely can handle a little extra complexity there since it's not "public" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW regarding style.. ArC is consistently exposing Set(s) which often requires wrapping in immutable sets, sometimes it requires full copies. And iterators aren't free either.

In general it would be more efficient to internalize the logic into the host which controls the collection, like I did by introducing the specialized matchesType method here.

The drawback is of course the specialization.. far less flexible and it only works if you know very well what the purpose of the iteration is going to be.

So I tend to make such changes for internal/private components, but sure I understand you can't do it for the ArC public APIs as a general rule. Still useful to consider exposing some specialized methods though, for the most common cases can be implemented more efficiently.

private final Set<Type> types;
private final Set<Annotation> qualifiers;
private final Type[] types;
private final Annotation[] qualifiers;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that most of the beans use the default qualifiers (aka the shared set), although many of the removed beans in any quickstart will be [@Any, @ConfigProperty] (because SR Config registers a dozen of config producer methods).

I believe that we could modify the io.quarkus.arc.processor.ComponentsProviderGenerator to share all sets of qualifiers with no runtime penalty.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sharing sets would be nice, but is it worth it for the removed beans? By using an array we already reduce the retained memory quite a bit, not sure if it's worth spending more time in trying to safe some KBs as this is all just references overhead.

If you have such ideas, would be nice to apply the optimisation to the non-removed beans? I focused on the removed ones as I felt it was odd to spend significant memory just to provide that error message (and because it was simpler for me to get a full picture of the use cases), but there's certainly more to be won in the area of the actually active beans.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have such ideas, would be nice to apply the optimisation to the non-removed beans?

+1

@@ -155,13 +157,14 @@ private void putContext(InjectableContext context) {
if (values == null) {
contexts.put(context.getScope(), Collections.singleton(context));
} else {
List<InjectableContext> multi = new LinkedList<>(values);
List<InjectableContext> multi = new ArrayList<>(values.size() + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1. This branch is only used if multiple contexts are registered for the same scope annotation, which is possible but rather rare in case of quarkus. Also what's wrong with a LinkedList? I seriously doubt that this change would cause a noticable difference...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LinkedLists are a very poor choice, especially if we care about the memory usage: you need an additional object to wrap each element in the list. And it just doesn't have any benefits.

@@ -610,17 +613,16 @@ private static Integer getAlternativePriority(InjectableBean<?> bean) {
}

List<InjectableBean<?>> getMatchingBeans(Resolvable resolvable) {
List<InjectableBean<?>> matching = new LinkedList<>();
List<InjectableBean<?>> matching = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, what's wrong with the LinkedList here? The list of matching beans is expected to be very small in 95% of cases. Moreover, the result of this method is cached for majority of usages (incl. programmatic lookup), only BeanManager.getBeans(Type, Annotation...) does not leverage the cache.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure the benefit is really small here, but there's a benefit. More importantly, nobody else is using LinkedList so this code is forcing us to a load a class which is otherwise dead.

Besides, the ArrayList does have better iteration performance and also consumes less memory.

I would certainly not have bothered you with this change alone - not worth it. But in the bigger picture it makes sense to consistently use the better lists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More importantly, nobody else is using LinkedList so this code is forcing us to a load a class which is otherwise dead.

Are you sure? I found many references in many projects and libs used in quarkus (jackson, jaeger, graphql, neo4j, vertx, etc.). That said I'm not a big fan of this data structure but it obviously has some pros and cons.

Besides, the ArrayList does have better iteration performance and also consumes less memory.

Agreed but given the small number of elements I'd guess that the foreach loop would take O(n) for both.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure?

Well no, of course, and it would depend on user code as well.

But if I diagnose the CRUD quickstart, there's only this one an a single use in Narayana. It made me think that it was within reach to avoid it completely in a good amount of cases :)

And of course I won't spend an hour to fix Narayana for this purpose alone, but I'm making a list of optimisations for Narayana and will then send a PR fixing them all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArrayList is almost always better than a LinkedList. If you look at memory footprint, forward/backward iteration, random access, and even modifications. Modifying a linked list (especially prepending or inserting into the middle) is theoretically faster than an array in certain computation models (random-access memory, where pointer chasing is not an issue), but in the real world (with memory hierarchy and highly optimized memory copying), arrays win there as well. There may be some cases where a linked list is better in practice, but I believe it's safe to always start with an ArrayList.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LinkedList doesn't make a lot of sense to use all you plan to do is append entries and later on iterate through the results. You are paying the cost of creating new nodes and then traversing them for no benefit whatsoever.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, guys I get your point! ;-). Nobody questions that ArrayList is the best "default choice". My point is that changes like these look like nitpicking and have no real impact on the performance/memory consumption in this particular use case. Well, we could save few hundreds of bytes in some special cases... Is worth the effort? That said, I'm going to incorporate these LinkedList changes in the "cumulative" PR...

@Sanne
Copy link
Member Author

Sanne commented Jul 8, 2021

I need to take a look at those changes, hopefully this afternoon. Speaking of ~30% in my test... that's quite a number, @Sanne could you share your test as well or at least some setup info (number of beans, etc.) so that we can guess what impact those individual optimizations have?

to be clear I meant ~30% less for representing this specific data structure, not of the whole application. In terms of tests, when I worked on this I was looking at the hibernate-orm-quickstart in our quickstarts repository.

@mkouba mkouba mentioned this pull request Jul 9, 2021
@Sanne Sanne closed this Jul 16, 2021
@quarkus-bot quarkus-bot bot added the triage/invalid This doesn't seem right label Jul 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/arc Issue related to ARC (dependency injection) area/vertx kind/enhancement New feature or request triage/invalid This doesn't seem right
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid loading the list of ArC removed beans into the heap
5 participants