-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements to queue(s) management in Webserver #2704
Conversation
Signed-off-by: Santiago Pericasgeertsen <[email protected]>
Signed-off-by: Santiago Pericasgeertsen <[email protected]>
Signed-off-by: Santiago Pericasgeertsen <[email protected]>
Signed-off-by: Santiago Pericasgeertsen <[email protected]>
…e connections. Some copyright fixes. Signed-off-by: Santiago Pericasgeertsen <[email protected]>
Signed-off-by: Santiago Pericasgeertsen <[email protected]>
|
||
IndirectReference(T referent, ReferenceQueue<? super T> q, R otherRef) { | ||
super(referent, q); | ||
this.otherRef.lazySet(otherRef); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why lazySet
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then it is a normal store
(no expensive full barrier)
(think mov mm, r
vs lock: mov mm, r
)
in this case it is sufficient, because IndirectReference
is published safely (becomes reachable from other threads only as a result of some operation that has full barriers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I am not sure that the atomicity of the reference is even required here.
All accesses are meant to be single-threaded in the intended use. It is atomic only for the purpose of potential other uses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not needed, but seems like the safer option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand all apart from the removal of ChannelHandlerContext ctx
from HttpRequestScopedPublisher
.
Also, I would do the handling of failPublisher
for 4xx errors as a separate commit, but separating that is not essential.
As far as the reference queue handling is concerned, it is good to ship.
That context is no longer used in the publisher.
Cool, thx. |
… DataChunk. Signed-off-by: Santiago Pericasgeertsen <[email protected]>
* this collection that cannot be fully released (some buffers still in | ||
* use) will be added to {@code unreleasedQueues} for later retries. | ||
*/ | ||
private final ReferenceQueue<Object> queues = new ReferenceQueue<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just being careful: is Object
the type argument you want to pass here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. That type-parameter is silly. (Reads: I don't get it) It is meant to represent the type of element returned by References
enqueued in ReferenceQueue
. But by the time a reference is enqueued the element is no longer reachable, as determined by GC. So...
What would make more sense as a type parameter for the ReferenceQueue
, is the class of Reference that is going to be returned by poll
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I don't know about silly. Would ? super Reference<?>
or something like that work better here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No.
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/ref/ReferenceQueue.html#poll() - ReferenceQueue<T>.poll()
returns Reference<? extends T>
. I am not interested in what T
is, because that value is no longer accessible. I am only interested in the type of Reference
. If it is something that has a method to release whatever resource it is associated with (ie instanceof IndirectReference
in this case), I am going to call it - end of story.
I'd rather have a guarantee that I will not encounter other type of Reference
here, so that I didn't have to type-cast. :)
I'd prefer:
public class ReferenceQueue<T extends Reference> {
...
public T poll() {...}
}
Then I'd be able to declare ReferenceQueue<IndirectReference<?, ReferenceQueue<IndirectReference<?, DataChunk>>>> queues
and get IndirectReference<ReferenceQueue<IndirectReference<?, DataChunk>>> r = queues.poll()
and ReferenceQueue<IndirectReference<?, DataChunk>> rq = r.acquire()
, and then IndirectReference<?, DataChunk> rr = rq.poll()
and even DataChunk dc = rr.acquire()
and dc.release()
(gosh, this is actually what this is for).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think of it this way: IndirectReference
is about two things: some owner of type T
that actually does not matter, and a resource R
that is temporarily handed off to the owner of type T
. The owner has the obligation to always terminate in a predictable path where it hands off the resource R
back to some resource pool.
IndirectReference
then is a safety net catching the cases when the owner T
fails to fulfil the obligation (Entscheidungsproblem and a Turing-complete language), and dies before returning resource R
. In this case IndirectReference
ends up in the ReferenceQueue
, and the queue processing routine is able to return resource R
back to the pool. Note only the owner of type T
is referenced phantomly. The resource R
remains strongly reachable, at least through IndirectReference
- unless, of course, the owner does fulfil the obligation, and returns the resource, first reclaiming it through IndirectReference.acquire()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having explained all this, I think more work is needed.
The focus has been to reduce the cost of maintaining resources for well-terminating responses. But need to consider what happens to strong references when the channel is closed (channelInactive
fired) and there are unfinished responses.
} | ||
unreleasedQueues.removeIf(ReferenceHoldingQueue::release); | ||
} finally { | ||
clearLock.lazySet(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just being careful: lazySet
is designed for very specialized use cases. Are you sure set
isn't the right choice here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is one case where lazySet
makes sense.
We only have a lock
that can either be taken immediately, or the contender goes away:
https://github.com/oracle/helidon/pull/2704/files/06c459f90972ba6a14a500bfaea7cd4b91beb3cb#diff-a0cac19f24ae85fbc99afd8c0b4e8375d3c036a0ad949c68befe56fb7346461dR100 - that yellow line
In this case we only need to ensure the get()
and compareAndSet
on that line synchronize-with this lazySet
, which they do.
We don't need to "publish" any other changes, as queue poll and removeIf are thread-safe in their own right.
In this case you can even go as weak as VarHandle.setOpaque
.
* Upgrade Netty to 4.1.58 (#2678) Signed-off-by: Tomas Langer <[email protected]> * Added overall timeout to evictable cache (#2659) Signed-off-by: Tomas Langer <[email protected]> * Fix copyright year for commits broken by squashing. (#2687) Signed-off-by: Tomas Langer <[email protected]> * Concat array enhancement (#2508) * Concat array enhancement Signed-off-by: Daniel Kec <[email protected]> * Update Jackson to 2.12.1 (#2690) * Update Jackson to 2.12.1 * Upgrade to latest Junit5 to get fix for junit-team/junit5#2198 * Manage junit4 version * PokemonService template fixed in SE Database Archetype. (#2701) Signed-off-by: Tomas Kraus <[email protected]> * Fixed different output in DbClient SE archetype (#2703) Signed-off-by: Tomas Kraus <[email protected]> * Fix TODO application: (#2708) - WebSecurity needs to be passed config.get("security") to take the "security.web-server" configuration - Added outbound configuration for the google login - Upgraded cassandra driver to fix issues with old guava dependencies - Removed metrics to avoid issues with cassandra driver. Fixes #2707 * Update k8s descriptors to avoid using deprecated APIs. (#2719) * Separate execution of DataChunkReleaseTest in its own VM to prevent leak messages in other test's logs. (#2716) Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Changes in this commit: (#2727) 1. Upgrade to Jersey 2.33 2. Configuration via system properties for the Jersey Client API. Any response in an exception will be mapped to an empty one to prevent data leaks. See eclipse-ee4j/jersey#4641. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Properly release underlying buffer before passing it to WebSocket handler (#2715) * Properly release underlying buffer before passing it to handler. * Releases data chunks after passing them to Tyrus without any copying. Reports an error and closes connection if Tyrus is unable to handle the data. Finally, fixed a problem related to subscription requests. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Removed unused logger. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed checkstyle. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fix issue with null value in JSON. (#2723) Signed-off-by: Tomas Langer <[email protected]> * Upgrade grpc to v1.35.0 (#2713) * Upgrade grpc to v1.35.0 * Update copyright * Upgrades OCI SDK to version 1.31.0 (#2699) * Updated OCI to 1.31.0 Signed-off-by: Laird Nelson <[email protected]> * Fix null array values in HOCON/JSON config parser. (#2731) Resolves #2720 (follow-up) * Performance improvements to queue(s) management in Webserver (#2704) * Initial patch. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed some type params and improved comments. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * More cleanup and make sure to fail publisher on an error condition. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Suppress warnings. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Call clearQueues on every new request for proper cleanup of keep-alive connections. Some copyright fixes. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed checkstyle issues. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Force logging of LEAK error even if finalize does not get called on a DataChunk. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Upgrade Weld (#2668) Signed-off-by: Tomas Langer <[email protected]> * Rest client async header propagation with usage of Helidon Context (#2735) Rest client header propagation with usage of Helidon Context Signed-off-by: David Kral <[email protected]> * Allow override of Jersey property via config (#2737) * Allow the default value of property jersey.config.client.ignoreExceptionResponse to be overridden via config. New test. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed copyright year. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * New implementation of LazyValue (#2738) * New implementation of LazyValue that lazily initializes a Semaphore instead of eagerly creating a ReentrantLock. Makes use of volatile guarantees and atomicity of VarHandle updates. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * New test for LazyValueImpl. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Reduced sleep time in test. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Update CHANGELOG for 2.2.1 release (#2743) * 2.2.1 THIRD_PARTY_LICENSES update (#2746) * Update THIRD_PARTY_LICENSES * Support async invocations using optional synthetic SimplyTimed behavior (#2745) * Add support for async invocations for optional inferred SimplyTimed behavior on JAX-RS endpoints Signed-off-by: [email protected] <[email protected]> * Do not attempt to access the request context in Fallback callback. If used together with Retry, it is possible for the fallback to be called in a fresh thread for which there is no current request scope. Instead just use the original value obtained in this class' constructor. Updated functional test (with some class renaming) to cover this use case. (#2748) Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fix for native image. (#2753) Signed-off-by: Tomas Langer <[email protected]> * Fixed checkstyle issues. Signed-off-by: Santiago Pericasgeertsen <[email protected]> Co-authored-by: Tomas Langer <[email protected]> Co-authored-by: Daniel Kec <[email protected]> Co-authored-by: Joe DiPol <[email protected]> Co-authored-by: Tomáš Kraus <[email protected]> Co-authored-by: Romain Grecourt <[email protected]> Co-authored-by: Jonathan Knight <[email protected]> Co-authored-by: Laird Nelson <[email protected]> Co-authored-by: David Král <[email protected]> Co-authored-by: Tim Quinn <[email protected]>
* Fault Tolerance 3.0 Support (#2680) * Initial changes to implement new metrics layer. Moving from complex names to simpler names and tags. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * More metric updates. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Migration of most unit tests to new metrics. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Completed migration of metrics test. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * New exception to discern timeouts during retries. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Implementation of retry metrics. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Cleanup metrics between tests. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Several changes related to execution of FT 3.0 TCKs. Adjusted initial size of executors and fixed a few other problems. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Copyright and checkstyle updates. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed copyright year. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed typos and some cleanup. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Created exclude file as a workaround for a sportbugs' bug. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Updated copyright year. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * MicroProfile Opentracing 2.0 (#2676) * Microprofile Opentracing uprgated to 2.0 * Unused dependences removed * Obsolete excludes removed * Sync up of microprofile-4.0 with master branch (#2757) * Upgrade Netty to 4.1.58 (#2678) Signed-off-by: Tomas Langer <[email protected]> * Added overall timeout to evictable cache (#2659) Signed-off-by: Tomas Langer <[email protected]> * Fix copyright year for commits broken by squashing. (#2687) Signed-off-by: Tomas Langer <[email protected]> * Concat array enhancement (#2508) * Concat array enhancement Signed-off-by: Daniel Kec <[email protected]> * Update Jackson to 2.12.1 (#2690) * Update Jackson to 2.12.1 * Upgrade to latest Junit5 to get fix for junit-team/junit5#2198 * Manage junit4 version * PokemonService template fixed in SE Database Archetype. (#2701) Signed-off-by: Tomas Kraus <[email protected]> * Fixed different output in DbClient SE archetype (#2703) Signed-off-by: Tomas Kraus <[email protected]> * Fix TODO application: (#2708) - WebSecurity needs to be passed config.get("security") to take the "security.web-server" configuration - Added outbound configuration for the google login - Upgraded cassandra driver to fix issues with old guava dependencies - Removed metrics to avoid issues with cassandra driver. Fixes #2707 * Update k8s descriptors to avoid using deprecated APIs. (#2719) * Separate execution of DataChunkReleaseTest in its own VM to prevent leak messages in other test's logs. (#2716) Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Changes in this commit: (#2727) 1. Upgrade to Jersey 2.33 2. Configuration via system properties for the Jersey Client API. Any response in an exception will be mapped to an empty one to prevent data leaks. See eclipse-ee4j/jersey#4641. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Properly release underlying buffer before passing it to WebSocket handler (#2715) * Properly release underlying buffer before passing it to handler. * Releases data chunks after passing them to Tyrus without any copying. Reports an error and closes connection if Tyrus is unable to handle the data. Finally, fixed a problem related to subscription requests. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Removed unused logger. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed checkstyle. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fix issue with null value in JSON. (#2723) Signed-off-by: Tomas Langer <[email protected]> * Upgrade grpc to v1.35.0 (#2713) * Upgrade grpc to v1.35.0 * Update copyright * Upgrades OCI SDK to version 1.31.0 (#2699) * Updated OCI to 1.31.0 Signed-off-by: Laird Nelson <[email protected]> * Fix null array values in HOCON/JSON config parser. (#2731) Resolves #2720 (follow-up) * Performance improvements to queue(s) management in Webserver (#2704) * Initial patch. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed some type params and improved comments. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * More cleanup and make sure to fail publisher on an error condition. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Suppress warnings. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Call clearQueues on every new request for proper cleanup of keep-alive connections. Some copyright fixes. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed checkstyle issues. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Force logging of LEAK error even if finalize does not get called on a DataChunk. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Upgrade Weld (#2668) Signed-off-by: Tomas Langer <[email protected]> * Rest client async header propagation with usage of Helidon Context (#2735) Rest client header propagation with usage of Helidon Context Signed-off-by: David Kral <[email protected]> * Allow override of Jersey property via config (#2737) * Allow the default value of property jersey.config.client.ignoreExceptionResponse to be overridden via config. New test. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed copyright year. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * New implementation of LazyValue (#2738) * New implementation of LazyValue that lazily initializes a Semaphore instead of eagerly creating a ReentrantLock. Makes use of volatile guarantees and atomicity of VarHandle updates. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * New test for LazyValueImpl. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Reduced sleep time in test. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Update CHANGELOG for 2.2.1 release (#2743) * 2.2.1 THIRD_PARTY_LICENSES update (#2746) * Update THIRD_PARTY_LICENSES * Support async invocations using optional synthetic SimplyTimed behavior (#2745) * Add support for async invocations for optional inferred SimplyTimed behavior on JAX-RS endpoints Signed-off-by: [email protected] <[email protected]> * Do not attempt to access the request context in Fallback callback. If used together with Retry, it is possible for the fallback to be called in a fresh thread for which there is no current request scope. Instead just use the original value obtained in this class' constructor. Updated functional test (with some class renaming) to cover this use case. (#2748) Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fix for native image. (#2753) Signed-off-by: Tomas Langer <[email protected]> * Fixed checkstyle issues. Signed-off-by: Santiago Pericasgeertsen <[email protected]> Co-authored-by: Tomas Langer <[email protected]> Co-authored-by: Daniel Kec <[email protected]> Co-authored-by: Joe DiPol <[email protected]> Co-authored-by: Tomáš Kraus <[email protected]> Co-authored-by: Romain Grecourt <[email protected]> Co-authored-by: Jonathan Knight <[email protected]> Co-authored-by: Laird Nelson <[email protected]> Co-authored-by: David Král <[email protected]> Co-authored-by: Tim Quinn <[email protected]> * Fixed problems in RetryImpl after merge. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed problems with metrics after merge. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Updated version in suite file. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed problem retrieving registry for metrics. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed more problems after merge. All tests are passing now. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed checkstyle errors. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Fixed TODO. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Enabled TCK's by default and removed generated file. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * One more checkstyle violation. Signed-off-by: Santiago Pericasgeertsen <[email protected]> * Removed duplicate test after merge. Signed-off-by: Santiago Pericasgeertsen <[email protected]> Co-authored-by: Dmitry Aleksandrov <[email protected]> Co-authored-by: Tomas Langer <[email protected]> Co-authored-by: Daniel Kec <[email protected]> Co-authored-by: Joe DiPol <[email protected]> Co-authored-by: Tomáš Kraus <[email protected]> Co-authored-by: Romain Grecourt <[email protected]> Co-authored-by: Jonathan Knight <[email protected]> Co-authored-by: Laird Nelson <[email protected]> Co-authored-by: David Král <[email protected]> Co-authored-by: Tim Quinn <[email protected]>
The Webserver keeps track of a queue of queues to release buffers back to Netty. Every new request that comes in creates a queue to track the Netty buffers. The cost of removing one of these queues when no longer needed is O(N) where N is the number of active connections. When N is large (e.g. 16K) this housekeeping operation can be costly.
This PR uses a different approach that avoids the O(N) removal operation. It uses phantom references and lets the GC gather the queues when the associated publisher becomes ready for collection. A
clearQueues
method is called periodically to clean up the queue of queues. Thanks to @olotenko.