Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve seednode monitoring #6432

Conversation

HenrikJannsen
Copy link
Collaborator

This is the seed node side part of the new monitoring concept.

Currently the monitor requests all data from seed nodes which is up to 10 MB and cause considerable load on the seeds as well as makes the monitoring less reliable.

Now we send from the seed nodes per clearnet the relevant reporting data (not the full data set, just the info we want to know, like number of messages) to a monitoring server.

The monitoring part will come in another PR for the monitor repo.

@ripcurlx ripcurlx added this to the v1.9.7 milestone Dec 1, 2022
@HenrikJannsen HenrikJannsen force-pushed the improve-seednode-monitoring branch from bc8059b to 3127649 Compare December 5, 2022 18:23
Provide full hash without truncating

Signed-off-by: HenrikJannsen <[email protected]>
Helps to speed up the build.

Signed-off-by: HenrikJannsen <[email protected]>
Signed-off-by: HenrikJannsen <[email protected]>
Signed-off-by: HenrikJannsen <[email protected]>
Remove log, change log level.
Add getters

Signed-off-by: HenrikJannsen <[email protected]>
Signed-off-by: HenrikJannsen <[email protected]>
@HenrikJannsen HenrikJannsen force-pushed the improve-seednode-monitoring branch from 3276730 to f4775f8 Compare December 7, 2022 18:23
ghost
ghost previously approved these changes Dec 8, 2022
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

Use newCachedThreadPool instead of getThreadPoolExecutor

Utilities.getThreadPoolExecutor use a BlockingQueue which prevents the intended
behaviour to increase the pool size to the max value.

Signed-off-by: HenrikJannsen <[email protected]>
Copy link
Contributor

@alejandrogarcia83 alejandrogarcia83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@alejandrogarcia83 alejandrogarcia83 merged commit 67f4d34 into bisq-network:master Dec 9, 2022
@HenrikJannsen HenrikJannsen deleted the improve-seednode-monitoring branch December 11, 2022 16:13
@alvasw
Copy link
Contributor

alvasw commented Dec 11, 2022

Hi @HenrikJannsen,

I was asked to review this PR. First, thank you for your awesome PR! This patch looks promising and will certainly helps us improve Bisq's reliability. I found a theoretical bug that is unlikely to happen.

My comments:

  1. You decided to directly handoff the tasks to available threads by passing a SynchronousQueue to the ThreadPoolExecutor:

public static ExecutorService newCachedThreadPool(int maximumPoolSize) {
return new ThreadPoolExecutor(0, maximumPoolSize,
60, TimeUnit.SECONDS,
new SynchronousQueue<>());
}

A queue attempt could fail if no threads are immediately available because the maximumPoolSize is set to 5. You could handle this case by passing a RejectedExecutionHandler to the ThreadPoolExecutor.

JDK docs [1]:
"Direct handoffs. [...] an attempt to queue a task will fail if no threads are immediately available to run it [...] Direct handoffs generally require unbounded maximumPoolSizes to avoid rejection of new submitted tasks. [...]"

The cached thread pool in the JDK sets its maximumPoolSize to Integer.MAX_VALUE to bypass this.
OpenJDK 11 - Executors.java:

public static ExecutorService newCachedThreadPool() {
        return new ThreadPoolExecutor(0, Integer.MAX_VALUE, 60L, TimeUnit.SECONDS, new SynchronousQueue());
    }
  1. I think it makes sense to keep the connection open by specifiying the Connection: Keep-Alive header [2]. The HttpClient in the JDK tries cache connection [3]. Another advantage is that the webserver doesn't need to guess it anymore.

  2. The HttpClient caches/reuses connections already (default executor) and you can pass the Executor to it by calling HttpClient.Builder.executor​(Executor executor). Afterwards, you can call client.sendAsync(request, BodyHandlers.ofString()) to get CompletableFuture<HttpResponse<T>> back.

  3. What do you think about increasing the keep alive of the ThreadPoolExecutor to 3 - 8 minutes? Now, a new thread gets created almost each time and the ThreadPoolExecutor is not really used. The seed node is sending a heart beat every minute, a data report every 5 minutes, and a block report roughly every 10 minutes.

  4. I'm not sure about this one. Renaming the classes in bisq.core.monitor from DoubleValueItem to DoubleValueItemTypeor DataStatisticType would make the code easier to read. When I started my review, I thought that aDoubleValueItem is a generic Double holder until I realized it contains all types of messages (sentBytesPerSec, receivedBytesPerSec, ...). The same applies to remaining classe in the package.

  5. The key and group field in DoubleValueItem can be final. Same applies to the other classes.

[1] https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html
[2] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Keep-Alive
[3] https://docs.oracle.com/javase/8/docs/technotes/guides/net/http-keepalive.html

@HenrikJannsen
Copy link
Collaborator Author

HenrikJannsen commented Dec 11, 2022

@alvasw

Great thanks for the review und suggestions! Will make a new PR for it. Yes agree the naming of the valueItems is not good. Will try to find a better name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants