Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow clearence level aggregation #2171

Closed
originalsouth opened this issue Dec 20, 2023 · 2 comments
Closed

Slow clearence level aggregation #2171

originalsouth opened this issue Dec 20, 2023 · 2 comments
Labels
backend bug Something isn't working

Comments

@originalsouth
Copy link
Contributor

On a up to date Archlinux (with linux-zen kernel) perform the following:

  1. Fresh install (make reset && make kat)
  2. Add 'Test/0' organization
  3. Enable the Wappalizer boefje
  4. Add url-ooi pointing to https://mispo.es (this should create a Hostname and HostnameURL ooi)
  5. Set the clearance level of the url-ooi to two paws
  6. Bug: wait ~10 minutes before Hostname and HostnameURL oois inherit the set clearance level

Expected behavior
Wait less than a minute for Hostname and HostnameURL oois to inherit the set clearance level

Initial analysis
It seems that during the waiting time is due to Celery is taking a whole CPU after the KAT install (which is possibly related to this upstream bug)
When during this time ooi clearence level inheritance seems to be suspended.
image
In the logs obtained by docker compose logs -f octopoes_api_worker rabbitmq rabbitmq reports messages like:

rabbitmq-1             | 2023-12-20 12:54:30.921841+00:00 [warning] <0.1379.0> closing AMQP connection <0.1379.0> (172.28.0.11:46772 -> 172.28.0.6:5672, vhost: 'kat', user: 'eb61296883e76fcdf6af97438ab1263c88fa2f1e2c5ccbcf87'):
rabbitmq-1             | 2023-12-20 12:54:30.921841+00:00 [warning] <0.1379.0> client unexpectedly closed TCP connection
rabbitmq-1             | 2023-12-20 12:56:15.805856+00:00 [error] <0.1224.0> closing AMQP connection <0.1224.0> (172.28.0.12:57620 -> 172.28.0.6:5672):
rabbitmq-1             | 2023-12-20 12:56:15.805856+00:00 [error] <0.1224.0> missed heartbeats from client, timeout: 60s
rabbitmq-1             | 2023-12-20 12:56:15.816879+00:00 [error] <0.1248.0> closing AMQP connection <0.1248.0> (172.28.0.12:57636 -> 172.28.0.6:5672):
rabbitmq-1             | 2023-12-20 12:56:15.816879+00:00 [error] <0.1248.0> missed heartbeats from client, timeout: 60s
rabbitmq-1             | 2023-12-20 12:56:16.095885+00:00 [error] <0.1273.0> closing AMQP connection <0.1273.0> (172.28.0.12:57654 -> 172.28.0.6:5672):
rabbitmq-1             | 2023-12-20 12:56:16.095885+00:00 [error] <0.1273.0> missed heartbeats from client, timeout: 60s

This keeps on going until Celery calmes down (which takes to ~10 minutes) after which the inheritance properly commences.
In the logs this is eventually manifested (after many errors) but does not fit this report.
The wait eventually yields the desired behavior:
image

OpenKAT version
Tested on 18ee8e6 and older versions going back to October 2023.

@originalsouth originalsouth added bug Something isn't working backend labels Dec 20, 2023
@originalsouth
Copy link
Contributor Author

Found the root cause, see celery/billiard#399

@originalsouth
Copy link
Contributor Author

Addressed in #2327.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant