Skip to content
This repository has been archived by the owner on Oct 17, 2022. It is now read-only.

Logs fill up with 'available tasks is 0' upon network saturation #759

Closed
velvia opened this issue Aug 12, 2022 · 0 comments · Fixed by #763
Closed

Logs fill up with 'available tasks is 0' upon network saturation #759

velvia opened this issue Aug 12, 2022 · 0 comments · Fixed by #763
Assignees
Labels
bug Something isn't working

Comments

@velvia
Copy link
Contributor

velvia commented Aug 12, 2022

Description

2022-08-12T20:50:17.765387Z  WARN network::primary: Executor in network:primary and module:primary_core available tasks is 0 for client address: /dns/validator-3/tcp/8381/http

The log files fill up and are spammed with the above.

Explanation from @huitseeker :

It means that the network has reached its maximum amount of concurrent network messages. Further messages are queued and will be sent when further semaphore tickets are available.

A network sending attempt can move off the semaphore in 3 ways :

  • the message send finishes,
  • the message send fails (some will be rescheduled on an exponential backoff)
  • the message is canceled because the network has progressed past the point where this would be relevant. (that cancellation cancels the retries for those nodes that are retried)

Steps to reproduce

Run a docker-compose with multiple Narwhal nodes. Saturate the network such that the network queue fills up.

Possible solutions:

Have the network throttling code not spam warnings every time it is full -- maybe set a flag and warn just once out of N invocations?

@velvia velvia added the bug Something isn't working label Aug 12, 2022
huitseeker added a commit to huitseeker/narwhal that referenced this issue Aug 14, 2022
We operate an executor with a bound on the concurrent number of messages (see MystenLabs#463, MystenLabs#559, MystenLabs#706).

We expect the executors to operate for a long time at this limit (e.g. in recovery situation).
The spammy logging is not usfeful

This removes the logging of the concurrency bound being hit.
Fixes MystenLabs#759
huitseeker added a commit to huitseeker/narwhal that referenced this issue Aug 14, 2022
We operate an executor with a bound on the concurrent number of messages (see MystenLabs#463, MystenLabs#559, MystenLabs#706).
PR MystenLabs#472 added logging for the bound being hit.

We expect the executors to operate for a long time at this limit (e.g. in recovery situation).
The spammy logging is not usfeful

This removes the logging of the concurrency bound being hit.
Fixes MystenLabs#759
@huitseeker huitseeker self-assigned this Aug 14, 2022
huitseeker added a commit that referenced this issue Aug 15, 2022
)

We operate an executor with a bound on the concurrent number of messages (see #463, #559, #706).
PR #472 added logging for the bound being hit.

We expect the executors to operate for a long time at this limit (e.g. in recovery situation).
The spammy logging is not usfeful

This removes the logging of the concurrency bound being hit.
Fixes #759
huitseeker added a commit to huitseeker/narwhal that referenced this issue Aug 16, 2022
…ystenLabs#763)

We operate an executor with a bound on the concurrent number of messages (see MystenLabs#463, MystenLabs#559, MystenLabs#706).
PR MystenLabs#472 added logging for the bound being hit.

We expect the executors to operate for a long time at this limit (e.g. in recovery situation).
The spammy logging is not usfeful

This removes the logging of the concurrency bound being hit.
Fixes MystenLabs#759
huitseeker added a commit that referenced this issue Aug 16, 2022
)

We operate an executor with a bound on the concurrent number of messages (see #463, #559, #706).
PR #472 added logging for the bound being hit.

We expect the executors to operate for a long time at this limit (e.g. in recovery situation).
The spammy logging is not usfeful

This removes the logging of the concurrency bound being hit.
Fixes #759
mwtian pushed a commit to mwtian/sui that referenced this issue Sep 30, 2022
…ystenLabs/narwhal#763)

We operate an executor with a bound on the concurrent number of messages (see MystenLabs/narwhal#463, MystenLabs/narwhal#559, MystenLabs/narwhal#706).
PR MystenLabs/narwhal#472 added logging for the bound being hit.

We expect the executors to operate for a long time at this limit (e.g. in recovery situation).
The spammy logging is not usfeful

This removes the logging of the concurrency bound being hit.
Fixes MystenLabs/narwhal#759
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants