This repository has been archived by the owner on Oct 17, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 68
[executor] Route the executor requests without the worker_to_worker
interface
#706
Labels
bug
Something isn't working
Comments
huitseeker
changed the title
[executor] Route the executor requests differently than through the
[executor] Route the executor requests without the Aug 7, 2022
worker_to_worker
interfaceworker_to_worker
interface
huitseeker
added a commit
to huitseeker/narwhal
that referenced
this issue
Aug 7, 2022
Context: The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic. The proper fix us to use the `BlockWaiter`. The issue: We would like a mitigation to be deployed and effective sooner. The fix: We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
huitseeker
added a commit
to huitseeker/narwhal
that referenced
this issue
Aug 7, 2022
Context: The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic. The proper fix us to use the `BlockWaiter`. The issue: We would like a mitigation to be deployed and effective sooner. The fix: We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
@huitseeker totally onboard in swapping the |
huitseeker
added a commit
to huitseeker/narwhal
that referenced
this issue
Aug 8, 2022
Context: The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic. The proper fix us to use the `BlockWaiter`. The issue: We would like a mitigation to be deployed and effective sooner. The fix: We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
huitseeker
added a commit
to huitseeker/narwhal
that referenced
this issue
Aug 8, 2022
Context: The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic. The proper fix us to use the `BlockWaiter`. The issue: We would like a mitigation to be deployed and effective sooner. The fix: We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
huitseeker
added a commit
that referenced
this issue
Aug 8, 2022
Context: The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic. The proper fix us to use the `BlockWaiter`. The issue: We would like a mitigation to be deployed and effective sooner. The fix: We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
huitseeker
added a commit
to huitseeker/narwhal
that referenced
this issue
Aug 8, 2022
Context: The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic. The proper fix us to use the `BlockWaiter`. The issue: We would like a mitigation to be deployed and effective sooner. The fix: We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
huitseeker
added a commit
that referenced
this issue
Aug 8, 2022
Context: The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic. The proper fix us to use the `BlockWaiter`. The issue: We would like a mitigation to be deployed and effective sooner. The fix: We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
huitseeker
added a commit
that referenced
this issue
Aug 12, 2022
Context: The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic. The proper fix us to use the `BlockWaiter`. The issue: We would like a mitigation to be deployed and effective sooner. The fix: We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
huitseeker
added a commit
to huitseeker/narwhal
that referenced
this issue
Aug 14, 2022
We operate an executor with a bound on the concurrent number of messages (see MystenLabs#463, MystenLabs#559, MystenLabs#706). We expect the executors to operate for a long time at this limit (e.g. in recovery situation). The spammy logging is not usfeful This removes the logging of the concurrency bound being hit. Fixes MystenLabs#759
huitseeker
added a commit
to huitseeker/narwhal
that referenced
this issue
Aug 14, 2022
We operate an executor with a bound on the concurrent number of messages (see MystenLabs#463, MystenLabs#559, MystenLabs#706). PR MystenLabs#472 added logging for the bound being hit. We expect the executors to operate for a long time at this limit (e.g. in recovery situation). The spammy logging is not usfeful This removes the logging of the concurrency bound being hit. Fixes MystenLabs#759
huitseeker
added a commit
that referenced
this issue
Aug 15, 2022
) We operate an executor with a bound on the concurrent number of messages (see #463, #559, #706). PR #472 added logging for the bound being hit. We expect the executors to operate for a long time at this limit (e.g. in recovery situation). The spammy logging is not usfeful This removes the logging of the concurrency bound being hit. Fixes #759
huitseeker
added a commit
to huitseeker/narwhal
that referenced
this issue
Aug 16, 2022
…ystenLabs#763) We operate an executor with a bound on the concurrent number of messages (see MystenLabs#463, MystenLabs#559, MystenLabs#706). PR MystenLabs#472 added logging for the bound being hit. We expect the executors to operate for a long time at this limit (e.g. in recovery situation). The spammy logging is not usfeful This removes the logging of the concurrency bound being hit. Fixes MystenLabs#759
huitseeker
added a commit
that referenced
this issue
Aug 16, 2022
) We operate an executor with a bound on the concurrent number of messages (see #463, #559, #706). PR #472 added logging for the bound being hit. We expect the executors to operate for a long time at this limit (e.g. in recovery situation). The spammy logging is not usfeful This removes the logging of the concurrency bound being hit. Fixes #759
mwtian
pushed a commit
to mwtian/sui
that referenced
this issue
Sep 30, 2022
Context: The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic. The proper fix us to use the `BlockWaiter`. The issue: We would like a mitigation to be deployed and effective sooner. The fix: We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
mwtian
pushed a commit
to mwtian/sui
that referenced
this issue
Sep 30, 2022
…ystenLabs/narwhal#763) We operate an executor with a bound on the concurrent number of messages (see MystenLabs/narwhal#463, MystenLabs/narwhal#559, MystenLabs/narwhal#706). PR MystenLabs/narwhal#472 added logging for the bound being hit. We expect the executors to operate for a long time at this limit (e.g. in recovery situation). The spammy logging is not usfeful This removes the logging of the concurrency bound being hit. Fixes MystenLabs/narwhal#759
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Executor
'sBatchLoader
is using the network to retrieve batches from its own worker,BlockWaiter
that allows a primary to retrieve payload from any particular block, feeding itself both on its own and on others' workers.BatchLoader
is using theWorkerToWorkerClient
to issue these batch requests, which means that primary is impersonating a worker, in order to contact its own worker, and using the publicly network interface of its worker (as opposed to the loopbackprimary_to_worker
interface).Failed to receive batch reply from worker
".Conclusion: I think we should excise the
WorkerToWorker
network from theBatchLoader
. A way to do that seems to be to mimic some of the behavior of theBlockWaiter
to retrieve solely batches (rather than blocks). /cc @akichidis for better ideas.The text was updated successfully, but these errors were encountered: