Skip to content
This repository has been archived by the owner on Oct 17, 2022. It is now read-only.

fix: Mitigate issue #706 #715

Merged
merged 1 commit into from
Aug 8, 2022
Merged

Conversation

huitseeker
Copy link
Contributor

@huitseeker huitseeker commented Aug 7, 2022

Context:

The BatchLoader is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic.
The proper fix us to use the BlockWaiter, see #706.

The issue:

We would like a mitigation to be deployed and effective sooner.

The mitigation:

We inspect the worker addresses used by the BatchLoader, and rewrite their hostname to localhost when it matches the local primary.

@tharbert
Copy link

tharbert commented Aug 7, 2022

@huitseeker this connectivity should work as expected:

root@sui-validator-0:/sui# host localhost
localhost has address 127.0.0.1
localhost has IPv6 address ::1
Host localhost not found: 3(NXDOMAIN)

root@sui-validator-0:/sui# nc localhost 8084
@
^C

@ is indicative of a grpc service listening as expected

we will verify again in staging

Copy link
Contributor

@gdanezis gdanezis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good hack for the moment indeed!

Copy link
Contributor

@akichidis akichidis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that @huitseeker ! 👍

@@ -169,3 +200,73 @@ impl Executor {
])
}
}

fn replace_distant_by_localhost(target: &mut Multiaddr, hostname_pattern: &Protocol) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe logging the successful replace can give us in staging/devnet a quick feedback whether we have successfully overwritten the address

Context:
The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic.
The proper fix us to use the `BlockWaiter`.

The issue:
We would like a mitigation to be deployed and effective sooner.

The fix:
We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
@huitseeker huitseeker merged commit 53b6549 into MystenLabs:main Aug 8, 2022
huitseeker added a commit to huitseeker/narwhal that referenced this pull request Aug 8, 2022
Context:
The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic.
The proper fix us to use the `BlockWaiter`.

The issue:
We would like a mitigation to be deployed and effective sooner.

The fix:
We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
huitseeker added a commit that referenced this pull request Aug 8, 2022
Context:
The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic.
The proper fix us to use the `BlockWaiter`.

The issue:
We would like a mitigation to be deployed and effective sooner.

The fix:
We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
huitseeker added a commit that referenced this pull request Aug 12, 2022
Context:
The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic.
The proper fix us to use the `BlockWaiter`.

The issue:
We would like a mitigation to be deployed and effective sooner.

The fix:
We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
mwtian pushed a commit to mwtian/sui that referenced this pull request Sep 30, 2022
Context:
The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic.
The proper fix us to use the `BlockWaiter`.

The issue:
We would like a mitigation to be deployed and effective sooner.

The fix:
We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
@huitseeker huitseeker deleted the mitigate-706 branch January 21, 2023 13:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants