Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2] node/relay bug fixes #908

Merged
merged 1 commit into from
Nov 20, 2024
Merged

Conversation

ian-shim
Copy link
Contributor

Why are these changes needed?

A number of bug fixes in relay and v2 node.
Also, it adds a feature flag for enabling v2 features on the node.

Checks

  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • I've checked the new test coverage and the coverage percentage didn't drop.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

Comment on lines +50 to +56
initOnce *sync.Map
// conns maps relay key to the gRPC connection: `map[corev2.RelayKey]*grpc.ClientConn`
conns sync.Map
logger logging.Logger

// grpcClients maps relay key to the gRPC client: `map[corev2.RelayKey]relaygrpc.RelayClient`
grpcClients sync.Map
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All maps need to be thread safe

@@ -119,8 +127,7 @@ func (m *metadataProvider) GetMetadataForBlobs(keys []v2.BlobKey) (metadataMap,
}()
}

mMap := make(metadataMap)
for len(mMap) < len(keys) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hangs if there are duplicates in keys

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a unit test for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@ian-shim ian-shim marked this pull request as ready for review November 18, 2024 22:06
@@ -119,8 +127,7 @@ func (m *metadataProvider) GetMetadataForBlobs(keys []v2.BlobKey) (metadataMap,
}()
}

mMap := make(metadataMap)
for len(mMap) < len(keys) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a unit test for this

config.EnableV2 = false
c := newTestComponents(t, config)
_, err := c.server.StoreChunks(context.Background(), &pbv2.StoreChunksRequest{})
require.Error(t, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These block of code may be benefited from a small helper function (parse error and assert the error status), as they repeat many times in this file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

logger logging.Logger

// grpcClients maps relay key to the gRPC client: `map[corev2.RelayKey]relaygrpc.RelayClient`
grpcClients sync.Map
}

var _ RelayClient = (*relayClient)(nil)

// NewRelayClient creates a new RelayClient that connects to the relays specified in the config.
// It keeps a connection to each relay and reuses it for subsequent requests, and the connection is lazily instantiated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why lazy init? It's not that expensive to create a connection once per Node creation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just preventing a case that creates a large number of connections. This client is not necessarily only used by the node as it's meant to be used as a library

@@ -83,6 +83,7 @@ func makeConfig(t *testing.T) *node.Config {
DbPath: t.TempDir(),
ID: opID,
NumBatchValidators: runtime.GOMAXPROCS(0),
EnableV2: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes the test coverage for the existing v1, we need to keep testing it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Defaulted it to false and overwrote it to true in server_v2_test

@ian-shim ian-shim force-pushed the v2-node-relay-fixes branch from 1fe0be4 to 2647dec Compare November 19, 2024 00:36
@ian-shim ian-shim force-pushed the v2-node-relay-fixes branch from 2647dec to 1958f6d Compare November 20, 2024 00:25
@ian-shim ian-shim force-pushed the v2-node-relay-fixes branch from 1958f6d to 00c7c50 Compare November 20, 2024 00:32
@ian-shim ian-shim merged commit 88b1ac3 into Layr-Labs:master Nov 20, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants