Add cache wrapper that handles parallel access. #861

cody-littley · 2024-11-04T20:50:40Z

Why are these changes needed?

This PR adds a wrapper around lru.Cache. The purpose of this wrapper is to facilitate efficient lookup if two RPCs request the same data from the relay concurrently. Ideally we only want to fetch the data once in order to serve both requests.

Example scenario without cache wrapper:

at t=0, client A requests data X. X is not present in the cache, so the relay requests it from S3.
at t=50, client B requests data X. X is not present in the cache, so the relay requests it from S3.
at t=200, the request initiated by client A is completed. X is now present in the cache.
at t=250, the request initiated by client B is completed.

Example scenario with cache wrapper:

at t=0, client A requests data X. X is not present in the cache, so the relay requests it from S3.
at t=50, client B requests data X. X is not present in the cache, but since the lookup is already in progress, B's request is paused until A's request is completed.
at t=200, the request initiated by client A is completed. X is now present in the cache.
at t=201, the request initiated by client B is wakes up and uses the same data that was used to serve the request from client A.

Checks

I've made sure the lint is passing in this PR.
I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
I've checked the new test coverage and the coverage percentage didn't drop.
Testing Strategy
- Unit tests
- Integration tests
- This PR is not tested :(

Signed-off-by: Cody Littley <[email protected]>

relay/cached_accessor.go

Signed-off-by: Cody Littley <[email protected]>

ian-shim

💯

relay/cached_accessor.go

ian-shim · 2024-11-05T17:45:45Z

relay/cached_accessor.go

+	// is written into the channel when it is eventually fetched. If a key is requested more than once while a
+	// lookup in progress, the second (and following) requests will wait for the result of the first lookup
+	// to be written into the channel.
+	lookupsInProgress *sync.Map


nit: you don't need to initialize lookupsInProgress if it's typed as sync.Map

neat, fixed

I ended up having to back out this change in favor of using a mutex. The core issue is that reading the lookupsInProgress map needs to be atomic with respect to reading from the cache. Was able to provoke a race condition in a unit test that caused an unnecessary cache miss.

ian-shim · 2024-11-05T17:57:31Z

relay/cached_accessor_test.go

+	// Wait for the goroutines to start. We want to give the goroutines a chance to do naughty things if they want.
+	// Eliminating this sleep will not cause the test to fail, but it may cause the test not to exercise the
+	// desired race condition.
+	time.Sleep(100 * time.Millisecond)


Sleep is generally not reliable in tests. i.e. I don't think it's guaranteed that it always sleep the right amount in test and it can result in flaky tests.
Maybe we can use something like a buffered channel to wait for all goroutines to trigger?

In this scenario, I'm not relying on the sleep to make the test pass. The purpose of the sleep statement is to allow background threads a chance to do bad things if they want to do bad things.

In order to show that test stability does not depend on the sleep, I've set this up so that it runs twice: once with the sleep and once without it. Are you ok with this approach? If you'd still prefer to avoid having a sleep, we should chat. One option might just be to remove the sleep entirely.

relay/cached_accessor.go

Signed-off-by: Cody Littley <[email protected]>

Add cache wrapper that handles parallel access.

f4d25a5

Signed-off-by: Cody Littley <[email protected]>

cody-littley requested review from jianoaix and ian-shim November 4, 2024 20:50

cody-littley self-assigned this Nov 4, 2024

ian-shim reviewed Nov 4, 2024

View reviewed changes

relay/cached_accessor.go Show resolved Hide resolved

relay/cached_accessor.go Show resolved Hide resolved

Made suggested changes.

37c7c5e

Signed-off-by: Cody Littley <[email protected]>

ian-shim approved these changes Nov 5, 2024

View reviewed changes

cody-littley added 3 commits November 5, 2024 13:57

Make suggested changes.

df7a2ac

Signed-off-by: Cody Littley <[email protected]>

Fix race condition.

a50a731

Signed-off-by: Cody Littley <[email protected]>

Unit test improvements.

968ecd8

Signed-off-by: Cody Littley <[email protected]>

cody-littley merged commit ae8ccaa into Layr-Labs:master Nov 6, 2024
6 checks passed

cody-littley deleted the cached-accessor branch November 6, 2024 14:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cache wrapper that handles parallel access. #861

Add cache wrapper that handles parallel access. #861

cody-littley commented Nov 4, 2024

ian-shim left a comment

ian-shim Nov 5, 2024

cody-littley Nov 5, 2024

cody-littley Nov 5, 2024

ian-shim Nov 5, 2024

cody-littley Nov 5, 2024

Add cache wrapper that handles parallel access. #861

Add cache wrapper that handles parallel access. #861

Conversation

cody-littley commented Nov 4, 2024

Why are these changes needed?

Checks

ian-shim left a comment

Choose a reason for hiding this comment

ian-shim Nov 5, 2024

Choose a reason for hiding this comment

cody-littley Nov 5, 2024

Choose a reason for hiding this comment

cody-littley Nov 5, 2024

Choose a reason for hiding this comment

ian-shim Nov 5, 2024

Choose a reason for hiding this comment

cody-littley Nov 5, 2024

Choose a reason for hiding this comment