Skip to content
This repository has been archived by the owner on Nov 14, 2024. It is now read-only.

[LW] Rely on locally written values instead of the cache #5871

Merged
merged 8 commits into from
Jan 25, 2022

Conversation

Jolyon-S
Copy link
Contributor

@Jolyon-S Jolyon-S commented Jan 21, 2022

Goals (and why):

==COMMIT_MSG==
The lock watch cache no longer attempts to cache writes - instead, these will read from the maps tracked within the transaction (with similar performance but more consistent behaviour around concurrent writes).
==COMMIT_MSG==

Implementation Description (bullets):

  • Writes in the TCVSI now just block cached reads, and force "remote" reads (realistically, these are local to the transaction).
  • Modifying local updates now uses putIfAbsent for non-writes.

Testing (What was existing testing like? What have you done to improve it?):
Modified current tests, and added a few test cases in various places

Concerns (what feedback would you like?):
Are there still race conditions that we should worry about?

Where should we start reviewing?:
TCVSI

Priority (whenever / two weeks / yesterday):
ASAP - we want to get this rolling.

@changelog-app
Copy link

changelog-app bot commented Jan 21, 2022

Generate changelog in changelog/@unreleased

Type

  • Feature
  • Improvement
  • Fix
  • Break
  • Deprecation
  • Manual task
  • Migration

Description

The lock watch cache no longer attempts to cache writes - instead, these will read from the maps tracked within the transaction (with similar performance but more consistent behaviour around concurrent writes).

Check the box to generate changelog(s)

  • Generate changelog entry

@@ -63,15 +63,13 @@ static TransactionScopedCache create(ValueCacheSnapshot snapshot, CacheMetrics m
@Override
public synchronized void write(TableReference tableReference, Map<Cell, byte[]> values) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could make this take Set<Cell>; I've left it as-is as that's nicer for the callers, but maybe we should just change that anyway.

eventCache.processStartTransactionsUpdate(ImmutableSet.of(TIMESTAMP_2), LOCK_WATCH_LOCK_SUCCESS);
valueCache.processStartTransactions(ImmutableSet.of(TIMESTAMP_2));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the absence of this line meant that cache 2 was always a no op cache (oops)

@Jolyon-S Jolyon-S changed the title [WIP] [LW] Rely on locally written values instead of the cache [LW] Rely on locally written values instead of the cache Jan 24, 2022
Comment on lines 1622 to +1623
getCache().delete(tableRef, cells);
putInternal(tableRef, Cells.constantValueMap(cells, PtBytes.EMPTY_BYTE_ARRAY));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change, there'd still be a race condition:

Thread 1:
-> put internal
-> pause (say GC)

Thread 2 (after put internal from above):
-> read from cache

this would of course fail validation. However, swapping them is safe, as it just invalidates; at the absolute worst this means that we'd read from the remote, so preserves correctness (but in practice this is likely extremely rare).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we swap, and a thread barges in between, we end up seeing in the cache that there has been a local write, but when we try to read it we instead read remote (since it's not cached in the transaction yet). This is ok because we can just pretend that the read happened before the write in this case?
The problem with the old impl is that we would potentially update the cache claiming we read this write remotely because we get confused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the problem with the old impl is that we could update the KVS before the cache, and the read in the middle would fail validation. By swapping the order and changing the write semantics, we guarantee that we either are forced to read remotely or read the locally written value, so it's correct in both cases.

Copy link
Contributor

@gmaretic gmaretic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct, though I have a few questions to validate that I'm following correctly


// Read values from the snapshot. For the hits, mark as hit in the local map.
Map<Cell, CacheValue> snapshotCachedValues = getSnapshotValues(table, remainingCells);
snapshotCachedValues.forEach(
(cell, value) -> localUpdates.put(CellReference.of(table, cell), LocalCacheEntry.hit(value)));
snapshotCachedValues.forEach((cell, value) -> cacheHitInternal(table, cell, value));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only functional change in this method, right? Reading writes and reads from the cache separately does not seem to do anything special

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only change is just that we're filtering out the writes entirely (because filtering them out from localUpdates isn't enough). The cacheHitInternal is just an extra

Comment on lines 1622 to +1623
getCache().delete(tableRef, cells);
putInternal(tableRef, Cells.constantValueMap(cells, PtBytes.EMPTY_BYTE_ARRAY));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we swap, and a thread barges in between, we end up seeing in the cache that there has been a local write, but when we try to read it we instead read remote (since it's not cached in the transaction yet). This is ok because we can just pretend that the read happened before the write in this case?
The problem with the old impl is that we would potentially update the cache claiming we read this write remotely because we get confused?

@Jolyon-S
Copy link
Contributor Author

This looks correct, though I have a few questions to validate that I'm following correctly

You do raise an interesting point: I think we're actually going to have more misses with this approach. Specifically, the read-only cache created will not have local writes, but also won't have any cached values, and so will have to hit the KVS for those. What's stupid is that we filter those out anyway, so really we should just optimise the SerializableTransaction to not read them in the first place (especially since they're actually the wrong values!)

@gmaretic gmaretic self-requested a review January 25, 2022 14:30
Copy link
Contributor

@gmaretic gmaretic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thi smakes sense, though yes on the point above -- that is not ins the scope of this PR necessarily, but we should probably do it in a follow-up when you have time

@bulldozer-bot bulldozer-bot bot merged commit 7906838 into develop Jan 25, 2022
@bulldozer-bot bulldozer-bot bot deleted the jshah/lock-watch-glory branch January 25, 2022 14:36
@svc-autorelease
Copy link
Collaborator

Released 0.527.0

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants