Skip to content
This repository has been archived by the owner on Nov 14, 2024. It is now read-only.

[LWMeta] TimeLock API changes #6660

Merged
merged 26 commits into from
Jul 19, 2023

Conversation

Toemmsche
Copy link
Contributor

@Toemmsche Toemmsche commented Jul 10, 2023

General

Before this PR:
ConjureLockRequest uses set<ConjureLockDescriptor>
After this PR:

  • Conjure excludeEmptyOptionals = true
  • New field optional<ConjureLockRequestMetadata> in ConjureLockRequest
  • ConjureLockRequest uses list<ConjureLockDescriptor> instead
  • If metadata is empty after lock watch filtering, it is omitted

==COMMIT_MSG==
[LWMeta] Add metadata to ConjureLockRequest
==COMMIT_MSG==

Priority:
P1
Concerns / possible downsides (what feedback would you like?):
Looking at some serialized ConjureLockRequest objects, there is an argument to be made to simplify the ChangeMetadata format to always include the old and new value. The conjure encoding for union types is out of our control and - frankly - quite poor.

Is documentation needed?:
If we have documentation for the TimeLock API, yes!

Compatibility

Does this PR create any API breaks (e.g. at the Java or HTTP layers) - if so, do we have compatibility?:
Due to us not using --strictObjects and using excludeEmptyOptionals = true as of this PR, we have compatibility.
Does this PR change the persisted format of any data - if so, do we have forward and backward compatibility?:
No
The code in this PR may be part of a blue-green deploy. Can upgrades from previous versions safely coexist? (Consider restarts of blue or green nodes.):
Yes, as long as nodes on newer versions do no forward requests to nodes of older versions.
Does this PR rely on statements being true about other products at a deployment - if so, do we have correct product dependencies on these products (or other ways of verifying that these statements are true)?:
TBD
Does this PR need a schema migration?
no

Testing and Correctness

What, if any, assumptions are made about the current state of the world? If they change over time, how will we find out?:

What was existing testing like? What have you done to improve it?:
I have added tests to the existing LockWatchEventIntegrationTest to verify that metadata passed to TimeLock is visible to transactions.
If this PR contains complex concurrent or asynchronous code, is it correct? The onus is on the PR writer to demonstrate this.:
N/A
If this PR involves acquiring locks or other shared resources, how do we ensure that these are always released?:
N/A

Execution

Metrics for TimeLock are added in #6665
If this PR does not work as expected, how do I fix that state? Would rollback be straightforward?:
Yes
If the above plan is more complex than “recall and rollback”, please tag the support PoC here (if it is the end of the week, tag both the current and next PoC):

Scale

Would this PR be expected to pose a risk at scale? Think of the shopping product at our largest stack.:
This PR does introduce some additional conversions from List<LockDescriptor> to Set<LockDescriptor> and back. These are necessary to facilitate the index encoding format for metadata.
Would this PR be expected to perform a large number of database calls, and/or expensive database calls (e.g., row range scans, concurrent CAS)?:
No
Would this PR ever, with time and scale, become the wrong thing to do - and if so, how would we know that we need to do something differently?:
If the aforementioned conversions are performance bottlenecks, we should consider a separate endpoint for lock requests with metadata.

Development Process

Where should we start reviewing?:

If this PR is in excess of 500 lines excluding versions lock-files, why does it not make sense to split it?:
We exceed the 500 line threshold because of adding an old version of a conjure-generated class

@Toemmsche Toemmsche changed the base branch from develop to tpapke/metadata/wire-format-lockevent-serdes July 10, 2023 15:00
@Toemmsche Toemmsche force-pushed the tpapke/metadata/timelock-api-changes branch 2 times, most recently from 224c741 to 7182be2 Compare July 11, 2023 15:04
@Toemmsche Toemmsche force-pushed the tpapke/metadata/wire-format-lockevent-serdes branch from b1ca4c2 to 80d9689 Compare July 11, 2023 15:23
@Toemmsche Toemmsche force-pushed the tpapke/metadata/timelock-api-changes branch from 7182be2 to 42bce2e Compare July 12, 2023 09:54
@Toemmsche Toemmsche force-pushed the tpapke/metadata/wire-format-lockevent-serdes branch from 80d9689 to 37e17e5 Compare July 12, 2023 09:55
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this class still in use?

Copy link
Contributor Author

@Toemmsche Toemmsche Jul 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is merely for compatibility + serializability checking so we don't have to test too many edge cases here. Anything related to actual index encoding (e.g. correctness checks) is handled by the conjure utilities.

@@ -139,13 +139,14 @@ types:
requestId:
type: uuid
safety: safe
lockDescriptors: set<ConjureLockDescriptor>
lockDescriptors: list<ConjureLockDescriptor>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed during the RFC review: This change is compatible on the wire

acquireTimeoutMs:
type: integer
safety: safe
clientDescription:
type: optional<string>
safety: unsafe
metadata: optional<ConjureLockRequestMetadata>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly enough, even though https://github.com/palantir/conjure/blob/master/docs/spec/wire.md#42-servers-reject-unknown-fields dictates that servers must reject unknown fields, all conjure object builder are actually annotated with @JsonIgnoreProperties(ignoreUnknown = true) meaning that they ignore all unknown properties. If we trust the spec, this change would require a product dependency, but the generated conjure code says otherwise.

IdentifiedLockRequest lockRequest = ImmutableIdentifiedLockRequest.builder()
.lockDescriptors(fromConjureLockDescriptors(request.getLockDescriptors()))
.lockDescriptors(lockDescriptors)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though we are passing a list to field that expects a set, immutable types always build an immutable version of the container type anyways (to guarantee immutability)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor because we moved some code to the LockWatchIntegrationTestUtilities class

@Toemmsche Toemmsche requested a review from mdaudali July 12, 2023 10:38
@Toemmsche Toemmsche force-pushed the tpapke/metadata/timelock-api-changes branch from 41d266c to a361c32 Compare July 12, 2023 12:29
return Optional.empty();
}
}
private enum AssertSuccessVisitor implements LockWatchStateUpdate.Visitor<LockWatchStateUpdate.Success> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty much duplicated from UpdateVisitors, but that class is in another test module. We could move it to a non-test module (ugly because test code now lives among production code) or move this test class over (but then we couldn't access LockWatchIntegrationTestUtilities). I am not too offended by this just being here.

@Toemmsche Toemmsche force-pushed the tpapke/metadata/wire-format-lockevent-serdes branch from d02537d to 6902515 Compare July 12, 2023 17:50
@Toemmsche Toemmsche force-pushed the tpapke/metadata/timelock-api-changes branch from 24ca170 to 2510b74 Compare July 12, 2023 17:51
@Toemmsche Toemmsche force-pushed the tpapke/metadata/wire-format-lockevent-serdes branch from 6902515 to a4685ab Compare July 13, 2023 08:53
@Toemmsche Toemmsche force-pushed the tpapke/metadata/timelock-api-changes branch from 2510b74 to 2ec8eef Compare July 13, 2023 09:40
@Toemmsche Toemmsche force-pushed the tpapke/metadata/wire-format-lockevent-serdes branch 2 times, most recently from 4e93e58 to 4dd4296 Compare July 13, 2023 15:11
@Toemmsche Toemmsche force-pushed the tpapke/metadata/timelock-api-changes branch 2 times, most recently from d5e8260 to 19dfcda Compare July 13, 2023 15:53
@Toemmsche Toemmsche marked this pull request as ready for review July 14, 2023 10:37
@Toemmsche Toemmsche requested a review from Sam-Kramer July 14, 2023 15:45
@Toemmsche Toemmsche force-pushed the tpapke/metadata/timelock-api-changes branch from 12d352c to b63000e Compare July 18, 2023 14:06
Change lock descriptor container type to list
Add test for backwards compatibility of conjure lock request deserialization in TimeLock
Extend LockWatchEventIntegrationTest with tests for metadata (it should be visible for transactions)
Toemmsche and others added 2 commits July 18, 2023 15:29
Refactor test file naming
@Toemmsche Toemmsche force-pushed the tpapke/metadata/timelock-api-changes branch from b63000e to e5cbcf9 Compare July 18, 2023 14:29
@Sam-Kramer Sam-Kramer changed the base branch from tpapke/metadata/wire-format-lockevent-serdes to develop July 18, 2023 14:45
@changelog-app
Copy link

changelog-app bot commented Jul 18, 2023

Generate changelog in changelog/@unreleased

What do the change types mean?
  • feature: A new feature of the service.
  • improvement: An incremental improvement in the functionality or operation of the service.
  • fix: Remedies the incorrect behaviour of a component of the service in a backwards-compatible way.
  • break: Has the potential to break consumers of this service's API, inclusive of both Palantir services
    and external consumers of the service's API (e.g. customer-written software or integrations).
  • deprecation: Advertises the intention to remove service functionality without any change to the
    operation of the service itself.
  • manualTask: Requires the possibility of manual intervention (running a script, eyeballing configuration,
    performing database surgery, ...) at the time of upgrade for it to succeed.
  • migration: A fully automatic upgrade migration task with no engineer input required.

Note: only one type should be chosen.

How are new versions calculated?
  • ❗The break and manual task changelog types will result in a major release!
  • 🐛 The fix changelog type will result in a minor release in most cases, and a patch release version for patch branches. This behaviour is configurable in autorelease.
  • ✨ All others will result in a minor version release.

Type

  • Feature
  • Improvement
  • Fix
  • Break
  • Deprecation
  • Manual task
  • Migration

Description

[LWMeta] Add metadata to ConjureLockRequest

Check the box to generate changelog(s)

  • Generate changelog entry

Copy link
Contributor

@Sam-Kramer Sam-Kramer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! mostly minor stuff

@@ -97,24 +99,26 @@ public LeaderTime leaderTime(AuthHeader authHeader, String namespace) {

@Override
public ConjureLockResponse lock(AuthHeader authHeader, String namespace, ConjureLockRequest request) {
Set<ConjureLockDescriptor> lockDescriptors = Set.copyOf(request.getLockDescriptors());
Copy link
Contributor

@Sam-Kramer Sam-Kramer Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we now "copying" the request lock descriptors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Downstream methods like ClientLockDiagnosticCollector#collect require a set, but ConjureLockRequest.lockDescriptors is now a list. Since this class is deprecated, I don't think it is worth the effort to refactor all the downstream methods and immutable objects to accept lists. Also, having set semantics makes more sense in this context and we merely have to do the conversion because of the wire format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, we should use ImmutableSet.copyOf here

@@ -62,6 +73,14 @@ private static Set<ConjureLockDescriptor> toConjure(Set<LockDescriptor> lockDesc
.collect(Collectors.toSet());
}

private static List<ConjureLockDescriptor> toConjure(List<LockDescriptor> lockDescriptors) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ObjectMappers.newServerObjectMapper().enable(SerializationFeature.INDENT_OUTPUT);
// This mapper is used to ensure that two JSONs are equal excluding indentation
private static final ObjectMapper VERIFYING_MAPPER = new ObjectMapper();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given our conversation around "dependency ordering", should we write a test proving that the old ConjureLockRequest will be able to handle metadata, whether or not it's specified? A bit ugly, as we'll need to copy the old conjure-generated file, but probably worth it to as our proof it won't break things

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought about this and found it acceptable in the LockEvent case since copying the class was trivial, but copying the generated class is a different beast. Given that we will not have product dependencies, though, I think it is acceptable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's the worst thing in the world for us to have a copy in the test folder here of the old-generated class. Granted, this is something we probably want anyways, as we have an invariant here which we must maintain, albeit super hard to break (our new format must be serializable by the old format).

Even though this becomes out-dated after this PR, we still allow the upgrade path from a really old version of timelock to a really new version of timelock in the future, so it's useful we have this test to ensure we don't break anything until we checkpoint timelock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable, I noticed that the test class is not really in the right module (should be co-located with timelock-api.yml), will move it

}

private static Path getJsonPath(String jsonFileName) {
return Paths.get(BASE + jsonFileName + ".json");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine, but I think you can also do Paths.get(BASE, jsonFileName + ".json")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will backport this suggestion for related files

private static Set<LockDescriptor> fromConjureLockDescriptors(Set<ConjureLockDescriptor> lockDescriptors) {
Set<LockDescriptor> descriptors = Sets.newHashSetWithExpectedSize(lockDescriptors.size());
private static List<LockDescriptor> fromConjureLockDescriptors(List<ConjureLockDescriptor> lockDescriptors) {
List<LockDescriptor> descriptors = new ArrayList<>(lockDescriptors.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to make this List mutable? Otherwise I saw let's opt for ImmutableList.


conjure {
java {
excludeEmptyOptionals = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌶️

@@ -359,6 +387,30 @@ private Set<LockDescriptor> getDescriptors(TableReference tableRef, Cell... cell
.collect(Collectors.toSet());
}

private List<Optional<LockRequestMetadata>> lockAndGetAllLockEventMetadataForNewTransaction(
LockRequest lockRequest) {
txnManager.runTaskReadOnly(txn -> null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to actually run a transaction, but we need the currentVersion (line below)

LockResponse response = txnManager.getTimelockService().lock(lockRequest);
// We need to clean up the lock, otherwise it will be held forever (the TimeLock cluster is static)!
txnManager.getTimelockService().unlock(ImmutableSet.of(response.getToken()));
OpenTransaction dummyTxn = startSingleTransaction();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe a better name than dummyTxn

@@ -359,6 +387,30 @@ private Set<LockDescriptor> getDescriptors(TableReference tableRef, Cell... cell
.collect(Collectors.toSet());
}

private List<Optional<LockRequestMetadata>> lockAndGetAllLockEventMetadataForNewTransaction(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth just returning the Optional<LockRequestMetadata>, seeing as we only expect a single event anyways?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lockAndGetEventMetadataFromLockWatchLog ?

@@ -26,6 +26,7 @@
import com.google.common.collect.ImmutableSet;
import com.google.common.collect.Iterables;
import com.google.common.util.concurrent.Uninterruptibles;
import com.palantir.atlasdb.encoding.PtBytes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to be a bit more aggressive here with our testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think any test we add will already be mostly by some other test. But given that this is the ultimate integration test, I would be fine with replicating/bundling some of them here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a large-scale random test might make sense.

Copy link
Contributor Author

@Toemmsche Toemmsche Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more, I would rather use this class for advanced testing using actual transactions once we have the necessary changes in Atlas. The two tests are added don't really fit the purpose of this class, but are good sanity checks. Thoughts on this approach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense! Plus, this api is not really used anyways until that happens, so it's fine for this to merge beforehand.

@Toemmsche Toemmsche requested a review from Sam-Kramer July 19, 2023 09:18
Comment on lines 204 to 206
return LockRequestMetadata.of(filteredLockMetadata);
return filteredLockMetadata.isEmpty()
? Optional.empty()
: Optional.of(LockRequestMetadata.of(filteredLockMetadata));
Copy link
Contributor Author

@Toemmsche Toemmsche Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not necessary for correctness, but we should have this behavior for performance reasons

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than ternary I personally bias towards

if(filteredLockMetadata.isEmpty()) {
   return Optional.empty();`
}
return Optional.of(...) 

BUT -- that said, I think ternary is also fine!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get behind it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied from the latest develop , but I renamed it to avoid confusion

@Unsafe
@JsonDeserialize(builder = OldConjureLockRequest.Builder.class)
@Generated("com.palantir.conjure.java.types.BeanGenerator")
public final class OldConjureLockRequest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old is fine, but Legacy is a bit cleaner

assertDeserializedEquals(OLD_CONJURE_LOCK_REQUEST_FILE, OLD_CONJURE_LOCK_REQUEST, OldConjureLockRequest.class);
}

// Requires excludeEmptyOptionals = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe reframe the comment to be more specific on what this means, i.e. we exclude optionals during serialization rather than marking them as null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of frustrating, as this is the ideal of as, so you can say "we expect the old and new conjure lock request to be identical as we do not serialize empty optionals".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth removing the as and to return the assertThat, to allow for these types of cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an improved comment is fine. Trying to get the assert here is especially ugly for assertSerializedEquals because we re-read the serialized form using a different mapper (so we can ignore any indentation differences).

}
}

private static <T> void assertDeserializedEquals(String jsonFileName, T object, Class<T> clazz) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also only other style nit, is that we have arguments reversed from assertSerializedEquals. I'd expect assertDeserializedEquals(T object, String jsonFileName, Class<T> clazz)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was deliberate to maintain the convention that we do assertThat(actual).isEqualTo(expected). The deserialized JSON object is the actual value and the object is the expected value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In assertSerializedEquals, the JSON file is the expected value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, then that makes sense!

private static Set<LockDescriptor> fromConjureLockDescriptors(Set<ConjureLockDescriptor> lockDescriptors) {
Set<LockDescriptor> descriptors = Sets.newHashSetWithExpectedSize(lockDescriptors.size());
private static List<LockDescriptor> fromConjureLockDescriptors(List<ConjureLockDescriptor> lockDescriptors) {
ImmutableList.Builder<LockDescriptor> descriptors = ImmutableList.builder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we know the size here, let's use builderWithExpectedSize.

@Toemmsche Toemmsche requested a review from Sam-Kramer July 19, 2023 15:31
Copy link
Contributor

@Sam-Kramer Sam-Kramer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@bulldozer-bot bulldozer-bot bot merged commit bfe7051 into develop Jul 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants