Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend relative hash support to whole API v2 #7308

Merged
merged 12 commits into from
Aug 1, 2023
Prev Previous commit
Next Next commit
Resolve hashes against DETACHED
adutra committed Aug 1, 2023

Verified

This commit was signed with the committer’s verified signature.
adutra Alexandre Dutra
commit 6f4a1bcca5ad05ebd28dd8774ecf7568e4afe6a2
Original file line number Diff line number Diff line change
@@ -278,9 +278,7 @@ additionalRefName, defaultBranch(), invalidHash))
"ASSIGN BRANCH %s TO %s AT %s IN nessie",
additionalRefName, defaultBranch(), unknownHash))
.isInstanceOf(NessieNotFoundException.class)
.hasMessage(
String.format(
"Could not find commit '%s' in reference '%s'.", unknownHash, defaultBranch()));
.hasMessage(String.format("Commit '%s' not found", unknownHash));
assertThatThrownBy(
() ->
sql(
@@ -329,9 +327,7 @@ additionalRefName, defaultBranch(), invalidHash))
"ASSIGN TAG %s TO %s AT %s IN nessie",
additionalRefName, defaultBranch(), unknownHash))
.isInstanceOf(NessieNotFoundException.class)
.hasMessage(
String.format(
"Could not find commit '%s' in reference '%s'.", unknownHash, defaultBranch()));
.hasMessage(String.format("Commit '%s' not found", unknownHash));
assertThatThrownBy(
() ->
sql(
@@ -458,8 +454,7 @@ void useShowReferencesAtWithFailureConditions()

assertThatThrownBy(() -> sql("USE REFERENCE %s AT %s IN nessie ", refName, randomHash))
.isInstanceOf(NessieNotFoundException.class)
.hasMessage(
String.format("Could not find commit '%s' in reference '%s'.", randomHash, refName));
.hasMessage(String.format("Commit '%s' not found", randomHash));

assertThatThrownBy(() -> sql("USE REFERENCE %s AT `%s` IN nessie ", refName, invalidTimestamp))
.isInstanceOf(NessieNotFoundException.class)
@@ -469,8 +464,7 @@ void useShowReferencesAtWithFailureConditions()

assertThatThrownBy(() -> sql("USE REFERENCE %s AT %s IN nessie ", refName, invalidHash))
.isInstanceOf(NessieNotFoundException.class)
.hasMessageStartingWith(
String.format("Could not find commit '%s' in reference '%s'", invalidHash, refName));
.hasMessage(String.format("Commit '%s' not found", invalidHash));
}

@Test
Original file line number Diff line number Diff line change
@@ -79,7 +79,6 @@
import org.projectnessie.error.NessieNamespaceNotFoundException;
import org.projectnessie.error.NessieNotFoundException;
import org.projectnessie.error.NessieReferenceConflictException;
import org.projectnessie.error.NessieReferenceNotFoundException;
import org.projectnessie.error.ReferenceConflicts;
import org.projectnessie.model.Branch;
import org.projectnessie.model.CommitMeta;
@@ -374,36 +373,10 @@ public void references() throws Exception {
api().deleteTag().tag(tag).delete();
}

// We cannot delete a branch if the expected hash is not reachable from its HEAD. Here,
// the expected hash is not reachable anymore, because the branch was reassigned to main
// previously.
// In such cases we expect a not-found error.
AbstractThrowableAssert<?, ? extends Throwable> deleteConflict =
isV2()
? soft.assertThatThrownBy(() -> api().deleteBranch().branch(branch).getAndDelete())
: soft.assertThatThrownBy(() -> api().deleteBranch().branch(branch).delete());
deleteConflict
.isInstanceOf(NessieReferenceNotFoundException.class)
.hasMessageContaining("Could not find commit");

// Move the HEAD of the branch from main to a new commit.
Branch branchWithNewCommit =
prepCommit(
branchAssigned,
"commit",
Put.of(ContentKey.of("key"), Namespace.of("key")),
dummyPut("key", "foo"))
.commit();

// We cannot delete a branch if the expected hash is reachable from HEAD, but is not HEAD. Here,
// the expected hash is not the HEAD anymore, because the branch was updated to
// branchWithNewCommit above.
// In such cases we expect a conflict.
deleteConflict =
isV2()
? soft.assertThatThrownBy(
() -> api().deleteBranch().branch(branchAssigned).getAndDelete())
: soft.assertThatThrownBy(() -> api().deleteBranch().branch(branchAssigned).delete());
deleteConflict
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only test that failed after the changes. It is testing an edge case: the branch is at c1, but the provided expected hash is c2.

Previously there was no hash resolution, so the endpoint would throw a CONFLICT error since c2 is not the expected hash.

With the new hash resolution in place, the endpoint now throws NOT_FOUND because c2 is not reachable from c1 at all.

I added another test case below that generates a CONFLICT as before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the new behavour here. In this case, the hash is valid, but it is not in the current commit chain of the reference. This is exactly the case when one client assigns the reference, while another client attempts to delete it using the old HEAD. I tend to think that a "conflict" is more appropriate from the high-level perspective.

We seems to have two general approaches. Read operations throw "not found" when the hash is not found on the branch. Write operations throw "conflict" when the "expected" hash does not match server-side expectations. The relative hash case kind of mixes both cases.

Since write operations now require a concrete base hash, could we make the resolution more lenient and defer the (final) exception throwing to the highest-level method that knows the operation's end-user semantics?

On the other hand, all non-trivial unambiguous relative hash references resolve to something other than HEAD. Perhaps, we could simplify the parsing of the expected hash parameter for assign/deleteReference requests and respond with a "bad request" if a relative spec is used. If the hash is absolute we could avoid checking it against the history, but do "conflict" by comparing it to current HEAD right away (also requires less backend work).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your comments are 100% spot on. I hesitated a long time myself before picking one solution.

To summarize, we'd have 3 ways to move forward with this issue:

  1. Accept the behavior change: if c2 is not in main, then main@c2 should legitimately throw not-found.
  2. Use DETACHED to resolve relative hashes.
    a. Using DETACHED, VersionStore.hashOnReference() will simply lookup the commit but won't validate that it belongs to any branch.
  3. respond with a "bad request" if a relative spec is used in assignReference or deleteReference
    a. This would work too, but it might not solve the issue completely because instead of conflict, we would throw a bad-request error.

Regarding option 2: this was actually my first idea. It solves the issue at hand, and doesn't seem to have any concerning downsides. I gave up on it, thinking that conceptually, option 1 was better than 2... but now I am not sure anymore :-)

Let's try option 2 first and see how it goes. If it doesn't look good, then I will go with your suggestion (option 3).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimas-b Seems like CI is happy with option 2!

.isInstanceOf(NessieReferenceConflictException.class)
.asInstanceOf(type(NessieReferenceConflictException.class))
@@ -412,12 +385,11 @@ public void references() throws Exception {
.extracting(Conflict::conflictType)
.containsExactly(ConflictType.UNEXPECTED_HASH);

// Delete branch with new commit as expected HEAD: OK
if (isV2()) {
Branch deleted = api().deleteBranch().branch(branchWithNewCommit).getAndDelete();
soft.assertThat(deleted).isEqualTo(branchWithNewCommit);
Branch deleted = api().deleteBranch().branch(branchAssigned).getAndDelete();
soft.assertThat(deleted).isEqualTo(branchAssigned);
} else {
api().deleteBranch().branch(branchWithNewCommit).delete();
api().deleteBranch().branch(branchAssigned).delete();
}

soft.assertThat(api().getAllReferences().get().getReferences())
Original file line number Diff line number Diff line change
@@ -22,7 +22,6 @@

import java.util.Collections;
import java.util.List;
import java.util.Objects;
import java.util.Optional;
import java.util.function.BiConsumer;
import javax.annotation.Nullable;
@@ -153,8 +152,12 @@ public ResolvedHash resolveHashOnRef(
List<RelativeCommitSpec> relativeParts =
parsed.map(ParsedHash::getRelativeParts).orElse(Collections.emptyList());
Optional<Hash> resolved = Optional.of(startOrHead);
if (!Objects.equals(startOrHead, currentHead) || !relativeParts.isEmpty()) {
resolved = Optional.ofNullable(store.hashOnReference(ref, resolved, relativeParts));
if (!relativeParts.isEmpty()) {
// Resolve the hash against DETACHED because we are only interested in
// resolving the hash, not checking if it is on the branch. This will
// be done later on.
resolved =
Optional.ofNullable(store.hashOnReference(DetachedRef.INSTANCE, resolved, relativeParts));
}
return ResolvedHash.of(ref, resolved, Optional.ofNullable(currentHead));
}
Original file line number Diff line number Diff line change
@@ -37,29 +37,21 @@ public void testUnknownHashesOnValidNamedRefs() throws BaseNessieClientServerExc
createCommits(branch, 1, commits, currentHash);
assertThatThrownBy(() -> commitLog(branch.getName(), MINIMAL, null, invalidHash, null))
.isInstanceOf(NessieNotFoundException.class)
.hasMessageContaining(
String.format(
"Could not find commit '%s' in reference '%s'.", invalidHash, branch.getName()));
.hasMessageContaining(String.format("Commit '%s' not found", invalidHash));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimas-b FYI this is the only downside so far of resolving hashes against DETACHED: the error returned is still not-found but the message changes slightly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍


assertThatThrownBy(() -> entries(branch.getName(), invalidHash))
.isInstanceOf(NessieNotFoundException.class)
.hasMessageContaining(
String.format(
"Could not find commit '%s' in reference '%s'.", invalidHash, branch.getName()));
.hasMessageContaining(String.format("Commit '%s' not found", invalidHash));

assertThatThrownBy(() -> contents(branch.getName(), invalidHash, ContentKey.of("table0")))
.isInstanceOf(NessieNotFoundException.class)
.hasMessageContaining(
String.format(
"Could not find commit '%s' in reference '%s'.", invalidHash, branch.getName()));
.hasMessageContaining(String.format("Commit '%s' not found", invalidHash));

assertThatThrownBy(
() ->
contentApi()
.getContent(ContentKey.of("table0"), branch.getName(), invalidHash, false))
.isInstanceOf(NessieNotFoundException.class)
.hasMessageContaining(
String.format(
"Could not find commit '%s' in reference '%s'.", invalidHash, branch.getName()));
.hasMessageContaining(String.format("Commit '%s' not found", invalidHash));
}
}