Skip to content
This repository has been archived by the owner on Nov 14, 2024. It is now read-only.

[Timelock Corruption]: Final Wiring part 1 #5071

Merged
merged 7 commits into from
Oct 23, 2020
Merged

[Timelock Corruption]: Final Wiring part 1 #5071

merged 7 commits into from
Oct 23, 2020

Conversation

sudiksha27
Copy link
Contributor

@sudiksha27 sudiksha27 commented Oct 22, 2020

Goals (and why):
Wire corruption components

Implementation Description (bullets):

  • Basically connects all pieces together

Testing (What was existing testing like? What have you done to improve it?):
Modified existing tests

Concerns (what feedback would you like?):

  • Is it correct to initialize the PaxosHistoryProvider in PaxosResourceFactory?

  • Did I miss something?

Where should we start reviewing?:

  • PaxosResourceFactory

Priority (whenever / two weeks / yesterday):
ASAP 🏃‍♀️

@sudiksha27 sudiksha27 marked this pull request as ready for review October 22, 2020 15:00
@sudiksha27 sudiksha27 changed the title Wire through - part 1 [Timelock Corruption]: Wire through Oct 22, 2020
@sudiksha27 sudiksha27 changed the title [Timelock Corruption]: Wire through [Timelock Corruption]: Final Wiring part 1 Oct 22, 2020
Copy link
Contributor

@Jolyon-S Jolyon-S left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just picked up some nits in a drive-by.

import java.util.List;

public class CorruptionHealthCheck {
private final List<CorruptionDetector> corruptionDetectors;
private final LocalCorruptionDetector localCorruptionDetector;
private final List<CorruptionDetector> localAndRemoteCorruptionDetectors;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels a bit weird (although maybe it's an Atlas thing?) - that is, using a List to represent two and exactly two distinct things here. I'd prefer if you just made a small Immutables to hold the local and remote, then you only need one field and can call something like corruptionDetectors.local() (although naturally this doesn't lend itself trivially to your streams).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at LocalAndRemotes? It's a bit more powerful than what you need here, but has a nice way of accessing all or specifically the local.

@@ -31,46 +32,63 @@

private final ScheduledExecutorService executor = PTExecutors.newSingleThreadScheduledExecutor(
new NamedThreadFactory(CORRUPTION_DETECTOR_THREAD_PREFIX, true));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unnecessary diff.

@@ -32,7 +32,7 @@ public JerseyCorruptionFilter(CorruptionHealthCheck healthCheck) {

@Override
public void filter(ContainerRequestContext requestContext) {
if (!healthCheck.isHealthy()) {
if (healthCheck.shootTimeLock()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really a fan of shootTimelock - it feels pretty ambiguous (particularly as this is a boolean!)

Copy link
Contributor

@jeremyk-91 jeremyk-91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct. Main things:

  • shootTimeLock() looks like a command, I'd suggest making it clearer that it is a predicate (e.g. shouldRejectUserRequests())
  • I don't really like the findFirst() when identifying rogue clients, it seems we should just report all failing checks when multiple are unhappy.

@@ -17,5 +17,5 @@
package com.palantir.timelock.corruption.detection;

public interface CorruptionDetector {
boolean hasDetectedCorruption();
boolean shootTimeLock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe shouldRejectRequests()? The impression I get if I come across a method like this is that it actually makes the server stop requests and returns true if successful/false if not

import java.util.List;

public class CorruptionHealthCheck {
private final List<CorruptionDetector> corruptionDetectors;
private final LocalCorruptionDetector localCorruptionDetector;
private final List<CorruptionDetector> localAndRemoteCorruptionDetectors;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at LocalAndRemotes? It's a bit more powerful than what you need here, but has a nice way of accessing all or specifically the local.

HistoryAnalyzer::extractNamespaceAndUseCase,
HistoryAnalyzer::violatedCorruptionChecksForNamespaceAndUseCase));

SetMultimap<CorruptionCheckViolation, NamespaceAndUseCase> entryEntrySetMultimap = KeyedStream.stream(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's name this based on what the map actually does: namespacesExhibitingViolations or something like this.

@VisibleForTesting
static List<CorruptionCheckViolation> violatedCorruptionChecksForNamespaceAndUseCase(
static CorruptionCheckViolation violatedCorruptionChecksForNamespaceAndUseCase(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should return List or some Collection given the name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the issue is that (and I missed this while writing #5057) these checks are to be executed in an order, e.g. the check for acceptor quorum for a learned value does not hold value if the learners have diverged as we anyway do not have a learner value to compare accepted values against.

Comment on lines 77 to 84
Function<CompletePaxosHistoryForNamespaceAndUseCase, CorruptionCheckViolation> divergedLearnedCheck =
HistoryAnalyzer::divergedLearners;
Function<CompletePaxosHistoryForNamespaceAndUseCase, CorruptionCheckViolation> learnedValueWithoutQuorum =
HistoryAnalyzer::learnedValueWithoutQuorum;
Function<CompletePaxosHistoryForNamespaceAndUseCase, CorruptionCheckViolation> greatestAcceptedValueNotLearned =
HistoryAnalyzer::greatestAcceptedValueNotLearned;

return Stream.of(divergedLearnedCheck, learnedValueWithoutQuorum, greatestAcceptedValueNotLearned)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth extracting a constant for the set of checks


private void processLocalHealthReport() {
localCorruptionState = getLocalCorruptionState(localCorruptionReport);
if (localCorruptionState.shootTimeLock()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as flagged elsewhere, this gets confusing, I think because it looks like a command, but it's not

assertThat(HistoryAnalyzer.corruptionHealthReportForHistory(historyForAll)
.statusesToNamespaceAndUseCase()
.size())
.isEqualTo(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Prefer isEmpty()

.isEqualTo(CorruptionCheckViolation.NONE);

assertThat(HistoryAnalyzer.corruptionHealthReportForHistory(historyForAll)
.statusesToNamespaceAndUseCase()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we rename this violatingStatuses... or something like that?

Copy link
Contributor

@jeremyk-91 jeremyk-91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed: yeah ordered checks make a lot of sense!

@sudiksha27 sudiksha27 merged commit ce7ec74 into develop Oct 23, 2020
@delete-merged-branch delete-merged-branch bot deleted the wire-ete-2 branch October 23, 2020 11:37
@svc-autorelease
Copy link
Collaborator

Released 0.257.0

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants