Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Fix some unjustified disputes #6103

Merged
merged 10 commits into from
Jan 22, 2023
Merged
16 changes: 10 additions & 6 deletions node/core/candidate-validation/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -527,8 +527,9 @@ async fn validate_candidate_exhaustive(
Err(e) => {
gum::info!(target: LOG_TARGET, ?para_id, err=?e, "Invalid candidate (validation code)");

// If the validation code is invalid, the candidate certainly is.
return Ok(ValidationResult::Invalid(InvalidCandidate::CodeDecompressionFailure))
// Code already passed pre-checking, if decompression fails now this most likley means
// some local corruption happened.
return Err(ValidationFailed("Code decompression failed".to_string()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does every network now have pre-checking turned on?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the node side, as soon as people upgrade to 0.9.30 (#5977). In the runtime, Kusama and Polkadot still don't.

pvfCheckingEnabled: false

Someone needs to submit a motion I suppose.

},
};

Expand Down Expand Up @@ -560,7 +561,6 @@ async fn validate_candidate_exhaustive(

match result {
Err(ValidationError::InternalError(e)) => Err(ValidationFailed(e)),

Err(ValidationError::InvalidCandidate(WasmInvalidCandidate::HardTimeout)) =>
Ok(ValidationResult::Invalid(InvalidCandidate::Timeout)),
Err(ValidationError::InvalidCandidate(WasmInvalidCandidate::WorkerReportedError(e))) =>
Expand All @@ -569,9 +569,13 @@ async fn validate_candidate_exhaustive(
Ok(ValidationResult::Invalid(InvalidCandidate::ExecutionError(
"ambiguous worker death".to_string(),
))),
Err(ValidationError::InvalidCandidate(WasmInvalidCandidate::PrepareError(e))) =>
Ok(ValidationResult::Invalid(InvalidCandidate::ExecutionError(e))),

Err(ValidationError::InvalidCandidate(WasmInvalidCandidate::PrepareError(e))) => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually now that I revisit this, this should be fixed at one layer down already: WasmInvalidCandidate is already the wrong error. The candidate is not at fault if preparation failed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is merged, actually more cleanup will be needed:

PrecheckOutcome::Failed and the related PrepareError.

Reason we no longer differentiate. During pre-checking we always assume invalid, just to be on the safe side. During preparation, we never want to raise a dispute if preparation fails (as it previously passed pre-checking), so we never want it to be of state InvalidCandidate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Differentiation between the two error types (deterministic and non deterministic) still makes sense. For deterministic errors happening in preparation, we should print a big fat error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Differentiation between the two error types (deterministic and non deterministic) still makes sense. For deterministic errors happening in preparation, we should print a big fat error.

You can call is_deterministic() on PrepareError. (Semantically it made sense to me to have separate error types, but others didn't think it was necessary.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I arrived at the conclusion that the distinction is very useful. Non spurious errors, suggest some major issue like a hardware fault - the very least, they should have a higher log level.

// In principle if preparation of the `WASM` fails, the current candidate can not be the
// reason for that. So we can't say whether it is invalid or not in addition with
// pre-checking enabled only valid runtimes should ever get enacted, so we can be
// reasonably sure that this is some local problem on the current node.
Err(ValidationFailed(e))
},
Ok(res) =>
if res.head_data.hash() != candidate_receipt.descriptor.para_head {
gum::info!(target: LOG_TARGET, ?para_id, "Invalid candidate (para_head)");
Expand Down
4 changes: 2 additions & 2 deletions node/core/candidate-validation/src/tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -671,7 +671,7 @@ fn compressed_code_works() {
}

#[test]
fn code_decompression_failure_is_invalid() {
fn code_decompression_failure_is_error() {
let validation_data = PersistedValidationData { max_pov_size: 1024, ..Default::default() };
let pov = PoV { block_data: BlockData(vec![1; 32]) };
let head_data = HeadData(vec![1, 1, 1]);
Expand Down Expand Up @@ -714,7 +714,7 @@ fn code_decompression_failure_is_invalid() {
&Default::default(),
));

assert_matches!(v, Ok(ValidationResult::Invalid(InvalidCandidate::CodeDecompressionFailure)));
assert_matches!(v, Err(_));
}

#[test]
Expand Down
131 changes: 67 additions & 64 deletions node/core/dispute-coordinator/src/initialized.rs
Original file line number Diff line number Diff line change
Expand Up @@ -198,59 +198,62 @@ impl Initialized {
gum::trace!(target: LOG_TARGET, "Waiting for message");
let mut overlay_db = OverlayedBackend::new(backend);
let default_confirm = Box::new(|| Ok(()));
let confirm_write =
eskimor marked this conversation as resolved.
Show resolved Hide resolved
match MuxedMessage::receive(ctx, &mut self.participation_receiver).await? {
MuxedMessage::Participation(msg) => {
gum::trace!(target: LOG_TARGET, "MuxedMessage::Participation");
let ParticipationStatement {
session,
let confirm_write = match MuxedMessage::receive(ctx, &mut self.participation_receiver)
.await?
{
MuxedMessage::Participation(msg) => {
gum::trace!(target: LOG_TARGET, "MuxedMessage::Participation");
let ParticipationStatement {
session,
candidate_hash,
candidate_receipt,
outcome,
} = self.participation.get_participation_result(ctx, msg).await?;
if let Some(valid) = outcome.validity() {
gum::trace!(
target: LOG_TARGET,
?session,
?candidate_hash,
?valid,
"Issuing local statement based on participation outcome."
);
self.issue_local_statement(
ctx,
&mut overlay_db,
candidate_hash,
candidate_receipt,
outcome,
} = self.participation.get_participation_result(ctx, msg).await?;
if let Some(valid) = outcome.validity() {
gum::trace!(
target: LOG_TARGET,
?session,
?candidate_hash,
?valid,
"Issuing local statement based on participation outcome."
);
self.issue_local_statement(
ctx,
&mut overlay_db,
candidate_hash,
candidate_receipt,
session,
valid,
clock.now(),
)
.await?;
}
session,
valid,
clock.now(),
)
.await?;
} else {
gum::warn!(target: LOG_TARGET, ?outcome, "Dispute participation failed");
}
default_confirm
},
MuxedMessage::Subsystem(msg) => match msg {
FromOrchestra::Signal(OverseerSignal::Conclude) => return Ok(()),
FromOrchestra::Signal(OverseerSignal::ActiveLeaves(update)) => {
gum::trace!(target: LOG_TARGET, "OverseerSignal::ActiveLeaves");
self.process_active_leaves_update(
ctx,
&mut overlay_db,
update,
clock.now(),
)
.await?;
default_confirm
},
MuxedMessage::Subsystem(msg) => match msg {
FromOrchestra::Signal(OverseerSignal::Conclude) => return Ok(()),
FromOrchestra::Signal(OverseerSignal::ActiveLeaves(update)) => {
gum::trace!(target: LOG_TARGET, "OverseerSignal::ActiveLeaves");
self.process_active_leaves_update(
ctx,
&mut overlay_db,
update,
clock.now(),
)
.await?;
default_confirm
},
FromOrchestra::Signal(OverseerSignal::BlockFinalized(_, n)) => {
gum::trace!(target: LOG_TARGET, "OverseerSignal::BlockFinalized");
self.scraper.process_finalized_block(&n);
default_confirm
},
FromOrchestra::Communication { msg } =>
self.handle_incoming(ctx, &mut overlay_db, msg, clock.now()).await?,
FromOrchestra::Signal(OverseerSignal::BlockFinalized(_, n)) => {
gum::trace!(target: LOG_TARGET, "OverseerSignal::BlockFinalized");
self.scraper.process_finalized_block(&n);
default_confirm
},
};
FromOrchestra::Communication { msg } =>
self.handle_incoming(ctx, &mut overlay_db, msg, clock.now()).await?,
},
};

if !overlay_db.is_empty() {
let ops = overlay_db.into_write_ops();
Expand Down Expand Up @@ -391,19 +394,19 @@ impl Initialized {
CompactStatement::Valid(_) =>
ValidDisputeStatementKind::BackingValid(relay_parent),
};
debug_assert!(
SignedDisputeStatement::new_checked(
debug_assert!(
SignedDisputeStatement::new_checked(
DisputeStatement::Valid(valid_statement_kind),
candidate_hash,
session,
validator_public.clone(),
validator_signature.clone(),
).is_ok(),
"Scraped backing votes had invalid signature! candidate: {:?}, session: {:?}, validator_public: {:?}",
candidate_hash,
session,
validator_public,
);
).is_ok(),
"Scraped backing votes had invalid signature! candidate: {:?}, session: {:?}, validator_public: {:?}",
candidate_hash,
session,
validator_public,
);
let signed_dispute_statement =
SignedDisputeStatement::new_unchecked_from_trusted_source(
DisputeStatement::Valid(valid_statement_kind),
Expand Down Expand Up @@ -485,20 +488,20 @@ impl Initialized {
})
.cloned()?;

debug_assert!(
SignedDisputeStatement::new_checked(
debug_assert!(
SignedDisputeStatement::new_checked(
dispute_statement.clone(),
candidate_hash,
session,
validator_public.clone(),
validator_signature.clone(),
).is_ok(),
"Scraped dispute votes had invalid signature! candidate: {:?}, session: {:?}, dispute_statement: {:?}, validator_public: {:?}",
candidate_hash,
session,
).is_ok(),
"Scraped dispute votes had invalid signature! candidate: {:?}, session: {:?}, dispute_statement: {:?}, validator_public: {:?}",
candidate_hash,
session,
dispute_statement,
validator_public,
);
validator_public,
);

Some((
SignedDisputeStatement::new_unchecked_from_trusted_source(
Expand Down
2 changes: 1 addition & 1 deletion node/core/dispute-coordinator/src/participation/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -357,7 +357,7 @@ async fn participate(
err,
);

send_result(&mut result_sender, req, ParticipationOutcome::Invalid).await;
send_result(&mut result_sender, req, ParticipationOutcome::Error).await;
},

Ok(Ok(ValidationResult::Invalid(invalid))) => {
Expand Down
4 changes: 0 additions & 4 deletions node/malus/src/variants/common.rs
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,6 @@ pub enum FakeCandidateValidationError {
ParamsTooLarge,
/// Code size is over the limit.
CodeTooLarge,
/// Code does not decompress correctly.
CodeDecompressionFailure,
/// PoV does not decompress correctly.
POVDecompressionFailure,
/// Validation function returned invalid data.
Expand All @@ -88,8 +86,6 @@ impl Into<InvalidCandidate> for FakeCandidateValidationError {
FakeCandidateValidationError::Timeout => InvalidCandidate::Timeout,
FakeCandidateValidationError::ParamsTooLarge => InvalidCandidate::ParamsTooLarge(666),
FakeCandidateValidationError::CodeTooLarge => InvalidCandidate::CodeTooLarge(666),
FakeCandidateValidationError::CodeDecompressionFailure =>
InvalidCandidate::CodeDecompressionFailure,
FakeCandidateValidationError::POVDecompressionFailure =>
InvalidCandidate::PoVDecompressionFailure,
FakeCandidateValidationError::BadReturn => InvalidCandidate::BadReturn,
Expand Down
2 changes: 0 additions & 2 deletions node/primitives/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -219,8 +219,6 @@ pub enum InvalidCandidate {
ParamsTooLarge(u64),
/// Code size is over the limit.
CodeTooLarge(u64),
/// Code does not decompress correctly.
eskimor marked this conversation as resolved.
Show resolved Hide resolved
CodeDecompressionFailure,
/// PoV does not decompress correctly.
PoVDecompressionFailure,
/// Validation function returned invalid data.
Expand Down