-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch deal activations #1310
Batch deal activations #1310
Conversation
9cae3bb
to
3b6e862
Compare
3b6e862
to
12a4ae8
Compare
e4e5bab
to
97f07d0
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #1310 +/- ##
==========================================
- Coverage 90.75% 90.75% -0.01%
==========================================
Files 137 137
Lines 27296 27348 +52
==========================================
+ Hits 24772 24819 +47
- Misses 2524 2529 +5
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks good.
I would like to explore reducing the verbosity of the intra-batch error handling and re-use of existing batch result patterns. Inside batch_activate_deals
, the match blocks and return None
are very noisy compared with propagating ?
. How about producing a Vec<Result<ActivateDealsResult, ActorError>>
instead of a Vec<Option<...>>
, and then you can use .with_context(...)?
inside the map closure? Then convert this vector into a BatchReturn
and array of only the successful ActivateDealsResult
. This will return more info to the caller about what went wrong (tho initially will probably be ignored). You mentioned that the None
results were to ease parallel iteration, but I think adding a method to BatchReturn
to iterate the exit codes alongside a vector of values corresponding only to the OKs
will not be hard, and will be re-usable elsewhere. The caller can use this to get the same parallel results.
FYI @ZenGround0 for opinions.
actors/market/src/lib.rs
Outdated
.sectors | ||
.iter() | ||
.map(|p| { | ||
if p.deal_ids.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be filtered out on the caller side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method would actually work fine without this check - it's just a minor optimisation. This method could either:
- check and fail if a provided sector has no deals
- just work and activate no deals
I think the latter is more robust here, and this early return not worth the lines of code. Everything below should handle empty collections too (and most likely already does).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just work and activate no deals
The shortcut path in the miner actor previously simulated this behaviour. The market actor is robust against sectors empty of deals so passing them through the usual path is ok. Though the batch_return pattern works well elsewhere in the PR for handling real errors, this is a valid use case and handling it in the miner actor will add a bit of code complexity to filter then re-interleave the sectors that either succeeded via shortcut or succeeded via market activation.
actors/market/src/lib.rs
Outdated
} | ||
}; | ||
|
||
let deal_spaces = match validate_and_return_deal_space( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's potentially a major security issue here.
validate_and_return_deal_spaces
needs access to all deals at once to make sure we're not double activating deals and over claiming QA power. But as I understand this change this is being called once per sector. A concrete attack on this is a batch of n deal activations all over the same deal ids giving the caller an amplification of n times their qa power.
It might be that verified registry now saves us here, will have to look more.
We should have a test of this case as part of this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok it looks like the duplicates will be fed to CLAIM_ALLOCATIONS
which doesn't have the same bug and will just fail the whole thing ✅
That said we should still avoid double activations on market actor or convince ourselves that activation is idempotent and then remove all duplication checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's get_proposals
that used to check for duplicates, but does't achieve it any more. It's the verified registry's problem to ensure QA power isn't amplified – this actor shouldn't be trusted for that any more and I'm glad to hear it's not.
But we should still avoid double-activating deals. I think it would be pretty confusing for this method to return deal space info for deals that were not activated (because they were already active). So I propose we check and fail the whole thing here if a duplicate deal is detected between sectors. Accumulate deal IDs similar to get_proposals
(there's some duplication of work but I don't think worth fixing).
I note that verify_deals_for_activation
(called at pre-commit and, redundantly, replica update) won't catch this either, and might be worth fixing in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made changes so that if a duplicate deal is detected between sectors, the subsequent activations are skipped and marked as failed. This preserves the behaviour checked in replica_update_test::dealincluded_in_multiple_sectors_failure
Yes I think this will be a lot better
It would be ideal to use this as it was an attempt to standardize batch return data types for ease of use
If using BatchReturn is being unnecessarily clunky then we can perhaps improve it here so that it is more likely to stick as a standard solution. |
pub sector_expiry: ChainEpoch, | ||
#[serde(transparent)] | ||
pub struct BatchActivateDealsParams { | ||
pub sectors: Vec<SectorDeals>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sector deals introduces sector_type: RegisteredSealProof
which is unecessary in this PR but will be used in #1312 to optimise ProveReplicaUpdates
actors/market/src/lib.rs
Outdated
.sectors | ||
.iter() | ||
.map(|p| { | ||
if p.deal_ids.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just work and activate no deals
The shortcut path in the miner actor previously simulated this behaviour. The market actor is robust against sectors empty of deals so passing them through the usual path is ok. Though the batch_return pattern works well elsewhere in the PR for handling real errors, this is a valid use case and handling it in the miner actor will add a bit of code complexity to filter then re-interleave the sectors that either succeeded via shortcut or succeeded via market activation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change itself looks good, only minor comments.
This new batch functionality needs testing. Please add a few market actor unit tests for batch sizes >1, including the basics like:
- multiple sectors succeeding
- one deal causing one sector to fail
- multiple sectors failing
- condition that causes whole call to fail
- a sector with no deals passing through alongside another with deals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case of same deal across different sectors now LGTM. Can look in more detail tomorrow at the rest of the change.
1541e83
to
2ff38ab
Compare
2ff38ab
to
ddc6934
Compare
Running a final benchmark on this gets gas down to: 213,060,403 from 239,773,152 or a ~11% reduction in gas cost. |
Addressing #1278
A small tidying refactor and reduction of messaging overhead costs is seen here. Gas savings are not as significant as #1304 since the costliness of the state updates themselves have not been reduced per se.
Design caveats:
TODOs
map
to enforce parallel return structure