Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix:proving:post check sector handles snap deals replica faults #8177

Merged
merged 1 commit into from
Feb 24, 2022

Conversation

ZenGround0
Copy link
Contributor

@ZenGround0 ZenGround0 commented Feb 23, 2022

Related Issues

#8148

Proposed Changes

  • Handle fault checking for snap deals update replicas
  • In particular check for faults in the update replicas even when the sector key replica and cache are still stored
  • Update storage miner api CheckProvable accordingly

Additional Info

Checklist

Before you mark the PR ready for review, please make sure that:

  • All commits have a clear commit message.
  • The PR title is in the form of of <PR type>: <area>: <change being made>
    • example: fix: mempool: Introduce a cache for valid signatures
    • PR type: fix, feat, INTERFACE BREAKING CHANGE, CONSENSUS BREAKING, build, chore, ci, docs,perf, refactor, revert, style, test
    • area: api, chain, state, vm, data transfer, market, mempool, message, block production, multisig, networking, paychan, proving, sealing, wallet, deps
  • This PR has tests for new functionality or change in behaviour
  • If new user-facing features are introduced, clear usage guidelines and / or documentation updates should be included in https://lotus.filecoin.io or Discussion Tutorials.
  • CI is green

@ZenGround0 ZenGround0 requested a review from a team as a code owner February 23, 2022 17:00
@codecov
Copy link

codecov bot commented Feb 23, 2022

Codecov Report

Merging #8177 (abe04c3) into master (ba65d1e) will decrease coverage by 0.05%.
The diff coverage is 35.29%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8177      +/-   ##
==========================================
- Coverage   39.92%   39.87%   -0.06%     
==========================================
  Files         666      666              
  Lines       72560    72571      +11     
==========================================
- Hits        28973    28936      -37     
- Misses      38551    38586      +35     
- Partials     5036     5049      +13     
Impacted Files Coverage Δ
api/api_storage.go 0.00% <ø> (ø)
api/version.go 80.00% <ø> (ø)
cmd/lotus-miner/proving.go 30.07% <0.00%> (-0.24%) ⬇️
node/impl/storminer.go 22.40% <0.00%> (-0.33%) ⬇️
extern/sector-storage/faults.go 30.63% <33.33%> (+0.82%) ⬆️
extern/sector-storage/mock/mock.go 60.62% <100.00%> (ø)
storage/wdpost_run.go 69.91% <100.00%> (-0.70%) ⬇️
markets/retrievaladapter/client_blockstore.go 62.50% <0.00%> (-6.25%) ⬇️
storage/wdpost_sched.go 77.45% <0.00%> (-3.93%) ⬇️
miner/miner.go 55.08% <0.00%> (-2.63%) ⬇️
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ba65d1e...abe04c3. Read the comment docs.

@jennijuju jennijuju added P1 P1: Must be resolved release/backport labels Feb 24, 2022
@ZenGround0
Copy link
Contributor Author

I've reproduced the original error from #8148 in a 2k devnet by deleting the p_aux file in a update cache for a sector that has not made it out of UpdateActivating.

INFO	storageminer	storage/wdpost_run.go:647	computing window post	{"batch": 0, "elapsed": 0.00035817}
2022-02-24T08:36:03.936-0700	ERROR	storageminer	storage/wdpost_run.go:649	error generating window post: could not read from path="/Users/zenground0/.lotusminer/update-cache/s-t01041-5/p_aux"

After reproducing this error I shut down the devnet and rebuilt with the code on this branch and then restarted. The spinning error in the logs then went away. Checking up on the status of the sectors in this deadline I found that the sector in question is marked as correctly faulted and the other sector in the partition is properly posted.

deadline  partition  sector  status
4         0          4       good
4         0          5       bad (stat /Users/zenground0/.lotusminer/update-cache/s-t01041-5/p_aux: no such file or directory)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 P1: Must be resolved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants