Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage deals can become stuck in state StorageDealStaged when MaxWaitDealsSectors is low #6019

Closed
neondragon opened this issue Apr 11, 2021 · 5 comments

Comments

@neondragon
Copy link

neondragon commented Apr 11, 2021

Problem

As a miner, when publishing storage deals, the storage deals are allocated into WaitDeals sectors. When there is insufficient space in existing WaitDeals sectors, and MaxWaitDealsSectors disallows creation of a new WaitDeals sector, the storage deal remains in state StorageDealStaged.

When the number of WaitDeals sectors decreases below MaxWaitDealsSectors, Lotus does not always create a new WaitDeals sector(s), and the storage deals remain in state StorageDealStaged apparently indefinitely.

Steps to reproduce

Version: lotus-miner version 1.6.0+mainnet+git.3cfb6b09d.dirty

Configure MaxWaitDealsSectors=4
Publish a batch of 16 32GiB storage deals.

Expected result

Four AddPiece tasks run and four WaitDeals sectors are created. Those four sectors may or may not move immediately to PC1 depending on the logic intended in Lotus right now. However, after each sector's AddPiece->WaitDeals->PC1 transition, the remaining 12 storage deals waiting in StorageDealStaged should start an AddPiece task into a WaitDeals sector.

Actual result

The remaining 12 sectors remain in state StorageDealStaged and are never packed into sectors.

After restarting lotus-miner, they are packed.

Example of staged deals on my miner right now (edited for privacy):

# lotus-miner storage-deals list | grep StorageDealStaged
...ht4ukwlu  1696082  StorageDealStaged   f3...   32GiB
...sxa47h5u  1696084  StorageDealStaged   f3...   32GiB
...ygahhagy  1696085  StorageDealStaged   f3...   32GiB
...4sr2q73e  1700059  StorageDealStaged   f3...   32GiB
...3uwdkeqe  1700065  StorageDealStaged   f3...   32GiB
...lys6q3l4  1700061  StorageDealStaged   f3...   32GiB
...dgtvruh4  1696089  StorageDealStaged   f3...   32GiB
...yu4ln5cm  1700066  StorageDealStaged   f3...   32GiB
...aiisbdg4  1700063  StorageDealStaged   f3...   32GiB
...lcnpxmti  1700060  StorageDealStaged   f3...   32GiB
...rrnrwg7e  1700069  StorageDealStaged   f3...   32GiB
...lbmklwqi  1700368  StorageDealStaged   f3...   32GiB
...rurft45u  1700067  StorageDealStaged   f3...   32GiB
...blcohsay  1700071  StorageDealStaged   f3...   32GiB
...bj3ez62u  1700367  StorageDealStaged   f3...   32GiB
...6vhwjc6y  1700068  StorageDealStaged   f3...   32GiB
...3bwxyqvi  1700369  StorageDealStaged   f3...   32GiB
...azp3r5kq  1700073  StorageDealStaged   f3...   32GiB
...hhkpki54  1700070  StorageDealStaged   f3...   32GiB
...i5t4h3vq  1700072  StorageDealStaged   f3...   32GiB

Sector list -- no WaitDeals/PC1 despite staged deals waiting

# lotus-miner sectors list | tail -n 10
1794  Proving         YES      NO      1742584 (in 1 year 1 week)      1            31.56GiB
1795  Proving         YES      NO      1742584 (in 1 year 1 week)      1            31.56GiB
1796  WaitSeed        NO       NO      n/a                             1
1797  Committing      NO       NO      n/a                             1
1798  Committing      NO       NO      n/a                             1
1799  WaitSeed        NO       NO      n/a                             1
1800  WaitSeed        NO       NO      n/a                             1
1801  WaitSeed        NO       NO      n/a                             1
1802  WaitSeed        NO       NO      n/a                             1
1803  WaitSeed        NO       NO      n/a                             1

Sealing jobs

# lotus-miner sealing jobs
ID        Sector  Worker    Hostname                Task  State    Time
d4126189  1798    cb9cb786  gb-wlv1-filecoin-seal2  C2    running  2m55.1s
7f4f3ae6  1797    f8de7cdb  gb-wlv1-filecoin-seal3  C2    running  2m23.5s

config.toml section [Sealing]

[Sealing]
  MaxWaitDealsSectors = 4
  MaxSealingSectors = 21
  MaxSealingSectorsForDeals = 0
  WaitDealsDelay = "12h0m0s"
  AlwaysKeepUnsealedCopy = false

Sealing jobs (immediately after miner restarted). Note: 4x AP now started as expected.

# lotus-miner sealing jobs
ID        Sector  Worker    Hostname                Task  State     Time
0f99369a  1799    00000000  gb-wlv1-filecoin-seal3  C2    ret-wait  8m16.7s
59c95d33  1805    a3b6e3bb  gb-wlv1-filecoin-seal4  AP    running   1.2s
41dcbb42  1806    a3b6e3bb  gb-wlv1-filecoin-seal4  AP    running   1.2s
1414a03d  1807    a3b6e3bb  gb-wlv1-filecoin-seal4  AP    running   1.2s
2a489fa1  1804    a3b6e3bb  gb-wlv1-filecoin-seal4  AP    running   1.2s
5307668d  1801    a3b6e3bb  gb-wlv1-filecoin-seal4  C1    running   1.2s
99fefbb0  1802    a3b6e3bb  gb-wlv1-filecoin-seal4  C1    running   1.2s
e0371b19  1799    a3b6e3bb  gb-wlv1-filecoin-seal4  C1    running   1.2s
bde968d6  1796    a3b6e3bb  gb-wlv1-filecoin-seal4  C1    running   1.2s

Wild Speculation

I suspect the logic that implements the shortcut of transitioning a full (32GiB) sector immediately to PC1 may not be performing the full AddPiece->WaitDeals->PC1 transition. If the check to move StorageDealStaged into new WaitDeals sectors is done at the WaitDeals->PC1 transition of the state machine, and in the case of 32GiB deals into 32GiB sectors we take the shortcut of AddPiece -> PC1, that could explain this. Total guess. Haven't checked the code.

Me (Slack @NeonixAF f019551), Slack @TippyFlits, and Slack @stuberman. have been unable to narrow it down further.

TippyFlits has been importing (mostly) 1GiB offline deals and doesn't experience this issue right now. Stuberman and I have been importing (mostly) 32GiB deals and do experience it.

@kernelogic
Copy link

I think I ran into something similar #6010 (comment)

That deal stuck in StorageDealStaged and no way to proceed it further.

@kaptinlin
Copy link

We have the same issue. Any solution?

@stuberman
Copy link

Looks like it was fixed in this commit - not yet merged into master

#6041

@Shekelme
Copy link

Shekelme commented Apr 26, 2021

Huh. In my case no WaitDeals sectors appeared in spite that there was no any single WaitDeals sectors at the time of making deals.
My current parameters:

MaxWaitDealsSectors = 3
MaxSealingSectorsForDeals = 5

And restarting the miner did not solve this problem.

@dkkapur dkkapur added this to the 🤝 Deal Success milestone May 19, 2021
@rjan90
Copy link
Contributor

rjan90 commented Nov 25, 2021

I think this issue has been solved now, and can be closed? #rengjøring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants