Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

on restart, all older active deals error out on 1.11.2-rc1 #7214

Closed
8 of 18 tasks
stuberman opened this issue Aug 28, 2021 · 9 comments
Closed
8 of 18 tasks

on restart, all older active deals error out on 1.11.2-rc1 #7214

stuberman opened this issue Aug 28, 2021 · 9 comments
Assignees
Labels
area/markets/storage kind/bug Kind: Bug P1 P1: Must be resolved team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs

Comments

@stuberman
Copy link

Checklist

  • This is not a security-related bug/issue. If it is, please follow please follow the security policy.
  • This is not a question or a support request. If you have any lotus related questions, please ask in the lotus forum.
  • This is not a new feature request. If it is, please file a feature request instead.
  • This is not an enhancement request. If it is, please file a improvement suggestion instead.
  • I have searched on the issue tracker and the lotus forum, and there is no existing related issue or discussion.
  • I am running the Latest release, or the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.
  • I did not make any code changes to lotus.

Lotus component

  • lotus daemon - chain sync
  • lotus miner - mining and block production
  • lotus miner/worker - sealing
  • lotus miner - proving(WindowPoSt)
  • lotus miner/market - storage deal
  • lotus miner/market - retrieval deal
  • lotus miner/market - data transfer
  • lotus client
  • lotus JSON-RPC API
  • lotus message management (mpool)
  • Other

Lotus Version

lotus-miner version
Daemon:  1.11.2-rc1+mainnet+git.80624845c.dirty+api1.2.0
Local: lotus-miner version 1.11.2-rc1+mainnet+git.80624845c.dirty

Formerly
lotus-worker info
Worker version:  1.1.0
CLI version: lotus-worker version 1.11.3-dev+mainnet+git.714635c75.dirty

Describe the Bug

I was running on a current Master pull, when I was asked to downgrade to 1.11.2-rc1 for testing.
(I have also been using various builds such as 'back-to-ocl-nemo' at the request of nemo.)
I have more than 2500 active deals, but now when I run lotus-miner info I only see 11 active deals and storage-deals list shows most deals are errored now (except deals processed today)

Logging Information

lotus-miner info

Enabled subsystems (from miner API): [Mining Sealing SectorStorage]
Enabled subsystems (from markets API): [Markets]
Chain: [sync ok] [basefee 476.002 pFIL]
Miner: f01278 (32 GiB sectors)
Power: 119 Ti / 9.83 Ei (0.0012%)
	Raw: 45.53 TiB / 9.828 EiB (0.0004%)
	Committed: 45.53 TiB
	Proving: 45.53 TiB
Projected average block win rate: 1.17/week (every 144h7m4s)
Projected block win with 99.9% probability every 995h30m5s
(projections DO NOT account for future network and miner growth)

Miner Balance:    1609.534 FIL
      PreCommit:  20.324 mFIL
      Pledge:     924.519 FIL
      Vesting:    255.157 FIL
      Available:  429.839 FIL
Market Balance:   100.645 FIL
       Locked:    11.563 FIL
       Available: 89.082 FIL
Worker Balance:   18.273 FIL
       Control:   555.095 FIL
Total Spendable:  1092.288 FIL

Sectors:
	Total: 1591
	Proving: 1458
	PreCommit1: 1
	Removed: 132

Storage Deals: 11, 151 GiB
      Active:            7  85.5 GiB (Verified: 7 85.5 GiB)
      AwaitingPreCommit: 3  33.5 GiB (Verified: 3 33.5 GiB)
      Publish:           1  32 GiB   (Verified: 1 32 GiB)

Retrieval Deals (complete): 191, 1.04 TiB

Example of a recent deal that is in an active Sector but now showing an error:

lotus-miner storage-deals list -v |grep 2330919
Aug 24 10:43:43  true   bafyreicnkrvl2tqvdgk2w5c3c43qohbpgesqys7vb6jldj5krlguenzrkq  2330919  StorageDealError              f144zep4gitj73rrujd3jw6iprljicx6vl4wbeavi                                               32GiB   0 FIL             1468800                                                                                                                                 error waiting for deal completion: awaiting deal expired: getting deal info: looking for publish deal message bafy2bzacecgcet5fztcwxealexgofak2cn7gsymuod4io6ciuhy3pvriv2a76: search msg failed: failed to load message: blockstore: block not found


lotus-miner sectors status --on-chain-info --log 1555
SectorID:	1555
Status:		Proving
CIDcommD:	baga6ea4seaqbt2nhxrnju7wip3thbnrnaksvhagkavum7mvesimuqrsayg7duka
CIDcommR:	bagboea4b5abcbnyi2wpc6rdbftozygcccxem32r6hudkeh7dqzaojdva3z6ikqbc
Ticket:		ce1a3fd48f5a5179eb7f712ea03d768c385860c05851b1e22f682bf335139c97
TicketH:	1049817
Seed:		dbc7c9673f777702d493482da90716d06d2aa6fec273904017838c46059270d6
SeedH:		1052490
Precommit:	bafy2bzacebdaek5m22pigdhimpnhkyhcvv432f64e4ld2kum52chijcixpna2
Commit:		bafy2bzaceaust7whtddpcv4e6alatgtuef5oavclawfj4qkjti4odar7hqxie
Proof:		81f4e3b42ae636bc78655eb6f742573e94949461bf6debecddd4fea4ae3c6f64e4ccfc82539c9aee3fc10e6adb8ab054a603f5924a7ad2e62cbd823ce97d20f56103a027760f5f4e0ff6447578098f9afa37892d14f83bb46f91c41f5b21ca45178d5a5028845f99969d67c474084c07b9421285a9c72f88bb54d617923f807a6d9728480dc4d5bd575fb40bfedaa2a3a89291cfabbf797d0791dcb4cf1010316c32b7f88d36299147c54bbaa70d6936c9a885a65c0b359064321b96f2bcc39db762532e37683ea946d9a56f8488e1cd4db8578fe29cd8dfa89fa6e73dcce84f81eaa9729166a9149bb5e4a533bd025d870d48bc18e24470d51b66778b98e71d5458181f695d8f973ef536ab76360384c39f7ac900669ba830d89baccf4c6f6207815daef96f5ecbc05cd4e8af3b63bc9e6c25da1ed3342be0b82fbac1bff9cd6e33920f6ac0c680424fb47cbf1e12dc99499d59e555235aabb912c5d8c3f9d45bf1120c8e1c697a3dd0a0169fe5b7ee91d030ee514414e305469f6dc977c22da9736d6bd11cb7f4138eed454f0e6fe99a65c417adba30f5fac9c70a28cb668d16f9a641993640e0ecb7294b772c355ab4b47f678ac3147dd9f1c2b3974186178462289f1d8ac608e9ae9fc7d593783044fcd4b719817c18e0f848b196259c1e00b8d0919f5dc9d4931f580c2f0ad05dc9f4a12a63ba75e002a6b006b77df7f283c47edfee0198c1d712d9f5b1618e649085e43e057e6d7c3f026d8cb46bdd2d492224e6b1db4e66410bce876133657df565c66108979a090c480ef6ec0ca421b15a5b219121facfd9b738aab27a9740d77c053276dceedc7c9e8063f3076600d63a99cecebc6953d7f97fe0fc1139d7908e07666a24da5783072be6d250d60a5cefb98689d2850011fd50e5b9856a8a19c20e0d94d4f4ca4af8c9c857e2fc021622fdd2f005ecaafdc27573ae3030c332922b240811419c7c31f72755595d81b4a2ed20aec35e473581190b5430715482dbc851ccaaa8053433cadc92620a1aa831e682dfe8069555b444bcfaf08124f0422ebc4ca7ee942fae43ef02880e6388ba654dff326a1e34bf66405f898ff8c33eb16f815c61c6c63da29ed499db866718ff07d81eebcb091b1412e02e2e0aad3108b74ed6b25965ad9c0dc76c214748378b48ba826d397a29029edb8c1a561421aa0a58d4599e387c3377cb871d650f2cf5070afbb565c717e28a6bf3bebbcd5e65511d26290f871e044332fdd784b475b8f5b5f57f3adedf42ef07f37f34aa7d452f97fe8d6d40c7522777dd9225d67cd8bb8416e7813b55a140e8dc35e751850e3b63fd4752b2af0517730d7651b4e7665d8993c75bc5a7fa9c69f76d4f962c2d1754bcbbf2a2f7c4642db65e912c9dbd4195673e8cec827a26d109a3469898d9aee80cdfaf105a59deb60263326d1652c647b88a6b1d5786463064199e442007e5c545df9a0b372cc662fe54e40fa399df5ffda6eca112f9830e07624b555df6f07ddc8b60401f85530ebe4182055f36a708488c4364a9d221f6afde90af102fcc0c9adf65ce9d389846e0466f8122e98a24975f526936f2f6938b89ee13766fcfcdac1228925df650f1190a9db2519b0a1998f8f75b1b5bb0182294d5e90784db0f4f3f9349b4fe113cfa4861a7fc936f4466159cdf1998512cfd77bdb06af127ab5e16a05d4d54e6a8c4f591d3241f5df4e08e191558ed7fca03e82c4f54f8dd63fc19728a4bfe92fd72eeb514fad53ff6de1c2e9b3aff10a0cdf4174eaa653e6205f25ac11701361074c96b78d90c15a4d3d12a3300b01f35aa13668e26cd8214a74fa621782e7585ae3a9b06aec08d7374ce75f8b5dbb9b1ae1697c2a9efae449d9047fd89ddf94b2eefd6ad5248674cbfd08cea17ac3d93e2f2722292d46ee1e908804497e4f9ad5707afbcf5035317b45edb29f7c2dedb590990a1ce726f037823329e36fbda944ac4dc21386d1bc1591d77417c6e697cdcc8a374fe60dea5a61c03ae322e9161eb950911f8dfd2e106727ca9b9b066b1c33ffc4a810e6354c9425c5121a2477b29b227178115d8438658af40dcc5ed32a71d6291e26205598b0be42088973964cd7dfea568f35e41cd9b1a7aa9962a0a2915b114748cb577066a0b02514bf6822fa7a185a04124079e3565416d7924ff157ccc9f2e6ad95c236599962d0f0b765b5f8dc82d2aeeae4e675860c168a2fc865dec928035f2247604fe4ea6fc4e6330ae0f550dedbe33960fa93ae642031867f5d53ebf0d9bf0a577653619041e0899732817e6f14b76f4374a6d9533b8a64786a700d70e6d488f27f968931c748d5f889d1d7a7e310d804822edb15e70bf7547c4aaffd0836d987139a9b734f14f1e897fd1c4f63f632abd15427f68444da510f040d36e41fd3c270db66b92478afa49f9aaf99219e9df3374d55015f7da3a088a303dca3f3d0d9e42ee4390b8989c98e292643780ee481d15e17eb0b9f404af588decbc52a80a60006cb5788bc03c111e6b02ec7dbea12fd1cf093e04993bfed363172b01b95a2f5b5a0d358c6e4a8669183df41430fdee787bea2df4d847b2902080d363115e860162134a6c8d95de9c0f0056c33f7ebcff4f0f861bea3a70b7844ab6ec13721cce44b9ade1f61c47e6bb0bc418e34c4dbab3de9e746d489782deadd707e77a543d159e4c6f7e7a724d
Deals:		[2330919]
Retries:	0

Sector On Chain Info
SealProof:		8
Activation:		1052662
Expiration:		2524396
DealWeight:		0
VerifiedDealWeight:		50467583714918400
InitialPledge:		1709434922106567767

Expiration Info
OnTime:		2524968
Early:		0
--------
Event Log:
0.	2021-08-24 16:55:04 +0000 UTC:	[event;sealing.SectorStart]	{"User":{"ID":1555,"SectorType":8}}
1.	2021-08-24 16:55:04 +0000 UTC:	[event;sealing.SectorAddPiece]	{"User":{}}
2.	2021-08-24 17:58:53 +0000 UTC:	[event;sealing.SectorPieceAdded]	{"User":{"NewPieces":[{"Piece":{"Size":34359738368,"PieceCID":{"/":"baga6ea4seaqbt2nhxrnju7wip3thbnrnaksvhagkavum7mvesimuqrsayg7duka"}},"DealInfo":{"PublishCid":{"/":"bafy2bzacecgcet5fztcwxealexgofak2cn7gsymuod4io6ciuhy3pvriv2a76"},"DealID":2330919,"DealProposal":{"PieceCID":{"/":"baga6ea4seaqbt2nhxrnju7wip3thbnrnaksvhagkavum7mvesimuqrsayg7duka"},"PieceSize":34359738368,"VerifiedDeal":true,"Client":"f144zep4gitj73rrujd3jw6iprljicx6vl4wbeavi","Provider":"f01278","Label":"mAXASIDa4BLZd38rNtDUh3z/d2KDXg409LaADotlhHzhtSiEF","StartEpoch":1055596,"EndEpoch":2524396,"StoragePricePerEpoch":"0","ProviderCollateral":"6105248445433132","ClientCollateral":"0"},"DealSchedule":{"StartEpoch":1055596,"EndEpoch":2524396},"KeepUnsealed":true}}]}}
3.	2021-08-24 17:58:53 +0000 UTC:	[event;sealing.SectorStartPacking]	{"User":{}}
4.	2021-08-24 17:58:53 +0000 UTC:	[event;sealing.SectorPacked]	{"User":{"FillerPieces":null}}
5.	2021-08-24 17:58:53 +0000 UTC:	[event;sealing.SectorTicket]	{"User":{"TicketValue":"zho/1I9aUXnrf3EuoD12jDhYYMBYUbHiL2gr8zUTnJc=","TicketEpoch":1049817}}
6.	2021-08-24 22:19:27 +0000 UTC:	[event;sealing.SectorPreCommit1]	{"User":{"PreCommit1Out":"eyJyZWdpc3RlcmVkX3Byb29mIjoiU3RhY2tlZERyZzMyR2lCVjFfMSIsImxhYmVscyI6eyJTdGFja2VkRHJnMzJHaUJWMSI6eyJsYWJlbHMiOlt7InBhdGgiOiIvc2VhbC93b3JrZXIvY2FjaGUvcy10MDEyNzgtMTU1NSIsImlkIjoibGF5ZXItMSIsInNpemUiOjEwNzM3NDE4MjQsInJvd3NfdG9fZGlzY2FyZCI6N30seyJwYXRoIjoiL3NlYWwvd29ya2VyL2NhY2hlL3MtdDAxMjc4LTE1NTUiLCJpZCI6ImxheWVyLTIiLCJzaXplIjoxMDczNzQxODI0LCJyb3dzX3RvX2Rpc2NhcmQiOjd9LHsicGF0aCI6Ii9zZWFsL3dvcmtlci9jYWNoZS9zLXQwMTI3OC0xNTU1IiwiaWQiOiJsYXllci0zIiwic2l6ZSI6MTA3Mzc0MTgyNCwicm93c190b19kaXNjYXJkIjo3fSx7InBhdGgiOiIvc2VhbC93b3JrZXIvY2FjaGUvcy10MDEyNzgtMTU1NSIsImlkIjoibGF5ZXItNCIsInNpemUiOjEwNzM3NDE4MjQsInJvd3NfdG9fZGlzY2FyZCI6N30seyJwYXRoIjoiL3NlYWwvd29ya2VyL2NhY2hlL3MtdDAxMjc4LTE1NTUiLCJpZCI6ImxheWVyLTUiLCJzaXplIjoxMDczNzQxODI0LCJyb3dzX3RvX2Rpc2NhcmQiOjd9LHsicGF0aCI6Ii9zZWFsL3dvcmtlci9jYWNoZS9zLXQwMTI3OC0xNTU1IiwiaWQiOiJsYXllci02Iiwic2l6ZSI6MTA3Mzc0MTgyNCwicm93c190b19kaXNjYXJkIjo3fSx7InBhdGgiOiIvc2VhbC93b3JrZXIvY2FjaGUvcy10MDEyNzgtMTU1NSIsImlkIjoibGF5ZXItNyIsInNpemUiOjEwNzM3NDE4MjQsInJvd3NfdG9fZGlzY2FyZCI6N30seyJwYXRoIjoiL3NlYWwvd29ya2VyL2NhY2hlL3MtdDAxMjc4LTE1NTUiLCJpZCI6ImxheWVyLTgiLCJzaXplIjoxMDczNzQxODI0LCJyb3dzX3RvX2Rpc2NhcmQiOjd9LHsicGF0aCI6Ii9zZWFsL3dvcmtlci9jYWNoZS9zLXQwMTI3OC0xNTU1IiwiaWQiOiJsYXllci05Iiwic2l6ZSI6MTA3Mzc0MTgyNCwicm93c190b19kaXNjYXJkIjo3fSx7InBhdGgiOiIvc2VhbC93b3JrZXIvY2FjaGUvcy10MDEyNzgtMTU1NSIsImlkIjoibGF5ZXItMTAiLCJzaXplIjoxMDczNzQxODI0LCJyb3dzX3RvX2Rpc2NhcmQiOjd9LHsicGF0aCI6Ii9zZWFsL3dvcmtlci9jYWNoZS9zLXQwMTI3OC0xNTU1IiwiaWQiOiJsYXllci0xMSIsInNpemUiOjEwNzM3NDE4MjQsInJvd3NfdG9fZGlzY2FyZCI6N31dLCJfaCI6bnVsbH19LCJjb25maWciOnsicGF0aCI6Ii9zZWFsL3dvcmtlci9jYWNoZS9zLXQwMTI3OC0xNTU1IiwiaWQiOiJ0cmVlLWQiLCJzaXplIjoyMTQ3NDgzNjQ3LCJyb3dzX3RvX2Rpc2NhcmQiOjd9LCJjb21tX2QiOlsyNSwyMzMsMTY3LDE4OCw5MCwxNTQsMTI2LDIwMCwxMjYsMjMwLDExMiwxODIsNDUsMiwxNjUsODMsMTI4LDIwMiw1LDEwNCwyMDcsMTc4LDE2NCwxNDYsMjUsNzIsNzAsNjQsMTkzLDE5MCw1OCw0MF19"}}
7.	2021-08-25 05:34:45 +0000 UTC:	[event;sealing.SectorPreCommit2]	{"User":{"Sealed":{"/":"bagboea4b5abcbnyi2wpc6rdbftozygcccxem32r6hudkeh7dqzaojdva3z6ikqbc"},"Unsealed":{"/":"baga6ea4seaqbt2nhxrnju7wip3thbnrnaksvhagkavum7mvesimuqrsayg7duka"}}}
8.	2021-08-25 05:34:45 +0000 UTC:	[event;sealing.SectorPreCommitBatch]	{"User":{}}
9.	2021-08-25 07:29:25 +0000 UTC:	[event;sealing.SectorPreCommitBatchSent]	{"User":{"Message":{"/":"bafy2bzacebdaek5m22pigdhimpnhkyhcvv432f64e4ld2kum52chijcixpna2"}}}
10.	2021-08-25 07:33:00 +0000 UTC:	[event;sealing.SectorPreCommitLanded]	{"User":{"TipSet":"AXGg5AIglOI9dfp5owwRA1kqBe7kddTHTULT6+HG/pv2fMzsyjABcaDkAiAI73VFQHiXDdLOe+5avZJcEmGfFXYrQ/NsK3B17mu8hQFxoOQCIDwsvJGYGXcuzuMhsunOxKg36wOIcW3F7vueRWHVHlW9AXGg5AIgetk8dKYwSOnJ1UNthOBhet+YPzU9BPh+W4hUQXb96Wo="}}
11.	2021-08-25 08:48:00 +0000 UTC:	[event;sealing.SectorSeedReady]	{"User":{"SeedValue":"28fJZz93dwLUk0gtqQcW0G0qpv7Cc5BAF4OMRgWScNY=","SeedEpoch":1052490}}
12.	2021-08-25 09:32:59 +0000 UTC:	[event;sealing.SectorProofReady]	{"User":{"Proof":"gfTjtCrmNrx4ZV6290JXPpSUlGG/bevs3dT+pK48b2TkzPyCU5ya7j/BDmrbirBUpgP1kkp60uYsvYI86X0g9WEDoCd2D19OD/ZEdXgJj5r6N4ktFPg7tG+RxB9bIcpFF41aUCiEX5mWnWfEdAhMB7lCEoWpxy+Iu1TWF5I/gHptlyhIDcTVvVdftAv+2qKjqJKRz6u/eX0Hkdy0zxAQMWwyt/iNNimRR8VLuqcNaTbJqIWmXAs1kGQyG5byvMOdt2JTLjdoPqlG2aVvhIjhzU24V4/inNjfqJ+m5z3M6E+B6qlykWapFJu15KUzvQJdhw1IvBjiRHDVG2Z3i5jnHVRYGB9pXY+XPvU2q3Y2A4TDn3rJAGabqDDYm6zPTG9iB4FdrvlvXsvAXNTorztjvJ5sJdoe0zQr4LgvusG/+c1uM5IPasDGgEJPtHy/HhLcmUmdWeVVI1qruRLF2MP51FvxEgyOHGl6PdCgFp/lt+6R0DDuUUQU4wVGn23Jd8ItqXNta9Ect/QTju1FTw5v6ZplxBetujD1+snHCijLZo0W+aZBmTZA4Oy3KUt3LDVatLR/Z4rDFH3Z8cKzl0GGF4RiKJ8disYI6a6fx9WTeDBE/NS3GYF8GOD4SLGWJZweALjQkZ9dydSTH1gMLwrQXcn0oSpjunXgAqawBrd99/KDxH7f7gGYwdcS2fWxYY5kkIXkPgV+bXw/Am2MtGvdLUkiJOax205mQQvOh2EzZX31ZcZhCJeaCQxIDvbsDKQhsVpbIZEh+s/ZtziqsnqXQNd8BTJ23O7cfJ6AY/MHZgDWOpnOzrxpU9f5f+D8ETnXkI4HZmok2leDByvm0lDWClzvuYaJ0oUAEf1Q5bmFaooZwg4NlNT0ykr4ychX4vwCFiL90vAF7Kr9wnVzrjAwwzKSKyQIEUGcfDH3J1VZXYG0ou0grsNeRzWBGQtUMHFUgtvIUcyqqAU0M8rckmIKGqgx5oLf6AaVVbREvPrwgSTwQi68TKfulC+uQ+8CiA5jiLplTf8yah40v2ZAX4mP+MM+sW+BXGHGxj2intSZ24ZnGP8H2B7rywkbFBLgLi4KrTEIt07WslllrZwNx2whR0g3i0i6gm05eikCntuMGlYUIaoKWNRZnjh8M3fLhx1lDyz1Bwr7tWXHF+KKa/O+u81eZVEdJikPhx4EQzL914S0dbj1tfV/Ot7fQu8H8380qn1FL5f+jW1Ax1Ind92SJdZ82LuEFueBO1WhQOjcNedRhQ47Y/1HUrKvBRdzDXZRtOdmXYmTx1vFp/qcafdtT5YsLRdUvLvyovfEZC22XpEsnb1BlWc+jOyCeibRCaNGmJjZrugM368QWlnetgJjMm0WUsZHuIprHVeGRjBkGZ5EIAflxUXfmgs3LMZi/lTkD6OZ31/9puyhEvmDDgdiS1Vd9vB93ItgQB+FUw6+QYIFXzanCEiMQ2Sp0iH2r96QrxAvzAya32XOnTiYRuBGb4Ei6Yokl19SaTby9pOLie4Tdm/PzawSKJJd9lDxGQqdslGbChmY+PdbG1uwGCKU1ekHhNsPTz+TSbT+ETz6SGGn/JNvRGYVnN8ZmFEs/Xe9sGrxJ6teFqBdTVTmqMT1kdMkH1304I4ZFVjtf8oD6CxPVPjdY/wZcopL/pL9cu61FPrVP/beHC6bOv8QoM30F06qZT5iBfJawRcBNhB0yWt42QwVpNPRKjMAsB81qhNmjibNghSnT6YheC51ha46mwauwI1zdM51+LXbubGuFpfCqe+uRJ2QR/2J3flLLu/WrVJIZ0y/0IzqF6w9k+LyciKS1G7h6QiARJfk+a1XB6+89QNTF7Re2yn3wt7bWQmQoc5ybwN4IzKeNvvalErE3CE4bRvBWR13QXxuaXzcyKN0/mDepaYcA64yLpFh65UJEfjf0uEGcnypubBmscM//EqBDmNUyUJcUSGiR3spsicXgRXYQ4ZYr0Dcxe0ypx1ikeJiBVmLC+QgiJc5ZM19/qVo815BzZsaeqmWKgopFbEUdIy1dwZqCwJRS/aCL6ehhaBBJAeeNWVBbXkk/xV8zJ8uatlcI2WZli0PC3ZbX43ILSrurk5nWGDBaKL8hl3skoA18iR2BP5OpvxOYzCuD1UN7b4zlg+pOuZCAxhn9dU+vw2b8KV3ZTYZBB4ImXMoF+bxS3b0N0ptlTO4pkeGpwDXDm1Ijyf5aJMcdI1fiJ0den4xDYBIIu2xXnC/dUfEqv/Qg22YcTmptzTxTx6Jf9HE9j9jKr0VQn9oRE2lEPBA025B/TwnDbZrkkeK+kn5qvmSGenfM3TVUBX32joIijA9yj89DZ5C7kOQuJicmOKSZDeA7kgdFeF+sLn0BK9Yjey8UqgKYABstXiLwDwRHmsC7H2+oS/Rzwk+BJk7/tNjFysBuVovW1oNNYxuSoZpGD30FDD97nh76i302EeykCCA02MRXoYBYhNKbI2V3pwPAFbDP368/08PhhvqOnC3hEq27BNyHM5Eua3h9hxH5rsLxBjjTE26s96edG1Il4LerdcH53pUPRWeTG9+enJN"}}
13.	2021-08-25 10:09:54 +0000 UTC:	[event;sealing.SectorFinalized]	{"User":{}}
14.	2021-08-25 10:09:55 +0000 UTC:	[event;sealing.SectorCommitSubmitted]	{"User":{"Message":{"/":"bafy2bzaceaust7whtddpcv4e6alatgtuef5oavclawfj4qkjti4odar7hqxie"}}}
15.	2021-08-25 10:14:00 +0000 UTC:	[event;sealing.SectorProving]	{"User":{}}
16.	2021-08-25 10:14:00 +0000 UTC:	[event;sealing.SectorFinalized]	{"User":{}}

Repo Steps

  1. Run '...'
  2. Do '...'
  3. See error '...'
    ...
@stuberman
Copy link
Author

lotus-miner storage-deals list -v |grep "search msg failed" |wc -l
2487

@jennijuju
Copy link
Member

I'm suspecting the issue is caused by this pr #5431.
Something changed in stage.GetStorageDeal with the new manager introduced but is not backward compatible with v1.11.3-dev and below

@jennijuju jennijuju added P1 P1: Must be resolved area/markets/storage team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs and removed need/triage labels Aug 28, 2021
@stuberman
Copy link
Author

Market node logs for an example of a deal that was active and then on downgrade went to error state

grep bafyreibscuk7oqvapys76r7iyo5aabu23o5alqmyuh6c35awhna2hgfkti /market/logs/markets.log
2021-08-23T23:29:59.394Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventRestart", "proposal CID": "bafyreibscuk7oqvapys76r7iyo5aabu23o5alqmyuh6c35awhna2hgfkti", "state": "StorageDealActive", "message": ""}
2021-08-23T23:51:45.029Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventRestart", "proposal CID": "bafyreibscuk7oqvapys76r7iyo5aabu23o5alqmyuh6c35awhna2hgfkti", "state": "StorageDealActive", "message": ""}
2021-08-26T11:32:51.459Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventRestart", "proposal CID": "bafyreibscuk7oqvapys76r7iyo5aabu23o5alqmyuh6c35awhna2hgfkti", "state": "StorageDealActive", "message": ""}
2021-08-26T20:26:32.569Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventRestart", "proposal CID": "bafyreibscuk7oqvapys76r7iyo5aabu23o5alqmyuh6c35awhna2hgfkti", "state": "StorageDealActive", "message": ""}
2021-08-27T01:42:38.256Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventRestart", "proposal CID": "bafyreibscuk7oqvapys76r7iyo5aabu23o5alqmyuh6c35awhna2hgfkti", "state": "StorageDealActive", "message": ""}
2021-08-27T04:01:42.006Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventRestart", "proposal CID": "bafyreibscuk7oqvapys76r7iyo5aabu23o5alqmyuh6c35awhna2hgfkti", "state": "StorageDealActive", "message": ""}
2021-08-27T04:11:10.597Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventRestart", "proposal CID": "bafyreibscuk7oqvapys76r7iyo5aabu23o5alqmyuh6c35awhna2hgfkti", "state": "StorageDealActive", "message": ""}
2021-08-27T04:51:57.183Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventRestart", "proposal CID": "bafyreibscuk7oqvapys76r7iyo5aabu23o5alqmyuh6c35awhna2hgfkti", "state": "StorageDealActive", "message": ""}
2021-08-27T23:54:54.579Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventRestart", "proposal CID": "bafyreibscuk7oqvapys76r7iyo5aabu23o5alqmyuh6c35awhna2hgfkti", "state": "StorageDealActive", "message": ""}
2021-08-28T02:00:11.749Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventRestart", "proposal CID": "bafyreibscuk7oqvapys76r7iyo5aabu23o5alqmyuh6c35awhna2hgfkti", "state": "StorageDealActive", "message": ""}
2021-08-28T02:00:12.247Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventDealCompletionFailed", "proposal CID": "bafyreibscuk7oqvapys76r7iyo5aabu23o5alqmyuh6c35awhna2hgfkti", "state": "StorageDealError", "message": "error waiting for deal completion: awaiting deal expired: getting deal info: looking for publish deal message bafy2bzaceadhxnjzn2uudy4cw5quafsdsq4m3bkwmlimgbskzbgft7g7wkjvq: search msg failed: failed to load message: blockstore: block not found"}

@aarshkshah1992
Copy link
Contributor

@dirkmc Please can you verify if this is caused by #5431 ?

@dirkmc
Copy link
Contributor

dirkmc commented Aug 30, 2021

The deal ID assigned to a deal on-chain can change if there is a chain re-org.
So #5431 introduced a change such that when the deal reaches the active state, instead of looking up the deal by ID, we look the deal up by the publish message CID (which does not change if there is a chain re-org).

We do this lookup

  • when the deal becomes active
  • whenever the node is restarted, for all active deals

The problem is that chain data gets GCed, so if the publish message is in a block that has been GCed from the local node's chain data, we will fail when trying to look up the deal by publish message CID.

I think the correct solution is to

  • check the deal start epoch
  • if the deal start epoch is recent, look up the deal by publish message CID
  • if the deal start epoch is not recent, look up the deal by deal ID

The definition of "recent" in this context is determined by

  • how many epochs to wait before assuming a re-org is not possible
  • how many epochs we can assume that chain data is protected from GC

We need to choose a value between these two numbers

@dirkmc dirkmc changed the title storage-deals list show all deal errors after downgrading from master to 1.11.2-rc1 on restart, all older active deals error out after downgrading from master to 1.11.2-rc1 Aug 30, 2021
@dirkmc dirkmc changed the title on restart, all older active deals error out after downgrading from master to 1.11.2-rc1 on restart, all older active deals error out on 1.11.2-rc1 Aug 30, 2021
@magik6k
Copy link
Contributor

magik6k commented Aug 30, 2021

how many epochs to wait before assuming a re-org is not possible

That's a finality, but realistically even in the 20~100 epoch range you'll see basically no reorgs

how many epochs we can assume that chain data is protected from GC

Not sure about splitstore (cc @vyzo), but after importing from a snapshot you may only have one finality.

We can also just special-case the deal message not being found, and gracefully fallback to looking at the deal ID

@stuberman
Copy link
Author

Updating to latest master does not resolve this.

lotus-miner version
Daemon: 1.11.3-dev+mainnet+git.3e450b881+api1.2.0
Local: lotus-miner version 1.11.3-dev+mainnet+git.3e450b881

lotus-miner info
Enabled subsystems (from miner API): [Mining Sealing SectorStorage]
Enabled subsystems (from markets API): [Markets]
Chain: [sync ok] [basefee 211.611 pFIL]
Miner: f01278 (32 GiB sectors)
Power: 121 Ti / 9.93 Ei (0.0012%)
Raw: 45.75 TiB / 9.93 EiB (0.0004%)
Committed: 46 TiB
Proving: 45.75 TiB
Projected average block win rate: 1.17/week (every 143h17m46s)
Projected block win with 99.9% probability every 989h49m35s
(projections DO NOT account for future network and miner growth)

Miner Balance: 1633.652 FIL
PreCommit: 410.976 mFIL
Pledge: 945.825 FIL
Vesting: 251.299 FIL
Available: 436.117 FIL
Market Balance: 101.986 FIL
Locked: 11.683 FIL
Available: 90.304 FIL
Worker Balance: 17.853 FIL
Control: 530.753 FIL
Total Spendable: 1075.027 FIL

Sectors:
Total: 1614
Proving: 1473
WaitDeals: 1
AddPiece: 1
PreCommit1: 1
PreCommit2: 4
Committing: 1
CommitFinalize: 1
Removed: 132

Storage Deals: 68, 760 GiB
Active: 39 485.5 GiB (Verified: 39 485.5 GiB)
AwaitingPreCommit: 23 186.5 GiB (Verified: 23 186.5 GiB)
Publish: 1 8 GiB (Verified: 1 8 GiB)
Transferring: 1 8 GiB (Verified: 1 8 GiB)
Sealing: 3 64 GiB (Verified: 3 64 GiB)
Staged: 1 8 GiB (Verified: 1 8 GiB)

Retrieval Deals (complete): 191, 1.04 TiB

@jennijuju
Copy link
Member

@stuberman thanks for reporting the issue and we have identified the bug. @dirkmc will work on a script to help bring back the right deal status locally!

@stuberman
Copy link
Author

Issue resolved by @dirkmc
Thanks You!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/markets/storage kind/bug Kind: Bug P1 P1: Must be resolved team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs
Projects
None yet
Development

No branches or pull requests

5 participants