epoch stale after mainchain rpc massive restart #2559

532910 · 2023-09-08T10:50:59Z

to reproduce: stop all mainchain rpc (for upgrade for exmaple), then start

workaround: restart all ir nodes

roman-khimov · 2023-12-15T20:30:59Z

Container can't be created as well in this case until all nodes are restarted.

carpawell · 2023-12-18T09:11:54Z

stop all mainchain rpc

@532910, how long RPCs were down? Seconds? Minutes?

until all nodes are restarted

@roman-khimov, what exact nodes? IR? SN?

532910 · 2023-12-18T09:15:47Z

how long

let start with minutes

what exact nodes? IR? SN?

IR, I believe

carpawell · 2023-12-25T13:33:21Z

No progress for now: one node local consensus does not allow reproducing. The best I can do here is wait until the next update and see the logs/profiles. If that is unacceptable I may try the local 4/7 nodes consensus.

Scenario: 0. at least one subscription has been performed 1. another subscription is being done 2. a notification from one of the `0.` point's subs is received If `2.` happens b/w `0.` and `1.` a deadlock appears since the notification routing process is locked on the subscription lock while the subscription lock cannot be unlocked since the subscription RPC cannot be done before the just-arrived notification is handled (read from the neo-go subscription channel). Relates #2559. Signed-off-by: Pavel Karpy <[email protected]>

Scenario: 0. at least one subscription has been performed 1. another subscription is being done 2. a notification from one of the `0.` point's subs is received If `2.` happens b/w `0.` and `1.` a deadlock appears since the notification routing process is locked on the subscription lock while the subscription lock cannot be unlocked since the subscription RPC cannot be done before the just-arrived notification is handled (read from the neo-go subscription channel). `switchLock` does the same thing to the `routeNotifications`: it ensures that no routine is doing/will be doing changes with the subscription channels, even though `subs`'s lock was created for this purpose initially. Relates #2559. Signed-off-by: Pavel Karpy <[email protected]>

roman-khimov · 2024-01-29T07:24:31Z

Can't be reproduced at this stage, waiting for another case in some network.

roman-khimov · 2024-04-27T14:20:31Z

Seems to be reproducible if 2/3 RPC nodes to go offline for some time.

evgeniiz321 · 2024-08-05T19:27:03Z

Got a stable reproduction during new payment tests tests.payment.test_container_payments.TestContainerPayments#test_container_payments: https://rest.fs.neo.org/HXSaMJXk2g8C14ht8HSi7BBaiYZ1HeWh2xnWPGQCg4H6/477-1722817756/index.html#suites/44e11ced39071e8d0cfc11c5b94622ba/d3f9a85ef9ecb46a/.

carpawell · 2024-09-27T14:28:04Z

@evgeniiz321, as I understand, in your case you want to make the test be faster so you change epoch duration to 20 blocks, right? I do not see any epoch ticks after changing so it is impossible to change epoch duration immediately, we do not have notifications about config changes and cannot recalculate the next block for epoch handling. As I understand you wait for a new epoch no longer than 60 seconds, probably, a new epoch will not happen if you used 240-second epoch before (1-second block for the default 240 epoch duration). Can you either try to tick epoch after network setting tuning or increase max awaiting to 240?

And yes, it does not relate the original flapping problem.

carpawell · 2024-11-11T10:23:06Z

We have not seen this exact issue for so long so I close it, it can be reopened once it happens.

carpawell · 2024-12-04T16:59:45Z

Still was not the case. See #3007, it is more related to the situation.

532910 added the triage label Sep 8, 2023

roman-khimov added bug Something isn't working neofs-ir Inner Ring node application issues and removed triage labels Dec 7, 2023

roman-khimov added this to the v0.40.0 milestone Dec 7, 2023

roman-khimov assigned carpawell Dec 7, 2023

roman-khimov added U2 Seriously planned S4 Routine I4 No visible changes labels Dec 21, 2023

carpawell mentioned this issue Jan 17, 2024

morph: Fix subscription deadlock #2720

Merged

roman-khimov added the blocked Can't be done because of something label Jan 29, 2024

roman-khimov unassigned carpawell Jan 29, 2024

roman-khimov modified the milestones: v0.40.0, v0.41.0 Jan 29, 2024

roman-khimov modified the milestones: v0.41.0, v0.42.0 Mar 22, 2024

roman-khimov modified the milestones: v0.42.0, v0.43.0 May 22, 2024

roman-khimov modified the milestones: v0.43.0, v0.44.0 Aug 20, 2024

roman-khimov modified the milestones: v0.44.0, v0.45.0 Nov 8, 2024

carpawell closed this as not planned Won't fix, can't repro, duplicate, stale Nov 11, 2024

carpawell removed this from the v0.45.0 milestone Nov 11, 2024

532910 reopened this Dec 4, 2024

carpawell closed this as not planned Won't fix, can't repro, duplicate, stale Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

epoch stale after mainchain rpc massive restart #2559

epoch stale after mainchain rpc massive restart #2559

532910 commented Sep 8, 2023 •

edited

Loading

roman-khimov commented Dec 15, 2023

carpawell commented Dec 18, 2023

532910 commented Dec 18, 2023

carpawell commented Dec 25, 2023

roman-khimov commented Jan 29, 2024

roman-khimov commented Apr 27, 2024

evgeniiz321 commented Aug 5, 2024

carpawell commented Sep 27, 2024 •

edited

Loading

carpawell commented Nov 11, 2024 •

edited

Loading

carpawell commented Dec 4, 2024

epoch stale after mainchain rpc massive restart #2559

epoch stale after mainchain rpc massive restart #2559

Comments

532910 commented Sep 8, 2023 • edited Loading

roman-khimov commented Dec 15, 2023

carpawell commented Dec 18, 2023

532910 commented Dec 18, 2023

carpawell commented Dec 25, 2023

roman-khimov commented Jan 29, 2024

roman-khimov commented Apr 27, 2024

evgeniiz321 commented Aug 5, 2024

carpawell commented Sep 27, 2024 • edited Loading

carpawell commented Nov 11, 2024 • edited Loading

carpawell commented Dec 4, 2024

532910 commented Sep 8, 2023 •

edited

Loading

carpawell commented Sep 27, 2024 •

edited

Loading

carpawell commented Nov 11, 2024 •

edited

Loading