-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ImmutableDB is leaking file handles #1543
Comments
It seems to be fixed by adding As an alternative, perhaps we could instead add the file handle to a registry ASAP instead of waiting for it to make it into an iterator which then gets added to a registry? |
FYI, @mrBliss I pushed up a branch |
Fixes #1543. The ChainDB.Iterators and ChainDB.Readers use ImmutableDB.Iterators internally. Both take a ResourceRegistry to allocate the ImmutableDB.Iterator in so that the ImmutableDB.Iterator and the open handle it contains is closed when an exception occurs. If an exception occurs at the wrong time while opening an ImmutableDB.Iterator, after opening the handle, we might leak the handle. Fix this by directly allocating the handle in the ResourceRegistry instead of the ImmutableDB.Iterator that refers to the handle.
Thanks for that! (After force-pushing my fix, that commit is now gone.) |
1547: ImmutableDB.Iterator: allocate handles in the ResourceRegistry r=mrBliss a=mrBliss Fixes #1543. The ChainDB.Iterators and ChainDB.Readers use ImmutableDB.Iterators internally. Both take a ResourceRegistry to allocate the ImmutableDB.Iterator in so that the ImmutableDB.Iterator and the open handle it contains is closed when an exception occurs. If an exception occurs at the wrong time while opening an ImmutableDB.Iterator, after opening the handle, we might leak the handle. Fix this by directly allocating the handle in the ResourceRegistry instead of the ImmutableDB.Iterator that refers to the handle. Co-authored-by: Thomas Winant <[email protected]>
1506: Fix cloneBlockchainTime r=nfrisby a=nfrisby Fixes #1489. Fixes #1524. Fixing Issue #1489 (let nodes lead when they join) and letting k vary in the range [2 .. 10] since the dual-ledger tests do that now revealed several Issues. Issues in the library, not just the test infrastructure: * Issue #1505 -- `removeTxs` cannot use the fast path when validating after removing or else we might have dangling tx inputs references. Was fixed by #1565. * Issue #1511, bullet 1 (closed) -- The `Empty` cases in `prevPointAndBlockNo` were wrong. Recent PRs have addressed this: #1544 and #1589. * Issue #1543 (closed) -- A bracket in `registeredStream` was spoiled by an interruptible operation. Thomas' PR re-designed the vulnerability away. I think this is unrelated to the other changes; it was lurking and happened to pop up here just because I've been running hundreds of thousands of tests. Issues only in the test infrastructure: * Issue #1489: let nodes lead when they join. This bug slipped in recently, when I added cloning of `BlockchainTime`s as part of the restarting/rekeying loop in the test infrastructure. * _PBFT reference simulator_. (Was masked by 1489.) Model competing 1-block chains in Ref.PBFT and use its results where applicable instead of the .Expectations module. Check that the PBFT threadnet and Ref.PBFT.simulate results agree on `Ref.Nominal` slots. The Ref.PBFT module had been making assumptions that were accurate given the guards in RealPBFT generators, given the `k >= n` regime. But outside of that regime, when the security parameter exceeds the node count, it wasn't enough. Also, it couldn't be compared against the PBFT threadnet because of those assumptions. * _PBFT reference simulator_. (Cascade of above.) Add `definitelyEnoughBlocks` to confirm the "at least k blocks in 2k slots" invariant in `genRealPBFTNodeJoinPlan`. The existing guards in the RealPBFT generators are intentionally insufficient by themselves; this way we can optimize them to avoid O(n^2) complexity without risking divergence from the `suchThat`s. * _Origin corner-case_. (Was masked by 1489.) Discard DualPBFT tests that forge in slot 0. The current `cardano-ledger-specs` doesn't allow for that. My hypothesis is that `cvsLastSlot` would need to be able to represent origin. * _Dlg cert tx_. Adjust `genNodeRekeys` to respect "If node X rekeys in slot S and Y leads slot S+1, then either the topology must connect X and Y directly, or Y must join before slot S." * _Dlg cert tx_. (Was (statistically?) masked by relatively large k.) Add `rekeyOracle` and use it to determine which epoch number to record in the dlg cert. The correct value depends on which block the dlg cert tx will end up in, not when we first add it to our mempool. * _Dlg cert tx_. (Was (statistically?) masked by relatively large k.) Add the dlg cert tx to the mempool each time the ledger changes. Co-authored-by: Nicolas Frisby <[email protected]>
On master (commit 62d8e66), this test case fails with
ImmutableDB is leaking file handles
.With some instrumentation, I determined that node c0 leaks one ImmDB file handle, for path @00000.epoch@, read only, offset at 0. It was opened by a
ChainSyncServer
thread.The test case seems somewhat fragile, since the 'slotLengths' value seems to matter.
The text was updated successfully, but these errors were encountered: