Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swingstore.close() to close transcripts for tests #3437

Closed
warner opened this issue Jun 30, 2021 · 8 comments
Closed

swingstore.close() to close transcripts for tests #3437

warner opened this issue Jun 30, 2021 · 8 comments
Labels
enhancement New feature or request SwingSet package: SwingSet
Milestone

Comments

@warner
Copy link
Member

warner commented Jun 30, 2021

What is the Problem Being Solved?

I'm writing a swingset test that wants to simulate running two sequential+discontiguous swingsets from a single state directory (like running a validator for a while, shutting it down, and then starting it back up again). I want create a hostStorage object, use it with one controller, do controller.shutdown() to kill it, then start it up with a new controller.

The problem I'm running into is that the streamStore is holding onto a transcript stream (for writing, from the first instance). controller.shutdown() doesn't release it, so when I pass the hostStorage to the second controller, it's not allowed to read back the transcripts. I can work around this by doing controller.dump(), which happens to do a closeVatTranscript() on each, so it can read the transcript contents and write then into the dump object.

But I think I want something like controller.shutdown() to also close any transcript streams it's currently holding. Or maybe the hostStorage.streamStore object should have a .closeAll() or .shutdown() method which would do the same thing.

@FUDCo any thoughts?

@warner warner added enhancement New feature or request SwingSet package: SwingSet labels Jun 30, 2021
@FUDCo
Copy link
Contributor

FUDCo commented Jun 30, 2021

I'm puzzled by this. It should already work more or less as you describe, unless something is not actually calling close when it should.

@FUDCo
Copy link
Contributor

FUDCo commented Jun 30, 2021

I'm thinking something may have gotten lost in translation when we switched over to sqlite.

@FUDCo
Copy link
Contributor

FUDCo commented Jun 30, 2021

Looking at the code, it's definitely an issue rooted in the switch to sqlite. In the file-based stream store, an open stream was associated with an open file descriptor for the individual transcript stream file. This fd would be automatically swept up and closed when operations (i.e., read or write) completed. Now all the transcripts share a single sqlite database file, so all that's tracked for the individual streams is their status (i.e., read vs write vs not in use). But there's nothing that closes the database file itself. It does get closed automagically when the program exits, but if you try to open a second swing store in the same process using the same path (as in the test scenario that was described) you'll get a failure. Closing a swing store needs to invoke some kind of close call on the sqlStore which in turn needs to do a sqlite close call on its open database.

That's a long winded way of saying that this is just a bug. I think the fix is straightforward.

@FUDCo
Copy link
Contributor

FUDCo commented Jun 30, 2021

A further thought, that might kinda sorta invalidate some of what I just said (although the underlying issue is the same and the bug I flagged is a real one that still needs to be fixed): in your test, are you provisioning the swing store by calling provideHostStorage or by calling openLMDBSwingStore? If the former (which is what I'd bet), then you're using the in-memory store, which (upon examination) has essentially the identical bug, though I think what's happening is also that the stream statuses are being left in whatever state they were in (typically write mode) when they need to be cleared.

@FUDCo
Copy link
Contributor

FUDCo commented Jun 30, 2021

Actually, the in-memory store is not designed to be closed and reopened. HOWEVER, it provides two special methods for testing, getAllState and setAllState, which allow you to capture the state at an arbitrary point and reset to an arbitrary captured state, which should do what you want. So depending on your test you may not need a fix here, you just need to use the appropriate API (though we still should file a bug on the close-the-sqlite-database thing).

@FUDCo
Copy link
Contributor

FUDCo commented Jun 30, 2021

In the course of doing the database refcount audit, I encountered a similar problem when I execute swingset-runner in database dump mode (it dies after the first crank complaining that one of the transcript streams is already in use), so there's clearly a problem with the swing store stepping on its own toes.

FUDCo added a commit that referenced this issue Jun 30, 2021
The stream store API evolved since the read/write mode interlock logic was
originally coded, such each write has become a singular, stand-alone
operation. The read/write mode interlock logic was thus overly aggressive in the
face of read-after-write use cases, breaking some uses, notably the problem
raised in issue #3437 as well as the swingset-runner "dump after each crank"
debug feature.  This fixes that.
FUDCo added a commit that referenced this issue Jun 30, 2021
The stream store API evolved since the read/write mode interlock logic was
originally coded, such each write has become a singular, stand-alone
operation. The read/write mode interlock logic was thus overly aggressive in the
face of read-after-write use cases, breaking some uses, notably the problem
raised in issue #3437 as well as the swingset-runner "dump after each crank"
debug feature.  This fixes that.
FUDCo added a commit that referenced this issue Jun 30, 2021
The stream store API evolved since the read/write mode interlock logic was
originally coded, such each write has become a singular, stand-alone
operation. The read/write mode interlock logic was thus overly aggressive in the
face of read-after-write use cases, breaking some uses, notably the problem
raised in issue #3437 as well as the swingset-runner "dump after each crank"
debug feature.  This fixes that.
FUDCo added a commit that referenced this issue Jun 30, 2021
The stream store API evolved since the read/write mode interlock logic was
originally coded, such each write has become a singular, stand-alone
operation. The read/write mode interlock logic was thus overly aggressive in the
face of read-after-write use cases, breaking some uses, notably the problem
raised in issue #3437 as well as the swingset-runner "dump after each crank"
debug feature.  This fixes that.
@Tartuffo
Copy link
Contributor

@FUDCo When I first started working on our ZH board, I marked this as being for the Mainnet 1 release because it was already in the Review/QA pipeline. This seems like it is an open question, not really In Review or In QA. Should this be moved back to the MN-1 backlog, or taken out of the Mainnet 1 release altogether? If this needs to get done for MN-1 please move to appropriate pipeline and give it an estimate (and hopefully an assignee!)

@FUDCo
Copy link
Contributor

FUDCo commented Feb 10, 2022

This was fixed and somehow did not get closed. Closing it now.

@FUDCo FUDCo closed this as completed Feb 10, 2022
@Tartuffo Tartuffo added this to the Mainnet 1 milestone Mar 23, 2022
@Tartuffo Tartuffo modified the milestones: Mainnet 1, RUN Protocol RC0 Apr 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request SwingSet package: SwingSet
Projects
None yet
Development

No branches or pull requests

3 participants