Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/rawdb: implement resettable freezer #26324

Merged
merged 3 commits into from
Dec 19, 2022

Conversation

rjl493456442
Copy link
Member

@rjl493456442 rjl493456442 commented Dec 7, 2022

This PR implements resettable freezer by adding a ResettableFreezer wrapper.

Freezer can truncate data from head or tail via truncation methods. Once items
are deleted from tail, a flag virtualTail will be updated in the corresponding
metadata file to indicate these items are not available anymore.

However, tail truncation is irreversible. In other words, these positions corresponding
to tail truncated can no longer store new data. The only way is to reset the entire
freezer.

In PBSS, freezer is used to restore trie history(reverse diff). These history items
will be truncated from tail. But it's possible that blockchain will be required to
rollback to genesis, which needs to reset the entire trie history(drop all of them).
So this function is needed in this case.

Implementation wise, it relies on the os.Rename and os.RemoveAll to atomically
delete the original freezer data and re-create a new one from scratch. The reason
is it's very hard to ensure the atomicity if we implement the reset in freezerTable level.

To be precise, a freezer will contain multiple tables inside. If a table is reset and
other are not(crash happens), then when the freezer is opened next time, all
tables will be trimmed to the same length, that is, the highest tail is selected as
the tail of the freezer, but it's impossible for the reset'd one.

Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one nitpick

// only resettable if the passed file directory is exclusively occupied
// by the freezer. The reset function will delete directory atomically
// and re-create the freezer from scratch.
func NewResettableFreezer(datadir string, namespace string, readonly bool, maxTableSize uint32, tables map[string]bool) (*ResettableFreezer, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't things be simplified by having this look like :

Suggested change
func NewResettableFreezer(datadir string, namespace string, readonly bool, maxTableSize uint32, tables map[string]bool) (*ResettableFreezer, error) {
func NewResettableFreezer(f *Freezer) (*ResettableFreezer, error) {

It can pick out the datadir etc from f, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need these parameters to re-create a new freezer instance once Reset happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you can get them from the f, no?

Copy link
Member Author

@rjl493456442 rjl493456442 Dec 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically yes, but in practice it's really inconvenient.
E.g. namespace is the prefix for metrics which is not stored in freezer
E.g. maxTableSize is passed to freezerTable which is also not stored in freezer

I would still prefer to keep all parameters passed in. I mean in implementation complexity wise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@rjl493456442 rjl493456442 force-pushed the reset-freezer branch 2 times, most recently from 8101e7e to 99e3596 Compare December 8, 2022 06:05
// recreate the freezer from scratch. The atomicity of directory deletion
// is guaranteed by the rename operation, the leftover directory will be
// cleaned up in next startup in case crash happens after rename.
func (f *ResettableFreezer) Reset() error {
Copy link
Contributor

@zhiqiangxu zhiqiangxu Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rjl493456442 where is this method called?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in my pbss code. Just open the PR for collecting some feedbacks.

@rjl493456442
Copy link
Member Author

--- FAIL: TestResetFreezer (1.80s)
    testing.go:1097: TempDir RemoveAll cleanup: remove C:\Users\appveyor\AppData\Local\Temp\1\TestResetFreezer4134227925\001\FLOCK: The process cannot access the file because it is being used by another process.

Need fix

@holiman holiman added this to the 1.11.0 milestone Dec 14, 2022
@holiman
Copy link
Contributor

holiman commented Dec 14, 2022

Triage discussion:

We need to ensure that the user-configurable datadir is not the folder which gets renamed during a reset. This is because it might be a mount, and when we rename it, it might become a local folder, causing a copy of hundreds of gigabyte.

As long as this is used for pbss, which already reside in a folder of the ancientdir, this is fine.

@karalabe
Copy link
Member

I'd also like the constructor / data struct doc to contain some note of this limitation and that if we ever want to use it for ancients or configurable/mountable freezers, we need to swap from datadir atomic reset to some file based solution.

@holiman holiman merged commit 79a478b into ethereum:master Dec 19, 2022
shekhirin pushed a commit to shekhirin/go-ethereum that referenced this pull request Jun 6, 2023
This PR implements resettable freezer by adding a ResettableFreezer wrapper.

The resettable freezer wraps the original freezer in a way that makes it possible to ensure atomic resets. Implementation wise, it relies on the os.Rename and os.RemoveAll to atomically delete the original freezer data and re-create a new one from scratch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants