Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AccountStorageEntry.count_and_status is serialized but not used when deserialized #7442

Closed
ryoqun opened this issue Dec 12, 2019 · 8 comments · Fixed by #9980
Closed

AccountStorageEntry.count_and_status is serialized but not used when deserialized #7442

ryoqun opened this issue Dec 12, 2019 · 8 comments · Fixed by #9980
Labels
good first issue Good for newcomers
Milestone

Comments

@ryoqun
Copy link
Member

ryoqun commented Dec 12, 2019

Problem

Previously count of AccountStorageEntry.count_and_status was serialized and consumed when deserialized. However, this got not true anymore over the course of prior snapshot stability PRs (specifically this change).

First, the new behavior of not using it was tentative to determine if the change are really good course of the impl direction. After a while and with much improved stability, it got clearer we can reconstruct this information when ingesting snapshots without problems.

Proposed Solution

Stop serializing count of count_and_status; it's not needed. And get over the resultant ABI-incompat change before ABI change gets costy. Also, we even may able to stop serializing the status too.

For the record: Coming from here.

@svenski123
Copy link
Contributor

Hi - I was looking at "good first issue" tagged issues and came across this one which caught my interest - the request scope is straightforward (perhaps deceptively so) but the relevant code also sits nicely down in the heart of things.

I've spent some time getting to grips with BankForks, Banks, Accounts, AccountInfos, AccountStorageEntries, (de)serialisation, initialisation from snapshots, access patterns as well as Rust (which is something new for me) - and am ready to give this a go.

One question I have is how important is backwards compatibility with the serialised format as mainnet is now live (albeit beta)? Is breaking compatbility acceptable or should the space occupied by count_and_status be filled with zero bytes (or some sentinel value i.e. 0xDEADBEEF etc.) when serialising and ignored when reading it in? I note the snapshot_version could be bumped from "1.1.0" but adding support for reading both the older and new versions would be an invasive change. Or is it acceptable to assume that users can be persuaded to upgrade en masse?

I also noticed the structs (StoredMeta, AccountMeta, etc.) used to access the mmap'd memory in AppendVec don't have any repr attributes and thus I assume use the Default Representation in rust for which the rust docs say "There are no guarantees of data layout made by this representation." which is less than ideal given that the files are exchanged betweeen nodes. (In fact for StoredMeta I discovered that data_len was placed before pubkey in the account files which is not the order they appear in append_ves.rs - which coming from C/C++ is surprising; though rust also has #[repr(C)].)

In any event I'd welcome any comments / thoughts for this first intended PR.

@mvines
Copy link
Member

mvines commented Apr 24, 2020

Thanks for your interest @svenski123!

One question I have is how important is backwards compatibility with the serialised format as mainnet is now live (albeit beta)? Is breaking compatbility acceptable or should the space occupied by count_and_status be filled with zero bytes (or some sentinel value i.e. 0xDEADBEEF etc.) when serialising and ignored when reading it in? I note the snapshot_version could be bumped from "1.1.0" but adding support for reading both the older and new versions would be an invasive change. Or is it acceptable to assume that users can be persuaded to upgrade en masse?

It would be very much preferred to support a rolling update, where the old/new version are supported for a time. Having to "cold start" the cluster is doable but I'd consider that a last resort. We can exploit the release model when coding up support for old/new though -- the master branch only needs to support the new, and we can limit the old/new adaption code to the current release branch. Since we generally ship a new major version on a monthly cadence, even if some hacky changes are required to adapt it'll be limited to a branch that in a month or so will be obsolete.

I also noticed the structs (StoredMeta, AccountMeta, etc.) used to access the mmap'd memory in AppendVec don't have any repr attributes and thus I assume use the Default Representation in rust for which the rust docs say "There are no guarantees of data layout made by this representation." which is less than ideal given that the files are exchanged betweeen nodes. (In fact for StoredMeta I discovered that data_len was placed before pubkey in the account files which is not the order they appear in append_ves.rs - which coming from C/C++ is surprising; though rust also has #[repr(C)].)

Yeah that's a little sloppy right now. Any cleanup here would be much appreciated!

@svenski123
Copy link
Contributor

I've opened a work-in-progress PR #9790 for some comment as this the first substantial bit of rust I've written.

I was able to get macro output out of rustc and have implemented Serialize / Deserialize traits for AccountStorageEntry (ASE) - so it is now straightforward to arbitraily change how ASE's are read in/written out separate from the object itself. Getting it to support two different formats at the same time is rather more difficult as it's all entwined with serde.

However it did allow me to split ASE.count_and_status into two atomic fields and get rid of the RwLock - (there's still the mutex in AppendVec). Aside from running the tests, I've also backed ported this to v1.1.7 and am running it on my TdS validator to test that it interoperates (the serialisation changes) and works (the lock-free stuff).

As for the multi-version serialisation stuff, having spent some time with serde now, I think it would make more sense for have separate dedicated structs/classes just for serialisation / deserialiation which contain only persistent state (not runtime / process state) and that can be multi-versioned - this way the lower level data (i.e. pubkeys, accounts, indices, etc.) can be handled with standard serde (i.e. #[derive(...)] trait implementations) and version differences can be handle at the top.

@sakridge
Copy link
Member

sakridge commented Apr 29, 2020

I think we need to re-think this. The solution to this seems to be more complicated than the problem it's trying to solve. Who cares if the count_status are dummy values or not if we are including them anyway? Seems like just more work for not a lot of gain.

The repr(C) thing seems like it might be a good change.

I'm afraid you will introduce races in count/status by switching to atomics which can be updated independently. The code assumes that they are updated both atomically in a couple places.

@svenski123
Copy link
Contributor

svenski123 commented Apr 29, 2020 via email

@svenski123
Copy link
Contributor

I've added a new PR #9980 against master which addresses the count and status serialisation issue in a more flexible way allowing for backward compatibility and an easier way of introducing newer versions.

I have also version of this against the v1.1 branch (I actually originally developed it against that branch as that's what TdS runs), let me know if you'd like a PR of that branch to look at.

Note the CAS count/status stuff is not in this new PR.

@ryoqun
Copy link
Member Author

ryoqun commented May 22, 2020

this is closed by the new test-bed version (1.2.0), which is introduced by the multiple snapshot pr: #9980

@ryoqun ryoqun closed this as completed May 22, 2020
@ryoqun
Copy link
Member Author

ryoqun commented May 22, 2020

@svenski123 Really thanks for working on this!

@mvines mvines modified the milestones: The Future!, v1.2.0 May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants