Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use protobufs to store confirmed blocks in BigTable #12526

Merged
merged 6 commits into from
Sep 30, 2020

Conversation

jstarry
Copy link
Member

@jstarry jstarry commented Sep 28, 2020

Problem

Data is stored in BigTable using bincode serialization which is not versioned and does not allow extending nested structs.

Summary of Changes

  • Extend storage-bigtable to store protobuf serialized data
  • Store confirmed blocks with protobuf serialization
  • Add .proto file for confirmed block storage
  • Add new cell name for protobuf serialized data: "proto"
  • Fall back to "bin" cell data if "proto" was not found

Fixes #12494

@jstarry
Copy link
Member Author

jstarry commented Sep 28, 2020

TODO:

  • Check data storage increase
  • Move code generation out of build.rs
  • Clean up unwraps

@jstarry jstarry added the noCI Suppress CI on this Pull Request label Sep 28, 2020
@jstarry jstarry force-pushed the bigtable-proto branch 3 times, most recently from eb51be5 to 39f5ade Compare September 29, 2020 17:33
@jstarry jstarry removed the noCI Suppress CI on this Pull Request label Sep 29, 2020
@jstarry jstarry marked this pull request as ready for review September 29, 2020 17:34
@jstarry jstarry added v1.3 CI Pull Request is ready to enter CI labels Sep 29, 2020
@solana-grimes solana-grimes removed the CI Pull Request is ready to enter CI label Sep 29, 2020
@jstarry
Copy link
Member Author

jstarry commented Sep 29, 2020

Here are the bigtable rows read out using cbt

----------------------------------------
0000000000000035
  x:bin                                    @ 2020/09/30-02:03:52.415000
    "\x03\x00\x00\x00(\xb5/\xfd\x00X\xcd\x0e\x00\xb4\x1a,\x00AqnGhHxx35bjFBHyYJb1oR3wzy8uYRPkuc6FihtB127j,\x00Gqpa5nXT1CsxYHXAqPjT3JA3GEfffaGDo4rJ1gsVWxNk4\x00\x01\x00\x01t^G\xee\xe0G9\xf2\xb6u5*\xf1\xb8\x91#ˌ\x13\xb4e\x9e\x99\x92\xb8-BV\xd88\xac\xc0\xcf\xcb\x06\x88|\xbe\x181\x06\x8a\xa4\xac\b3\xc0\x95\xa1x\x93uWX\bu\x88\x8d\x1eN\xaa\xc4\xd9\x03\x01\x00\x03\x05z\xf8|\xac\xd1(\v\x11@\x93\xa2Cy\xfe\xaaף\xf4{y\x95\xa5\x00\xd3\xe4K\x8f\x83j,@\x8a9\xac\xe0+\x04t\x99#\xc53\xa91\x89\xf5\x00~\xa72])/7\x04\x9c\xd22\xd2q\x9b4\x9f\x04\x06\xa7\xd5\x17\x19/\n\xaf\xc6\xf2e\xe3\xfbw\xcczڂ\xc5)о;\x13n-\x00U \x00\x00\x00\x06\xa7\xd5\x17\x18\xc7t\xc9(Vc\x98i\x1d^\xb6\x8b^\xb8\xa3\x9bKm\\sU[!\x00\x00\x00\x00\aaH\x1d5tt\xbb|Mv$\xebӽ\xb3\xd85^s\xd1\x10C\xfc\r\xa3S\x80\x00\x00\x00\x00\x928?\xb1r\xc778?\xbecu(c\x90\xe7\xc7Cp2\xb3\xceM\x12\x15\x8eK\x92>\x1ft\xaa\x01\x04\x04\x01\x02\x03\x00=\x02\xd1E\xcb%X\xf7\xcd.\xad\x97\xb6+\xaa\xf7\x84'\xb38}\xf0!\x93\xf2\xbfb\xbe%3\xfb^\xf2h\x01\xc1vs_\x00\x88\x13\x00\x00\x05\x00\x00\xf4\x95Pjt\x00\x00\x00\x90ԙ\x01\x05l\x82\x00\x12\x00\aS\x00\x90\x03D8\xd5C\x90y\x85\x18d\xf6\x82,\xb9u\x1f\xf4\x00\v\x9b\xabx\x96%\x80\x03\xc72\x13\fl0\x8a\xc1\xd0\x03h\x03"
----------------------------------------
0000000000000036
  x:bin                                    @ 2020/09/30-02:03:53.436000
    "\x03\x00\x00\x00(\xb5/\xfd\x00X\xcd\x0e\x00\xb4\x1a,\x00Gqpa5nXT1CsxYHXAqPjT3JA3GEfffaGDo4rJ1gsVWxNk,\x008UpLvMmG4FqcWMQcCtkXDV4eZjon5aSK96PBmP7NHz6d5\x00\x01\x00\x013\x85[j\x14\x88\xb1'\xd1\x1f;\xec\"\x80\xebxF]1k=\xfa!\xafN\x9e(c\x80\xc5q\x98\f\x1fE\x03\xc0\x84<\x0f,(\xb1MP\bF\x92\b\x12\xff\xdb$\x19\ns\x88Rd\xd2\xeb\xd4S\v\x01\x00\x03\x05z\xf8|\xac\xd1(\v\x11@\x93\xa2Cy\xfe\xaaף\xf4{y\x95\xa5\x00\xd3\xe4K\x8f\x83j,@\x8a9\xac\xe0+\x04t\x99#\xc53\xa91\x89\xf5\x00~\xa72])/7\x04\x9c\xd22\xd2q\x9b4\x9f\x04\x06\xa7\xd5\x17\x19/\n\xaf\xc6\xf2e\xe3\xfbw\xcczڂ\xc5)о;\x13n-\x00U \x00\x00\x00\x06\xa7\xd5\x17\x18\xc7t\xc9(Vc\x98i\x1d^\xb6\x8b^\xb8\xa3\x9bKm\\sU[!\x00\x00\x00\x00\aaH\x1d5tt\xbb|Mv$\xebӽ\xb3\xd85^s\xd1\x10C\xfc\r\xa3S\x80\x00\x00\x00\x00\xeb`\xf8\x82j,\xa3=^2%Y\xa3\x1fD \x9e\x8f\xe2\x1dl\x1e~\xe3hӤZ\xf9\xf4\xef\xf1\x01\x04\x04\x01\x02\x03\x00=\x02\\\n4\xe2\xb9Q¬\x9e\xf6\xd3*ᶞ2\xba\xa7\x12ş\xdb!\xc54\xf8G\xc3\xea\x01\xe8\n\x01\xc1vs_\x00\x88\x13\x00\x00\x05\x00\x000\x8cPjt\x00\x00\x00\x90ԙ\x01\x05\xa8x\x00\x12\x00\aS\x00\x90\x03D8\xd5C\x90y\x85\x18d\xf6\x82,\xb9u\x1f\xf4\x00\v\x9b\xabx\x96%\x80\x03\xc72\x13\fl0\x8a\xc1\xd0\x03h\x03"
----------------------------------------
0000000000000037
  x:bin                                    @ 2020/09/30-02:03:53.438000
    "\x03\x00\x00\x00(\xb5/\xfd\x00X\xcd\x0e\x00\xb4\x1a,\x008UpLvMmG4FqcWMQcCtkXDV4eZjon5aSK96PBmP7NHz6d,\x00BNUL2oL5SyyYrSaR3qf2hEudDn63qoEkSd6L3w8vCHsL6\x00\x01\x00\x01\x8eX˖o\xbf\xaa\xd3\x135\xd0\tP\xef\xfe\"c\x8b\xee\x88N\xb7\xc0\xf9\xa7\xff2B\f\xe7\xfe=!\xcaZ\xe4\xa4@\\C\x8d.\x91+\xb3\xb6N\x101\x8e\xdc\x10\xff0\xf2c\x03\xa2\xd0\xc3\xd4\xe3\xfd\x03\x01\x00\x03\x05z\xf8|\xac\xd1(\v\x11@\x93\xa2Cy\xfe\xaaף\xf4{y\x95\xa5\x00\xd3\xe4K\x8f\x83j,@\x8a9\xac\xe0+\x04t\x99#\xc53\xa91\x89\xf5\x00~\xa72])/7\x04\x9c\xd22\xd2q\x9b4\x9f\x04\x06\xa7\xd5\x17\x19/\n\xaf\xc6\xf2e\xe3\xfbw\xcczڂ\xc5)о;\x13n-\x00U \x00\x00\x00\x06\xa7\xd5\x17\x18\xc7t\xc9(Vc\x98i\x1d^\xb6\x8b^\xb8\xa3\x9bKm\\sU[!\x00\x00\x00\x00\aaH\x1d5tt\xbb|Mv$\xebӽ\xb3\xd85^s\xd1\x10C\xfc\r\xa3S\x80\x00\x00\x00\x00o\"\x05\x90\x84\xfa<\xb0\xf4\xbbH\xd7\xfb\xb7x}\xc6C\xbf\xaa\xc9Wj\xee\t\xfbK\xbf\xbd\xa5\xd6Z\x01\x04\x04\x01\x02\x03\x00=\x02n\xca\xd7\"\x8a\xcc,\xe6v\xc9\xee\x8b\xc5|\xf518\x04\x1ec\xb4\xe2{\x1eZ\xf8\xb1g5O{\xb0\x01\xc2vs_\x00\x88\x13\x00\x00\x05\x00\x00l\x82Pjt\x00\x00\x00\x90ԙ\x01\x05\xe4n\x00\x12\x00\aS\x00\x90\x03D8\xd5C\x90y\x85\x18d\xf6\x82,\xb9u\x1f\xf4\x00\v\x9b\xabx\x96%\x80\x03\xc72\x13\fl0\x8a\xc1\xd0\x03h\x03"


----------------------------------------
0000000000000035
  x:proto                                  @ 2020/09/30-01:57:34.435000
    "\x03\x00\x00\x00(\xb5/\xfd\x00X\xd5\x0e\x00D\x1c\n,3aGnCY7kyRDyDq4XkVSCu5jGwo1s2CMQjcFxJ8R3kFvd\x12,3n78sweXL3h872rxdYAhKkRFvaBX4rBDHkqvNM9V1DPE\x184\"\x86\x03\n\xe0\x02\n@\xd5\x1c/a\x87nG\a\xb1Y\xb9\x81\xab\x83\x1ci\x9c\xdf\xf8\xf1_\x16\xfb^\xc9\x1cZ\xaf\xa8\xe0щ\xe5\x11\xacB\x82J\xffg\xee\xa8\xcf\xc2\xe0\x10\x9a\xb4+#]\xd0\xf7\xa0\xa9F\xd5Ui.5\x1e\xa7\x04\x12\x9b\x02\n\x04\b\x01\x18\x03\x12 My\xd5\f\xfa\x8e\xa5jh\xc7k(\xa9R\xe7\xa2\xcb\xf4\x82\\ĞQ)$\"[/L\x01\xea-\x12 dz:?c\u007f\xc6\x13\xca\xee\xdf\x02\xf1\x1dn\r1\xad\xd1\xe1,Coc\xf5\x93\xfa\x06S _E\x12 \x06\xa7\xd5\x17\x19/\n\xaf\xc6\xf2e\xe3\xfbw\xcczڂ\xc5)о;\x13n-\x00U \x00\x00\x00\x18\xc7t\xc9(Vc\x98i\x1d^\xb6\x8b^\xb8\xa3\x9bKm\\sU[!\x00\aaH\x1d5tt\xbb|Mv$\xebӽ\xb3\xd85^s\xd1\x10C\xfc\r\xa3S\x80\x1a &=a\xdbd\x9d\xd6s#Ss\x9a\xa3\xba\xce\x16\xb1\bc\xa0\x80\xe8\xe3\x10\xa4\x05K\x01\x16\b\xf76\"G\b\x04\x12\x04\x01\x02\x03\x00\x1a=\x02\x00\x00\x00\x01\x004\x00\xf0+\x86\xcb\xf7k\xfb\xd7P>\xb9;\x83Zi\xa9#\xf8\xf9\x8c\x03:Ѷ\xb6[\xf7T+\xdb\xd1\x05\x01Bts_!\x10\x88'\x1a\r\xf4\xab\xc2\xd2\xc6\x0e\x90\xa9\xe7\f\x01\x01\x01\"\r\xec\x842\x06\b\xc2\xe8\xcd\xfb\x05\a\x00e\xac\x15\xf2\xee\x03\x03\x13n  \\\xe0\x8e\x15ePS"
----------------------------------------
0000000000000036
  x:proto                                  @ 2020/09/30-01:57:34.437000
    "\x03\x00\x00\x00(\xb5/\xfd\x00X\xdd\x0e\x00T\x1c\n,3n78sweXL3h872rxdYAhKkRFvaBX4rBDHkqvNM9V1DPE\x12,Gp5yzgV4jrRCYBSjTS8qU6GLpv1rFuGD6zBTS4sdyYEE\x185\"\x86\x03\n\xe0\x02\n@\xc6]\x11C7\\'/A@\xd8)\x8f\xbbj\x05\a\xf7\xee\xed\x10\xe1+\x19b\xb4~v\xd1\t\xf2\xe7\xa8\x1d\v\x9d\x10\xf5\xcc\xf3\x19,\xa4?6X\xd6]\xfe\xddł%Y\xe1\x18O\xcf\x1b\xaa\xf9,\x8c\f\x12\x9b\x02\n\x04\b\x01\x18\x03\x12 My\xd5\f\xfa\x8e\xa5jh\xc7k(\xa9R\xe7\xa2\xcb\xf4\x82\\ĞQ)$\"[/L\x01\xea-\x12 dz:?c\u007f\xc6\x13\xca\xee\xdf\x02\xf1\x1dn\r1\xad\xd1\xe1,Coc\xf5\x93\xfa\x06S _E\x12 \x06\xa7\xd5\x17\x19/\n\xaf\xc6\xf2e\xe3\xfbw\xcczڂ\xc5)о;\x13n-\x00U \x00\x00\x00\x18\xc7t\xc9(Vc\x98i\x1d^\xb6\x8b^\xb8\xa3\x9bKm\\sU[!\x00\aaH\x1d5tt\xbb|Mv$\xebӽ\xb3\xd85^s\xd1\x10C\xfc\r\xa3S\x80\x1a )Er\xb4\xaf\xdf\xe1\xe8;\xbbM\x1002\x81vK\xb0\xf3\fXkN\xee\x97G)Y\x02\x147\xf9\"G\b\x04\x12\x04\x01\x02\x03\x00\x1a=\x02\x00\x00\x00\x01\x005\x00\x84ׄ\xdaF\xfc\xc8\ajQ\x93\x05\xa8s\"SN\xd8q\xb9\x94\xb9:\x0e\x02\xb3\xa9\x98\xa3\xd2.\xa8\x01Bts_!\x10\x88'\x1a\r\xb0\x98\xc2\xd2\xc6\x0e\x90\xa9\xe7\f\x01\x01\x01\"\r\xa8\xf1\xc12\x06\b\xc2\xe8\xcd\xfb\x05\a\x00\xc8\xec(\xe45\x02p#\xdcl@\xe4\xc0\xc5+ʠ\xa6"
----------------------------------------
0000000000000037
  x:proto                                  @ 2020/09/30-01:57:34.439000
    "\x03\x00\x00\x00(\xb5/\xfd\x00X\xdd\x0e\x00T\x1c\n,Gp5yzgV4jrRCYBSjTS8qU6GLpv1rFuGD6zBTS4sdyYEE\x12,3TTj6AFyHrFF7tsKeYdhUyuXN5jJPNuw5mGXdi2Qe9a7\x186\"\x86\x03\n\xe0\x02\n@\x80~}[l^\x90+\xe0\xd8\xf3Yy\xbf\x86i\x9cj\t\x0e\xfbq\x1bAN\xe3,f6D\x15q\x88\xc1X|\x89\x90LG\x01\xcao\xc0#\xcfV\x91\x14r\xa8\x95ב\x91\xc8Y\xbdT/\xd6\t\xac\x0f\x12\x9b\x02\n\x04\b\x01\x18\x03\x12 My\xd5\f\xfa\x8e\xa5jh\xc7k(\xa9R\xe7\xa2\xcb\xf4\x82\\ĞQ)$\"[/L\x01\xea-\x12 dz:?c\u007f\xc6\x13\xca\xee\xdf\x02\xf1\x1dn\r1\xad\xd1\xe1,Coc\xf5\x93\xfa\x06S _E\x12 \x06\xa7\xd5\x17\x19/\n\xaf\xc6\xf2e\xe3\xfbw\xcczڂ\xc5)о;\x13n-\x00U \x00\x00\x00\x18\xc7t\xc9(Vc\x98i\x1d^\xb6\x8b^\xb8\xa3\x9bKm\\sU[!\x00\aaH\x1d5tt\xbb|Mv$\xebӽ\xb3\xd85^s\xd1\x10C\xfc\r\xa3S\x80\x1a \xea\xef<_\xdb\x12\xd0\xe3\xa2\xf5\xb46G\xd3pNB\xd4\x02+\xaa\x9dJ3\xe5\xe5\x00c\xd1>\xc5\xdb\"G\b\x04\x12\x04\x01\x02\x03\x00\x1a=\x02\x00\x00\x00\x01\x006\x00B?\xb9{\x9aC\x9d\x86\x12\xf7t\x81\xd9\x14e\xe5P\xaeB\xcf\xf2P\xe9/\xfbˠ\xcf\xcdP\xa0\xe8\x01Cts_!\x10\x88'\x1a\r\xec\x84\xc2\xd2\xc6\x0e\x90\xa9\xe7\f\x01\x01\x01\"\r\xe4\xdd\xc12\x06\b\xc3\xe8\xcd\xfb\x05\a\x00\xc8\xec(\xe45\x02p#\xdcl@\xe4\xc0\xc5+ʠ\xa6"

I'm not sure how to translate this to raw bytes, but length of the string encoded values is roughly the same 🤷

@@ -0,0 +1,67 @@
syntax = "proto3";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer this layout:

storage-bigtable/src/confirmed_block.proto
storage-bigtable/proto/confirmed_block.rs
  1. keeping generated files out of src/ in general feels best, as we unleash linters and formatters on src/.
  2. No need for src/proto/blah.proto in this case, src/blah.proto is just as good since the number of files in this crate is small
  3. I'm not a superfan of storage-bigtable/proto/... as the location for generated protobuf source files, but I guess it works and it's not src/


message Reward {
string pubkey = 1;
int64 lamports = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fun fact: I'm already thinking about adding a new field to this struct

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post_balance from this PR: #12561

@codecov
Copy link

codecov bot commented Sep 29, 2020

Codecov Report

Merging #12526 into master will increase coverage by 0.0%.
The diff coverage is 74.8%.

@@           Coverage Diff            @@
##           master   #12526    +/-   ##
========================================
  Coverage    82.0%    82.0%            
========================================
  Files         356      357     +1     
  Lines       83140    83390   +250     
========================================
+ Hits        68197    68461   +264     
+ Misses      14943    14929    -14     

@CriesofCarrots
Copy link
Contributor

CriesofCarrots commented Sep 30, 2020

@mvines , I think this is looking pretty decent. Can you take a look?
With #12587, the deserialization compatibility of StoredConfirmedBlock should be fixed; I've confirmed with various local ledgers and BigTable emulator.

mvines
mvines previously approved these changes Sep 30, 2020
Copy link
Member

@mvines mvines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks amazing overall. Super happy to see protobuf sneaking in more!

@@ -0,0 +1,291 @@
use solana_sdk::{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generic file/directory names like "utils" are pretty triggering for me. Maybe this can be called from.rs? convert.rs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Way better. Thanks!

@mergify mergify bot dismissed mvines’s stale review September 30, 2020 16:30

Pull request has been modified.

@CriesofCarrots CriesofCarrots added the automerge Merge this Pull Request automatically once CI passes label Sep 30, 2020
@mergify mergify bot merged commit ce598c5 into solana-labs:master Sep 30, 2020
mergify bot pushed a commit that referenced this pull request Sep 30, 2020
* Use protobufs to store confirmed blocks in BigTable

* Cleanup

* Reorganize proto

* Clean up use statements

* Split out function for unit testing

* s/utils/convert

Co-authored-by: Tyera Eulberg <[email protected]>
(cherry picked from commit ce598c5)
mergify bot added a commit that referenced this pull request Sep 30, 2020
* Use protobufs to store confirmed blocks in BigTable

* Cleanup

* Reorganize proto

* Clean up use statements

* Split out function for unit testing

* s/utils/convert

Co-authored-by: Tyera Eulberg <[email protected]>
(cherry picked from commit ce598c5)

Co-authored-by: Justin Starry <[email protected]>
@jstarry jstarry deleted the bigtable-proto branch November 29, 2021 01:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge Merge this Pull Request automatically once CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

v1.3: Bigtable inner transaction plumbing is incompatible with v1.2
4 participants