-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(bigtable/bttest): make table gc release memory #3930
Conversation
We had a problem where we were unable to run the emulator for extended periods in dev environments without giving it huge amounts of RAM, even if we had quite aggressive GC policies applied to the tables. This fixes that by making GC delete empty rows and columns.
Ah, actually, scratch that. It's still not safe. I'm not convinced it was safe before as GC was changing the cell slices, but this has certainly made it more obviously unsafe. The easy option would be to hold the write lock over the GC cycle, whether it changes or not. A harder option would be to do the whole thing in two phases |
Ah, no, wrong again. I think this version is good. The row lock protects us. |
Wrong a third time! We need to re-check the rows as we delete them with a lock held. It could be that cells have been added to the row in the meantime. Fix incoming |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me also. I have asked some other folks that are more familiar with the emulator to take a look also, in case there is a hidden gotcha. |
Any chance of getting this merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but needs at least a basic test.
@jimfulton do you have any suggestions about what the test would look like and what it would test? Presumably there are already tests that the GC operates correctly in terms of what's visible in the database. There are already tests that catch issues with --race (I know because I broke them!). Are you suggesting a test that counts memory allocations? |
Reviewing |
The changes LGTM
You can add a test with simple gc rule such as maxNumVersions and verify that empty rows, cells, cell families did get deleted. |
Added the tests |
How does the emulator store rows?
Each table in emulator contains a Btree which stores the rows:
Each object in this btree is of type row. A row has a families map. A family has a cells map keyed by column name
A sample row would be:
Many fields have not been shown in above sample for brevity.
Now, when cells col-1 and col-2 are deleted, the emulator just sets "col-1" and "col-2" to empty array i.e. the row becomes:
In this PR, this behaviour is being changed to remove the key itself i.e. the row will become:
Similarly, when all the families are deleted, the empty row will be deleted as well.
Closes: #6102
Description by PR and issue original author:
We had a problem where we were unable to run the BigTable emulator for extended periods in dev environments without giving it huge amounts of RAM, even if we had quite aggressive GC policies applied to the tables. This fixes that by making GC in the emulator actually delete empty rows and columns.