Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize put by reuse memory of kvData before #379

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kingeasternsun
Copy link

optimize the put

the next node of the previous node is current node, so we can use  node to replace p.nodeData[m];
@kingeasternsun kingeasternsun changed the title Kingeasternsun op put 1 optimize put by reuse memory before Nov 10, 2021
@kingeasternsun kingeasternsun changed the title optimize put by reuse memory before optimize put by reuse memory of kvData before Nov 10, 2021
Copy link
Author

@kingeasternsun kingeasternsun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good advise

leveldb/memdb/memdb.go Show resolved Hide resolved
Copy link

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought some more about this. I think it violates the API. There are some methods in the memdb, e.g. Find, and Get, which returns the value slice from the kvData buffer.

They are documented with The caller should not modify the contents of the returned slice, implying that the returned value has not been copied, but is a direct reference to the backing data.

With this PR, a data-race is introduced, since the readlock is only held during the Get/Find, but not afterwards.

  • Routine 1: Get(aaa) -> returns 'bbb'
  • Routine 2: Put(aaa, cc) -> overwrites 'bbb' with cc.

And now, when Routine 1 looks at the data, it sees bbc. Hence why the kvData needs to remain append-only, and cannot be modified in-place.

@holiman
Copy link

holiman commented Dec 16, 2021

About that scenario ^ it would be good with a testcase for that.

Put(aaa, bbb)
val = Get(aaa)
Put(aaa, cc)
assert( val == bbb)

It should hold true for both disk-db and memdb

@holiman
Copy link

holiman commented Dec 16, 2021

It might be that this actually already works, since leveldb uses sequence numbers overlaid on memdb. So this change does make memdb api unsafe, but it won't matter when the memdb is used through a regular db -- it won't Put the exact key another time, but only a key with the same ukey prefix.

Which still is a bit dangerous, because it's a very subtle bug which will be hard to find. However, due to this usage of sequence numbers, I think the optimization is probably moot aswell.

@kingeasternsun
Copy link
Author

moot aswe

very great thoughts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants