-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Iterator Performance of Seeking with Prefix #1719
Closed
zzyalbert
wants to merge
78
commits into
dgraph-io:main
from
zzyalbert:feature/improve_iterator_prefix_seek
Closed
Improve Iterator Performance of Seeking with Prefix #1719
zzyalbert
wants to merge
78
commits into
dgraph-io:main
from
zzyalbert:feature/improve_iterator_prefix_seek
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* zstd version bump
zstd is not set by default even when cgo is enabled.
Add a Builder type in skiplist package which can be used to insert sorted keys efficiently. Add a test and benchmark for it.
This change makes the skiplist grow for the case of sorted skiplist builder. The normal skiplist still cannot grow. Note: The growing skiplist is not thread safe. Co-authored-by: Ahsan Barkati <[email protected]>
…aph-io#1696) In Dgraph, we already use Raft write-ahead log. Also, when we commit transactions, we update tens of thousands of keys in one go. To optimize this write path, this PR introduces a way to directly hand over Skiplist to Badger, short circuiting Badger's Value Log and WAL. This feature allows Dgraph to generate Skiplists while processing mutations and just hand them over to Badger during commits. It also accepts a callback which can be run when Skiplist is written to disk. This is useful for determining when to create a snapshot in Dgraph.
…isher (dgraph-io#1697) When a skip-list is handed over to badger we should also send the entries in skiplist to the publisher so that all the subscribers get notified.
This PR adds DropPrefixNonBlocking and DropPrefixBlocking API that can be used to logically delete the data for specified prefixes. DropPrefix now makes decision based on badger option AllowStopTheWorld whose default is to use DropPrefixBlocking. With DropPrefixNonBlocking the data would not be cleared from the LSM tree immediately. It would be deleted eventually through compactions. Co-authored-by: Rohan Prasad <[email protected]>
Add benchmark tool for picktable benchmarking.
Fixes DOC-303
dgraph-io#1700) This PR adds FullCopy option in Stream. This allows sending the table entirely to the writer. If this option is set to true we directly copy over the tables from the last 2 levels. This option increases the stream speed while also lowering the memory consumption on the DB that is streaming the KVs. For 71GB, compressed and encrypted DB we observed 3x improvement in speed. The DB contained ~65GB in the last 2 levels while remaining in the above levels. To use this option, the following options should be set in Stream. stream.KeyToList = nil stream.ChooseKey = nil stream.SinceTs = 0 db.managedTxns = true If we use stream writer for receiving the KVs, the encryption mode has to be the same in sender and receiver. This will restrict db.StreamDB() to use the same encryption mode in both input and output DB. Added TODO for allowing different encryption modes.
…-io#1706) Use https://github.com/klauspost/compress ZSTD compression when CGO is not enabled. Related to dgraph-io#1383
…io#1705) immudb has its own store since version 0.9
Remove "GitHub issues" reference. (we use discuss now)
This PR adds support for stream writing incrementally to the DB. Adds an API: StreamWriter.PrepareIncremental Co-authored-by: Manish R Jain <[email protected]>
…h-io#1723) While doing an incremental stream write, we should look at the first level on which there is no data. Earlier, due to a bug we were writing to a level that already has some tables.
I propose this simple fix for detecting conflicts in managed mode. Addresses https://discuss.dgraph.io/t/fatal-error-when-writing-conflicting-keys-in-managed-mode/14784. When a write conflict exists for a managed DB, an internal assert can fail. This occurs because a detected conflict is indicated with commitTs of 0, but handling the error is skipped for managed DB instances. Rather than conflate conflict detection with a timestamp of 0, it can be indicated with another return value from hasConflict.
…raph-io#1721) With the introduction of SinceTs, a bug was introduced dgraph-io#1653 that skips the pending entries. The default value of SinceTs is zero. And for the transaction made at readTs 0, the pending entries have version set to 0. So they were also getting skipped.
joshua-goldstein
added
area/performance
Performance related issues.
and removed
skip/stale
Skip stalebot
labels
Nov 4, 2022
Latest runner tag now uses ubuntu-22.04. We pin to ubuntu 20.04.
Currently [appveyor tests](https://ci.appveyor.com/project/manishrjain/badger/builds/42502297) are failing in multiple places on Windows. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/dgraph-io/badger/1775) <!-- Reviewable:end -->
## Problem Currently we only deploy amd64 badger CLI tool builds. We would like arm64 builds too. ## Solution Use an arm64 self-hosted runner to build arm64 badger CLI tool.
mYmNeo
added a commit
to mYmNeo/badger
that referenced
this pull request
Jan 18, 2023
Signed-off-by: thomassong <[email protected]>
joshua-goldstein
force-pushed
the
feature/improve_iterator_prefix_seek
branch
from
February 6, 2023 22:12
2ee9c97
to
2ec98c3
Compare
joshua-goldstein
requested review from
akon-dey,
billprovince,
joshua-goldstein and
skrdgraph
as code owners
February 6, 2023 22:12
mYmNeo
added a commit
to mYmNeo/badger
that referenced
this pull request
Feb 13, 2023
Signed-off-by: thomassong <[email protected]>
This PR has been stale for 60 days and will be closed automatically in 7 days. Comment to keep it open. |
Hey, I am trying to test this diff. Would it be possible for you to rebase it again? Otherwise I will have to create a new diff. |
harshil-goel
added a commit
that referenced
this pull request
Aug 14, 2024
Copy of #1719 Co-authored-by: Ziyuan Zhong <[email protected]>
This has been merged. |
mYmNeo
added a commit
to mYmNeo/badger
that referenced
this pull request
Sep 14, 2024
Signed-off-by: thomassong <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When I was trying to change the db engine to badgerdb in some of my projects, I found the iterator Seek with prefix was pretty slow in the following situation:
Then I use
pprof
to found out the iterator was still runningparseItem
even if the current key was not match the prefix.So I fix this by skipping the
parseItem
process when the current key is not match the prefix.This change is