-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client/db/bolt: omit copy of buffer and upgrade in separate db txns #1070
Conversation
// {"testnetbot", v1Upgrade, verifyV1Upgrade, "dexbot-testnet.db.gz", 3}, // only for TestUpgradeDB | ||
{"upgradeFromV0", v1Upgrade, verifyV1Upgrade, "v0.db.gz", 1}, | ||
{"upgradeFromV1", v2Upgrade, verifyV2Upgrade, "v1.db.gz", 2}, | ||
{"upgradeFromV2", v3Upgrade, verifyV3Upgrade, "v2.db.gz", 3}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test a big upgrade by commenting out these three and uncommenting the other one, which is only for TestUpgradeDB
.
Then see how it goes with go test -v -count 1 -run TestUpgradeDB
.
If it succeeds, you can try a mem profile:
go test -v -count 1 -run TestUpgradeDB -memprofile mem.out
go tool pprof bolt.test mem.out
# top and web are easiest
This PR only changes two things that could help memory: two unneeded copies inside db txns (the buffer from Get is not used outside the db tx), and doing the upgrades in separate db txns. EDIT: plus a couple other optimizations and shortcuts with cancel matches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does not oom for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking. I suspect breaking each upgrade into different db txns allowed cleanup (GC or other OS business). So I guess this PR is needed then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(pprof) top
Showing nodes accounting for 3810.92MB, 92.13% of 4136.67MB total
Dropped 21 nodes (cum <= 20.68MB)
Showing top 10 nodes out of 45
flat flat% sum% cum cum%
678.63MB 16.41% 16.41% 678.63MB 16.41% go.etcd.io/bbolt.Open.func1
575.58MB 13.91% 30.32% 575.58MB 13.91% go.etcd.io/bbolt.(*node).dereference
457.83MB 11.07% 41.39% 457.83MB 11.07% go.etcd.io/bbolt.(*node).read
447.71MB 10.82% 52.21% 447.71MB 10.82% go.etcd.io/bbolt.(*node).put
417.71MB 10.10% 62.31% 417.71MB 10.10% decred.org/dcrdex/dex/encode.ExtractPushes
380.76MB 9.20% 71.51% 380.76MB 9.20% go.etcd.io/bbolt.(*Bucket).write
310.18MB 7.50% 79.01% 359.18MB 8.68% go.etcd.io/bbolt.(*Bucket).openBucket
215.01MB 5.20% 84.21% 215.01MB 5.20% go.etcd.io/bbolt.(*Cursor).search
183.50MB 4.44% 88.64% 641.33MB 15.50% go.etcd.io/bbolt.(*Bucket).node
144.02MB 3.48% 92.13% 144.02MB 3.48% decred.org/dcrdex/dex/encode.BuildyBytes.AddData
Looks to work well for me. Is it using too much memory? Possibly related? boltdb/bolt#253
@JoeGruffins could I trouble you for a re-review after having split the upgrades into separate db txns? |
@@ -848,7 +848,7 @@ func (db *BoltDB) DEXOrdersWithActiveMatches(dex string) ([]order.OrderID, error | |||
} | |||
|
|||
// Inactive if refunded. | |||
proofB := getCopy(mBkt, proofKey) | |||
proofB := mBkt.Get(proofKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this method, DEXOrdersWithActiveMatches
, the proof bytes are only used inside the db.Tx to check revoked and refunded status, but not outside the tx, so no copy is safe.
@@ -197,7 +209,7 @@ func reloadMatchProofs(tx *bbolt.Tx) error { | |||
if mBkt == nil { | |||
return fmt.Errorf("match %x bucket is not a bucket", k) | |||
} | |||
proofB := getCopy(mBkt, proofKey) | |||
proofB := mBkt.Get(proofKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In reloadMatchProofs
, the proof bytes are decoded, reencoded, and re-saved inside the transaction, so also safe to elide the copy.
archive.Close() | ||
dbFile.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're done with both the archive and the extracted db here. The NewDB
call back in the test loads the DB by its file name.
client/db/bolt/upgrades.go
Outdated
// Each upgrade its own tx. | ||
for i, upgrade := range upgrades[version:] { | ||
err = db.Update(func(tx *bbolt.Tx) error { | ||
return doUpgrade(tx, upgrade, version+uint32(i)+1) | ||
}) | ||
if err != nil { | ||
return err | ||
} | ||
return nil | ||
}) | ||
} | ||
return nil | ||
|
||
// All upgrades in a single tx. | ||
// return db.Update(func(tx *bbolt.Tx) error { | ||
// // Execute all necessary upgrades in order. | ||
// for i, upgrade := range upgrades[version:] { | ||
// err := doUpgrade(tx, upgrade, version+uint32(i)+1) | ||
// if err != nil { | ||
// return err | ||
// } | ||
// } | ||
// return nil | ||
// }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove this commented chunk. I feel like it's not horrible if a rollback only goes to one version up instead of all the way to the starting version because at least the DB isn't hosed (there's also a file backup created at the initial version). It does have an impact on memory, so I don't think we have a choice. See the TestUpgradeDB
test with the dexbot-testnet.db.gz archive.
All in 1 txn:
File: bolt.test
Type: alloc_space
Time: Apr 27, 2021 at 2:41pm (CDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 3861.84MB, 92.23% of 4187.32MB total
Dropped 8 nodes (cum <= 20.94MB)
Showing top 10 nodes out of 45
flat flat% sum% cum cum%
673.61MB 16.09% 16.09% 673.61MB 16.09% go.etcd.io/bbolt.Open.func1
583.08MB 13.92% 30.01% 583.08MB 13.92% go.etcd.io/bbolt.(*node).dereference
464.23MB 11.09% 41.10% 464.23MB 11.09% go.etcd.io/bbolt.(*node).put
451.83MB 10.79% 51.89% 451.83MB 10.79% go.etcd.io/bbolt.(*node).read
442.72MB 10.57% 62.46% 442.72MB 10.57% decred.org/dcrdex/dex/encode.ExtractPushes
407.78MB 9.74% 72.20% 407.78MB 9.74% go.etcd.io/bbolt.(*Bucket).write
314.68MB 7.52% 79.72% 369.18MB 8.82% go.etcd.io/bbolt.(*Bucket).openBucket
211.51MB 5.05% 84.77% 211.51MB 5.05% go.etcd.io/bbolt.(*Cursor).search
165.38MB 3.95% 88.72% 617.21MB 14.74% go.etcd.io/bbolt.(*Bucket).node
147.02MB 3.51% 92.23% 147.02MB 3.51% decred.org/dcrdex/dex/encode.BuildyBytes.AddData
Separate txns:
File: bolt.test
Type: alloc_space
Time: Apr 27, 2021 at 2:38pm (CDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 3212.76MB, 91.10% of 3526.49MB total
Dropped 24 nodes (cum <= 17.63MB)
Showing top 10 nodes out of 45
flat flat% sum% cum cum%
453.32MB 12.85% 12.85% 453.32MB 12.85% go.etcd.io/bbolt.(*node).read
435.18MB 12.34% 25.20% 435.18MB 12.34% go.etcd.io/bbolt.(*node).put
420.21MB 11.92% 37.11% 420.21MB 11.92% decred.org/dcrdex/dex/encode.ExtractPushes
398.27MB 11.29% 48.40% 398.27MB 11.29% go.etcd.io/bbolt.(*Bucket).write
364.41MB 10.33% 58.74% 364.41MB 10.33% go.etcd.io/bbolt.Open.func1
318.06MB 9.02% 67.76% 318.06MB 9.02% go.etcd.io/bbolt.(*node).dereference
311.17MB 8.82% 76.58% 356.18MB 10.10% go.etcd.io/bbolt.(*Bucket).openBucket
201.51MB 5.71% 82.30% 201.51MB 5.71% go.etcd.io/bbolt.(*Cursor).search
174.11MB 4.94% 87.23% 627.43MB 17.79% go.etcd.io/bbolt.(*Bucket).node
136.51MB 3.87% 91.10% 136.51MB 3.87% decred.org/dcrdex/dex/encode.BuildyBytes.AddData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the current changes:
top
Showing nodes accounting for 3515.76MB, 91.35% of 3848.73MB total
Dropped 16 nodes (cum <= 19.24MB)
Showing top 10 nodes out of 45
flat flat% sum% cum cum%
648.01MB 16.84% 16.84% 648.01MB 16.84% go.etcd.io/bbolt.Open.func1
463.85MB 12.05% 28.89% 463.85MB 12.05% go.etcd.io/bbolt.(*node).read
451.74MB 11.74% 40.63% 451.74MB 11.74% go.etcd.io/bbolt.(*node).put
444.22MB 11.54% 52.17% 444.22MB 11.54% decred.org/dcrdex/dex/encode.ExtractPushes
381.26MB 9.91% 62.07% 381.26MB 9.91% go.etcd.io/bbolt.(*Bucket).write
319.06MB 8.29% 70.36% 319.06MB 8.29% go.etcd.io/bbolt.(*node).dereference
273.66MB 7.11% 77.47% 330.16MB 8.58% go.etcd.io/bbolt.(*Bucket).openBucket
206.01MB 5.35% 82.83% 206.01MB 5.35% go.etcd.io/bbolt.(*Cursor).search
172.94MB 4.49% 87.32% 636.78MB 16.55% go.etcd.io/bbolt.(*Bucket).node
155.02MB 4.03% 91.35% 155.02MB 4.03% decred.org/dcrdex/dex/encode.BuildyBytes.AddData
Your second top is missing go.etcd.io/bbolt.Open.func1
, wonder why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why too. That profile looks more like the profile with a single Update as it was before. 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With d977162, now I'm seeing:
$ go test -v -count 1 -run TestUpgradeDB -memprofile=mem.out
=== RUN TestUpgradeDB
2021-04-28 11:58:11.281 [INF] db_TEST: Upgrading database from version 1 to 3
2021-04-28 11:58:11.513 [DBG] db_TEST: Upgrading to version 2...
2021-04-28 11:58:15.132 [DBG] db_TEST: Upgrading to version 3...
--- PASS: TestUpgradeDB (6.90s)
PASS
ok decred.org/dcrdex/client/db/bolt 7.221s
$ go tool pprof bolt.test mem.out
File: bolt.test
Type: alloc_space
Time: Apr 28, 2021 at 11:58am (CDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top15
Showing nodes accounting for 2627.65MB, 99.25% of 2647.48MB total
Dropped 14 nodes (cum <= 13.24MB)
Showing top 15 nodes out of 43
flat flat% sum% cum cum%
458.21MB 17.31% 17.31% 458.21MB 17.31% go.etcd.io/bbolt.(*node).put
364.92MB 13.78% 31.09% 364.92MB 13.78% go.etcd.io/bbolt.Open.func1
335.56MB 12.67% 43.77% 335.56MB 12.67% go.etcd.io/bbolt.(*node).dereference
298.68MB 11.28% 55.05% 361.68MB 13.66% go.etcd.io/bbolt.(*Bucket).openBucket
247.69MB 9.36% 64.40% 247.69MB 9.36% go.etcd.io/bbolt.(*node).read
216.66MB 8.18% 72.59% 216.66MB 8.18% go.etcd.io/bbolt.(*Bucket).write
163.01MB 6.16% 78.74% 163.01MB 6.16% go.etcd.io/bbolt.(*Cursor).search
154.58MB 5.84% 84.58% 154.58MB 5.84% decred.org/dcrdex/dex/encode.ExtractPushes
127.05MB 4.80% 89.38% 127.05MB 4.80% decred.org/dcrdex/client/db.decodeMatchProof_v2
92.26MB 3.49% 92.87% 339.96MB 12.84% go.etcd.io/bbolt.(*Bucket).node
81.67MB 3.08% 95.95% 537.36MB 20.30% go.etcd.io/bbolt.(*Bucket).Bucket
63MB 2.38% 98.33% 63MB 2.38% go.etcd.io/bbolt.newBucket (inline)
10MB 0.38% 98.71% 1049.60MB 39.65% go.etcd.io/bbolt.(*Bucket).spill
7.50MB 0.28% 98.99% 759.83MB 28.70% go.etcd.io/bbolt.(*Bucket).Put
6.86MB 0.26% 99.25% 709.17MB 26.79% go.etcd.io/bbolt.(*Tx).allocate
I left some suggestions here: chappjc#2 Am getting this by updating in batches.
|
The peak allocation is clearly reduced with doing match updates in chunks @JoeGruffins. My only comment on that approach was regarding rollback not necessarily leaving the DB in a state prior to starting the upgrade. chappjc#2 (review) I think this may be acceptable since the first Update in a chunk is likely to error if any of them error (although it could be a disk full error on the last chunk), and we have a backup db file created before starting... Opinions welcome on the issue. It is clear that the larger the txn, the higher the peak memory utilization... |
My suspicion now is that we've made sub-buckets with values that are simply too large, and should have been split into multiple sub-buckets. Namely the MatchProof being serialized and written into a single sub-bucket of a match bucket. If we had each field of MatchProof in a separate sub-bucket, we likely wouldn't have this memory issue just to update/add one field. Also, another reason why splitting more could be good is if we want to check just one field in the db, such as RedeemSig, without loading and decoding the entire MatchProof. This is a current issue with checking for match completeness (long load times as we cycle through all matches to determine if they are active). This is the reason the proposed addition of an archived orders bucket in #1071 Making the above change is a challenge though. It's not just a big change, it still requires dealing with an upgrade that needs to load all the match proofs just to split into more buckets. EDIT: smaller sub-buckets wouldn't explain the memory use from v2Upgrade though, because it just adds a key |
@@ -3923,7 +3925,7 @@ func (c *Core) dbTrackers(dc *dexConnection) (map[order.OrderID]*trackedTrade, e | |||
trackers[dbOrder.Order.ID()] = tracker | |||
|
|||
// Get matches. | |||
dbMatches, err := c.db.MatchesForOrder(oid) | |||
dbMatches, err := c.db.MatchesForOrder(oid, excludeCancelMatches) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note a few lines down how we skip cancel matches:
// Only trade matches are added to the matches map. Detect and skip
// cancel order matches, which have an empty Address field.
if dbMatch.Address == "" {
// tracker.cancel is set from LinkedOrder with cancelTrade.
continue
Any cancel info comes from the order's data (tracker.metaData.LinkedOrder
).
reProof, ver, err := db.DecodeMatchProof(proofB) | ||
if err != nil { | ||
t.Fatalf("match decode error: %v", err) | ||
} | ||
MustCompareMatchProof(t, proof, reProof) | ||
if ver != db.MatchProofVer { | ||
t.Errorf("wanted match proof ver %d, got %d", db.MatchProofVer, ver) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually not getting why either the v1 or v3 upgrades that just call reloadMatchProofs
are needed at all given the MatchProofs will always load the same regardless of the serialization version in the stored data.
func (c *Core) coreOrderFromMetaOrder(mOrd *db.MetaOrder) (*Order, error) { | ||
corder := coreOrderFromTrade(mOrd.Order, mOrd.MetaData) | ||
oid := mOrd.Order.ID() | ||
matches, err := c.db.MatchesForOrder(oid) | ||
excludeCancels := false // maybe don't include cancel order matches? | ||
matches, err := c.db.MatchesForOrder(oid, excludeCancels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used by (*Core).Order
and (*Core).Orders
, and the consumers may be looking at cancel matches too, so I didn't change this result.
However yeah apiOrders seem to need the cancel matches so it can display like this:apiOrders
is very unlikely to need the cancel matches, and we have had reports of the order history page taking extremely long to load pull new orders...
So coreOrderFromMetaOrder
really does need to get a MetaOrder
that includes even cancel order matches.
With 68a6cb3 just added, the v2 upgrade is faster and lighter since it no longer incorrectly sets timing and memprofile now:
|
at 2ba2aeb
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this reduced the memory usage, that db must have a lot of cancel orders?
// No need to rewrite this if it was loaded from the current version. | ||
if ver == dexdb.MatchProofVer { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic would make resuming from failed batches easily doable, for this upgrade, if we did batches.
Thanks for the follow-up Joe. We've had a good amount of feedback and testing, so I think this is good to go. |
try each upgrade in its own tx excludeCancels option for db.MatchesForOrder and reloadMatchProofs Cancel matches can be the lion's share of matches in some DBs, and in most cases it is not necessary to process these. - In db upgrades, there is no reason to reencode the MatchProof. Although it's not clear there is a reason to do this for trade matches either since any match proof version decodes automatically to the current MatchProof. - In (*Core).dbTrackers, cancel order matches are not used. - (*Core).coreOrderFromMetaOrder might be able to exclude cancel order matches, but the consumers are exported Core methods, so no change. This also include a prealloc optional arg to ExtractPushes and DecodeBlob, which is mainly helpful for the MatchProof buckets that have 22 pushes. do not set maxFeeRate for cancels v2 upgrade
This attempts to address high memory utilization (both VIRT and RES) during a db upgrade with the following changes:
db.Get
because they are not used outside the db.Tx and thus do not need to be copies.MatchesForOrder
, and not re-saving cancel matches inreloadMatchProofs
since they will always lack both revoke bools and txdata fields that are added in these upgrades. d977162 However, I'm uncertain now ifreloadMatchProofs
really needs to be done ever sinceDecodeMatchProof
ensures old encodings load the same way as new encodings.tryCancelTrade
), so^uint8(0)
was incorrect here previously. But themaxFeeRate
is correctly ignored for cancel matches in(*dexConnection).parseMatches
. Skipping these updates to cancel orders further improves memory utilization and speed. 68a6cb3These changes will be needed for the 0.2 upgrade since there are some rather large client DBs out there that are going to have similar troubles.