Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve IdentityStore Invalidate performance #27184

Merged

Conversation

marcboudreau
Copy link
Contributor

The Invalidate method of the IdentityStore struct was using a simplistic algorithm to synchronize the MemDB records (entities, groups, local entity aliases) with those from the storage bucket. This simplistic algorithm would result in a large number of MemDB operations within a single transaction whenever the storage bucket contained a large number or records. This large number of operations led to using a much slower comparer function within MemDB which caused the Invalidate function to take a long time to complete and could lead the node to fall so far behind in processing WALs sent over by the primary cluster that the replication state would transition to merkle-sync.

The simplistic approach basically consisted of deleting everything from MemDB that was associated with the invalidated storage bucket and re-inserting those resources using state contained in the storage bucket. Since invalidations usually occur to signal a single resource has changed, been added, or been deleted; when a large number of unchanged resources also exist in the storage bucket, a lot of unnecessary work was being done (deleting and re-adding).

These changes replace the simplistic approach for the handling of entities and local entity aliases since they are the more likely resource to exist in large numbers where this problem occurs.

The new approach consists of comparing the contents of the invalidated storage bucket with the set of resources from MemDB associated that storage bucket. Resources that match in both systems are left alone, and only differences are rectified in MemDB.

@marcboudreau marcboudreau added this to the 1.17.0-rc milestone May 22, 2024
@marcboudreau marcboudreau requested a review from mpalmi May 22, 2024 20:43
@github-actions github-actions bot added the hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed label May 22, 2024
Copy link

github-actions bot commented May 22, 2024

CI Results:
All Go tests succeeded! ✅

Copy link

github-actions bot commented May 22, 2024

Build Results:
All builds succeeded! ✅

vault/identity_store.go Outdated Show resolved Hide resolved
// bucketLocalAlias.
var memDBLocalAlias *identity.Alias
for i, localAlias := range memDBLocalAliases {
if localAlias.ID == bucketLocalAlias.ID {
Copy link
Contributor

@mpalmi mpalmi May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed out of band about the possibility of creating a new MemDBMapIDToEntityByBucketKeyInTxn (yuck). It would be really nice if we could build up a map[ID]identity.Entity and map[ID]identity.Alias and use those to trim duplicates by direct lookup, thus replacing the inner loop.

I think this could potentially be a significant improvement in clarity and performance (though the current patch has already proven a significant improvement over the prior implementation).

Since this code has already been tested and provides the results we want, we can defer this change as follow-up work after the Code Freeze.

vault/identity_store.go Outdated Show resolved Hide resolved
vault/identity_store.go Outdated Show resolved Hide resolved
Copy link
Contributor

@mpalmi mpalmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me @marcboudreau! I know we've done significant testing and validation of the improvement, which helps us have confidence in the change.

The comment wording nit and the map optimization we discussed can be addressed as future work so I'll go ahead an approve!

edit: It looks like we need a godoc to pass the linter check. Aside from that CI appears to be happy.

Comment on lines +727 to +730
// We've considered the use of github.com/google/go-cmp here,
// but opted for sticking with reflect.DeepEqual because go-cmp
// is intended for testing and is able to panic in some
// situations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// We've considered the use of github.com/google/go-cmp here,
// but opted for sticking with reflect.DeepEqual because go-cmp
// is intended for testing and is able to panic in some
// situations.
// Though DeepEqual relies on == equality for underlying comparison,
// this is perfectly safe for all compared fields. Timestamps are all in
// unix epoch time and embedded structs contain no `.Equals`.

@marcboudreau marcboudreau merged commit d309176 into main May 24, 2024
83 checks passed
@marcboudreau marcboudreau deleted the marcboudreau/VAULT-27060/identity-store-invalidation branch May 24, 2024 17:48
marcboudreau pushed a commit that referenced this pull request May 24, 2024
* improve identitystore invalidate performance

* add changelog

* adding test to cover invalidation of entity bucket keys within IdentityStore

* minor clean ups

* adding tests

* add missing godoc for tests
marcboudreau pushed a commit that referenced this pull request May 24, 2024
…/1.16.x (#27230)

* Improve IdentityStore Invalidate performance (#27184)

* improve identitystore invalidate performance

* add changelog

* adding test to cover invalidation of entity bucket keys within IdentityStore

* minor clean ups

* adding tests

* add missing godoc for tests

* fix incorrect merge resolution

---------

Co-authored-by: Marc Boudreau <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants