Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage,server: disable separated intents #60557

Merged
merged 1 commit into from
Feb 13, 2021

Conversation

sumeerbhola
Copy link
Collaborator

They are disabled by default, and the setting is
not randomized on the testing path for testserver.go.

This was causing slowness in TestLogic/*/aggregate
resulting in TestLogic sometimes exceeding the
test timeout.

This is due to an unforseen slowness in intent
resolution caused by the effectively randomized ordering
of the latest and older intents.

Informs #41720

Release note: None

They are disabled by default, and the setting is
not randomized on the testing path for testserver.go.

This was causing slowness in TestLogic/*/aggregate
resulting in TestLogic sometimes exceeding the
test timeout.

This is due to an unforseen slowness in intent
resolution caused by the effectively randomized ordering
of the latest and older intents.

Informs cockroachdb#41720

Release note: None
@sumeerbhola sumeerbhola requested a review from a team as a code owner February 13, 2021 03:30
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@sumeerbhola sumeerbhola removed the request for review from a team February 13, 2021 03:30
@sumeerbhola
Copy link
Collaborator Author

TFTR!

@sumeerbhola
Copy link
Collaborator Author

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 13, 2021

Build succeeded:

@craig craig bot merged commit 197d0cc into cockroachdb:master Feb 13, 2021
@nvanbenschoten
Copy link
Member

This is due to an unforseen slowness in intent resolution caused by the effectively randomized ordering of the latest and older intents.

Could you explain this further? Is this due to the key encoding of LockTableKey, which uses a /strength/txnID key suffix that does not allow us to efficiently find a given key's lock table key? If so, I'm surprised this would be so pronounced. We haven't yet implemented soft lock-table resolution (finalized but not removed), so we should only have a single lock table key version at a time. So are LSM tombstones the issue? If so, do we have the option to perform more targetted seeks when performing point (i.e. not ranged) intent resolution, where we know the txn whose intents we want to resolve?

Do you have thoughts on mitigating this? I imagine we don't want to rethink the lock table key encoding to order by timestamp.

@sumeerbhola
Copy link
Collaborator Author

Yes, it's the LSM tombstones. When the intent resolution path also needs to read the latest version, the interleaved intents case would be as bad (or worse) since the pebble.Iterator.Next call would have to iterate over all the deleted ones. But the common path for committed intent resolution only needs to read the intent.
And yes, using the txn ID is exactly what I was planning to do for the single intent resolution path, but it doesn't help with range intent resolution.
I've opened #60622

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants