-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add non-trivial implementations of FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE #40205
Comments
40206: sql: add row lock modes; make FOR UPDATE a no-op r=jordanlewis a=jordanlewis This commit adds parsing for the row locking modes: FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR KEY UPDATE. All return unimplemented errors, except for FOR UPDATE, which behaves as a no-op. Closes #6583. Added #40205 to track the remaining kinds of row locking. Release note (sql change): support parsing the FOR UPDATE modifier on SELECT clauses, treating it as a no-op, since CockroachDB's transactions only operate in SERIALIZABLE mode. Co-authored-by: Jordan Lewis <[email protected]>
The spreadsheet we discussed is unwieldy - hard to edit and impossible to keep up to date. If we write down blacklists in code, then we can use an approach like this to always have an up to date aggregation. So far it seems like there's just a lot of unknowns to categorize still. The output today: ``` === RUN TestBlacklists 648: unknown (unknown) 493: cockroachdb#5807 (sql: Add support for TEMP tables) 151: cockroachdb#17511 (sql: support stored procedures) 86: cockroachdb#26097 (sql: make TIMETZ more pg-compatible) 56: cockroachdb#10735 (sql: support SQL savepoints) 55: cockroachdb#32552 (multi-dim arrays) 55: cockroachdb#26508 (sql: restricted DDL / DML inside transactions) 52: cockroachdb#32565 (sql: support optional TIME precision) 39: cockroachdb#243 (roadmap: Blob storage) 33: cockroachdb#26725 (sql: support postgres' API to handle blob storage (incl lo_creat, lo_from_bytea)) 31: cockroachdb#27793 (sql: support custom/user-defined base scalar (primitive) types) 24: cockroachdb#12123 (sql: Can't drop and replace a table within a transaction) 24: cockroachdb#26443 (sql: support user-defined schemas between database and table) 20: cockroachdb#21286 (sql: Add support for geometric types) 18: cockroachdb#6583 (sql: explicit lock syntax (SELECT FOR {SHARE,UPDATE} {skip locked,nowait})) 17: cockroachdb#22329 (Support XA distributed transactions in CockroachDB) 16: cockroachdb#24062 (sql: 32 bit SERIAL type) 16: cockroachdb#30352 (roadmap:when CockroachDB will support cursor?) 12: cockroachdb#27791 (sql: support RANGE types) 8: cockroachdb#40195 (pgwire: multiple active result sets (portals) not supported) 8: cockroachdb#6130 (sql: add support for key watches with notifications of changes) 5: Expected Failure (unknown) 5: cockroachdb#23468 (sql: support sql arrays of JSONB) 5: cockroachdb#40854 (sql: set application_name from connection string) 4: cockroachdb#35879 (sql: `default_transaction_read_only` should also accept 'on' and 'off') 4: cockroachdb#32610 (sql: can't insert self reference) 4: cockroachdb#40205 (sql: add non-trivial implementations of FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE) 4: cockroachdb#35897 (sql: unknown function: pg_terminate_backend()) 4: cockroachdb#4035 (sql/pgwire: missing support for row count limits in pgwire) 3: cockroachdb#27796 (sql: support user-defined DOMAIN types) 3: cockroachdb#3781 (sql: Add Data Type Formatting Functions) 3: cockroachdb#40476 (sql: support `FOR {UPDATE,SHARE} {SKIP LOCKED,NOWAIT}`) 3: cockroachdb#35882 (sql: support other character sets) 2: cockroachdb#10028 (sql: Support view queries with star expansions) 2: cockroachdb#35807 (sql: INTERVAL output doesn't match PG) 2: cockroachdb#35902 (sql: large object support) 2: cockroachdb#40474 (sql: support `SELECT ... FOR UPDATE OF` syntax) 1: cockroachdb#18846 (sql: Support CIDR column type) 1: cockroachdb#9682 (sql: implement computed indexes) 1: cockroachdb#31632 (sql: FK options (deferrable, etc)) 1: cockroachdb#24897 (sql: CREATE OR REPLACE VIEW) 1: pass? (unknown) 1: cockroachdb#36215 (sql: enable setting standard_conforming_strings to off) 1: cockroachdb#32562 (sql: support SET LOCAL and txn-scoped session variable changes) 1: cockroachdb#36116 (sql: psychopg: investigate how `'infinity'::timestamp` is presented) 1: cockroachdb#26732 (sql: support the binary operator: <int> / <float>) 1: cockroachdb#23299 (sql: support coercing string literals to arrays) 1: cockroachdb#36115 (sql: psychopg: investigate if datetimetz is being returned instead of datetime) 1: cockroachdb#26925 (sql: make the CockroachDB integer types more compatible with postgres) 1: cockroachdb#21085 (sql: WITH RECURSIVE (recursive common table expressions)) 1: cockroachdb#36179 (sql: implicity convert date to timestamp) 1: cockroachdb#36118 (sql: Cannot parse '24:00' as type time) 1: cockroachdb#31708 (sql: support current_time) ``` Release justification: non-production change Release note: None
The spreadsheet we discussed is unwieldy - hard to edit and impossible to keep up to date. If we write down blacklists in code, then we can use an approach like this to always have an up to date aggregation. So far it seems like there's just a lot of unknowns to categorize still. The output today: ``` === RUN TestBlacklists 648: unknown (unknown) 493: cockroachdb#5807 (sql: Add support for TEMP tables) 151: cockroachdb#17511 (sql: support stored procedures) 86: cockroachdb#26097 (sql: make TIMETZ more pg-compatible) 56: cockroachdb#10735 (sql: support SQL savepoints) 55: cockroachdb#32552 (multi-dim arrays) 55: cockroachdb#26508 (sql: restricted DDL / DML inside transactions) 52: cockroachdb#32565 (sql: support optional TIME precision) 39: cockroachdb#243 (roadmap: Blob storage) 33: cockroachdb#26725 (sql: support postgres' API to handle blob storage (incl lo_creat, lo_from_bytea)) 31: cockroachdb#27793 (sql: support custom/user-defined base scalar (primitive) types) 24: cockroachdb#12123 (sql: Can't drop and replace a table within a transaction) 24: cockroachdb#26443 (sql: support user-defined schemas between database and table) 20: cockroachdb#21286 (sql: Add support for geometric types) 18: cockroachdb#6583 (sql: explicit lock syntax (SELECT FOR {SHARE,UPDATE} {skip locked,nowait})) 17: cockroachdb#22329 (Support XA distributed transactions in CockroachDB) 16: cockroachdb#24062 (sql: 32 bit SERIAL type) 16: cockroachdb#30352 (roadmap:when CockroachDB will support cursor?) 12: cockroachdb#27791 (sql: support RANGE types) 8: cockroachdb#40195 (pgwire: multiple active result sets (portals) not supported) 8: cockroachdb#6130 (sql: add support for key watches with notifications of changes) 5: Expected Failure (unknown) 5: cockroachdb#23468 (sql: support sql arrays of JSONB) 5: cockroachdb#40854 (sql: set application_name from connection string) 4: cockroachdb#35879 (sql: `default_transaction_read_only` should also accept 'on' and 'off') 4: cockroachdb#32610 (sql: can't insert self reference) 4: cockroachdb#40205 (sql: add non-trivial implementations of FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE) 4: cockroachdb#35897 (sql: unknown function: pg_terminate_backend()) 4: cockroachdb#4035 (sql/pgwire: missing support for row count limits in pgwire) 3: cockroachdb#27796 (sql: support user-defined DOMAIN types) 3: cockroachdb#3781 (sql: Add Data Type Formatting Functions) 3: cockroachdb#40476 (sql: support `FOR {UPDATE,SHARE} {SKIP LOCKED,NOWAIT}`) 3: cockroachdb#35882 (sql: support other character sets) 2: cockroachdb#10028 (sql: Support view queries with star expansions) 2: cockroachdb#35807 (sql: INTERVAL output doesn't match PG) 2: cockroachdb#35902 (sql: large object support) 2: cockroachdb#40474 (sql: support `SELECT ... FOR UPDATE OF` syntax) 1: cockroachdb#18846 (sql: Support CIDR column type) 1: cockroachdb#9682 (sql: implement computed indexes) 1: cockroachdb#31632 (sql: FK options (deferrable, etc)) 1: cockroachdb#24897 (sql: CREATE OR REPLACE VIEW) 1: pass? (unknown) 1: cockroachdb#36215 (sql: enable setting standard_conforming_strings to off) 1: cockroachdb#32562 (sql: support SET LOCAL and txn-scoped session variable changes) 1: cockroachdb#36116 (sql: psychopg: investigate how `'infinity'::timestamp` is presented) 1: cockroachdb#26732 (sql: support the binary operator: <int> / <float>) 1: cockroachdb#23299 (sql: support coercing string literals to arrays) 1: cockroachdb#36115 (sql: psychopg: investigate if datetimetz is being returned instead of datetime) 1: cockroachdb#26925 (sql: make the CockroachDB integer types more compatible with postgres) 1: cockroachdb#21085 (sql: WITH RECURSIVE (recursive common table expressions)) 1: cockroachdb#36179 (sql: implicity convert date to timestamp) 1: cockroachdb#36118 (sql: Cannot parse '24:00' as type time) 1: cockroachdb#31708 (sql: support current_time) ``` Release justification: non-production change Release note: None
41252: roachtest: add test that aggregates orm blacklist failures r=jordanlewis a=jordanlewis The spreadsheet we discussed is unwieldy - hard to edit and impossible to keep up to date. If we write down blacklists in code, then we can use an approach like this to always have an up to date aggregation. So far it seems like there's just a lot of unknowns to categorize still. The output today: ``` === RUN TestBlacklists 648: unknown (unknown) 493: #5807 (sql: Add support for TEMP tables) 151: #17511 (sql: support stored procedures) 86: #26097 (sql: make TIMETZ more pg-compatible) 56: #10735 (sql: support SQL savepoints) 55: #32552 (multi-dim arrays) 55: #26508 (sql: restricted DDL / DML inside transactions) 52: #32565 (sql: support optional TIME precision) 39: #243 (roadmap: Blob storage) 33: #26725 (sql: support postgres' API to handle blob storage (incl lo_creat, lo_from_bytea)) 31: #27793 (sql: support custom/user-defined base scalar (primitive) types) 24: #12123 (sql: Can't drop and replace a table within a transaction) 24: #26443 (sql: support user-defined schemas between database and table) 20: #21286 (sql: Add support for geometric types) 18: #6583 (sql: explicit lock syntax (SELECT FOR {SHARE,UPDATE} {skip locked,nowait})) 17: #22329 (Support XA distributed transactions in CockroachDB) 16: #24062 (sql: 32 bit SERIAL type) 16: #30352 (roadmap:when CockroachDB will support cursor?) 12: #27791 (sql: support RANGE types) 8: #40195 (pgwire: multiple active result sets (portals) not supported) 8: #6130 (sql: add support for key watches with notifications of changes) 5: Expected Failure (unknown) 5: #23468 (sql: support sql arrays of JSONB) 5: #40854 (sql: set application_name from connection string) 4: #35879 (sql: `default_transaction_read_only` should also accept 'on' and 'off') 4: #32610 (sql: can't insert self reference) 4: #40205 (sql: add non-trivial implementations of FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, FOR NO KEY SHARE) 4: #35897 (sql: unknown function: pg_terminate_backend()) 4: #4035 (sql/pgwire: missing support for row count limits in pgwire) 3: #27796 (sql: support user-defined DOMAIN types) 3: #3781 (sql: Add Data Type Formatting Functions) 3: #40476 (sql: support `FOR {UPDATE,SHARE} {SKIP LOCKED,NOWAIT}`) 3: #35882 (sql: support other character sets) 2: #10028 (sql: Support view queries with star expansions) 2: #35807 (sql: INTERVAL output doesn't match PG) 2: #35902 (sql: large object support) 2: #40474 (sql: support `SELECT ... FOR UPDATE OF` syntax) 1: #18846 (sql: Support CIDR column type) 1: #9682 (sql: implement computed indexes) 1: #31632 (sql: FK options (deferrable, etc)) 1: #24897 (sql: CREATE OR REPLACE VIEW) 1: pass? (unknown) 1: #36215 (sql: enable setting standard_conforming_strings to off) 1: #32562 (sql: support SET LOCAL and txn-scoped session variable changes) 1: #36116 (sql: psychopg: investigate how `'infinity'::timestamp` is presented) 1: #26732 (sql: support the binary operator: <int> / <float>) 1: #23299 (sql: support coercing string literals to arrays) 1: #36115 (sql: psychopg: investigate if datetimetz is being returned instead of datetime) 1: #26925 (sql: make the CockroachDB integer types more compatible with postgres) 1: #21085 (sql: WITH RECURSIVE (recursive common table expressions)) 1: #36179 (sql: implicity convert date to timestamp) 1: #36118 (sql: Cannot parse '24:00' as type time) 1: #31708 (sql: support current_time) ``` Release justification: non-production change Release note: None Co-authored-by: Jordan Lewis <[email protected]>
The issue lists four lock types, which are documented in PG's docs here. Here are a few takeways from thinking about how these will play into CRDB in the future:
In addition to supporting these locking modes, we'll also want to use the same approach that Postgres uses for using these locking modes in UPDATE and DELETE statements. From Postgres' docs:
CockroachDB implements serializability, so this is not needed for correctness, but acquiring these locks implicitly in UPDATE and DELETE statements has the potential to:
|
Relates to cockroachdb#40205. This change updates optbuilder to propagate row-level locking modes through the *scope object to achieve the same locking clause scoping that Postgres contains. The change propagates the locking mode all the way down to the ScanExpr level, where a single locking mode becomes an optional property of a Scan. With proper scoping rules, the change also improves upon the locking validation introduced in cockroachdb#43887. This completes the SELECT FOR [KEY] SHARE/UPDATE optbuilder interactions. The next step will be to hook the ScanExpr locking property into the execbuilder. Release note (sql change): More invalid usages of FOR UPDATE locking clauses are now rejected by the SQL optimizer.
Relates to cockroachdb#40205. This change updates optbuilder to propagate row-level locking modes through the *scope object to achieve the same locking clause scoping that Postgres contains. The change propagates the locking mode all the way down to the ScanExpr level, where a single locking mode becomes an optional property of a Scan. With proper scoping rules, the change also improves upon the locking validation introduced in cockroachdb#43887. This completes the SELECT FOR [KEY] SHARE/UPDATE optbuilder interactions. The next step will be to hook the ScanExpr locking property into the execbuilder. Release note (sql change): More invalid usages of FOR UPDATE locking clauses are now rejected by the SQL optimizer.
nit: I assume you meant
I'm not clear about how not "worth worrying about" is not contradicted by bullet 3 -- if upgrading |
Yes, updated.
I had the same thought when writing this. There is a bit of a contradiction in this reasoning, as you pointed out. We don't want to weaken these locking modes any further because then they become no-ops and not useful to expose. However, we don't want to strengthen them because then that presents the possibility of creating unexpected deadlocks for unsuspecting users. We also likely don't want to actually support them properly as that would be a large undertaking. The way I've been reconciling this is that in my mind there are two primary row-level locking classes ( So I think we have two choices here:
I think a strong argument could be made for both approached. @rafiss I'm curious if you have an opinion here. Also, to add a little more context here, |
I did a very unscientific Github search to see how often the various syntaxes appear in public codebases:
That is useful data that suggests that many people probably would not be affected by either choice. But apart from that, another consideration is to figure out if any ORMs generate queries with the KEY/NO KEY specifiers. I would guess none do it by default, and those that do support it probably see similar proportions of usage as above. For what it's worth, looking at some of the ORMs we are focusing on right now: Hibernate does not support it but has an open issue for it, and Django also does not support it but has an open issue/PR. For both of these we are creating a CockroachDB dialect/adapter, so if these issues ever do get implemented we should still be able to avoid using the KEY/NO KEY specifiers with CockroachDB, if necessary. So I think my vote is for option 1 -- rejecting the KEY/NO KEY specifiers, and also adding telemetry on how often the unimplemented syntax is used. I worry that ignoring them (even with documentation elsewhere) will not be obvious to users, and I would claim that deadlocks happening randomly at runtime are worse than syntax errors happening deterministically at runtime. Though I'm definitely open to being convinced otherwise, and my position isn't held very strongly. I do want other stakeholders to weigh in. My understanding is that we usually only go with the "accept the syntax but ignore it" approach if the syntax in question has no meaning in CockroachDB and causes no behavior change... which arguably does apply in this case. |
Thanks for gathering those numbers and finding references to this in Hibernate and Django Rafi! I also have conflicted opinions here, but I'm actually leaning towards option 2 for the following reasons:
Also, keep in mind that for the next release we won't even perform any FOR SHARE locking at all, so we don't need to resolve this immediately. |
Relates to cockroachdb#40205. This change updates optbuilder to propagate row-level locking modes on the stack to achieve the same locking clause scoping that Postgres contains. The change propagates the locking mode all the way down to the ScanExpr level, where a single locking mode becomes an optional property of a Scan. With proper scoping rules, the change also improves upon the locking validation introduced in cockroachdb#43887. This completes the SELECT FOR [KEY] SHARE/UPDATE optbuilder interactions. The next step will be to hook the ScanExpr locking property into the execbuilder. Release note (sql change): More invalid usages of FOR UPDATE locking clauses are now rejected by the SQL optimizer.
Relates to cockroachdb#40205. This change updates optbuilder to propagate row-level locking modes on the stack to achieve the same locking clause scoping that Postgres contains. The change propagates the locking mode all the way down to the ScanExpr level, where a single locking mode becomes an optional property of a Scan. With proper scoping rules, the change also improves upon the locking validation introduced in cockroachdb#43887. This completes the SELECT FOR [KEY] SHARE/UPDATE optbuilder interactions. The next step will be to hook the ScanExpr locking property into the execbuilder. Release note (sql change): More invalid usages of FOR UPDATE locking clauses are now rejected by the SQL optimizer.
Relates to cockroachdb#40205. This change updates optbuilder to propagate row-level locking modes on the stack to achieve the same locking clause scoping that Postgres contains. The change propagates the locking mode all the way down to the ScanExpr level, where a single locking mode becomes an optional property of a Scan. With proper scoping rules, the change also improves upon the locking validation introduced in cockroachdb#43887. This completes the SELECT FOR [KEY] SHARE/UPDATE optbuilder interactions. The next step will be to hook the ScanExpr locking property into the execbuilder. Release note (sql change): More invalid usages of FOR UPDATE locking clauses are now rejected by the SQL optimizer.
44015: sql/opt/optbuilder: propagate FOR UPDATE locking in scope to ScanExprs r=nvanbenschoten a=nvanbenschoten Relates to #40205. This change updates optbuilder to propagate row-level locking modes through the `*scope` object to achieve the same locking clause scoping that Postgres contains. The change propagates the locking mode all the way down to the `ScanExpr` level, where a single locking mode becomes an optional property of a Scan. With proper scoping rules, the change also improves upon the locking validation introduced in #43887. This completes the SELECT FOR [KEY] SHARE/UPDATE optbuilder interactions. The next step will be to hook the ScanExpr locking property into the execbuilder. Co-authored-by: Nathan VanBenschoten <[email protected]>
Relates to cockroachdb#40205. This change introduces a new option to `MVCCGetOptions` and `MVCCScanOptions` that causes reads to throw an error if they observe an MVCC version at a timestamp above their read timestamp. Specifically, when the option is enabled, a `WriteTooOldError` will be returned if a scan observes an MVCC version with a timestamp above the read. Similarly, a `WriteIntentError` will be returned if a scan observes another transaction's intent, even if that intent has a timestamp above the scan's read timestamp. This option will be used in the future by `ScanRequests` and `ReverseScanRequests` that intend to acquire unreplicated key-level locks, which will power `SELECT FOR UPDATE`. These scans do not want to ignore existing MVCC versions at higher timestamps like traditional scans do. Instead, they want to throw errors and force their transaction to increase its timestamp through either a refresh or a retry. This was previously prototyped in 4a8e8dc. Interestingly, this is not new logic to the MVCC layer. This behavior is exactly the same as that of the initial key-value lookup performed during MVCC writes (see `mvccPutInternal`). It's fitting that behavior needed for `SELECT FOR UPDATE` would mirror that already exhibited by the read portion of read-write operations. This also hints at an opportunity to potentially use this option to merge the two implementations and get rid of custom logic like `mvccGetInternal` that only exists on the write path. We'd need to be careful about doing so though, as this code is heavily tuned. Release note: None
Relates to cockroachdb#40205. This change introduces a new option to `MVCCGetOptions` and `MVCCScanOptions` that causes reads to throw an error if they observe an MVCC version at a timestamp above their read timestamp. Specifically, when the option is enabled, a `WriteTooOldError` will be returned if a scan observes an MVCC version with a timestamp above the read. Similarly, a `WriteIntentError` will be returned if a scan observes another transaction's intent, even if that intent has a timestamp above the scan's read timestamp. This option will be used in the future by `ScanRequests` and `ReverseScanRequests` that intend to acquire unreplicated key-level locks, which will power `SELECT FOR UPDATE`. These scans do not want to ignore existing MVCC versions at higher timestamps like traditional scans do. Instead, they want to throw errors and force their transaction to increase its timestamp through either a refresh or a retry. This was previously prototyped in 4a8e8dc. Interestingly, this is not new logic to the MVCC layer. This behavior is exactly the same as that of the initial key-value lookup performed during MVCC writes (see `mvccPutInternal`). It's fitting that behavior needed for `SELECT FOR UPDATE` would mirror that already exhibited by the read portion of read-write operations. This also hints at an opportunity to potentially use this option to merge the two implementations and get rid of custom logic like `mvccGetInternal` that only exists on the write path. We'd need to be careful about doing so though, as this code is heavily tuned. Release note: None
44473: storage/engine: expose FailOnMoreRecent MVCC{Get/Scan}Option r=nvanbenschoten a=nvanbenschoten Relates to #40205. This change introduces a new option to `MVCCGetOptions` and `MVCCScanOptions` that causes reads to throw an error if they observe an MVCC version at a timestamp above their read timestamp. Specifically, when the option is enabled, a `WriteTooOldError` will be returned if a scan observes an MVCC version with a timestamp above the read. Similarly, a `WriteIntentError` will be returned if a scan observes another transaction's intent, even if that intent has a timestamp above the scan's read timestamp. This option will be used in the future by `ScanRequests` and `ReverseScanRequests` that intend to acquire unreplicated key-level locks, which will power `SELECT FOR UPDATE`. These scans do not want to ignore existing MVCC versions at higher timestamps like traditional scans do. Instead, they want to throw errors and force their transaction to increase its timestamp through either a refresh or a retry. This was previously prototyped in 4a8e8dc. Interestingly, this is not new logic to the MVCC layer. This behavior is exactly the same as that of the initial key-value lookup performed during MVCC writes (see `mvccPutInternal`). It's fitting that behavior needed for `SELECT FOR UPDATE` would mirror that already exhibited by the read portion of read-write operations. This also hints at an opportunity to potentially use this option to merge the two implementations and get rid of custom logic like `mvccGetInternal` that only exists on the write path. We'd need to be careful about doing so though, as this code is heavily tuned. Release note: None Co-authored-by: Nathan VanBenschoten <[email protected]>
Relates to cockroachdb#40205. This change adds FOR UPDATE locking mode information to the EXPLAIN output of scanNodes. This is useful for testing that locking modes are being correctly propagated when a FOR [KEY] UPDATE/SHARE locking clause is specified. It will be even more useful for testing that locking clauses are being implicitly applied to UPDATE, UPSERT, and DELETE statements when we start performing that implicit transformation to their underlying scans.
44790: sql: add FOR UPDATE locking modes to EXPLAIN output r=nvanbenschoten a=nvanbenschoten Relates to #40205. This change adds FOR UPDATE locking mode information to the EXPLAIN output of scanNodes. This is useful for testing that locking modes are being correctly propagated when a FOR [KEY] UPDATE/SHARE locking clause is specified. It will be even more useful for testing that locking clauses are being implicitly applied to UPDATE, UPSERT, and DELETE statements when we start performing that implicit transformation to their underlying scans. We currently reject both non-standard locking wait-policies, so we can't easily test that code. I'm open to suggestions to address that, but I also don't think it's a very big issue at the moment and I don't think it makes sense to hold off on making the change while we're already here. If/when we let non-standard locking wait policies through the optimizer, we'll be able to add similar tests for them. Co-authored-by: Nathan VanBenschoten <[email protected]>
Related to cockroachdb#41720. Related to cockroachdb#44976. This commit integrates the new concurrency package into the storage package. Each Replica is given a concurrency manager, which replaces its existing latch manager and txn wait queue. The change also uses the concurrency manager to simplify the role of the intent resolver. The intent resolver no longer directly handles WriteIntentErrors. As a result, we are able to delete the contention queue entirely. With this change, all requests are now sequenced through the concurrency manager. When sequencing, latches are acquired and conflicting locks are detected. If any locks are found, the requests wait in lock wait-queues for the locks to be resolved. This is a major deviation from how things currently work because today, even with the contention queue, requests end up waiting for conflicting transactions to commit/abort in the txnWaitQueue after at least one RPC. Now, requests wait directly next to the intents/locks that they are waiting on and react directly to the resolution of these intents/locks. Once requests are sequenced by the concurrency manager, they are theoretically fully isolated from all conflicting requests. However, this is not strictly true today because we have not yet pulled all replicated locks into the concurrency manager's lock table. We will do so in a future change. Until then, the concurrency manager maintains a notion of "intent discovery", which is integrated into the Replica-level concurrency retry loop. Performance numbers will be published shortly. This will be followed by performance numbers using the SELECT FOR UPDATE locking (cockroachdb#40205) improvements that this change enables.
Related to cockroachdb#41720. Related to cockroachdb#44976. This commit integrates the new concurrency package into the storage package. Each Replica is given a concurrency manager, which replaces its existing latch manager and txn wait queue. The change also uses the concurrency manager to simplify the role of the intent resolver. The intent resolver no longer directly handles WriteIntentErrors. As a result, we are able to delete the contention queue entirely. With this change, all requests are now sequenced through the concurrency manager. When sequencing, latches are acquired and conflicting locks are detected. If any locks are found, the requests wait in lock wait-queues for the locks to be resolved. This is a major deviation from how things currently work because today, even with the contention queue, requests end up waiting for conflicting transactions to commit/abort in the txnWaitQueue after at least one RPC. Now, requests wait directly next to the intents/locks that they are waiting on and react directly to the resolution of these intents/locks. Once requests are sequenced by the concurrency manager, they are theoretically fully isolated from all conflicting requests. However, this is not strictly true today because we have not yet pulled all replicated locks into the concurrency manager's lock table. We will do so in a future change. Until then, the concurrency manager maintains a notion of "intent discovery", which is integrated into the Replica-level concurrency retry loop. Performance numbers will be published shortly. This will be followed by performance numbers using the SELECT FOR UPDATE locking (cockroachdb#40205) improvements that this change enables.
Related to cockroachdb#41720. Related to cockroachdb#44976. This commit integrates the new concurrency package into the storage package. Each Replica is given a concurrency manager, which replaces its existing latch manager and txn wait queue. The change also uses the concurrency manager to simplify the role of the intent resolver. The intent resolver no longer directly handles WriteIntentErrors. As a result, we are able to delete the contention queue entirely. With this change, all requests are now sequenced through the concurrency manager. When sequencing, latches are acquired and conflicting locks are detected. If any locks are found, the requests wait in lock wait-queues for the locks to be resolved. This is a major deviation from how things currently work because today, even with the contention queue, requests end up waiting for conflicting transactions to commit/abort in the txnWaitQueue after at least one RPC. Now, requests wait directly next to the intents/locks that they are waiting on and react directly to the resolution of these intents/locks. Once requests are sequenced by the concurrency manager, they are theoretically fully isolated from all conflicting requests. However, this is not strictly true today because we have not yet pulled all replicated locks into the concurrency manager's lock table. We will do so in a future change. Until then, the concurrency manager maintains a notion of "intent discovery", which is integrated into the Replica-level concurrency retry loop. Performance numbers will be published shortly. This will be followed by performance numbers using the SELECT FOR UPDATE locking (cockroachdb#40205) improvements that this change enables.
45269: importccl: Parallelize avro import r=miretskiy a=miretskiy Parallelize avro importer to improve its throughput (2.8x improvement). Touches #40374. Fixes #45097. Release notes (performance): Faster avro import 45482: storage: integrate Concurrency Manager into Replica request path r=nvanbenschoten a=nvanbenschoten Related to #41720. Related to #44976. This commit integrates the new concurrency package into the storage package. Each Replica is given a concurrency manager, which replaces its existing latch manager and txn wait queue. The change also uses the concurrency manager to simplify the role of the intent resolver. The intent resolver no longer directly handles WriteIntentErrors. As a result, we are able to delete the contention queue entirely. With this change, all requests are now sequenced through the concurrency manager. When sequencing, latches are acquired and conflicting locks are detected. If any locks are found, the requests wait in lock wait-queues for the locks to be resolved. This is a major deviation from how things currently work because today, even with the contention queue, requests end up waiting for conflicting transactions to commit/abort in the txnWaitQueue after at least one RPC. Now, requests wait directly next to the intents/locks that they are waiting on and react directly to the resolution of these intents/locks. Once requests are sequenced by the concurrency manager, they are theoretically fully isolated from all conflicting requests. However, this is not strictly true today because we have not yet pulled all replicated locks into the concurrency manager's lock table. We will do so in a future change. Until then, the concurrency manager maintains a notion of "intent discovery", which is integrated into the Replica-level concurrency retry loop. Performance numbers will be published shortly. This will be followed by performance numbers using the SELECT FOR UPDATE locking (#40205) improvements that this change enables. 45484: sql: simplify connection state machine - stop tracking retry intent r=andreimatei a=andreimatei Before this patch, the SQL connection state machine had an optimization: if a transaction that hadn't used "SAVEPOINT cockroach_restart" encountered a retriable error that we can't auto-retry, then we'd release the txn's locks eagerly and enter the Aborted state. As opposed to transactions that had used the "SAVEPOINT cockroach_restart", which go to RestartWait. This optimization is a significant complication for the state machine, so this patch is removing it. All transactions now go to RestartWait, and wait for a ROLLBACK to release the locks. On the flip side, doing "RELEASE SAVEPOINT cockroach_restart" and "ROLLBACK SAVEPOINT cockroach_restart" now works even for transactions that haven't explicitly declared that savepoint, which is nice. Although I don't promise I'll keep it working. Release note: None Co-authored-by: Yevgeniy Miretskiy <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Andrei Matei <[email protected]>
Closes cockroachdb#40205. Informs cockroachdb#41720. This change teaches the KV client and the KV API about unreplicated locks. It then adds a KeyLocking mode to ScanRequest and ReverseScanRequest, which allows their users to select the locking strength that they would like the scan to use. This locking strength defaults to None, which corresponds to the current behavior. However, some users will want to acquire locks on each row scanned, which is now possible by setting the locking strength to a stronger level. For now, only the Exclusive strength is supported. The change then revisits SQL's row-level locking support, which is supported all the way down to the row fetcher for implicit (e.g. UPDATE) and explicit (e.g. SELECT ... FOR UPDATE) upgrade locking. The change uses the new key-locking functionality in the KV API to hook up row-level locking, completing the integration of SELECT FOR UPDATE with the KV layer and, in particular, the new lock-table structure. cockroachdb#43775 described the three main benefits of this change: - higher throughput under contention - lower latency and improved fairness under contention - a reduction in transaction retries under contention I've revisited those results a few times in the last two months and seen that the results continue to hold, and in some cases they have improved. I intend to update this PR with a more complete analysis of its impact on those three areas. Release note (sql change): SELECT FOR UPDATE now hooks into a new leaseholder-only locking mechanism. This allows the feature to be used to improve performance of transactional that read, modify, and write contended to rows. Similarly, UPDATE statements now use this new mechanism by default, meaning that their performance under contention is improved.
Closes cockroachdb#40205. Informs cockroachdb#41720. This change teaches the KV client and the KV API about unreplicated locks. It then adds a KeyLocking mode to ScanRequest and ReverseScanRequest, which allows their users to select the locking strength that they would like the scan to use. This locking strength defaults to None, which corresponds to the current behavior. However, some users will want to acquire locks on each row scanned, which is now possible by setting the locking strength to a stronger level. For now, only the Exclusive strength is supported. The change then revisits SQL's row-level locking support, which is supported all the way down to the row fetcher for implicit (e.g. UPDATE) and explicit (e.g. SELECT ... FOR UPDATE) upgrade locking. The change uses the new key-locking functionality in the KV API to hook up row-level locking, completing the integration of SELECT FOR UPDATE with the KV layer and, in particular, the new lock-table structure. cockroachdb#43775 described the three main benefits of this change: - higher throughput under contention - lower latency and improved fairness under contention - a reduction in transaction retries under contention I've revisited those results a few times in the last two months and seen that the results continue to hold, and in some cases they have improved. I intend to update this PR with a more complete analysis of its impact on those three areas. Release note (sql change): SELECT FOR UPDATE now hooks into a new leaseholder-only locking mechanism. This allows the feature to be used to improve performance of transactional that read, modify, and write contended to rows. Similarly, UPDATE statements now use this new mechanism by default, meaning that their performance under contention is improved.
Closes cockroachdb#40205. Informs cockroachdb#41720. This change teaches the KV client and the KV API about unreplicated locks. It then adds a KeyLocking mode to ScanRequest and ReverseScanRequest, which allows their users to select the locking strength that they would like the scan to use. This locking strength defaults to None, which corresponds to the current behavior. However, some users will want to acquire locks on each row scanned, which is now possible by setting the locking strength to a stronger level. For now, only the Exclusive strength is supported. The change then revisits SQL's row-level locking support, which is supported all the way down to the row fetcher for implicit (e.g. UPDATE) and explicit (e.g. SELECT ... FOR UPDATE) upgrade locking. The change uses the new key-locking functionality in the KV API to hook up row-level locking, completing the integration of SELECT FOR UPDATE with the KV layer and, in particular, the new lock-table structure. cockroachdb#43775 described the three main benefits of this change: - higher throughput under contention - lower latency and improved fairness under contention - a reduction in transaction retries under contention I've revisited those results a few times in the last two months and seen that the results continue to hold, and in some cases they have improved. I intend to update this PR with a more complete analysis of its impact on those three areas. Release note (sql change): SELECT FOR UPDATE now hooks into a new leaseholder-only locking mechanism. This allows the feature to be used to improve performance of transactional that read, modify, and write contended to rows. Similarly, UPDATE statements now use this new mechanism by default, meaning that their performance under contention is improved. This is only enabled for UPDATE statements that can push their filter all the way into their key-value scan. To determine whether an UPDATE statement is implicitly using SELECT FOR UPDATE locking, look for a "locking strength" field in the EXPLAIN output for the statement.
Closes cockroachdb#40205. Informs cockroachdb#41720. This change teaches the KV client and the KV API about unreplicated locks. It then adds a KeyLocking mode to ScanRequest and ReverseScanRequest, which allows their users to select the locking strength that they would like the scan to use. This locking strength defaults to None, which corresponds to the current behavior. However, some users will want to acquire locks on each row scanned, which is now possible by setting the locking strength to a stronger level. For now, only the Exclusive strength is supported. The change then revisits SQL's row-level locking support, which is supported all the way down to the row fetcher for implicit (e.g. UPDATE) and explicit (e.g. SELECT ... FOR UPDATE) upgrade locking. The change uses the new key-locking functionality in the KV API to hook up row-level locking, completing the integration of SELECT FOR UPDATE with the KV layer and, in particular, the new lock-table structure. cockroachdb#43775 described the three main benefits of this change: - higher throughput under contention - lower latency and improved fairness under contention - a reduction in transaction retries under contention I've revisited those results a few times in the last two months and seen that the results continue to hold, and in some cases they have improved. I intend to update this PR with a more complete analysis of its impact on those three areas. Release note (sql change): SELECT FOR UPDATE now hooks into a new leaseholder-only locking mechanism. This allows the feature to be used to improve performance of transactions that read, modify, and write contended to rows. Similarly, UPDATE statements now use this new mechanism by default, meaning that their performance under contention is improved. This is only enabled for UPDATE statements that can push their filter all the way into their key-value scan. To determine whether an UPDATE statement is implicitly using SELECT FOR UPDATE locking, look for a "locking strength" field in the EXPLAIN output for the statement.
#40206 added parsing for the various row-level lock modes, and treats them like no-ops. This is acceptable for CockroachDB, which only supports
SERIALIZABLE
transactions, because from a client's perspective there's no way to observe whether any particular transaction needed to be restarted because of a conflict on the rows that were declared asFOR UPDATE
, or because of some other conflict - the transaction will have to be restarted in either case.This issue tracks improving this behavior for performance reasons. There's ongoing work in core to support at least some of these semantics.
The text was updated successfully, but these errors were encountered: