Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documents of lock view for sprint 4 #6179

Merged
merged 14 commits into from
Aug 25, 2021
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions functions-and-operators/tidb-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,3 +257,73 @@ Check Table Before Drop: false
### MySQL compatibility

The `TIDB_VERSION` function is TiDB-specific and not compatible with MySQL. If MySQL compatibility is required, you can also use `VERSION` to get version information, but the result does not contain build details.

## TIDB_DECODE_SQL_DIGESTS

The `TIDB_DECODE_SQL_DIGESTS` function is used to query the normalized SQL statements (a form without formats and arguments) corresponding to the set of SQL digests in the cluster. This function accepts 1 or 2 arguments:

* `digests`: A string. This parameter is in the format of a JSON string array, and each string in the array is a SQL digest.
* `stmtTruncateLength`: An integer (optional). It is used to limit the length of each SQL statement in the returned result. If a SQL statement exceeds the specified length, the statement is truncated. `0` means that the length is unlimited.

This function returns a string, which is in the format of a JSON string array. The *i*-th item in the array is the normalized SQL statement corresponding to the *i*-th element in the `digests` parameter. If an element in the `digests` parameter is not a valid SQL digest or the system cannot find the corresponding SQL statement, the corresponding item in the returned result is `NULL`. If the truncation length is specified (`stmtTruncateLength > 0`), for each statement in the returned result that exceeds this length, the first `stmtTruncateLength` characters are retained and the suffix `"..."` is added at the end to indicate the truncation. If the `digests` parameter is `NULL`, the returned value of the function is `NULL`.
TomShawn marked this conversation as resolved.
Show resolved Hide resolved

> **Note:**
>
> * Only users with the [PROCESS](https://dev.mysql.com/doc/refman/8.0/en/privileges-provided.html#priv_process) privilege can use this function.
> * When `TIDB_DECODE_SQL_DIGESTS` is executed, TiDB queries the statement corresponding to each SQL digest from the statement summary tables, so there is no guarantee that the corresponding statement can always be found for any SQL digest. Only the statements that have been executed in the cluster can be found, and whether these SQL statements can be queried or not is also affected by the related configuration of the statement summary tables. For the detailed description of the statement summary table, see [Statement Summary Tables](/statement-summary-tables.md).
> * This function has a high overhead. In queries with a large number of rows (for example, querying the full table of `information_schema.cluster_tidb_trx` on a large and busy cluster), using this function might cause the queries to run for too long. Use it with caution.
> * This function has a high overhead because every time it is called, it internally queries the `STATEMENTS_SUMMARY`, `STATEMENTS_SUMMARY_HISTORY`, `CLUSTER_STATEMENTS_SUMMARY`, and `CLUSTER_STATEMENTS_SUMMARY_HISTORY` tables, and the query involves the `UNION` operation. This function currently does not support vectorization, that is, when calling this function for multiple rows of data, the above query is performed separately for each row.

### Synopsis
TomShawn marked this conversation as resolved.
Show resolved Hide resolved

```ebnf+diagram
DecodeSQLDigestsExpr ::=
"TIDB_DECODE_SQL_DIGESTS" "(" digests ( "," stmtTruncateLength )? ")"
```

### Example

{{< copyable "sql" >}}

```sql
set @digests = '["e6f07d43b5c21db0fbb9a31feac2dc599787763393dd5acbfad80e247eb02ad5","38b03afa5debbdf0326a014dbe5012a62c51957f1982b3093e748460f8b00821","e5796985ccafe2f71126ed6c0ac939ffa015a8c0744a24b7aee6d587103fd2f7"]';

select tidb_decode_sql_digests(@digests);
```

```sql
+------------------------------------+
| tidb_decode_sql_digests(@digests) |
+------------------------------------+
| ["begin",null,"select * from `t`"] |
+------------------------------------+
1 row in set (0.00 sec)
```

In the above example, the parameter is a JSON array containing 3 SQL digests, and the corresponding SQL statements are the three items in the query results. But the SQL statement corresponding to the second SQL digest cannot be found from the cluster, so the second item in the result is `null`.

{{< copyable "sql" >}}

```sql
select tidb_decode_sql_digests(@digests, 10);
```

```sql
+---------------------------------------+
| tidb_decode_sql_digests(@digests, 10) |
+---------------------------------------+
| ["begin",null,"select * f..."] |
+---------------------------------------+
1 row in set (0.01 sec)
```

The above call specifies the second parameter (that is, the truncation length) as 10, and the length of the third statement in the query result is greater than 10. Therefore, only the first 10 characters are retained, and `"..."` is added at the end, which indicates the truncation.

### MySQL compatibility

`TIDB_DECODE_SQL_DIGESTS` is a TiDB-specific function and not compatible with MySQL.

### See also

- [`Statement Summary Tables`](/statement-summary-tables.md)
- [`INFORMATION_SCHEMA.TIDB_TRX`](/information-schema/information-schema-tidb-trx.md)
82 changes: 43 additions & 39 deletions information-schema/information-schema-data-lock-waits.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,6 @@ summary: Learn the `DATA_LOCK_WAITS` information_schema table.

The `DATA_LOCK_WAITS` table shows the ongoing pessimistic locks waiting on all TiKV nodes in the cluster.

> **Warning:**
>
> Currently, this is an experimental feature. The definition and behavior of the table structure might have major changes in future releases.

{{< copyable "sql" >}}

```sql
Expand All @@ -22,24 +18,55 @@ DESC data_lock_waits;
+------------------------+---------------------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+---------------------+------+------+---------+-------+
| KEY | varchar(64) | NO | | NULL | |
| KEY | text | NO | | NULL | |
| KEY_INFO | text | YES | | NULL | |
| TRX_ID | bigint(21) unsigned | NO | | NULL | |
| CURRENT_HOLDING_TRX_ID | bigint(21) unsigned | NO | | NULL | |
| SQL_DIGEST | varchar(64) | YES | | NULL | |
| SQL_DIGEST_TEXT | text | YES | | NULL | |
+------------------------+---------------------+------+------+---------+-------+
```

The meaning of each column field in the `DATA_LOCK_WAITS` table is as follows:

* `KEY`: The KEY that is waiting for the lock and displayed in the form of hexadecimal string.
* `KEY`: The key that is waiting for the lock and in the hexadecimal form.
* `KEY_INFO`: The detailed information of `KEY`. See the [KEY_INFO](#key_info) section.
* `TRX_ID`: The ID of the transaction that is waiting for the lock. This ID is also the `start_ts` of the transaction.
* `CURRENT_HOLDING_TRX_ID`: The ID of the transaction that currently holds the lock. This ID is also the `start_ts` of the transaction.
* `SQL_DIGEST`: The digest of the SQL statement that is currently blocked in the lock-waiting transaction.
* `SQL_DIGEST_TEXT`: The normalized SQL statement (the SQL statement without arguments and formats) that is currently blocked in the lock-waiting transaction. It corresponds to `SQL_DIGEST`.

> **Warning:**
>
> * The information in this table is obtained in real time from all TiKV nodes during the query. Currently, even if the `WHERE` condition is added, TiDB might still collect information from all TiKV nodes. If the cluster is large and the load is high, querying this table might cause a potential risk of performance jitter. Therefore, use this table according to your actual situation.
> * The information from different TiKV nodes is NOT guaranteed to be the snapshot at the same point in time.
> * Only the users with the [PROCESS](https://dev.mysql.com/doc/refman/8.0/en/privileges-provided.html#priv_process) privilege can query this table.
> * The information in the `DATA_LOCK_WAITS` table is obtained in real time from all TiKV nodes during the query. Currently, even if a query has the `WHERE` condition, the information collection is still performed on all TiKV nodes. If your cluster is large and the load is high, querying this table might cause potential risk of performance jitter. Therefore, use it according to your actual situation.
> * Information from different TiKV nodes is NOT guaranteed to be snapshots of the same time.
> * The information (SQL digest) in the `SQL_DIGEST` column is the hash value calculated from the normalized SQL statement. The information in the `SQL_DIGEST_TEXT` column is internally queried from statements summary tables, so it is possible that the corresponding statement cannot be found internally. For the detailed description of SQL digests and the statements summary tables, see [Statement Summary Tables](/statement-summary-tables.md).

## `KEY_INFO`

The `KEY_INFO` column shows the detailed information of the `KEY` column. The information is shown in the JSON format. The description of each field is as follows:

* `"db_id"`: The ID of the schema to which the key belongs.
* `"db_name"`: The name of the schema to which the key belongs.
* `"table_id"`: The ID of the table to which the key belongs.
* `"table_name"`: The name of the table to which the key belongs.
* `"partition_id"`: The ID of the partition where the key is located.
* `"partition_name"`: The name of the partition where the key is located.
* `"handle_type"`: The handle type of the row key (that is, the key that stores a row of data). The possible values ​​are as follows:
* `"int"`: The handle type is int, which means that the handle is the row ID.
* `"common"`: The handle type is not int64. This type is shown in the non-int primary key when clustered index is enabled.
* `"unknown"`: The handle type is currently not supported.
* `"handle_value"`: The handle value.
* `"index_id"`: The index ID to which the index key (the key that stores the index) belongs.
* `"index_name"`: The name of the index to which the index key belongs.
* `"index_values"`: The index value in the index key.

In the above fields, if the information of a field is not applicable or currently unavailable, the field is omitted in the query result. For example, the row key information does not contain `index_id`, `index_name`, and `index_values`; the index key does not contain `handle_type` and `handle_value`; non-partitioned tables do not display `partition_id` and `partition_name`; the key information in the deleted table cannot obtain schema information such as `table_name`, `db_id`, `db_name`, and `index_name`, and it is unable to distinguish whether the table is a partitioned table.

> **Note:**
>
> If a key comes from a table with partitioning enabled, and the information of the schema to which the key belongs cannot be queried due to some reasons (for example, the table to which the key belongs has been deleted) during the query, the ID of the partition to which the key belongs might be appear in the `table_id` field. This is because TiDB encodes the keys of different partitions in the same way as it encodes the keys of several independent tables. Therefore, when the schema information is missing, TiDB cannot confirm whether the key belongs to an unpartitioned table or to one partition of a table.

## Example

Expand All @@ -51,36 +78,13 @@ select * from information_schema.data_lock_waits\G

```sql
*************************** 1. row ***************************
KEY: 7480000000000000355f728000000000000002
TRX_ID: 425405024158875649
CURRENT_HOLDING_TRX_ID: 425405016242126849
SQL_DIGEST: f7530877a35ae65300c42250abd8bc731bbaf0a7cabc05dab843565230611bb22
2 rows in set (0.01 sec)
```

The above query result shows that the transaction of the ID `425405024158875649` was trying to obtain the pessimistic lock on the key `7480000000000000355f728000000000000002` when the statement with digest `"f7530877a35ae65300c42250abd8bc731bbaf0a7cabc05dab843565230611bb22"` was being executed, but the lock on this key was held by the transaction of the ID `425405016242126849`.

## SQL Digest

The `DATA_LOCK_WAITS` table records the SQL digest but not the original SQL statement.

SQL digest is the hash value of the normalized SQL statement. To find the original SQL statement corresponding to the SQL digest, perform one of the following operations:

- For the statements executed on the current TiDB node in the recent period of time, you can find the corresponding original SQL statement in the `STATEMENTS_SUMMARY` or `STATEMENTS_SUMMARY_HISTORY` table according to the SQL digest.
- For the statements executed on all TiDB nodes in the entire cluster in the recent period of time, you can find the corresponding SQL statement in the `CLUSTER_STATEMENTS_SUMMARY` or `CLUSTER_STATEMENTS_SUMMARY_HISTORY` table according to the SQL digest.

{{< copyable "sql" >}}

```sql
select digest, digest_text from information_schema.statements_summary where digest = "f7530877a35ae65300c42250abd8bc731bbaf0a7cabc05dab843565230611bb2";
```

```sql
+------------------------------------------------------------------+---------------------------------------+
| digest | digest_text |
+------------------------------------------------------------------+---------------------------------------+
| f7530877a35ae65300c42250abd8bc731bbaf0a7cabc05dab843565230611bb2 | update `t` set `v` = ? where `id` = ? |
+------------------------------------------------------------------+---------------------------------------+
KEY: 7480000000000000355F728000000000000001
KEY_INFO: {"db_id":1,"db_name":"test","table_id":53,"table_name":"t","handle_type":"int","handle_value":"1"}
TRX_ID: 426790594290122753
CURRENT_HOLDING_TRX_ID: 426790590082449409
SQL_DIGEST: 38b03afa5debbdf0326a014dbe5012a62c51957f1982b3093e748460f8b00821
SQL_DIGEST_TEXT: update `t` set `v` = `v` + ? where `id` = ?
1 row in set (0.01 sec)
```

For detailed description of SQL digest, `STATEMENTS_SUMMARY`, `STATEMENTS_SUMMARY_HISTORY`, `CLUSTER_STATEMENTS_SUMMARY`, and `CLUSTER_STATEMENTS_SUMMARY_HISTORY` tables, see [Statement Summary Tables](/statement-summary-tables.md).
The above query result shows that the transaction of the ID `425405024158875649` is trying to obtain the pessimistic lock on the key `"7480000000000000355f728000000000000002"` when executing a statement that has digest `"f7530877a35ae65300c42250abd8bc731bbaf0a7cabc05dab843565230611bb22"` and is in the form of ``update `t` set `v` = `v` + ? where `id` = ?``, but the lock on this key was held by the transaction of the ID `425405016242126849`.
TomShawn marked this conversation as resolved.
Show resolved Hide resolved
Loading