Skip to content

Commit

Permalink
Update execution summary related changes
Browse files Browse the repository at this point in the history
Signed-off-by: yibin <[email protected]>
  • Loading branch information
yibin87 committed Dec 11, 2024
1 parent 5e9142e commit 228fa32
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 9 deletions.
10 changes: 7 additions & 3 deletions explain-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ An operator is a particular step that is executed as part of returning query res
TiDB aggregates the data or calculation results scanned from TiKV/TiFlash. The data aggregation operators can be divided into the following categories:
- **TableReader**: Aggregates the data obtained by the underlying operators like `TableFullScan` or `TableRangeScan` in TiKV.
- **TableReader**: Aggregates the data obtained by the underlying operators in TiKV or TiFlash.
- **IndexReader**: Aggregates the data obtained by the underlying operators like `IndexFullScan` or `IndexRangeScan` in TiKV.
- **IndexLookUp**: First aggregates the RowID (in TiKV) scanned by the `Build` side. Then at the `Probe` side, accurately reads the data from TiKV based on these RowIDs. At the `Build` side, there are operators like `IndexFullScan` or `IndexRangeScan`; at the `Probe` side, there is the `TableRowIDScan` operator.
- **IndexMerge**: Similar to `IndexLookUp`. `IndexMerge` can be seen as an extension of `IndexLookupReader`. `IndexMerge` supports reading multiple indexes at the same time. There are many `Build`s and one `Probe`. The execution process of `IndexMerge` the same as that of `IndexLookUp`.
Expand All @@ -166,9 +166,13 @@ In the `WHERE`/`HAVING`/`ON` conditions, the TiDB optimizer analyzes the result
### Task overview
Currently, calculation tasks of TiDB can be divided into two categories: cop tasks and root tasks. A `cop[tikv]` task indicates that the operator is performed inside the TiKV coprocessor. A `root` task indicates that it will be completed inside of TiDB.
Currently, calculation tasks of TiDB can be divided into four categories: root task, cop task, batchCop task and mpp task:
- root task indicates that it will be completed inside of TiDB.
- cop task indicates that it will be executed using Coprocessor in TiKV or TiFlash.
- batchCop task is an optimization of TiFlash cop tasks, which can execute queries on multiple regions in one task.
- mpp task indicates that it will be executed using TiFlash's [MPP mode](/explain-mpp.md).
One of the goals of SQL optimization is to push the calculation down to TiKV as much as possible. The Coprocessor in TiKV supports most of the built-in SQL functions (including the aggregate functions and the scalar functions), SQL `LIMIT` operations, index scans, and table scans.
One of the goals of SQL optimization is to push the calculation down to TiKV or TiFlash as much as possible. The Coprocessor in TiKV supports most of the built-in SQL functions (including the aggregation functions and the scalar functions), SQL 'LIMIT' operations, index scans, and table scans. The Coprocessor in TiFlash is similar to TiKV in functionality, but does not support index scans.

Check warning on line 175 in explain-overview.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Ambiguous] Consider using a clearer word than 'much' because it may cause confusion. Raw Output: {"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'much' because it may cause confusion.", "location": {"path": "explain-overview.md", "range": {"start": {"line": 175, "column": 92}}}, "severity": "INFO"}
### Operator info overview
Expand Down
22 changes: 17 additions & 5 deletions sql-statements/sql-statement-explain-analyze.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,10 @@ Different from `EXPLAIN`, `EXPLAIN ANALYZE` executes the corresponding SQL state

| attribute name | description |
|:----------------|:---------------------------------|
| actRows | Number of rows output by the operator. |
| execution info | Execution information of the operator. `time` represents the total `wall time` from entering the operator to leaving the operator, including the total execution time of all sub-operators. If the operator is called many times by the parent operator (in loops), then the time refers to the accumulated time. `loops` is the number of times the current operator is called by the parent operator. |
| memory | Memory space occupied by the operator. |
| disk | Disk space occupied by the operator. |
| `actRows` | Number of rows output by the operator. |
| `execution info` | Execution information of the operator. `time` represents the total `wall time` from entering the operator to leaving the operator, including the total execution time of all sub-operators. If the operator is called many times by the parent operator (in loops), then the time refers to the accumulated time. `loops` is the number of times the current operator is called by the parent operator. 'open' represents the time required for operator initialization. 'close' represents the time from the operator processing all data to the operator ending. The time counted by 'time' includes the time of 'open' and 'close'. When the operator is executed concurrently, the execution information will show the sum of all used 'wall time'. At this time, 'time', 'open' and 'close' will be replaced with 'total_time', 'total_open' and 'total_close'. |

Check warning on line 42 in sql-statements/sql-statement-explain-analyze.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion. Raw Output: {"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.", "location": {"path": "sql-statements/sql-statement-explain-analyze.md", "range": {"start": {"line": 42, "column": 237}}}, "severity": "INFO"}
| `memory` | Max memory space occupied by the operator. |
| `disk` | Max disk space occupied by the operator. |

## Examples

Expand Down Expand Up @@ -99,7 +99,7 @@ EXPLAIN ANALYZE SELECT * FROM t1;

## Execution information of operators

In addition to the basic `time` and `loop` execution information, `execution info` also contains operator-specific execution information, which mainly includes the time consumed for the operator to send RPC requests and the duration of other steps.
In addition to the basic `time`, `open`, `close` and `loop` execution information, `execution info` also contains operator-specific execution information, which mainly includes the time consumed for the operator to send RPC requests and the duration of other steps.

### Point_Get

Expand Down Expand Up @@ -339,6 +339,18 @@ after key/value request is processed:

For writes and batch gets, the calculation is similar with different base costs.

### tiflash_wait information

When a query involves MPP tasks, the execution time is also affected by various tiflash_wait times, for example:

```
tiflash_wait: {minTSO_wait: 425ms, pipeline_breaker_wait: 133ms, pipeline_queue_wait: 512ms}
```

- `minTSO_wait`: Records the time spent waiting for an MPP Task to be scheduled by the [TiFlash MinTSO Scheduler](/tiflash/tiflash-mintso-scheduler.md).
- 'pipeline_breaker_wait': When TiFlash uses the [Pipeline Execution Model](/tiflash/tiflash-pipeline-model.md), it records the time it takes for the pipeline containing the pipeline breaker operator to wait for all data in the upstream pipeline. Currently only used to display the time it takes for the pipeline containing the 'Join' operator to wait for all hash table builds to complete.
- `pipeline_queue_wait`: When TiFlash uses the Pipeline Execution Model(/tiflash/tiflash-pipeline-model.md), it records the waiting time in the CPU Task Thread Pool and IO Task Thread Pool during the execution of the pipeline.

### Other common execution information

The Coprocessor operators usually contain two parts of execution time information: `cop_task` and `tikv_task`. `cop_task` is the time recorded by TiDB, and it is from the moment that the request is sent to the server to the moment that the response is received. `tikv_task` is the time recorded by TiKV Coprocessor itself. If there is much difference between the two, it might indicate that the time spent waiting for the response is too long, or the time spent on gRPC or network is too long.
Expand Down
2 changes: 1 addition & 1 deletion sql-statements/sql-statement-explain.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ Currently, `EXPLAIN` in TiDB outputs 5 columns: `id`, `estRows`, `task`, `access
|:----------------|:----------------------------------------------------------------------------------------------------------|
| id | The operator ID is the unique identifier of the operator in the entire execution plan. In TiDB 2.1, the ID is formatted to display the tree structure of the operator. Data flows from the child node to the parent node. One and only one parent node for each operator. |
| estRows | The number of rows that the operator is expected to output. This number is estimated according to the statistics and the operator's logic. `estRows` is called `count` in the earlier versions of TiDB 4.0. |
| task | The type of task the operator belongs to. Currently, the execution plans are divided into two tasks: **root** task, which is executed on tidb-server, and **cop** task, which is performed in parallel on TiKV or TiFlash. The topology of the execution plan at the task level is that a root task followed by many cop tasks. The root task uses the output of cop tasks as input. The cop tasks refer to tasks that TiDB pushes down to TiKV or TiFlash. Each cop task is distributed in the TiKV cluster or the TiFlash cluster, and is executed by multiple processes. |
| task | The type of task the operator belongs to. Currently, the execution plans are divided into four tasks: **root** task, which is executed on tidb-server; **cop** task, which is performed in parallel on TiKV or TiFlash; ** batchCop ** task, executed in parallel on TiFlash; ** mpp ** task, executed in parallel on TiFlash. The topology of the execution plan at the task level is that a root task followed by many other tasks. The root task uses the output of other tasks as input. The other tasks refer to tasks that TiDB pushes down to TiKV or TiFlash. Each pushed down task is distributed in the TiKV cluster or the TiFlash cluster, and is executed by multiple processes. |

Check failure on line 55 in sql-statements/sql-statement-explain.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Spacing] 'h. T' should have one space. Raw Output: {"message": "[PingCAP.Spacing] 'h. T' should have one space.", "location": {"path": "sql-statements/sql-statement-explain.md", "range": {"start": {"line": 55, "column": 335}}}, "severity": "ERROR"}

Check warning on line 55 in sql-statements/sql-statement-explain.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion. Raw Output: {"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.", "location": {"path": "sql-statements/sql-statement-explain.md", "range": {"start": {"line": 55, "column": 424}}}, "severity": "INFO"}
| access object | Data item information accessed by the operator. The information includes `table`, `partition`, and `index` (if any). Only operators that directly access the data have such information. |
| operator info | Other information about the operator. `operator info` of each operator is different. You can refer to the following examples. |

Expand Down

0 comments on commit 228fa32

Please sign in to comment.