Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Pushdown Limit for Bitmap Scans without ORDER BY clause #21155

Open
1 task done
timothy-e opened this issue Feb 22, 2024 · 0 comments
Open
1 task done

[YSQL] Pushdown Limit for Bitmap Scans without ORDER BY clause #21155

timothy-e opened this issue Feb 22, 2024 · 0 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@timothy-e
Copy link
Contributor

timothy-e commented Feb 22, 2024

Jira Link: DB-10089

Description

Generally, a scan with a LIMIT should also have an ORDER BY clause. Bitmap scans return unordered results, so an ORDER BY clause requires all the rows to be fetched, sorted, and then the limit can be applied. So the optimization outlined here is not very applicable to the general case.

If a bitmap scan is used in a subquery used for a WHERE EXISTS or WHERE NOT EXISTS clause, then we are only interested in collecting one row, with no order information. This is a case where a LIMIT is present without an ORDER BY, so it allows for the following optimization to be applied:

If the top level node in a Bitmap Scan is a Bitmap Or or a Bitmap Index Scan, we know that any ybctid collected by it (or its immediate children) will be returned to the client.

If a request is sent with a LIMIT 10, then we know that the first 10 ybctids that we collect are all that we need.

CREATE TABLE test_limit (a INT, b INT, c INT);
CREATE INDEX ON test_limit (a ASC);
INSERT INTO test_limit SELECT i, i * 2, i * 3 FROM generate_series(1, 1000) i;
SET yb_fetch_row_limit = 100;

Consider the query:

/*+ BitmapScan(test_limit) */ EXPLAIN (ANALYZE, DIST, COSTS OFF) SELECT * FROM test_limit WHERE a < 200 LIMIT 10;
                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 Limit (actual time=19.238..19.265 rows=10 loops=1)
   ->  YB Bitmap Table Scan on test_limit (actual time=19.234..19.249 rows=10 loops=1)
         Storage Table Read Requests: 1
         Storage Table Read Execution Time: 3.605 ms
         Storage Table Rows Scanned: 100
         ->  Bitmap Index Scan on test_limit_a_idx (actual time=13.769..13.769 rows=199 loops=1)
               Index Cond: (a < 200)
               Storage Index Read Requests: 3
               Storage Index Read Execution Time: 8.752 ms
               Storage Index Rows Scanned: 199
 Planning Time: 0.442 ms
 Execution Time: 19.768 ms
 Storage Read Requests: 4
 Storage Read Execution Time: 12.357 ms
 Storage Rows Scanned: 299
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 12.357 ms
 Peak Memory Usage: 68 kB
(21 rows)

Here, we collected 199 ybctids from the index even though we only needed to get 10. Then we requested 100 rows from the main table, even though we also only needed 10.

Using the limit for the YB Bitmap Table Scan node should be easy.

For the Bitmap Index Scans, if the only nodes between them and the YB Bitmap Table Scan node are Bitmap Or nodes, then they can also have a limit of 10.

For example:

CREATE INDEX ON test_limit (b ASC);
/*+ BitmapScan(test_limit) */ EXPLAIN (ANALYZE, DIST, COSTS OFF) SELECT * FROM test_limit WHERE a < 200 OR b < 200 LIMIT 10;
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
 Limit (actual time=13.837..13.857 rows=10 loops=1)
   ->  YB Bitmap Table Scan on test_limit (actual time=13.834..13.845 rows=10 loops=1) <- should have a limit of 10
         Storage Table Read Requests: 1
         Storage Table Read Execution Time: 2.586 ms
         Storage Table Rows Scanned: 100
         ->  BitmapOr (actual time=9.984..9.984 rows=199 loops=1) <- even this node could have a limit of 10 to avoid spending extra work on the OR operation
               ->  Bitmap Index Scan on test_limit_a_idx (actual time=6.432..6.432 rows=199 loops=1) <- should have a limit of 10
                     Index Cond: (a < 200)
                     Storage Index Read Requests: 3
                     Storage Index Read Execution Time: 4.193 ms
                     Storage Index Rows Scanned: 199
               ->  Bitmap Index Scan on test_limit_b_idx (actual time=3.546..3.546 rows=99 loops=1) <- should have a limit of 10
                     Index Cond: (b < 200)
                     Storage Index Read Requests: 2
                     Storage Index Read Execution Time: 2.141 ms
                     Storage Index Rows Scanned: 99
 Planning Time: 15.139 ms
 Execution Time: 14.154 ms
 Storage Read Requests: 6
 Storage Read Execution Time: 8.921 ms
 Storage Rows Scanned: 398
 Storage Write Requests: 0
 Catalog Read Requests: 6
 Catalog Read Execution Time: 19.862 ms
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 28.782 ms
 Peak Memory Usage: 119 kB
(28 rows)

Issue Type

kind/enhancement

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@timothy-e timothy-e added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Feb 22, 2024
@timothy-e timothy-e self-assigned this Feb 22, 2024
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Feb 22, 2024
@sushantrmishra sushantrmishra removed the status/awaiting-triage Issue awaiting triage label Feb 27, 2024
@timothy-e timothy-e changed the title [YSQL] Pushdown Limit for Bitmap Scans [YSQL] Pushdown Limit for Bitmap Scans without ORDER BY clause Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

3 participants