-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL][Query Diagnostics] Handle intermediate flushing for explain plans #23720
Labels
area/ysql
Yugabyte SQL (YSQL)
kind/new-feature
This is a request for a completely new feature
priority/medium
Medium priority issue
Comments
IshanChhangani
added
area/ysql
Yugabyte SQL (YSQL)
status/awaiting-triage
Issue awaiting triage
labels
Aug 29, 2024
yugabyte-ci
added
kind/new-feature
This is a request for a completely new feature
priority/medium
Medium priority issue
and removed
status/awaiting-triage
Issue awaiting triage
labels
Aug 29, 2024
IshanChhangani
added a commit
that referenced
this issue
Sep 5, 2024
…queryDiagnostics Summary: - This diff adds functionality for `explain_analyze` and `explain_dist` params. - Note: This diff does not include code for supporting EXPLAIN(ANALYZE, DIST, DEBUG) - Explain.txt looks like:- ``` duration: 6.407 ms plan: Sort (cost=550.02..550.85 rows=333 width=116) (actual time=6.390..6.390 rows=0 loops=1) Sort Key: (sum(o.total_amount)) DESC, (count(DISTINCT o.order_id)) DESC Sort Method: quicksort Memory: 25kB -> GroupAggregate (cost=496.90..536.07 rows=333 width=116) (actual time=6.377..6.377 rows=0 loops=1) Group Key: c.customer_id, ((((c.first_name)::text || ' '::text) || (c.last_name)::text)) Filter: (count(DISTINCT o.order_id) > 0) -> Sort (cost=496.90..499.40 rows=1000 width=178) (actual time=6.374..6.374 rows=0 loops=1) Sort Key: c.customer_id, ((((c.first_name)::text || ' '::text) || (c.last_name)::text)) Sort Method: quicksort Memory: 25kB -> YB Batched Nested Loop Left Join (cost=117.50..447.07 rows=1000 width=178) (actual time=6.362..6.362 rows=0 loops=1) Join Filter: (oi.product_id = p.product_id) -> YB Batched Nested Loop Join (cost=117.50..328.19 rows=1000 width=268) (actual time=6.359..6.359 rows=0 loops=1) Join Filter: (o.customer_id = c.customer_id) -> Hash Right Join (cost=117.50..225.69 rows=1000 width=32) (actual time=6.354..6.355 rows=0 loops=1) Hash Cond: (oi.order_id = o.order_id) -> Seq Scan on order_items oi (cost=0.00..100.00 rows=1000 width=12) (never executed) -> Hash (cost=105.00..105.00 rows=1000 width=24) (actual time=6.348..6.348 rows=0 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 8kB -> Seq Scan on orders o (cost=0.00..105.00 rows=1000 width=24) (actual time=6.345..6.345 rows=0 loops=1) Storage Filter: ((order_date >= '2023-04-01'::date) AND (order_date <= '2023-05-31'::date)) Storage Table Read Requests: 1 Storage Table Read Execution Time: 0.000 ms -> Index Scan using customers_pkey on customers c (cost=0.00..0.12 rows=1 width=240) (never executed) Index Cond: (customer_id = ANY (ARRAY[o.customer_id, $1, $2, ..., $1023])) Storage Filter: (registration_date >= '2023-01-01'::date) -> Index Scan using products_pkey on products p (cost=0.00..0.11 rows=1 width=122) (never executed) Index Cond: (product_id = ANY (ARRAY[oi.product_id, $1025, $1026, ..., $2047])) duration: 6.559 ms plan: Sort (cost=550.02..550.85 rows=333 width=116) (actual time=6.537..6.537 rows=0 loops=1) Sort Key: (sum(o.total_amount)) DESC, (count(DISTINCT o.order_id)) DESC Sort Method: quicksort Memory: 25kB -> GroupAggregate (cost=496.90..536.07 rows=333 width=116) (actual time=6.526..6.526 rows=0 loops=1) Group Key: c.customer_id, ((((c.first_name)::text || ' '::text) || (c.last_name)::text)) Filter: (count(DISTINCT o.order_id) > 0) -> Sort (cost=496.90..499.40 rows=1000 width=178) (actual time=6.523..6.523 rows=0 loops=1) Sort Key: c.customer_id, ((((c.first_name)::text || ' '::text) || (c.last_name)::text)) Sort Method: quicksort Memory: 25kB -> YB Batched Nested Loop Left Join (cost=117.50..447.07 rows=1000 width=178) (actual time=6.515..6.515 rows=0 loops=1) Join Filter: (oi.product_id = p.product_id) -> YB Batched Nested Loop Join (cost=117.50..328.19 rows=1000 width=268) (actual time=6.511..6.511 rows=0 loops=1) Join Filter: (o.customer_id = c.customer_id) -> Hash Right Join (cost=117.50..225.69 rows=1000 width=32) (actual time=6.506..6.506 rows=0 loops=1) Hash Cond: (oi.order_id = o.order_id) -> Seq Scan on order_items oi (cost=0.00..100.00 rows=1000 width=12) (never executed) -> Hash (cost=105.00..105.00 rows=1000 width=24) (actual time=6.497..6.497 rows=0 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 8kB -> Seq Scan on orders o (cost=0.00..105.00 rows=1000 width=24) (actual time=6.494..6.494 rows=0 loops=1) Storage Filter: ((order_date >= '2023-04-01'::date) AND (order_date <= '2023-05-31'::date)) Storage Table Read Requests: 1 Storage Table Read Execution Time: 0.000 ms -> Index Scan using customers_pkey on customers c (cost=0.00..0.12 rows=1 width=240) (never executed) Index Cond: (customer_id = ANY (ARRAY[o.customer_id, $1, $2, ..., $1023])) Storage Filter: (registration_date >= '2023-01-01'::date) -> Index Scan using products_pkey on products p (cost=0.00..0.11 rows=1 width=122) (never executed) Index Cond: (product_id = ANY (ARRAY[oi.product_id, $1025, $1026, ..., $2047])) ``` Jira: DB-11078 Test Plan: ./yb_build.sh --java-test TestYbQueryDiagnostics#checkExplainData Testing with a large query that generates an EXPLAIN plan exceeding 16KB, causes output termination and loss of remaining data. This is expected and will be handled later [[ #23720 | #23720 ]] Reviewers: asaha, hbhanawat, telgersma Reviewed By: telgersma Subscribers: yql, ybase Differential Revision: https://phorge.dev.yugabyte.com/D35566
IshanChhangani
added a commit
that referenced
this issue
Sep 12, 2024
…ANALYZE, DIST) support for queryDiagnostics Summary: No merge conflict - yb_query_diagnostics.c - YbQueryDiagnostics_ExecutorStart - Resolved compile errors by replacing `MAX_REMOVE_VALUE` with pg15 counterpart function `pg_prng_double(&pg_global_prng_state)`. Upstream PG 3804539e48e794781c6145c7f988f5d507418fa8 replaced random function with pg_prng_double. YB master 40689bc - Imported this function `include "common/pg_prng.h"` - same original summary - This diff adds functionality for `explain_analyze` and `explain_dist` params. - Note: This diff does not include code for supporting EXPLAIN(ANALYZE, DIST, DEBUG) - Explain.txt looks like:- ``` duration: 6.407 ms plan: Sort (cost=550.02..550.85 rows=333 width=116) (actual time=6.390..6.390 rows=0 loops=1) Sort Key: (sum(o.total_amount)) DESC, (count(DISTINCT o.order_id)) DESC Sort Method: quicksort Memory: 25kB -> GroupAggregate (cost=496.90..536.07 rows=333 width=116) (actual time=6.377..6.377 rows=0 loops=1) Group Key: c.customer_id, ((((c.first_name)::text || ' '::text) || (c.last_name)::text)) Filter: (count(DISTINCT o.order_id) > 0) -> Sort (cost=496.90..499.40 rows=1000 width=178) (actual time=6.374..6.374 rows=0 loops=1) Sort Key: c.customer_id, ((((c.first_name)::text || ' '::text) || (c.last_name)::text)) Sort Method: quicksort Memory: 25kB -> YB Batched Nested Loop Left Join (cost=117.50..447.07 rows=1000 width=178) (actual time=6.362..6.362 rows=0 loops=1) Join Filter: (oi.product_id = p.product_id) -> YB Batched Nested Loop Join (cost=117.50..328.19 rows=1000 width=268) (actual time=6.359..6.359 rows=0 loops=1) Join Filter: (o.customer_id = c.customer_id) -> Hash Right Join (cost=117.50..225.69 rows=1000 width=32) (actual time=6.354..6.355 rows=0 loops=1) Hash Cond: (oi.order_id = o.order_id) -> Seq Scan on order_items oi (cost=0.00..100.00 rows=1000 width=12) (never executed) -> Hash (cost=105.00..105.00 rows=1000 width=24) (actual time=6.348..6.348 rows=0 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 8kB -> Seq Scan on orders o (cost=0.00..105.00 rows=1000 width=24) (actual time=6.345..6.345 rows=0 loops=1) Storage Filter: ((order_date >= '2023-04-01'::date) AND (order_date <= '2023-05-31'::date)) Storage Table Read Requests: 1 Storage Table Read Execution Time: 0.000 ms -> Index Scan using customers_pkey on customers c (cost=0.00..0.12 rows=1 width=240) (never executed) Index Cond: (customer_id = ANY (ARRAY[o.customer_id, $1, $2, ..., $1023])) Storage Filter: (registration_date >= '2023-01-01'::date) -> Index Scan using products_pkey on products p (cost=0.00..0.11 rows=1 width=122) (never executed) Index Cond: (product_id = ANY (ARRAY[oi.product_id, $1025, $1026, ..., $2047])) duration: 6.559 ms plan: Sort (cost=550.02..550.85 rows=333 width=116) (actual time=6.537..6.537 rows=0 loops=1) Sort Key: (sum(o.total_amount)) DESC, (count(DISTINCT o.order_id)) DESC Sort Method: quicksort Memory: 25kB -> GroupAggregate (cost=496.90..536.07 rows=333 width=116) (actual time=6.526..6.526 rows=0 loops=1) Group Key: c.customer_id, ((((c.first_name)::text || ' '::text) || (c.last_name)::text)) Filter: (count(DISTINCT o.order_id) > 0) -> Sort (cost=496.90..499.40 rows=1000 width=178) (actual time=6.523..6.523 rows=0 loops=1) Sort Key: c.customer_id, ((((c.first_name)::text || ' '::text) || (c.last_name)::text)) Sort Method: quicksort Memory: 25kB -> YB Batched Nested Loop Left Join (cost=117.50..447.07 rows=1000 width=178) (actual time=6.515..6.515 rows=0 loops=1) Join Filter: (oi.product_id = p.product_id) -> YB Batched Nested Loop Join (cost=117.50..328.19 rows=1000 width=268) (actual time=6.511..6.511 rows=0 loops=1) Join Filter: (o.customer_id = c.customer_id) -> Hash Right Join (cost=117.50..225.69 rows=1000 width=32) (actual time=6.506..6.506 rows=0 loops=1) Hash Cond: (oi.order_id = o.order_id) -> Seq Scan on order_items oi (cost=0.00..100.00 rows=1000 width=12) (never executed) -> Hash (cost=105.00..105.00 rows=1000 width=24) (actual time=6.497..6.497 rows=0 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 8kB -> Seq Scan on orders o (cost=0.00..105.00 rows=1000 width=24) (actual time=6.494..6.494 rows=0 loops=1) Storage Filter: ((order_date >= '2023-04-01'::date) AND (order_date <= '2023-05-31'::date)) Storage Table Read Requests: 1 Storage Table Read Execution Time: 0.000 ms -> Index Scan using customers_pkey on customers c (cost=0.00..0.12 rows=1 width=240) (never executed) Index Cond: (customer_id = ANY (ARRAY[o.customer_id, $1, $2, ..., $1023])) Storage Filter: (registration_date >= '2023-01-01'::date) -> Index Scan using products_pkey on products p (cost=0.00..0.11 rows=1 width=122) (never executed) Index Cond: (product_id = ANY (ARRAY[oi.product_id, $1025, $1026, ..., $2047])) ``` Jira: DB-11078 Original commit: 40689bc / D35566 Test Plan: ./yb_build.sh --java-test TestYbQueryDiagnostics#checkExplainData Testing with a large query that generates an EXPLAIN plan exceeding 16KB, causes output termination and loss of remaining data. This is expected and will be handled later [[ #23720 | #23720 ]] Reviewers: jason, tfoucher Reviewed By: jason Subscribers: ybase, yql Differential Revision: https://phorge.dev.yugabyte.com/D37980
IshanChhangani
changed the title
[YSQL][Query Diagnostics] Handle oversized explain plans
[YSQL][Query Diagnostics] Handle intermediate flushing for explain plans
Oct 22, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/ysql
Yugabyte SQL (YSQL)
kind/new-feature
This is a request for a completely new feature
priority/medium
Medium priority issue
Jira Link: DB-12628
Description
Currently, when a query's explain plan exceeds the shared memory allocated for explain plans (YB_QD_MAX_EXPLAIN_PLAN_LEN), the output is trimmed, leading to potential data loss. We aim to improve this behavior to ensure no data is lost.
Currently, the logic copies as much of the plan as possible, even if the data is incomplete. We want to change this to copy the plan only if the entire plan fits within the buffer, similar to how bind variables are handled in this scenario.
Additionally, we propose the following improvements:
GUC Configuration: Convert the hard limits on buffer sizes into configurable GUCs (Grand Unified Configuration) for more flexibility.
User Warnings: Print warnings if a user exceeds the buffer limit within a single background worker interval.
Overflow Flag: Introduce a flag to indicate if data was missed due to buffer overflow. If this flag is set, trigger a buffer flush, regardless of how much the buffer is filled.
These changes will prevent data loss and provide better control over buffer management for explain plans.
Issue Type
kind/new-feature
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: