-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
distsql: support window functions in distsql #24425
Comments
Adds support of window functions in DistSQL. A stage of windower processors is added for a particular PARTITION BY scheme, all window functions with the same partitioning are processed by that stage. A doc describing all the details of the implementation will be out soon. Note: memory accounting is not done, I plan on adding that after this PR. Resolves: cockroachdb#24425. Release note: None
Adds support of window functions in DistSQL. A stage of windower processors is added for a particular PARTITION BY scheme, all window functions with the same partitioning are processed by that stage. We also cache sorted partitions if more than one window function has the same ORDER BY clause. A doc describing the details of the implementation is out. Note: memory accounting is not done, I plan on adding that after this PR. Resolves: cockroachdb#24425. Release note: None
Adds support of window functions in DistSQL. A stage of windower processors is added for a particular PARTITION BY scheme, all window functions with the same partitioning are processed by that stage. We also cache sorted partitions if more than one window function has the same ORDER BY clause. A doc describing the details of the implementation is out. Note: memory accounting is not done, I plan on adding that after this PR. Resolves: cockroachdb#24425. Release note: None
Adds support of window functions in DistSQL. A stage of windower processors is added for a particular PARTITION BY scheme, all window functions with the same partitioning are processed by that stage. We also cache sorted partitions if more than one window function has the same ORDER BY clause. A doc describing the details of the implementation is out. Note: memory accounting is not done, I plan on adding that after this PR. Resolves: cockroachdb#24425. Release note: None
Adds support of window functions in DistSQL. A stage of windower processors is added for a particular PARTITION BY scheme, all window functions with the same partitioning are processed by that stage. We also cache sorted partitions if more than one window function has the same ORDER BY clause. A doc describing the details of the implementation is out. Note: memory accounting is not done, I plan on adding that after this PR. Resolves: cockroachdb#24425. Release note: None
Adds support of window functions in DistSQL. A stage of windower processors is added for a particular PARTITION BY scheme, all window functions with the same partitioning are processed by that stage. We also cache sorted partitions if more than one window function has the same ORDER BY clause. A doc describing the details of the implementation is out. Note: memory accounting is not done, I plan on adding that after this PR. Resolves: cockroachdb#24425. Release note: None
Adds support of window functions in DistSQL. A stage of windower processors is added for a particular PARTITION BY scheme, all window functions with the same partitioning are processed by that stage. We also cache sorted partitions if more than one window function has the same ORDER BY clause. A doc describing the details of the implementation is out. Note: memory accounting is not done, I plan on adding that after this PR. Resolves: cockroachdb#24425. Release note: None
Adds support of window functions in DistSQL. A stage of windower processors is added for a particular PARTITION BY scheme, all window functions with the same partitioning are processed by that stage. We also cache sorted partitions if more than one window function has the same ORDER BY clause. A doc describing the details of the implementation is out. Note: memory accounting is not done, I plan on adding that after this PR. Resolves: cockroachdb#24425. Release note: None
Adds support of window functions in DistSQL. A stage of windower processors is added for a particular PARTITION BY scheme, all window functions with the same partitioning are processed by that stage. We also cache sorted partitions if more than one window function has the same ORDER BY clause. A doc describing the details of the implementation is out. Note: memory accounting is not done, I plan on adding that after this PR. Resolves: cockroachdb#24425. Release note: None
27140: distsql: support window functions in distsql r=yuzefovich a=yuzefovich Adds support of window functions in DistSQL. A stage of windower processors is added for a particular PARTITION BY scheme, all window functions with the same partitioning are processed by that stage. We also cache sorted partitions if more than one window function has the same ORDER BY clause. A doc describing the details of the implementation is out. Note: memory accounting is not done, I plan on adding that after this PR. Resolves: #24425. Release note: None Co-authored-by: yuzefovich <[email protected]>
Boom!!! |
Adds memory accounting and stats collection to windowers. Related to: cockroachdb#24425. Release note: None
Using slices instead of maps gives significant performance improvement. On several quiries I ran with profiling, time spent while populating output rows reduced from 10s or to 1-2s (in terms of performance degradation: the difference went down from 90-100% to about 20%). Related to: cockroachdb#24425. Release note: None
27978: sql: improve handling of decimal zeroes r=mjibson a=mjibson Previously we had inconsistent handling of decimal zeroes. Various forms (-0, -0.00, +0.00) were correctly handled from a "what is this equal to" perspective, but we were not consistent in keeping the digits after the decimal point. Furthermore, postgres always removes negatives when making -0 for ints and decimals (not floats). This behavior (removing the negative from -0 automatically) existed in the higher level apd tests but we were subverting it by using apd.Decimal.Neg directly. apd has been updated to do this for Decimals now. Our constant evaluation has also caused us some problems. We use the go/constant package, but it evaluates numbers as floats. When we applied the + operator our NumVal type would remove its OrigString representation and thus use only the constant's Value, which, as a float, didn't contain the trailing digits after the decimal (in cases like +0.000). Now the + operator is a pass through. The - operator remains the same, but has special logic to detect -0 and leave the original string The added logic tests have been compared to postgres and now we match them exactly for many more things. Of note is that, in postgres, `SELECT -0::float` returns -0 (in opposition to the decimal behavior). This is why the test in TestParseDatumStringAs removed only the -0 test for decimals and not floats. Release note (sql change): More correct handling of decimal 0s. Specifically, -0 is coerced to 0 and values like 0.00 retain the digits after the decimal point. 28145: sql: hopefully reduce the memory usage of TestDropWhileBackfill r=vivekmenezes a=vivekmenezes related to #28084 Release note: None 28229: rangefeed: add resolved timestamp initialization scan and registration catch-up scan r=nvanbenschoten a=nvanbenschoten This PR includes a number of small improvements to the `rangefeed` package, culminating in the introduction of two scans that a rangefeed will need to perform. The first is a scan over the entire key range from the previously known resolved timestamp onward to learn about all existing unresolved intents. The second is a scan over a new registration's key range to backfill in all values from before it was registered but after its starting timestamp. Both of these scans will be able to take advantage of TimeBound iterators. However, the way they're represented in the `rangefeed` package is pretty abstract, which made testing pretty easy. The first commit and the last two are substantial, but the rest are just small logical changes so I hope this is easy to review. @benesch thanks for offering to take a look at this while @tschottdorf is out! There's only been one PR merged so far so it shouldn't take much to catch up to speed: #28072. 28313: workload: revert --init idempotency if initial rows are loaded r=nvanbenschoten a=nvanbenschoten Fixes #27981. Release note: None 28326: cli: set a timeout delay for SQL connections r=knz a=knz Fixes #20097. This patch leverages a feature in the underlying SQL driver `lib/pq` to set a connection timeout for all CLI commands that use a SQL connection. This includes `cockroach sql`/`demo`, `cockroach node ls`, `cockroach zone set`, etc. The timeout is set to 5 seconds. If the client cannot connect within 5 seconds it fails with the following error: ``` Error: cannot dial server. Is the server running? If the server is running, check --host client-side and --advertise server-side. read tcp [::1]:60205->[::1]:26257: i/o timeout ``` Release note (cli change): the various client commands that use a SQL connection (including e.g. `cockroach sql`, `cockroach node`, `cockroach user`) now produce an error if a connection could not be established within 5 seconds instead of waiting forever. 28341: distsql: use slices for populating output of windowers r=yuzefovich a=yuzefovich Using slices instead of maps gives significant performance improvement. On several quiries I ran with profiling, time spent while populating output rows reduced from 10s or to 1-2s (in terms of performance degradation: the difference went down from 90-100% to about 20%). Related to: #24425. Release note: None Co-authored-by: Matt Jibson <[email protected]> Co-authored-by: Vivek Menezes <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: yuzefovich <[email protected]>
Adds memory accounting and stats collection to windowers. Related to: cockroachdb#24425. Release note: None
28328: distsql: add memory accounting and stats to windowers r=yuzefovich a=yuzefovich Adds memory accounting and stats collection to windowers. Related to: #24425. Release note: None Co-authored-by: yuzefovich <[email protected]>
Here are benchmarks on 4 node roachprod cluster with TPCH dataset (after adding memory accounting and fixing some inefficiencies during population of output rows). Performance improvements are 2x and more. The numbers are even higher if a query has several window functions with the same partitioning scheme. Please note that on 1 node cluster, the performance of DistSQL window functions will be worse than of local execution which is expected (decrease in performance is usually around 20%). Comparison on orders table (1500000 rows)
Comparison on lineitem table (6001215 rows)
|
@yuzefovich is a stud. thank you so much for this. |
We currently don't support window functions in DistSQL queries. This means that any query which includes a window function will be forced into local SQL execution. This is a problem for OLAP-style workloads because these often include window functions when performing complex analytics.
We should fix this. The design of window function processing in DistSQL would need to be carefully thought out. To start, we would want to consider how they could be distributed and decomposed. To get an intuition about this, we can look at how window functions are implemented in local SQL. Each function begins with a scan over all input rows in order to compute partitions. This is similar to the partitioning step in a hash join, which DistSQL already knows how to do. After this, each partition can be operated on in isolation. The per-partition work can be decomposed into two steps - a sort and an aggregation. It would be ideal if these two steps could re-use existing distSQL processors. This should be possible but will require that aggregation processors be taught about window frames. Finally, we'll need to merge all of the partitions together. This again could rely on some existing processors like the UNION processor and the HASH JOIN processor (depending on how many columns in each row are propagated through the window function computation steps).
One complication of window functions is that a given query can have multiple window functions. This is similar to how a query can have multiple aggregation functions, but can get a lot more complicated because each window function can specify a different partitioning and sorting scheme. This means that we can't always combine window function aggregations. Instead, it's probably easiest to run each window function serially and not worry about multiple partitioning phases.
The text was updated successfully, but these errors were encountered: