-
Notifications
You must be signed in to change notification settings - Fork 907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend LEAD/LAG to work with non-fixed-width types #8062
Conversation
Looks to be a transient test failure:
|
Codecov Report
@@ Coverage Diff @@
## branch-0.20 #8062 +/- ##
===============================================
+ Coverage 82.88% 82.90% +0.01%
===============================================
Files 103 103
Lines 17668 17877 +209
===============================================
+ Hits 14645 14821 +176
- Misses 3023 3056 +33
Continue to review full report at Codecov.
|
1. Renamed NULL_INDEX to _null_index. 2. Removed an unused variable. 3. Used memory resource and stream to construct output column.
1. Struct -> Class 2. Uniform class member prefixes.
Rerun tests. |
Rerun tests |
1. Removed unnecessary headers. 2. Switched gather_map to mark nulls with `size` instead of `size+1`. 3. Minor header rearrangement.
Thanks for the reviews, chaps! I'll wait for CI to pass before checking this in. |
@gpucibot merge |
This commit extends the LEAD()/LAG() offset window functions introduced in #6277 to work with more than just fixed-width data types. The functions are defined as follows:
LEAD(N)
: Returns the row from the input column, at the specified offset past thecurrent row. If the offset crosses the grouping boundary or column boundary for
a given row, a "default" value is returned if available. The "default" value is
null, by default.
LAG(N)
: returns the row from the input column at the specified offset precedingthe current row. If the offset crosses the grouping boundary or column boundary for
a given row, a "default" value is returned if available. The "default" value is
null, by default.
As an illustration, consider the following example input array input column, with two groups (
G1
andG2
):LEAD(col, 1)
yields:LAG(input_col, 2)
yields:If a
defaults
column is specified with contents:then,
LEAD(col, 1, defaults)
yields:Note that in the cases where the offset (
1
) would cross the column/group boundary (i.e. indices3
and7
), the corresponding entry from thedefaults
column is returned instead of∅
(i.e.null
).