Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement range/generate_series func #8140

Merged
merged 7 commits into from
Nov 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions datafusion/expr/src/built_in_function.rs
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,8 @@ pub enum BuiltinScalarFunction {
MakeArray,
/// Flatten
Flatten,
/// Range
Range,

// struct functions
/// struct
Expand Down Expand Up @@ -406,6 +408,7 @@ impl BuiltinScalarFunction {
BuiltinScalarFunction::ArrayToString => Volatility::Immutable,
BuiltinScalarFunction::ArrayIntersect => Volatility::Immutable,
BuiltinScalarFunction::ArrayUnion => Volatility::Immutable,
BuiltinScalarFunction::Range => Volatility::Immutable,
BuiltinScalarFunction::Cardinality => Volatility::Immutable,
BuiltinScalarFunction::MakeArray => Volatility::Immutable,
BuiltinScalarFunction::Ascii => Volatility::Immutable,
Expand Down Expand Up @@ -588,6 +591,9 @@ impl BuiltinScalarFunction {
BuiltinScalarFunction::ArrayToString => Ok(Utf8),
BuiltinScalarFunction::ArrayIntersect => Ok(input_expr_types[0].clone()),
BuiltinScalarFunction::ArrayUnion => Ok(input_expr_types[0].clone()),
BuiltinScalarFunction::Range => {
Ok(List(Arc::new(Field::new("item", Int64, true))))
}
BuiltinScalarFunction::Cardinality => Ok(UInt64),
BuiltinScalarFunction::MakeArray => match input_expr_types.len() {
0 => Ok(List(Arc::new(Field::new("item", Null, true)))),
Expand Down Expand Up @@ -902,6 +908,14 @@ impl BuiltinScalarFunction {
// 0 or more arguments of arbitrary type
Signature::one_of(vec![VariadicAny, Any(0)], self.volatility())
}
BuiltinScalarFunction::Range => Signature::one_of(
vec![
Exact(vec![Int64]),
Exact(vec![Int64, Int64]),
Exact(vec![Int64, Int64, Int64]),
],
self.volatility(),
),
BuiltinScalarFunction::Struct => Signature::variadic(
struct_expressions::SUPPORTED_STRUCT_TYPES.to_vec(),
self.volatility(),
Expand Down Expand Up @@ -1533,6 +1547,7 @@ fn aliases(func: &BuiltinScalarFunction) -> &'static [&'static str] {
BuiltinScalarFunction::MakeArray => &["make_array", "make_list"],
BuiltinScalarFunction::ArrayIntersect => &["array_intersect", "list_intersect"],
BuiltinScalarFunction::OverLay => &["overlay"],
BuiltinScalarFunction::Range => &["range", "generate_series"],

// struct functions
BuiltinScalarFunction::Struct => &["struct"],
Expand Down
6 changes: 6 additions & 0 deletions datafusion/expr/src/expr_fn.rs
Original file line number Diff line number Diff line change
Expand Up @@ -737,6 +737,12 @@ scalar_expr!(
"Returns an array of the elements in the intersection of array1 and array2."
);

nary_scalar_expr!(
Veeupup marked this conversation as resolved.
Show resolved Hide resolved
Range,
gen_range,
Veeupup marked this conversation as resolved.
Show resolved Hide resolved
"Returns a list of values in the range between start and stop with step."
);

// string functions
scalar_expr!(Ascii, ascii, chr, "ASCII code value of the character");
scalar_expr!(
Expand Down
52 changes: 52 additions & 0 deletions datafusion/physical-expr/src/array_expressions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -643,6 +643,58 @@ fn general_append_and_prepend(
)?))
}

/// Generates an array of integers from start to stop with a given step.
///
/// This function takes 1 to 3 ArrayRefs as arguments, representing start, stop, and step values.
/// It returns a `Result<ArrayRef>` representing the resulting ListArray after the operation.
///
/// # Arguments
///
/// * `args` - An array of 1 to 3 ArrayRefs representing start, stop, and step(step value can not be zero.) values.
///
/// # Examples
///
/// gen_range(3) => [0, 1, 2]
/// gen_range(1, 4) => [1, 2, 3]
/// gen_range(1, 7, 2) => [1, 3, 5]
pub fn gen_range(args: &[ArrayRef]) -> Result<ArrayRef> {
Veeupup marked this conversation as resolved.
Show resolved Hide resolved
let (start_array, stop_array, step_array) = match args.len() {
1 => (None, as_int64_array(&args[0])?, None),
2 => (
Some(as_int64_array(&args[0])?),
as_int64_array(&args[1])?,
None,
),
3 => (
Some(as_int64_array(&args[0])?),
as_int64_array(&args[1])?,
Some(as_int64_array(&args[2])?),
),
_ => return internal_err!("gen_range expects 1 to 3 arguments"),
};

let mut values = vec![];
let mut offsets = vec![0];
for (idx, stop) in stop_array.iter().enumerate() {
let stop = stop.unwrap_or(0);
let start = start_array.as_ref().map(|arr| arr.value(idx)).unwrap_or(0);
let step = step_array.as_ref().map(|arr| arr.value(idx)).unwrap_or(1);
Veeupup marked this conversation as resolved.
Show resolved Hide resolved
if step == 0 {
return exec_err!("step can't be 0 for function range(start [, stop, step]");
}
let value = (start..stop).step_by(step as usize);
values.extend(value);
offsets.push(values.len() as i32);
}
let arr = Arc::new(ListArray::try_new(
Arc::new(Field::new("item", DataType::Int64, true)),
OffsetBuffer::new(offsets.into()),
Arc::new(Int64Array::from(values)),
None,
)?);
Ok(arr)
}

/// Array_append SQL function
pub fn array_append(args: &[ArrayRef]) -> Result<ArrayRef> {
let list_array = as_list_array(&args[0])?;
Expand Down
3 changes: 3 additions & 0 deletions datafusion/physical-expr/src/functions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,9 @@ pub fn create_physical_fun(
BuiltinScalarFunction::ArrayIntersect => Arc::new(|args| {
make_scalar_function(array_expressions::array_intersect)(args)
}),
BuiltinScalarFunction::Range => {
Arc::new(|args| make_scalar_function(array_expressions::gen_range)(args))
}
BuiltinScalarFunction::Cardinality => {
Arc::new(|args| make_scalar_function(array_expressions::cardinality)(args))
}
Expand Down
1 change: 1 addition & 0 deletions datafusion/proto/proto/datafusion.proto
Original file line number Diff line number Diff line change
Expand Up @@ -637,6 +637,7 @@ enum ScalarFunction {
ArrayIntersect = 119;
ArrayUnion = 120;
OverLay = 121;
Range = 122;
}

message ScalarFunctionNode {
Expand Down
3 changes: 3 additions & 0 deletions datafusion/proto/src/generated/pbjson.rs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions datafusion/proto/src/generated/prost.rs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 9 additions & 2 deletions datafusion/proto/src/logical_plan/from_proto.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,8 @@ use datafusion_expr::{
concat_expr, concat_ws_expr, cos, cosh, cot, current_date, current_time, date_bin,
date_part, date_trunc, decode, degrees, digest, encode, exp,
expr::{self, InList, Sort, WindowFunction},
factorial, flatten, floor, from_unixtime, gcd, isnan, iszero, lcm, left, ln, log,
log10, log2,
factorial, flatten, floor, from_unixtime, gcd, gen_range, isnan, iszero, lcm, left,
ln, log, log10, log2,
logical_plan::{PlanType, StringifiedPlan},
lower, lpad, ltrim, md5, nanvl, now, nullif, octet_length, overlay, pi, power,
radians, random, regexp_match, regexp_replace, repeat, replace, reverse, right,
Expand Down Expand Up @@ -488,6 +488,7 @@ impl From<&protobuf::ScalarFunction> for BuiltinScalarFunction {
ScalarFunction::ArrayToString => Self::ArrayToString,
ScalarFunction::ArrayIntersect => Self::ArrayIntersect,
ScalarFunction::ArrayUnion => Self::ArrayUnion,
ScalarFunction::Range => Self::Range,
ScalarFunction::Cardinality => Self::Cardinality,
ScalarFunction::Array => Self::MakeArray,
ScalarFunction::NullIf => Self::NullIf,
Expand Down Expand Up @@ -1409,6 +1410,12 @@ pub fn parse_expr(
parse_expr(&args[0], registry)?,
parse_expr(&args[1], registry)?,
)),
ScalarFunction::Range => Ok(gen_range(
args.to_owned()
.iter()
.map(|expr| parse_expr(expr, registry))
.collect::<Result<Vec<_>, _>>()?,
)),
ScalarFunction::Cardinality => {
Ok(cardinality(parse_expr(&args[0], registry)?))
}
Expand Down
1 change: 1 addition & 0 deletions datafusion/proto/src/logical_plan/to_proto.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1495,6 +1495,7 @@ impl TryFrom<&BuiltinScalarFunction> for protobuf::ScalarFunction {
BuiltinScalarFunction::ArrayToString => Self::ArrayToString,
BuiltinScalarFunction::ArrayIntersect => Self::ArrayIntersect,
BuiltinScalarFunction::ArrayUnion => Self::ArrayUnion,
BuiltinScalarFunction::Range => Self::Range,
BuiltinScalarFunction::Cardinality => Self::Cardinality,
BuiltinScalarFunction::MakeArray => Self::Array,
BuiltinScalarFunction::NullIf => Self::NullIf,
Expand Down
36 changes: 36 additions & 0 deletions datafusion/sqllogictest/test_files/array.slt
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,13 @@ AS VALUES
(make_array(31, 32, 33, 34, 35, 26, 37, 38, 39, 40), 34, 4, 'ok', [8,9])
;

statement ok
CREATE TABLE arrays_range
AS VALUES
(3, 10, 2),
(4, 13, 3)
;

statement ok
CREATE TABLE arrays_with_repeating_elements
AS VALUES
Expand Down Expand Up @@ -2662,6 +2669,32 @@ select list_has_all(make_array(1,2,3), make_array(4,5,6)),
----
false true false true

query ???
select range(column2),
range(column1, column2),
range(column1, column2, column3)
from arrays_range;
----
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [3, 4, 5, 6, 7, 8, 9] [3, 5, 7, 9]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] [4, 5, 6, 7, 8, 9, 10, 11, 12] [4, 7, 10]

query ?????
select range(5),
range(2, 5),
range(2, 10, 3),
range(1, 5, -1),
range(1, -5, 1)
;
----
[0, 1, 2, 3, 4] [2, 3, 4] [2, 5, 8] [1] []

query ???
select generate_series(5),
generate_series(2, 5),
generate_series(2, 10, 3)
;
----
[0, 1, 2, 3, 4] [2, 3, 4] [2, 5, 8]

### Array operators tests

Expand Down Expand Up @@ -2969,6 +3002,9 @@ drop table array_intersect_table_3D;
statement ok
drop table arrays_values_without_nulls;

statement ok
drop table arrays_range;

statement ok
drop table arrays_with_repeating_elements;

Expand Down
1 change: 1 addition & 0 deletions docs/source/user-guide/expressions.md
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,7 @@ Unlike to some databases the math functions in Datafusion works the same way as
| array_union(array1, array2) | Returns an array of the elements in the union of array1 and array2 without duplicates. `array_union([1, 2, 3, 4], [5, 6, 3, 4]) -> [1, 2, 3, 4, 5, 6]` |
| cardinality(array) | Returns the total number of elements in the array. `cardinality([[1, 2, 3], [4, 5, 6]]) -> 6` |
| make_array(value1, [value2 [, ...]]) | Returns an Arrow array using the specified input expressions. `make_array(1, 2, 3) -> [1, 2, 3]` |
| range(start [, stop, step]) | Returns an Arrow array between start and stop with step. `SELECT range(2, 10, 3) -> [2, 5, 8]` |
| trim_array(array, n) | Deprecated |

## Regular Expressions
Expand Down
15 changes: 15 additions & 0 deletions docs/source/user-guide/sql/scalar_functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -1560,6 +1560,7 @@ from_unixtime(expression)
- [string_to_array](#string_to_array)
- [string_to_list](#string_to_list)
- [trim_array](#trim_array)
- [range](#range)

### `array_append`

Expand Down Expand Up @@ -2481,6 +2482,20 @@ trim_array(array, n)
Can be a constant, column, or function, and any combination of array operators.
- **n**: Element to trim the array.

### `range`
Veeupup marked this conversation as resolved.
Show resolved Hide resolved

Returns an Arrow array between start and stop with step. `SELECT range(2, 10, 3) -> [2, 5, 8]`

The range start..end contains all values with start <= x < end. It is empty if start >= end.

Step can not be 0 (then the range will be nonsense.).

#### Arguments

- **start**: start of the range
- **end**: end of the range (not included)
- **step**: increase by step (can not be 0)

## Struct Functions

- [struct](#struct)
Expand Down