-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23906][SQL] Add built-in UDF TRUNCATE(number) #22419
Conversation
Test build #96067 has finished for PR 22419 at commit
|
Test build #96068 has finished for PR 22419 at commit
|
btw, in the title, not |
/** | ||
* Returns double type input truncated to scale decimal places. | ||
*/ | ||
def trunc(input: Double, scale: Int): Double = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you put this function in a separate file? Any plan to reuse this?
I just linked to the previous discussion: #18106 |
override def right: Expression = scale | ||
|
||
override def inputTypes: Seq[AbstractDataType] = | ||
Seq(TypeCollection(DoubleType, DecimalType), IntegerType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to support FloatType
?
|
||
checkEvaluation(Truncate(Literal.create(1D, DoubleType), | ||
NonFoldableLiteral.create(null, IntegerType)), | ||
null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only NonFoldableLiteral
? What if Literal.create(null, IntegerType)
for scale?
null) | ||
checkEvaluation(Truncate(Literal.create(null, DoubleType), | ||
NonFoldableLiteral.create(null, IntegerType)), | ||
null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add tests for DecimalType
, and FloatType
if we need to support?
number.dataType match { | ||
case DoubleType => MathUtils.trunc(input1.asInstanceOf[Double], truncScale) | ||
case DecimalType.Fixed(_, _) => | ||
MathUtils.trunc(input1.asInstanceOf[Decimal].toJavaBigDecimal, truncScale) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we have to return Decimal
instead of java.math.BigDecimal
?
select truncate(1234567891.1234567891, -4), truncate(1234567891.1234567891, 0), truncate(1234567891.1234567891, 4); | ||
select truncate(cast(1234567891.1234567891 as decimal), -4), truncate(cast(1234567891.1234567891 as decimal), 0), truncate(cast(1234567891.1234567891 as decimal), 4); | ||
select truncate(cast(1234567891.1234567891 as long), -4), truncate(cast(1234567891.1234567891 as long), 0), truncate(cast(1234567891.1234567891 as long), 4); | ||
select truncate(cast(1234567891.1234567891 as long), 9.03) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test omitting scale?
*/ | ||
def truncate(number: Column, scale: Int): Column = withExpr { | ||
Truncate(number.expr, Literal(scale)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need def truncate(number: Column)
to support omitting scale?
override def checkInputDataTypes(): TypeCheckResult = { | ||
super.checkInputDataTypes() match { | ||
case TypeCheckSuccess => | ||
if (scale.foldable) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same to RoundBase
. only support foldable:
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
Line 1076 in c715694
if (scale.foldable) { |
Test build #96242 has finished for PR 22419 at commit
|
On second thoughts, I'm wondering whether we can reuse case class Truncate(child: Expression, scale: Expression)
extends RoundBase(child, scale, BigDecimal.RoundingMode.DOWN, "ROUND_DOWN")
with Serializable with ImplicitCastInputTypes {
def this(child: Expression) = this(child, Literal(0))
} If we want to round negative values towards negative infinity instead of towards zero, we should use Btw, could you add test cases for negative value child as well? |
# Conflicts: # sql/core/src/test/resources/sql-tests/results/operators.sql.out
@ueshin Thanks a lot! |
Test build #96448 has finished for PR 22419 at commit
|
retest this please |
@@ -413,6 +413,7 @@ object Decimal { | |||
val ROUND_HALF_EVEN = BigDecimal.RoundingMode.HALF_EVEN | |||
val ROUND_CEILING = BigDecimal.RoundingMode.CEILING | |||
val ROUND_FLOOR = BigDecimal.RoundingMode.FLOOR | |||
val ROUND_DOWN = BigDecimal.RoundingMode.DOWN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need this change, otherwise:
Caused by: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 138: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 138: A method named "ROUND_DOWN" is not declared in any enclosing class nor any supertype, nor through a static import
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we need to modify Decimal.changePrecision()
as well? Could you add it to DecimalSuite.scala#L207 to check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it and seem do not need change longVal
. This test case can passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about this?
Btw, could you add test cases for negative value child as well?
@@ -413,6 +413,7 @@ object Decimal { | |||
val ROUND_HALF_EVEN = BigDecimal.RoundingMode.HALF_EVEN | |||
val ROUND_CEILING = BigDecimal.RoundingMode.CEILING | |||
val ROUND_FLOOR = BigDecimal.RoundingMode.FLOOR | |||
val ROUND_DOWN = BigDecimal.RoundingMode.DOWN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we need to modify Decimal.changePrecision()
as well? Could you add it to DecimalSuite.scala#L207 to check?
Test build #96460 has finished for PR 22419 at commit
|
Test build #96466 has finished for PR 22419 at commit
|
retest this please |
Test build #96469 has finished for PR 22419 at commit
|
* Returns the value of the column `e` truncated to 0 places. | ||
* | ||
* @group math_funcs | ||
* @since 2.4.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@since 2.5.0
now.
* Scale can be negative to truncate (make zero) scale digits left of the decimal point. | ||
* | ||
* @group math_funcs | ||
* @since 2.4.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
#22419 (comment) |
Test build #96587 has finished for PR 22419 at commit
|
retest this please |
Test build #96596 has finished for PR 22419 at commit
|
retest this please |
Test build #96628 has finished for PR 22419 at commit
|
1234567891 | ||
""") | ||
// scalastyle:on line.size.limit | ||
case class Truncate(child: Expression, scale: Expression) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still preferring to extend trunc
. Not straightforward to know the difference between truncate
and trunc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @wangyum, can you still extend trunc
?
If not, what were the major reasons you decided to separate these? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't know we should trunc StringType
to number or trunc to date.
For example:
SELECT trunc('2.5');
SELECT trunc('2009-02-12');
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, its ok to handle the string as date. How about only accepting float, double, and decimal for number truncation?
Test build #97730 has finished for PR 22419 at commit
|
Test build #97709 has finished for PR 22419 at commit
|
Test build #97727 has finished for PR 22419 at commit
|
Test build #97797 has finished for PR 22419 at commit
|
What changes were proposed in this pull request?
Add UDF
TRUNCATE(number)
:It's similar to MySQL TRUNCATE(X, D)
How was this patch tested?
unit tests