ARROW-13095: [C++] Implement trig compute functions #10544

lidavidm · 2021-06-16T20:42:05Z

Adds sin/cos/tan and their inverses. Checked variants check for what would be domain errors (this does not apply to atan/atan2).

github-actions · 2021-06-16T20:42:23Z

https://issues.apache.org/jira/browse/ARROW-13095

edponce · 2021-06-17T04:25:22Z

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc

+
+TRIG_NO_INF(Sin, std::sin);
+TRIG_NO_INF(Cos, std::cos);
+TRIG_NO_INF(Tan, std::tan);


I suggest to not use macros for defining compute functions and instead expand the definition of each function explicitly. I understand it is to reduce code duplication, but the same case could be applied to kernel registration code and others.

edponce · 2021-06-17T04:26:37Z

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc

+    return std::atan2(y, x);
+  }
+};
+


Why use the Trig prefix on these compute functions and not their standalone name?

edponce · 2021-06-17T04:28:03Z

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc

+    static_assert(std::is_same<Arg0, Arg1>::value, "");
+    return std::atan2(y, x);
+  }
+};


Are we missing the TrigAtanChecked and TrigAtan2Checked?

I was going by cppreference which states that these functions don't have domain errors. The SEI CERT C coding standard disagrees in the case of atan2 (it states x and y must not be 0), but after digging up IEEE754, it states that atan2(0, 0) is defined (as 0).

These functions do have range (overflow) errors though so I can add checked versions for that.

Actually, I take that back: looking at IEEE754 neither of them are specified to overflow. They can both underflow, but this isn't undefined behavior/I don't think we need to check for that?

But do we want to add the ArithmeticOptions to the signature anyways for consistency?

C++ defines atan2(0, 0) as a domain error, the returned value is implementation-specific, so even if it returns 0 in our tests we should trigger a domain error as it maps to 0/0 expression.

Well, probably there is no need to check for underflow, because it seems that for all trigonometric functions, C++ applies rounding to the underflow cases and returns a correct value.

Ah, ok. I do see that it'll set a floating-point exception that you can test for, but that may indeed be overkill.

I've fixed the docs and updated atan2 to explicitly check for the 0, 0 case (including signed zeroes which are specified).

One last fix, std::copysign should also be applied to 0.0 value, return std::copysign(static_cast<T>(0.0), y).

Whoops, good catch - thanks!

edponce · 2021-06-17T04:58:25Z

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc

+  template <typename T, typename Arg0>
+  static enable_if_floating_point<Arg0, T> Call(KernelContext*, Arg0 val, Status* st) {
+    static_assert(std::is_same<T, Arg0>::value, "");
+    if (ARROW_PREDICT_FALSE((val < -1.0 || val > 1.0) && !std::isnan(val))) {


I suggest reorder the domain error checks as follows:

if (!std::isnan(val) && (val > 1.0 || val < -1.0)) { ... }

The reasoning is that a NaN input will exit immediately and a non-NaN input will be correctly compared to |val| > 1 (C++ does not guarantees NaN comparisons to be false). Reordering the positive check before the negative check can make compiler reorder instructions more optimal, see examples of all the combinations for clang and GCC.

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc

edponce · 2021-06-17T19:01:57Z

LGTM, thanks for working on this.

edponce · 2021-06-17T19:19:56Z

The Windows MinGW jobs are failing due to undefined M_PI. The reason is M_PI and friends are not part of the C++ standard. We need to add our own definitions to be portable with these definitions.

#if defined(_USE_MATH_DEFINES) && !defined(_MATH_DEFINES_DEFINED)
#define _MATH_DEFINES_DEFINED

/* Define _USE_MATH_DEFINES before including math.h to expose these macro
 * definitions for common math constants.  These are placed under an #ifdef
 * since these commonly-defined names are not part of the C/C++ standards.
 */

/* Definitions of useful mathematical constants
 * M_E        - e
 * M_LOG2E    - log2(e)
 * M_LOG10E   - log10(e)
 * M_LN2      - ln(2)
 * M_LN10     - ln(10)
 * M_PI       - pi
 * M_PI_2     - pi/2
 * M_PI_4     - pi/4
 * M_1_PI     - 1/pi
 * M_2_PI     - 2/pi
 * M_2_SQRTPI - 2/sqrt(pi)
 * M_SQRT2    - sqrt(2)
 * M_SQRT1_2  - 1/sqrt(2)
 */

#define M_E        2.71828182845904523536
#define M_LOG2E    1.44269504088896340736
#define M_LOG10E   0.434294481903251827651
#define M_LN2      0.693147180559945309417
#define M_LN10     2.30258509299404568402
#define M_PI       3.14159265358979323846
#define M_PI_2     1.57079632679489661923
#define M_PI_4     0.785398163397448309616
#define M_1_PI     0.318309886183790671538
#define M_2_PI     0.636619772367581343076
#define M_2_SQRTPI 1.12837916709551257390
#define M_SQRT2    1.41421356237309504880
#define M_SQRT1_2  0.707106781186547524401

#endif  /* _USE_MATH_DEFINES */

lidavidm · 2021-06-17T19:20:48Z

Yup, I just saw that - I ended up using acos, I can add a define instead too.

edponce · 2021-06-17T19:23:39Z

I think it would be good to add these macros somewhere accessible for the compute layer. Should we open a JIRA for this or not consider this now?

lidavidm · 2021-06-17T19:24:29Z

I can put them in util_internal.h perhaps?

edponce · 2021-06-17T19:31:51Z

I agree putting the math definitions in arrow/cpp/src/arrow/compute/kernels/util_internal.h, this will make them visible to arithmetic kernel implementations and tests.

edponce · 2021-06-17T20:37:05Z

cpp/src/arrow/compute/kernels/util_internal.h

+#define M_SQRT2 1.41421356237309504880
+#define M_SQRT1_2 0.707106781186547524401
+#endif
+


I do not understand the issue with _USE_MATH_DEFINES and unity builds. These defines should be guarded with

#if defined(_USE_MATH_DEFINES) && !defined(_MATH_DEFINES_DEFINED) #define _MATH_DEFINES_DEFINED ... #endif

to prevent multiple definitions.

It's the opposite: if any translation unit forgets to add that define before including cmath, even if you don't forget, you'll get a build error, since cmath will already have been included.

Ok, I see. You are guarding all the defines based on the existence of M_E. I think it would be "safer" to guard each macro independently. Safer in the sense that if at some point only M_E is defined explicitly in another translation unit, then the remaining macros will be undefined. Alternatively, we can use _USE_ARROW_MATH_DEFINES and require it before including util_internal.h, but this seems overkill.

jorisvandenbossche · 2021-06-22T11:50:12Z

docs/source/cpp/compute.rst

 | add                      | Binary     | Numeric            | Numeric (1)         |
 +--------------------------+------------+--------------------+---------------------+
 | add_checked              | Binary     | Numeric            | Numeric (1)         |
 +--------------------------+------------+--------------------+---------------------+
+| asin                     | Unary      | Numeric            | Numeric             |


I would put then in a separate table with the trigonometric kernels grouped (IMO that will be easier to keep somewhat the overview on this page)

cyb70289

~~LGTM.~~

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc

cyb70289 · 2021-06-25T03:10:35Z

@edponce @jorisvandenbossche do you have other comments?

edponce · 2021-06-25T03:38:15Z

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc

+  template <typename T, typename Arg0>
+  static enable_if_floating_point<Arg0, T> Call(KernelContext*, Arg0 val, Status*) {
+    static_assert(std::is_same<T, Arg0>::value, "");
+    return sin(val);


Use std::sin and friends to be consistent and be clear that this is not an Arrow function (or is it?).

Whoops, not sure why it even lets me call that without std::. Fixed.

edponce · 2021-06-25T03:58:24Z

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc

+    static_assert(std::is_same<T, double>::value, "");
+    static_assert(std::is_same<Arg0, Arg1>::value, "");
+    // Explicitly mimic what IEEE754 does (atan(0, 0) == 0) if needed
+    if (!std::numeric_limits<T>::is_iec559 && y == 0 && x == 0) return 0;


After more thorough thinking, I consider it is safe to remove the "zero check" condition because 99% of CPUs adhere to 99% of IEEE 754, so based on the C++ standard the resulting value from atan2(0,0) will be correct. If you feel that we should keep it, I am fine with that too. Apologies for the back and forth on this detail.

I've decided to remove it then. The tests will catch if a platform doesn't support that.

docs/source/cpp/compute.rst

pitrou · 2021-06-28T14:28:12Z

cpp/src/arrow/compute/api_scalar.h

+
+/// \brief Compute the inverse tangent (arctangent) of the array
+/// values, using the argument signs to determine the correct
+/// quadrant.


This docstring should be more explicit about what y and x are for.

cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc

pitrou · 2021-06-28T14:33:10Z

cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc

+                              -M_PI_2, 0, M_PI));
+}
+
+TYPED_TEST(TestUnaryArithmeticIntegral, Trig) {


Is this testing an implicitly cast version of the kernel? It doesn't seem very useful to compute trigonometric functions on integers.

pitrou · 2021-06-28T14:34:54Z

docs/source/python/api/compute.rst

+-----------------------
+
+Trigonometric functions are also supported, and also offer ``_checked``
+variants which detect domain and range errors where appropriate.


Are range errors actually detected?

They are not, because none of these functions can raise a range error, except in case of underflow, in which case they're defined to return a correctly rounded result, so I'll revise the docs.

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc

pitrou · 2021-06-28T14:39:08Z

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc

+  template <typename T, typename Arg0>
+  static enable_if_floating_point<Arg0, T> Call(KernelContext*, Arg0 val, Status* st) {
+    static_assert(std::is_same<T, Arg0>::value, "");
+    if (ARROW_PREDICT_FALSE(!std::isnan(val) && (val < -1.0 || val > 1.0))) {


Is the NaN check necessary? NaNs should typically fail those comparisons, AFAIU.

As Eduardo points out above, C++ doesn't guarantee any particular behavior. But we're essentially already assuming IEE754 conformance here in which case the check is redundant.

pitrou · 2021-06-28T14:41:57Z

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc

@@ -454,6 +462,191 @@ struct PowerChecked {
  }
 };

+struct Sin {
+  template <typename T, typename Arg0>
+  static enable_if_integer<Arg0, T> Call(KernelContext*, Arg0 val, Status*) {


As mentioned elsewhere, I don't think it's useful to expose integer-accepting versions of these kernels.

I've gone ahead and removed them.

pitrou · 2021-06-28T14:43:10Z

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc

@@ -820,6 +1081,80 @@ const FunctionDoc pow_checked_doc{
    ("An error is returned when integer to negative integer power is encountered,\n"
     "or integer overflow is encountered."),
    {"base", "exponent"}};
+
+const FunctionDoc sin_doc{"Computes the sine of the elements argument-wise",


Other docstrings use infinitive / imperative instead of present, such as "Compute the sine of the arguments element-wise".

pitrou

+1, thank you @lidavidm

Adds ln, log10, and log2. We could add a log1e and/or a logN if useful (probably not?) Has some code from/will conflict with #10544. Closes #10567 from lidavidm/arrow-13096 Authored-by: David Li <[email protected]> Signed-off-by: Yibo Cai <[email protected]>

github-actions bot added the Component: C++ label Jun 16, 2021

edponce reviewed Jun 17, 2021

View reviewed changes

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc Show resolved Hide resolved

edponce reviewed Jun 17, 2021

View reviewed changes

lidavidm force-pushed the arrow-13095 branch from d8441c9 to 07124c6 Compare June 21, 2021 12:32

lidavidm mentioned this pull request Jun 21, 2021

ARROW-13096: [C++] Implement logarithm compute functions #10567

Closed

jorisvandenbossche reviewed Jun 22, 2021

View reviewed changes

cyb70289 approved these changes Jun 24, 2021

View reviewed changes

cyb70289 requested changes Jun 24, 2021

View reviewed changes

cpp/src/arrow/compute/kernels/scalar_arithmetic.cc Show resolved Hide resolved

lidavidm force-pushed the arrow-13095 branch from b639054 to db75ffe Compare June 24, 2021 12:59

cyb70289 approved these changes Jun 25, 2021

View reviewed changes

edponce reviewed Jun 25, 2021

View reviewed changes

docs/source/cpp/compute.rst Show resolved Hide resolved

lidavidm force-pushed the arrow-13095 branch from db75ffe to 1addb64 Compare June 25, 2021 12:52

pitrou reviewed Jun 28, 2021

View reviewed changes

ARROW-13095: [C++] Implement trig compute functions

7ba5db9

lidavidm force-pushed the arrow-13095 branch from 9e787c9 to 7ba5db9 Compare June 30, 2021 12:35

pitrou approved these changes Jun 30, 2021

View reviewed changes

pitrou closed this in 01f3338 Jun 30, 2021

asfimport mentioned this pull request Jun 30, 2021

[C++] Implement trigonometric compute functions #28800

Closed

ARROW-13095: [C++] Implement trig compute functions #10544

ARROW-13095: [C++] Implement trig compute functions #10544

Conversation

lidavidm commented Jun 16, 2021

github-actions bot commented Jun 16, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edponce Jun 17, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edponce Jun 17, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edponce commented Jun 17, 2021

edponce commented Jun 17, 2021

lidavidm commented Jun 17, 2021

edponce commented Jun 17, 2021

lidavidm commented Jun 17, 2021

edponce commented Jun 17, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edponce Jun 17, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cyb70289 left a comment • edited Loading

Choose a reason for hiding this comment

cyb70289 commented Jun 25, 2021

edponce Jun 25, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pitrou left a comment

Choose a reason for hiding this comment

edponce Jun 17, 2021 •

edited

Loading

edponce Jun 17, 2021 •

edited

Loading

edponce Jun 17, 2021 •

edited

Loading

cyb70289 left a comment •

edited

Loading

edponce Jun 25, 2021 •

edited

Loading