Add lowering for the DOT_PRODUCT intrinsic #914

banach-space · 2021-07-08T19:05:55Z

Hi, I still have a few cases left to implement in Fortran::lower::genDotProduct. Any pointers would be much appreciated!

I'm sending this for early feedback to make sure that this is the right approach.

banach-space · 2021-07-08T19:08:30Z

flang/lib/Lower/ReductionRuntime.cpp

+    func =
+        Fortran::lower::getRuntimeFunc<mkRTKey(DotProductReal8)>(loc, builder);
+
+  // REQUIRES LONG_DOUBLE=80


I got a bit confused here by https://github.com/flang-compiler/f18-llvm-project/blob/fir-dev/flang/runtime/dot-product.cpp#L141-L145. Should I just use #if LONG_DOUBLE == 80? But then I won't be able to test this code-path.

No. Take a look at "struct ForcedProductReal10", "struct ForcedProductReal16", "struct ForcedProductInteger16", "struct ForcedProductComplex10", and "struct ForcedProductComplex16" in ReductionRuntime.cpp. Also, see how those are used in the genProduct() function in the same file. That way, real*10, real*16, complex*10, complex*16, integer*16 will just work if/when they are supported by the target.

Ah, thank you for pointing this out!

How can I test that it works? This is what I get for real(kind=10):

'fir.convert' op invalid type conversion

That's on an X86-64 machine. I would hope that we can force the frontend to generate this code regardless of whether the host systems supports extended real numbers.

You should not get a fir error. The code should compile and link if the runtime routine is present for the target (although you may not get correct answers if the width is not supported natively by the target), or you should get a link time error. We have some regression tests in flang/test/Lower/intrinsic-procedures.f90 that use these higher precision data types (i.e., sum_test4 and product_test4 use complex(10), and spacing_test2 uses real*10). But we are just testing the fir in intrinsic-procedures.f90; not whether the code links and runs.

If you test something like real*16, you should get a link time error on X86-64 for the missing routine. For example, consider the following SUM code that operates on real*16. It compiles successfully but then you get a link time error.

program p
real*16:: x(2)

x(1) = 1.
x(2) = 9.

print *, sum(x)

end

When I compile it, I get the expected link time error on X86-64:

sum16.f90.o: In function _QQmain': /local/home/mleair/fir-dev/tests/sum/<stdin>:26: undefined reference to _FortranASumReal16'
collect2: error: ld returned 1 exit status

Thank you for confirming this and for all the references. There was a bug in my implementation - I was using incorrect function type in ForcedDotProductReal{N}. This is now fixed and tested!

flang/lib/Lower/IntrinsicCall.cpp

flang/test/Lower/intrinsic-procedures.f90

flang/lib/Lower/IntrinsicCall.cpp

banach-space · 2021-07-09T17:00:59Z

Thank you for taking a look @mleair ! I still would like to add tests for real10, real16 and integer*16.

Lowering for complex does not work yet. What should I be returning in this case? A 2-element array?

I am also missing lowering for logical vectors. Are there any examples for these?

mleair · 2021-07-09T21:52:16Z

Thank you for taking a look @mleair ! I still would like to add tests for real_10, real_16 and integer*16.

Lowering for complex does not work yet. What should I be returning in this case? A 2-element array?

I am also missing lowering for logical vectors. Are there any examples for these?

I suggest taking a look at some of the implemented intrinsics that handle these data types. For example, sum and product both handle complex results. The results are handled in a result descriptor. You can see how it is handled in the genSum() and genProduct() routines in ReductionRuntime.cpp. The key is that they need a result descriptor while the other scalar data type versions do not. The function genProdOrSum() in IntrinsicCall.cpp creates this result descriptor when the result is Complex.

banach-space · 2021-07-15T15:53:53Z

@mleair,intrinsic-procedures.f90 is quite large and covers both Fortran 90 and 95. How about splitting it into multiple files? Perhaps in a subdirectory in flang/test/Lower/? I would be very much in favour of moving the tests for dot_product into a dedicated file. But I don't want to get ahead of myself - so whatever you think works best!

mleair · 2021-07-16T17:48:22Z

@mleair,intrinsic-procedures.f90 is quite large and covers both Fortran 90 and 95. How about splitting it into multiple files? Perhaps in a subdirectory in flang/test/Lower/? I would be very much in favour of moving the tests for dot_product into a dedicated file. But I don't want to get ahead of myself - so whatever you think works best!

Fortan 95 is just a small incremental change to Fortran 90, so I don't think there's much benefit for making a split between these two languages. Maybe a split can happen when we start working on intrinsics that were introduced in Fortran 2003. @schweitzpgi and @jeanPerier might have some ideas here.

banach-space · 2021-07-17T19:07:00Z

@mleair,intrinsic-procedures.f90 is quite large and covers both Fortran 90 and 95. How about splitting it into multiple files? Perhaps in a subdirectory in flang/test/Lower/? I would be very much in favour of moving the tests for dot_product into a dedicated file. But I don't want to get ahead of myself - so whatever you think works best!

Fortan 95 is just a small incremental change to Fortran 90, so I don't think there's much benefit for making a split between these two languages. Maybe a split can happen when we start working on intrinsics that were introduced in Fortran 2003. @schweitzpgi and @jeanPerier might have some ideas here.

My main rationale for splitting it would be the fact that it is so large. There are benefits to having fewer cases per test file:

it is easier to triage failures in CI (LLVM BuildBots report files that fail, fewer tests per file means less work to identify the failing case)
if one test case fails, only a small number of other tests in our CI are affected (currently, if one case from intrinsic-procedures.f90 is failing, nothing else is being tested from that file while the bug is being fixed)
LIT variables (e.g. %[[x:arg0]] ) are never shared between cases (currently, one test case pollutes all other cases with its variables)
it's a bit easier for humans to parse tests (this is very subjective, but definitely the case for me)

There are good reasons to keep everything in one file too and a lot of it is just a matter of preference.

The actual reason to bring this up in this patch is that I'm adding 244 lines of tests. That's quite a lot and personally I'd prefer to move it to a separate file. But that's nothing major and I'm happy to keep things as is.

jeanPerier · 2021-07-19T08:34:35Z

My main rationale for splitting it would be the fact that it is so large. There are benefits to having fewer cases per test file:

it is easier to triage failures in CI (LLVM BuildBots report files that fail, fewer tests per file means less work to identify the failing case)

if one test case fails, only a small number of other tests in our CI are affected (currently, if one case from intrinsic-procedures.f90 is failing, nothing else is being tested from that file while the bug is being fixed)

LIT variables (e.g. %[[x:arg0]] ) are never shared between cases (currently, one test case pollutes all other cases with its variables)

it's a bit easier for humans to parse tests (this is very subjective, but definitely the case for me)

There are good reasons to keep everything in one file too and a lot of it is just a matter of preference.

The actual reason to bring this up in this patch is that I'm adding 244 lines of tests. That's quite a lot and personally I'd prefer to move it to a separate file. But that's nothing major and I'm happy to keep things as is.

I agree with the points you are bringing, splitting it sounds reasonable to me. One test file per intrinsic may be a bit overkill, especially for the tiny ones. But I am not sure what would be the good split. I would like a split that is easy to reason about, so that it's clear for people where they must add their tests, and it is easy to audit which intrinsics have a test. The standard version split may be a bit annoying to follow and may still lead to big test files.

mleair · 2021-07-19T11:25:39Z

My main rationale for splitting it would be the fact that it is so large. There are benefits to having fewer cases per test file:

it is easier to triage failures in CI (LLVM BuildBots report files that fail, fewer tests per file means less work to identify the failing case)

if one test case fails, only a small number of other tests in our CI are affected (currently, if one case from intrinsic-procedures.f90 is failing, nothing else is being tested from that file while the bug is being fixed)

LIT variables (e.g. %[[x:arg0]] ) are never shared between cases (currently, one test case pollutes all other cases with its variables)

it's a bit easier for humans to parse tests (this is very subjective, but definitely the case for me)

There are good reasons to keep everything in one file too and a lot of it is just a matter of preference.

The actual reason to bring this up in this patch is that I'm adding 244 lines of tests. That's quite a lot and personally I'd prefer to move it to a separate file. But that's nothing major and I'm happy to keep things as is.

It might make sense to split along some alphabetical order. For example, intrinsics that start with the letters 'A' through 'M' in one file and 'N' through 'Z' in another file. Or depending on the number of intrinsics per letter, split into more than two files.

psteinfeld · 2021-07-19T13:44:41Z

It would be nice to have the same file structure (as much as practical) across all of the tests -- semantics, lowering, runtime. Is there an intern who might be willing to take this on?

banach-space · 2021-07-19T20:57:13Z

I think that having one file per intrinsic would be the most intuitive approach. There are multiple precedents for that in llvm-project (e.g. SVE intrinsics in Clang). This approach makes triaging failures very easy. It also speeds-up testing (LIT will run tests in parallel).

It's not too much work to implement: #927. By the way, shall we move this discussion to #927 ? I've created that PR to make it easier for us to see the possible options. Please reject it if you don't like my suggestion.

As for dot_product, does this look ready for you?

kiranchandramohan · 2021-07-18T20:49:06Z

flang/include/flang/Lower/ReductionRuntime.h

@@ -120,6 +120,12 @@ mlir::Value genProduct(Fortran::lower::FirOpBuilder &builder,
                       mlir::Location loc, mlir::Value arrayBox,
                       mlir::Value maskBox, mlir::Value resultBox);

+/// Generate call to Product intrinsic runtime routine. This is the version


Nit: Product -> DotProduct

kiranchandramohan · 2021-07-20T16:10:01Z

flang/lib/Lower/ReductionRuntime.cpp

+  auto sourceLine =
+      Fortran::lower::locationToLineNo(builder, loc, fTy.getInput(3));
+  auto args = Fortran::lower::createArguments(
+      builder, loc, fTy, vectorABox, vectorBBox, sourceFile, sourceLine);


I see the same in genSum and genProd, but why is the resultBox not included here?

In this case, it's similar for product: https://github.com/flang-compiler/f18-llvm-project/blob/fir-dev/flang/lib/Lower/ReductionRuntime.cpp#L712-L713. resultBox is included when if (fir::isa_complex(eleTy)) returns true (L863).

Correct. resultBox is only used with the complex case in these functions.

mleair

Looks good. Please address Kiran's nit and squash your commits before merging. Thanks!

banach-space · 2021-07-22T13:24:24Z

Rabased, squashed, [nit] addressed.

All tests from flang/test/Lower/intrinsic-procedures.f90 are moved to dedicated files (with the exception of `lge`, `lgt`, `lle` and `llt`). With this change, every filename clearly defines which intrinsic procedure is being tested. After the initial split, we discovered that some tests were invalid (e.g. `! CHECK` was used instead of `! CHECK:`). These tests are now fixed. Some tests are additionally hardened (e.g. some `CHECK-DAG` directives was replaced with `CHECK`). Some tests are updated with checks generated with mlir/utils/generate-test-checks.py. In other words, this patch _moves_ and _reformats_ the tests. The documentation is updated accordingly (with some extra clarifications, as suggested by the reviewers). Please see the following PRs for a complete discussion: * #927 * #914

banach-space requested review from jeanPerier and schweitzpgi July 8, 2021 19:06

banach-space commented Jul 8, 2021

View reviewed changes

schweitzpgi requested review from mleair, psteinfeld and vdonaldson July 8, 2021 19:48