Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang] Add runtime interface for NUM_IMAGES #1076

Open
wants to merge 1,158 commits into
base: fir-dev
Choose a base branch
from

Conversation

rasmussn
Copy link

NUM_IMAGES takes optional arguments TEAM and TEAM_NUMBER. This
patch provides an interface for only the optional TEAM_NUMBER
dummy argument. It is likely that a different function will
be required to support TEAM arguments.

Reviewed By: PeteSteinfeld, klausler

Differential Revision: https://reviews.llvm.org/D109547

psteinfeld and others added 30 commits July 22, 2021 10:41
…project into fir-dev

Getting ready to merge changes for the CSHIFT intrinsic.
This change implements the lowering runtime portion of the CSHIFT
intrinsic.  The implementation of the runtime was done in revision
D106292.
[flang] Implement the lowering portion of the CSHIFT intrinsic
…roduct

Add lowering for the DOT_PRODUCT intrinsic

I confirmed that it builds and runs without issue (regression tests pass and an end-to-end dot product test runs correctly too). Thanks @banach-space !
Initial implementation with assumption that rank will be always 2
In processing the UNPACK intrinsic, my implementation had an unneeded
extra call to createBox().  This change removes it.
-> Handle case where rank of one the matrices is 1.
-> Add test for the above case with logical type.
-> Remove return value from the matmul call.
This change fixes a bug in  the runtime portion of the CSHIFT intrinsic
that happens when the value of the SHIFT argument is negative.

Differential Revision: https://reviews.llvm.org/D106292
[flang] Remove an unneeded call to createBox
When instantiating variables, identify the ones that must be default
initialized at runtime (locals and intent(out) dummies). Call the
default initialization runtime routine to perform the initialization.

Global variables that have no explicit initialization and have a default
initialization are left TODO. The runtime cannot be used for them, the
fir.global generation will need to be updated for these variables (it was
generating an undefined undefined until now). For now, simply catch and
add a hard TODO for these cases. For information, it is firing 13 times
in NAG F95 tests.

Runtime interface for finalization is added, but nothing is done with it
yet (Final procedures are not F95).
…-type-init-2

Lower default initialization of local and dummy entities
[OpenMP][Flang]Add support for modifiers monotonic and non-monotonic
Also insert casts in fir.array_coor op (error revealed by the first fix).
…issue

Fix fir.alloca codegen for array with dynamic type parameters
It does not make sense to embox a box. Instead we can convert the box's
signature with a fir.convert, which is idempotent on the boxed value.
…faces

Since BOZ literal arguments are typeless, we cannot know how to pass them as
actual arguments to procedures with implicit interfaces.  This change avoids
the problem by emitting an error message in such situations.

This change stemmed from the following issue --
  flang-compiler#794

Differential Revision: https://reviews.llvm.org/D106831
According to C7109, "A boz-literal-constant shall appear only as a
data-stmt-constant in a DATA statement, or where explicitly allowed in
16.9 as an actual argument of an intrinsic procedure."  This change
enforces that constraint for output list items.

I also added a general interface to determine if an expression is a BOZ
literal constant and changed all of the places I could find where it
could be used.

I also added a test.

This change stemmed from the following issue --
  https://gitlab-master.nvidia.com/fortran/f18-stage/issues/108

Differential Revision: https://reviews.llvm.org/D106893
When a WRITE overwrites an endfile record, we need to forget
that there was an endfile record.  When doing a BACKSPACE
after an explicit ENDFILE statement, the position afterwards
must be upon the endfile record.

Attempts to join list-directed delimited character input across
record boundaries was due to a bad reading of the standard
and has been deleted, now that the requirements are better understood.
This problem would cause a read attempt past EOF if a delimited
character input value was at the end of a record.

It turns out that delimited list-directed (and NAMELIST) character
output is required to emit contiguous doubled instances of the
delimiter character when it appears in the output value.  When
fixed-size records are being emitted, as is the case with internal
output, this is not possible when the problematic character falls
on the last position of a record.  No two other Fortran compilers
do the same thing in this situation so there is no good precedent
to follow.

Because it seems least wrong, with this patch we now emit one copy
of the delimiter as the last character of the current record and
another as the first character of the next record.  (The
second-least-wrong alternative might be to flag a runtime error,
but that seems harsh since it's not an explicit error in the standard,
and the output may not have to be usable later as input anyway.)
Consequently, the output is not suitable for use as list-directed or
NAMELIST input.

If a later standard were to clarify this case, this behavior will of
course change as needed to conform.

Differential Revision: https://reviews.llvm.org/D106695
NAMELIST I/O formatting uses the runtime infrastructure for
list-directed I/O.  List-directed input processing has same state
that requires reinitialization for each successive NAMELIST input
item.  This patch fixes bugs with "null" items and repetition counts
on NAMELIST input items after the first in the group.

Differential Revision: https://reviews.llvm.org/D106694
cherry pick runtime I/O fixes for namelist array input, backspace, rewind
All tests from flang/test/Lower/intrinsic-procedures.f90 are moved to
dedicated files (with the exception of `lge`, `lgt`, `lle` and `llt`).
With this change, every filename clearly defines which intrinsic
procedure is being tested.

After the initial split, we discovered that some tests were invalid
(e.g.  `! CHECK` was used instead of `! CHECK:`). These tests are now
fixed. Some tests are additionally hardened (e.g. some `CHECK-DAG`
directives was replaced with `CHECK`). Some tests are updated with
checks generated with mlir/utils/generate-test-checks.py. In other
words, this patch _moves_ and _reformats_ the tests.

The documentation is updated accordingly (with some extra
clarifications, as suggested by the reviewers).

Please see the following PRs for a complete discussion:
* flang-compiler#927
* flang-compiler#914
…ate_tests

[flang][nfc] Move tests from intrinsic-procedures.f90 to dedicated files
Delete code that was redundantly removing the elided extents before t…
…r with

a scalar result.

See the following Phabricator review for more information:

https://reviews.llvm.org/D106820
schweitzpgi and others added 9 commits September 16, 2021 13:29
Fix evaluation semantics of FORALL constructs per 10.2.4.2.4.
The added pass works when mlir::LLVM::LLVMDialect, mlir::acc::OpenACCDialect,
and mlir::omp::OpenMPDialect are present, which adds a dependency on those
dialect libraries.

https://github.com/flang-compiler/f18-llvm-project/blob/bb1ab876a5fec1eb3f2ab0603756f9124515a68a/flang/include/flang/Optimizer/Transforms/Passes.td#L189
Define a VectorSubscriptBox class that allow representing and working
with a lowered Designator containing vector subscripts while ensuring
all the subscripts expression are only lowered once.

It's a bit of a super ExtendedValue (but is is not added as such because
it is heavy and its use case is only restricted to IO input so far).

The key point of this class is that it has members functions
`loopOverElements` and `loopOverElementsWhile` that allow creating loops
over each element of the designator, and call a provided callback with
the address of the element.

This class is used in input IO to create an IO runtime call for each
element of the designator, since it is not possible to build a
descriptor for it. The `loopOverElementsWhile` version is required when
error recovery is enabled in IO.

A hidden VectorSubscriptBoxBuilder is in charge of making a custom
Designator<T> visit and create the VectorSubscriptBox. Once this is
done, the VectorSubscriptBox does not equire any front-end data
structures to work with.

The motivation for creating such tool is that the current lowering
array lowering infrastructure is centered around assignment (to a
variable or temporary), and could not be used here to created the loops
over the IO runtime calls without some non trivial modification to it.
Adding complexity to the array expression lowering framework to cover
a corner case did not appear a good idea.

The option of creating a temp, passing it to the runtime, and copying it
back was explored, but it was not possible to guarantee that the
subscript would be evaluated only once given there was no way to "keep
the lowered representation" of the designator between the temp creation
and the copy back (the temp creation requires evaluating the susbcripts
to compute the shape in general). However, note that with the added
VectorSubscriptBox, it would actually now also be possible to deploy a
temp + copy back mechanism. This can be done later if it appeared
beneficial in real world program.

The only cases left TODOs are the cases when one of the Components
is a parent type reference (not yet handled properly in the general case),
the coarray case, and the PDT case (not sure exactly how type params
will be threaded in fir.field_index). All other cases are covered, and
I tried to add exhaustive tests since I do not expect real world program
to be very harsh on this utility (most will just do READ(*, *) x(y)).
…-io-2

Lower IO input with vector subscripts
Also fix some tests that were walls of CHECK-DAG checks.
Changes so that temporaries do not use uniq_name.
- Some file name size made it into transfer.f90 test in flang-compiler#1070 test update
through the fir.char<?,90> type. This makes the test dependent on where
LLVM directory is. Replace fir.char<?,90> by fir.char<?,{{.*}}>

- vector-subscript-io.f90 was not updated with the uniq_name ->
bindc_name change in flang-compiler#1070.
Use the array lowering framework to lower elemental subroutine.
This required:

- Splitting the part of genIterSpace that creates the
  implicit loops into a new genImplicitLoops function. genIterspace
  deals with the destination and other explicit contexts set-ups that
  do not matter in the subroutine case (there is no destination).
  Other than having no destination, a big difference is that there is
  no threaded innerArg. genImplicitLoops handles this variation.

- Fixing a bug in the iteration shape determination. The code was
  assuming there would always be at least one array_load, which is
  not true if there are only RefTransparent operands. This is always
  true in the subroutine case, but was actually also an issue for things
  like `print *, elem_func(array)` with the previous code (compile crash).
  Add a new ArrayOperands structure to keep track of what matters to
  later deduce the shape of the array expression instead of keeping
  track of array_loads for that purpose.

- Elemental subroutine with intent(out)/intent(inout) arguments must be
  applied "in array element order" (15.8.3). It turns out this is also
  the case for impure elemental functions (10.1.4 p5). Instead of always
  creating loops as "unordered=true" add an unordered field to the ArrayExprLowering
  class. It is set to true by default, but will be set to false if any
  impure elemental function is lowered, or in case of
  intent(out)/intent(inout) arguments in elemental subroutines.

- Last, instead of using createFirExpr to lowering subroutine calls, use
  a new createSubroutineCall that deals with the array/scalar
  dispatching.

A TODO is added for the user defined elemental assignment, because
overlap analysis between RHS/LHS must be done, and this require somehow
plugging this special and very restricted case of elemental calls into
the array_load/array_update/array_merge framework or some ad-hoc
lowering.
…ub-2

Lower elemental subroutine calls (outside of user assignment case)
@jeanPerier
Copy link
Collaborator

This interface looks good to me, however it looks like it did not make it in LLVM as part of https://reviews.llvm.org/D109547 (I see the runtime discussed in the thread, but I cannot see the runtime change itself in the last diff or in https://github.com/flang-compiler/f18-llvm-project/tree/fir-dev/flang/include/flang/Runtime). Was this intended ?

NUM_IMAGES takes optional arguments TEAM and TEAM_NUMBER.  This
patch provides an interface for only the optional TEAM_NUMBER
dummy argument.  It is likely that a different function will
be required to support TEAM arguments.

Reviewed By: PeteSteinfeld, klausler

Differential Revision: https://reviews.llvm.org/D109547
@rasmussn
Copy link
Author

This interface looks good to me, however it looks like it did not make it in LLVM as part of https://reviews.llvm.org/D109547 (I see the runtime discussed in the thread, but I cannot see the runtime change itself in the last diff or in https://github.com/flang-compiler/f18-llvm-project/tree/fir-dev/flang/include/flang/Runtime). Was this intended ?

I don't recall at this point why I thought I had to commit it to fir-dev. Probably because I have the implementation for NUM_IMAGES done and this will definitely have to go in fir-dev. However, I previously pushed

[flang] Make 'this_image()' an intrinsic function

on Thu Sep 16 17:21:40 and this isn't if fir-dev yet. So perhaps I thought that by submitting a pull request to fir dev I could use the the file as soon as it was accepted rather that waiting around for the file to show up in fir-dev.

I'm a bit confused at the moment how this all works. I have several intrinsic functions I've been working on that I'd like to get into fir-dev and I'm finding that the coding is the easiest part. Now that I have commit privileges on the llvm-project itself it is straight forward for me to push stuff there.

I think I'll just go ahead and submit a pull request for the whole implementation of NUM_IMAGES since the interface has already been accepted.

@rasmussn rasmussn reopened this Sep 23, 2021
This commit provides a single-image implementation of NUM_IMAGES.
It returns 1.
@rasmussn
Copy link
Author

I've added the implementation for NUM_IMAGES. It will likely be easier to understand the interface for num_images() in coarray.h in context of the implementation. The implementation is for execution on a single image only, in which case the function returns 1.

@schweitzpgi
Copy link

Hi Craig,

At this time we are typically cherry picking changes from the main branch onto the fir-dev branch when and where it makes sense to do so. Lowering code is not upstreamed, so changes to that part of the compiler come in as PRs like this one. It looks like this PR could've cherry picked over a change or two, but don't worry about it at this point.

Thanks for the help with coarrays.


int RTNAME(NumImages)(int team,
const char *sourceFile, int sourceLine) {
std::fputs("Fortran NUM_IMAGES: single image value is 1", stderr);
Copy link
Author

@rasmussn rasmussn Sep 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should go away along with the include of C++ header file.

const char *sourceFile, int sourceLine) {
std::fputs("Fortran NUM_IMAGES: single image value is 1", stderr);
// return value for single image (default coarray runtime)
return 1;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error checking needs to be added. I need to think through what values team can take on a single image.

@rasmussn
Copy link
Author

This pull request should wait until https://reviews.llvm.org/D110356 is merged into the main branch. It makes several changes related to TEAM_TYPE in flang/lib/Evaluate/intrinsics.cpp that will be useful here.

@jeanPerier
Copy link
Collaborator

Hi, I have just rebased fir-dev on LLVM 85bf221 from last week.
That means you will need to update this PR with the rebased fir-dev.
You cannot simply rebase your PR with fir-dev (it will try to reapply all fir-dev changes).

You can update the PR as follow (assuming your branch is called my-branch):

  • backup your branch into my-branch-bckup (just to be safe :) )
  • create a new branch my-branch-new following the new fir-dev head
  • cherry-pick all your commits from this PR into my-branch-new
  • once it looks good, force push my-branch-new into the upstream my-branch (git push origin my-branch-new:my-branch -f )

Sorry for the inconvenience, I could not wait for every PR to get in and solve all new conflicts on my side, the rebase
is meant to help upstreaming so that we can avoid having this too many times again in the future.

Copy link

@schweitzpgi schweitzpgi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR requires a rebase.

Copy link

@schweitzpgi schweitzpgi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@clementval clementval removed their request for review July 17, 2024 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.