Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canonicalize IfNode to ConditionalNode for variable assignment. #884

Merged
merged 1 commit into from
Jun 2, 2020

Conversation

XiaohongGong
Copy link

@XiaohongGong XiaohongGong commented Dec 25, 2018

This patch here wants to canonicalize IfNode to ConditionalNode. We want to use "csel" to implement the conditional value selecting operation for more conditions like Math.min/max, instead of using "cmp + b.cond".

The trueSuccessor and falseSuccessor blocks of IfNode must meet the following conditions:

  1. The nodes in block only contain a start and an end one except for the constant and parameter node.
  2. The merge node only contains a phi member.

All the checking jobs must begin after the final schedule phase is finished, so that the
nodes in each block will not be changed anymore.

Change-Id: I4aa0ae791e05f34e6af3dd1da035e42a97815b9b

@XiaohongGong
Copy link
Author

Hi there,
Could anyone please take a look at this patch? Thanks!

@tkrodriguez
Copy link
Member

This seems like a somewhat intrusive way of accomplishing this. Performing an extra canonicalization round is fairly expensive. Also if I read the code in LowTier correctly the final round shouldn't be performing GVN because of the optimizations performed in FixReadsPhase. If I understand this correctly, the use of a schedule to detect empty blocks is attempting to handle cases where the conditional is the only user of a complex true or false expression? I think that's the reason we only perform the transformation for constants and parameters since otherwise the transformation can introduce extra unconditional work. @gilles-duboscq maybe examining the number of users of the values would work just as well and could done all the time instead of only during the final canonicalizer phase. It also suggests that sometimes it might make sense to convert a ConditionalNode into an explicit if for the same reason we don't perform the reverse transform.

@XiaohongGong
Copy link
Author

@tkrodriguez Thanks for your reviewing and comments! We need to make sure the blocks are all empty, that means in the true and false blocks, the conditional is the only user of the values and the values are not created there, like Math.min/max. The reason that we need to do the check after a final schedule is that we must make sure the blocks will not be changed due to some optimizations. Also the transformation for parameters are not supported yet. Due to the reasons that it's not appropriate to accomplish this transformation here, can we just add the support for parameters or just implement the transformation for Math.min/max functions?

@tkrodriguez
Copy link
Member

I agree that this kind of transform should be done relatively late so we can make sure we don't introduce extra unconditional work but it doesn't have to be done at the very end. We could do it in the canonicalize phase that runs after the FixReadsPhase in LowTier. The schedule right before that isn't a FINAL_SCHEDULE but I don't think that really makes any difference for the cases we care about. The interesting cases are diamonds so the only thing which should be in the true or false branches are values being produced by the conditional. So we could add a flag on CanoniclizerPhase to control this, kind of like canonicalizeReads, and propagate it through CanonicalizerTool and use that to control this optimization. Structurally I think that would be more acceptable. Optimization of conditionals is a real rathole but I agree this specific pattern should always be a win.

@XiaohongGong
Copy link
Author

XiaohongGong commented Jan 9, 2019

I checked that the StructuredGraph has a field isAfterFixedReadPhase, which will be setted true when running FixReadPhase. So I think we can use this flag to directly control this optimization, can we?

@tkrodriguez
Copy link
Member

That seems like a reasonable way to go.

@XiaohongGong
Copy link
Author

@tkrodriguez I updated my patch according to the discussion above. Could you please take a look at it again? Thank you!

@tkrodriguez
Copy link
Member

I'm running it through out internal gate and may do some benchmark runs since using the conditional operators can have surprising performance effects, at least partly because it makes larger blocks where local scheduling can become more important.

@XiaohongGong XiaohongGong force-pushed the csel branch 3 times, most recently from 46e90f2 to 331c5d9 Compare January 21, 2019 03:49
@XiaohongGong
Copy link
Author

@tkrodriguez how is the test going now? Is there any update? Thank you!

@tkrodriguez
Copy link
Member

Sorry I went silent on you. Can you rebase onto the latest master? There's a conflict with some of your changes in IfNode that needs to be resolved and I have some style comments and minor bug fixes.

@tkrodriguez
Copy link
Member

When you merge with latest master the new LLVM based SVM backend is now in this repo and I think these changes have exposed a problem with conditional lowering there. I have a fix for it and if you don't mind I'll just include that fix when I merge your changes. Otherwise I think these changes look ok.

@XiaohongGong XiaohongGong force-pushed the csel branch 2 times, most recently from faa5a8c to 2d58d25 Compare March 11, 2019 07:26
@XiaohongGong
Copy link
Author

When you merge with latest master the new LLVM based SVM backend is now in this repo and I think these changes have exposed a problem with conditional lowering there. I have a fix for it and if you don't mind I'll just include that fix when I merge your changes. Otherwise I think these changes look ok.

What will the problem be? What is the relate backend difference between SVM backend and the current one?

@XiaohongGong XiaohongGong force-pushed the csel branch 4 times, most recently from e2e03d5 to ae66241 Compare March 11, 2019 10:14
@tkrodriguez
Copy link
Member

Basically the LLVM backend in the context of SVM ends up mixing a pointer and a long in a conditional and needs to add a little type conversion to make LLVM happy. It does this in other places but seems to be missing it for conditionals.

@tkrodriguez
Copy link
Member

Your comment about build problems is gone so I assume you got that resolved?

@XiaohongGong
Copy link
Author

Your comment about build problems is gone so I assume you got that resolved?

I resolved it locally to make sure my patch is compiled ok. But I didn't commit the changes here. I wonder whether someone will fix it for aarch64.

@tkrodriguez
Copy link
Member

tkrodriguez commented Mar 12, 2019

I think something like this is probably what's required. Could you try that out?

diff --git a/compiler/mx.compiler/suite.py b/compiler/mx.compiler/suite.py
index d76db6d3f65..26848f3ec2a 100644
--- a/compiler/mx.compiler/suite.py
+++ b/compiler/mx.compiler/suite.py
@@ -240,6 +240,10 @@ suite = {
               "classifier": "linux-x86_64"
             },
           },
+          "aarch64": {
+            "path": "",
+            "optional": True
+          }
         },
         "darwin": {
           "amd64": {

@XiaohongGong
Copy link
Author

XiaohongGong commented Mar 12, 2019

I think something like this is probably what's required. Could you try that out?

diff --git a/compiler/mx.compiler/suite.py b/compiler/mx.compiler/suite.py
index d76db6d3f65..26848f3ec2a 100644
--- a/compiler/mx.compiler/suite.py
+++ b/compiler/mx.compiler/suite.py
@@ -240,6 +240,10 @@ suite = {
               "classifier": "linux-x86_64"
             },
           },
+          "aarch64": {
+            "path": "",
+            "optional": True
+          }
         },
         "darwin": {
           "amd64": {

Yes. Similar but I use "<others>" while not "aarch64".

@XiaohongGong
Copy link
Author

Hi @tkrodriguez , how is this patch going now?

@dougxc
Copy link
Member

dougxc commented Dec 10, 2019

@tkrodriguez can you please give this PR some attention.

@XiaohongGong
Copy link
Author

Hi @tkrodriguez , how is this PR going now? Thank you!

@XiaohongGong
Copy link
Author

Hi @tkrodriguez , how is this PR going now? If it's controversial for this mid-end optimization, so how do you think if we just limit the optimization to Math.min/max? If so, we can simply add the intrinsics for Math on AArch64. Thank you!

@tkrodriguez
Copy link
Member

I'll look at this today and see where it stands. I can't remember what the performance looks like which is my main concern about this change.

@XiaohongGong
Copy link
Author

I'll look at this today and see where it stands. I can't remember what the performance looks like which is my main concern about this change.

Thank you so much that you could look at it again!

@tkrodriguez
Copy link
Member

Could you rebase this branch? I've done a local merge for testing but it would be better to have a clean rebase as there are conflicts.

Brief: We want to use "csel" to implement the conditional value selecting operation
for more conditions like Math.min/max, instead of using "cmp + b.cond".

The trueSuccessor and falseSuccessor blocks of IfNode must meet the following conditions:
1. The nodes in block only contain a start and an end one except for the constant and parameter nodes.
2. The merge node only contains a phi member.

All the checking jobs must begin after the FixReadPhase, so that the nodes in blocks will not be changed.

Change-Id: I4aa0ae791e05f34e6af3dd1da035e42a97815b9b
@XiaohongGong
Copy link
Author

Could you rebase this branch? I've done a local merge for testing but it would be better to have a clean rebase as there are conflicts.

Sure! I'v rebased and updated my patch. Thanks a lot!

@graalvmbot graalvmbot merged commit 37ab446 into oracle:master Jun 2, 2020
@XiaohongGong XiaohongGong deleted the csel branch June 2, 2020 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants