Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64] Reflection calls from JRuby are slow on GraalVM vs. HotSpot #6600

Open
lewurm opened this issue May 12, 2023 · 4 comments
Open

[AArch64] Reflection calls from JRuby are slow on GraalVM vs. HotSpot #6600

lewurm opened this issue May 12, 2023 · 4 comments
Assignees

Comments

@lewurm
Copy link
Member

lewurm commented May 12, 2023

As reported by @headius (see #2666 (comment) ), GraalVM executes this JRuby snippet much slower on AArch64 than HotSpot:

Runtime = java.lang.Runtime
loop {
  t = Time.new
  i = 0
  while i < 100_000_000
    i += 1
    Runtime.runtime
  end
  puts Time.now - t
}

I observed a slowdown of ~2x on linux-aarch64 and ~6x on darwin-aarch64 compared to HotSpot (using the same JAVA_HOME that was used to build GraalVM). I did not see any slowdown on darwin-amd64; I could not verify that on linux-amd64 would be great if someone could check this platform, so we know that this is issue is specific to AArch64.

Instructions on darwin-aarch64:

$ brew install jruby
$ JAVA_HOME=$labs_jdk_of_your_choice jruby -v -Xcompile.invokedynamic -e 'Runtime = java.lang.Runtime; loop { t = Time.new; i = 0; while i < 100_000_000; i += 1; Runtime.runtime; end; puts Time.now - t }'
jruby 9.4.2.0 (3.1.0) 2023-03-08 90d2913fda Java HotSpot(TM) 64-Bit Server VM 20.0.1+9-jvmci-23.0-b10 on 20.0.1+9-jvmci-23.0-b10 +indy +jit [arm64-darwin]
2.379717
2.2683880000000003
2.267346
^C
$ cd $graal_repo/vm
$ JAVA_HOME=$labs_jdk_of_your_choice mx --env ce-aarch64-darwin build
$ JAVA_HOME=`mx --env ce-aarch64-darwin graalvm-home` jruby -v -Xcompile.invokedynamic -e 'Runtime = java.lang.Runtime; loop { t = Time.new; i = 0; while i < 100_000_000; i += 1; Runtime.runtime; end; puts Time.now - t }'
13.344807000000001
13.256523
13.325128
^C
@dougxc
Copy link
Member

dougxc commented May 22, 2023

@teshull could you please have a look into this.

@dougxc
Copy link
Member

dougxc commented May 24, 2023

I've looked into this a bit and it seems to be a problem with the CE inliner. When I run the example with GraalVM EE, the performance is better than C2.

EE:

jruby 9.4.2.0 (3.1.0) 2023-03-08 90d2913fda Java HotSpot(TM) 64-Bit Server VM 21+23-jvmci-23.1-b04 on 21+23-jvmci-23.1-b04 +indy +jit [arm64-darwin]
2.504992
2.017036
1.6439199999999998
1.643141
1.6402519999999998
1.640962
^C⏎

C2 (GraalVM EE with -J-XX:-UseJVMCICompiler):

jruby 9.4.2.0 (3.1.0) 2023-03-08 90d2913fda Java HotSpot(TM) 64-Bit Server VM 21+23-jvmci-23.1-b04 on 21+23-jvmci-23.1-b04 +indy +jit [arm64-darwin]
2.569087
2.3188429999999998
2.241689
2.3638049999999997
2.258704
^C⏎

EE with CE inliner (-J-Dgraal.UsePriorityInlining=false):

jruby 9.4.2.0 (3.1.0) 2023-03-08 90d2913fda Java HotSpot(TM) 64-Bit Server VM 21+23-jvmci-23.1-b04 on 21+23-jvmci-23.1-b04 +indy +jit [arm64-darwin]
13.685284
13.430108
13.498757000000001
^C⏎

@dougxc
Copy link
Member

dougxc commented May 24, 2023

Use of -Dgraal.TraceInlining=true may reveal which important inlining decisions are going the wrong way.

@dougxc
Copy link
Member

dougxc commented May 25, 2023

More experimentation shows that this is most likely a profiling and compilation timing issue. In some cases, the relevant invokes do not seem to have a mature profile. Playing around with flags such as -XX:ProfileMaturityPercentage revealed this. Also, just enabling Graal IGV dumping seems to "fix" the performance problem as it probably delays compilation of the relevant methods long enough that their profiles get hot. The EE inliner is less prone to this due to being aggressive even in the context of less profile info.

@oubidar-Abderrahim oubidar-Abderrahim removed their assignment Jul 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants