-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore JavaInteropSpeedTest. #112
Conversation
b01d27c
to
16f5898
Compare
Is there anyone who knows how to "Convert into a microbenchmark!"? Who will be responsible for that? |
@jtulach I can do that (actually I have already done it and I wanted to add it to this PR but then we stumbled over #113). But there are two question: First, the benchmark makes only sense if executed with the graal truffle implementation, right? If so I would put the benchmark inside the graal source tree. The infrastructure is already there. (If we put it into truffle we need to add a JMH dependency.) Second and even more important, who is going to monitor the benchmark? The unittest variant was flaky for weeks and it turned out that it was not doing anything (because of #113). |
As far as the monitoring goes, I'd suggest to follow work done in #101 - it allows downstream projects to run the test they need and still provide feedback to each truffle pull request. |
Sounds promising! But we need to keep in mind that this is not a test that says OK or NG. It is a performance number. Deciding automatically whether there is a regression is not trivial IMHO. |
The alternative is to have the test code in the truffle repository and only execute it if the optimizing runtime is present. Plus configure travis to get the latest binary of graal-core and run |
Well, it is not a unittest, it is an |
Truffle repository contains various distributions. Adding the JMH dependency to |
…/truffle:fix_dsl_for_eclipse_neon to master * commit '9d6708adf9d1a57d32f15ecd5bf2063790d83244': Fix DSL JDT compiler support for Eclipse Neon.
…inter. Trapping nullcheck might generate two uncompress instructions for the same compressed oop in Graal. One is inserted by the backend when it emits nullcheck. If the pointer is a compressed object, it should be uncompressed before the nullcheck is emitted. And another one is generated by the normal uncompressing operation. These two instructions are duplicated with each other. The generated codes on AArch64 like: ldr w0, [x0,oracle#112] lsl x2, x0, oracle#3 ; uncompressing (first) ldr xzr, [x2] ; implicit exception: deoptimizes ...... ; fixed operations lsl x0, x0, oracle#3 ; uncompressing (second) str w1, [x0,oracle#12] A simple way to avoid this is to apply the nullcheck to the uncompressed result if it exists instead of to the compressed pointer when generating the trapping nullcheck. With the modification, the codes above could be optimized to: ldr w0, [x0,oracle#112] lsl x0, x0, oracle#3 ; uncompressing ldr xzr, [x0] ; implicit exception: deoptimizes ...... ; fixed operations str w1, [x0,oracle#12] Change-Id: Iabfe47bbf984ed11c42555f84bdd0ccf2a5bdddb
…inter. Trapping nullcheck might generate two uncompress instructions for the same compressed oop in Graal. One is inserted by the backend when it emits nullcheck. If the pointer is a compressed object, it should be uncompressed before the nullcheck is emitted. And another one is generated by the normal uncompressing operation. These two instructions are duplicated with each other. The generated codes on AArch64 like: ldr w0, [x0,oracle#112] lsl x2, x0, oracle#3 ; uncompressing (first) ldr xzr, [x2] ; implicit exception: deoptimizes ...... ; fixed operations lsl x0, x0, oracle#3 ; uncompressing (second) str w1, [x0,oracle#12] A simple way to avoid this is to apply the nullcheck to the uncompressed result if it exists instead of to the compressed pointer when generating the trapping nullcheck. With the modification, the codes above could be optimized to: ldr w0, [x0,oracle#112] lsl x0, x0, oracle#3 ; uncompressing ldr xzr, [x0] ; implicit exception: deoptimizes ...... ; fixed operations str w1, [x0,oracle#12] Change-Id: Iabfe47bbf984ed11c42555f84bdd0ccf2a5bdddb
Trapping nullcheck might generate two uncompress instructions for the same compressed oop on AArch64. One is inserted by the backend when it emits nullcheck. If the object is a compressed pointer, it is uncompressed before the nullcheck is emitted. And another one is generated by the uncompression node used for memory access. These two instructions are duplicated with each other. The generated codes on AArch64 like: ldr w0, [x0,oracle#112] lsl x2, x0, oracle#3 ; uncompressing (first) ldr xzr, [x2] ; implicit exception: deoptimizes ...... ; fixed operations lsl x0, x0, oracle#3 ; uncompressing (second) str w1, [x0,oracle#12] A simple way to avoid this is to creat a new uncompression node for the nullcheck, and let the value numbering remove the duplicated one if possible. Since the address lowering of AMD64 can handle the uncompressing computation for address, the created uncompression node is wrapped to an address node and the nullcheck is finally applied on the address. With the modification, the codes above could be optimized to: ldr w0, [x0,oracle#112] lsl x0, x0, oracle#3 ; uncompressing ldr xzr, [x0] ; implicit exception: deoptimizes ...... ; fixed operations str w1, [x0,oracle#12] Change-Id: Iabfe47bbf984ed11c42555f84bdd0ccf2a5bdddb
Trapping nullcheck might generate two uncompress instructions for the same compressed oop on AArch64. One is inserted by the backend when it emits nullcheck. If the object is a compressed pointer, it is uncompressed before the nullcheck is emitted. And another one is generated by the uncompression node used for memory access. These two instructions are duplicated with each other. The generated codes on AArch64 like: ldr w0, [x0,oracle#112] lsl x2, x0, oracle#3 ; uncompressing (first) ldr xzr, [x2] ; implicit exception: deoptimizes ...... ; fixed operations lsl x0, x0, oracle#3 ; uncompressing (second) str w1, [x0,oracle#12] A simple way to avoid this is to creat a new uncompression node for the nullcheck, and let the value numbering remove the duplicated one if possible. Since the address lowering of AMD64 can handle the uncompressing computation for address, the created uncompression node is wrapped to an address node and the nullcheck is finally applied on the address. With the modification, the codes above could be optimized to: ldr w0, [x0,oracle#112] lsl x0, x0, oracle#3 ; uncompressing ldr xzr, [x0] ; implicit exception: deoptimizes ...... ; fixed operations str w1, [x0,oracle#12] Change-Id: Iabfe47bbf984ed11c42555f84bdd0ccf2a5bdddb
…ontext-per-example Bugfix: use nested context for each example
Trapping nullcheck might generate two uncompress instructions for the same compressed oop on AArch64. One is inserted by the backend when it emits nullcheck. If the object is a compressed pointer, it is uncompressed before the nullcheck is emitted. And another one is generated by the uncompression node used for memory access. These two instructions are duplicated with each other. The generated codes on AArch64 like: ldr w0, [x0,oracle#112] lsl x2, x0, oracle#3 ; uncompressing (first) ldr xzr, [x2] ; implicit exception: deoptimizes ...... ; fixed operations lsl x0, x0, oracle#3 ; uncompressing (second) str w1, [x0,oracle#12] A simple way to avoid this is to creat a new uncompression node for the nullcheck, and let the value numbering remove the duplicated one if possible. Since the address lowering of AMD64 can handle the uncompressing computation for address, the created uncompression node is wrapped to an address node and the nullcheck is finally applied on the address. With the modification, the codes above could be optimized to: ldr w0, [x0,oracle#112] lsl x0, x0, oracle#3 ; uncompressing ldr xzr, [x0] ; implicit exception: deoptimizes ...... ; fixed operations str w1, [x0,oracle#12]
The test tries to find performance regressions in the java interop implementation by comparing invocation time which is highly system dependent and fails regularly on different machines. Doing this in a unit test is sub-optimal. It should be converted into a microbenchmark.