Compiler is failing to track source position for NewInstance nodes #3620

adinn · 2021-07-23T13:43:04Z

The current GraalVM head is manifesting a problem with tracking source positions for compiled Java methods that affects debugging in native images.

During compilation a NewInstance node is often replaced by a VirtualInstance node.The source position (NodeSourcePosition) associated with the NewInstance is not copied across to the VirtualInstance. As a consquence the caller information in the SourceMapping for the compiled method that relates to the inlined allocation instructions ends in a root call that has no associated file or line number.

Manifesting this problem without the latest in progress debuginfo update is complicated. It can only be done by using a debugger to inspect the SourceMapping of a CompilationResult for a method that includes a new operation. The mapping will include one or more NodeSourcePosition objects for the inlined allocation. These will have caller chains that have valid inline positions for methods like GenScavengeAllocationSnippets::readTlabTop, AllocationSnippets.allocateInstanceImpl etc but the inlined caller chains all end with a root NodeSourcePosiiton for the top level method that is marked as a placeholder and has bci = -1. This root called should actually identify the bci for the new operation in the outer compiled method.

The latest debug info code manifests the problem more immediately. At the point where a new is inlinedin the Hello debugger test the code produces this backtrace:

#0  com.oracle.svm.core.genscavenge.graal.GenScavengeAllocationSnippets::readTlabTop () at com/oracle/svm/core/genscavenge/graal/GenScavengeAllocationSnippets.java:129
#1  org.graalvm.compiler.replacements.AllocationSnippets::allocateInstanceImpl () at org/graalvm/compiler/replacements/AllocationSnippets.java:59
#2  com.oracle.svm.core.graal.snippets.SubstrateAllocationSnippets::allocateInstance () at com/oracle/svm/core/graal/snippets/SubstrateAllocationSnippets.java:124
#3  0x0000000000405087 in Hello$Greeter::greeter(java.lang.String[] *) ()
#4  0x0000000000405593 in Hello::main(java.lang.String[] *) () at Hello.java:43
#5  0x000000000041b9e5 in com.oracle.svm.core.JavaMainWrapper::runCore () at com/oracle/svm/core/JavaMainWrapper.java:146
#6  com.oracle.svm.core.JavaMainWrapper::run () at com/oracle/svm/core/JavaMainWrapper.java:182
#7  0x000000000041b9e5 in com.oracle.svm.core.code.IsolateEnterStub::JavaMainWrapper_run_5087f5482cc9a6abc971913ece43acb471d2631b(int, org.graalvm.nativeimage.c.type.CCharPointerPointer *) ()

A proposed patch that simply copies the the node source position across from the NewInstance to the VirtualInstance at the point of replacement corrects the backtrace as follows

#0  com.oracle.svm.core.genscavenge.graal.GenScavengeAllocationSnippets::readTlabTop () at com/oracle/svm/core/genscavenge/graal/GenScavengeAllocationSnippets.java:129
#1  org.graalvm.compiler.replacements.AllocationSnippets::allocateInstanceImpl () at org/graalvm/compiler/replacements/AllocationSnippets.java:59
#2  com.oracle.svm.core.graal.snippets.SubstrateAllocationSnippets::allocateInstance () at com/oracle/svm/core/graal/snippets/SubstrateAllocationSnippets.java:124
#3  Hello$Greeter::greeter(java.lang.String[] *) () at Hello.java:12
#4  0x0000000000405593 in Hello::main(java.lang.String[] *) () at Hello.java:43
#5  0x000000000041b9e5 in com.oracle.svm.core.JavaMainWrapper::runCore () at com/oracle/svm/core/JavaMainWrapper.java:146
#6  com.oracle.svm.core.JavaMainWrapper::run () at com/oracle/svm/core/JavaMainWrapper.java:182
#7  0x000000000041b9e5 in com.oracle.svm.core.code.IsolateEnterStub::JavaMainWrapper_run_5087f5482cc9a6abc971913ece43acb471d2631b(int, org.graalvm.nativeimage.c.type.CCharPointerPointer *) ()

Note that in the second backtrace the root of the inline tree for pc 0x405087 at frame #3 is showing the correct file and line number, Hello.java:12.

The text was updated successfully, but these errors were encountered:

adinn · 2021-07-23T14:06:25Z

@christianwimmer @peter-hofer

I have a simple patch for this issue which resolves the problem that manifests in the latest debugger code. It is clearly along the right lines. However, I am not sure if the proposed code is all that is needed or that it is being executed at the right place/in the right way.

Could one or other of you please link the patch to this issue and either review it or get someone in the compiler team to review it?

Also, I am concerned that there may be other points (now or, perhaps, in the future) where the compiler fails to carry across source position info when it is doing graph transformations and that this will impact on the quality of debugging. I'd prefer to try to find some way of automatically testing that substitutions and plugin replacements don't do that. Perhaps we can add this to our discussion agenda?

adinn · 2021-07-23T14:08:27Z

@zakkak A fix to this issue is needed to ensure that the gdb displays the correct file and line for the root method of an inline method hiererachy when using the latest version of the DWARF inline method info generation code.

peter-hofer · 2021-07-26T08:40:33Z

@dougxc @tkrodriguez please let us know how we can best improve node source tracking for debug information generation and have a look at #3621.

dougxc · 2021-07-26T09:19:50Z

I'm not sure that are many strategies for improving source node position info apart from finding places where it is missing and fixing them.
@tkrodriguez maybe there's some extra verification code we can add to detect missing propagation of source position info?

adinn · 2021-07-26T09:33:30Z

@dougxc So does that mean the patch in #3621 is ok to push? I ask because I was hoping there might be some way to ensure source position propagation happens as a side-effect of the graph node replacement (e.g. in this case the call to setSourceNodePosition would happen in the call to tool.replaceWithVirtual). If not then I guess we just have a game of whack-a-mole on our hands.

dougxc · 2021-07-26T09:39:06Z

I like the suggestion of tool.replaceWithVirtual and other graph transforming methods in VirtualizerTool taking responsibility for source node info propagation. What do you think @tkrodriguez ?
As a short term fix, #3621 is ok to merge.

adinn · 2021-07-26T10:45:13Z

@dougxc sounds like a good idea to me. Would you or @tkrodriguez be able to review #3621and shephered it through the internal gate. I would normally ask @olpaw or @christianwimmer but they are both on holiday at present.

dougxc · 2021-07-26T10:54:53Z

I'll shepherd it through once @tkrodriguez signs off on it.

adinn · 2021-07-26T11:19:48Z

@dougxc thanks very much!

peter-hofer · 2021-07-26T13:27:04Z

@dougxc thank you!

tkrodriguez · 2021-07-26T18:06:53Z

Yes to support more accurate information we'd need to tracked it through the PEA process. Currently it's somewhat incomplete I think. It's also a little complicated to get the right answer in all cases since PEA allows merging of allocations so we'd end up just picking one arbitrarily in that case. A more systematic change would be to require passing in a NodeSourcePosition to the VirtualObjectNode constructor and then ensuring that a valid position is correctly passed down or some other piece of code takes care of setting an initial value.

christianwimmer · 2021-07-26T18:16:37Z

@adinn It is a good question what stack trace we actually want to see when we are in the slow path of an allocation. Is it useful to show the stack trace where the original new bytecode was? Or do we want to show the stack trace that is "closest" to the point where escape analysis actually placed the allocation?

tkrodriguez · 2021-07-26T18:47:28Z

I think the answer for something like profiling is that we want to properly attribute the work as a new bytecode. For debugging you might be able to argue that closest makes sense but I'm not sure it's helpful. We will happily float operations like loads, stores and arithmetic out of loops but keep the source position within the loop. I'm not sure it makes sense to treat the allocation differently. I'm not even sure how you express the notion of closest since it may be near other operations that floated to their current location.

adinn · 2021-07-27T09:21:23Z

@christianwimmer I think for this case it makes most sense to keep the inlined allocation calls tree rooted at the source bytecode/line which led to the inline code. That's very much in line with what gcc tries to do when generating debug info for -g -O3 compiled code. With such code the debugger jumps backwards and forwards in the source to reflect the re-ordering of integral source level operations into separate, interleaved strands of generated machine code.

The utility of having debug info in optimized code is not to support debugging of operations in source order -- that's just not an option. It is to help whoever is debugging the machine code to relate that code to the original source in as coherent a manner as possible, wherever that is possible. In this case, rooting the inline tree at some other bytecode/line would just make it harder to work out why this specific machine code was generated.

As @tkrodriguez says there is not always going to be a clear mapping. But that's ok -- propagating source bytecode/line info through to machine code addresses is simply a convenience. It does not need to be completely accurate. It merely needs to reflect some more or less tenuous connection between the final code and an originating point in the source. The imprecision and incompleteness of any such mapping is no reason to give up on providing source position info. We just need to find places where we can retain some sort of link that is helpful, like this one.

Of course, this information is not going to help every developer. The target audience for debug info cannot realistically be people who have no understanding of what the compiler is going to do -- you need some basic knowledge even if you compile with the economy tier. The most important use of debuginfo is not to diagnose and fix errors in the original Java code (we have the JVM and JVM debuggers for that) but to diagnose and fix errors in the way the code has been translated to a compiled native image if/when a disparity between JVM and native execution arises. Whoever is debugging will need, and indeed need to be able, to apply both judgement and a knowledge of the compile process to whatever relation between source and generated code the debug info records.

adinn added the bug label Jul 23, 2021

adinn mentioned this issue Jul 23, 2021

ensure source position for new instance node is tracked #3621

Merged

munishchouhan linked a pull request Jul 26, 2021 that will close this issue

ensure source position for new instance node is tracked #3621

Merged

munishchouhan assigned christianwimmer Jul 26, 2021

munishchouhan assigned munishchouhan and unassigned christianwimmer Jul 30, 2021

graalvmbot closed this as completed in #3621 Aug 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compiler is failing to track source position for NewInstance nodes #3620

Compiler is failing to track source position for NewInstance nodes #3620

adinn commented Jul 23, 2021

adinn commented Jul 23, 2021

adinn commented Jul 23, 2021

peter-hofer commented Jul 26, 2021

dougxc commented Jul 26, 2021

adinn commented Jul 26, 2021

dougxc commented Jul 26, 2021

adinn commented Jul 26, 2021

dougxc commented Jul 26, 2021

adinn commented Jul 26, 2021

peter-hofer commented Jul 26, 2021

tkrodriguez commented Jul 26, 2021

christianwimmer commented Jul 26, 2021

tkrodriguez commented Jul 26, 2021

adinn commented Jul 27, 2021

Compiler is failing to track source position for NewInstance nodes #3620

Compiler is failing to track source position for NewInstance nodes #3620

Comments

adinn commented Jul 23, 2021

adinn commented Jul 23, 2021

adinn commented Jul 23, 2021

peter-hofer commented Jul 26, 2021

dougxc commented Jul 26, 2021

adinn commented Jul 26, 2021

dougxc commented Jul 26, 2021

adinn commented Jul 26, 2021

dougxc commented Jul 26, 2021

adinn commented Jul 26, 2021

peter-hofer commented Jul 26, 2021

tkrodriguez commented Jul 26, 2021

christianwimmer commented Jul 26, 2021

tkrodriguez commented Jul 26, 2021

adinn commented Jul 27, 2021