Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward declared Phi instructions - with dodgy workaround ;) #89

Closed
justinclift opened this issue Jun 8, 2019 · 7 comments
Closed

Forward declared Phi instructions - with dodgy workaround ;) #89

justinclift opened this issue Jun 8, 2019 · 7 comments
Milestone

Comments

@justinclift
Copy link
Contributor

justinclift commented Jun 8, 2019

Stumbled on an interesting case with Phi instructions, and figured out a dodgy workaround that is probably good enough for now.

Hoping there's a better way, just in case. 😄

Scenario: attempting to generate some LLVM IR with the following structure:

entry:
  [stuff]

for.loop:
  %2 = phi i32 [ 0, %entry ], [ %6, %for.body ]
  [stuff]

for.body:
  [stuff]
  %6 = add nuw i32 %2, 1

Note that %2 is referenced by the %6 creation line, and the %6 is referenced by the %2 creation line.

It seems like just declaring the Go container for %6 before it's called would work:

var val6 *ir.InstAdd

entryBlk := x.NewBlock("entry")
// ...

forLoopBlk := x.NewBlock("entry")
val2 := forLoopBlk.NewPhi(ir.NewIncoming(i32zero, entryBlk), ir.NewIncoming(val6, forBodyBlk))
// ...

forBodyBlk := x.NewBlock("entry")
val6 = forBodyBlk.NewAdd(val2, constant.NewInt(types.I32, 1))

Indeed, this compiles.

However, fmt.Println() on the module fails:

PANIC=String method: runtime error: invalid memory address or nil pointer dereference

After some investigation, the problem turned out to be the String() method here:

return fmt.Sprintf("[ %s, %s ]", inc.X.Ident(), inc.Pred.Ident())

It assumes the Phi Incoming has already had it's contents set, thus barfing when that's not the case (like above). In the above case, X is still nil at this point. 😉

The dodgy workaround was to add some bogus contents to the variable (without adding them to the block), so String() works on it ok. eg:

val6 = &ir.InstAdd{X: i32zero}
val6.LocalID = 6
val2 := forLoopBlk.NewPhi(ir.NewIncoming(i32zero, entryBlk), ir.NewIncoming(val6, forBodyBlk))

Is there a better approach? 😄

@justinclift
Copy link
Contributor Author

Gah, ignore this. After coming back to it, the problem was obvious:

var val6 *ir.InstAdd

should be:

val6 := &ir.InstAdd{}

Oops. 😉

@mewmew
Copy link
Member

mewmew commented Jun 8, 2019

It seems like just declaring the Go container for

val6 = forBodyBlk.NewAdd(val2, constant.NewInt(types.I32, 1))

The reason that this fails is that NewAdd allocates a new &ir.InstAdd{...} reference. Thus, the "old" pointer value is never used and it will panic during printing, most likely since the "old" pointer value var val6 *ir.InstAdd is nil. Even if the old pointer value was instantiated to an allocated struct, e.g. var val6 = &ir.InstAdd{}, the pointer to this allocated value would be different from that allocated by NewAdd, and moreover, the fields of var val6 = &ir.InstAdd{} are all zero, thus it's X field would also be nil (which is used to compute the type of the instruction), and produce a panic later on.

Unfortunately. solving these kind of circular reference cases is tricky.

I would probably do something along the lines of:

entryBlk := x.NewBlock("entry")
// ...

forLoopBlk := x.NewBlock("entry")
forBodyBlk := x.NewBlock("entry")
val6 := forBodyBlk.NewAdd(nil, constant.NewInt(types.I32, 1)) // <<<--- notice that we use nil as a dummy value here.

val2 := forLoopBlk.NewPhi(ir.NewIncoming(i32zero, entryBlk), ir.NewIncoming(val6, forBodyBlk))
val6.X = val2 // <<--- and set the actual value of val6.X here.
// ...

@mewmew
Copy link
Member

mewmew commented Jun 8, 2019

Gah, ignore this. After coming back to it, the problem was obvious:

var val6 *ir.InstAdd

should be:

val6 := &ir.InstAdd{}

Oops. wink

That would solve the first panic, but, the value would be the wrong one. Since you would effectively and up with the different *ir.InstAdd pointers. The one created at val6 := &ir.InstAdd{} (that is added as operand to ir.NewIncoming, and the other one created by the NewAdd method. As such, these values would not be kept in sync. Say you remove an instruction and want to recompute the IDs for unnamed instructions. Then one would have %5, which the other would quite likely still have %6.

So, as we would say in Swedish "Håll tungan rätt i munnen" when working with circular references in data structures. ;) The translation of which for those not native in Swedish would be along the lines of Keep the tongue straight in the mouth, for added precision.

@mewmew
Copy link
Member

mewmew commented Jun 8, 2019

After some investigation, the problem turned out to be the String() method here:

Oh, and a quick tip. When you get panics like the following:

PANIC=String method: runtime error: invalid memory address or nil pointer dereference

An easy way to find the line number causing the crash is by simply replacing fmt.Println(m) with m.String().

-fmt.Println(m) // produces `PANIC=String method: runtime error: invalid memory address or nil pointer dereference` message, but no stack trace.
+m.String()     // produces stack trace with source file name and line numbers.

@justinclift
Copy link
Contributor Author

I would probably do something along the lines of ...

Yeah, it moves things out of order though, so kills readability. 😦

That would solve the first panic, but, the value would be the wrong one.

The easy workaround for that was to just set it directly, as I know in advance the value it needs to be:

val6.LocalID = 6 

Cheating? Yep.

In cases where the value isn't known in advance though, such as generating new LLVM IR instead of trying to match existing IR... the correct approach you gave above would probably be better. 😄

@justinclift
Copy link
Contributor Author

An easy way to find the line number causing the crash is by simply replacing fmt.Println(m) with m.String().

Thanks, just tried it and that helps heaps. 😄

@mewmew
Copy link
Member

mewmew commented Jun 8, 2019

The easy workaround for that was to just set it directly, as I know in advance the value it needs to be:

val6.LocalID = 6

Hehe, yeah. That approach works for producing LLVM IR assembly :p However, if you want to use the in-memory representation of the IR, the value should really be one and the same (as in the same pointer). Otherwise, data flow analysis is going to get quite messed up.

In the future, we will start working on control flow and data flow analysis packages (see #19, note that this is an old issue, and the API is not likely to stay as proposed), and given a function e.g. ReplaceAllUses, a call to dfa.ReplaceAllUses(val6, constant.NewInt(types.I32, 42)) would only replace one of the uses of val6, not the other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants