-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mirpasses: make call-argument fixup a MIR pass #818
mirpasses: make call-argument fixup a MIR pass #818
Conversation
Summary ======= Replace the `PNode`-based fixup for call arguments in `ccgcalls` with a MIR pass. This removes a dependency on `PNode`-based analysis from `cgen`, and also makes it possible to, in the future, enable the fixup when the JS or VM backends are used (both are also affected by the issue). While the used analysis stays mostly the same, an observable evaluation- order violation is fixed. Injecting (shallow) copies for values passed to both immutable and mutable parameters now considers *all* parameters, instead of only parameters to the right of immutable ones. For example, given: ```nim f(a, a) # proc f(x: var T, y: T) ``` this fixes mutations through `x` inside `f` being visible on `y`. Details ======= The analysis used by the MIR pass works mostly the same as the one previously used in `ccgcalls`: for each argument value that is not explicitly passed by-reference, it is analyzed whether: - the value is potentially mutated *after* it is bound to the parameter but *before* the procedure is called - the value is potentially also passed to a `var` parameter If either is the case, the argument is shallow-copied to a temporary that is then passed to the parameter instead. The differences compared to the previous analysis are that: - checking whether the argument value (or something that potentially overlaps with it in memory) is also passed to a mutable parameter now also considers preceding parameters. Previously, only the following parameters were checked - testing for overlapping values considers the whole path now, instead of only the root. In effect, this means that for `f(a.x, a.y)`, where the second parameter is mutable, no temporary is (unnecessarily) injected for the first parameter For overlap testing, the `maybeSameMutableLocation` procedure is introduced, which mirrors the behaviour of `dfa.aliases` (the routine used by the previous `PNode`-based analysis). Since dereferences of pointer-like values are treated like a normal field access, calls like `f(a[].x, b[].y)`, where one of the parameters is mutable and `a` and `b` point to the same location, still cause observable evaluation-order violations. Finally, the analysis and temporary injection in `ccgcalls` is removed, and a test is added for the fixed issue.
## specified range, rather it means that the value *could* be mutated. | ||
var i = start | ||
while i <= last: | ||
# all ``mnkTag`` nodes currently imply some sort of mutation/change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm increasingly starting to think that generalizing the original mnkModify
into mnkTag
was a mistake. As the comment mentions, all current value tags represent some form of mutation, but the name mnkTag
doesn't really convey that.
I'll think about it some more, but my current opinion is that mnkTag
should become mnkMutation
(or similar).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless the intention is to generalize across other effects, I agree I think a name indicating mutation/memory makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very cool
compiler/mir/mirpasses.nim
Outdated
## Do note that due to the placement of this pass (it happens after the | ||
## ``injectdestructors`` pass), only *shallow*, non-owning copies of the | ||
## affected arguments are made, meaning that there's the issue of resource- | ||
## like values (refs, seqs, strings, everything else that has a destructor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm increasingly convinced that destructors, as realized are a misfeature, or one that should be deemphasized. instead we should trigger ops on the containing type, which would be better for data oriented programming and allow for better composition. the op interface might be: =receiveCopy
, =takeOwnership
, =endOfLife
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does instead we should trigger ops on the containing type
mean concretely, since the current hooks are attached to the respective type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type
seq[T] = object # for discussion assume this definition
data: uncheckedArray[T]
len: int
proc `=takeOwnership`[T](s: seq[T], t: T) = # ... whatever default impl we want
# `seq` is a container and its 'containers ops' are triggered based on a `T`'s lifetime (like current type ops), but call `seq[T]`'s 'container ops'
# this also opens the door on specialization based on the contained type and overriding might be feasible
proc `=takeOwnership`(s: seq[Uri], u: Uri) = # ... specialized impl
I'm not sure about the actual events/hooks we want, but what I'm somewhat certain about is type ops triggered and called on the same type is 'off'. To clarify, Uri
could have container ops of its own under this scheme, but they'd apply to the pile of string
fields that are in its type definition and not the Uri
type itself.
I had separately started thinking about triggered ops on containers before and I was coming at it from a DOD perspective. That further corroborated that type ops on the actual type aren't as useful wrt correctness or saving work, below is a snippet I was using to think about this.
Using the below example, if we specialized the Component
types with type ops we'd end up with something "wrong". Those types would start having to know where and how they're stored, plus others couldn't reuse the components and associated systems easily (poor composition).
type
Position* = object
x*, y*: int64
Movement* = object
dx*, dy*, ddx*, ddy*: float32
Component* = Position | Movement
EntityData = object
Store* = object
allocator: Allocator
entities*: seq[EntityData]
positions*: seq[Position]
movements*: seq[Movement]
proc `=alloc`[T=Component or EntitData](s: Store, t: typedesc[T])
proc `=dealloc`[T=Component or EntitData](s: Store, a: T)
proc `=transferInto`[T=Component or EntitData](s: Store, a: T)
proc `=copyInto`[T=Component or EntitData](s: Store, a: T)
The last bit I should add is that we effectively do this sort of logic for the stack, the push/pop of activation records, lifting of envs and associated allocs/deallocs, etc. All that is more call stack/container focused, that's what's being bumped, that's the ever present context. If we were actually modelling a call stack, we wouldn't want to put the type ops on each call frame type (that's a lot of object types 😆), we'd want activation record lifecycle events to trigger container ops on the call stack (container).
Anyhow, I hope that makes some sense.
|
||
var i = C(i: 1) | ||
when nimvm: # XXX: doesn't work yet | ||
discard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this gave me an idea, maybe we should add a doAssertKnownIssue
that'll work with inverted behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, I've come across multiple cases were that would have been useful.
One bug was that the wrong nodes were compared (`^1` was used instead of `^i`), and while this doesn't necessarily lead to incorrect behaviour, it does lead to overlap being detected where there is none. The second was with the node position for array element comparisons being wrong.
Co-authored-by: Clyybber <[email protected]>
/merge |
Merge requested by: @saem Contents after the first section break of the PR description has been removed and preserved below:
|
Summary
Replace the
PNode
-based fixup for call arguments inccgcalls
with aMIR pass. This removes a dependency on
PNode
-based analysis fromcgen
, and also makes it possible to, in the future, enable the fixupfor the JS or VM backends.
While the used analysis stays mostly the same, an observable evaluation-
order violation is fixed. Injecting (shallow) copies for values passed
to both immutable and mutable parameters now considers all parameters,
instead of only parameters to the right of immutable ones. For example,
given:
this fixes mutations through
x
insidef
being visible ony
.Details
The analysis used by the MIR pass works mostly the same as the one
previously used in
ccgcalls
: for each argument value that is notexplicitly passed by-reference, it is analyzed whether:
but before the procedure is called
var
parameterIf either is the case, the argument is shallow-copied to a temporary
that is then passed to the parameter instead.
The differences compared to the previous analysis are that:
overlaps with it in memory) is also passed to a mutable parameter
now also considers preceding parameters. Previously, only the
following parameters were checked
of only the root. In effect, this means that for
f(a.x, a.y)
, wherethe second parameter is mutable, no temporary is (unnecessarily)
injected for the first parameter
For overlap testing, the
maybeSameMutableLocation
procedure isintroduced, which mirrors the behaviour of
dfa.aliases
(the routineused by the previous
PNode
-based analysis). Since dereferences ofpointer-like values are treated like a normal field access, calls like
f(a[].x, b[].y)
, where one of the parameters is mutable anda
andb
point to the same location, still cause observable evaluation-orderviolations.
Finally, the analysis and temporary injection in
ccgcalls
is removed,and a test is added for the fixed issue.