Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interactive: improvements for chrony story #724
Interactive: improvements for chrony story #724
Changes from 26 commits
0709d82
30c1d7a
11cc869
dd43200
1ba9590
f5c4e89
752d16c
180ed23
653d0bb
a4be262
f753df5
e10102e
4a8aa85
ed6b2e4
c5cf526
2b60af7
ea05ea8
59f7343
577f6f3
76f9b29
84a98bf
b09b1fd
f2b8673
d5a0c27
a4db1a0
e6e33b2
6016835
22d6da8
a482bd1
67211c2
92c7ea4
94fd11f
c491dbb
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you maybe comment on the impact that this has now? Are we tracking equalities involving pointer arithmetic? If yes, how confident are we that we do useful things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These additions aren't changing the equalities that are tracked, but just allows them to be used with array indexing via pointer addition (which CIL sometimes inserts instead of actual
Index
offset that the analysis already handles).It's needed for the following pattern from one of the added tests:
The symb_locks analysis uses the
EqualSet
query to look up equalities forentry[0].refs
, but the old behavior meant that an empty set was returned, so we couldn't infer that it has the same prefix as the locking expression.A less trivial case of it comes from the minimized chrony test, where the locking actually happens outside the function where the access is, so there's already an equality
ip_addrs = inst->addresses
known to the analysis and there's a symbolic lockinst->mutex
, but then the accessip_addrs[0]
isn't found to be protected because to find that out, the equality needs to be used inside the index expression.Note that the index itself is completely irrelevant because all that's needed is the common
inst
prefix.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are a bit too eager to declare access as non-racy here, consider the example I added:
These accesses at
entry[1]
are racy, as the protecting mutex for that element is not held (I fixed the loop to only run until 9, so it's not UB). Also passing at*
and then using array access on it is a very common idiom in C afaik.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's quite an evil example, but I'm rather skeptical that we can do much about this. The function in isolation, as presented here, just gets a pointer and then accesses
entry[1]
. For an arbitrary pointer that may point to any memory and therefore a write to it may invalidate any memory, including any other program variable (any global, any local of any stack frame, any dynamically allocated memory). Of course in this particular case it doesn't, but in Goblint we have no way of knowing that, because for addresses (like the pointer here) we don't keep track of their context (whether they're pointing into an array or not).I think our much broader assumption is that pointer arithmetic stays "within the direct object" (for the lack of a better definition) regardless of its context, because we simply don't know about the broader memory layout (even if in this case it seems obvious). To make weaker assumptions than that, we would either need some crazy possible memory layout analysis or, lacking that, assume the worst and invalidate everything (not just globals!).
Is it though? If you want to use it as an array, you might as well pass it as an array (which is a pointer under the hood anyway), to make it clear that it's an array (usually with another length argument), not a pointer to an isolated object.
Now thinking about bad things, if the argument were an array and we copy the array domain into the function and mutated it, do those writes even propagate back to the caller's array domain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the assumption we should be making is that
a
toa
in the next element of the array unlessa
is the first struct element.That seems not too strong but also not as radical as setting everything to top. (And should be possible as our addresses maintain offsets and have abstract values at the granularity of variables anyway).
I think if we assume anything more, we cannot seriously claim we are sound anymore. I personally would be extremely unhappy with Goblint accepting this program as race-free! (How do @vesalvojdani @stilscher @jerhard feel here?)
Should we put this on the agenda at the Gobcon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd hope so. We pass a pointer to the array, hence the entire(!) array will go to the callee, we update things there and should then copy the value over in combine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe this is somehow also related to #434? Because the thing is, this change simply allows the
entry[0]
orentry[1]
expression to preserve its prefix (previously we gave up on index access of pointer). Index access on array type has been previously supported for a long time and this shouldn't be different: symbolic locking happens on the basis of common prefix between the mutex and the access (and we have at least one paper on this!). So I'm quite surprised how this can suddenly be going wrong.cc: @vesalvojdani
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't dug into this now, but this is exactly where I'm worried. Sure, if you take a pointer of the array and pass that to a function, it gets passed along as reachable and
combine
d back. But what if you pass the array directly (without making a pointer to it first). In the caller, the local domain contains an array abstraction directly under the array variable name, there isn't an address domain containing an address to the array contents. This is analogous to how an integer variable contains the integer abstraction directly. And when they are passed as arguments, they are copied — there's a copy of the integer's abstraction directly under the callee's argument and copy of the array's abstraction directly under the callee's argument. Oncombine
, the internal integer abstraction is simply removed and so would the callee's (modified) array one. Or do we have some special logic somewhere there specifically for arrays? I don't recall at least.Indeed in the non-UB test you added, we should not. It's just the symbolic locks that then might be assuming something stronger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized there's a granularity difference at play here as well:
cache_entry
tests in our suite use an array of structs, where each struct has a mutex protecting its data.DNS_Async_Instance
struct, for which this is needed, is a standalone struct (not one in an array), where the single mutex protects an entire array inside the struct.Because they are structurally different like that, I'd hope we can actually be sound for the tests you added and the story in chrony. I'll have to dig into how exactly these different granularity locking schemes, and per-field vs per-index symbolic locks are supposed to work, because these cases involve both at the same time (but in different order!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case for chrony is easier than the more general one, where the unsoundness problem came in. I now fixed that using very ad-hoc handling that's still sufficient for chrony.