Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unlawful compare implementations #99

Closed
sim642 opened this issue Jan 29, 2024 · 5 comments
Closed

Unlawful compare implementations #99

sim642 opened this issue Jan 29, 2024 · 5 comments

Comments

@sim642
Copy link
Contributor

sim642 commented Jan 29, 2024

The compare functions which are exposed to OCaml, either directly or via custom_operations structs and Stdlib.compare, do not satisfy the usual order laws. In turn, this means that many Apron types cannot (at least easily) be used for Set.Make and Map.Make.
Since the operations are somehow defined, doing so doesn't immediately fail (e.g. by exception), but leads to some nasty Heisenbugs down the line.

Examples

Environment

Environment.compare is explicitly exposed to OCaml:

quote(MLI,"\n(** Compare two environment. [compare env1 env2] return [-2] if the environments are not compatible (a variable has different types in the 2 environments), [-1] if [env1] is a subset of env2, [0] if equality, [+1] if env1 is a superset of env2, and [+2] otherwise (the lce exists and is a strict superset of both) *)")
int ap_environment_compare(ap_environment_ptr env1,
ap_environment_ptr env2);

The documentation correctly describes its behavior, which, however, is not the compare suitable for Set.OrderedType.
For example (in pseudo-syntax), Environment.compare {var1} {var2} > 0 and Environment.compare {var2} {var1} > 0 both return true (because compare returns 2 both ways).

What's worse, the same implementation is specified for the Environment.t custom OCaml block via custom_operations struct, which is to be used by Stdlib.compare (and its standard operator-derivatives). The problem is that this doesn't fit the usually-understood properties of that function as documented:

compare x y returns 0 if x is equal to y, a negative integer if x is less than y, and a positive integer if x is greater than y.
[...]
The compare function can be used as the comparison function required by the Set.Make and Map.Make functors.

So, for example, {var1} > {var2} and {var2} > {var1} both return true as well.

Abstract0 (and Abstract1)

Abstract0 (and Abstract1, which is built on top of it) don't directly expose and document a compare. However, one is still defined for Abstract0.t's custom block:

int camlidl_apron_abstract0_ptr_compare(value v1, value v2)
{
ap_abstract0_ptr* p1 = (ap_abstract0_ptr *) Data_custom_val(v1);
ap_abstract0_ptr* p2 = (ap_abstract0_ptr *) Data_custom_val(v2);
ap_abstract0_t* a1 = *p1;
ap_abstract0_t* a2 = *p2;
ap_dimension_t dim1,dim2;
int res;
if (v1==v2 || p1==p2 || a1==a2)
res=0;
else {
dim1 = ap_abstract0_dimension(a1->man,a1);
dim2 = ap_abstract0_dimension(a2->man,a2);
res = dim1.intdim-dim2.intdim;
if (!res){
res = dim1.realdim-dim2.realdim;
if (!res){
if (ap_abstract0_is_eq(a1->man,a1,a2))
res=0;
else
res = a1 > a2 ? 1 : (-1);
if (a1->man->result.exn!=AP_EXC_NONE) camlidl_apron_manager_check_exception(a1->man,NULL);
}
}
}
return res;
}

At first glance, this appears somewhat sound: first dimensions are compared, then ap_abstract0_is_eq is used to check for semantic equality (which is also exposed as is_eq to OCaml). But the final fallback comparison is of pointers themselves (which are arbitrary memory addresses). On its own, that's a valid total order, but not when combined with the semantic equality check.

For example, suppose a1 = {x>=0}, a2 = {-x>=0} and a3 = {x>=0} are allocated in order and their addresses also happen to be in the same order. Then a1 < a2 and a2 < a3 per pointer comparison, but a1 = a3 per semantic equality check.

Other types

Other Apron types might have similar issues, I didn't check everything. Var.compare at least is sound because it's just strcmp.

Conclusion

Even though polymorphic comparison and Stdlib.compare aren't generally recommended, they have been defined for Apron's custom blocks and that's the only way to compare most of them. Even if there's no plan to fix it, this issue hopefully serves as a warning to anyone trying to use Set.Make/Map.Make on Apron data types, which appears to work at first glance but can surprisingly break down.

@sim642
Copy link
Contributor Author

sim642 commented Jan 29, 2024

Apparently Frama-C realized this at some point as well:

      (* BIGTODO: this function is not quite a total order, because [is_leq] is
         only a partial order. Using the hash as a first comparison is only
         a doubtful hack. *)
      let compare a b =
        if equal a b then 0
        else
          let cmp = compare (hash a) (hash b) in
          if cmp <> 0 then cmp
          else if Abstract1.is_leq man a b then 1 else -1

(https://git.frama-c.com/pub/frama-c/-/blob/c13260097f092e405ccb1131668fbc4c94eb1644/src/plugins/eva/domains/apron/apron_domain.ml#L390-398).

@antoinemine
Copy link
Owner

Thanks for reporting this issue. I try to address them in #108, by basically disabling the problematic features. I currently see no efficient and robust solution to implement a total order on these data-structures and make them usable as keys in sets and maps. In particular, the order cannot rely on the internal representation of an abstract element as the library is free to update it with another representation of the same concrete set without notice.

The Frama-C code shows that they did not find a good solution either.

@sim642
Copy link
Contributor Author

sim642 commented May 28, 2024

For Environment at least a total order should be possible: they seem to be normalized such that variables (which are totally ordered) are sorted in the underlying array. Environment.equal also compares them under that assumption, without any semantic considerations like Environment.cmp now.
However, also implementing Environment.compare with the lexicographic total order would be an even more confusing breaking change: existing uses of it (which likely depend on the semantic results it previously had) would start behaving unexpectedly.

@antoinemine
Copy link
Owner

Yes Environment are represented as ordered lists of variables. This is important to ensure that, for environments with the same variables, a given variable is always assigned the same index (vector space dimension). What complicates the matter is that we distinguish integers and float variables. Environments where the a variable is integer in one and float in the other are not compatible.

In theory, a total order could be possible for Environment but also possibly other data-types. Ideally, they should be extensions of the natural partial order, so that ⊆ ⇒ ≤. However, they are probably complex to implement and costly to run, in addition to also breaking compatibility with the incorrect implementation of compare we wish to abandon. I fear that people would continue to use ≤ assuming it means (or implies) ⊆. I suggest we remove compare and not replace it at all (at least, not yet). The drawback is that we can longer use the polymorphic = to test for equality.

@antoinemine
Copy link
Owner

Fixed in #108

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants