Deprecate `==` for refs and procs #224

Araq · 2020-05-15T17:21:30Z

Equality is hard and sometimes there is not one kind of equality, but multiple. The Nim compiler is a good example of this, whenever you write a == b where a, b: PNode it's likely a bug. Instead some tree structure comparsion should be used.

Motivation

Problem: .error doesn't work that well

The default ref equality is error prone and even .error doesn't work all that well:

# module A

proc `==`(a, b: PNode): bool {.error: "use sameTrees or similar instead".}

# module B

echo (n == m) # problem: B didn't import A and so there is no error here

Problem: `==` for refs is inconsistent with `hash`

Because the standard library knows that equality is hard, no hash for ref T is provided in hashes.nim, that's good but it's inconsistent with ==.

Problem: Function merging is less effective

Generic instantiations can often lead to the same binary code (for example fac[int] and fac[BiggestInt] can be identical) but the optimizer cannot blindly merge the identical functions if the language spec requires that f != g for proc f; proc g. At the same time, hardly any valid usages for function equality are known; it suffices if isNil(f) works.

Proposal: Deprecate `==` for `ref` and procs

Instead of opt-out you have to opt-in in order to get reference equality for refs. There is a new operation, system.equalRefs for the pointer-like reference equality. Then it's particularly easy to
implement a custom == operator:

proc `==`(a, b: PNode): bool = equalRefs(a, b)

Alternatives considered

Conceptually equality is tied to a type much like assignments and destructors. If we bind equality to the type internally, the .error solution starts to work too. But then the same applies to hash and $ and we need to support type bound operations properly. So it would add a new concept to Nim or at least expand an existing concept. Removing a very questionable default seems to be the better solution. (But I personally want type bound operations too...)

The text was updated successfully, but these errors were encountered:

RSDuck · 2020-05-15T18:54:43Z

I agree that there has to be done something about this.

Adding new operators for different kinds of equality is common practice in scripting languages (though mostly as a fixup for bad language design think of === 😛).

Maybe new operators for value comparison could be added? Another idea would be for example for hash tables, to keep the default behaviour (hash and compare the pointer not the value it stands for), but add some kind of wrapper type, like this:

type CompareDeref[T: ref] = object
  reference: T

converter getRef[T](refVal: CompareDeref[T]) = refVal.reference

proc `==`[T](a, b: CompareDeref[T]) =
  a[] == b[]
...

These are just some ideas I have on this topic. Though the utility of ref values as keys is debatable considering the dangers (making the datastructure invalid by changing the value externally...).

disruptek · 2020-05-16T16:01:34Z

Two ref variables that point to the same region of memory are not merely equal, they are identical. Why can't we just have is?

Araq · 2020-05-17T13:15:04Z

is is already used for when x is SomeType but yes, we can overload is to make it work.

RSDuck · 2020-05-17T13:49:45Z

I think that wouldn't be a very good idea, since it breaks consistency.

(in an alternative reality where programming languages were german, das gleiche und das selbe would solve this problem quite well)

Skrylar · 2020-05-21T07:53:49Z

I don't really like it, since if i say two stack instances of an object are == then it means they (by default) have identical contents. If two refs point to the same object then the ref has identical contents (as in addresses.) It's probably not what was meant but it is consistent.
Maybe an === to represent bit-equality and == to mean semantic equality. Semantic equality on a ref would then just be the same as a[] == b[] since that is most likely what someone means anyway.

planetis-m · 2020-05-21T14:34:55Z

I usually overload == to this:

proc `==`(self, other:MyRefT): bool =
   if self.isNil: other.isNil
   elif other.isNil: false
   else: self[] == other[]

not everyone may want this, so deprecating is fine.

timotheecour · 2020-05-21T15:20:59Z

Problem: .error doesn't work that well

that has everything to do with lack of custom type bound operations, == for ref types is just a consequence of it; eg: you can make exact same argument for proc $*(info: TLineInfo): string {.error.} = discard + others

Type bound operations are already supported for =destroy and = so it wouldn't really be a new language feature, more of a generalization of an existing language feature. That's the right approach and we should do that instead (but definitely with an RFC before; eg there needs to be a way to optin/optout)

your proposal would imply that t == nil (or t != nil) doesn't work anymore; breaking lots of valid code
why allow == for ptr types but not ref types?

Semantic equality on a ref would then just be the same as a[] == b[] since that is most likely what someone means anyway.

that would be a large breaking change; == for ref types is (unless overridden) not the same as a[] == b[] and is also not what I mean when I write a == b.

they are identical. Why can't we just have is?
is is already used for when x is SomeType but yes, we can overload is to make it work.

is should keep its meaning. is is not a symmetrical relationship (eg int is SomeInt is true but SomeInt is int is false), whereas == is symmetrical, for all practical purposes.

Furthermore, that'd be dangerous because of converters, or concepts (which interchange type(x) and x) or a recent change which allows foo.x is bar to mean type(foo.x) is bar; even if it would cause no actual ambiguity (which remains to be proven), it would likely cause unintended ambiguities. To paraphrase your quote, "when all types are compatible, typechecking can't help you find bugs"

Problem: Function merging is less effective

that RFC is not needed for that; see instead #229 which instead deprecates implicit conversion procvar => pointer and ptr T => pointer, which as a side effect will disallow == for incompatible procvars or ptr T, but keep == for compatible procvars, ptr T, and ref T

hardly any valid usages for function equality are known

== nil may be more common for procs but I can't agree there are no valid usages, in particular for callback heavy code, debugging, intercepting etc; also cast[int](myproc) doesn't (yet) work in VM so can't be used as a workaround; all we need is #229

proposal

== stays as-is and means semantic equality, which can be overloaded
we implement RFC: deprecate implicit conversions from procvar => pointer and ptr T => pointer; enable function merging #229
we write an RFC for type bound operations followed by implementation. That will solve many existing problems, far beyond {.error.}
we introduce bitEquals (or === + !==, happy with that too) that works with any type (ref, ptr, cstring, proc, object, float etc) and means bit equality; which is resilient to == being overloaded

there are many differences with semantic equality, as shown below:

# in std/bitequals.nim and re-exported from std/bitops
when true:
  proc bitEqualsUntyped[A, B](a: A, b: B): bool =
    const na = A.sizeof
    const nb = B.sizeof
    return na == nb and equalMem(a.unsafeAddr, b.unsafeAddr, na)

  template bitEquals[T](a, b: T): bool = bitEqualsUntyped(a, b)
  
  doAssert 0.0 == -0.0
  doAssert not bitEquals(0.0, -0.0)
  doAssert bitEquals(-0.0, -0.0)
  doAssert bitEqualsUntyped(-0.0, cast[uint64](-0.0))

  var a = 0/0 # nan
  doAssert a != a
  doAssert bitEquals(a, a)

  block:
    var a: cstring = "abc"
    var b1 = "abc"
    var b = b1.cstring
    doAssert a == b
    doAssert not bitEquals(a, b)

(see also nim-lang/Nim#8436 for additional motivation + use cases for bitEquals, including for optimizing in several places)

Araq · 2020-05-21T15:39:38Z

your proposal would imply that t == nil (or t != nil) doesn't work anymore; breaking lots of valid code

Correct, t.isNil has to be used instead.

why allow == for ptr types but not ref types?

I consider ptr T low level but you can argue either way and for consistency I wouldn't mind to treat ptr the same as ref.

RSDuck · 2020-05-21T15:43:54Z

there's another situation where the current == semantic is useful. Say you have some kind of collection and you want to find the entry of a ref.

mratsim · 2020-05-21T17:31:32Z

I don't really like it, since if i say two stack instances of an object are == then it means they (by default) have identical contents. If two refs point to the same object then the ref has identical contents (as in addresses.) It's probably not what was meant but it is consistent.

Maybe an === to represent bit-equality and == to mean semantic equality. Semantic equality on a ref would then just be the same as a[] == b[] since that is most likely what someone means anyway.

I am repeating @Skrylar but:

a value object is a value, == compares the value.
a ref object is just an address, == compares the address, and like every other languages, this means comparing the instance.

What is wrong with that?

I think the change is bad for 2 reasons:

It breaks code
When I opt for reference semantics, it's usually because I want reference semantics, i.e. no copy but sharing instances and comparing instances.

Furthermore a[] == b[] can be used for content comparison.

The issue is not that == is error prone but that reference semantics are tricky.

The solution is to improve knowledge, i.e. what is the difference between a value and a reference
not have ref object be half reference (on copy) and half value (on equality).

Araq · 2020-05-21T17:45:07Z

When I opt for reference semantics, it's usually because I want reference semantics, i.e. no copy but sharing instances and comparing instances.

There is little connection between "I want no copy" and "I want to compare identities and the machine address is fine as the identity". You can opt into comparisons but why should there be an error-prone default that is inconsistent with hash?

mratsim · 2020-05-21T23:07:23Z

Now you said the forbidden consistent word ;).

People can have their custom hash function for their types.

I don't know of any other language with types with reference semantics where ==:

is not defined
does not compare instances instead of content.

Also refs are often used for unique resources and you don't hash a socket or a database connection, you would need to write a custom hash anyway.

There are good reasons to depart from status quo in language design but for this specific case, consistency between hash and comparison doesn't seem enough for me.

If anything it's even worse than not being able to print $ref by default in the past as we had. Now to compare instances we would need to create a cast[pointer](a) = cast[pointer](b) even though we want to strongly discourage casts.

Araq · 2020-05-22T05:24:29Z

I don't know of any other language with types with reference semantics where ==:

Well that's why Nim has it but copying bad design is not good design.

Also refs are often used for unique resources and you don't hash a socket or a database connection, you would need to write a custom hash anyway.

And again, I see if socket == nil in ordinary code, but never if socket == otherSocket.

Now to compare instances we would need to create a castpointer = castpointer even though we want to strongly discourage casts.

No and how to do it is covered in my RFC.

timotheecour · 2020-05-22T10:11:23Z

I don't see the inconsistency with hash

by default, == uses address (and hashes.nim should be updated IMO to allo hash(ref|ptr) as shown below using address, which is consistent with how it already works for proc pointer)
custom code can customize == and hash for a given type (yes, this loops back to the need for type bound procs, to make code more robust to missing imports)

and btw that's what other languages do too (eg D, C++); custom == goes hand in hand with custom hash; all that's needed wrt to this point is clarify in docs that hash should be consistent with == so that if you overload == you should overload hash, unless you know what you're doing.

eg: https://dlang.org/spec/hash-map.html

If toHash must consistently be the same value when opEquals returns true. In other words, two objects that are considered equal should always have the same hash value. Otherwise, undefined behavior will result.

(ditto with java equasl vs hashCode, or C++)

import tables, hashes
proc hash(a: ref|ptr|pointer): Hash = cast[Hash](a)

type Foo = ref object
  id: int

type Bar = ref object
  id: int

proc `==`(a, b: Bar): bool = a.id == b.id
proc hash(a: Bar): Hash = a.id

block: # default case: identity = address
  var t: Table[Foo, int]
  let a = Foo(id: 3)
  let b = Foo(id: 3)
  t[a] = a.id
  doAssert a in t
  doAssert b notin t
block: # special case: identity = id
  var t: Table[Bar, int]
  let a = Bar(id: 3)
  let b = Bar(id: 3)
  t[a] = a.id
  doAssert a in t
  doAssert b in t

And again, I see if socket == nil in ordinary code, but never if socket == otherSocket.

i see node1 == node2 all the time with code involving graphs, loopy structures, compilers etc; types requiring == via some other mean (eg same .id field, like PIdent in compiler) are handled via custom procs + in future, type bound procs.

note

D also uses == based on address for classes, (vs content for struct); that's not something ppl complain about nor find confusing in D
ditto for java, to the extent it makes sense talking about addresses

summary

we'd just have:

=== (aka bitEquals) for bit equality via memcmp (see above for typical cases where it differs from ==); no special case or overloading here.
in particular, for ptr-like types (ref|ptr|pointer|cstring|proc of compatible types) it uses address comparison
== for semantic equality; it defaults to === comparison except for cstring where it uses strcmp; can be typically overloaded in which case overloading hash is recommended

mratsim · 2020-05-22T12:31:48Z

@timotheecour I agree with your points, though I don't see the need for ===, let's leave that as an open symbol for libraries. bitEquals can be implemented in bitops as a full name.

Araq · 2020-05-22T15:10:44Z

I consider this RFC as rejected and I'm waiting for more type-bound operators as the alternative solution.

treeform · 2020-05-23T18:22:17Z

I would not mind == doing a deep tree equality, same with the hashes module doing a deep tree hash. It turns language gotches into performance gotches. It pushes the complexity of writing code into complexities of profiling.

I live in the profiler anyways and would stop performance issues quickly, people that don't don't live in the profile don't care as much about performance. I think its a win-win for both groups.

Araq · 2023-03-23T19:56:30Z

ref object vs object is commonly used as a performance hack but these have very different semantics and ref uses address/pointer semantics and IMO should continue to do so. In an ideal world we would maybe have object {.asref.} instead to get the performance of ref object for passing it around while keeping the value smemantics, I don't know.

al6x · 2023-03-23T20:17:32Z

Wouldn't the ideal case be when the compiler could detect that the copied value weren't changed so instead of copy it pass it as reference?

Araq · 2023-03-23T21:45:39Z

Yeah, yeah, yeah...

Araq changed the title ~~Deprecate == for refs and closures~~ Deprecate == for refs and procs May 16, 2020

timotheecour mentioned this issue May 21, 2020

RFC: deprecate implicit conversions from procvar => pointer and ptr T => pointer; enable function merging #229

Closed

Araq added the Rejected RFC label May 22, 2020

Araq closed this as completed May 22, 2020

timotheecour mentioned this issue May 23, 2020

bitEquals timotheecour/Nim#243

Open

timotheecour mentioned this issue Dec 2, 2020

prefer x == nil to x.isNil nim-lang/Nim#16191

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecate `==` for refs and procs #224

Deprecate `==` for refs and procs #224

Araq commented May 15, 2020 •

edited

Loading

RSDuck commented May 15, 2020 •

edited

Loading

disruptek commented May 16, 2020

Araq commented May 17, 2020

RSDuck commented May 17, 2020 •

edited

Loading

Skrylar commented May 21, 2020

planetis-m commented May 21, 2020

timotheecour commented May 21, 2020 •

edited

Loading

Araq commented May 21, 2020

RSDuck commented May 21, 2020

mratsim commented May 21, 2020 •

edited

Loading

Araq commented May 21, 2020

mratsim commented May 21, 2020

Araq commented May 22, 2020

timotheecour commented May 22, 2020 •

edited

Loading

mratsim commented May 22, 2020 •

edited

Loading

Araq commented May 22, 2020

treeform commented May 23, 2020

Araq commented Mar 23, 2023

al6x commented Mar 23, 2023

Araq commented Mar 23, 2023

Deprecate == for refs and procs #224

Deprecate == for refs and procs #224

Comments

Araq commented May 15, 2020 • edited Loading

Motivation

Problem: .error doesn't work that well

Problem: == for refs is inconsistent with hash

Problem: Function merging is less effective

Proposal: Deprecate == for ref and procs

Alternatives considered

RSDuck commented May 15, 2020 • edited Loading

disruptek commented May 16, 2020

Araq commented May 17, 2020

RSDuck commented May 17, 2020 • edited Loading

Skrylar commented May 21, 2020

planetis-m commented May 21, 2020

timotheecour commented May 21, 2020 • edited Loading

proposal

Araq commented May 21, 2020

RSDuck commented May 21, 2020

mratsim commented May 21, 2020 • edited Loading

Araq commented May 21, 2020

mratsim commented May 21, 2020

Araq commented May 22, 2020

timotheecour commented May 22, 2020 • edited Loading

note

summary

mratsim commented May 22, 2020 • edited Loading

Araq commented May 22, 2020

treeform commented May 23, 2020

Araq commented Mar 23, 2023

al6x commented Mar 23, 2023

Araq commented Mar 23, 2023

Deprecate `==` for refs and procs #224

Deprecate `==` for refs and procs #224

Araq commented May 15, 2020 •

edited

Loading

Problem: `==` for refs is inconsistent with `hash`

Proposal: Deprecate `==` for `ref` and procs

RSDuck commented May 15, 2020 •

edited

Loading

RSDuck commented May 17, 2020 •

edited

Loading

timotheecour commented May 21, 2020 •

edited

Loading

mratsim commented May 21, 2020 •

edited

Loading

timotheecour commented May 22, 2020 •

edited

Loading

mratsim commented May 22, 2020 •

edited

Loading