-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend the notion of a "type bound" operation #380
Comments
Questions:
Comments:
|
For "generic" methods we already attach them to the underlying type so that during instantiation the method is instantiated too -- so I think it's safe to say that there are nothing but pleasant surprises, more code simply works.
It would be more complex but not by much -- maybe 100 lines added to the compiler. It would be slightly slower, however, there is a great potential here for speeding up the compiler with a couple of refactorings. For example: For a comparison like
I share this worry; however:
|
Here is how Rust does it:
(from https://doc.rust-lang.org/book/ch10-02-traits.html ) Now the question is, "what is a crate?":
So ... it's pretty much the same solution. You cannot compile only a "subset" of the crate so the trait implementations for type |
Go's language specification dictates something similar - essentially, what would be called a "set of type-bound routines" here, is called a method set in Go. Which leads to another question - if the proposal is to essentially take the same approach as Rust and Go, would this make it impossible to implement a concept for an external type without first creating a child type? |
What's the reasoning for not allowing |
The current type-bound operators don't seem to follow this rule (e.g. |
That is a solution, yes, however, the best solution seems to me to extend the concepts spec; when it talks about how a proc must be "in scope", it can also suffice that instead the proc is attached to the type. |
I'm cautious. We can always extend the idea later. |
It would make some sense I guess but I fail to see the advantage: If you cannot iterate over the type explicitly (as your |
The "same module" requirement is not enough to solve the sandwich problems in the typical scenarios in which they arise. As a typical example, consider serialization functions such as To solve this problem in more general way, another solution is possible: you can expand the definition of what a "method" is. A method declaration using a generic type can be used to introduce a type bound operation which can be specified for types that are not part of an hierarchy: # serialization_lib.nim
method writeValue[T](x: Stream, value: T) # Only declared here Then one or more modules can attempt to "bound" an implementation of the method for a particular type: # options_impl.nim
import
serialization_lib
# This is a definition since it has a body and the declaration is in scope
method writeValue[T](x: Stream, value: Option[T]) =
... The user of the serialization lib can compile calls to |
But then you don't have what Rust calls "coherence" either as you can easily write code that never imports json_serialization/std/options.nim, not even indirectly. And even if you say "in practice it's easy enough to ensure the helper module got picked up" there is some very spooky action at a distance involved. And finally, std/options does not have to know about JSON serialization, a design based on #247 would work. |
Instead of attaching syms to types, I think we should attach scopes (the declaration scope) to syms, so that symbols available at declaration will always be available at instantiation. |
This doesn't solve the problem: from objs import Obj
import tables Inside tables.nim there is the []= proc and it doesn't have objs.hash in scope! |
I'm entirely unconvinced that this notion of "coherence" is valuable. My own example demonstrates a clear and natural division of responsibilities that violates the proposed principles, so I'll hold this as a limitation and a weakness of Rust in comparisons to Nim. Clearly, a module defining Clearly, a serialization library defining a common interface should not be aware of all types that can be serialized in user programs because this set in infinite. So, it's perfectly natural to have modules that bridge the two islands. A user program is likely to select some existing bridges and to build some new ones connecting its own land. |
True but the serializer must know about Option[T] or about a Optional concept because it's its job of doing so. Optionals are special for JSON and SQL and likely for most other formats we care about.
True and it seems to work this way in Rust, with the enforcement of coherence. |
How I see the problem:
I see this as a good design, and means we could choose: import data definition import data definition and behavior import all Other view on problem: isn't this showing that design of Table isn't good, when it defines some default behavior or isn't strictly a pure function? Increment to proposed solution, create another kind of import explicit as "import X and its behaviors" |
The default behavior is from system.nim and hashes.nim, tables are not too blame.
But what's the point? That should be the default for |
Well, no. Would you argue that anyone importing |
Well a Vector3 is a type that hides little if anything so if the serializers can perform But tell me: Would this proposal without the restriction "r is declared in the same module as T" have your approval? |
I think this is a pretty interesting. Concepts are existential types, "there exists a thing that satisfies these conditions". From the perspective of the receiver that's perfectly ok, but from the perspective of the provider this is where I think things diverge. In one case I want to be able to define a data type and set of proc and they happen to be taken as fulfilling some requirement, the definition of which is composed not in a single module. This would be where I'm faced with a concept receiver and I just need to define the missing procs on some type I happen to have until it fits. These cases will become more rare, but still useful stuff. The other case is one of "coherence" wherein I have a type and I wish to package with it a set of rules that travel along with it no matter where it goes. This would be Aside: I never really liked the restriction of not being able to change the memory management semantics of a type after the fact, as it feels very unlike Nim. Taking that aside and inspiration from what @Clyybber mentioned about declaring scope. I would take a page from effect types, especially the tagging. I think the destructors and move semantics should be, pardon me as I stretch definitions, memory effects and the default ones should be attached to the data type from the module they're defined in. So now a data type has effects/slots where we can attach other types and they travel around with the type, so only importing Obj will correctly bring asking with it any destructors on the type. None of which are hidden but now first class in the type system. So how does that help here? Effects are basically tagging types, and this gives us key value storage. Tables should work on any type that has type tags (using the term "tags" as I'm not simply referencing effects) that have equality and hash tags defined. A tag being a key and the type the right hand side should take/match the value. In this case the tag definition should be in the hashes module and the tables module should insist on it for any V as a type constraint. When a data type is expected to fulfill tag requirements we look inside the module it's defined in to fulfill those first and then in the environment. The ability to overwrite these would be the last bit of control. For the receiver the tags just act as guaranteed to resolve procs. I need to think about the syntax a bit, but I was thinking proc foo(a: Whatever): Bar {.pragmas.}
where a: ... =
# proc code here
Kotlin's where is used for generics, here is an example and link: fun <T> copyWhenGreater(list: List<T>, threshold: T): List<String>
where T : CharSequence,
T : Comparable<T> {
return list.filter { it > threshold }.map { it.toString() }
} |
Don't think about the syntax, think about the implications of a system where operations are not bound to the type. "Coherence" is only one of them, if you detach destructors from types you cannot do code generation for "partial" Nim programs, you have to delay code generation until the full Nim program was analysed. Maybe that's worth it, but it's different from how every other production language works. (And hence, very risky.) I don't even have to read Kotlin's spec to know that |
Great, I'd rather stay at the conceptual level right now, because the thinking needs cleaning up.
Good prompt, revisiting and revising my thinking:
An implication I've not quite worked my way through is that the same type with different tags isn't exactly a different type. I'm not sufficiently familiar with type theory and the best I can reason about it is that tags are like row polymorphism. Extra tags don't matter and structural sub-typing apply.
Hmm, I must have miscommunicated here, but this fuzziness is only true within the definition module. As what I'm proposing suggests that destructors and their addition to a type should follow the same declaration order rule as everything else in the language. I'm admittedly a stickler about these sorts of symmetry.
Check the revisions above, I think the path my brain is taking me down suggests stick with static dispatch as much as possible as Nim does, but allow for a "static vtable" or "static interfaces" that's editable. For when you care not just about the data type, but other associated behaviours. Something like use all the usual static dispatch we do today, but once tags come into play: A tag on a type is void if not defined or some type that forces the resolution to whatever occupies that slot. If a tag is void and that tag is demanded then it's lazily resolved. The case for where one wants to prohibit or close these tags, my current thought is why would one want to and if one must then why not the
Yup, they're definitely in the class and static extension methods are going to do nothing to help. |
I know, "copying bad design is not good design" but if every old and new PL agrees on this subject and it's also a reasonable thing to do from first principles then maybe we can do the same? I fail to see the innovation in "Nim is better because it's unsound". |
I hope Nim will not follow Rust too closely regarding coherence: there's an issue at a documentation repo about Rusts's idea of coherence which demonstrates that the Rust people use the term in a somewhat undifferentiated and extreme way. The links inside the issue show Haskell's and Scala's take on the topic: there's a difference between local and global coherence (the Haskell related link seems to call the former "coherence" and the latter "global uniqueness of instances"). My short take on it:
Following the Haskell road would mean that different "typeclasses" are allowed to introduce ambiguity to the overloading hierarchy of their procs. But this only explodes when and where somebody tries to actually use these procs in one context. Just like overload resolution works in Nim right now and maybe also resolvable like now: by qualifying the proc (with a module / concept / "associated implementation type" ?). I really like this. IMHO, the Rust compiler is being an overly restrictive smartass (surprise..) and sacrifices composability for it: "However, the problem with global uniqueness of instances is that they are inherently nonmodular: you might find yourself unable to compose two components because they accidentally defined the same type class instance, even though these instances are plumbed deep in the implementation details of the components." |
I don't mind this as much as I mind "the code compiled without module M but now with M it changed its behavior" (at runtime). |
The "use all type" syntax in Ada 2012 was introduced to address a similar issue in Ada after the "use type" syntax, which only attached operators to the type (added in Ada 95), was found wanting. See http://www.ada-auth.org/standards/12rat/html/Rat12-4-5.html |
I've read the discussion a few times and the following points have all been brought up one way or another, but I figure I can add some commentary as someone who works on a generic C++ library (the standard library), given that C++ has a similar feature. I don't think this proposal will really suffer from the problems that C++'s ADL has. Basically ADL creates .... problems for generics. It's probably better in nim than in C++ since modules are constrained to a single file, whereas namespaces tend to be quite large, but still, the problems with ADL in C++ can effect nim too, especially modules that use include and/or reexport a bunch of other stuff (if we allow ADL in that case, and it would be nice, being able to "implement a trait" via type bound procedures in a separate module to the type itself is useful for example if you have optional support for that functionality). Firstly in C++ generics function calls are only "open" (in nim terminology) if the call is unqualified and there is at least one overload in the definition context. Thus if you want "open" symbol binding you must also opt into ADL. This causes lots of workarounds like "customization point objects" (structs defined at global scope with an operator(), basically to enforce concept constraints (nim's typeclasses) and then do ADL. Then, you realize that well, actually you only wanted the call to be open, you didn't actually want ADL (since that essentially "reserves" the semantics of that function name globally if everything is going to work), so you invent Another extremely destructive thing about what C++ does is that for function calls where some of the arguments are class templates (generic objects) C++ considers functions defined in the namespace of the template parameters in addition to the namespace of the class template itself. This is a ... questionable design in and of itself (and, to be honest is so subtle that until recently compilers almost never got it "right"), it also makes writing generic libraries (especially ones with multiple implementations) a nightmare because any helper types (like Neither of these problems are present in this proposal, and http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0934r0.pdf proposes fixing them by doing almost exactly what this proposal says (except you can trigger ADL with any argument position, not just the first). Actually being able to use any parameter position could be good for binary operators. Just wanted to pipe up in case anyone has any fears this will create the same problems C++ has. |
Why limit the attaching to the first parameter? |
I don't understand why change the rules of overload lookup/disambiguation if the same effect can be had just changing what is imported to the current scope in the first place and is in my head much simpler. Isn't Araq proposal equivalent to: "when a type is imported from a module, import all routines that are bound to a type T?" Or is there some subtle difference that I'm not seeing? We could also optionally also import all routines whose first parameter includes that type in subsequent imported modules while keeping the parsing linear. This would give more power through reordering, but on the other hand can be confusing and lead to those bugs we want to avoid in the first place (although they would be much more rare). For full backwards compatibility, create a That syntax is proposed in #426 but unlike that RFC I don't propose any changes outside the import. However |
@xigoi You're right, the mechanism works better when every parameter is taken into account. |
@ReneSac This proposal differs from your solution as soon as an intermediate module that does not import the type at all is involved. |
implements nim-lang/RFCs#380
implements nim-lang/RFCs#380
implements nim-lang/RFCs#380
Why it works only for names with overloads? From the PR: "If no overloads with a name are in scope, type bound ops are not triggered" Is it planned to remove this limitation? |
Good question, I think the spec is buggy here. |
This is mostly a consequence of the implementation, but after looking into it more, I can't think of a clean way to implement this. Right off the bat I think this is how it works in Rust, a trait has to be imported to use its methods, but in this case we are dealing with a much less restricted version of a trait. For example, doing As for why it's the case in the current implementation: Firstly, the type bound overloads are added as the arguments get typechecked, which is done lazily during overloading depending on which parameters each overload needs to be typed. An early version of the PR just considered every single type bound operation with the same name straight away, which causes problems for things like (a failing test case used the name # a.nim
type Foo* = object
proc foo*(x: Foo) = discard # b.nim
import a # compile module a # c.nim
import b
# a not imported
template foo(body: untyped) =
let abc {.inject.} = 123
foo: # this considers a.foo, which tries to type the argument:
echo abc # undeclared identifier: 'abc' If no overloads exist, we can't perform typechecking for any argument, or at least trying to do so would give confusing errors as in cases like nim-lang/Nim#7857. It's hard to come up with something predictable that the compiler could do when it has no information about the provided name. Still, the workaround I gave in the PR description with |
In principle (ignoring
|
The Problem
"Generic sandwiches" (nim-lang/Nim#11225), error-prone behavior.
For a simple example let's look at this program consisting of two separate modules:
This program fails at runtime with a
KeyError
exception. What happened?Since
Obj
's hash and equality implementation is not in scope, the defaultsfrom
hashes.nim
have been picked up instead! Now, while this particular example seemscontrived (why only import
Obj
from the moduleobjs
?) this problem canshow up in all sort of ways and I personally consider it Nim's biggest design
mistake. We prefer the style of keeping data and computation separate.
But that doesn't mean not to bind a routine to the type it "naturally" belongs to.
Note: A
routine
is any callable entity, in Nim's case one of:proc
,func
,method
,iterator
,template
,macro
,converter
.The solution
Let's attach routines to types.
Proposal
Routines are attached to a type, much like in more classical OOP languages.
If a routine
r
is attached to typeT
it participates in overload resolution,even if the routine is not in the current scope.
This can be seen as a scope extension or "argument dependent" lookup.
A routine
r
is bound to a typeT
if the following is true:r
is a toplevel routine and declared in the same module asT
andr
is declared as public andT
is anenum
,distinct
orobject
type or a generic type thereof andr
is of typeT2
andT2
"is attachable to"T
.A type
T2
"is attachable to"T
iff[var|sink] T
(thevar|sink
modifier is optional) or[var|sink] T
.A routine that is attached to a type participates in overload resolution even if it
is not in the instantiation scope at all. With this additional language rule
our "Table[Obj, int]" example program simply works.
The text was updated successfully, but these errors were encountered: