Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement type bound operation RFC #24315

Merged
merged 25 commits into from
Oct 25, 2024
Merged

implement type bound operation RFC #24315

merged 25 commits into from
Oct 25, 2024

Conversation

metagn
Copy link
Collaborator

@metagn metagn commented Oct 15, 2024

closes nim-lang/RFCs#380, fixes #4773, fixes #14729, fixes #16755, fixes #18150, fixes #22984, refs #11167 (only some comments fixed), refs #12620 (needs tiny workaround)

The compiler gains a concept of root "nominal" types (i.e. objects, enums, distincts, direct Foo = ref objects, generic versions of all of these). Exported top-level routines in the same module as the nominal types that their parameter types derive from (i.e. with var/sink/typedesc/generic constraints) are considered attached to the respective type, as the RFC states. This happens for every argument regardless of placement.

When a call is overloaded and overload matching starts, for all arguments in the call that already have a type, we add any operation with the same name in the scope of the root nominal type of each argument (if it exists) to the overload match. This also happens as arguments gradually get typed after every overload match. This restricts the considered overloads to ones attached to the given arguments, as well as preventing untyped arguments from being forcefully typed due to unrelated overloads. There are some caveats:

  • If no overloads with a name are in scope, type bound ops are not triggered, i.e. if foo is not declared, foo(x) will not consider a type bound op for x.
  • If overloads in scope do not have enough parameters up to the argument which needs its type bound op considered, then type bound ops are also not added. For example, if only foo() is in scope, foo(x) will not consider a type bound op for x.

In the cases of "generic interfaces" like hash, $, items etc. this is not really a problem since any code using it will have at least one typed overload imported. For arbitrary versions of these though, as in the test case for #12620, a workaround is to declare a temporary "template" overload that never matches:

# neither have to be exported, just needed for any use of `foo`:
type Placeholder = object
proc foo(_: Placeholder) = discard

I don't know what a "proper" version of this could be, maybe something to do with the new concepts.

Possible directions:

A limitation with the proposal is that parameters like a: ref Foo are not attached to any type, even if Foo is nominal. Fixing this for just ptr/ref would be a special case, parameters like seq[Foo] would still not be attached to Foo. We could also skip any structural type but this could produce more than one nominal type, i.e. (Foo, Bar) (not that this is hard to implement, it just might be unexpected).

Converters do not use type bound ops, they still need to be in scope to implicitly convert. But maybe they could also participate in the nominal type consideration: if Generic[T] = distinct T has a converter to T, both Generic and T can be considered as nominal roots.

The other restriction in the proposal, being in the same scope as the nominal type, could maybe be worked around by explicitly attaching to the type, i.e.: proc foo(x: T) {.attach: T.}, similar to class extensions in newer OOP languages. The given type T needs to be obtainable from the type of the given argument x however, i.e. something like proc foo(x: ref T) {.attach: T.} doesn't work to fix the ref issue since the compiler never obtains T from a given ref T argument. Edit: Since the module is queried now, this is likely not possible.

@Araq
Copy link
Member

Araq commented Oct 15, 2024

It's better to do it lazily. If f(arg1, arg2) requires the semcheck of arg1 add arg1's f to the sym choice, likewise for arg2. This only fails for untyped arguments which we simply ignore.

@metagn metagn changed the title test type bound ops per name test type bound ops Oct 16, 2024
@metagn metagn changed the title test type bound ops type bound ops Oct 18, 2024
@metagn metagn marked this pull request as ready for review October 19, 2024 14:52
@metagn metagn changed the title type bound ops implement type bound operation RFC Oct 19, 2024
compiler/modulegraphs.nim Outdated Show resolved Hide resolved
compiler/semcall.nim Outdated Show resolved Hide resolved
@Araq
Copy link
Member

Araq commented Oct 22, 2024

How does this affect compile times?

@metagn
Copy link
Collaborator Author

metagn commented Oct 23, 2024

Temporarily enabled the option globally to see since new code only runs with it on now. Also tests that the new changes work.

nominalRoot and isAttachableRoutineTo should have negligible runtime by themselves since they are like skipTypes and use simple ID checks.

The performance in addTypeBoundSymbols depends on how many call arguments have top-level nominal types, and how long it takes to search the callee name in the module of such types. This is not going to be slower than, for example, creating an extra symchoice on every call. Hence I don't think this impacts performance much but I can't speak too much for the performance of identifier searches.

There are a few possible optimizations I can think of but they'll add complexity for little gain:

  • Cache searched module/type ID pairs per call so we don't do the search again
  • Pre-calculate the nominal types per proc

@Araq
Copy link
Member

Araq commented Oct 23, 2024

Sure but it's easy to measure:

bootstrap time with type bound ops disabled: 5.2s, with enabled: 5.3s (for example!)

@metagn
Copy link
Collaborator Author

metagn commented Oct 23, 2024

Ran nim check compiler/nim 15 times each with release booted compiler and experimental switch on or off.

Times with experimental switch off:

5.306s
5.584s
5.356s
5.112s
5.384s
5.422s
5.235s
5.361s
5.270s
5.297s
5.322s
5.219s
5.106s
5.296s
5.185s

peakmem all 533.637MiB

Times with it on:

5.230s
6.535s
5.957s
5.751s
5.306s
5.308s
5.110s
5.104s
5.440s
5.571s
5.356s
5.365s
5.151s
5.429s
5.271s

peakmem all 533.82MiB

So 5.297 ± 0.118 seconds with it off and 5.459 ± 0.365 seconds with it on. Removing the 2 worst and 2 best times for each, it gives 5.294 ± 0.059 vs 5.380 ± 0.159. That is about 90 milliseconds or a 2% impact at worst but other codebases might be impacted more than the compiler.

The bootstrap times in the CI seem to have a similar difference.

@arnetheduck
Copy link
Contributor

nimbus-eth2/beacon_chain/nimbus_beacon_node takes several minutes to compile, if you're looking for a compiler speed stress test - it's quite slow and not getting any faster

#result = nominalRoot(t.skipModifier)
result = nil
of tyStatic:
# ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a follow-up PR we should treat static just like var and sink here.

@Araq Araq merged commit 2864830 into nim-lang:devel Oct 25, 2024
12 of 19 checks passed
Copy link
Contributor

Thanks for your hard work on this PR!
The lines below are statistics of the Nim compiler built from 2864830

Hint: mm: orc; opt: speed; options: -d:release
175458 lines; 8.616s; 654.488MiB peakmem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment