Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change interpreter calling conventions to be more direct #5542

Merged
merged 9 commits into from
Jan 15, 2025

Conversation

dolio
Copy link
Contributor

@dolio dolio commented Jan 14, 2025

This change set reworks the calling conventions of foreign functions to avoid various encoding methodologies that were just wasting time by causing more passes through the interpreter loop. Instead, the ForeignConvention class has turned into a way of specifying how unison values are parsed into their Haskell equivalents, and each foreign operation is expected to just take a number of unison values that are parsed this way. Similarly, single results are written back directly, rather than encoding them on the stack. The result is that most wrappers are now just trivial (and should be subject to inlining).

One additional thing I did was add support for foreign functions that raise an Exception in the unison sense. The convention for these is that foreign functions return booleans indicating whether the result on the stack is valid or exceptional. In the latter case the stack will have a Failure value, and the interpreter will call the current Exception handler. This means that e.g. the array operations are also just trivial wrappers, and don't need to return their results wrapped in Either.

Haven't gotten to benchmarks yet, but the tests I've run are working (@unison/base tests, unison-runtime tests, transcripts and interpreter-tests). I haven't run any cloud tests; if Cody could arrange for that that'd be helpful.

Also haven't compared performance with any of our benchmarks yet.

Making this draft because it seems to indicate I need to merge in trunk or something.

dolio added 7 commits January 9, 2025 10:43
The new conventions are not aimed to encode structure on the stack,
but to instead directly produce corresponding unison values. This
should allow the raw foreign operations to be called directly in
most cases, without the wrapper having to mediate between the
convention encoding and the unison value. This both allows for more
efficient direct calls, and avoiding what might be considerable
overhead of building up the unison values via multiple iterations
through the interpreter loop.

The one portion where a 'flattened' encoding is retained is for
tuples, but this is just for supporting multi-argument functions,
not for actually encoding tuples on the stack.
…tions

Most wrappers became obselete for foreigns, and we can just specify
how many arguments they take.

The foreign functions now have an option for indicating an
exceptional result, in which case the returned value is a `Failure`,
and the machine will directly call the `Exception` handler using that
value. This allows e.g. array operations to have less interpreter
overhead.
@ceedubs
Copy link
Contributor

ceedubs commented Jan 15, 2025

I'll run the cloud integration tests against this today.

Copy link
Contributor

@ceedubs ceedubs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the code, but I have verified that the Unison Cloud test suite passes with this change.

@dolio
Copy link
Contributor Author

dolio commented Jan 15, 2025

Here are some benchmark numbers:

fib1
186.674µs -> 183.472µs

fib2
2.802898ms -> 2.983028ms

fib3
3.295707ms -> 3.379658ms

Decode Nat
228ns -> 176ns

Generate 100 random numbers
180.031µs -> 188.523µs

List.foldLeft
2.326404ms -> 2.313983ms

Count to 1 million
44.410122ms -> 43.79875ms

Json parsing (per document)
276.48µs -> 263.727µs

Count to N (per element)
110ns -> 110ns

Count to 1000
111.101µs -> 110.738µs

Mutate a Ref 1000 times
163.826µs -> 162.988µs

CAS an IO.ref 1000 times
193.437µs -> 190.413µs

List.range (per element)
232ns -> 241ns

List.range 0 1000
277.155µs -> 307.623µs

Set.fromList (range 0 1000)
1.659985ms -> 1.857961ms

Map.fromList (range 0 1000)
1.128239ms -> 1.10596ms

NatMap.fromList (range 0 1000)
4.12972ms -> 3.531125ms

Map.lookup (1k element map)
1.91µs -> 1.621µs

Map.insert (1k element map)
4.726µs -> 4.822µs

List.at (1k element list)
205ns -> 211ns

Text.split /
48.83µs -> 47.647µs

I think most of them are fake, though, because they don't actually use anything that I changed. The real one is Decode Nat because that is actually a foreign, I think. I think even though the Set and (Nat)Map numbers look like significant differences, they actually aren't, and you can get that amount of difference just between runs.

I also wrote a bubble sort on byte arrays test, and it is about twice as fast on the new branch (that was the sort of builtin I would expect to be most improved).

@dolio dolio marked this pull request as ready for review January 15, 2025 20:49
@dolio
Copy link
Contributor Author

dolio commented Jan 15, 2025

I think this should be good to go. I fiddled around with inlining/strictness on some of the new stuff related to the sorting test case I wrote, but none of it seemed to make any difference.

@dolio dolio merged commit 0a98cc9 into trunk Jan 15, 2025
32 checks passed
@dolio dolio deleted the topic/calling-conventions branch January 15, 2025 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants