Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opt-in return type checks #8502

Merged
merged 16 commits into from
Dec 19, 2023
Merged

Conversation

radeusgd
Copy link
Member

@radeusgd radeusgd commented Dec 8, 2023

Pull Request Description

Important Notes

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • All code has been tested:
    • Unit tests have been written where possible.
    • If GUI codebase was changed, the GUI was tested when built using ./run ide build.

@radeusgd radeusgd linked an issue Dec 9, 2023 that may be closed by this pull request
3 tasks
@radeusgd
Copy link
Member Author

The first approach implements the checks with a simple 'trick' - I 'just' wrap the body of a function that has the return type check with a type ascription (when creating the IR). This then ensures that the type is checked.

At first I thought that this will break TCO (since technically the ascription is an additional operation, so the tail-call would no longer be the last operation in a branch - but apparently this is ignored so TCO still works - I added test cases to verify it).

However, I'm not sure if this is the right solution.


We still need to somehow pass the information about the return type signature into our IR. One way is as implemented in this PR, another one that I considered (and briefly started implementing) is extending the family of IR nodes Method.Binding (and related, Function.Binding etc.) with an optional return type specification.

With the latter approach, the return type is specified more 'directly' and then we can perform an IR transformation adding the ascriptions or do the translation slightly differently. For example this could allow us to add better error messages for when the type is wrong (currently it is 'expression has wrong type', it could be better to clarify that it is a 'return type' of a particular function that was violated).

It also seems that the latter approach may be more friendly to other IR passes that perform any kind of type inference - the type check is part of the function definition so it should then be easier to turn this information into type inference metadata associated with that function. Whereas with the current approach, we'd need to match the body of the function for such ascriptions (but that is still certainly doable and may actually be preferred for more advanced cases (example below)).

Example where actually analyzing such ascriptions is more flexible:

foo x = case x of
    1 -> (x+x : Integer)
    _ : Integer -> x

technically code that is able to analyze the kind of ascriptions, with some additional extensions, should also be able to infer that foo always returns Integer. So for longer term maybe that current solution could actually be good.

@radeusgd
Copy link
Member Author

radeusgd commented Dec 11, 2023

Added the ascriptions to Sieve benchmarks and scheduled a run of benchmarks:

https://github.com/enso-org/enso/actions/runs/7195218562

@radeusgd radeusgd force-pushed the wip/radeusgd/8240-check-return-type branch from c0877f8 to 74aac1a Compare December 12, 2023 11:09
Base automatically changed from wip/kw/parse-inline-signatures to develop December 12, 2023 14:48
@radeusgd radeusgd force-pushed the wip/radeusgd/8240-check-return-type branch 3 times, most recently from f8e03ac to 578ade0 Compare December 14, 2023 13:02
@radeusgd
Copy link
Member Author

The CI did not run my added benchmark (perhaps I need to try again), but I've ran them on my PC.

JMH

[info] Benchmark                          Mode  Cnt     Score      Error  Units
[info] Sieve.Ascribed                     avgt    3   212,305 ±  161,777  ms/op
[info] Sieve.Ascribed_With_Return_Checks  avgt    3   225,027 ±  611,798  ms/op
[info] Sieve.Java_Script_All              avgt    3   595,150 ±  966,628  ms/op
[info] Sieve.Java_Script_File             avgt    3    44,876 ±  144,381  ms/op
[info] Sieve.Java_Script_Filter           avgt    3  1482,088 ±  385,427  ms/op
[info] Sieve.Java_Script_Natural          avgt    3  1381,269 ± 3863,190  ms/op
[info] Sieve.Original                     avgt    3   201,776 ±  165,169  ms/op
[info] Sieve.Without_Types                avgt    3   202,058 ±  178,725  ms/op

Enso Runner

Label Measurement Warmup
Sieve.Original 216.6 218.4
Sieve.Without_Types 293.4 225.2
Sieve.Ascribed 214.8 278.6
Sieve.Ascribed_With_Return_Checks 207.8 223.4
Sieve.Java_Script_All 607.5 453.0
Sieve.Java_Script_File 40.0 41.7
Sieve.Java_Script_Filter 1545.3 1295.8
Sieve.Java_Script_Natural 1442.9 1453.7

image

Interestingly the JMH and Enso Runner results differ a bit - Enso Runner seems to favor the type checks more while on JMH they get worse results relative to the one without.

Looking at the graph with marked std-dev we can see that all tests have similar performance - if we want to get any more detailed results we may need to increase the warmup time/iterations. Within the current setting it seems that the overhead is rather small, if any.

@radeusgd
Copy link
Member Author

I did one more run with extended warmup and measurement iterations and times:

-options = Bench.options . set_warmup (Bench.phase_conf 3 5) . set_measure (Bench.phase_conf 3 3)
+options = Bench.options . set_warmup (Bench.phase_conf 3 7) . set_measure (Bench.phase_conf 5 7)
Label Measurement Warmup
Sieve.Original 196.0 215.6
Sieve.Without_Types 201.1 220.6
Sieve.Ascribed 216.5 217.2
Sieve.Ascribed_With_Return_Checks 217.6 251.7

image

Here we can see that the variants without type checks are slightly faster, even including the variance of measurements, the difference is by about 8%.

@@ -322,7 +322,7 @@ case object TypeSignatures extends IRPass {
*
* @param signature the expression for the type signature
*/
case class Signature(signature: Expression) extends IRPass.IRMetadata {
case class Signature(signature: Expression, comment: Option[String] = None) extends IRPass.IRMetadata {
Copy link
Member

@JaroslavTulach JaroslavTulach Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will likely change the cache serialization format - you'd rather assign new number for the @Persistable(Signature.class) definition.

@@ -322,7 +322,7 @@ case object TypeSignatures extends IRPass {
*
* @param signature the expression for the type signature
*/
case class Signature(signature: Expression) extends IRPass.IRMetadata {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need comment now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed if for now.

* 2. for `foo a : Integer -> Integer`, this results in a compile error currently.
*/
@Test
public void weirdReturnTypeSignature1() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the test called weird? I'd expect that something fails then, but apparently it works. Calling nonIntuitiveForRadek ;-)

acceptAndAdd : Integer
acceptAndAdd self n =
iterate f = case f of
acceptAndAdd : Integer -> Filter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next : Gen
next self -> Gen = Gen.Generator self.n+1

natural : Gen
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you can remove the natural : Gen definition now.

@radeusgd radeusgd force-pushed the wip/radeusgd/8240-check-return-type branch from 98457d7 to 78efd1f Compare December 16, 2023 14:04
@radeusgd radeusgd marked this pull request as ready for review December 16, 2023 14:13
@radeusgd
Copy link
Member Author

I decided to scratch the better error messages for now, and try to get this PR merged with the current state - return type mismatch resulting in:

Type error: expected `expression` to be Integer, but got Text.

I will create a second followup PR where I will try to improve it to be:

Type error: expected `function_name` result to be Integer, but got Text.

I decided to keep the PRs separate, as I see 2 possible approaches for implementing the error messages and I feel like at least one of them may be controversial - so I want to discuss it in a separate PR, without blocking merging of the basic feature that already works.

@radeusgd radeusgd self-assigned this Dec 19, 2023
@radeusgd radeusgd added the CI: Ready to merge This PR is eligible for automatic merge label Dec 19, 2023
@mergify mergify bot merged commit f0c2a5f into develop Dec 19, 2023
32 of 34 checks passed
@mergify mergify bot deleted the wip/radeusgd/8240-check-return-type branch December 19, 2023 15:32
mergify bot pushed a commit that referenced this pull request Dec 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check return type of functions (opt-in)
3 participants