Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime regression from Scala 2: A non-closure lambda literal is no longer a constant value and is not equal to itself #19224

Closed
neko-kai opened this issue Dec 7, 2023 · 17 comments · Fixed by #19251

Comments

@neko-kai
Copy link
Contributor

neko-kai commented Dec 7, 2023

Compiler version

3.3.1

Minimized code

object app extends App {
  def x(): Int => String = (i: Int) => i.toString

  locally {
    println(x() == x()) // true on Scala 2, false on Scala 3...
    println(x().hashCode -> x().hashCode) // same on Scala 2, different on Scala 3
  }
}

Output

% coursier launch scala:2.13.12 -- example.scala
true
(1240730624,1240730624)
% coursier launch scala:3.3.1 -- example.scala
false
(580212331,390994841)

Expectation

Expected repeated invocations of x() to produce the exact same lambda instance, as before in Scala 2 and currently in Java:

class app {
    static java.util.function.Function<Integer, String> x() {
        return ((Integer i) -> i.toString());
    }

    public static void main(String[] args) {
        System.out.println(x() == x());
        System.out.println(x().equals(x()));
        System.out.println(x().hashCode());
        System.out.println(x().hashCode());
    }
}

Outputs:

true
true
707610042
707610042

We rely on this property very heavily in https://github.com/7mind/izumi (since Java behaves the same way, we assumed this property couldn't be broken in the future)

@neko-kai neko-kai added itype:bug stat:needs triage Every issue needs to have an "area" and "itype" label labels Dec 7, 2023
@mbovel
Copy link
Member

mbovel commented Dec 11, 2023

I could reproduce:

  ~/scala-snippets-4 scala-cli run -S 2.13 closuresEquality.scala     
true
(729864207,729864207)
  ~/scala-snippets-4 scala-cli run -S 3.nightly closuresEquality.scala
false
(128526626,1265210847)

Seems like Scala 2 hoist closures that don't depend on their environment while Dotty doesn't.

Compare ASTs
  ~/scala-snippets-4 scala-cli run --server=false -S 2.13 -Xprint:delambdafy closuresEquality.scala
[[syntax trees at end of                delambdafy]] // closuresEquality.scala
package <empty> {
  object app extends Object with App {
    final protected def args(): Array[String] = app.super.args();
    @deprecated(message = "the delayedInit mechanism will disappear", since = "2.11.0") override def delayedInit(body: Function0): Unit = app.super.delayedInit(body);
    final def main(args: Array[String]): Unit = app.super.main(args);
    final override <stable> <accessor> def executionStart(): Long = (app.this.executionStart: Long);
    private[this] var executionStart: Long = _;
    override <accessor> def _args(): Array[String] = (app.this._args: Array[String]);
    private[this] var _args: Array[String] = _;
    override <accessor> def _args_=(x$1: Array[String]): Unit = app.this._args = (x$1: Array[String]);
    override <stable> <accessor> def initCode(): scala.collection.mutable.ListBuffer = (app.this.initCode: scala.collection.mutable.ListBuffer);
    private[this] var initCode: scala.collection.mutable.ListBuffer = _;
    final override <accessor> protected[this] def scala$App$_setter_$executionStart_=(x$1: Long): Unit = app.this.executionStart = (x$1: Long);
    final override <accessor> protected[this] def initCode_=(x$1: scala.collection.mutable.ListBuffer): Unit = app.this.initCode = (x$1: scala.collection.mutable.ListBuffer);
    def x(): Function1 = $anonfun();
    final <static> <artifact> def $anonfun$x$1(i: Int): String = java.lang.Integer.toString(i);
    final <synthetic> def delayedEndpoint$app$1: Unit = {
      scala.Predef.locally({
        scala.Predef.println(scala.Boolean.box(app.this.x().==(app.this.x())));
        {
          scala.Predef.println(scala.Predef$ArrowAssoc.->$extension(scala.Predef.ArrowAssoc(scala.Int.box(app.this.x().hashCode())), scala.Int.box(app.this.x().hashCode())));
          scala.runtime.BoxedUnit.UNIT
        }
      });
      ()
    };
    def <init>(): app.type = {
      app.super.<init>();
      app.super./*App*/$init$();
      app.this.delayedInit(new app$delayedInit$body(app.this));
      ()
    };
    final <static> <artifact> def $anonfun$x$1$adapted(i: Object): String = app.this.$anonfun$x$1(unbox(i))
  };
  final <synthetic> class app$delayedInit$body extends runtime.AbstractFunction0 {
    <paramaccessor> private[this] val $outer: app.type = _;
    final def apply(): Object = {
      app$delayedInit$body.this.$outer.delayedEndpoint$app$1();
      scala.runtime.BoxedUnit.UNIT
    };
    def <init>($outer: app.type): app$delayedInit$body = {
      if ($outer.eq(null))
        throw null
      else
        app$delayedInit$body.this.$outer = $outer;
      app$delayedInit$body.super.<init>();
      ()
    }
  }
}

 scala-cli compile -S 3.nightly -Xprint:genBCode closuresEquality.scala 
Compiling project (Scala 3.4.0-RC1-bin-20231209-fdd06ce-NIGHTLY, JVM)
[[syntax trees at end of                  genBCode]] // /Users/mbovel/scala-snippets-4/closuresEquality.scala
package <empty> {
  @SourceFile("closuresEquality.scala") final module class app extends Object, 
    App {
    def <init>(): Unit =
      {
        super()
        super[App].$init()
        locally(
          {
            println(Boolean.box(app.x().==(app.x())))
            {
              println(
                {
                  val ev$1: Integer =
                    ArrowAssoc(Int.box(app.x().hashCode())).asInstanceOf[Integer
                      ]
                  ArrowAssoc.->$extension(ev$1, Int.box(app.x().hashCode()))
                }
              )
              scala.runtime.BoxedUnit.UNIT
            }
          }
        )
        scala.runtime.Statics.releaseFence()
        ()
      }
    private <static> var executionStart: Long
    final def executionStart(): Long = app.executionStart
    private <static> var scala$App$$_args: String[]
    def scala$App$$_args(): String[] = app.scala$App$$_args
    private <static> var scala$App$$initCode:
      scala.collection.mutable.ListBuffer
    def scala$App$$initCode(): scala.collection.mutable.ListBuffer =
      app.scala$App$$initCode
    def scala$App$$_args_=(x$1: String[]): Unit = app.scala$App$$_args = x$1
    def scala$App$_setter_$executionStart_$eq(x$0: Long): Unit =
      app.executionStart = x$0
    def scala$App$_setter_$scala$App$$initCode_$eq(
      x$0: scala.collection.mutable.ListBuffer): Unit =
      app.scala$App$$initCode = x$0
    protected final def args(): String[] = super[App].args()
    @deprecated(message = "the delayedInit mechanism will disappear",
      since = "2.11.0") override def delayedInit(body: Function0): Unit =
      super[App].delayedInit(body)
    final def main(args: String[]): Unit = super[App].main(args)
    private def writeReplace(): Object =
      new scala.runtime.ModuleSerializationProxy(classOf[app])
    def x(): Function1 =
      {
        {
          closure(this.x$$anonfun$adapted$1)
        }
      }
    private final def x$$anonfun$1(i: Int): String = Int.box(i).toString()
    private final def x$$anonfun$adapted$1(i: Object): String =
      this.x$$anonfun$1(Int.unbox(i))
  }
  final lazy module val app: app = new app()
}

@mbovel mbovel added area:backend and removed stat:needs triage Every issue needs to have an "area" and "itype" label labels Dec 11, 2023
@mbovel
Copy link
Member

mbovel commented Dec 11, 2023

@sjrd does the Spec say anything about that?

@sjrd
Copy link
Member

sjrd commented Dec 11, 2023

The spec doesn't say anything about this. I don't think the Java spec says anything about it either. Relying on this behavior is just playing with chance.

@mbovel mbovel removed the itype:bug label Dec 11, 2023
@neko-kai
Copy link
Contributor Author

The spec doesn't say anything about this. I don't think the Java spec says anything about it either. Relying on this behavior is just playing with chance.

@sjrd Technically yes, of course. But, in this answer by Brian Goetz he does say that you can 'conclude' this property at least for Oracle's Java implementation.

From an implementation perspective, you can conclude a little more. ... if you evaluate the same lambda at the same capture site, and that lambda is non-capturing, you get the same instance, which can be compared with reference equality.

@odersky
Copy link
Contributor

odersky commented Dec 11, 2023

I verified that the following does work (i.e. Scala 2 and Scala 3 both return true):

class App {
  def x(): Int => String = (i: Int) => i.toString

  locally {
    println(x() == x()) // true on Scala 2, false on Scala 3...
    println(x().hashCode -> x().hashCode) // same on Scala 2, different on Scala 3
  }
}

@main def Test = App()

This could be a simple fix in LambdaLift. I'll investigate.

@pshirshov
Copy link
Contributor

pshirshov commented Dec 11, 2023

@sjrd : even if this is not specified, it might be a very good idea to explicitly specify this. Otherwise we consider lambdas to be second-class citizens and we suddenly get unsoundness where it can be easily avoided.

Also I think that we should also define sound equality rules for for the lambdas with non-empty closures, I think we should create a separate issue for that. I don't see any obvious performance penalties in that. If I understand correctly, both equals and hashCode rely on default implementations right now and, effectively, are useless for bound lambdas.

@SethTisue
Copy link
Member

SethTisue commented Dec 11, 2023

the main Scala 2 PR is #19224 scala/scala#5099 by @retronym — perhaps it could be raided for implementation ideas and/or test cases. not sure if there were any sequels/followups

as the PR description indicates, the primary motivation at the time was serialization (see also scala/scala#4652):

allows lambdas that call the local methods to be serializable, regardless of whether or not the enclosing class is serializable

but also performance, and compatibility with Scala 2 does seem like a reasonable additional motivation for forward-porting

@odersky
Copy link
Contributor

odersky commented Dec 11, 2023

@SethTisue #19224 is this issue. What is the Scala 2 PR that addressed it?

@som-snytt
Copy link
Contributor

The only question is did retronym use the time machine to go to the future to cause this ticket to be created, or did he go to the past to report scala/bug#9408? The PR is as Seth showed scala/scala#4652 IIUC.

Also if we follow his WIP at the distantly related ticket scala/bug#9414 will we find commits from 2025?

@SethTisue
Copy link
Member

SethTisue commented Dec 11, 2023

gah, sorry, the main one I meant to link to is scala/scala#5099 (though yes, 4652 may also be of interest)

@som-snytt
Copy link
Contributor

Now I see why it's better to link to PRs than to commits.

"This change is similar in spirit to SI-9408 / scala/scala@93bee55." also "As of SI-9408 / scala/scala@93bee55" which has a ticket link that was not converted FSR.

odersky added a commit to dotty-staging/dotty that referenced this issue Dec 12, 2023
An anonymous function in a static object was previously mapped to a member
of that object. We now map it to a static member of the toplevel class instead.
This causes the backend to memoize the function, which fixes scala#19224. On the
other hand, we don't do that for anonymous functions nested in the object
constructor, since that can cause deadlocks (see run/deadlock.scala).

Scala 2's behavior is different: it does lift lambdas in constructors
to be static, too, which can cause deadlocks.

Fixes scala#19924
odersky added a commit to dotty-staging/dotty that referenced this issue Dec 12, 2023
An anonymous function in a static object was previously mapped to a member
of that object. We now map it to a static member of the toplevel class instead.
This causes the backend to memoize the function, which fixes scala#19224. On the
other hand, we don't do that for anonymous functions nested in the object
constructor, since that can cause deadlocks (see run/deadlock.scala).

Scala 2's behavior is different: it does lift lambdas in constructors
to be static, too, which can cause deadlocks.

Fixes scala#19924
@odersky
Copy link
Contributor

odersky commented Dec 12, 2023

I have no insight what these PRs do, they don't seem to correspond with anything we have in Dotty. I made a tweak that seems to fix the problem. I also noted that Scala 2 runs into a deadlock for some lifted lambdas where Scala 3 does not.
@retronym, maybe you could take a look at my PR?

@retronym
Copy link
Member

@odersky

There were two motivating reasons for the change.

First, as per @neko-kai's test case, it shares a single instance of a non-capturing lambda, as per Java 8's precedent. Even though most such lambda are themselves tiny and short lived, it can be still beneficial to avoid allocation pressure to let the GC keep as many other short lived objects entirely managed within the eden spaces.

Second, it avoids needlessly capturing an unused outer class. This can make the outer class eligible for GC sooner in some uses cases. It can also avoid "poisoning" a lambda wrt Serializability by keeping a reference to some non-Serializable outer class.

The Java experience here is a little different as Java lambas are only serizable if the function interface is explicitly Serializable (or is intersected with Serialiazable). The standard set of interfaces in java.util.function._ are not themselves serialziable. But when one has opted into a serializable lambda, the static nature of the lambda is equally important to avoid the outer reference.

@neko-kai has wrapped the test case in an object but Scala 2's static lambda transform is more general and also works for:

object Test {
  val t1 = new C
  val t2 = new C

  def main(args: Array[String]): Unit = {
    println(t1.x() == t2.x()) // true on Scala 2, false on Scala 3...
    println(t1.x().hashCode -> t2.x().hashCode) // same on Scala 2, different on Scala 3
  }
}

class C {
  def x(): Int => String = (i: Int) => i.toString
}
import java.util.List;

public class Test {
  private static C t1 = new C();
  private static C t2 = new C();

  public static void main(String[] args) {
    System.out.println(t1.x() == t2.x()); // true on Java false on Scala 3...
    System.out.println(List.of(t1.x().hashCode(), t2.x().hashCode())); // same on Java different on Scala 3
  }
}

class C {
  java.util.function.IntFunction x() { return (int i) -> "" + i; }  
}

To get such examples working in Scala 2 in the linked PR, I found that I had to do a late transform after all phases that could introduce this references (e.g. outer accessors) had run. The late transform was implemented as part of the delambdafy phase, but probably could live in a dedicated phase. It's a little ugly because we have to introduce the idea of STATIC members of classes that do not correspond to scala objects.

@retronym
Copy link
Member

Here's the spot in javac that makes eligible lambda methods static, based on an capture analysis: https://github.com/openjdk/jdk/blob/5463c9cd9a0a6f95f90787c330679b2ea78690c6/src/jdk.compiler/share/classes/com/sun/tools/javac/comp/LambdaToMethod.java#L2220

@odersky
Copy link
Contributor

odersky commented Dec 13, 2023

@retronym: In fact with this PR, Scala 3 and 2 are in alignment also for the code you showed. I added it to the test.

I believe the Scala 3 mechanism is a bit different from Scala 2's. We do everything in LambdaLift. For any method, we tentatively place it as far out as we can, either in the enclosing package (that would make it static) or in the enclosing top-level object. My fix was a tweak where anonymous functions go in this step. Then, we traverse the method's code and note any dependencies on fields. Any such dependency (in the method itself or its callees) causes the method to be moved in the outermost scope that sees the dependency via a this-type.

odersky added a commit to dotty-staging/dotty that referenced this issue Dec 13, 2023
An anonymous function in a static object was previously mapped to a member
of that object. We now map it to a static member of the toplevel class instead.
This causes the backend to memoize the function, which fixes scala#19224. On the
other hand, we don't do that for anonymous functions nested in the object
constructor, since that can cause deadlocks (see run/deadlock.scala).

Scala 2's behavior is different: it does lift lambdas in constructors
to be static, too, which can cause deadlocks.

Fixes scala#19224
@retronym
Copy link
Member

Oh great, that looks like it should doing the right thing then!

@retronym
Copy link
Member

I like the approach of pushing these out as far as possible, avoiding the need to deference long chains of outer pointers from nested lambdas (as are typical in for comprehensions).

👏

odersky added a commit to dotty-staging/dotty that referenced this issue Dec 13, 2023
An anonymous function in a static object was previously mapped to a member
of that object. We now map it to a static member of the toplevel class instead.
This causes the backend to memoize the function, which fixes scala#19224. On the
other hand, we don't do that for anonymous functions nested in the object
constructor, since that can cause deadlocks (see run/deadlock.scala).

Scala 2's behavior is different: it does lift lambdas in constructors
to be static, too, which can cause deadlocks.

Fixes scala#19224
odersky added a commit that referenced this issue Dec 14, 2023
An anonymous function in a static object was previously mapped to a
member of that object. We now map it to a static member of the toplevel
class instead. This causes the backend to memoize the function, which
fixes #19224. On the other hand, we don't do that for anonymous
functions nested in the object constructor, since that can cause
deadlocks (see run/deadlock.scala).

Scala 2's behavior is different: it does lift lambdas in constructors to
be static, too, which can cause deadlocks.

Fixes #19224
WojciechMazur pushed a commit that referenced this issue Jun 25, 2024
An anonymous function in a static object was previously mapped to a member
of that object. We now map it to a static member of the toplevel class instead.
This causes the backend to memoize the function, which fixes #19224. On the
other hand, we don't do that for anonymous functions nested in the object
constructor, since that can cause deadlocks (see run/deadlock.scala).

Scala 2's behavior is different: it does lift lambdas in constructors
to be static, too, which can cause deadlocks.

Fixes #19224

[Cherry-picked 22a959a]
neko-kai added a commit to 7mind/izumi that referenced this issue Jul 29, 2024
neko-kai added a commit to 7mind/izumi that referenced this issue Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants