Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing type annotations #655

Closed
wants to merge 1 commit into from
Closed

Missing type annotations #655

wants to merge 1 commit into from

Conversation

dubinsky
Copy link
Contributor

@dubinsky dubinsky commented Mar 27, 2023

  • introduce Scala-version-specific MiMa exclusions
  • add type annotations that break binary compatibility except for the unapply() methods for the Node subclasses;
  • add appropriate MiMa exclusions

@SethTisue as discussed in #649:

could we have one set of MiMa exclusions for Scala 2 and one for Scala 3, so that we can know for sure that any potential bincompat breakage is only on the Scala 3 side?

done; only Scala 3-specific exclusions are added in this pull request
NOTE: no longer true; I added all the remaining missing type annotations except for the unapply() methods: since even with the Scala 3-only exclusions this pull request is considered too risky (although not even one example of the breakage it can cause is known), I think it is better to record all the related changes in one pull request.

do you understand why MiMa considers the more specific types to break binary compatibility?

probably because MiMa can not take subtyping into account since JVM class linking does not...

I understand the reluctance to break binary compatibility between the current and the next versions of this library just to add some type annotations, but please consider that because those type annotations are not there, binary compatibility is already broken between the builds of the same version of this library for different versions of Scala: code that uses the methods with the type annotations missing that is built with Scala 2 will stop compiling - or will start behaving differently - when switched to Scala 3 with the same version of this library...

@dubinsky dubinsky marked this pull request as ready for review March 27, 2023 03:17
@SethTisue SethTisue requested a review from ashawley March 27, 2023 06:55
@SethTisue SethTisue mentioned this pull request Mar 27, 2023
@dubinsky dubinsky changed the title missing type annotations in scala.xml.Null: Missing type annotations Mar 27, 2023
@SethTisue
Copy link
Member

probably because MiMa can not take subtyping into account since JVM class linking does not

Seems like a good general description, but I was hoping we could understand the possible failure scenarios more precisely in order to gauge the danger level here. We can be (very nearly) sure that MiMa is technically correct here that failure scenarios exist, but I'm wondering how far-fetched they are. For example, do they only involve users extending scala-xml classes that are rarely or never extended in practice? Or is there a possibility of more ordinary usage of the library running into trouble?

I'm asking not only in order to help us make an up/down decision about this PR, but also because even in the case of an "up" decision, we'll need to document the situation in the release notes.

@SethTisue
Copy link
Member

One possibility to try to give us some hope that downstream breakage wouldn't be widespread and immediate would be to try this out in the Scala 3 community build — not the big, “open” community build maintained by VirtusLab, which only tests varying the Scala 3 version, but the smaller community build in the lampepfl/dotty repo. If I understand correctly, in that smaller build an updated scala-xml would be used by downstream libraries.

@SethTisue
Copy link
Member

Another thing that could be help would be to put out an RC — 2.2.0-RC1, I guess it would be — and ask some important downstream users to try it out.

@SethTisue
Copy link
Member

SethTisue commented Mar 28, 2023

While I'm throwing random ideas out there, here's one more:

We could declare a version 3 but move all the code to a new package, so that the new version and the old version could coexist on the same classpath and there wouldn't be a riot of eviction warnings in everyone's builds, ecosystem-wide, as we all only just got (mostly) done doing with the 1->2 upgrade.

:sigh:

cc @ckipp01

@ckipp01
Copy link
Member

ckipp01 commented Mar 28, 2023

as we all only just got (mostly) done doing with the 1->2 upgrade.

:sigh:

cc @ckipp01

shivvers

I don't really have many thoughts on this. I just know that if we knowing break compat again people will probably lose their minds. The update to 2.x was a way bigger deal than I thought it would be.

Since we are throwing wild ideas out there, if we were to break, coinciding with he sbt 2.x release could be the best time.... but again, I really really think we should not break again in the same manner that we did before.

@ashawley
Copy link
Member

The collections library update for 2.13 caused some changes for scala-xml, but they were deemed compatible enough and to only require a 1.3.0 release. The 2.0 release series of scala-xml was intended to help land any significant changes for Scala 3 "dotty". Those changes never came. Most of the changes for Scala 3 were required upstream in dotty. Unlike the Scala 2.13 effort, I never studied the dotty artifacts closely. It's interesting that we missed these signature differences between the two archives. Not a lot of people probably interface with these low-level bits of of scala-xml to notice. It seems fine to try fixing these type signatures as a bug fix. If the experiment fails we can rollback.

@dubinsky
Copy link
Contributor Author

dubinsky commented Mar 28, 2023

@SethTisue

probably because MiMa can not take subtyping into account since JVM class linking does not

Seems like a good general description, but I was hoping we could understand the possible failure scenarios more precisely in order to gauge the danger level here. We can be (very nearly) sure that MiMa is technically correct here that failure scenarios exist, but I'm wondering how far-fetched they are. For example, do they only involve users extending scala-xml classes that are rarely or never extended in practice? Or is there a possibility of more ordinary usage of the library running into trouble?

I'm asking not only in order to help us make an up/down decision about this PR, but also because even in the case of an "up" decision, we'll need to document the situation in the release notes.

I suspect that since method invocation on JVM references a method descriptor that specifies the method's return type, ordinary usage of the methods affected by this change will cause breakage. I'll try to tabulate some of the failure scenarios illustrating this in the coming days...

@ashawley

Unlike the Scala 2.13 effort, I never studied the dotty artifacts closely. It's interesting that we missed these signature differences between the two archives. Not a lot of people probably interface with these low-level bits of of scala-xml to notice.

It's easy to miss these signature differences without either

  • an automated binary compatibility check across the Scala versions for the same version of the library (which we do not have)
  • or - better! - having all methods annotated with the return types, so that there are no signature differences ;)

The collections library update for 2.13 caused some changes for scala-xml, but they were deemed compatible enough and to only require a 1.3.0 release.

There is a bunch of unapply() methods which return a Seq[Node] as a part of their result, and since they do not have type annotations, their signatures are inferred by the Scala compiler, resulting in the signatures being different not only between Scala 2.13 and Scala 3, but also between Scala 2.12 and Scala 2.13. Those differences were missed too, and they affect the run-time behavior of pattern-matching on Node subclasses - something that a lot of people probable do interface with.

@ashawley

It seems fine to try fixing these type signatures as a bug fix. If the experiment fails we can rollback.

@ckipp01

I really really think we should not break again in the same manner that we did before.

I am going to:

  • look into the specifics of the breakage resulting from this change
  • create another pull request that covers the unapply() methods (and binary compatibility breakage for Scala 2)

so that we can have a full(er) picture.

@dubinsky
Copy link
Contributor Author

dubinsky commented Mar 29, 2023

@SethTisue @ckipp01 @ashawley my findings so far:

Proposed change affects return type of all the involved methods in a uniform way:

  • in the companion object, return type changes to a more specific one;
  • in the class itself, a variant of the method with the same parameters but more specific return type appears alongside what existed before the change. For example:
public static scala.xml.MetaData next(); // before the change
public static scala.runtime.Null$ next(); // after the change

public scala.xml.MetaData next(); // before and after the change
public scala.runtime.Null$ next(); // appears after the change

Reflection on the method continues to return the non-specific variant even after the change, contrary to what I expected after reading this comment on the java.lang.Class.getMethod():

The result of this algorithm is chosen arbitrarily from the methods with most specific return type among all selected methods...
There may be more than one method with matching name and parameter types in a class because while the Java language forbids a class to declare multiple methods with the same signature but different return types, the Java virtual machine does not. This increased flexibility in the virtual machine can be used to implement various language features. For example, covariant returns can be implemented with bridge methods; the bridge method and the overriding method would have the same signature but different return types. This method would return the overriding method as it would have a more specific return type.

Turns out, when I said the "MiMa can not take subtyping into account since JVM class linking does not", I was wrong - at least about the JVM: even reflection takes subtyping into account, let alone invokevirual JVM instruction (being virtual and all :)). As a result, the code that uses scala-xml-2.1.0 with Scala 3, where the compiler infers less specific types, and invokes a method affected by this change with such pre-change less specific type - for instance val next: MetaData = Null.next - works just fine with the version of scala-xml where the type changes to a more specific one (scala.Null), notwithstanding the fact that the less specific type used is encoded in the classfile (invokevirtual // Method scala/xml/Null$.next:()Lscala/xml/MetaData;).

So far I was not able to see any breakage introduced by this change, nor figure out what breakage MiMa is warning about from its code. It is clear that that code was written by people who understand this stuff much better than me (to put it mildly), but it is possible that the potential breakage MiMa is worried about materializes only in situations where different methods are called depending on how the calling code "sees" the class or some such(scala/bug#11804, scala/scala#9141) - concerns that does not seem to apply to this change.

@SethTisue

I was hoping we could understand the possible failure scenarios more precisely in order to gauge the danger level here

Maybe you know somebody to consult on this? It would be a shame to not apply this change, since unlike the breakage it could cause, we do know what existing breakage its absence does cause: the code that uses scala-xml-2.1.0 with Scala 2.12 or 2.13, where the compiler infers more specific types, and invokes a method with such more specific type - for instance val next: scala.Null = Null.next - does not work with scala-xml-2.1.0 with Scala 3, where the compiler infers less specific types (scala.xml.MetaData): java.lang.NoSuchMethodError: 'scala.runtime.Null$ scala.xml.Null$.getNamespace(scala.xml.Node)'; it doesn't even compile using scala-xml-2.1.0 with Scala 3 either (Found: xml.MetaData, Required: Null), forcing the developers facing this breakage to weaken the types...

In the next pull request in this series, the one addressing the unapply() methods, pre-existing breakage involved is more far-reaching (it affects pattern-matching on Node subclasses) and more insidious (silent code behavior changes); it would be nice to be able to take care of that...

@SethTisue
Copy link
Member

@dwijnand can we get your input here?

@dubinsky
Copy link
Contributor Author

@SethTisue do you think there is a way to get some input on this?

@lrytz
Copy link
Member

lrytz commented May 16, 2023

@SethTisue and me discussed this today.

IMO breaking binary compatibility of Scala 3 artifacts over this is not worth the risk (or the effort to release a new major version of xml).

To get explicit return types in the source, you can add a single top-level object with type aliases, with separate sources for 2 and 3. Something like this:

// For Scala 2
private[xml] object ReturnTypes {
  type NullNext = scala.Null
  type NullKey = scala.Null
  type NullValue = scala.Null
  ...
}
// For Scala 3
private[xml] object ReturnTypes {
  type NullNext = MetaData
  type NullKey = String
  type NullValue = Seq[Node]
  ...
}

and then change the definitions to use these aliases

case object Null extends MetaData {
...
  override def next: NullNext = null
  override def key: NullKey = null
  override def value: NullValue = null
...
}

@SethTisue
Copy link
Member

I understand the reluctance to break binary compatibility between the current and the next versions of this library just to add some type annotations, but please consider that because those type annotations are not there, binary compatibility is already broken between the builds of the same version of this library for different versions of Scala: code that uses the methods with the type annotations missing that is built with Scala 2 will stop compiling - or will start behaving differently - when switched to Scala 3 with the same version of this library...

I would consider that source compatibility, not binary compatibility, as switching to Scala 3 involves recompiling the code. Binary compatibility is about what happens when you swap in a new binary artifact without recompiling.

I don't think we should be bothered by minor source incompatibilities between the Scala 2 and Scala 3 versions. We might have avoided it if we'd seen it coming, but now that it's happened, I don't think we should feel bothered. Users haven't complained. Yes people are still in the process of migrating to Scala 3, but many have already migrated, too and we aren't getting user complaints/questions about the discrepancies in the types.

Also, I think the Scala 3 types are actually fine. Having different types in 2 and 3 is a bit awkward, but I don't think the Scala 3 types are actually bad — the less specific types seem fine to me.

Maybe you know somebody to consult on this? It would be a shame to not apply this change, since unlike the breakage it could cause, we do know what existing breakage its absence does cause: the code that uses scala-xml-2.1.0 with Scala 2.12 or 2.13, where the compiler infers more specific types, and invokes a method with such more specific type - for instance val next: scala.Null = Null.next - does not work with scala-xml-2.1.0 with Scala 3, where the compiler infers less specific types (scala.xml.MetaData): java.lang.NoSuchMethodError: 'scala.runtime.Null$ scala.xml.Null$.getNamespace(scala.xml.Node)'; it doesn't even compile using scala-xml-2.1.0 with Scala 3 either (Found: xml.MetaData, Required: Null), forcing the developers facing this breakage to weaken the types...

When something doesn't even compile, that's actually good! The compiler is letting the user know there's a problem they need to address. It's far, far less pernicious than runtime breakage.

I'm not 100% sure I follow what you're saying about NoSuchMethodError, though. That's a runtime error, so that worries me. Can you be more specific about what scenario you anticipate NoSuchMethodError occurring...?

@ashawley
Copy link
Member

I wonder if we determine we can't make these signature fixes or any binary changes in 2.0 really, since it is too risky, maybe we need to just fork the 2.0 branch and make a 3.0 RC version for Leonid and others to make progress with? Or make a forked library and repo, call it "scala-xml3" or such?

@dubinsky
Copy link
Contributor Author

@lrytz

To get explicit return types in the source, you can ...

Thank you for your suggestion; it allows to make the status-quo explicit, but not to make types to be the same for Scala 2 and Scala 3... Also, the cost of just making the current types explicit - declaring them indirectly via ReturnTypes - makes it not worth it IMO.

@SethTisue

code that uses the methods with the type annotations missing built with Scala 2 will stop compiling with Scala 3

I would consider that source compatibility, not binary compatibility,

You are right, of course :)

@SethTisue

I don't think we should be bothered by minor source incompatibilities between the Scala 2 and Scala 3 versions. We might have avoided it if we'd seen it coming, but now that it's happened, I don't think we should feel bothered. Users haven't complained.
Also, I think the Scala 3 types are actually fine. Having different types in 2 and 3 is a bit awkward, but I don't think the Scala 3 types are actually bad — the less specific types seem fine to me.

Maybe the users do not complain because in the midst of the migration from Scala 2 to Scala 3, minor changes like this are the least of their problems ;)

Forcing the users that are trying to use the most specific types available to weaken them bothers me, and so does having different types for Scala 2 and Scala 3 for no good reason.

@SethTisue

When something doesn't even compile, that's actually good! The compiler is letting the user know there's a problem they need to address. It's far, far less pernicious than runtime breakage.

Absolutely! In this case, the problem is spurious though :(

@SethTisue

I'm not 100% sure I follow what you're saying about NoSuchMethodError, though. That's a runtime error, so that worries me.

Worry not: it happens when the Scala XML library built for Scala 2 is swapped for the one built for Scala 3 without recompiling the code using it - a scenario not covered by any compatibility guarantees.

@lrytz

IMO breaking binary compatibility of Scala 3 artifacts over this is not worth the risk

@SethTisue

do you understand why MiMa considers the more specific types to break binary compatibility?
We can be (very nearly) sure that MiMa is technically correct here that failure scenarios exist, but I'm wondering how far-fetched they are.

Is there a way to gauge this risk? So far, we are not aware of any scenarios where those changes lead to breakage...

@ashawley

I wonder if we determine we can't make these signature fixes or any binary changes in 2.0 really, since it is too risky, maybe we need to just fork the 2.0 branch and make a 3.0 RC version for Leonid and others to make progress with?

I think we are at that point, yes: I have more radical changes planned, and it would be nice to have a place for them; on the other hand, if chances of those changes ever being released are slim (as seems to be the case) - is there any point?

@dubinsky dubinsky marked this pull request as draft May 22, 2023 01:23
- introduce Scala-version-specific  MiMa exclusions
- add type annotations that break binary compatibility except for the `unapply()` methods for the `Node` subclasses;
- add appropriate MiMa exclusions
@lrytz
Copy link
Member

lrytz commented Jun 16, 2023

@dubinsky I apologize I missed your reply.

it allows to make the status-quo explicit, but not to make types to be the same for Scala 2 and Scala 3

I understand, and I believe this is the right thing to do here.

Of course it's not ideal to have a mismatch between the signatures of a method in Scala 2 vs 3. But:

  • the other solutions (breaking binary compatibility, or releasing scala-xml 3.0) are risky (binary compatibility is very hard to reason about, we failed more than once in the past) or very costly
  • we are talking about a couple of overrides that narrow to Null / Nil / Nothing. User code will not depend on these signatures, both the Scala 2 and Scala 3 versions will work equally well in ordinary user code (people are not going to write Null.next). It doesn't complicate cross-building / source compatibility.

the cost of just making the current types explicit - declaring them indirectly via ReturnTypes - makes it not worth it IMO.

Why not? Result types of public methods should be explicit, as we are just witnessing here.

@dubinsky
Copy link
Contributor Author

dubinsky commented Jun 19, 2023

@dubinsky I apologize I missed your reply.

No problem @lrytz , thank you for the clarifications!

the cost of just making the current types explicit - declaring them indirectly via ReturnTypes - makes it not worth it IMO.

Why not? Result types of public methods should be explicit, as we are just witnessing here.

Type annotations serve to 1) make the intent known to the compiler and 2) to document the types for the developer. What we are witnessing here is that result types of the public methods should have been explicit to begin with to make the intent clear and avoid the not ideal signature mismatch between Scala versions. Making explicit the accidental status quo breaks 1) by codifying the situation that is not aligned with the intent, and ReturnTypes.NullNext is more confusing than scala.Null.

Above is the reasoning behind my reluctance to make the types explicit but version specific; now that I understand that there is no chance to remedy the situation and roll the accident back, this reluctance is gone: any documentation of the types is better than no documentation... So, I shall make a pull request that follows your suggestion and add type annotations wherever they are missing. Thank you!!

By the way, signature mismatch is not limited to obscure things like Null.next; it seems that Scala 2.13 and Scala 3 infer scala.collection.immutable.Seq for repeated parameters and sequences returned by the unapplySeq, whereas Scala 2.12 infers scala.collection.Seq. There is nothing I can do about the repeated parameters, but I will need to annotate the return types of the unapplySeq methods in the three scala.xml.Node subclasses that have them in a Scala-version-specific way too...

@dubinsky
Copy link
Contributor Author

Pull request that adds all the missing type annotations using Scala-version-dependent return types is up: #675.
Closing this one. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants