Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add toggle for ijar in java_import #14967

Closed
wants to merge 1 commit into from

Conversation

cpsauer
Copy link
Contributor

@cpsauer cpsauer commented Mar 5, 2022

Hi, wonderful Bazel folks. Here's another PR for your consideration :)

While fixing ProGuards for java_import in #14966, I ran across #4549, wherein we were fragmenting efforts to import jars because the java_import rule had been broken for Kotlin interfaces for so long.

It seemed from the discussion there that adding a toggle for ijar on java_import was the right interim move, and was likely to unblock things with few downsides. Plus, I'd just added a (private) attribute to java import in #14966, so I figured I'd toss up a quick PR.

Like the previous PR, this isn't something I immediately need it for myself or anything, I just figured I'd be a good ex-Googler and toss in a fix when I saw an easy opportunity and what seemed like a burning need (from the length of discussion and effort on alternative implementations). I do think I'd better stop the yak shave here, though, rather than going on to fix ijar.

As always, please feel free to modify the PR as you see fit, especially if that'd make things faster overall! This bundle of PRs is my first time touching a lot of these parts of the codebase--so I'm hoping you'll be patient with me.

More details from the commit message:

Problem: java_import has been unusably broken for years for JARs with Kotlin interfaces, since ijar strips out important information. This has caused multiple dependent projects (including official Bazel ones) to abandon java_import in favor of rolling their own versions, which themselves contain issues that are getting fixed in java_import. Fragmentation is bad, and fragmentation of bugs and fixes is worse.
For more, see
https://github.com/bazelbuild/rules_jvm_external/blob/master/private/rules/jvm_import.bzl
#4549
#14966
bazel-contrib/rules_jvm_external#672

Temporary solution: Until such time as ijar is fixed for Kotlin, this adds a toggle that enables/disables ijar on java_import. This should unblock use of java_import for libraries that might contain Kotlin, so implementations can reunify.

It also restores java_import to a state where it works correctly by default. Per the user manual, ijars are just a build performance optimization to allow caching of actions that use JARs whose implementations change frequently [1]. Imported (externally compiled) JARs shouldn't be changing very often, meaning that the build performance cost of disabling ijars for these prebuilt JARs should be relatively low. Therefore, the ijar toggle is off by default, so the build is correct by default. But ijar is still made available though the toggle, just in case someone is importing a Java-interface-only JAR that they change all the time.

[1] https://docs.bazel.build/versions/main/user-manual.html#flag--use_ijars


Thanks for your consideration!
Chris

Problem: java_import has been unusably broken for years for JARs with Kotlin interfaces, since ijar strips out important information. This has caused multiple dependent projects (includng official Bazel ones) to abandon java_import in favor of rolling their own versions, which themselves contain issues that are getting fixed in java_import. Fragmentation is bad, and fragmentation of bugs and fixes is worse.
For more, see
https://github.com/bazelbuild/rules_jvm_external/blob/master/private/rules/jvm_import.bzl
bazelbuild#4549
bazelbuild#14966
bazel-contrib/rules_jvm_external#672

Temporary solution: Until such time as ijar is fixed for Kotlin, this adds a toggle that enables/disables ijar on java_import. This should unblock use of java_import for libraries that might contain Kotlin, so implementations can reunify.

It also restores java_import to a state where it works correctly by default. Per the user manual, ijars are a build performance optimization to allow caching of actions that use JARs whose implementations change frequenly [1]. Imported (externally compiled) JARs shouldn't be changing very often, meaning that the build performance cost of disabling ijars for these prebuilt JARs should be relatively low. Therefore, the ijar toggle is off by default, so the build is correct by default. But ijar is still made available though the toggle, just in case someone is importing a Java-interface-only JAR that they change all the time.

[1] https://docs.bazel.build/versions/main/user-manual.html#flag--use_ijars
@gregestren gregestren added the team-Rules-Java Issues for Java rules label Mar 7, 2022
@comius
Copy link
Contributor

comius commented Mar 8, 2022

This PR introduces a new public attribute on java_import. The problem with that is how "temporary" the solution is. Once new public attribute gets used in Bazel, we need to migrate/wean users of of it - by a warning and error and finally removing it. Which takes 1 to 2 LTS releases - 1 to 2 years.

Another problem I see if I'm not mistaken, this is changing the default behaviour of use_ijar from true to false.

@comius comius requested review from comius and removed request for lberki March 8, 2022 08:42
@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 9, 2022

Thanks for your efforts and for taking a look, @comius!

Re ijar disabled by default: Yes, definitely! That's intentional and a big part of the point of the commit.
For my reasoning, please see middle/last bit of the commit message and description:

It also restores java_import to a state where it works correctly by default. Per the user manual, ijars are just a build performance optimization to allow caching of actions that use JARs whose implementations change frequently [1]. Imported (externally compiled) JARs shouldn't be changing very often, meaning that the build performance cost of disabling ijars for these prebuilt JARs should be relatively low. Therefore, the ijar toggle is off by default, so the build is correct by default. But ijar is still made available though the toggle, just in case someone is importing a Java-interface-only JAR that they change all the time.

Again, ijar seems to be broken for such a large chunk of use cases that people are reimplementing this rule from scratch many times over just to avoid having ijar on. (See ~100 messages in #4549 about the breakage, outstanding for more than 4 years, and reimplementations in bazelbuild/rules_jvm_external, kotlin rules, square's rules, etc.) Basically, it seemed like the right call to stop the bleeding and have the rule work out of the box without sneaky, subtle failures. (And with minimal cost, it seems, since the broken part is just a performance optimization that doesn't really apply here, and doesn't seem to have even been bothered with for the parallel aar_import rule...) If any of that's wrong and you'd rather solve things another way, though, I'd certainly like to hear it!

(Obviously, fixing ijar is the best option if folks with enough JVM expertise are down to do it. Hence the prioritization request in that other PR you responded to. But given how many years it's been broken, given the continuing, mounting cost to Bazel of not fixing, and given that its just a performance optimization that doesn't seem to really apply here, I figured I'd toss up this PR for consideration.)

If we leave ijar on by default (i.e. broken by default), then all those users are going to have to set the temporary attribute to get things to work, which would make it harder to get rid of and the rule harder to use. That didn't seem like a good tradeoff, but I'd be happy to change it if you'd require.

That gets us to your other point, the temporary attribute. Another reasonable option would be to not introduce a temporary attribute, and just disable ijar completely for now. Sure! But it seemed like if there were ever a time for a temporary option, this would be it. When ijar is fixed, and there's no reason to not use ijar, just make the option a no op. Moving off of it is as easy as it gets; people can just delete the calls. And setting the option should be so infrequent (like probably only for testing?) and never necessary that I think you could just disclaim that it could disappear at any time and be done with it, as is already don for a bunch of other Bazel APIs. Plus, it's kinda nice to keep the codepath and tests around if the goal is to reenable it someday. And it's the interface users explicitly requested. Up to you, though!

To summarize and recap the thought process:
Rule sufficiently broken that people are abandoning it and writing multiple versions with other problems
&& no fix for user pain for >4 years because the perfect fix is hard
&& the broken part appears to be a minimal performance optimization that users don't care about in their reimplementations
-> Gah! That's really bad. Even as a bystander, the former Googler in me can't stand by.
Hence, PR to disable the broken performance optimization to stop the pain and fragmentation--at least until the perfect fix arrives. Can't let the perfect be the enemy of the good!

Totally disable ijar in java_import? Or give people the option to turn it back on if they're using one of the rare use cases where it's a performance win? Ehh 🤷🏻‍♂️. But probably better to give people the option, since it's beneficial and probably less effort than disabling and reenabling when things are eventually fixed.

Thanks again for your consideration!
Chris
(ex-Googler)

@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 9, 2022

[cc'ing also @kevin1e100, because you'd tagged him on the prioritization request that this impacts.]

@kevin1e100
Copy link
Contributor

kevin1e100 commented Mar 9, 2022

This is certainly a well-known issue that regularly leads to user confusion. To me personally, the value/benefit of using ijar in java_import has never been very clear, and I'd support turning that off for that particular rule (even without attribute to turn it back on). As an alternative, I suspect it might be reasonably straightforward to make the ijar tool itself at least optionally (ie, behind a flag) not strip any Kotlin classes (i.e., classes with @kotlin.Metadata annotations), and to set that flag when java_import invokes ijar.

I'll point out that this change alone doesn't necessarily make it straightforward to use java_import for Kotlin Jars, since those Jars generally need a dependency on the Kotlin standard library. While it's possible to declare that in the java_import's deps, I suspect that users will still struggle to correctly use java_import for Kotlin Jars even with this change. rules_kotlin's kt_jvm_import correctly adds the needed dependency, I believe, so even with this change, kt_jvm_import is still beneficial to use, but of course users need to know about it.

While the same caveat applies to aar_import, that rule has logic built-in to make sure any needed deps are declared, IIRC, so it helps users import AARs containing Kotlin code correctly. Ideally, java_import would do that as well, but it would likely be a struggle to make it do that, at least by default.

Even without tooling to ensure that deps are declared, it's probably still nicer for java_import to be usable for Kotlin Jars, even if it requires declaring deps for the imported to function correctly.

@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 9, 2022

Wow! Thanks for an awesome, super knowledgable response, @kevin1e100. (🔥 Google username, btw)
Also appreciate the backup on the user need.

Where do you want to go from here?

  • Seems like you're also firmly in favor of us doing something to help out here in the near term.
  • Would you prefer landing this, or do you think (or others) would have time to fix ijar so soon that we shouldn't?
    • (Ties back into the prioritization question in that other PR.)
  • If landing this: Do you actively want me to remove the attribute before we land?
    • My read was that you were neutral, but I wanted to make sure.
    • I'd be weakly in favor of keeping it--which is why I created it--but only weakly. Probably you already got that from the discussion above.

Finally, one thing re the Kotlin standard library dependency that I'd like to check with you on:
I'm hoping that this'll change would at least unblock the use of java_import for maven/rules_jvm_external, where JAR importing is so heavily used, and where Kotlin support is needed. There, I'd think the Kotlin standard library would be declared as a Maven dependency in the POM, and thus picked up automatically and work. Does that seem right to you, too?

(For manual use cases, kt_jvm_import is totally fine, but maybe they'd want to leverage java_import to help with the implementation, e.g., to properly support ProGuard as in that other PR #14966.)

@benjaminp
Copy link
Collaborator

Interface jars do have some value for java_import targets because interface jars are substantially smaller than the full jars. That means remote execution systems have to shuffle less bytes around to compile the reverse dependencies of an ijared java_import target.

@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 9, 2022

Ooh! Good thought. I don't have much experience thinking about remote exec.

@benjaminp, do you think that is bad enough that we should alter our plan here, or just a good thing to consider?

Naive question: Would the remote executors already be pulling down the full jars from maven when setting up the workspace? In my mind that rules_jvm_external use case seems like probably the majority use of java_import?

Edit: heh, quick search indicated that you're also a former Dropboxer! That was the first company I worked for. Fun times.

@benjaminp
Copy link
Collaborator

Ooh! Good thought. I don't have much experience thinking about remote exec.

@benjaminp, do you think that is bad enough that we should alter our plan here, or just a good thing to consider?

It's something to measure.

Naive question: Would the remote executors already be pulling down the full jars from maven when setting up the workspace? In my mind that rules_jvm_external use case seems like probably the majority use of java_import?

The full jar will be involved at some points naturally. You only need it to actually run a binary, though. So, only creating the deploy jars or running the tests at the top-level will have to pull down the full jar.

@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 10, 2022

Thanks! You don't have a good way of measuring by any chance, do you, Benjamin?

@Bencodes
Copy link
Contributor

Until such time as ijar is fixed for Kotlin, this adds a toggle that enables/disables ijar on java_import. This should unblock use of java_import for libraries that might contain Kotlin, so implementations can reunify.

Given some of the drawbacks mentioned here and the performance wins that ijar provides, fixing ijar definitely seems like a better path forward long term that doesn't complicate LTS releases. Is there anything blocking us from making the necessary changes to ijar so that it's compatible with Kotlin class files, or has anyone scoped out this work?

@Bencodes
Copy link
Contributor

@hsyed wrote some test cases back in 2018 that might be useful here #4632.

@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 11, 2022

Hey, @Bencodes! Good to see you here, too. I'd written a more full answer to your similar questions re ijar over in bazelbuild/rules_kotlin#697

But to summarize for this discussion:
I don't have more info on what's hung up a fix to ijar (for 4+ years!) than is in it's issue thread #4549. But the only blocker seemed to be someone with Kotlin/JVM expertise taking it on, plus enough C++ fluency to modify ijar.

I, too, would certainly prefer to see ijar fixed for Kotlin. @kevin1e100 was triaged in originally to see if it might merit a higher priority. But it seems like it requires enough Kotlin/JVM expertise that someone else would have comparative advantage over me there. My bet is that a Kotlin-capable ijar might be quite useful more generally for Kotlin, but I'm guessing you know more about that that I do :)

As above, this work is just proposing an interim fix that gets java_import back working for people until we get ijar fixed. When I drafted this, I'd figured ijar wasn't going to get fixed quickly if it'd been broken so long. But I would love to be proven wrong. Literally zero qualms about closing this PR right down because there's an ijar fix that supplants it! No ponies in this race except trying to get fixes out to people because brokenness, pain, and fragmentation make me sad.

Still, IMO it's well worth landing in some form iff someone isn't going to fix ijar right away. In hindsight, don't we wish that 4 years ago they'd done this quick fix, as they'd proposed, rather than letting the perfect be the enemy of the good? That would have saved a bunch of rules that imported jars from abandoning java_import as too broken, fragmenting implementations every which way. I'm really hoping we can undo some of that with this, especially in rules_jvm_external.

I also think we might be overdoing the performance (micro) optimization and option concerns a bit. Again, most jar importing (rules_jvm_external) has abandoned this rule in favor of rules without the performance optimization, presumably bc it didn't much matter. (Again, it doesn't seem to much apply for prebuilt jars.) And if you'd like it to have no changes to the user interface, we can totally do that, though it does seems like this is basically the easiest temporary option, migration wise. Happy either way.

Chris

@fmeum
Copy link
Collaborator

fmeum commented Mar 11, 2022

I briefly looked into making ijar preserve Kotlin classes, but it's not that easy: AFAIU, ijar doesn't even parse the part of the class file that contains the actual method implementations - after all, it is designed not to emit them. However, for classes with the kotlin.Metadata annotation, it would have to emit method bodies. @hsyed also identified some issues with the use of typealias which are not fully understood yet.

A simpler workaround could be to add functionality to ijar behind a flag that iterates over the paths in the JAR once at startup and simply emits the JAR unchanged if it finds a something.kotlin_module under META-INF. That would definitely be doable, I just don't know enough about Kotlin internals to determine whether it's sufficient.

@comius
Copy link
Contributor

comius commented Mar 16, 2022

cc @cushon

@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 18, 2022

I'd just really like to avoid having improvements stall like #4549. Could we do a thing where we call time after, say, max 1 month of discussing, and then ship the best improvement anyone's written? (Not, of course, stopping further improvements from happening thereafter.)

(Recall, we're dealing with a problematic perf optimization that doesn't really apply to java_import and doesn't work for a substantial chunk of cases. Its benefit is minimal enough that parallel rules don't bother. Our goal is to scale it back so that things work. If someone want to land something else that keeps more of the (minimal seeming) benefits, that's great. But it's well worth making a change, because the long-running status quo is trading brokenness for questionable performance gains.)

@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 18, 2022

@fmeum, I like your proposal, making ijar safe by auto-disabling on Kotlin, leaving Java unchanged.
If it gives confidence, it looks like someone proposed the same sentinel in #4549 (comment)
[No complaints there. It seems it only didn't get done then bc they didn't feel like they needed ijar.]
Would you (or someone else) want to take a crack at it? (Maybe not behind a flag?).
Happy to fold this PR in favor of that if so.

@fmeum
Copy link
Collaborator

fmeum commented Mar 18, 2022

I have this implemented and will submit a PR once I have added some test JARs.

@kevin1e100
Copy link
Contributor

kevin1e100 commented Mar 18, 2022 via email

@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 18, 2022

Oh! Sad times. Okay. Thanks for raising that new use case, @kevin1e100 and @cushon, here and on the issue. Sounds like maybe there was some internal discussion that caused a switch in views. (Had contemplated, but guessed it was rare compared to breaking kotlin, rather than common. Hence the toggle and default :)

How would you guys like to fix #4549?

@comius
Copy link
Contributor

comius commented Mar 22, 2022

Unfortunately disabling ijar on java_import by default is not acceptable for Java rules.

I believe there are others working on a more appropriate solution for Kotlin.

I this light I'll close this PR now.

@comius comius closed this Mar 22, 2022
@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 22, 2022

Huh. Are there people working on a better fix?

Delighted if we're actually closing this in favor of something better and imminent, but my read of the above is more that we've dismissed all the options people have offered to do, and that there might be considerably more constructive brainstorming or action possible!

Also happy to change this to ijar-by-default if that's the way to get things unbroken!

@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 29, 2022

...hey all, is someone actually working on this, or are we making a mistake killing efforts here? I'm pretty concerned we're going to fall back into the trap that's left this in the state it is. If someone's actually working on it, that's one thing, but otherwise, let's at least make some incremental improvements!

(@kevin1e100, happy to change the default to ijar-on to match the reversal of opinion.)

@cushon
Copy link
Contributor

cushon commented Mar 29, 2022

let's at least make some incremental improvements!

I summarized the issues with disabling ijar by default in #4549 (comment). It makes the rule easier to use for kotlin jars, but would be a regression for the existing uses of the rule, so it isn't the right tradeoff overall.

I appreciate the enthusiasm here, if you're interested in fixing this I think the alternative in #4549 (comment) is a better direction.

@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 29, 2022

Thanks for appreciating! And replying @cushon
(This was really just an earnest attempt to quickly help people. It's gotten a little un-fun, so it feels good to have someone appreciate the effort and enthusiasm. In reverse, I definitely appreciate all you guys do!)

Re the comment, I've definitely read and remember. As offered in the above few comments, how would you feel about leaving ijar on-by-default but adding the option to have things be Kotlin safe? That'd address your good concerns, unblock reunification with rules that don't/can't currently use ijar, and unblock other kotlin usage, making things incrementally better and paving the way for future improvements.

I also think it's great you added to the issue what the higher-effort-but-better next steps would be for getting ijar working with Kotlin. (Commented back another loose end there, just in case it's helpful for future work.) I don't think I'd be your man on those changes; as above, you all have the advantage over me on ijar and JVM internals, I'm sure, and we've rather exhausted my time budget here. This whole thing started as @kevin1e100 getting tagged by @cushon to see if #4549 might be worth more priority. But I'll continue to try to help improve Bazel in other ways! Looking forward to working together on those other (I think quite clear cut) improvements that we're on together.

All the best, Chris

@cpsauer
Copy link
Contributor Author

cpsauer commented Mar 29, 2022

In that vein, I just authored a quick commit in case you want it, attempting to address the above by leaving ijar on-by-default, while still allowing the option to make things Kotlin safe for those that need it.

[It won't update this PR automatically bc it's been closed, but perhaps if we reopen...]

@cpsauer
Copy link
Contributor Author

cpsauer commented Apr 4, 2022

@cushon and @kevin1e100, would you support a solution that gives users the option to have java_import not break Kotlin, while also leaving ijar on-by-default, as above?

@kevin1e100
Copy link
Contributor

kevin1e100 commented Apr 4, 2022 via email

@cushon
Copy link
Contributor

cushon commented Apr 4, 2022

I don't think this is the best option:

  • the implementation effort seems comparable to having ijar not strip classes annotated with kotlin.Metadata
  • there's a cost to increasing the surface area of the core rules with additional attributes
  • the hard part here seems like realizing that a build is broken by using a Kotlin jar in a java_import, and this doesn't solve that problem: by the time you've figured that out that using java_import.use_ijar would help, it isn't that much easier than using kt_jvm_import
  • it's already possible to get something like java_import.use_ijar with java_common.stamp_jar, which I think is a better level of abstraction to provide that choice at
  • the proposal disables ijar for the entire jar, not just the Kotlin parts of it, which I'd like to avoid

@cpsauer
Copy link
Contributor Author

cpsauer commented Apr 5, 2022

Okay! Closed it remains, then... Still, some regrets in my heart, but only out of wanting to help.

(I will say, it seems like we're talking past each other somehow, and perhaps got a bit at odds, and that I really regret, since we're nothing if not on the same team.)

Before I sign off, I also want to celebrate @fmeum, because I think we haven't enough. Pretty great that, out of the goodness of his heart, he dove into ijar, investigated some fixes, spun up a better one, and evaluated the other one being proposed. We haven't said too much about it, but I think that's pretty great. I'm continually impressed by him, every time I run into him on the internet.

Thanks all, for your time and care.

-Chris

@cushon
Copy link
Contributor

cushon commented Apr 5, 2022

Thanks for the note Chris. I'm sorry this ended up being a frustrating experience, and I really do appreciate the desire to help and the positivity here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Rules-Java Issues for Java rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants