Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Add a new "non-transitive" java strict deps enum value #4821

Closed
ittaiz opened this issue Mar 10, 2018 · 29 comments
Closed
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Rules-Java Issues for Java rules type: feature request

Comments

@ittaiz
Copy link
Member

ittaiz commented Mar 10, 2018

Description of the problem / feature request:

Add a new strict_java_deps value- "non-transitive"

Feature requests: what underlying problem are you trying to solve with this feature?

The underlying problem I'm trying to resolve is that ijar is less than optimal for scala code which means using strict-deps as it's used in java rules (always pass the full transitive classpath as inputs) might have performance issues in terms of many redundant triggers (again due to problems with ijar and its applicability to scala).
Solving the root cause with ijar itself is not on anyone's horizon (and sounds like it's a very big undertaking).
Currently rules_scala defaults to only passing the direct dependencies and it's overloading the strict_java_deps values to mean that off\default is only pass the direct dependencies.

This will probably come at a surprise to some users who use both java and scala rules since "off" means completely the other way around for java rules.

My suggestion is to add a new value "non-transitive" which we can use to only pass the direct dependencies.

Our other alternative is to still overload an existing value "strict" and decide that it means the above.
It's only relevant if "strict" is not planned to be removed since it's doc says "Transition to strict by default".

Have you found anything relevant by searching the web?

Further discussion in bazelbuild/rules_scala#423

cc @johnynek @pauldraper

@lberki
Copy link
Contributor

lberki commented Mar 20, 2018

Not ever adding the transitive dependencies to Java compile actions doesn't work, because javac needs to read the class files in transitive dependencies in some cases (unfortunately, it escapes me what those are :( )

In addition to that, I'd much rather not complicate our Java rules with yet another strict deps mode for something Scala specific. How about making a separate flag for Scala? Even better, it can be just a --define, in which case this is doable without changing Bazel at all.

@ittaiz
Copy link
Member Author

ittaiz commented Mar 20, 2018 via email

@lberki
Copy link
Contributor

lberki commented Mar 21, 2018

The --define Bazel command line option; for better or worse, it allows one to define $(VAR)-style variables in BUILD files. Mind you, I consider that as a misfeature, but I thought --define was sufficiently well entrenched so that another use for it is preferable to adding a command line option to Bazel for something that's implemented in Skylark. However, I just realized that it's not available in Skylark.

Let's ask @laurentlb what he prefers -- adding a bespoke command line option to Bazel specifically for Scala (even though Scala is implemented in Skylark) or adding a key-value map? In the long term, the latter is much more preferable, since we'll eventually need to define configuration fragments in Skylark, but we may not want to commit to any specific option on that. And in that case, a single command-line option is much simpler to eventually deprecate.

@ittaiz
Copy link
Member Author

ittaiz commented Mar 21, 2018 via email

@lberki
Copy link
Contributor

lberki commented Mar 21, 2018

Sure :) Would you be okay with a separate flag, however it is implemented, for Scala strict deps?

@ittaiz
Copy link
Member Author

ittaiz commented Mar 21, 2018 via email

@hmemcpy
Copy link

hmemcpy commented Apr 1, 2018

Sorry to hijack the original request, but I've also found a use for --define in Skylark. We would like to use it to implement bazelbuild/intellij#247 - the ability to generate a temp binary target for any arbitrary main, and --define looks like a good way of passing additional arguments. Any chance you could consider implementing it for Skylark as well?

@lberki
Copy link
Contributor

lberki commented Apr 3, 2018

@laurentlb @gregestren FYI for the new flag specific to a Skylark rule set in Bazel core. If you don't answer, I assume you are okay with adding something like ctx.fragments.scala.stuffto Bazel.

@tomlu -- before I send a change to implement this, are you okay with adding the "scala strict deps" flag to Bazel core?

@lberki
Copy link
Contributor

lberki commented Apr 3, 2018

@hmemcpy -- it appears that we won't go the --define route (unless @laurentlb objects to the current plan), so let's continue the discussion on bazelbuild/intellij#247 .

@laurentlb
Copy link
Contributor

Sorry for the delay. Can you send a message to bazel-dev or follow this process: https://bazel.build/designs/skylark/skylark-design-process.html

Having a flag or anything hardcoded, used only in a Skylark rule, seems wrong (or at least, a temporary workaround). I think this should be discussed first. We should at least have an idea of where we're going.

(sorry, I'll be away this week, but other people can comment on the plan)

@tomlu
Copy link
Contributor

tomlu commented Apr 3, 2018

I don't understand this request. Are you saying you want to pass only direct jars to your scalac invocations? You can just do that straight in the rule implementation, can you not?

I don't think it works, anyway, because compilation requires the full transitive classpath even with strict deps.

Consider three rules, A, B, and C, each with a single class. This is a Java example, but I imagine Scala will be the same.

class A {
}

class B extends A {
}

class C {
B b = new B();
}

Even though C only has a direct dep on B, compilation still requires A because of the inheritance.

Incremental compilations can potentially use a reduced classpath if you instrument the compiler to see what it is truly using, but that's not what we're talking about is it?

@johnynek
Copy link
Member

johnynek commented Apr 3, 2018

Scala does not require the full transitive path. It sometimes requires some transitive jars, but not all. Sadly, the user experience is not great: we just get an error sometimes that a jar we need is missing, but we don’t have an message which tells us where to find it.

The full transitive classpath could be very bad for scala performance because ijar is substantially less effective for scala and scalac is so much slower than javac.

@tomlu
Copy link
Contributor

tomlu commented Apr 3, 2018

This:

Scala does not require the full transitive path

Seems to be directly at odds with this:

It sometimes requires some transitive jars

Yes, and as I explained, you cannot know in advance which transitive jars are needed, and the set wouldn't be stable. This is why I think this request cannot really work. Changing code can cause cascading non-local compilation failures, as the transitive jars required further up the build graph can shift.

The full transitive classpath could be very bad for scala performance because ijar is substantially less effective for scala and scalac is so much slower than javac.

Right :( Just to be clear, this is not directly a bazel problem, right? You'd have the same problem in any build system, if you use the same number of targets?

ijar is substantially less effective for scala and scalac is so much slower than javac

Is scalac slower at cracking open the classpath jars? ijar really only buys you smaller (in bytes) jars on the classpath. It doesn't reduce the number of jars on the classpath.

I suppose it does help incremental cache hit rate, in case you modify method bodies only.

@cushon
Copy link
Contributor

cushon commented Apr 3, 2018

Java compilations doesn't use the full transitive classpath either, but the hard part is knowing up-front which subset is needed. How do you handle that for scala?

ijar is substantially less effective for scala

I thought ijar was disabled for scala because of macros. Or are there places where ijar is still used with scala, and other reasons why it's less effective than for Java?

@johnynek
Copy link
Member

johnynek commented Apr 3, 2018

We use ijar on all non macro targets. We require users to declare if a target has macros. For external jars, which we assume don’t change much and may have macros, we don’t use ijar in most cases (again since scala jars are just jars, users can circumvent this and use different options in scala_import).

Why is ijar less effective with scala? Because the scala signature, an annotation in the jar, is preserved in its entirety. This has meta data which will effectively change on almost any change a user can make.

Lastly, while it is true that an upstream target can make a change that causes downstream targets to fail, that is also true of the transitive deps world: an upstream target could remove a dependency that a downstream target was using directly. Without that dependency transitively showing up downstream fails. You may have some fancy plugins to prevent this in your java rules interaction with javac but we don’t have that now for scala.

So in practice, requiring users to name all jars works fairly well.

In my view, the main missing thing is an error when you have an explicit dependency that is not used.

@cushon
Copy link
Contributor

cushon commented Apr 3, 2018

Why is ijar less effective with scala? Because the scala signature, an annotation in the jar, is preserved in its entirety. This has meta data which will effectively change on almost any change a user can make.

Can we fix that?

You may have some fancy plugins to prevent this in your java rules interaction with javac but we don’t have that now for scala.

There's also a bytecode-based implementation of strict dependency enforcement that's used for aar_import, and could be generalized to other rules that produce bytecode.

@pauldraper
Copy link
Contributor

pauldraper commented Apr 3, 2018

This has meta data which will effectively change on almost any change a user can make.
Can we fix that?

It's a good as can be without removing macros. Though the premise is subjective. The types of all members are represented in the ScalaSignature, even privates because they can be used by macros. But implementations of methods are not in the signature. So the accuracy of the premise depends if you alter member types in "almost any change".

In addition to that, I'd much rather not complicate our Java rules with yet another strict deps mode for something Scala specific.

It's not necessary Scala-specific. Java jars can have dependencies that ijars do not. I assume this case is low-value enough that Java just has not done it.

Having a flag or anything hardcoded, used only in a Skylark rule, seems wrong

More generally, having non-Skylark rules seems wrong, and a violation of abstraction. Unfortunately, the blessed Google Languages remain in Bazel core, and their useful abilities also remain there.


I think this should not be flag, but rather an attribute. Typically, build code is made to work with many flag options; the flags make relatively minor adjustments to overall behavior.

Transitive vs non-transitive is a fundamental difference in the build code that is written.

@ittaiz
Copy link
Member Author

ittaiz commented Apr 4, 2018

@tomlu

This:

Scala does not require the full transitive path

Seems to be directly at odds with this:

It sometimes requires some transitive jars

This might already be clear from the thread but just in case it isn't: The above doesn't contradict if you take into account that usually scalac requires what people expect but sometimes it also needs (transitive) jars that people don't expect. Stipe people have been using this form for some time and are happy with it (@johnynek you are happy with it with respect to the alternative right?)

@johnynek

Lastly, while it is true that an upstream target can make a change that causes downstream targets to fail, that is also true of the transitive deps world: an upstream target could remove a dependency that a downstream target was using directly. Without that dependency transitively showing up downstream fails. You may have some fancy plugins to prevent this in your java rules interaction with javac but we don’t have that now for scala.

I'm not sure I understand what you mean. If you use strict-deps error the above scenario you describe won't happen because the downstream target can't directly use a dependency and not declare it. Additionally we have support exactly for that in rules_scala so I'm not sure I understand what you're referring to.
What we need to improve is the granularity of "transitive usage" (what we talked about with @gkk-stripe as smart mode) and introduce an "unused deps" feature which is orthogonal.

@pauldraper

I think this should not be flag, but rather an attribute. Typically, build code is made to work with many flag options; the flags make relatively minor adjustments to overall behavior.

Transitive vs non-transitive is a fundamental difference in the build code that is written.

I seriously disagree since I think this should be a global decision and I'd rather not have everyone copy paste this setting in their BUILD files

@pauldraper
Copy link
Contributor

I'd rather not have everyone copy paste this setting in their BUILD files

Flags reduce verbosity at the expense of sharing code with any project that uses a different set of flags.

Classic global pros + cons.

@johnynek
Copy link
Member

johnynek commented Apr 4, 2018

to the question of "can we fix ijar to work better for scala" posed here:
#4821 (comment)

we have a ticket here:
bazelbuild/rules_scala#125

The problem is that fixing ijar for scala is really tightly coupled to the scala compiler, and even the version of scala (2.11 vs 2.12, etc...). Speaking to scala compiler and sbt contributors, they have been somewhat pessimistic about doing this. SBT tries to do this but has had many bugs over the years and still is probably not totally bug free. Even then, sbt extracts a fingerprint to detect change, if things have not changed, we use the old jar, if it has changed, we may need to recompile. Bazel does not allow this: we can't tell bazel to only rerun an action if this function of the inputs has changed vs requiring a re-run any time any input has changed. I have requested the ability to define "cache keys" for inputs (buck supports this). So far I have not had much success getting bazel developers excited about this.

@gregestren
Copy link
Contributor

@laurentlb
Re: native-defined / Skyark-consumed flags, a doc proposing the way forward is on schedule to be shared today or tomorrow.

@ittaiz
Copy link
Member Author

ittaiz commented Apr 15, 2018

I've recently watched this talk by a pants developer. It actually pushed me much more towards trying to work with non-transitive class-paths for performance reasons.
The TL;DR is that they're using non-transitive class-paths (actually allowing one level up) with great success of convenience and performance.
She claims there that the smaller class-path contributed to 10% more perf in compilation and 20% less cache thrashing which one can argue is solved in bazel via ijar but then we go back to the whole ijar-scala discussion.

@pauldraper
Copy link
Contributor

FYI I created #5021 for removal of transitivity of --strict_java_deps, because it makes portable BUILDs impossible.

@johnynek
Copy link
Member

@ittaiz as I have mentioned: we use non-transitive classpaths at Stripe, and I continue to be happy with the outcome. The main thing we need is an error when we have a jar that we never use so we don't accumulate cruft. Caching in this case is working very well.

@cushon
Copy link
Contributor

cushon commented Apr 16, 2018

The thing pants does with direct deps + direct deps of direct deps is similar to --java_classpath=javabuilder. The latter compiles against direct deps + the dependencies those direct deps used at compile-time, as reported by SJD. The main difference is that --java_classpath=javabuilder silently falls back in the cases that doesn't succeed instead of requiring additional direct deps be added, which happens rarely. --java_classpath=javabuilder achieves a similar speedup for individual compilations (since they read less of the classpath) , but doesn't improve caching (since the transitive jars are still inputs to the action).

Without the fallback, only using direct deps ends up making builds fragile. The compiler needs to load transitive classes for reasons that are unobvious and easy to perturb. For example, adding an overload of a method being called which references a type that wasn't already being loaded could break the caller and force them to add additional direct deps.

@johnynek are you compiling against direct deps only, or direct deps + direct deps of direct deps, like the pants feature? What has your experience been with the fragility issue I mentioned?

@helenalt
Copy link
Contributor

@lberki what would be the next step in addressing this issue?

@iirina
Copy link
Contributor

iirina commented Sep 6, 2018

I'm trying to understand the situation here.

@ittaiz How are you currently using --strict_java_deps in rules_scala? I was curious about two things: 1. How does the overload of the off/default values impact the interaction with the java rules? Do you pass a different value (error?) to java_common.compile when the flag value is off/default?
2. Do you have a similar SJD plugin that interacts with rules_scala?

By looking at the documentation, the --strict_java_deps flag is strictly related to the SJD plugin. It doesn't control what ends up on the compiletime classpath and its purpose is only to point out unused dependencies.

FWIU this FR is about globally controlling what ends up on the scalc compiletime classpath. Is that correct? If so, it shouldn't impact javac compilations.

Can you explain why rules_scala can not use the current values of --strict_java_deps and pass the direct jars accordingly (e.g. when equals to default/strict/error)?

Anyhow, I think what this FR is about is conceptually different than the --strict_java_deps flag. Since you want to control what ends up on the scalac compile time, I suggest a new flag --jvm_classpath. It would be up to the JVM rules how they implement it (the implementation should be similar to what @cushon described above about --java_classpath=javabuilder).

@lberki lberki added team-Rules-Java Issues for Java rules P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed category: rules > java labels Dec 3, 2018
@github-actions
Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 2.5 years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Mar 16, 2023
@github-actions
Copy link

This issue has been automatically closed due to inactivity. If you're still interested in pursuing this, please reach out to the triage team (@bazelbuild/triage). Thanks!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Rules-Java Issues for Java rules type: feature request
Projects
None yet
Development

No branches or pull requests