-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strict scala deps and unused_deps #235
Comments
this seems really nice. note related post on bazel blog for java The strategy here looks good to me. Some comments:
|
The blog post and the issue are both products of Carmi and me meeting a few
days ago :) Some of the wording in the issue are copied from the
explanation of how the plugin works.
Re #1 I think having "dumb" strict mode via Bazel and not scalac should be
easy. By "dumb" I mean we warn/error for every indirection. What might be
more complicated is the "smarter" mode where we are able to identify which
indirect usages are legitimate (compiler's limitations) and which aren't
(you're just using that dep yourself)
Re #2 honestly that's not the problem I need to solve now. Many of our
generated BUILD files have ~100 dependencies in their "deps". If I get
strict mode with warnings then I'm able to add that into the migration
phase and output minimal BUILD files. Unused is for longer down the road
where you want to keep BUILD files cleaner and IINM the current convention
in Google is to move the unused dependencies to runtime under the
assumption something might need it.
Re #3 after I added around 30 transitive targets without understanding why
the compiler is angry I decided to automate it. Note that it wasn't all of
them. For this one repo I'd guess we're talking about several hundreds of
noise dependencies.
…On Thu, 29 Jun 2017 at 5:06 P. Oscar Boykin ***@***.***> wrote:
this seems really nice.
note related post on bazel blog for java
https://blog.bazel.build/2017/06/28/sjd-unused_deps.html
The strategy here looks good to me. Some comments:
1. can we get the reliability so this is the default, or will we need
to be opt in for a while (I imagine the latter).
2. should we have a version that just warns on unused dependencies
first (kind of the reverse problem as this).
3. I'm surprised the strict dependencies became a show-stopper for
you. Our dependencies are probably over-broad, but it hasn't been
debilitating.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#235 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIF0si-kg_1lWXezRPcevNm5oLGICOks5sIwaJgaJpZM4OHkDK>
.
|
@gkossakowski Since Oscar pulled you to the PR I thought I'd pull you to the issue and ask you if you have an intiuition or an idea on how "smart" will the plugin be able to be? What we really want to enable is to warn/error the user when their code uses a transitive dependency. This is to allow for upstream targets freedom in changing implementation decisions. I don't know how feasible this distinction (I use vs compiler needs for transitive reasons) is in the compiler/compiler-plugins. Thanks! |
) First step (experimental) in the introduction of strict scala deps feature (see issue #235). Currently this feature is turned off (due to noticeable impact on performance) switching ‘dependency_analyzer_mode_soon_to_be_removed’ to “on” means that ANY API change in a target’s transitive dependencies will trigger a recompilation of that target, on the other hand any internal change (i.e. on code that ijar omits) WON’T trigger recompilation by transitive dependencies A few implementation details: A new scalac plugin informs the user on every target that has missing transitive dependencies that should be added as direct dependencies. For convenience a buildozer command is provided to make the change in the BUILD file easily. before the compilation, transitive compile time jars are collected along with a mapping to the originating target label.
@gkossakowski Do you have an intuition or an idea on how we can advance the "smart" plugin mode? What we really want to enable is to warn/error the user when their code uses a transitive dependency as opposed to warning the user for cases where the compiler needs this symbol lookup just for the its own needs. This is to allow for upstream targets freedom in changing implementation decisions. Thanks! |
Yes, I mulled over this problem in the context of zinc's incremental compilation decision. In zinc, you want to approximate the dependencies scalac would ever need. The short answer is: the dependencies introduced by reference do not require a transitive closure but the one introduced by inheritance does require a transitive closure. When a program mentions a symbol, e.g. You can capture both kinds of dependencies by traversing the trees and collecting the information in the right context. Zinc implements this: https://github.com/sbt/zinc/blob/1.x/internal/compiler-bridge/src/main/scala/xsbt/Dependency.scala#L326 |
Are you sure? |
Actually I think that my example above shows that there are more areas the plugin can improve in but what you wrote IIUC is that at least in that aspect (of inheritance) the plugin can be smarter and not warn/error. Did I understand you correctly? |
My comment is what should happen given Scala's typesystem design to shed light what directionally to expect. From there you can derive whether the behavior you're observing is a bug or a design choice you have to work with. I just checked that what I said about dependencies by reference not forcing transitive dependencies is true for pure Scala:
I'm not sure what's different about your example but I think it should work. Perhaps you can check whether you can reproduce the problem in pure Scala setting? I suspect that loading Java-originating symbols from a class file is overeager. It's a different code path than loading Scala symbols in the compiler. |
Would the use-case where a I have built a contrived example here: promiseofcake/bazel-scala-test#2 that runs with Travis to show the current issue. I think it belongs here because for our java targets we have not had to perform similar adjustments. |
Yes. This is under the purview of this issue. |
Regarding the newly documented
If it were to become the default behavior, would we want to follow the proposed solution before making it mandated? |
Hi Lucas,
First I'd like to apologize for not responding previously, I somehow missed
your reply.
When you say "the proposed solution" you mean your proposed solution of
"whitelisting" the scalatest/scala library dependencies?
If so I agree we should. Can you open an issue about it?
If you meant something else I'd appreciate the clarification
…On Thu, 9 Nov 2017 at 9:37 Lucas Kacher ***@***.***> wrote:
Regarding the newly documented strict-deps,
Note: Currently strict-deps is protected by a feature toggle but we're
strongly considering making it the default behavior as java_* rules do.
If it were to become the default behavior, would we want to follow the
proposed solution before making it mandated?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#235 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIFwUNp2bp4KAQdctKA5os_ivpkzK3ks5s0zhHgaJpZM4OHkDK>
.
|
No worries @ittaiz, the proposed solution I was referring to was:
With the gist of it being that dependency resolution could perhaps mimic the way it works for java targets? Here is another example: promiseofcake/pubref-java-proto-scala#1 which shows that |
Ok,
This proposal is implemented (minus the smart mode which we actually
started working on yesterday).
It's very possible that we have bugs like mandating the scala test jars or
not supporting exports properly.
The best solution IMHO is for other organizations to start using it and
flush out problems which we'll fix.
For each of those we should probably have an issue and a repro and we'll
try to iron them out.
…On Thu, 9 Nov 2017 at 10:05 Lucas Kacher ***@***.***> wrote:
No worries @ittaiz <https://github.com/ittaiz>, the proposed solution I
was referring to was:
Pass direct and indirect jars to scalac's classpath (like java compilation
does).
Add a scala compiler plugin which essentially will walk the AST after the
type annotation phase, will stop at each 'type expression' and check that
the jar from which the type was loaded is a direct_dependency. If not, it
is an --indirect_dependency, and the plugin generates an error message. The
message includes an 'add_dep' command to add the missing direct dep.
With the gist of it being that dependency resolution could perhaps mimic
the way it works for java targets? Here is another example:
promiseofcake/pubref-java-proto-scala#1
<promiseofcake/pubref-java-proto-scala#1> which
shows that exports from java_ targets aren't pulled in for scala_ rules,
and I am fairly certain those targets *are* exported because the
--strict_java_deps=ERROR does not error out for the java_ targets.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#235 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIF6bqKfXJADrCimr9DujLxqbXP4PJks5s0z70gaJpZM4OHkDK>
.
|
Is this implemented? I'm running at current tip, and am getting errors that seem to indicate that it isn't: I get a term not found error that I can resolve by manually adding transitive dependencies to the target. |
This is implemented. Are you running the build with |
Ah, I didn't realize I had to pass that flag. That does indeed fix it, and allows me to remove the extra dependencies (or equivalently the exports on the direct dependencies). Thanks! (There's some weirdness with the error messages mentioning unknown labels, but I'm pretty sure I saw that in another issue.) |
Are you using scala_import for your dependencies?
Currently the only rules that propagates labels correctly are scala_import
and scala_* (library, binary, etc).
If for example you have a scala_library (a) which depends on java_library
(b) which in turn also depends on java_library (c) and `a` uses `c` without
declaring it the labels will be off.
We use only scala_library and scala_import so this specific thing is less
of a blocker for us.
I hope to tackle it in the next few months but it's not a blocker to making
it the default since we're talking about info you also don't have today
…On Sun, Dec 10, 2017 at 11:01 AM Kamal Marhubi ***@***.***> wrote:
Ah, I didn't realize I had to pass that flag. That does indeed fix it, and
allows me to remove the extra dependencies (or equivalently the exports on
the direct dependencies). Thanks!
(There's some weirdness with the error messages mentioning unknown labels,
but I'm pretty sure I saw that in another issue.)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#235 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIFyiIy5_TmdspvtqnrBH4sHaqsKT_ks5s-53ngaJpZM4OHkDK>
.
|
I'm experimenting with Thinking on how to solve this: it seems like it should be possible to apply an aspect to the deps of scala_* to collect this info. Is that the approach you were considering? |
No actually. |
As far as I know, strict mode for rules_scala does not yet catch extra dependencies. @ittaiz discussed writing a compiler plugin to deal with this part. I ran across this compiler plugin from the Scala Center https://github.com/scalacenter/classpath-shrinker which seems to do what we are interested in doing. I played around with using the plugin to clean up some BUILD files that were auto generated using sbt -> pom -> Bazel: https://docs.bazel.build/versions/master/migrate-maven.html I ran into an issue where all of the transitive dependencies are on the classpath at compile time, so if you have scala targets that depend on other scala targets, then the compiler plugin prints a bunch of warnings. I'm curious if this plugin has been explored before or if any work has been done on detecting extra dependencies. |
Well, I feel silly. I didn't read closely enough. Looks like that plugin is already in rules_scala :D https://github.com/bazelbuild/rules_scala/blob/master/third_party/plugin/src/main/io/bazel/rulesscala/dependencyanalyzer/DependencyAnalyzer.scala I'm still curious if the original plugin detected extra jars if rules_scala can do that. I'll dive in a bit more. Sorry to waste anyone's time. |
Okay, I quickly hacked together a really rough implementation of |
Will there be a way in the future to use both unused_deps and strict_deps at once? Currently readme suggests not |
It would be nice... that said, they work across purposes a bit. At Stripe, we minimize the compiler classpath to reduce cache invalidation. So, we don't use strict_deps (which actually puts all your transitive deps on the classpath, but then errors if you use them). Instead, we just don't have them, and we get errors (although harder to diagnose) if you use them. Really, I start to wonder if compiler plugins are the right path here. As long as scalac is not trying to do separate compilation, the experience will always be a bit second rate. I'd like to see separate compilation with minimal classpaths be a supported feature of scalac, but right now it is kind of tolerated, but not really strongly supported. |
We've been using both together at Lucid for a few months with Annex and it has worked quite well. Zinc's analysis is used to determine unneeded and unused deps and outputs nothing, warnings, or errors with buildozer commands depending on the configuration. Combined with ibazel's |
I hope/plan to implement Twitter's solution which AFAIU goes something like
this:
Use a minimal classpath (like the default one in rules_scala today).
Have something (ibazel?) that sees the compilation failures
Knows how to extracts from it the class name (class name might be too
simple)
Sends it to an index of class names to labels which returns a response
If it's one label it adds it to the relevant target (like the buildozer one)
I think the hard parts are the index (we'll probably have that ready in a
couple of weeks) and extracting a class name from the cryptic compiler
errors.
WDYT?
…On Wed, Oct 3, 2018 at 11:00 PM James Judd ***@***.***> wrote:
We've been using both together at Lucid for a few months with Annex and it
has worked quite well. Zinc's analysis is used to determine unneeded and
unused deps and outputs nothing, warnings, or errors with buildozer
commands depending on the configuration. Combined with ibazel's run_output
and run_output_interactive, you can get an interactive N/y prompt to
execute a buildozer command to remove or add a dependency when it violates
strict/unused deps. This has made it pretty easy to maintain minimal
classpaths.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#235 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIF4C646e2OzYLgcXvrLYs6e9PDN-zks5uhRdhgaJpZM4OHkDK>
.
|
There would need to be one index per |
I don't think we're talking about the same thing.
The index is of class-name to labels that it belongs to (meaning it was in
the sources of that target) (multiple labels can declare the same FQN).
If the FQN belongs to one label it's easy. Otherwise the user needs to
decide.
…On Thu, Oct 4, 2018 at 12:07 AM James Judd ***@***.***> wrote:
There would need to be one index per scala_ target, right? The index
would contain the mapping of class to label for the transitive closure of
the current target? If it is global, it would be constantly churning and
class name collisions could cause problems.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#235 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIF7KQeQimd1PTPKFCcXAj4-Hf9yRdks5uhScTgaJpZM4OHkDK>
.
|
So while unused_deps and strict_deps do work a bit at cross purposes it still seems useful to have both. Because without strict_deps, you have to have dependencies in your classpath that don't really make sense (eg types you never use in any way yet a base package class uses). so you may well decide that the tradeoff in increased compile time in some cases is worthwhile. And unused_deps of course, because you don't want to have unnecessary dependencies regardless of what dependency model you are using. |
So,
I met with @cgrushko a few days ago and apparently the way we (rules_scala) currently do strict deps (allowing a target to only use what it directly depends on) is very different from the way java rules do it and IMHO not as good.
Problem:
We currently pass to scalac only the direct jars and let scalac emit an error to the user.
These errors are quite often cryptic and unuseful for developers to be able to easily fix the problem.
Moreover because of how the scala compiler works there are many cases where it needs transitive dependencies even if they aren't mentioned in the source code of the unit we're compiling[1]. This leads to having to add many transitive targets without any reasoning just to appease the compiler or worse to start using
exports
loosely around the codebase.We suffered from the above transitive dependencies so much that we're just adding the whole transitive closure to the
deps
of a target. This is in the onboarding of a project so we're ok with it but this solution can help greatly minimize it.Proposed solution:
Pass direct and indirect jars to scalac's classpath (like java compilation does).
Add a scala compiler plugin which essentially will walk the AST after the type annotation phase, will stop at each 'type expression' and check that the jar from which the type was loaded is a direct_dependency. If not, it is an --indirect_dependency, and the plugin generates an error message. The message includes an 'add_dep' command to add the missing direct dep.
We will also support a lenient mode where this error is outputted as a warning.
add_dep is a command which buildozer can take and apply it to your BUILD file.
I think (not sure) that ibazel can pick up these warnings and automatically apply them to your BUILD file. Probably rebazel will be able to do this as well if they'll want to.
Value:
Separation of concerns (scalac won't enforce graph concerns)
Allows higher level messages and so call to actions
Might allow finer grain modeling of when indirect dependencies are ok (for an example where scalac "messes" up but we might be able to improve see this repo)
Better cycle for developers since they can run with lenient mode and iterate quickly and fix the deps before pushing (or use ibazel/rebazel). The caveat is obviously the fact people can have a local green build which fails on CI.
Open questions:
It's unclear how smart the compiler plugin can be given scalac's limitations and so unclear on how much noise will still be added to a target's
deps
.Small implementation detail:
To do this we'll need to pass to the compiler plugin a list of direct dependencies and a list of indirect dependencies where at least the latter needs to have both jars and the targets so that the error/warning message can be useful (add
//src/main/foo:baz
to yourdeps
)[1]Improve user experience for builds that use only direct dependencies
The text was updated successfully, but these errors were encountered: