cquery doesn't filter ruleInputs according to used configurations #14610

illicitonion · 2022-01-20T17:34:51Z

Description of the problem / feature request:

Cquery follows selects across the build graph, but does not appear to perform this filtering to the Rule.rule_input field.

Accordingly, when trying to ascertain what files may affect a target in the current configuration, irrelevant files may be returned.

For example, when trying to query the transitive affecting files of this kind of target:

java_test(
    name = "ExampleTest",
    srcs = ["ExampleTest.java"],
)

Following the rule_input fields when evaluating bazel cquery 'deps(//java/example:ExampleTest) --output=jsonproto, there's the chain:

//java/example:ExampleTest -> @bazel_tools//tools/test:xml_writer -> @bazel_tools//tools/test:xml.exe

However, when the currently configured platform is not Windows, there should be no dependency on xml.exe, as that's Window-specific.

I think it would be reasonable to have to perform some logic to do this filtering (though it just being presented completely up-front would be great), but right now I don't think it's possible to follow any algorithm to decide that xml.exe isn't actually a dependency in the current configuration...

Feature requests: what underlying problem are you trying to solve with this feature?

Coming up with a hash for each target so that we can diff graphs across different commits of source code. See also #7962.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

With the above BUILD file, run a bazel cquery for that target, and try determine whether @bazel_tools//tools/test:xml.exe is depended on by the target.

What's the output of `bazel info release`?

release 4.2.1

/cc @gregestren

The text was updated successfully, but these errors were encountered:

illicitonion · 2022-01-28T17:48:37Z

Relatedly, it feels like these ruleInputs should be ConfiguredTargets not just labels - it's impossible to chase the ruleInput graph if a label is present under multiple configurations. For instance, say have a repo containing both a target using Java in the exec configuration:

java_binary(
    name = "jbin",
    srcs = ["JBin.java"],
    main_class = "JBin",
)
 
genrule(
    name = "run_jbin",
    tools = [":jbin"],
    cmd = "$(execpath :jbin) hello world > $@",
    outs = ["out"],
)

As well as a java_test which depends on a Java toolchain in the target configuration - these will both list @bazel_tools//tools/jdk:current_java_runtime as a ruleInput, but without specifying which configuration each is for, it's impossible to know which is affected by changes to host or target java toolchains. We can over-estimate and say that both are affected, but this may lead to significant over-building in the CI case.

sdtwigg · 2022-01-31T09:13:12Z

So, looking specifically at the ruleInputs issue, this is as-intended. It is dumping the (potential) inputs for a Rule where Rule is a loading phase (post-BUILD file processing) concept and thus has no notion of a configuration. (A ConfiguredTarget is what results from taking a Rule and applying a configuration to it so that select statements can be resolved and an implementation executed.)

Can you clarify the underlying problem? I am not very familiar with the JSON proto output format. On first glance, for cquery, it looks like it just dumps the Rule info and configuration info for each target, which may not be sufficiently disambiguating for your use case.

illicitonion · 2022-01-31T15:20:32Z

Can you clarify the underlying problem?
The root problem we're trying to solve here is to determine which targets Bazel thinks could have changed between two commits, so that we know what tests to run on CI, what binaries to re-deploy, etc - in order to do that, we need a list of files which, if they've changed, cause the target to have changed.

Taking the example from the original issue here, this dep chain:

//java/example:ExampleTest -> @bazel_tools//tools/test:xml_writer -> @bazel_tools//tools/test:xml.exe

By the time cquery is running, when not on Windows, Bazel knows that xml.exe is not a relevant ruleInput, because it has resolved this select:

filegroup(
  name = "xml_writer",
  srcs = select({
    "@bazel_tools//src/conditions:windows": ["@bazel_tools//tools/test:xml.exe"],
    "//conditions:default": ["@bazel_tools//tools/test:xml"],
  }),
)

Right now the code I'm writing to fingerprint each target does more work than it needs (hashing xml.exe) and will return false positives (if xml.exe has changed, it will say that the target has changed and needs re-running) - I'm trying to leverage information that Bazel already knows to avoid these two issues.

When the existing `--transitions` flag is set to a non-None value, cquery will report the configuration its dependencies are configured in. This allows for pruning away values from ruleInputs which are not required in the current configuration. Example output from running: `bazel cquery --transitions=lite --output=jsonproto 'deps(...)'`: ``` [...] "ruleInput": ["@bazel_tools//src/conditions:host_windows", "@bazel_tools//src/tools/launcher:launcher", "@bazel_tools//tools/launcher:launcher_windows"], "configuredRuleInput": [{ "label": "@bazel_tools//src/tools/launcher:launcher", "configurationChecksum": "01ec5513a9fc41b2a15570123817f3c2200ad9aeb21b1181d588a4b4f91d5693", "configurationId": 3 }] [...] ``` Previously on a non-Windows platform it was not possible to determine that the two windows-specific ruleInputs are not actually required - now we can see from the configuredRuleInput section that the windows-specific dependencies have been pruned by selection, and we can see that a transition has re-configured the ruleInput into the exec configuration. For dependencies which were not subject to transition (e.g. because they're in a non-transition attribute), or which had no configuration (e.g. because they're a source file), we add the label as a ConfiguredRuleInput _without_ any configuration information. This indicates that the dependency has not been pruned, but that the caller should determine the correct configuration from context (probably be determining whether it's a source file, and if so, considering it un-configured, otherwise propagating the contextual ConfiguredTarget's configuration). Fixes bazelbuild#14610 Fixes bazelbuild#14617 Implementation-wise, this took roughly the following shape: 1. Extract TransitionResolver from being inline in TransitionsOutputFormatterCallback 2. Extract KnownTargetsDependencyResolver from TransitionsOutputFormatterCallback.FormatterDependencyResolver 3. Conditionally call these from ProtoOutputFormatterCallback depending on the --transitions flag.

purkhusid · 2022-04-08T11:57:51Z

@illicitonion I'm curious if the tool you are building is open source? We currently use https://github.com/Tinder/bazel-diff but I'm interested in seeing what approaches people are using to solve the Bazel diff problem.

illicitonion · 2022-04-08T15:01:16Z

@illicitonion I'm curious if the tool you are building is open source? We currently use https://github.com/Tinder/bazel-diff but I'm interested in seeing what approaches people are using to solve the Bazel diff problem.

It's not currently, but we're hoping to open source it soon. Roughly our approach is to, at both the before and after commit, cquery deps(...) and follow all of the ruleInputs edges, hashing rules and files appropriately. Watch this space :) (and very happy to chat, e.g. on the Bazel slack)

gibfahn · 2022-05-16T15:02:14Z

I'm curious if the tool you are building is open source? We currently use Tinder/bazel-diff but I'm interested in seeing what approaches people are using to solve the Bazel diff problem.

We have now open-sourced this, it's at https://github.com/bazel-contrib/target-determinator

illicitonion mentioned this issue Jan 21, 2022

cquery returns rule_input targets which are missing from deps list #14617

Closed

aiuto added team-Configurability platforms, toolchains, cquery, select(), config transitions untriaged labels Jan 23, 2022

gregestren assigned sdtwigg Jan 26, 2022

gregestren added the type: bug label Feb 1, 2022

illicitonion mentioned this issue Mar 14, 2022

Cquery can report transitions #15038

Closed

sdtwigg added type: feature request P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed type: bug untriaged labels Mar 25, 2022

bazel-io closed this as completed in 9994c32 May 2, 2022

ckolli5 mentioned this issue May 12, 2022

Cquery can report transitions #15475

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cquery doesn't filter ruleInputs according to used configurations #14610

cquery doesn't filter ruleInputs according to used configurations #14610

illicitonion commented Jan 20, 2022

illicitonion commented Jan 28, 2022

sdtwigg commented Jan 31, 2022

illicitonion commented Jan 31, 2022

purkhusid commented Apr 8, 2022

illicitonion commented Apr 8, 2022

gibfahn commented May 16, 2022

cquery doesn't filter ruleInputs according to used configurations #14610

cquery doesn't filter ruleInputs according to used configurations #14610

Comments

illicitonion commented Jan 20, 2022

Description of the problem / feature request:

Feature requests: what underlying problem are you trying to solve with this feature?

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

What's the output of bazel info release?

illicitonion commented Jan 28, 2022

sdtwigg commented Jan 31, 2022

illicitonion commented Jan 31, 2022

purkhusid commented Apr 8, 2022

illicitonion commented Apr 8, 2022

gibfahn commented May 16, 2022

What's the output of `bazel info release`?