Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Action conflicts when new Python version transition semantics are enabled #7655

Closed
brandjon opened this issue Mar 6, 2019 · 7 comments
Closed
Assignees
Labels
P1 I'll work on this now. (Assignee required) team-Rules-Python Native rules for Python type: bug

Comments

@brandjon
Copy link
Member

brandjon commented Mar 6, 2019

There've been multiple reports of action conflicts when --incompatible_allow_python_version_transitions is enabled. Here's a minimal repro:

cc_binary(
    name = "foo",
    srcs = ["foo.cc"],
)

py_binary(
    name = "bar",
    srcs = ["bar.py"],
    data = [":foo"],
)
bazel build --nobuild :foo :bar --incompatible_allow_python_version_transitions

Note that :foo is built in two configurations: the top-level configuration, and a config transitioned by the py_binary rule. The transition serves two purposes:

  1. It changes the Python version if necessary (not the case here).

  2. It changes the value of the physical --force_python flag (above it's null since it's absent) to be consistent with the Python version, so that select()s on "force_python" will work. This is done to improve backward compatibility, but once code is migrated for --incompatible_remove_old_python_version, selecting on "force_python" is actually prohibited anyway and this is moot.

The logic for whether or not the transition occurs is here. If I prevent the transition from happening, the action conflict goes away. This can be done by passing either an explicit (but semantically redundant) --force_python=PY2 flag, or the --incompatible_remove_old_python_version_api flag.

The question is what are the exact conflicting actions, and why did the two nearly identical configurations produce different results? It can't be due to hidden select()s on "force_python" (via implicit dependencies), because they would be errors under --incompatible_remove_old_python_version_api.

@brandjon brandjon added type: bug P1 I'll work on this now. (Assignee required) team-Rules-Python Native rules for Python labels Mar 6, 2019
@brandjon brandjon self-assigned this Mar 6, 2019
@brandjon
Copy link
Member Author

brandjon commented Mar 7, 2019

@hlopko I'm at an impasse trying to understand the command lines of the conflicting actions. Or rather, how the output artifacts differ when they have the same path (root relative, and I believe also exec path from when I tried dumping that with a custom build). Any ideas on how I can dump more useful debug information? Here's the output I get from bazel for the above repro:

ERROR: file 'pkg/_objs/foo/foo.pic.o' is generated by these conflicting actions:
Label: //pkg:foo
RuleClass: cc_binary rule
Configuration: ee8984798ff2d0f986eb1a80976f95db, bac1659b951e8d8a395e8e53af0e4908
Mnemonic: CppCompile
Action key: 691c80693f9f9874ea6b586799f99563
Progress message: Compiling pkg/foo.cc
PrimaryInput: File:[/usr/local/google/home/brandjon/src/foo[source]]pkg/foo.cc
PrimaryOutput: File:[[<execution_root>]bazel-out/k8-fastbuild/bin]pkg/_objs/foo/foo.pic.o
Primary outputs are different: 70578622, 148677802
Owner information: //pkg:foo BuildConfigurationValue.Key[ee8984798ff2d0f986eb1a80976f95db] false, //pkg:foo BuildConfigurationValue.Key[bac1659b951e8d8a395e8e53af0e4908] false
MandatoryInputs: are equal
Outputs: Attempted action contains artifacts not in previous action (first 5): 
        pkg/_objs/foo/foo.pic.o
        pkg/_objs/foo/foo.pic.d
Previous action contains artifacts not in attempted action (first 5): 
        pkg/_objs/foo/foo.pic.o
        pkg/_objs/foo/foo.pic.d
ERROR: com.google.devtools.build.lib.actions.MutableActionGraph$ActionConflictException: for pkg/_objs/foo/foo.pic.o, previous action: action 'Compiling pkg/foo.cc', attempted action: action 'Compiling pkg/foo.cc'

@benjaminp
Copy link
Collaborator

The output "differences" are spurious; see #7658. I expect the problem is that c++ compile actions are generally unshareable. Information about the shareability is not included in that error message. See if this fixes it:

diff --git a/src/main/java/com/google/devtools/build/lib/rules/cpp/CppCompileActionBuilder.java b/src/main/java/com/google/devtools/build/lib/rules/cpp/CppCompileActionBuilder.java
index ecd0b78dbb..5cde841d34 100644
--- a/src/main/java/com/google/devtools/build/lib/rules/cpp/CppCompileActionBuilder.java
+++ b/src/main/java/com/google/devtools/build/lib/rules/cpp/CppCompileActionBuilder.java
@@ -113,7 +113,7 @@ public class CppCompileActionBuilder {
       CcToolchainProvider ccToolchain,
       @Nullable Artifact grepIncludes) {
     this.owner = actionOwner;
-    this.shareable = false;
+    this.shareable = true;
     this.configuration = configuration;
     this.cppConfiguration = configuration.getFragment(CppConfiguration.class);
     this.mandatoryInputsBuilder = NestedSetBuilder.stableOrder();

@brandjon
Copy link
Member Author

brandjon commented Mar 7, 2019

Here I thought shareable actions were the exception, not the rule, but AbstractAction#isShareable says otherwise.

@brandjon
Copy link
Member Author

brandjon commented Mar 7, 2019

@lberki (since you're on 772fdfe) and @gregestren (for configurability): What is the intended plan for shareable / non-shareable actions? If even trivial differences in the configuration will produce action conflicts in the transitive closure (at least, without fragment pruning -- heck let's add @serynth to the thread too!), this seems problematic for defining any transition that doesn't produce its own output root.

For my use case I can tell users to enable --incompatible_remove_old_python_api to disable this particular transition, but I worry about what this means down the line for other transitions.

@brandjon
Copy link
Member Author

brandjon commented Mar 7, 2019

Confirmed that setting shareable=True for CppCompileAction suppresses the error. Thanks for the diagnosis Benjamin.

@gregestren
Copy link
Contributor

@brandjon and I talked about this offline for a bit today.

Both the work @serynth and I are doing (trimming and output path rewriting) could provide long-term generic solutions. I think the "plan" for shareable actions depends on who you ask. :) My hope is we can eliminate shared action risk completely, although that might be optimistic. Include scanning (which is why I believe C++ compile actions are exempted) do make life more complicated.

@brandjon
Copy link
Member Author

brandjon commented Mar 7, 2019

The conclusion from my POV as a native rule author is that it is generally incorrect to ever transition the configuration without changing the output root, no matter how trivial the config change is. This situation should hopefully not arise again in Python rules, since any significant future feature work should take place in Starlark code rather than native.

The resolution for users affected by this particular action conflict is to enable --incompatible_remove_old_python_version_api. If you're not able to do that yet in your build, you can instead add an explicit --force_python=... to your command line.

@brandjon brandjon closed this as completed Mar 7, 2019
bazel-io pushed a commit that referenced this issue Mar 13, 2019
These transitions were intended to bring --force_python in sync with the actual Python version to make select() less broken. This ended up causing more problems than it fixed (#7655). It's not really needed anyway since the select() issue becomes moot once --incompatible_remove_old_python_version_api is enabled.

RELNOTES: None
PiperOrigin-RevId: 238281325
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) team-Rules-Python Native rules for Python type: bug
Projects
None yet
Development

No branches or pull requests

3 participants