Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ActionConflictException for SolibSymlink action created for dynamic_runtime_lib #14826

Closed
glukasiknuro opened this issue Feb 16, 2022 · 7 comments
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Rules-CPP Issues for C++ rules type: bug

Comments

@glukasiknuro
Copy link
Contributor

glukasiknuro commented Feb 16, 2022

Description of the problem

ActionConflictException shows up when using combination of CC toolchains with runtime libs, and rules that require specific execution platforms - rules_docker in the reproduction below, but it can be reproduced for e.g. tensorflow as well.

Looks like the problematic action is SolibSymlink action. Changing the action to ignore execution platforms and exec properties fixes the issue (WIP: #14827). Similar change was made to Middleman in 4926955, possibly it can be applied for SolibSymlink as well?

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Code to reproduce as well and instructions: https://github.com/glukasiknuro/bazel_solibsymlink_conflict_issue

Error message- executionPlatformLabel seems to be the problem:

./bazel-5.0.0-linux-x86_64 build //:test_py_image

Starting local Bazel server and connecting to it...
ERROR: file '_solib__toolchain_Ck8_Ucc_Utoolchain/foo.so' is generated by these conflicting actions:
Label: //toolchain:k8_cc_toolchain
RuleClass: cc_toolchain rule
Configuration: 30ef7fa7561d68e72a06426dbdb1ff29fe154ce431958a511b45458ec146af56
Mnemonic: SolibSymlink
Action key: fb9103b4f577c25c225b84492070aef8fb18733871a75006b27c69ee7311572c, 1fb5751a4d74877adaf30c2a75ff8ae43324906fb1e7f94f6bf94b426e3566d6
Progress message: (null)
PrimaryInput: File:[/(...)/bazel_solibsymlink_conflict_issue[source]]toolchain/libs/foo.so
PrimaryOutput: File:[[<execution_root>]bazel-out/k8-py2-opt-ST-4a519fd6d3e4/bin]_solib__toolchain_Ck8_Ucc_Utoolchain/foo.so
Owner information: 

ToolchainDependencyConfiguredTargetKey{
label=//toolchain:k8_cc_toolchain,
config=BuildConfigurationValue.Key[30ef7fa7561d68e72a06426dbdb1ff29fe154ce431958a511b45458ec146af56], 
executionPlatformLabel=@io_bazel_rules_docker//platforms:local_container_platform}, 

ConfiguredTargetKey{
label=//toolchain:k8_cc_toolchain, 
config=BuildConfigurationValue.Key[30ef7fa7561d68e72a06426dbdb1ff29fe154ce431958a511b45458ec146af56]}

MandatoryInputs: are equal
Outputs: are equal
ERROR: com.google.devtools.build.lib.skyframe.ArtifactConflictFinder$ConflictException: com.google.devtools.build.lib.actions.MutableActionGraph$ActionConflictException: for _solib__toolchain_Ck8_Ucc_Utoolchain/foo.so, previous action: action 'SolibSymlink _solib__toolchain_Ck8_Ucc_Utoolchain/foo.so', attempted action: action 'SolibSymlink _solib__toolchain_Ck8_Ucc_Utoolchain/foo.so'

What operating system are you running Bazel on?

Ubuntu 20.04

What's the output of bazel info release?

Issue is present for released version of bazel 5, as well as one built from the current HEAD.
It does not show up on bazel 4.2

Through bisect, found this commit as triggering this behavior:
a436297

Have you found anything relevant by searching the web?

#14521
4926955

@glukasiknuro
Copy link
Contributor Author

@katre Do you know why a436297 could trigger this change of behavior?

I am not sure whether it's expected that those two SolibSymlink actions have same output file in the same directory (namely bazel-out/k8-py2-opt-ST-4a519fd6d3e4/bin above). I have seen Middleman action was tuned in 4926955 to ignore platforms, but not sure whether my change to do the same for SolibSymlink is appropriate.

The fix definitely helped with our code-based, but would like to verify whether the fix will not break something else in the future by just plastering over symptoms instead of fixing core issue.

@katre
Copy link
Member

katre commented Feb 17, 2022

This looks similar to an issue that we've been looking at recently.

@sdtwigg Is this also fixed by your changes to the exec platform output hash?

@glukasiknuro
Copy link
Contributor Author

Is there anything I can do to help with resolving this issue? Not sure what the fix for exec platform output hash is and whether I can help with modifying the example provided in the issue description.

@sdtwigg
Copy link
Contributor

sdtwigg commented Feb 23, 2022

I don't expect this to be fixed by the changes to my exec platform hash. In this case, it looks like cc_toolchain is creating actions itself. When this happens, approaching the action twice with different execution platforms will cause issues unless the output file name manually incorporates the execution platform name (since that is the only way those ConfiguredTargetKeysWithToolchainDependency diverge).

@sdtwigg
Copy link
Contributor

sdtwigg commented Feb 23, 2022

For reference the actions is being created by getDynamicLibrarySymlinkInternal in //src/main/java/com/google/devtools/build/lib/rules/cpp/SolibSymlinkAction.java via getCcToolchainProviderHelper in //src/main/java/com/google/devtools/build/lib/rules/cpp/CcToolchainProviderHelper.java.

This is a 'bug' in cc_toolchain. [lang]_toolchain rules are not generally supposed to register actions. This is because, internally, for all rules (even/especially after my changes), the computed output directories is a function of the given BuildOptions. For rules this is usually fine as, when a rule is analyzed twice, it is almost always because the BuildOptions changed and thus the computed output directory is different.

However, [lang]_toolchain is an exception. The selected execution platform exists outside of the BuildOptions as a third field of the ConfiguredTargetKey. Thus, we are doing the analysis multiple times but with the same BuildOptions (and thus the same output directory) but different execution platforms. 'Even worse' the action itself looks dependent of the selected execution platform (since the action command hash is different) and thus last-ditch attempts to register the new action as a duplicate 'shared' action fail.

I don't know enough about the C++ rules to give a full fix. A reasonable-looking short-term fix would be to (inside the Bazel internal implementation of cc_toolchain) incorporate the selected execution platform into the name of the generated output file. In the long-term, I think it would be best if cc_toolchain (or any other [lang]_toolchain] did not register actions.

@aiuto aiuto added team-Rules-CPP Issues for C++ rules untriaged labels Feb 26, 2022
@oquenchil oquenchil added type: bug P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed untriaged labels Apr 6, 2022
bazel-io pushed a commit that referenced this issue May 11, 2022
ckolli5 added a commit that referenced this issue May 12, 2022
#14826

Closes #14827.

PiperOrigin-RevId: 447948011

Co-authored-by: Grzegorz Lukasik <[email protected]>
@github-actions
Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Jun 14, 2023
@github-actions
Copy link

This issue has been automatically closed due to inactivity. If you're still interested in pursuing this, please reach out to the triage team (@bazelbuild/triage). Thanks!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Rules-CPP Issues for C++ rules type: bug
Projects
None yet
Development

No branches or pull requests

5 participants