Bazel should re-execute actions when switching between sandboxed/non-sandboxed runs #2765

kchodorow · 2017-03-31T13:15:20Z

Right now, if I do bazel build //foo; bazel build --sandbox_strategy=standalone //foo, the second build will just be the cached results from the first. However, obviously I wouldn't have turned off sandboxing if I expected it to have the same results as the first build.

Right now, the only workaround I know is to manually delete outputs or run a full bazel clean.

The text was updated successfully, but these errors were encountered:

ittaiz · 2018-05-07T07:37:38Z

Wow, isn't this a really big issue correctness wise? We encountered it internally and were really surprised

ittaiz · 2018-07-09T05:17:11Z

@philwo any thoughts?

philwo · 2018-07-09T09:24:01Z

@ittaiz The reasoning behind why changing the execution strategy doesn't invalidate any actions is that we assume that all execution strategies yield the same results (just via different ways of executing actions). If it doesn't, it's a bug.

This also means that tweaking the execution strategy shouldn't be used as a normal way to get different build results.

We could revisit this assumption and make the execution strategy used for an action part of its cache key, of course. :) But I'm not sure if it's the right thing to do - we should definitely discuss this on the mailing list first, as it would be a pretty big change.

@dslomov @lberki WDYT?

lberki · 2018-07-09T09:46:54Z

That's one tough question. We should definitely not make all execution strategies part of the action key because that would make some interesting use cases (e.g. do full build remotely to benefit from increased parallelism, then incremental builds locally to benefit from shorter round-trip times on the critical path) impossible.

On balance, I'd rather we don't add the knob of "are these two execution strategies equivalent". If this use case is important, maybe we could implement a partial bazel clean so that one can re-run a specific set of actions. Actually, that sorta already works by just deleting their output files (of course, you have to know what those files are, but that's why we are adding a way to query the action graph, /cc @meisterT )

ittaiz · 2018-07-09T11:33:30Z

I see (and agree). On the other hand I think that maybe defining some strategies as “less safe” might be the wise choice. Or allowing users to decide which strategies are “less safe”. We hit this a lot with tests which pass without sandboxing but fail in sandboxing

…

On Mon, 9 Jul 2018 at 12:46 lberki ***@***.***> wrote: That's one tough question. We should definitely not make *all* execution strategies part of the action key because that would make some interesting use cases (e.g. do full build remotely to benefit from increased parallelism, then incremental builds locally to benefit from shorter round-trip times on the critical path) impossible. On balance, I'd rather we don't add the knob of "are these two execution strategies equivalent". If this use case is important, maybe we could implement a partial bazel clean so that one can re-run a specific set of actions. Actually, that sorta already works by just deleting their output files (of course, you have to know what those files are, but that's why we are adding a way to query the action graph, /cc @meisterT <https://github.com/meisterT> ) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2765 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABUIFxpwItR_rjKBFe7dI7gg8VHZEZG5ks5uEyaTgaJpZM4MvpzX> .

ittaiz · 2018-07-10T18:01:13Z

Two usecases:

I run a test and it fails, I suspect it's related to sandboxing so I run it as standalone and it indeed passes. I now try to re-run it in sandbox to try and better triage the error but now it passes.
I run a build of a library which fails, I suspect maybe I have an undeclared dependency (@aehlig mentioned compiler?), I run it as standalone and it indeed passes. I now try to re-run it in sandbox to try and better triage the error but not it's already built.

@talya is the above complete enough? from this it sounds we can make due with some kind of flag which will not write to the local cache for this specific run (it will read from the cache). Am I missing something?

cc @dslomov

talya · 2018-07-10T19:06:30Z

usecase 1 is indeed what brought us to this thread in the first-place.

@philwo - re "we assume that all execution strategies yield the same results" - imo in the case of sandbox vs standalone strategies this does not always hold.

@ittaiz - A flag which will not write to the local cache for this specific run could be nice, but it could also be easy to forget to add it. and if you ARE paying attention to this then using the --cache_test_results=false is already available.
the direction suggested by @lberki of supplying "a partial bazel clean so that one can re-run a specific set of actions" sounds good.

helenalt · 2018-09-05T14:19:09Z

@philwo please triage/respond

philwo · 2018-10-17T12:43:51Z

I really don't see any good way to improve this. If you want to ensure a clean build, run bazel clean before your build. If you want to benefit from the "clean build remotely, then incremental builds locally" use case, then this behavior is WAI. If you want to partially rebuild, you can delete the relevant output files from bazel-out and it should just work.

Supporting bazel clean //my:target sounds nice. I'd be happy to review a pull request that implements this.

jmmv · 2019-03-14T10:47:25Z

Agree with @philwo's assessment in the latest comment. I'll close this in favor of #1035, which also talks about having better mechanisms of cleaning the cache -- and implementing a bazel clean //target would fall in that category.

kchodorow added the category: sandboxing label Mar 31, 2017

helenalt added the team-Execution label Sep 5, 2018

benjaminp mentioned this issue Sep 28, 2018

Should the state of --incompatible_symlinked_sandbox_expands_tree_artifacts_in_runfiles_tree affect the hashes of artifacts? #6204

Closed

philwo added type: feature request P2 We'll consider working on this in future. (Assignee optional) category: remote execution / caching and removed category: sandboxing labels Oct 17, 2018

talya mentioned this issue Nov 6, 2018

Tests run without (network) sandboxing when --java_debug is set #6379

Closed

jin added team-Local-Exec Issues and PRs for the Execution (Local) team and removed team-Execution labels Jan 14, 2019

buchgr removed the category: remote execution / caching label Feb 5, 2019

jmmv mentioned this issue Mar 14, 2019

Cache management facilities #1035

Closed

jmmv closed this as completed Mar 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bazel should re-execute actions when switching between sandboxed/non-sandboxed runs #2765

Bazel should re-execute actions when switching between sandboxed/non-sandboxed runs #2765

kchodorow commented Mar 31, 2017

ittaiz commented May 7, 2018

ittaiz commented Jul 9, 2018

philwo commented Jul 9, 2018

lberki commented Jul 9, 2018

ittaiz commented Jul 9, 2018 via email

ittaiz commented Jul 10, 2018

talya commented Jul 10, 2018

helenalt commented Sep 5, 2018

philwo commented Oct 17, 2018

jmmv commented Mar 14, 2019

Bazel should re-execute actions when switching between sandboxed/non-sandboxed runs #2765

Bazel should re-execute actions when switching between sandboxed/non-sandboxed runs #2765

Comments

kchodorow commented Mar 31, 2017

ittaiz commented May 7, 2018

ittaiz commented Jul 9, 2018

philwo commented Jul 9, 2018

lberki commented Jul 9, 2018

ittaiz commented Jul 9, 2018 via email

ittaiz commented Jul 10, 2018

talya commented Jul 10, 2018

helenalt commented Sep 5, 2018

philwo commented Oct 17, 2018

jmmv commented Mar 14, 2019