Make comparisons in is_cached independent of order #3731
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
On is_cached, we used to compare two dicts: one of current stage in memory
created by
run
and one that's already written to the file.As
outs
anddicts
are lists (which is generate bydumpd
), thecomparison was dependent on the order of outs and deps.
So, the
run
before and now, can be in different order and would fail the comparisons.Also, given that we can compare without order, we can make pipeline files sorted by outs and deps.
Plus, in the post-pipeline-file world, we don't even need
is_cached
. We could just throwDuplicateStage
orOutputDuplicationError
. It's convenient of course, to be able todvc run
same command many times but, I'm not sure if it's worth the hassle of having
is_cached
for just that.❗ I have followed the Contributing to DVC checklist.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here. If the CLI API is changed, I have updated tab completion scripts.
❌ I will check DeepSource, CodeClimate, and other sanity checks below. (We consider them recommendatory and don't expect everything to be addressed. Please fix things that actually improve code or fix bugs.)
Thank you for the contribution - we'll try to review it as soon as possible. 🙏