Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run: at least unprotect existing outputs #1786

Closed
efiop opened this issue Mar 26, 2019 · 4 comments
Closed

run: at least unprotect existing outputs #1786

efiop opened this issue Mar 26, 2019 · 4 comments
Assignees
Labels
bug Did we break something? p1-important Important, aka current backlog of things to do

Comments

@efiop
Copy link
Contributor

efiop commented Mar 26, 2019

As @pared noticed, when you run two sequential dvc run commands with the same output:

dvc run -o out 'echo 1 >> out'
dvc run -o out 'echo 2 >> out'

you actually corrupt your cache file for first out, as it is still linked to the cache. We should at least unprotect outputs in this case or maybe even consider removing them by default. That will play nicely with the new --outs-persist flag.

@efiop efiop added p1-important Important, aka current backlog of things to do bug Did we break something? c3-small-fix labels Mar 26, 2019
@pared
Copy link
Contributor

pared commented Mar 27, 2019

There is no "fix" for this issue per se because duplicating outputs are unprotected via

stage.unprotect_outs()

@efiop
Copy link
Contributor Author

efiop commented Mar 27, 2019

Fixed by #1789

@efiop efiop closed this as completed Mar 27, 2019
@ghost
Copy link

ghost commented Mar 28, 2019

@efiop , I tried to replicate the problem on my machine but it doesn't overwrite anything.
I'm currently using XFS, using hardlinks as my dvc config:

dvc init --no-scm
dvc run -o out `echo 1 > out`
dvc run -o out `echo 2 > out`
cat .dvc/cache/**/*`
───────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       │ File: .dvc/cache/c1/57a79031e1c40f85931829bc5fc552
───────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1   │ bar
───────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
───────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       │ File: .dvc/cache/d3/b07384d113edec49eaa6238ad5ff00
───────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1   │ foo
───────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
❯ exa --tree --inode .dvc/cache/* out
     inode Permissions Size User    Date Modified Name
1077495202 drwxr-xr-x     - mroutis 27 Mar 19:29  .dvc/cache/c1
1077494138 .rw-r--r--     4 mroutis 27 Mar 19:29  └── 57a79031e1c40f85931829bc5fc552
 540589766 drwxr-xr-x     - mroutis 27 Mar 19:29  .dvc/cache/d3
1077494140 .rw-r--r--     4 mroutis 27 Mar 19:29  └── b07384d113edec49eaa6238ad5ff00
1077494138 .rw-r--r--     4 mroutis 27 Mar 19:29  out

Am I missing something? 🤔

@pared
Copy link
Contributor

pared commented Mar 28, 2019

@MrOutis As I mentioned before, there already was unprotect step in Stage.create. So cache corruption would not occur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? p1-important Important, aka current backlog of things to do
Projects
None yet
Development

No branches or pull requests

2 participants