Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In contrived circumstances, repro of an added file with hardlinks enabled results in a not-read-only file #3668

Closed
charlesbaynham opened this issue Apr 23, 2020 · 2 comments
Labels
bug Did we break something? p2-medium Medium priority, should be done, but less important

Comments

@charlesbaynham
Copy link
Contributor

charlesbaynham commented Apr 23, 2020

DVC 0.93.0, Windows 10

This is a quite contrived Issue, so please feel free to close it on the grounds that anyone messing around with .dvc files manually deserves what's coming to them. However:

Doing the following steps works fine:

git init
dvc init
dvc config cache.type reflink,hardlink,copy

echo Hello world > out.txt

ls -l out.txt
# returns -rw-r--r--
dvc add out.txt
ls -l out.txt
# returns -r-r--r--
dvc repro out.txt.dvc
ls -l out.txt
# returns -r-r--r--

Once out.txt is added to the cache, it's made read-only as it should be.

However, if you then alter out.txt.dvc, e.g. by deleting the md5 entry, running dvc repro out.txt.dvc results out.txt being restored, but now it's unprotected:

# Now manually alter the .dvc file, removing the md5 entry but leaving the outs: section untouched
tail -n +2 out.txt.dvc > tmp
mv tmp out.txt.dvc

dvc repro out.txt.dvc
ls -l out.txt
# returns -rw-r--r--

Like I said, a bit contrived, but might point to something funny going on in repro when used for added files.

I bumped into this problem because I generated .dvc files with only the outs filled in then used dvc checkout to load these files from the cache. I noticed that these .dvc files didn't have the md5 fields filled in, so I ran dvc repro to make a fully formed file. This worked, but made the files writable as described.

@triage-new-issues triage-new-issues bot added the triage Needs to be triaged label Apr 23, 2020
@pared
Copy link
Contributor

pared commented Apr 24, 2020

#!/bin/bash
rm -rf repo
mkdir repo

pushd repo
git init --quiet
dvc init -q
dvc config cache.type reflink,hardlink,copy

echo data >> data
ls -l | grep data

dvc add data -f stage.dvc -q
ls -l | grep data

dvc repro stage.dvc
ls -l | grep data

sed -i "s/6137cde4893c59f76f005a8123d8e8e6//g" stage.dvc
cat stage.dvc

dvc repro stage.dvc
ls -l | grep data

Seems that on linux the permissions do not change. I think that there should be no disrepancy between systems. Marking as bug.

@pared pared added the bug Did we break something? label Apr 24, 2020
@triage-new-issues triage-new-issues bot removed the triage Needs to be triaged label Apr 24, 2020
@pared pared added the p2-medium Medium priority, should be done, but less important label Apr 24, 2020
@efiop
Copy link
Contributor

efiop commented Dec 8, 2023

Closing as stale.

@efiop efiop closed this as not planned Won't fix, can't repro, duplicate, stale Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? p2-medium Medium priority, should be done, but less important
Projects
None yet
Development

No branches or pull requests

3 participants