Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc: refactor cache saving #5314

Merged
merged 1 commit into from
Jan 22, 2021
Merged

dvc: refactor cache saving #5314

merged 1 commit into from
Jan 22, 2021

Conversation

efiop
Copy link
Contributor

@efiop efiop commented Jan 22, 2021

Moving cache-related modes from the tree to the cache itself. On our way to unify transfer/push/pull/save functionality into one, so that we could use them in the 3.0 cache format.

Pre-requisite for #5255

Thank you for the contribution - we'll try to review it as soon as possible. 🙏

@efiop efiop changed the title dvc: refactor cache saving [WIP] dvc: refactor cache saving Jan 22, 2021
@efiop efiop added refactoring Factoring and re-factoring skip-changelog Skips changelog labels Jan 22, 2021
@efiop efiop force-pushed the fix-5255 branch 8 times, most recently from a0bb3bb to bb62098 Compare January 22, 2021 14:55
@efiop efiop changed the title [WIP] dvc: refactor cache saving dvc: refactor cache saving Jan 22, 2021
shared = tree.config.get("shared")
if shared:
self._file_mode = 0o664
self._dir_mode = 0o2775
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as git, using S_ISGID sticky bit, this is a more proper way to handle shared cache scenario.

self._dir_mode = 0o2775
else:
self._file_mode = 0o666 & ~umask
self._dir_mode = 0o777 & ~umask
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we've used hardcoded modes, which was not right. Now properly using umask, same as git.

Comment on lines +461 to +470
if self.tree.scheme == "local":
with self.tree.state:
for checksum in named_cache.scheme_keys("local"):
cache_file = self.tree.hash_to_path_info(checksum)
if self.tree.exists(cache_file):
hash_info = HashInfo(
self.tree.PARAM_CHECKSUM, checksum
)
self.tree.state.save(cache_file, hash_info)
self.cache.protect(cache_file)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as "trust" logic in pull. This really emphasizes that push and pull are the same thing: transering objects from one cache to another, so the receiving cache should be able to save objects as it pleases. This and the "trust" logic below will move to cache.save() in the future.

tmp = tempfile.NamedTemporaryFile(delete=False, dir=path.parent).name
assert os.path.exists(path.parent)
assert os.path.isdir(path.parent)
dump_yaml(tmp, cache)
self.tree.move(PathInfo(tmp), path)
self.repo.cache.local.move(PathInfo(tmp), path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic related to saving stage cache object, should really be in the cache itself.

@efiop efiop changed the title dvc: refactor cache saving [WIP] dvc: refactor cache saving Jan 22, 2021
@efiop efiop changed the title [WIP] dvc: refactor cache saving dvc: refactor cache saving Jan 22, 2021
@efiop efiop merged commit cbd7472 into iterative:master Jan 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Factoring and re-factoring skip-changelog Skips changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant