-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remote: use string paths over PathInfo for performance reasons #3672
Conversation
9cb7a0f
to
654b606
Compare
|
75cffe8
to
9d4a3b8
Compare
0f69c29
to
c4863b1
Compare
- cloud remotes still default to using PathInfo's
- if path is not relpath from cwd or abspath, posix lstat() syscall runtime doubles (from calculating relpath from cwd)
b17446f
to
33ec6d7
Compare
@@ -67,6 +67,13 @@ def cache_dir(self, value): | |||
def supported(cls, config): | |||
return True | |||
|
|||
@cached_property | |||
def cache_path(self): | |||
return os.path.abspath(self.cache_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If path is not abspath or relpath from current working dir, posix lstat()
syscall runtime doubles. PathInfo.__str__
uses relpath from cwd, but since cwd may change during runtime (like in some of our erepo tests), we can't cache relpath(self.cache_dir)
and we don't want to compute it each time, so we just use abspath.
@efiop solution is more generalized now. |
PR iterative#3672 (6d8499e) extended `LocalRemote::_get_plans` to return one `checksums` too. As all of the args from `_get_plans` was passed down to `download()`, it recognized extra arg of checksum as `no_progress_bar` due to which it became True and stopped showing progress bar at all. Fix iterative#3874
β I have followed the Contributing to DVC checklist.
π If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here. If the CLI API is changed, I have updated tab completion scripts.
β I will check DeepSource, CodeClimate, and other sanity checks below. (We consider them recommendatory and don't expect everything to be addressed. Please fix things that actually improve code or fix bugs.)
Thank you for the contribution - we'll try to review it as soon as possible. π
Should close #3635.
status
andLocalRemote.cache_exists
)_process
/upload
/download
pipeline has been left alone for now - in the event that we are pushing or pulling enough files that using PathInfo's becomes noticeably slow, the expectation is already going to be that the operation will be slow, given that we will be uploading/downloading a lot of data.