Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: don't set default cache dir #9037

Merged
merged 1 commit into from
Feb 17, 2023
Merged

Conversation

efiop
Copy link
Contributor

@efiop efiop commented Feb 16, 2023

Setting default values is very untypical for all other applications and the info config has is not enough to make a decision on the default cache dir, especially when dealing with gitfs.

Fixes #8705

dvc/cachemgr.py Outdated
Comment on lines 33 to 49
default = None
if repo and repo.dvc_dir and not isinstance(repo.fs, GitFileSystem):
default = repo.fs.path.join(repo.dvc_dir, self.CACHE_DIR)
elif repo:
dvc_dir = os.path.join(
repo.scm.root_dir,
*repo.fs.path.relparts(repo.root_dir, "/"),
repo.DVC_DIR,
)
if os.path.exists(dvc_dir):
default = os.path.join(dvc_dir, self.CACHE_DIR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the best look and there are lots of isinstance(repo.fs, GitFileSystem) across the codebase elsewhere, but the proper solution will have to wait. Will probably get back to this when messing with repo.tmp_dir for persistent index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the whole Repo() constructor taking a "git" filesystem to be a wrong approach, needlessly complicating things. It should require a physical filesystem during instantiation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skshetry That would not cover bare repos. We need to be able to operate on virtual filesystems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just need to clearly handle "workspace-fs" (e.g. where the scm is, odb is, etc) from "vfs" (where dvcfiles are, some configs are, etc). It will get better from here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bare repos aren’t that important, we could easily drop support (or, clone if needed).

I am not saying dvc not to support virtual filesystems, but loading config always requires a working filesystem. So, Repo.__init__ could always require a physical filesystem.

The other cases could be handled through different classmethods: Repo.from_git() etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, re-read your message. Yeah, I agree, the alternative is to just pass a rev and scm or something. Also looked at it a bit. Might get back to it later (external_repo is also annoying).

@efiop efiop force-pushed the config-cache branch 2 times, most recently from e5d721e to 340e352 Compare February 16, 2023 06:47
@codecov
Copy link

codecov bot commented Feb 16, 2023

Codecov Report

Base: 93.04% // Head: 92.98% // Decreases project coverage by -0.06% ⚠️

Coverage data is based on head (5b2056e) compared to base (d7ea7c3).
Patch coverage: 95.83% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9037      +/-   ##
==========================================
- Coverage   93.04%   92.98%   -0.06%     
==========================================
  Files         456      456              
  Lines       36758    36773      +15     
  Branches     5322     5208     -114     
==========================================
- Hits        34201    34193       -8     
- Misses       2030     2049      +19     
- Partials      527      531       +4     
Impacted Files Coverage Δ
dvc/config.py 98.51% <ø> (-0.02%) ⬇️
dvc/cachemgr.py 94.91% <92.85%> (-0.83%) ⬇️
dvc/commands/cache.py 92.68% <100.00%> (+1.77%) ⬆️
tests/remotes/git_server.py 33.33% <0.00%> (-9.10%) ⬇️
tests/conftest.py 76.51% <0.00%> (-6.82%) ⬇️
dvc/testing/fixtures.py 77.63% <0.00%> (-6.58%) ⬇️
tests/unit/test_analytics.py 95.71% <0.00%> (-4.29%) ⬇️
dvc/analytics.py 88.73% <0.00%> (-4.23%) ⬇️
dvc/dependency/__init__.py 98.21% <0.00%> (-1.79%) ⬇️
dvc/prompt.py 70.83% <0.00%> (-1.17%) ⬇️
... and 4 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@efiop efiop force-pushed the config-cache branch 2 times, most recently from db6215c to cc81681 Compare February 16, 2023 21:27
Setting default values is very untypical for all other applications and the
info config has is not enough to make a decision on the default cache dir,
especially when dealing with gitfs.

Fixes iterative#8705
@efiop efiop merged commit 4f49023 into iterative:main Feb 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dvc.api.DVCFileSystem not working with rev=...
2 participants