-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
guide: Using a shared cache in networked filesystems (NAS/NFS, etc) #4303
Comments
@skshetry I think it can be part of #103? @jorgeorpinel Can we include this after we do the major cloud providers? |
More discussions, links here https://discord.com/channels/485586884165107732/1077945387136073869 |
Good idea (and it's a task in #2866). I can certainly start it, not sure how it will look as it's not technically a remote storage type but we'll see. |
Good point, @jorgeorpinel. On second thought, I don't think it makes sense to mix it with #103 since it's not about remote storage, and NAS/NFS are simpler as remote storage (don't have to worry about checkouts, temp directories, sqlite databases, etc.). I think we need a page in the data management guide about caching, which could include:
|
A cache guide sounds more appropriate 👍🏼 NFS is already mentioned in the remote guide. |
Now that it looks like we no longer need to recommend any specific setup for NFS, I'm not so sure this is needed. Closing for now, although maybe coming back to a cache guide in the future would make sense. |
One item that might be important still for this scenario is using symlinks, not sure it's worth creating a separate page for this though. |
@shcheklein One thought was to move https://dvc.org/doc/user-guide/how-to/share-a-dvc-cache to the bottom of https://dvc.org/doc/user-guide/data-management/large-dataset-optimization since I think they are both related to how to handle large amounts of data in the cache and cover most of these topics, but I wasn't sure the effort is worth it (edit: actually concern is not really about effort but whether it's impactful) to move more stuff around. WDYT? |
Yep, not a huge priority. Also we can just always add a link from one thing to another- it's faster. I'm just worried that even in basic scenarios ppl might not be realizing that DVC is copying files. We used to have a warning in such cases in DVC but I don't think it exist anymore. |
Report
It seems that there have been more questions regarding NAS/NFS being slow. And the common fix that we suggest is to set
index.dir
andstate.dir
to a directory in a non-networked filesystem.It'd be nice to have a page to mention this, say why we need to do this, etc.
The text was updated successfully, but these errors were encountered: