Introducing cache types: data, metrics and plots, run-cache and per-file #4040
Labels
feature request
Requesting a new feature
p2-medium
Medium priority, should be done, but less important
question
I have a question?
research
Today we use a single remote to store all the data (which can be redefined by
--remote
option). However, there are different types of information in the cache:There might be several reasons (data sensitivity or optimizations) to store these artifacts in different remotes:
It would be greater to introduce "types" of remote to group some of the cache types.
Proposal ideas
This is the very first iteration on the subject and I don't have a clear proposal yet. We need to take a look at some analogs and possible solutions. How I see it for now:
Related subjects
Remote per file
This use case can be extended to per-file scenarios. Sometimes a special remote is required for a specific file (one bucket for data sources, the other for the pipeline derivative data sources). An extreme case - imported data sources. It would be great to have the information about remotes and use a single command
dvc push
to use all the remotes instead of specifying--remove
option all the time.Change remotes globally
See #2960
External workspaces
It might be also related to #3920
Random ideas
Does it make sense to use the new
dvc.yaml
for storing all this information or part of these?The text was updated successfully, but these errors were encountered: