-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pipeline: show: doesn't show all DAGs in the projects when ran without arguments #2392
Comments
Some context We need to either:
|
In case of first approach we need to consider what to do in case like this:
|
I think that the most natural behavior is:
If this approach will be picked, then the case mentioned by @pared should yield:
assuming the target |
@drorata showing all paipelines seems reasonable in case of no target |
Maybe I'm confused, but, consider the following example. Assume we have a data file
and we build the following steps: dvc run -f split.dvc -d data.txt -o first_2.txt -o last_2.txt "head -n 2 data.txt > first_2.txt && tail -n 2 data.txt > last_2.txt"
dvc run -f add_ts.dvc -d first_2.txt -o first_2_w_ts.txt "cp first_2.txt first_2_w_ts.txt && echo $(date) | tee -a first_2_w_ts.txt"
dvc run -f add_sig.dvc -d last_2.txt -o last_2_w_sig.txt "cp last_2.txt last_2_w_sig.txt && echo 'this is my signature' >> last_2_w_sig.txt"
dvc run -f combine.dvc -d first_2_w_ts.txt -d last_2_w_sig.txt -o combined.txt "cat first_2_w_ts.txt last_2_w_sig.txt > combined.txt"
dvc run -f lines_count.dvc -d combined.txt -m lines_count "wc -l combined.txt > lines_count" Now the result of dvc pipeline show --ascii lines_count.dvc | less
dvc pipeline show --ascii combine.dvc | less
dvc pipeline show --ascii split.dvc | less is always the same --- showing the whole pipeline. This is yet another behavior --- independently of the specified |
@drorata |
Related #2391 |
May be I'm confused as well, but I think it's reasonable to change the behavior to show only the part of the pipeline up to the target. What are the problems we can expect here, @efiop ? |
@shcheklein yep, we could do that by default and have |
Let's for the time being, forget about the discrepancy between the actual behavior and the docs and focus on the desired behavior. I also agree that if a target was provided, then the DAG containing this target should be rendered from the target (including it) all the way up to (all) the source(s). Obviously, if the target is a "last" step of the DAG, then (almost, see the next item) the whole DAG is to be rendered. BTW, this raises a question what to do with "siblings" (a term which is not well defined in a DAG's context) of a target? I'd suggest sticking to a concise approach as I described above. If a target is specified and the flag Another case is when a target is not provided at all. What would be rendered then? I believe it is desired to return all the possible DAGs. Indeed there could be several disconnected DAGs, but still, it is worthy to have them. Maybe, in this case, it won't be very helpful to actually render them, but having an explicit description of them (like
would yield an error saying something like:
An additional issue which was raised is the |
Isn't this related to #2453? |
i would love this feature :) |
from dvc.repo import Repo as DVCRepo
dvcrepo = DVCRepo('.')
Gs = dvcrepo.pipelines
nodes = []
for G in Gs:
nodes = nodes + [n for n in G.nodes()]
self.args.targets = nodes , Right? |
@mastaer Yes, that line is the cause. But the solution is a bit deeper, and probably would look like making |
Related #3661 |
I this comment @efiop was not sure about the behavior of version 0.66. Here is a small example: Build the following DAG (in a fresh environment with version 0.66 installed): dvc run -d my_data.txt -o head.txt "head -n 2 my_data.txt > head.txt"
dvc run -d my_data.txt -o tail.txt "tail -n 3 my_data.txt > tail.txt"
dvc run -d tail.txt -o tail_count "wc -l tail.txt > tail_count"
dvc run -d head.txt -o head_count "wc -l head.txt > head_count" Then:
Now, if you update to version 0.93 then:
is the output of |
As per discussion with @pared we realized that the behavior of
dvc pipeline show --ascii
contradicts the documentation. Current behavior looks forDvcfile
and fails if it is not found.The expected behavior of
dvc pipeline show --ascii
is that it will find all.dvc
files in the workspace and plot all pipelines. It does make sense IMHO that if a.dvc
is provided then the yielded graph is showing the steps up to the file's corresponding stage.The text was updated successfully, but these errors were encountered: