Pipeline variables from params file #3633

dmpetrov · 2020-04-14T10:41:38Z

With the introduction of the new multiple-stage pipeline, we will need to find a way of defining variables in the pipeline. For example, the intermediate file name cleansed.csv is used from two stages in the following pipeline and it needs to be defined into a variable:

stages:
    process:
        cmd: "./process.bin --input data --output cleansed.csv"
        deps:
             - path: data/
        outs:
             - path: cleansed.csv

    train:
        cmd: "python train.py"
        deps:
             - path: cleansed.csv
             - path: train.py
             - path: params.yaml
                params:
                     lr: 0.042
                     layers: 8
                     classes: 4
        outs:
             - path: model.pkl
             - path: log.csv
                cache: true
             - path: summary.json

We need to solve two problems here:

Define a variable in one place and reuse it from multiple places/stages.
Often users prefer to read file names from config files (like in the train stage), not from the command line (like in the process stage).

We can solve both of the problems using a single abstraction - parameters file variable:

stages:
    process:
        cmd: ./process.bin
        outs:
             - path: "params.yaml:cleansed_file_name"
        ....
    train:
        cmd: "python train.py"
        deps:
             - path: "params.yaml:cleansed_file_name"

This feature is useful in the current DVC design as well. It is convenient to read file names from params file and still define dependency properly like dvc run -d params.yaml:input_file -o params.yaml:model.pkl

The text was updated successfully, but these errors were encountered:

karajan1001 · 2020-04-18T04:38:11Z

In my case I stored these global variables (vocabulary file, cleaned data file, label name file) in a configuration file as default. And the processing cli can accept all these variables to override the defualt one. But my configuration file parsing and merging is coded inside the program, and had to be re-coded in every program.

tall-josh · 2020-04-18T06:55:04Z

Before I knew about multiple-stage pipelines I used the params functionality and hacked together a script to build a richer dvc run command. I used the same dot.notation you can use in the params command line args ie: dvc run -p params.yaml:data.output,train ... In the end my params.yaml looks like:

data:
    url: s3://mydata.npz
    out_path: data/mydata.npz
data/-o:  # Each element in this list will be
          # added as a  -o <arg>
    # For access strings within the current stage
    # you can also use '.out_path'
    - data.out_path
data/script:
    - python src/cli/data_download.py
    - --data-size
    - .url  # equivalent to data.url
    - --out-path
    - .out_path # equivalent to data.out_path

train:
  lr: 0.001
  ckpts: arteftacts/ckpts
train/-d:
  - data.out_path
train/-o:
  - artefacts/tflogs
train/script:
  - python train.py
  - --data
  - data.out_path
  - --ckpts
  - .ckpts
  - --lr
  - .lr

I'm not suggesting the multi-stage pipeline changes. I guess all I'm saying is:

I like that dot.notation and,
I like the idea of self-referencing inside the same yaml file.

After Thought

After sleeping on it I thought, 'I can use jinja2 for this':

stages:
  data:
    cmd: python src/cli/data_download.py --data-size {{ args.data.size }} --out-path {{ args.data.out_path }}
    wdir: ..
    outs:
    - {{ args.data.out_path }}
  train:
    cmd: python src/cli/train.py -d {{ args.data.out_path }} --params {{ args.train.params }} --log-dir {{ args.train.lodir }} --metrics-path {{ args.train.metrics }} --ckpts {{ args.train.ckpt_path }}
    wdir: ..
    deps:
    - {{ args.data.out_path }}
    - {{ args.train.params }}
    metrics:
    - {{ args.train.metrics }}
    outs:
    - {{ args.train.logdir }}
    - {{ args.train.ckpt_dir }}

I then load my args.yaml file and convert it into a namedtuple (so I can use the dot.notation). Bingo bango, we have our filled in .dvc file

dsuess · 2020-04-22T00:56:52Z

Unfortunately, YAML anchors can only solve part of this: You can use them to fill in repeating dependencies like this:

stages:
    process:
        outs:
             - path: &cleansed cleansed.csv
       ...

    train:
        deps:
             - path: *cleansed
        ...

However, there doesn't seem to be an easy way to also "paste" them into a full command, unless we write some custom YAML parsing.

dmpetrov · 2020-04-22T06:08:54Z

@dsuess can jinja2-like approach from @tall-josh potentially solve this issue?

dsuess · 2020-04-22T06:50:23Z

@dmpetrov I like the idea, but I see two drawbacks:

the necessary information is split across two files
allowing full jinja-templating on the configuration file might be too much

To alleviate the first, maybe we can add a special section variables to the Dvcfile like this:

variables:
  data:
    size: 10
    out_path: ./data/
  ...
  
stages:
  data:
    cmd: python src/cli/data_download.py --data-size {{ args.data.size }} --out-path {{ args.data.out_path }}
    wdir: ..
    outs:
    - {{ args.data.out_path }}
  train:
     ...

and then only template the stages section with those variables. No templating would be allowed for that first section.

For the second one, we could either look into not allowing certain functionality (like conditionals or loops) in jinja2. Alternatively, we need to implement our own simple templating engine that only supports a very limited sets of features. So far, I only see variable substitution to be necessary.

Alternatively, we could implement our own PyYaml tag to support pasting like this

karajan1001 · 2020-04-22T11:27:04Z

@dsuess, good idea, but there is a questionvariables need to be shared among DVCfiles, many of them ( for me embedding size, vocabulary file, sequence length and etc.) must stay the same in all stages. Maybe we should make the DVCfile inherit parameters from its ancestors.

tall-josh · 2020-04-22T11:57:54Z

There are definitely use cases for both in-file variables and external variable files. I agree with @dsuess in that full templating is probably not the way to go as a proper solution.
Maybe it's just a terminology thing but @karajan1001, if your thinking full blown inheritance then I think things could get messy. A simple way of importing from 1 or more files cloud be nice.

vars:
  IMPORT:
    somevars: some/vars.yaml
    morevars: more/vars.yaml

  data:
    size: 10
    out_path: ./data/
  ...
  
stages:
  data:
    cmd: python src/cli/data_download.py --data-size {{ vars.data.size }} --out-path {{ vars.data.out_path }}
    wdir: ..
    outs:
    - {{ args.data.out_path }}
  train:
     ...
    outs:
    - {{ somevars.something.x }}
    - {{ morevars.someotherthing.y }}

dmpetrov · 2020-04-22T12:21:16Z

@dsuess the variables section makes total sense. I see that as a simple form of variables with no param file associated. We definitely need it and this part has a higher priority.

It looks like all of us on the same page regarding Jinja - full support is too much.

Also, I like @tall-josh's idea of importing which unifies the explicit vars definition:

vars:
  IMPORT:
    somevars: some/vars.yaml
    morevars: more/vars.yaml

It looks like the high-level design is ready :) and we need to agree on the syntax and then a templating engine.

On the syntax side, I see these options:

Special patter in strings like "$vars.somevar"
jinja2-style variables: {{ vars.somevar }}
PyYaml tag: !var [*vars.somevar]
other options

What are your thoughts on this?

(3) seems like a more formalized way of doing (1)

PS: dvc run can use a bit different syntax or we can decide not to support variables on the command level - let's keep this discussion separately.

tall-josh · 2020-04-22T21:49:31Z

Hi @dmpetrov
Yep, we're on the same page. I like the first syntax simply because it's concise.

dsuess · 2020-04-22T22:43:51Z

@karajan1001 Agreed. I think this change make most sense with #3584

@dmpetrov If we want to share variables between files, I think @tall-josh solution is better than just implicitly passing variables through from the dependencies. That would also cover the use case of having a single variables-file that's shared throughout the repo. However, I also think that not passing them between files wouldn't be a problem, if the multi-stage files from #3584 are the new standard. This way, files would scope variables nicely.

Regarding the syntax: I don't have any preference as long as the files are still valid YAML. The first one will interfere with standard shell variable substitution in the command, which might be a good thing as using non-tracked variables in the command will break reproducibility. It might also be a bad thing if there's a valid use case for having shell-variables in the command.

tall-josh · 2020-04-22T23:53:03Z

@dsuess I'm not sure multi-stage is quite at the point of being the new standard, though I think it will get there soon. Someone correct me if I'm wrong, but as of a few days ago multi-stage ( #3584 ) did not support --params :-( . Something I'm been meaning to bring up their thread. I'll do that now I guess.

dmpetrov · 2020-04-23T07:59:32Z

@dsuess agree, defining variables from command line is not the high priority feature. We should have a great solution in the pipeline file. Also, a good point about env variables.

@tall-josh it would be great to hear your params ideas! You can post it on the multi-stage issue or just here.

tall-josh · 2020-04-23T10:28:12Z

@dmpetrov. No probs, I've raised the issue ( #3584 ), but it's not really an idea. It's more along the lines of keeping the multi-stage consistent with the regular single-stage .dvc files. ie:

dvc run --params myparams.yaml:train "cat myparams.yaml"

Produces a dvc file:

md5: d978fc54ad4ee1ad65e241f0bd122482
cmd: cat myparams.yaml
deps:
- path: myparams.yaml
  params:
    train:
      epochs: 10
      lr: 0.001

Whereas using multi-stage

dvc run -n stage1 --params myparams.yaml:train "cat myparams.yaml"

Results in a DvcFile that looks like this. Plus a lock file. Neither of which refer to the train key in myparams.yaml

stages:
  stage1:
    cmd: cat myparams.yaml
    deps:
    - myparams.yaml

I've directed @skshetry to this thread too as I think the two issues would benefit from one and other.

skshetry · 2020-05-04T07:04:44Z

Hi @tall-josh. We now have params support for the pipeline file (as we are calling it at the moment).
We are planning to release it soon (with a public beta in this week).

You have already seen the file format, so, I'll just enter to the discussions of variables. I'd propose
not having an indirection of reading variables from the params file, but from the pipeline file itself.
The pipeline file is very readable and clean, and we'll strive to make it so in the future.

And, we can introduce a variables section in the pipeline file, which can be used for reading params by the dvc.
This way, there can be clear separation between params required by your scripts vs required by
dvc so as not to duplicate information (but, again for read-only purpose, you can just add to
the pipeline file, no problem on that).

So, it should look something like:

variables:
  data:
    size: 10
    out_path: ./data/
stages:
  data:
    cmd: python src/cli/data_download.py --data-size {{ data.size }} --out-path {{ data.out_path }}
    wdir: ..
    outs:
    - {{ data.out_path }}

Note that, I'm not focusing on templating formats (as it's likely going to be tied with the implementation), but we should try to make it logic-less. Providing logic will only make it complicated, and ugly.

cc @karajan1001 @dsuess

elgehelge · 2020-05-07T20:13:40Z

Seems that most of you agree that reusable yaml-variables should be defined directly in the pipeline file (I do too). So maybe you should rename the issue @dmpetrov ?

tall-josh · 2020-05-09T09:19:54Z

@skshetry Amazing news! I can't wait to give it a crack!

I agree logic-less is a good call
I'm in two minds about whether to allow loading vars from separate files. Leaning more towards as you've described above, but I could be convinced.

tall-josh · 2020-05-29T07:05:41Z

@skshetry
I have finally got some overhead to give the params support for the pipeline file. Do you have an example somewhere?

EDIT:

Ahhh I misunderstood your comment. If I now understand correctly; There is now --params support for multistage dvc files, but you're still working on templating variables.

dmpetrov · 2020-05-31T03:00:19Z

@tall-josh yes, the templating is still in process. Now we are busy by releasing DVC 1.0 https://dvc.org/blog/dvc-3-years-and-1-0-release The parameter is a good feature for the next step. DVC 1.1 I think.

@elgehelge I think there is no need for renaming this issue. It was created specifically to find a solution for using params.yaml in the pipeline. It is a useful approach when you need to localize all the configuration in a single file - a very common pattern. Also, it might be useful when some external tools can properly read and modify simple parameters files params.yaml but do not understand the more complicated format of dvc.yaml. And we already good a lot of good ideas on how to implement it.

But I agree that the "static" params is a super useful tool and should be the first priority (I had no doubt with that). I can create a separate issue for the "static" parameters case.

I like how simple variables can be implemented:

variables:
  data:
    size: 10
    out_path: ./data/

Why don't we use the same approach for importing all the params:

params_variables: [params.yaml]

stages:
  mytrain:
    cmd: python train.py --batch-size {{ train.batch_size }} --out-path {{ train.out_path }}
    outs:
        - {{ train.out_path }}

As an option, we can import by param name or param section name:

params_variables:
    - params.yaml: [process.threshold, train]

In this case, dvc.lock should include all the params that were used in this stage like:

    params:
         train.batch_size: 2048
         train.out_path: "out/model.p"

What do you think folks?

tall-josh · 2020-05-31T06:03:50Z

@dmpetrov Thanks for the clarification. In the meantime, I had a crack at implementing a vars section at the top of the multistage dvc.yaml file using Jinja2. It's a bit janky, but it appears to have the desired effect.

https://github.com/tall-josh/dvc/tree/master/TEMPLATING_DEMO

The dvc.yaml file looks like:

vars:
    data: data.txt
    base_dir: base_dir
    stage1_out: stage1_output.txt
    stage2_out: stage2_output.txt
    splits:
        train: 0.89
        eval: 0.3

stages:
  st1:
    deps:
    - {{ vars.data }}
    outs:
    - {{ vars.base_dir }}/{{ vars.stage1_out }}
    cmd: >-
        mkdir {{ vars.base_dir }};
        cat {{ vars.data }} > {{ vars.base_dir }}/{{ vars.stage1_out }}
  st2:
    deps:
    - {{ vars.base_dir }}/{{ vars.stage1_out }}
    outs:
    - {{ vars.base_dir }}/{{ vars.stage2_out }}
    cmd: >-
        echo "train: {{ vars.splits['train'] }}" > {{ vars.base_dir }}/{{ vars.stage2_out }};
        cat {{ vars.base_dir }}/{{ vars.stage1_out }} >>
        {{ vars.base_dir }}/{{ vars.stage2_out }}

jcpsantiago · 2020-08-20T06:35:13Z

Hello everyone! I'm a bit late to the party, but here are some ideas about using variables for looping (original discussion started in https://discuss.dvc.org/t/using-dvc-to-keep-track-of-multiple-model-variants/471).

In a nutshell, I want to use the same pipeline to train models using different groups or sets of data. The steps are the same, but the data is slightly different and the outputs have different paths.

Having vars or a params.yaml file would be great! I like the idea of declaring how the DAG should look like and what parameters it needs to run

From the original post:

params:
  group1:
    name: "group 1"
    something_else: "foo"
  group2:
    name: "group 2"
    something_else: "meh"
  group3:
    name: "group 3"
    something_else: "gwah"

stages:
  get_some_data:
    cmd: ./get_data.sh {{params.name}}
    deps:
      - get_data.sh
      - query.sql
    outs:
      - data.csv
  train_model:
    cmd: Rscript train.R {{params.name}} {{params.something_else}}
    deps:
      - data.csv
    outs:
      - models/{{params.name}}/model.xgb
    metrics:
      - metrics/{{params.name}}/model_eval.json

No ideas at the moment regarding how to declare that some parameters should be looped on.

I imagine dvc repro would then run group-by-group the whole DAG from start to finish (as opposed to stage-by-stage).

tall-josh · 2020-08-24T00:24:27Z

I've been mulling this over too. Would be super handy!!

johnnychen94 · 2020-12-01T12:00:47Z

The biggest improvement of the parametrization feature I've experienced is that I can now scale the workflow by only modifying params.yaml:

This defines the minimal repro workflow (For simplicity I've removed a lot of params here), I use this for my gitlab CI setting:

# params.yaml

models:
  gaussian_15:
    name: gaussian_15
    data:
      noise_level: 15
    train:
      epochs: 2

datasets:
  train:
    name: train
    src: data/DnCNN.zip
    path: data/train/rawdata
  Set12:
    name: Set12
    src: data/DnCNN.zip
    path: data/test/Set12


testcases:
  set12_gaussian_15:
    model_name: gaussian_15
    noise_level: 15
    dataset: Set12

In this setting, I trained one model, and test the model on one dataset (Set12).

the corresponding dag is

I could easily repeat some of the contents in params.yml. In my case, I now trained three models, and test each model on two datasets(Set12 and Set68).

# params.yaml
models:
  gaussian_15:
    name: gaussian_15
    data:
      noise_level: 15
    train:
      epochs: 50

  gaussian_25:
    name: gaussian_25
    data:
      noise_level: 25
    train:
      epochs: 50

  gaussian_50:
    name: gaussian_50
    data:
      noise_level: 50
    train:
      epochs: 50

datasets:
  train:
    name: train
    src: data/DnCNN.zip
    path: data/train/rawdata
  Set12:
    name: Set12
    src: data/DnCNN.zip
    path: data/test/Set12
  Set68:
    name: Set68
    src: data/DnCNN.zip
    path: data/test/Set68


testcases:
  set12_gaussian_15:
    model_name: gaussian_15
    noise_level: 15
    dataset: Set12
  set12_gaussian_25:
    model_name: gaussian_25
    noise_level: 25
    dataset: Set12
  set12_gaussian_50:
    model_name: gaussian_50
    noise_level: 50
    dataset: Set12

  set68_gaussian_15:
    model_name: gaussian_15
    noise_level: 15
    dataset: Set68
  set68_gaussian_25:
    model_name: gaussian_25
    noise_level: 25
    dataset: Set68
  set68_gaussian_50:
    model_name: gaussian_50
    noise_level: 50
    dataset: Set68

🎉 the workflow gets scaled automatically without touching any of the other files:

It's still unclear to me what's the best practice to parameterized the workflow. But I'm pretty satisfied with how it makes the code clean.

I'm still playing with dvc and the parametrization. When I feel ready, I might summary them up as a blog.

dvc.yaml

stages:
  extract_data:
    foreach: ${datasets}
    do:
      cmd: mkdir -p ${item.path} && unzip -q ${item.src} "${item.name}/*" -d tmp && mv tmp/${item.name}/* ${item.path}
      deps:
      - ${item.src}
      outs:
      - ${item.path}
  prepare:
    foreach: ${models}
    do:
      cmd: >-
        julia --color=yes --startup=no --project=prepare -e "using Pkg; Pkg.instantiate()" &&
        julia --color=yes --startup=no --project=prepare \
          prepare/main.jl data/train/rawdata/ data/train/prepared \
          --noise-level ${item.data.noise_level} \
          --patch-stride ${item.data.patch_stride} \
          --patch-size ${item.data.patch_size} \
          --max-num-patches ${item.data.max_num_patches}
      deps:
      - data/train/rawdata
      - prepare
      outs:
      - data/train/prepared/${item.name}.h5
  train:
    foreach: ${models}
    do:
      cmd: >-
        julia --color=yes --startup=no --project=train -e "using Pkg; Pkg.instantiate()" &&
        julia --color=yes --startup=no --project=train \
          train/main.jl data/train/prepared/${item.name}.h5 models/${item.name} \
          --epochs ${item.train.epochs} \
          --batch-size ${item.train.batch_size} \
          --initial-lr ${item.train.initial_learning_rate} \
          --lr-decay-ratio ${item.train.learning_rate_decay_ratio} \
          --lr-decay-freq ${item.train.learning_rate_decay_frequency} \
          --model-width ${item.train.model_width} \
          --model-depth ${item.train.model_depth} \
          --padding-size ${item.train.padding_size} \
          --kernel-size ${item.train.kernel_size} \
          --batch-size ${item.train.batch_size} \
          --log-freq ${item.train.log_frequency} \
          --use-gpu ${item.train.use_gpu}
      deps:
      - data/train/prepared/${item.name}.h5
      - train
      outs:
      - models/${item.name}
  evaluate:
    foreach: ${testcases}
    do:
      cmd: >-
        julia --color=yes --startup=no --project=evaluate -e "using Pkg; Pkg.instantiate()" &&
        julia --color=yes --startup=no --project=evaluate evaluate/main.jl \
          models/${item.model_name} \
          data/test/${item.dataset} \
          results/${item.model_name}/${item.dataset} \
          ${item.noise_level}
      deps:
      - data/test/${item.dataset}
      - models/${item.model_name}
      - evaluate
      outs:
      - results/${item.model_name}/${item.dataset}/processed_images
      plots:
      - results/${item.model_name}/${item.dataset}/checkpoints.csv
      - results/${item.model_name}/${item.dataset}/metrics.csv
  summary:
    cmd: >-
      julia --color=yes --startup=no --project=evaluate -e "using Pkg; Pkg.instantiate()" &&
      julia --color=yes --startup=no --project=evaluate evaluate/summary.jl results metrics.json
    deps:
    - results
    - evaluate
    metrics:
    - metrics.json

Even simpler `params.yaml`

The params.yaml can be further simplified if dvc borrows travis build matrix semantics, e.g.:

stages:
  build:
    foreach-matrix:
          - ${model}
          - ${dataset}
    do:
      ...

or ansible filter semantics, e.g.,:

stages:
  build:
    # item[0] := ${model}
    # item[1] := ${dataset}
    foreach: ${model} | zip(${dataset})
    do:
      ...

I don't have a preference here, I like both specifications.

Travis build matrix is very easy to use
ansible filter is more flexible, it also allows custom filters

For normal data science projects, it might be better to adopt the Travis way because data scientists can adapt it more easily; they normally are not good programmers 😨 I'm not sure of it, though.

running concurrent jobs

As the workflow can now scale up very easily, it becomes a pain to queue all jobs into one dvc repro sequentially. I still don't have a good sense of how this can be done programmatically. But having a loose read lock(#4979) sounds like a good start.

~~I haven't followed all the discussions in #755 yet~~; it can be quite challenging to let DVC figure out what is the best parallelization strategy. So my intuition is that, with the current dvc repro -s functionality, it might be easier to put into another YAML files:

Edit: This seems to be option 1 that @dmpetrov mentioned in #755 (comment)

# jobs.yaml
jobs:
  - name: extract_data@*
    limit: 1 # 1 means no concurrency, this can be set as default value
  - name: prepare@*
    limit: 4 # at most 4 concurrent jobs
    env:
      JULIA_NUM_THREADS: 8
  - train@*
    limit: 4
    env:
      # if it's a scalar, apply to all jobs
      JULIA_NUM_THREADS: 8
      # if it's a list, iterate one by one
      CUDA_VISIBLE_DEVICES:
        - 0
        - 1
        - 2
        - 3
  - name: evaluate@*
    limit: 3
    env:
      JULIA_NUM_THREADS: 8
      CUDA_VISIBLE_DEVICES:
        - 0
        - 1
        - 2
  - name: summary

extract_data@* follows the same glob syntax in #4976

P.S. I don't have access to slurm cluster so my ideas on parallelization are basically on a local machine. It's not very clear to me whether distributed computation should be handled by dvc or by concrete languages/toolboxes that dvc calls.

skshetry · 2020-12-02T06:04:36Z

Hi, everyone. The parametrization feature is complete, at least the things that we planned for the next release. We are hoping to release a pre-release version this week. So, it'd be really helpful if you guys could provide feedback.

tall-josh · 2020-12-08T02:34:53Z

Are there docs or examples for this? Does it follow @johnnychen94 's latest comment? If there are no docs but I can get a nudge in the right direction I could put together a Notebook or something.

tall-josh · 2020-12-08T02:47:24Z

Ahhh think I found what I'm looking for in tests/func/test_run_multistage.py:test_run_params_default()

skshetry · 2020-12-08T03:50:23Z

Are there docs or examples for this?

@tall-josh, it's in the wiki as mentioned here: https://github.com/iterative/dvc/wiki/Parametrization

@johnnychen94's example also works.

Ahhh think I found what I'm looking for in tests/func/test_run_multistage.py:test_run_params_default()

What are you looking for exactly? I don't think it has anything to do with parameterization.

tall-josh · 2020-12-08T04:13:44Z

Thanks @skshetry that wiki perfect. Yeah, the test_run_params_default() did not give me what I want.

dmpetrov added the feature request Requesting a new feature label Apr 14, 2020

weekly-digest bot mentioned this issue Apr 19, 2020

Weekly Digest (12 April, 2020 - 19 April, 2020) #3651

Closed

tall-josh mentioned this issue Apr 23, 2020

dvc: implement multi-stage dvcfile #3584

Merged

3 tasks

tall-josh mentioned this issue May 31, 2020

fill_from_lock called when stage.cmd_chaged == True #3918

Closed

jorgeorpinel mentioned this issue Jun 18, 2020

ML experiments and hyperparameters tuning #2799

Closed

tall-josh mentioned this issue Aug 27, 2020

WIP dvc: implement vars templating in multistage dvc file #4463

Closed

5 tasks

dmpetrov mentioned this issue Sep 5, 2020

Using parameters on pipeline commands #4525

Closed

efiop assigned skshetry Sep 8, 2020

This was referenced Oct 18, 2020

dvc: consider introducing build matrix #1018

Closed

How to manage repetitive dvc run commands (like unpacking of many zip files)? #1119

Closed

Allow pipeline to create pipeline #4213

Closed

skshetry mentioned this issue Dec 1, 2020

Boolean value might get unnecessarily uppercased #4996

Closed

johnnychen94 mentioned this issue Dec 1, 2020

repro: add scheduler for parallelising execution jobs #755

Open

skshetry unpinned this issue Dec 5, 2020

skshetry closed this as completed Dec 8, 2020

pared mentioned this issue Dec 14, 2020

live logs initial #4902

Merged

2 tasks

skshetry mentioned this issue Dec 29, 2020

Nested Looping in Parameterization #5172

Closed

skshetry added the A: templating Related to the templating feature label Dec 30, 2020

skshetry mentioned this issue Feb 2, 2021

2.0 meta release plan #5367

Closed

31 tasks

jorgeorpinel mentioned this issue Feb 10, 2021

dvc.yaml: future of foreach stages #5440

Closed

skshetry mentioned this issue Jun 23, 2023

Stage-level vars alternative in DVC 3.X #9639

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline variables from params file #3633

Pipeline variables from params file #3633

dmpetrov commented Apr 14, 2020 •

edited

Loading

karajan1001 commented Apr 18, 2020

tall-josh commented Apr 18, 2020 •

edited

Loading

dsuess commented Apr 22, 2020 •

edited

Loading

dmpetrov commented Apr 22, 2020

dsuess commented Apr 22, 2020

karajan1001 commented Apr 22, 2020 •

edited

Loading

tall-josh commented Apr 22, 2020 •

edited

Loading

dmpetrov commented Apr 22, 2020

tall-josh commented Apr 22, 2020

dsuess commented Apr 22, 2020

tall-josh commented Apr 22, 2020

dmpetrov commented Apr 23, 2020

tall-josh commented Apr 23, 2020

skshetry commented May 4, 2020

elgehelge commented May 7, 2020 •

edited

Loading

tall-josh commented May 9, 2020

tall-josh commented May 29, 2020 •

edited

Loading

dmpetrov commented May 31, 2020

tall-josh commented May 31, 2020

jcpsantiago commented Aug 20, 2020

tall-josh commented Aug 24, 2020 •

edited

Loading

johnnychen94 commented Dec 1, 2020 •

edited

Loading

skshetry commented Dec 2, 2020

tall-josh commented Dec 8, 2020

tall-josh commented Dec 8, 2020

skshetry commented Dec 8, 2020

tall-josh commented Dec 8, 2020

Pipeline variables from params file #3633

Pipeline variables from params file #3633

Comments

dmpetrov commented Apr 14, 2020 • edited Loading

karajan1001 commented Apr 18, 2020

tall-josh commented Apr 18, 2020 • edited Loading

dsuess commented Apr 22, 2020 • edited Loading

dmpetrov commented Apr 22, 2020

dsuess commented Apr 22, 2020

karajan1001 commented Apr 22, 2020 • edited Loading

tall-josh commented Apr 22, 2020 • edited Loading

dmpetrov commented Apr 22, 2020

tall-josh commented Apr 22, 2020

dsuess commented Apr 22, 2020

tall-josh commented Apr 22, 2020

dmpetrov commented Apr 23, 2020

tall-josh commented Apr 23, 2020

skshetry commented May 4, 2020

elgehelge commented May 7, 2020 • edited Loading

tall-josh commented May 9, 2020

tall-josh commented May 29, 2020 • edited Loading

dmpetrov commented May 31, 2020

tall-josh commented May 31, 2020

jcpsantiago commented Aug 20, 2020

tall-josh commented Aug 24, 2020 • edited Loading

johnnychen94 commented Dec 1, 2020 • edited Loading

Even simpler params.yaml

running concurrent jobs

skshetry commented Dec 2, 2020

tall-josh commented Dec 8, 2020

tall-josh commented Dec 8, 2020

skshetry commented Dec 8, 2020

tall-josh commented Dec 8, 2020

dmpetrov commented Apr 14, 2020 •

edited

Loading

tall-josh commented Apr 18, 2020 •

edited

Loading

dsuess commented Apr 22, 2020 •

edited

Loading

karajan1001 commented Apr 22, 2020 •

edited

Loading

tall-josh commented Apr 22, 2020 •

edited

Loading

elgehelge commented May 7, 2020 •

edited

Loading

tall-josh commented May 29, 2020 •

edited

Loading

tall-josh commented Aug 24, 2020 •

edited

Loading

johnnychen94 commented Dec 1, 2020 •

edited

Loading

Even simpler `params.yaml`