From ed71b6626211babc66b72adfc845b233a1e62a99 Mon Sep 17 00:00:00 2001 From: sarthakforwet Date: Sat, 4 Jul 2020 20:40:03 +0530 Subject: [PATCH 01/10] Updated versioning use case tutorial to stop the use of Dvcfile --- .../tutorial.md | 23 +++++++++---------- 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md index 9630c790c7..cfedaada4a 100644 --- a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md +++ b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md @@ -317,26 +317,25 @@ When you have a script that takes some data as an input and produces other data > ``` ```dvc -$ dvc run -f Dvcfile \ +$ dvc run -n train \ -d train.py -d data \ -M metrics.csv \ -o model.h5 -o bottleneck_features_train.npy -o bottleneck_features_validation.npy \ python train.py ``` -Similar to `dvc add`, `dvc run` creates a -[DVC-file](/doc/user-guide/dvc-files-and-directories) named `Dvcfile` (specified -using the `-f` option). It tracks all outputs (`-o`) the same way as `dvc add` -does. Unlike `dvc add`, `dvc run` also tracks dependencies (`-d`) and the -command (`python train.py`) that was run to produce the result. We call such a -DVC-file a "stage file". +`dvc run` creates a pipeline stage named `train` (specified using the `-n` +option) in [`dvc.yaml`](/doc/user-guide/dvc-files-and-directories#dvcyaml-file) +file. It tracks all outputs (`-o`) the same way as `dvc add` does. Unlike `dvc +add`, `dvc run` also tracks dependencies (`-d`) and the command (`python +train.py`) that was run to produce the result. -> At this point you could run `git add .` and `git commit` to save the `Dvcfile` -> stage file and its changed outputs to the repository. +> At this point you could run `git add .` and `git commit` to save the updated +> stage and its changed outputs to the repository. -`dvc repro` will run `Dvcfile` if any of its dependencies (`-d`) changed. For -example, when we added new images to built the second version of our model, that -was a dependency change. It also updates outputs and puts them into the +`dvc repro` will run `train` stage if any of its dependencies (`-d`) changed. +For example, when we added new images to built the second version of our model, +that was a dependency change. It also updates outputs and puts them into the cache. To make things a little simpler: if `dvc add` and `dvc checkout` provide a basic From 15af2143b3294d6e459e4c9cdfb85823a92bd935 Mon Sep 17 00:00:00 2001 From: sarthakforwet Date: Sat, 4 Jul 2020 20:44:04 +0530 Subject: [PATCH 02/10] Restyled tutorials.md --- .../use-cases/versioning-data-and-model-files/tutorial.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md index cfedaada4a..7c35b0f969 100644 --- a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md +++ b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md @@ -326,9 +326,9 @@ $ dvc run -n train \ `dvc run` creates a pipeline stage named `train` (specified using the `-n` option) in [`dvc.yaml`](/doc/user-guide/dvc-files-and-directories#dvcyaml-file) -file. It tracks all outputs (`-o`) the same way as `dvc add` does. Unlike `dvc -add`, `dvc run` also tracks dependencies (`-d`) and the command (`python -train.py`) that was run to produce the result. +file. It tracks all outputs (`-o`) the same way as `dvc add` does. Unlike +`dvc add`, `dvc run` also tracks dependencies (`-d`) and the command +(`python train.py`) that was run to produce the result. > At this point you could run `git add .` and `git commit` to save the updated > stage and its changed outputs to the repository. From e2ffd14ac1dfe1dff6f2b42ab253c13ba577de94 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Sun, 5 Jul 2020 19:00:18 -0500 Subject: [PATCH 03/10] Update content/docs/use-cases/versioning-data-and-model-files/tutorial.md --- .../use-cases/versioning-data-and-model-files/tutorial.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md index 7c35b0f969..b3846936a5 100644 --- a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md +++ b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md @@ -320,7 +320,9 @@ When you have a script that takes some data as an input and produces other data $ dvc run -n train \ -d train.py -d data \ -M metrics.csv \ - -o model.h5 -o bottleneck_features_train.npy -o bottleneck_features_validation.npy \ + -o model.h5 \ + -o bottleneck_features_train.npy \ + -o bottleneck_features_validation.npy \ python train.py ``` From e768a2efad19e172e248309fddc2be23b327519e Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Sun, 5 Jul 2020 19:02:50 -0500 Subject: [PATCH 04/10] Update content/docs/use-cases/versioning-data-and-model-files/tutorial.md --- .../use-cases/versioning-data-and-model-files/tutorial.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md index b3846936a5..e2d9b6ea7e 100644 --- a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md +++ b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md @@ -317,9 +317,7 @@ When you have a script that takes some data as an input and produces other data > ``` ```dvc -$ dvc run -n train \ - -d train.py -d data \ - -M metrics.csv \ +$ dvc run -n train -d train.py -d data \ -o model.h5 \ -o bottleneck_features_train.npy \ -o bottleneck_features_validation.npy \ From 69d802772bdc6e0dca4149d3e37764416de76a2f Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Sun, 5 Jul 2020 19:03:14 -0500 Subject: [PATCH 05/10] Update content/docs/use-cases/versioning-data-and-model-files/tutorial.md --- .../use-cases/versioning-data-and-model-files/tutorial.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md index e2d9b6ea7e..07589b8c10 100644 --- a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md +++ b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md @@ -318,9 +318,8 @@ When you have a script that takes some data as an input and produces other data ```dvc $ dvc run -n train -d train.py -d data \ - -o model.h5 \ - -o bottleneck_features_train.npy \ - -o bottleneck_features_validation.npy \ + -o model.h5 -o bottleneck_features_train.npy \ + -o bottleneck_features_validation.npy -M metrics.csv \ python train.py ``` From a249875e96a626f3af3a99d7e4a5344eb874d65f Mon Sep 17 00:00:00 2001 From: Sarthak khandelwal Date: Mon, 6 Jul 2020 10:35:12 +0530 Subject: [PATCH 06/10] Update content/docs/use-cases/versioning-data-and-model-files/tutorial.md Co-authored-by: Jorge Orpinel --- .../use-cases/versioning-data-and-model-files/tutorial.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md index 07589b8c10..1c60f0bba1 100644 --- a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md +++ b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md @@ -329,8 +329,8 @@ file. It tracks all outputs (`-o`) the same way as `dvc add` does. Unlike `dvc add`, `dvc run` also tracks dependencies (`-d`) and the command (`python train.py`) that was run to produce the result. -> At this point you could run `git add .` and `git commit` to save the updated -> stage and its changed outputs to the repository. +> At this point you could run `git add .` and `git commit` to save the `train` +> stage and its outputs to the repository. `dvc repro` will run `train` stage if any of its dependencies (`-d`) changed. For example, when we added new images to built the second version of our model, From e64a44cbfccb460c28828d1223c2efcfb1b15ab1 Mon Sep 17 00:00:00 2001 From: Sarthak khandelwal Date: Mon, 6 Jul 2020 10:37:22 +0530 Subject: [PATCH 07/10] Corrected grammar of tutorials.md Co-authored-by: Jorge Orpinel --- .../docs/use-cases/versioning-data-and-model-files/tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md index 1c60f0bba1..89e1fec4de 100644 --- a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md +++ b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md @@ -332,7 +332,7 @@ file. It tracks all outputs (`-o`) the same way as `dvc add` does. Unlike > At this point you could run `git add .` and `git commit` to save the `train` > stage and its outputs to the repository. -`dvc repro` will run `train` stage if any of its dependencies (`-d`) changed. +`dvc repro` will run the `train` stage if any of its dependencies (`-d`) changed. For example, when we added new images to built the second version of our model, that was a dependency change. It also updates outputs and puts them into the cache. From 75588fb7f9f17726a53d321cf6426df42eb8fe0d Mon Sep 17 00:00:00 2001 From: sarthakforwet Date: Mon, 6 Jul 2020 10:45:59 +0530 Subject: [PATCH 08/10] Updated styling of tutorial.md --- .../use-cases/versioning-data-and-model-files/tutorial.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md index 89e1fec4de..8c2123fa09 100644 --- a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md +++ b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md @@ -332,10 +332,10 @@ file. It tracks all outputs (`-o`) the same way as `dvc add` does. Unlike > At this point you could run `git add .` and `git commit` to save the `train` > stage and its outputs to the repository. -`dvc repro` will run the `train` stage if any of its dependencies (`-d`) changed. -For example, when we added new images to built the second version of our model, -that was a dependency change. It also updates outputs and puts them into the -cache. +`dvc repro` will run the `train` stage if any of its dependencies (`-d`) +changed. For example, when we added new images to built the second version of +our model, that was a dependency change. It also updates outputs and puts them +into the cache. To make things a little simpler: if `dvc add` and `dvc checkout` provide a basic mechanism to version control large data files or models, `dvc run` and From 6f066b89ece170658d5b008e19ad9deb1db58a33 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 6 Jul 2020 13:40:21 -0500 Subject: [PATCH 09/10] Update content/docs/use-cases/versioning-data-and-model-files/tutorial.md --- .../use-cases/versioning-data-and-model-files/tutorial.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md index 8c2123fa09..a1675b6caa 100644 --- a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md +++ b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md @@ -323,9 +323,9 @@ $ dvc run -n train -d train.py -d data \ python train.py ``` -`dvc run` creates a pipeline stage named `train` (specified using the `-n` -option) in [`dvc.yaml`](/doc/user-guide/dvc-files-and-directories#dvcyaml-file) -file. It tracks all outputs (`-o`) the same way as `dvc add` does. Unlike +`dvc run` writes a pipeline stage named `train` (specified using the `-n` +option) in [`dvc.yaml`](/doc/user-guide/dvc-files-and-directories#dvcyaml-file). +It tracks all outputs (`-o`) the same way as `dvc add` does. Unlike `dvc add`, `dvc run` also tracks dependencies (`-d`) and the command (`python train.py`) that was run to produce the result. From 7be9a518bf2ec67340fb2aa168851bf6824b6fc9 Mon Sep 17 00:00:00 2001 From: "Restyled.io" Date: Mon, 6 Jul 2020 18:40:31 +0000 Subject: [PATCH 10/10] Restyled by prettier --- .../use-cases/versioning-data-and-model-files/tutorial.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md index a1675b6caa..84427155fd 100644 --- a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md +++ b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md @@ -325,9 +325,9 @@ $ dvc run -n train -d train.py -d data \ `dvc run` writes a pipeline stage named `train` (specified using the `-n` option) in [`dvc.yaml`](/doc/user-guide/dvc-files-and-directories#dvcyaml-file). -It tracks all outputs (`-o`) the same way as `dvc add` does. Unlike -`dvc add`, `dvc run` also tracks dependencies (`-d`) and the command -(`python train.py`) that was run to produce the result. +It tracks all outputs (`-o`) the same way as `dvc add` does. Unlike `dvc add`, +`dvc run` also tracks dependencies (`-d`) and the command (`python train.py`) +that was run to produce the result. > At this point you could run `git add .` and `git commit` to save the `train` > stage and its outputs to the repository.