From 193e0432c38077802209c9ca7fd48f486ba051e5 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel <jorge@orpinel.com> Date: Sun, 3 Jan 2021 19:08:03 -0600 Subject: [PATCH 1/8] cache: centralize concept explanation in tooltip --- content/docs/command-reference/add.md | 8 +++--- content/docs/command-reference/cache/index.md | 11 +++----- content/docs/command-reference/config.md | 7 ++---- .../user-guide/basic-concepts/dvc-cache.md | 6 ++--- .../user-guide/dvc-files-and-directories.md | 25 +++++++++++-------- .../user-guide/large-dataset-optimization.md | 9 ++----- 6 files changed, 29 insertions(+), 37 deletions(-) diff --git a/content/docs/command-reference/add.md b/content/docs/command-reference/add.md index d020bffdd0..e020dcdb4f 100644 --- a/content/docs/command-reference/add.md +++ b/content/docs/command-reference/add.md @@ -76,10 +76,10 @@ A `dvc add` target can be either a file or a directory. In the latter case, a `.dvc` file is created for the top of the hierarchy (with default name `<dir_name>.dvc`). -Every file inside is stored in the cache (unless the `--no-commit` option is -used), but DVC does not produce individual `.dvc` files for each file in the -entire tree. Instead, the single `.dvc` file references a special JSON file in -the cache (with `.dir` extension), that in turn points to the added files. +Every file in the dir is cached normally (unless the `--no-commit` option is +used), but DVC does not produce individual `.dvc` files for each one. Instead, +the single `.dvc` file references a special JSON file in the cache (with `.dir` +extension), that in turn points to the added files. > Refer to > [Structure of cache directory](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory) diff --git a/content/docs/command-reference/cache/index.md b/content/docs/command-reference/cache/index.md index 0117e2b4f7..de2144d346 100644 --- a/content/docs/command-reference/cache/index.md +++ b/content/docs/command-reference/cache/index.md @@ -15,15 +15,12 @@ positional arguments: ## Description -The DVC Cache is where your data files, models, etc. (anything you want to -version with DVC) are actually stored. The data files and directories visible in -the <abbr>workspace</abbr> are links\* to (or copies of) the ones in cache. -Learn more about it's -[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory). +Tracked files and directories visible in the <abbr>workspace</abbr> are links\* +to the ones in the project's <abbr>cache</abbr>. -> \* Refer to +> \* Or copies. Refer to > [File link types](/doc/user-guide/large-dataset-optimization#file-link-types-for-the-dvc-cache) -> for more information on file links on different platforms. +> for more information on supported linking on different platforms. For cache configuration options, refer to `dvc config cache`. diff --git a/content/docs/command-reference/config.md b/content/docs/command-reference/config.md index 8ef1c166a9..654c574526 100644 --- a/content/docs/command-reference/config.md +++ b/content/docs/command-reference/config.md @@ -131,11 +131,8 @@ remote. See `dvc remote` for more information. ### cache -A DVC project <abbr>cache</abbr> is the hidden storage (by default located in -the `.dvc/cache` directory) for files that are tracked by DVC, and their -different versions. (See `dvc cache` and -[DVC Files and Directories](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory) -for more details.) This section contains the following options: +This section contains the following options, which affect the project's +<abbr>cache</abbr>: - `cache.dir` - set/unset cache directory location. A correct value is either an absolute path, or a path **relative to the config file location**. The default diff --git a/content/docs/user-guide/basic-concepts/dvc-cache.md b/content/docs/user-guide/basic-concepts/dvc-cache.md index 0c4febcb59..0fbe9fd10b 100644 --- a/content/docs/user-guide/basic-concepts/dvc-cache.md +++ b/content/docs/user-guide/basic-concepts/dvc-cache.md @@ -3,7 +3,7 @@ name: 'DVC Cache' match: ['DVC cache', cache, caches, cached, 'cache directory'] --- -The DVC cache is a hidden storage (by default located in the `.dvc/cache` -directory) for files that are tracked by DVC, and their different versions. -Learn more about it's +The DVC cache is a hidden storage (by default in `.dvc/cache`) for files and +directories tracked by DVC, and their different versions. Data is cached into a +special flattened [structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory). diff --git a/content/docs/user-guide/dvc-files-and-directories.md b/content/docs/user-guide/dvc-files-and-directories.md index 5d687dafb4..059c9a48b0 100644 --- a/content/docs/user-guide/dvc-files-and-directories.md +++ b/content/docs/user-guide/dvc-files-and-directories.md @@ -256,7 +256,7 @@ Full <abbr>parameters</abbr> (key and value) are listed separately under - `.dvc/cache`: The <abbr>cache</abbr> directory will store your data in a special [structure](#structure-of-the-cache-directory). The data files and directories in the <abbr>workspace</abbr> will only contain links to the data - files in the cache. (Refer to + files in the cache (refer to [Large Dataset Optimization](/doc/user-guide/large-dataset-optimization). See `dvc config cache` for related configuration options. @@ -297,13 +297,17 @@ Full <abbr>parameters</abbr> (key and value) are listed separately under ## Structure of the cache directory -The DVC cache is a +The DVC cache is a hidden [content-addressable storage](https://en.wikipedia.org/wiki/Content-addressable_storage) -(by default in `.dvc/cache`), which adds a layer of indirection between code and +(by default in `.dvc/cache`). It adds a layer of indirection between code and data. -There are two ways in which the data is <abbr>cached</abbr>: As a single file -(eg. `data.csv`), or as a directory. +There are two ways in which the data is <abbr>cached</abbr>, depending on +whether it's a single file, or a directory (which may contain multiple files). + +Note files are renamed, reorganized, and directory trees are flattened in the +cache, which always has exactly one depth level with 2-character directories +(based on hashes of the data contents, as explained next). ### Files @@ -331,9 +335,7 @@ data/images/ $ dvc add data/images ``` -The directory is cached as a JSON file with `.dir` extension. The files it -contains are stored in the cache regularly, as explained earlier. It looks like -this: +The resulting cache dir looks like this: ```dvc .dvc/cache/ @@ -345,8 +347,9 @@ this: └── 0b40427ee0998e9802335d98f08cd98f ``` -The `.dir` file contains the mapping of files in `data/images` (as a JSON -array), including their hash values: +The files in the directory are cached normally. The directory itself gets a +similar entry, which with the `.dir` extension. It contains the mapping of files +inside (as a JSON array), identified by their hash values: ```dvc $ cat .dvc/cache/19/6a322c107c2572335158503c64bfba.dir @@ -354,4 +357,4 @@ $ cat .dvc/cache/19/6a322c107c2572335158503c64bfba.dir {"md5": "29a6c8271c0c8fbf75d3b97aecee589f", "relpath": "index.jpeg"}] ``` -That's how DVC knows that the other two cached files belong in the directory. +That's how DVC knows that those two cached files belong in the directory. diff --git a/content/docs/user-guide/large-dataset-optimization.md b/content/docs/user-guide/large-dataset-optimization.md index 7bdeb4e102..9ce6fa9bb5 100644 --- a/content/docs/user-guide/large-dataset-optimization.md +++ b/content/docs/user-guide/large-dataset-optimization.md @@ -1,12 +1,7 @@ # Large Dataset Optimization -In order to track the data files and directories added with `dvc add` or -`dvc run`, DVC moves all these files to the <abbr>cache</abbr>. A -<abbr>project</abbr>'s cache is the hidden storage (by default located in -`.dvc/cache`) for files that are tracked by DVC, and their different versions. -(See `dvc cache` and -[DVC Files and Directories](/doc/user-guide/dvc-files-and-directories) for more -details.) +In order to track the data files and directories added with `dvc add`, +`dvc repro`, etc. DVC moves all these files to the project's <abbr>cache</abbr>. However, the versions of the tracked files that [match the current code](/doc/tutorials/get-started/data-pipelines) are also From 3b0c7e728571fa4735ec3e6c55dbca7af5b815a2 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel <jorge@orpinel.com> Date: Sun, 3 Jan 2021 19:27:28 -0600 Subject: [PATCH 2/8] cmd: match metrics show usage block to actual cmd --- content/docs/command-reference/metrics/show.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/command-reference/metrics/show.md b/content/docs/command-reference/metrics/show.md index 70e4e020e6..8da82aacbe 100644 --- a/content/docs/command-reference/metrics/show.md +++ b/content/docs/command-reference/metrics/show.md @@ -5,8 +5,8 @@ Print [metrics](/doc/command-reference/metrics), with optional formatting. ## Synopsis ```usage -usage: dvc metrics show [-h] [-q | -v] [-a] [-T] [--all-commits] [-R] - [--show-json] +usage: dvc metrics show [-h] [-q | -v] [-a] [-T] [--all-commits] + [--show-json] [-R] [targets [targets ...]] positional arguments: From 049204c03fca0dcc5b62ffa94134ea694f99ec34 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel <jorge@orpinel.com> Date: Sun, 3 Jan 2021 19:52:08 -0600 Subject: [PATCH 3/8] cmd: add git-archive-like example to list per #1521 --- content/docs/command-reference/list.md | 28 ++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/content/docs/command-reference/list.md b/content/docs/command-reference/list.md index 9544e26bc3..4a24c0c962 100644 --- a/content/docs/command-reference/list.md +++ b/content/docs/command-reference/list.md @@ -125,3 +125,31 @@ images/dvc-logo-outlines.png.dvc images/owl_sticker.png ... ``` + +## Example: Create an archive of you DVC project + +Just like you can use `git archive` to make a quick bundle (ZIP) file of the +current code, `dvc list` can be easily complemented with simple archive tools to +bundle the current data files in the project. + +For example, here's a TAR archive of the entire <abbr>workspace</abbr> +(Linux/GNU): + +```dvc +$ dvc list . -R | tar -cvf project.tar +``` + +Or separate ZIP archives of code and DVC-tracked data (POSIX terminal with +`zip`): + +``` +$ git archive -o code.zip HEAD +$ dvc list . -R --dvc-only | zip -@ data.zip +``` + +ZIP alternative for [POSIX on Windows](/doc/user-guide/running-dvc-on-windows) +(Python installed): + +```dvc +$ dvc list . -R --dvc-only | xargs python -m zipfile -c data.zip +``` From 667550a3973de04e13b0be42dddfc98c81abbc7b Mon Sep 17 00:00:00 2001 From: Jorge Orpinel <jorge@orpinel.com> Date: Sun, 3 Jan 2021 20:11:14 -0600 Subject: [PATCH 4/8] run: review -d usage and consistency t/o docs per #1700 --- .../docs/command-reference/params/index.md | 10 ++++----- content/docs/command-reference/run.md | 22 +++++++++---------- content/docs/start/data-pipelines.md | 2 +- .../use-cases/shared-development-server.md | 8 ++++--- .../how-to/add-deps-or-outs-to-a-stage.md | 8 +++---- 5 files changed, 26 insertions(+), 24 deletions(-) diff --git a/content/docs/command-reference/params/index.md b/content/docs/command-reference/params/index.md index 02a92c0e0f..b1e493e313 100644 --- a/content/docs/command-reference/params/index.md +++ b/content/docs/command-reference/params/index.md @@ -87,7 +87,7 @@ Define a [stage](/doc/command-reference/run) that depends on params `lr`, specify `layers` and `epochs` from the `train` group: ```dvc -$ dvc run -n train -d users.csv -o model.pkl \ +$ dvc run -n train -d train.py -d users.csv -o model.pkl \ -p lr,train.epochs,train.layers \ python train.py ``` @@ -130,7 +130,7 @@ Alternatively, the entire group of parameters `train` can be referenced, instead of specifying each of the group parameters separately: ```dvc -$ dvc run -n train -d users.csv -o model.pkl \ +$ dvc run -n train -d train.py -d users.csv -o model.pkl \ -p lr,train \ python train.py ``` @@ -139,7 +139,7 @@ In the examples above, the default parameters file name `params.yaml` was used. This file name can be redefined with a prefix in the `-p` argument: ```dvc -$ dvc run -n train -d logs/ -o users.csv \ +$ dvc run -n train -d train.py -d logs/ -o users.csv -f \ -p parse_params.yaml:threshold,classes_num \ python train.py ``` @@ -182,7 +182,7 @@ The following [stage](/doc/command-reference/run) depends on params `BOOL`, `INT`, as well as `TrainConfig`'s `EPOCHS` and `layers`: ```dvc -$ dvc run -n train -d users.csv -o model.pkl \ +$ dvc run -n train -d train.py -d users.csv -o model.pkl \ -p params.py:BOOL,INT,TrainConfig.EPOCHS,TrainConfig.layers \ python train.py ``` @@ -227,7 +227,7 @@ can be referenced supported), instead of the parameters in it: ```dvc -$ dvc run -n train -d users.csv -o model.pkl \ +$ dvc run -n train -d train.py -d users.csv -o model.pkl \ -p params.py:BOOL,INT,TestConfig \ python train.py ``` diff --git a/content/docs/command-reference/run.md b/content/docs/command-reference/run.md index a835e36b53..a9e68ae5a5 100644 --- a/content/docs/command-reference/run.md +++ b/content/docs/command-reference/run.md @@ -73,7 +73,7 @@ so on (see `dvc dag`). This graph can be restored by DVC later to modify or ```dvc $ dvc run -n printer -d write.sh -o pages ./write.sh -$ dvc run -n scanner -d read.sh -d pages -o signed.pdf ./read.sh +$ dvc run -n scanner -d read.sh -d pages -o signed.pdf ./read.sh pages ``` Stage dependencies can be any file or directory, either untracked, or more @@ -151,7 +151,7 @@ variables in it that should be evaluated dynamically. Examples: ```dvc $ dvc run -n my_stage "./my_script.sh > /dev/null 2>&1" -$ dvc run -n my_stage './my_script.sh $MYENVVAR' +$ dvc run -n my_stage -f './my_script.sh $MYENVVAR' ``` ## Options @@ -317,17 +317,17 @@ dataset (`20180226` is a seed value): ```dvc $ dvc run -n train \ - -d matrix-train.p -d train_model.py \ - -o model.p \ - python train_model.py matrix-train.p 20180226 model.p + -d train_model.py -d matrix-train.p -o model.p \ + python train_model.py 20180226 model.p ``` To update a stage that is already defined, the `-f` (`--force`) option is needed. Let's update the seed for the `train` stage: ```dvc -$ dvc run -n train -f -d matrix-train.p -d train_model.py -o model.p \ - python train_model.py matrix-train.p 18494003 model.p +$ dvc run -n train --force \ + -d train_model.p -d matrix-train.p -o model.p \ + python train_model.py 18494003 model.p ``` ## Example: Separate stages in a subdirectory @@ -341,7 +341,7 @@ $ cd more_stages/ $ dvc run -n process_data \ -d data.in \ -o result.out \ - ./my_script.sh data.in result.out + ./my_script.sh --in data.in --out result.out $ tree .. . ├── dvc.yaml @@ -379,7 +379,7 @@ Execute an R script that parses the XML file: $ dvc run -n parse \ -d parsingxml.R -d data/Posts.xml \ -o data/Posts.csv \ - Rscript parsingxml.R data/Posts.xml data/Posts.csv + Rscript parsingxml.R --in data/Posts.xml --out data/Posts.csv ``` To visualize how these stages are connected into a pipeline (given their outputs @@ -421,9 +421,9 @@ Define a stage with both regular dependencies as well as parameter dependencies: ```dvc $ dvc run -n train \ - -d matrix-train.p -d train_model.py -o model.p \ + -d train_model.py -d matrix-train.p -o model.p \ -p seed,train.lr,train.epochs - python train_model.py matrix-train.p model.p + python train_model.py 20200105 model.p ``` `train_model.py` will include some code to open and parse the parameters: diff --git a/content/docs/start/data-pipelines.md b/content/docs/start/data-pipelines.md index 076158130c..e3ee12a236 100644 --- a/content/docs/start/data-pipelines.md +++ b/content/docs/start/data-pipelines.md @@ -72,7 +72,7 @@ $ dvc run -n prepare \ ``` A `dvc.yaml` file is generated. It includes information about the command we ran -(`python src/prepare.py`), its <abbr>dependencies</abbr>, and +(`python src/prepare.py data/data.xml`), its <abbr>dependencies</abbr>, and <abbr>outputs</abbr>. <details> diff --git a/content/docs/use-cases/shared-development-server.md b/content/docs/use-cases/shared-development-server.md index 5874bdab7c..ecce0e815c 100644 --- a/content/docs/use-cases/shared-development-server.md +++ b/content/docs/use-cases/shared-development-server.md @@ -80,8 +80,9 @@ Let's say you are cleaning up raw data for later stages: ```dvc $ dvc add raw -$ dvc run -n clean_data -d raw -o clean ./cleanup.py raw clean - # The data is cached in the shared location. +$ dvc run -n clean_data -d cleanup.py -d raw -o clean \ + ./cleanup.py raw clean +# The data is cached in the shared location. $ git add raw.dvc dvc.yaml dvc.lock .gitignore $ git commit -m "cleanup raw data" $ git push @@ -97,7 +98,8 @@ manually. After this, they could decide to continue building this $ git pull $ dvc checkout A raw # Data is linked from cache to workspace. -$ dvc run -n process_clean_data -d clean -o processed ./process.py clean process +$ dvc run -n process_clean_data -d process.py -d clean -o processed + ./process.py clean processed $ git add dvc.yaml dvc.lock $ git commit -m "process clean data" $ git push diff --git a/content/docs/user-guide/how-to/add-deps-or-outs-to-a-stage.md b/content/docs/user-guide/how-to/add-deps-or-outs-to-a-stage.md index 159a760b88..950118bd2b 100644 --- a/content/docs/user-guide/how-to/add-deps-or-outs-to-a-stage.md +++ b/content/docs/user-guide/how-to/add-deps-or-outs-to-a-stage.md @@ -39,13 +39,13 @@ output. To add a missing dependency (`data/raw.csv`) as well as a missing output > dependency/output to the stage: > > ```dvc -> $ dvc run -f --no-exec \ -> -n prepare \ -> -d data/raw.csv \ +> $ dvc run -n prepare \ +> -f --no-exec \ > -d src/prepare.py \ +> -d data/raw.csv \ > -o data/train \ > -o data/validate \ -> python src/prepare.py +> python src/prepare.py data/raw.csv > ``` > > `-f` overwrites the stage in `dvc.yaml`, while `--no-exec` updates the stage From f5978df3a1cb52320f51b0430e6fa02136f310ab Mon Sep 17 00:00:00 2001 From: Jorge Orpinel <jorge@orpinel.com> Date: Sun, 3 Jan 2021 20:51:56 -0600 Subject: [PATCH 5/8] dvc.yaml: mention that there can be more than one per project per #1641 --- content/docs/command-reference/repro.md | 8 ++++---- content/docs/command-reference/root.md | 5 ++++- content/docs/user-guide/dvc-files-and-directories.md | 6 +++++- 3 files changed, 13 insertions(+), 6 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index c571646d45..f0370af7db 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -24,11 +24,11 @@ positional arguments: `dvc repro` provides a way to regenerate data pipeline results, by restoring the dependency graph (a [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) -implicitly defined by the stages listed in `dvc.yaml`. The commands defined in -these stages are then executed in the correct order, reproducing pipeline -results. +implicitly defined by the stages listed in `dvc.yaml` files. The commands +defined in these stages are then executed in the correct order, reproducing +pipeline results. -> Pipeline stages are defined in a `dvc.yaml` file (either manually or by using +> Pipeline stages are defined in `dvc.yaml` (either manually or by using > `dvc run`) while initial data dependencies can be registered with `dvc add`. This command is similar to [Make](https://www.gnu.org/software/make/) in diff --git a/content/docs/command-reference/root.md b/content/docs/command-reference/root.md index 0c3d4151c9..47bc6ed6d8 100644 --- a/content/docs/command-reference/root.md +++ b/content/docs/command-reference/root.md @@ -27,11 +27,14 @@ Use this command to build fixed paths to dependencies, files, or stage - `-v`, `--verbose` - displays detailed tracing information. -## Example: Basic output +## Examples + +Basic demonstration: ```dvc $ dvc root . + $ mkdir subdir $ cd subdir $ dvc root diff --git a/content/docs/user-guide/dvc-files-and-directories.md b/content/docs/user-guide/dvc-files-and-directories.md index 059c9a48b0..e62f632b93 100644 --- a/content/docs/user-guide/dvc-files-and-directories.md +++ b/content/docs/user-guide/dvc-files-and-directories.md @@ -187,7 +187,11 @@ the following fields: `dvc.yaml` files also support `# comments`. -💡 We maintain a `dvc.yaml` +💡 Keep in mind that there may be more than one `dvc.yaml` files in each +<abbr>DVC project</abbr>. DVC checks all of them for consistency during +operations that require rebuilding DAGs (like `dvc dag`). + +Note that we maintain a `dvc.yaml` [schema](https://github.com/iterative/dvcyaml-schema) that can be used by editors like [VSCode](/doc/install/plugins#visual-studio-code) or [PyCharm](/doc/install/plugins#pycharmintellij) to enable automatic syntax From 4781b64cfddab8a89dce4b444e941fe91f0e0a84 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel <jorge@orpinel.com> Date: Mon, 4 Jan 2021 15:21:10 -0600 Subject: [PATCH 6/8] cmd: use diff names/scripts in run examples per https://github.com/iterative/dvc.org/pull/2075#pullrequestreview-562344646 --- content/docs/command-reference/run.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/command-reference/run.md b/content/docs/command-reference/run.md index a9e68ae5a5..bcceae3bf6 100644 --- a/content/docs/command-reference/run.md +++ b/content/docs/command-reference/run.md @@ -150,8 +150,8 @@ like `|` (pipe) or `<`, `>` (redirection), otherwise they would apply to variables in it that should be evaluated dynamically. Examples: ```dvc -$ dvc run -n my_stage "./my_script.sh > /dev/null 2>&1" -$ dvc run -n my_stage -f './my_script.sh $MYENVVAR' +$ dvc run -n first_stage "./a_script.sh > /dev/null 2>&1" +$ dvc run -n second_stage './another_script.sh $MYENVVAR' ``` ## Options From 785f8843e07855db648fc34eacbc9c8bac63ac01 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel <jorge@orpinel.com> Date: Mon, 4 Jan 2021 15:38:57 -0600 Subject: [PATCH 7/8] cmd: no need for --in/out in run cmd examples per https://github.com/iterative/dvc.org/pull/2075#pullrequestreview-562344027 --- content/docs/command-reference/run.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/command-reference/run.md b/content/docs/command-reference/run.md index bcceae3bf6..df57eb69b9 100644 --- a/content/docs/command-reference/run.md +++ b/content/docs/command-reference/run.md @@ -341,7 +341,7 @@ $ cd more_stages/ $ dvc run -n process_data \ -d data.in \ -o result.out \ - ./my_script.sh --in data.in --out result.out + ./my_script.sh data.in result.out $ tree .. . ├── dvc.yaml @@ -379,7 +379,7 @@ Execute an R script that parses the XML file: $ dvc run -n parse \ -d parsingxml.R -d data/Posts.xml \ -o data/Posts.csv \ - Rscript parsingxml.R --in data/Posts.xml --out data/Posts.csv + Rscript parsingxml.R data/Posts.xml data/Posts.csv ``` To visualize how these stages are connected into a pipeline (given their outputs From 56376d41388bd2cbd0570c8dac3d539dd0437dfb Mon Sep 17 00:00:00 2001 From: Jorge Orpinel <jorge@orpinel.com> Date: Mon, 4 Jan 2021 15:43:15 -0600 Subject: [PATCH 8/8] concept: don't emphasize cache structure is flat per https://github.com/iterative/dvc.org/pull/2075#pullrequestreview-562345357 --- content/docs/user-guide/basic-concepts/dvc-cache.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/docs/user-guide/basic-concepts/dvc-cache.md b/content/docs/user-guide/basic-concepts/dvc-cache.md index 0fbe9fd10b..e74d83c8f6 100644 --- a/content/docs/user-guide/basic-concepts/dvc-cache.md +++ b/content/docs/user-guide/basic-concepts/dvc-cache.md @@ -4,6 +4,6 @@ match: ['DVC cache', cache, caches, cached, 'cache directory'] --- The DVC cache is a hidden storage (by default in `.dvc/cache`) for files and -directories tracked by DVC, and their different versions. Data is cached into a -special flattened -[structure](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory). +directories tracked by DVC, and their different versions. Learn more about it's +structure +[here](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory).