Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how-to: further improvements #1976

Merged
merged 2 commits into from
Nov 27, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions content/docs/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,6 @@
"source": "what-is-dvc.md"
},
"dvc-files-and-directories",
"merge-conflicts",
{
"slug": "dvcignore",
"tutorials": {
Expand All @@ -101,8 +100,9 @@
"source": false,
"children": [
"stop-tracking-data",
"update-tracked-files",
"add-deps-or-outs-to-a-stage"
"update-tracked-data",
"add-deps-or-outs-to-a-stage",
"merge-conflicts"
]
},
"setup-google-drive-remote",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like this lonely guide could also be a how-to, at least until we expand on all the other types and figure out where to put them.

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Nov 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can put pretty much anything into How to this way ... How to manage external data, How to contribute, How to optimize deal with large data, How to run on Windows, etc ...

I think there should be something else that defines this section - closer to FAQ? some specific workflow questions?

In case of Google Drive - we'll have one page per remote. Even size in this case makes a bad fit. Eventually it might become bigger than everything else.

Putting "how to" before the title doesn't make it a how-to, the content has to be short recipes to address specific problems. The ones you mention are 50% + explanation (and not very short), the remaining part is examples. That doesn't really count (:

How to Setup Remotes also includes some explanation, but mostly how-to stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to Setup Remotes also includes some explanation, but mostly how-to stuff.

I guess you could argue it's a lot of explanation actually. Ok, let's leave it as a guide!

Expand Down
15 changes: 8 additions & 7 deletions content/docs/user-guide/how-to/add-deps-or-outs-to-a-stage.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,21 @@
---
title: 'How to Add Dependencies or Outputs to a Stage '
description: 'We have executed a stage, but later notice that some of the
dependencies or outputs are missing...'
title: 'How to Add Dependencies or Outputs to a Stage'
description: 'It's possible to add files or directories to existing stages, with
or without executing them.'
---

# How to Add Dependencies or Outputs

To add <abbr>dependencies</abbr> or <abbr>outputs</abbr> to a stage, edit the
`dvc.yaml` file (by hand or using `dvc run` with the `-f --no-exec` flags).
`dvc repro` will execute it and <abbr>cache</abbr> the output files when ready.
To add <abbr>dependencies</abbr> or <abbr>outputs</abbr> to a
[stage](/doc/command-reference/run), edit the `dvc.yaml` file (by hand or using
`dvc run` with the `-f --no-exec` flags). `dvc repro` will execute it and
<abbr>cache</abbr> the output files when ready.

If the stage has already been executed it and the desired outputs are present in
the <abbr>workspace</abbr>, you can avoid `dvc repro` (which can be expensive
and is unnecessary) and use `dvc commit` instead.

> Both alternatives update `dvc.lock` accordingly.
> Note that both alternatives update `dvc.lock` too.

## Example

Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
# Merge Conflicts
---
title: 'How to Merge Conflicts'
description: 'Git merge conflicts can happen in in DVC metafiles when combining
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved
changes from multiple team members.'
---

# How to Merge Conflicts in DVC Metafiles

Sometimes multiple members of a team might work on the the same DVC-tracked
data. And when the time comes to combine their changes, merge conflicts can
arise in Git-tracked [metafiles](/doc/user-guide/dvc-files-and-directories),
happen in Git-tracked [metafiles](/doc/user-guide/dvc-files-and-directories),
which need to be resolved.

## `dvc.yaml`
Expand Down
48 changes: 48 additions & 0 deletions content/docs/user-guide/how-to/stop-tracking-data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
title: 'How to Stop Tracking Data'
description: 'You can "un-track" files or directories added in error.'
---

# How to Stop Tracking Data

There are situations where you may want to "un-track" files or directories added
in error to DVC.

<details>

## Expand to add a sample data `data.csv` file

`dvc add` creates a `.dvc` file to track the file, and lists it in `.gitignore`:

```dvc
$ dvc add data.csv

$ ls
data.csv data.csv.dvc
$ cat .gitignore
/data.csv
```

</details>

Let's undo `dvc add` with `dvc remove`. This deletes the `.dvc` file (and
corresponding `.gitignore` entry). The data file is now no longer being tracked
after this:

```dvc
$ dvc remove data.csv.dvc

$ git status
Untracked files:
data.csv
```

You can run `dvc gc` with the `-w` option to remove the data (and all of it's
previous versions, if any) from the <abbr>cache</abbr>:

```dvc
$ dvc gc -w
```

> Note that a very similar procedure works for `dvc.yaml` stages and their
> outputs.
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
# How to Update Tracked Files
---
title: 'How to Update Tracked Data'
description: 'Updating files or directories may mean either modifying some of
the data contents, or completely replacing them.'
---

Updating a tracked data file (or directory) may mean either
[modifying](#modifying-content) some of its contents, or completely
[replacing](#replacing-file) it with a new one (same file name).
# How to Update Tracked Data

Updating tracked files or directories may mean either
[modifying](#modifying-content) some of the data contents, or completely
[replacing](#replacing-file) them (under the same file name).

When the `cache.type` config option is set to `symlink` or `hardlink` (not the
default, see `dvc config cache` for more info.), updating tracked files has to
Expand Down
4 changes: 3 additions & 1 deletion redirects-list.json
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,9 @@

"^/doc/use-cases/data-and-model-files-versioning/?$ /doc/use-cases/versioning-data-and-model-files",
"^/doc/user-guide/dvc-file-format$ /doc/user-guide/dvc-files-and-directories",
"^/doc/user-guide/updating-tracked-files$ /doc/user-guide/how-to/update-tracked-files",
"^/doc/user-guide/updating-tracked-files$ /doc/user-guide/how-to/update-tracked-data",
"^/doc/user-guide/how-to/update-tracked-files$ /doc/user-guide/how-to/update-tracked-data",
"^/doc/user-guide/merge-conflicts$ /doc/user-guide/how-to/merge-conflicts",
Comment on lines +33 to +34
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for self:

  • Remember to request re-indexing of these 2 new URLs in search.google.com/u/1/search-console

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, can't right now: https://support.google.com/webmasters/answer/6211453#url_inspection

Will try in a few days...

"^/doc/understanding-dvc(/.*)?$ /doc/user-guide/what-is-dvc",
"^/doc/commands-reference(/.*)?$ /doc/command-reference$1",
"^/doc/command-reference/plot$ /doc/command-reference/plots",
Expand Down