Skip to content
This repository has been archived by the owner on Jul 5, 2022. It is now read-only.

Commit

Permalink
Merge pull request #40 from iterative/iesahin/fix-minor-issues-in-acc…
Browse files Browse the repository at this point in the history
…essing

Command and typo fixes in Accessing Data scenario
  • Loading branch information
iesahin authored Mar 10, 2021
2 parents 23d608f + 33f7b2b commit 5ceaed3
Show file tree
Hide file tree
Showing 5 changed files with 12 additions and 9 deletions.
2 changes: 1 addition & 1 deletion get-started/accessing/01-download.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Download

We can download any file in a DVC repository:
We can download any file from a DVC repository:

```
dvc get \
Expand Down
8 changes: 4 additions & 4 deletions get-started/accessing/02-discovering-files.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Discovering files

As we mentioned, if you look at the [repository][dr], you won't see
`data/data.xml` or `model.pkl`, or any DVC-tracked files. They are not stored
in Git. We can `dvc get` them, but how do we even know what data is tracked in a
remote DVC repo before accessing it?
If you look at the [repository][dr], you won't see `data/data.xml` or
`model.pkl`, or any DVC-tracked files. They are not stored in Git. We can
`dvc get` them, but how do we even know what data is tracked in a remote DVC
repo before accessing it?

[dr]: https://github.com/iterative/dataset-registry

Expand Down
6 changes: 4 additions & 2 deletions get-started/accessing/03-python-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,15 @@

Besides using DVC commands in the command line, we can also access any
DVC-tracked artifact "natively" from Python with
[the API](https://dvc.org/doc/api-reference):
[the API](https://dvc.org/doc/api-reference). Please click the below link to open the Python script:

`process.py`{{open}}

The script downloads the data like `dvc get` and counts the number of lines in it:

`python3 process.py`{{execute}}
`python3 ~/process.py`{{execute}}

Note that the script doesn't download the data to a file before counting the lines.

The interface of [`dvc.api.open`][apiopen] is similar to the one we've
seen already. It receives Git repo URL and path as arguments, and works
Expand Down
3 changes: 2 additions & 1 deletion get-started/accessing/04-reusing-data-or-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ A DVC repository and the `dvc import` command are enough to export data and mode
reuse them, track upstream changes, etc. Let's give it a try:

```
mkdir data
dvc import \
https://github.com/iterative/dataset-registry \
get-started/data.xml -o data/data.xml
Expand All @@ -19,7 +20,7 @@ dvc import \
`dvc import` command creates `data/data.xml.dvc` to track the dependency. You
can view this file in the editor:
`data/data.xml.dvc`{{open}}
`project/data/data.xml.dvc`{{open}}
The `url` and `rev_lock` subfields under `repo` are used to save the origin and
the version of the dependency, respectively:
Expand Down
2 changes: 1 addition & 1 deletion get-started/accessing/05-congrats.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ download model and data files with `dvc get` or import them to DVC repositories
with `dvc import`. DVC also has an API that streams large files directly into
the memory with `dvc.api.open`.

Our vision is to have a central registry for all the data and model files and
DVC allows to have a central registry for all the data and model files and
using them in different projects. It's based on Git, and provides flexibility
without requiring additional infrastructure.

Expand Down

0 comments on commit 5ceaed3

Please sign in to comment.