diff --git a/README.Rmd b/README.Rmd index 0765a3c..9d5685a 100644 --- a/README.Rmd +++ b/README.Rmd @@ -23,6 +23,8 @@ here::i_am("README.Rmd") [![R-CMD-check](https://github.com/terminological/dtrackr/workflows/R-CMD-check/badge.svg)](https://github.com/terminological/dtrackr/actions) [![DOI](https://zenodo.org/badge/335974323.svg)](https://zenodo.org/badge/latestdoi/335974323) [![dtrackr status badge](https://terminological.r-universe.dev/badges/dtrackr)](https://terminological.r-universe.dev) +[![metacran downloads](https://cranlogs.r-pkg.org/badges/dtrackr)](https://cran.r-project.org/package=dtrackr) +[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/dtrackr)](https://cran.r-project.org/package=dtrackr) @@ -30,7 +32,7 @@ here::i_am("README.Rmd") Accurate documentation of a data pipeline is a first step to reproducibility, and a flow chart describing the steps taken to prepare data is a useful part of -this documentation. In analyses that relies on data that is frequently updated, +this documentation. In analyses that rely on data that is frequently updated, documenting a data flow by copying and pasting row counts into flowcharts in PowerPoint becomes quickly tedious. With interactive data analysis, and particularly using RMarkdown, code execution sometimes happens in a non-linear @@ -47,7 +49,7 @@ also a visual check of the actual data processing. ## Installation -In general use `dtrackr` is expected to be installed alongside the `idyverse` +In general use `dtrackr` is expected to be installed alongside the `tidyverse` set of packages. It is recommended to install `tidyverse` first. Binary packages of `dtrackr` are available on CRAN and r-universe for `macOS` @@ -164,7 +166,8 @@ of `pdf`, `png`, `svg` or `ps`), ready for submission to Nature. This is a trivial example, but the more complex the pipeline, the bigger benefit you will get. -Check out the [main documentation for detailed examples](https://terminological.github.io/dtrackr/) +Check out the [main documentation for more details](https://terminological.github.io/dtrackr), +and in particular the [getting started vignette](https://terminological.github.io/dtrackr/articles/dtrackr.html). ## Testing and integration diff --git a/README.md b/README.md index fa38a5e..4ec85c8 100644 --- a/README.md +++ b/README.md @@ -7,6 +7,9 @@ [![DOI](https://zenodo.org/badge/335974323.svg)](https://zenodo.org/badge/latestdoi/335974323) [![dtrackr status badge](https://terminological.r-universe.dev/badges/dtrackr)](https://terminological.r-universe.dev) +[![metacran +downloads](https://cranlogs.r-pkg.org/badges/dtrackr)](https://cran.r-project.org/package=dtrackr) +[![CRAN\_Status\_Badge](https://www.r-pkg.org/badges/version/dtrackr)](https://cran.r-project.org/package=dtrackr) @@ -14,7 +17,7 @@ badge](https://terminological.r-universe.dev/badges/dtrackr)](https://terminolog Accurate documentation of a data pipeline is a first step to reproducibility, and a flow chart describing the steps taken to prepare -data is a useful part of this documentation. In analyses that relies on +data is a useful part of this documentation. In analyses that rely on data that is frequently updated, documenting a data flow by copying and pasting row counts into flowcharts in PowerPoint becomes quickly tedious. With interactive data analysis, and particularly using @@ -35,7 +38,7 @@ visual check of the actual data processing. ## Installation In general use `dtrackr` is expected to be installed alongside the -`idyverse` set of packages. It is recommended to install `tidyverse` +`tidyverse` set of packages. It is recommended to install `tidyverse` first. Binary packages of `dtrackr` are available on CRAN and r-universe for @@ -148,8 +151,10 @@ Nature. This is a trivial example, but the more complex the pipeline, the bigger benefit you will get. -Check out the [main documentation for detailed -examples](https://terminological.github.io/dtrackr/) +Check out the [main documentation for more +details](https://terminological.github.io/dtrackr), and in particular +the [getting started +vignette](https://terminological.github.io/dtrackr/articles/dtrackr.html). ## Testing and integration diff --git a/docs/index.html b/docs/index.html index 5d60d0b..9981c30 100644 --- a/docs/index.html +++ b/docs/index.html @@ -109,13 +109,13 @@
Accurate documentation of a data pipeline is a first step to reproducibility, and a flow chart describing the steps taken to prepare data is a useful part of this documentation. In analyses that relies on data that is frequently updated, documenting a data flow by copying and pasting row counts into flowcharts in PowerPoint becomes quickly tedious. With interactive data analysis, and particularly using RMarkdown, code execution sometimes happens in a non-linear fashion, and this can lead to, at best, confusion and at worst erroneous analysis. Basing such documentation on what the code does when executed sequentially can be inaccurate when the data has being analysed interactively.
+Accurate documentation of a data pipeline is a first step to reproducibility, and a flow chart describing the steps taken to prepare data is a useful part of this documentation. In analyses that rely on data that is frequently updated, documenting a data flow by copying and pasting row counts into flowcharts in PowerPoint becomes quickly tedious. With interactive data analysis, and particularly using RMarkdown, code execution sometimes happens in a non-linear fashion, and this can lead to, at best, confusion and at worst erroneous analysis. Basing such documentation on what the code does when executed sequentially can be inaccurate when the data has being analysed interactively.
The goal of dtrackr
is to take away this pain by instrumenting and monitoring a dataframe through a dplyr
pipeline, creating a step-by-step summary of the important parts of the wrangling as it actually happened to the dataframe, right into dataframe metadata itself. This metadata can be used to generate documentation as a flowchart, and allows both a quick overview of the data and also a visual check of the actual data processing.
In general use dtrackr
is expected to be installed alongside the idyverse
set of packages. It is recommended to install tidyverse
first.
In general use dtrackr
is expected to be installed alongside the tidyverse
set of packages. It is recommended to install tidyverse
first.
Binary packages of dtrackr
are available on CRAN and r-universe for macOS
and Windows
. dtrackr
can be installed from source on Linux. dtrackr
has been tested on R versions 3.6, 4.0, 4.1 and 4.2.
You can install the released version of dtrackr
from CRAN with:
@@ -190,7 +190,7 @@
And your publication ready data pipeline, with any assumptions you care to document, is creates in a format of your choice (as long as that choice is one of
png
,svg
orps
), ready for submission to Nature.This is a trivial example, but the more complex the pipeline, the bigger benefit you will get.
-Check out the main documentation for detailed examples
+Check out the main documentation for more details, and in particular the getting started vignette.
#> $paths #> $paths$dot -#> [1] "/tmp/Rtmp7DmOBj/file6ad543f97d0c.dot" +#> [1] "/tmp/RtmpbhV7Go/file1d192458707f.dot" #> #> $paths$png -#> [1] "/tmp/Rtmp7DmOBj/file6ad543f97d0c.png" +#> [1] "/tmp/RtmpbhV7Go/file1d192458707f.png" #> #> $paths$pdf -#> [1] "/tmp/Rtmp7DmOBj/file6ad543f97d0c.pdf" +#> [1] "/tmp/RtmpbhV7Go/file1d192458707f.pdf" #> #> $paths$svg -#> [1] "/tmp/Rtmp7DmOBj/file6ad543f97d0c.svg" +#> [1] "/tmp/RtmpbhV7Go/file1d192458707f.svg" #> #> #> $svg