Skip to content

Commit

Permalink
doc updates; bump to 0.9.2
Browse files Browse the repository at this point in the history
  • Loading branch information
mschubert committed Dec 7, 2023
1 parent b8427e9 commit a8d57a2
Show file tree
Hide file tree
Showing 4 changed files with 31 additions and 27 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: clustermq
Title: Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)
Version: 0.9.1
Version: 0.9.2
Authors@R: c(
person('Michael', 'Schubert', email='[email protected]',
role = c('aut', 'cre', 'cph'),
Expand Down
3 changes: 2 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# git head
# clustermq 0.9.2

* Fix a bug where SSH proxy would not cache data properly (#320)
* Fix a bug where `max_calls_worker` was not respected (#322)
* Local parallelism (`multicore`, `multiprocess`) again uses local IP (#321)
* Pool `info()` now also returns current worker and number of calls

# clustermq 0.9.1

Expand Down
13 changes: 10 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,14 +146,23 @@ Use [`batchtools`](https://github.com/mllg/batchtools) if you:
* don't mind there's no load-balancing at run-time

Use [Snakemake](https://snakemake.readthedocs.io/en/latest/) or
[`drake`](https://github.com/ropensci/drake) if:
[`targets`](https://github.com/ropensci/targets) if:

* you want to design and run a workflow on HPC

Don't use [`batch`](https://cran.r-project.org/web/packages/batch/index.html)
(last updated 2013) or [`BatchJobs`](https://github.com/tudo-r/BatchJobs)
(issues with SQLite on network-mounted storage).

Questions
---------

You are welcome to ask questions if something is not clear in the [User
guide](https://mschubert.github.io/clustermq/articles/userguide.html).

Please use the [Github
Discussions](https://github.com/mschubert/clustermq/discussions) for this.

Contributing
------------

Expand All @@ -162,8 +171,6 @@ to coordinate development of `clustermq`. Contributions are welcome and they
come in many different forms, shapes, and sizes. These include, but are not
limited to:

* Questions: You are welcome to ask questions if something is not clear in the
[User guide](https://mschubert.github.io/clustermq/articles/userguide.html).
* Bug reports: Let us know if something does not work as expected. Be sure to
include a self-contained [Minimal Reproducible
Example](https://stackoverflow.com/help/minimal-reproducible-example) and set
Expand Down
40 changes: 18 additions & 22 deletions vignettes/userguide.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Install the `clustermq` package in R from CRAN. This will automatically detect
if [ZeroMQ](https://github.com/zeromq/libzmq) is installed and otherwise use
the bundled library:

```r
```{r eval=FALSE}
# Recommended:
# If your system has `libzmq` installed but you want to enable the worker crash
# monitor, set the following environment variable to enable compilation of the
Expand All @@ -47,7 +47,8 @@ install.packages('clustermq')
Alternatively you can use the `remotes` package to install directly from
Github. Note that this version needs `autoconf`/`automake` for compilation:

```r
```{r eval=FALSE}
# Sys.setenv(CLUSTERMQ_USE_SYSTEM_LIBZMQ=0)
# install.packages('remotes')
remotes::install_github('mschubert/clustermq')
```
Expand All @@ -59,6 +60,7 @@ However, [feedback is very
welcome](https://github.com/mschubert/clustermq/issues/new).

```{r eval=FALSE}
# Sys.setenv(CLUSTERMQ_USE_SYSTEM_LIBZMQ=0)
# install.packages('remotes')
remotes::install_github('mschubert/clustermq', ref="develop")
```
Expand All @@ -69,6 +71,7 @@ Choose your preferred parallelism using:

```{r eval=FALSE}
options(clustermq.scheduler = "your scheduler here")
# this may require additional setup, for details see below
```

There are three kinds of schedulers:
Expand Down Expand Up @@ -106,7 +109,7 @@ To set up a scheduler explicitly, see the following links:
* [SGE](#SGE) - *should work without setup*
* [SLURM](#SLURM) - *should work without setup*
* [PBS](#PBS)/[Torque](#TORQUE) - *needs* `options(clustermq.scheduler="PBS"/"Torque")`
* if you want another scheduler, [open an
* you can suggest another scheduler by [opening an
issue](https://github.com/mschubert/clustermq/issues/new)

Default submission templates [are
Expand Down Expand Up @@ -212,7 +215,7 @@ Q(fx, x=1:3, export=list(y=10), n_jobs=1)
```

If we want to use a package function we need to load it on the worker using the
`pkg` argument or referencing it with `package_name::`:
`pkg` argument or referencing it with `package_name::`.

```{r}
fx = function(x) {
Expand Down Expand Up @@ -259,29 +262,22 @@ register(DoparParam()) # after register_dopar_cmq(...)
bplapply(1:3, sqrt)
```

### With `drake`
### With `targets`

The [`drake`](https://github.com/ropensci/drake) package enables users to
The [`targets`](https://github.com/ropensci/targets) package enables users to
define a dependency structure of different function calls, and only evaluate
them if the underlying data changed.

> drake — or, Data Frames in R for Make — is a general-purpose workflow manager
> for data-driven tasks. It rebuilds intermediate data objects when their
> dependencies change, and it skips work when the results are already up to
> date. Not every runthrough starts from scratch, and completed workflows have
> tangible evidence of reproducibility. drake also supports scalability,
> parallel computing, and a smooth user experience when it comes to setting up,
> deploying, and maintaining data science projects.
> The `targets` package is a [Make](https://www.gnu.org/software/make/)-like
> pipeline tool for statistics and data science in R. The package skips costly
> runtime for tasks that are already up to date, orchestrates the necessary
> computation with implicit parallel computing, and abstracts files as R
> objects. If all the current output matches the current upstream code and
> data, then the whole pipeline is up to date, and the results are more
> trustworthy than otherwise.
It can use `clustermq` to perform calculations as jobs:

```{r eval=FALSE}
library(drake)
load_mtcars_example()
# clean(destroy = TRUE)
# options(clustermq.scheduler = "multicore")
make(my_plan, parallelism = "clustermq", jobs = 2, verbose = 4)
```
It can use `clustermq` to [perform calculations as
jobs](https://books.ropensci.org/targets/hpc.html#clustermq).

## Options

Expand Down

0 comments on commit a8d57a2

Please sign in to comment.