Skip to content

Commit

Permalink
update scripts (#9)
Browse files Browse the repository at this point in the history
* update scripts

* remove gitignore

* add output state to script

* add contrib guidelines

* get rid of the findStatesTemp workaround

* move methods and metrics one level higher

* fix commands

* bump version

* update workflow scripts

* fix script
  • Loading branch information
rcannood authored Sep 4, 2024
1 parent 1f0e346 commit 334b59f
Show file tree
Hide file tree
Showing 26 changed files with 460 additions and 260 deletions.
14 changes: 7 additions & 7 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
resources
resources_test
work
.nextflow*
target
/resources
/resources_test
/work
/.nextflow*
/target
.vscode
.DS_Store
output
/output
trace-*
.ipynb_checkpoints
.ipynb_checkpoints
148 changes: 148 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Contribution guidelines

We encourage contributions from the community. To contribute:

* Star this repository: Click the star button in the top-right corner of the repository to show your support.
* Fork the repository: Start by forking this repository to your account.
* Develop your component: Create your Viash component, ensuring it aligns with our best practices (detailed below).
* Submit a pull request: After testing your component, submit a pull request for review.

## Installation

You need to have Docker, Java, and Viash installed. Follow
[these instructions](https://openproblems.bio/documentation/fundamentals/requirements)
to install the required dependencies.

## Getting started

### Cloning the repository

To get started, fork the repository and clone it to your local machine:

```bash
git clone <git url to fork> --recursive

cd <name of the repository>
```

NOTE: If you forgot to clone the repository with the `--recursive` flag, you can run the following command to update the submodules:

```bash
git submodule update --init --recursive
```

### Downloading the test resources

Next, you should download the test resources:

```bash
scripts/sync_resources.sh
```

You may need to run this script again if the resources are updated.

## Good first steps

### Creating a new method

To create a new method, you can use the following command:

```bash
# in Python:
common/scripts/create_component \
--name my_method \
--language python \
--type method

# or in R:
common/scripts/create_component \
--name my_method \
--language r \
--type method
```

This will create a new method in `src/methods/my_method`. Next, you should:

* Fill in the component's metadata
* Specify dependencies
* Implement the method's code
* Run the unit test

Please review our documentation on [creating a new method](https://openproblems.bio/documentation/create_component/add_a_method) for more information on how to do this.


### Creating a new metric

Creating a new metric is similar to creating a new method. You can use the following command:

```bash
# in Python:
common/scripts/create_component \
--name my_metric \
--language python \
--type metric

# or in R:
common/scripts/create_component \
--name my_metric \
--language r \
--type metric
```

This will create a new metric in `src/metrics/my_metric`. Next, you should:

* Fill in the component's metadata
* Specify dependencies
* Implement the metric's code
* Run the unit test

Please review our documentation on [creating a new metric](https://openproblems.bio/documentation/create_component/add_a_metric) for more information.


## Frequently used commands

To get started, you can run the following commands:

### Sync the test data

To sync the test data, run the following command:

```bash
scripts/sync_resources.sh
```

### Building the components

To run the benchmark, you first need to build the components.

```bash
viash ns build --parallel --setup cachedbuild
```

For each of the components, this will:

* Build a Docker image
* Build an executable at `target/executable/<component_name>`
* Build a Nextflow module at `target/nextflow/<component_name>`

### Running the unit tests

To run the unit test of one component, you can use the following command:

```bash
viash test src/path/to/config.vsh.yaml
```

To run the unit tests for all components, you can use the following command:

```bash
viash ns test --parallel
```

### Running the benchmark

To run the benchmark, you can use the following command:

```bash
scripts/run_benchmark/run.sh
```
2 changes: 1 addition & 1 deletion _viash.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
viash_version: 0.9.0-RC7
viash_version: 0.9.0

# Step 1: Change the name of the task.
# example: task_name_of_this_task
Expand Down
3 changes: 0 additions & 3 deletions scripts/.gitignore

This file was deleted.

2 changes: 2 additions & 0 deletions scripts/create_component/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# if users change the scripts, the changes should not be committed.
/create_*_*.sh
8 changes: 8 additions & 0 deletions scripts/create_component/create_python_method.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

set -e

common/scripts/create_component \
--name my_python_method \
--language python \
--type method
8 changes: 8 additions & 0 deletions scripts/create_component/create_python_metric.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

set -e

common/scripts/create_component \
--name my_python_metric \
--language python \
--type metric
8 changes: 8 additions & 0 deletions scripts/create_component/create_r_method.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

set -e

common/scripts/create_component \
--name my_r_method \
--language r \
--type method
8 changes: 8 additions & 0 deletions scripts/create_component/create_r_metric.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

set -e

common/scripts/create_component \
--name my_r_metric \
--language r \
--type metric
34 changes: 34 additions & 0 deletions scripts/create_datasets/resources.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#!/bin/bash

# get the root of the directory
REPO_ROOT=$(git rev-parse --show-toplevel)

# ensure that the command below is run from the root of the repository
cd "$REPO_ROOT"

# remove this when you have implemented the script
echo "TODO: once the 'process_datasets' workflow is implemented, update this script to use it."
echo " Step 1: replace 'task_template' with the name of the task in the following command."
echo " Step 2: replace the rename keys parameters to fit your process_dataset inputs"
echo " Step 3: replace the settings parameter to fit your process_dataset outputs"
echo " Step 4: remove this message"
exit 1

cat > /tmp/params.yaml << 'HERE'
input_states: s3://openproblems-data/resources/datasets/**/state.yaml
rename_keys: 'input:output_dataset'
output_state: '$id/state.yaml'
settings: '{"output_train": "$id/train.h5ad", "output_test": "$id/test.h5ad"}'
publish_dir: s3://openproblems-data/resources/task_template/datasets/
HERE

tw launch https://github.com/openproblems-bio/task_template.git \
--revision build/main \
--pull-latest \
--main-script target/nextflow/workflows/process_datasets/main.nf \
--workspace 53907369739130 \
--compute-env 6TeIFgV5OY4pJCk8I0bfOh \
--params-file /tmp/params.yaml \
--entry-name auto \
--config common/nextflow_helpers/labels_tw.config \
--labels task_template,process_datasets
49 changes: 49 additions & 0 deletions scripts/create_datasets/test_resources.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/bin/bash

# get the root of the directory
REPO_ROOT=$(git rev-parse --show-toplevel)

# ensure that the command below is run from the root of the repository
cd "$REPO_ROOT"

# remove this when you have implemented the script
echo "TODO: replace the commands in this script with the sequence of components that you need to run to generate test_resources."
echo " Inside this script, you will need to place commands to generate example files for each of the 'src/api/file_*.yaml' files."
exit 1

set -e

RAW_DATA=resources_test/common
DATASET_DIR=resources_test/task_template

mkdir -p $DATASET_DIR

# process dataset
echo Running process_dataset
nextflow run . \
-main-script target/nextflow/workflows/process_datasets/main.nf \
-profile docker \
--publish_dir "$DATASET_DIR" \
--id "pancreas" \
--input "$RAW_DATA/pancreas/dataset.h5ad" \
--output_train '$id/train.h5ad' \
--output_test '$id/test.h5ad' \
--output_solution '$id/solution.h5ad' \
--output_state '$id/state.yaml'

# run one method
viash run src/methods/knn/config.vsh.yaml -- \
--input_train $DATASET_DIR/pancreas/train.h5ad \
--input_test $DATASET_DIR/pancreas/test.h5ad \
--output $DATASET_DIR/pancreas/prediction.h5ad

# run one metric
viash run src/metrics/accuracy/config.vsh.yaml -- \
--input_prediction $DATASET_DIR/pancreas/prediction.h5ad \
--input_solution $DATASET_DIR/pancreas/solution.h5ad \
--output $DATASET_DIR/pancreas/score.h5ad

# only run this if you have access to the openproblems-data bucket
aws s3 sync --profile op \
"$DATASET_DIR" s3://openproblems-data/resources_test/task_template \
--delete --dryrun
2 changes: 2 additions & 0 deletions scripts/create_readme.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
#!/bin/bash

set -e

common/scripts/create_task_readme
38 changes: 0 additions & 38 deletions scripts/create_test_resources.sh

This file was deleted.

9 changes: 0 additions & 9 deletions scripts/download_resources.sh

This file was deleted.

6 changes: 6 additions & 0 deletions scripts/project/build_all_components.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash

set -e

# Build all components in a namespace (refer https://viash.io/reference/cli/ns_build.html)
viash ns build --parallel
7 changes: 7 additions & 0 deletions scripts/project/build_all_docker_containers.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/bin/bash

set -e

# Build all components in a namespace (refer https://viash.io/reference/cli/ns_build.html)
# and set up the container via a cached build
viash ns build --parallel --setup cachedbuild
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/bin/bash

set -e

# Test all components in a namespace (refer https://viash.io/reference/cli/ns_test.html)
viash ns test --parallel
Loading

0 comments on commit 334b59f

Please sign in to comment.