Provides a single command-line interface to generate data sets for use with ADRIA and the RRAP program.
Table of Contents
TODO: Standalone executable.
pip install rrap-dg
pip install git+https://github.com/open-AIMS/rrap-dg
Clone the repository, and navigate to the project folder.
git clone https://github.com/open-AIMS/rrap-dg
cd rrap_dg
It is recommended that any development work be done in a separate environment.
Here, mamba
is used to create a local conda
environment.
# Create a new environment called rrap-dg
$ mamba create -n rrap-dg python=3.11
# Don't forget to activate the environment
$ mamba activate rrap-dg
# Install local development copy of rrap-dg
(rrap-dg) $ pip install -e .
Note: The first time rrapdg
is run, it will go through an initial set up process.
Run the help command to trigger the setup.
(rrap-dg) $ rrapdg --help
Alternatively, you can use a traditional python venv. For example
python -m venv .venv
source .venv/bin/activate
pip install -e .
This project uses Pydantic's BaseSettings
to manage configuration. While default values are provided, you can easily override these settings using a .env
file.
- Create a file named
.env
in the root directory of the project. - Add your custom settings to this file using the following format, noting all values are optional:
PROVENA_DOMAIN=your.provena.domain.com
PROVENA_REALM_NAME=provena
PROVENA_CLIENT_ID=automated-access
The following settings can be overridden:
PROVENA_DOMAIN
: The Provena deployment to target (default: "mds.gbrrestoration.org")PROVENA_REALM_NAME
: The Keycloak realm name (default: "rrap")PROVENA_CLIENT_ID
: The Keycloak client ID (default: "automated-access")
The rrap-dg Data Packages are used as inputs by DHW and Cyclone Mortality data generators.
To generate DHW data cubes, the folders MIROC5
, NOAA
, RECOM
and spatial
are required. To
generate the coral mortality projections due to cyclones, the folder cyclone_mortality
is required.
The data package should be named with the following convention:
[cluster name]_rrapdg_[YYYY-MM-DD]
An example for a hypothetical Moore dataset:
Moore_rrapdg_2023-01-24
│ datapackage.json
│ README.md
│
├───MIROC5
│ GBR_maxDHW_MIROC5_rcp26_2021_2099.csv
│ GBR_maxDHW_MIROC5_rcp45_2021_2099.csv
│ GBR_maxDHW_MIROC5_rcp60_2021_2099.csv
│ GBR_maxDHW_MIROC5_rcp85_2021_2099.csv
│
├───NOAA
│ GBR_dhw_hist_noaa.nc
│
├───RECOM
│ Moore_2015_585_dhw_exp.nc
│ Moore_2016_586_dhw_exp.nc
│ Moore_2017_599_dhw_exp.nc
│
└───spatial
│ list_gbr_reefs.csv
│ Moore.gpkg
│
└───cyclones
│ coral_cover_cyclone.csv
The most recent data package is available on the RRAP IS Data store: https://hdl.handle.net/102.100.100/481718
Create an empty ADRIA Domain to be filled with data.
(rrap-dg) $ rrapdg template generate [directory]
TODO: Package an ADRIA Domain with data from the M&DS data store.
(rrap-dg) $ rrapdg template package [directory] [spec]
Where spec points to a json file defining handle IDs for each dataset to be downloaded from the M&DS data store.
Generate Degree Heating Week projections using combinations of
- NOAA Coral Reef Watch (CRW version 3.1) satellite data
- MIROC5 RCP projections (2021 - 2099)
- RECOM spatial multi-marine heat wave patterns
This work was ported to Python from the original MATLAB developed by Dr. Veronique Lago and modified by Chinenye Ani in MATLAB.
Usage:
(rrap-dg) $ rrapdg dhw generate [cluster name] [input data directory] [output directory] [optional settings...]
For example, with default values shown for optional settings:
(rrap-dg) $ rrapdg dhw generate Moore C:/data_package_location C:/temp --n-sims 50 --rcps "2.6 4.5 6.0 8.5" --gen-year "2025 2100"
Note that the output directory is assumed to already exist.
The expected data package are detailed here
Initial coral cover data is downscaled from ReefMod Engine (RME) data. The current process is compatible with ReefMod or RME v1.0.x datasets or the rrap-dg data package.
(rrap-dg) $ rrapdg coral-cover downscale-icc [rrap-dg datapackage path] [target geopackage] [output path]
For example, to downscale RME data for the Moore cluster defined by a geopackage:
(rrap-dg) $ rrapdg coral-cover downscale-icc C:/example/rrapdg ./Moore.gpkg ./coral_cover.nc
(rrap-dg) $ rrapdg coral-cover downscale-icc C:/example/rme_dataset ./Moore.gpkg ./coral_cover.nc
A set of initial cover files can be created using a TOML file:
(rrap-dg) $ rrapdg coral-cover downscale-icc [rrap-dg datapackage path] [target geopackage] [output directory] [TOML file]
The output path is assumed to exist.
(rrap-dg) $ rrapdg coral-cover bin-edge-icc C:/example/rrapdg ./Moore.gpkg ./icc_files ./bin_edges.toml
This will create a set of netCDFs in the icc_files
directory using the bin edges defined
in the TOML file.
The format of the TOML file is:
name_of_file = [
[values, for, each, size class],
[rows, are, functional, groups],
[cols, are size, classes]
]
Note that ReefMod represents arborescent Acropora, whereas CoralBlocks does not. Hence the first line is set to 0.0.
A full example:
bin_edge_1 = [
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[5.0, 7.5, 10.0, 20.0, 40.0, 100.0, 150.0],
[5.0, 7.5, 10.0, 20.0, 35.0, 50.0, 100.0],
[5.0, 7.5, 10.0, 15.0, 20.0, 40.0, 50.0],
[5.0, 7.5, 10.0, 20.0, 40.0, 50.0, 100.0],
[5.0, 7.5, 10.0, 20.0, 40.0, 50.0, 100.0]
]
bin_edge_2 = [
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[4.0, 7.5, 10.0, 20.0, 40.0, 100.0, 150.0],
[4.0, 7.5, 10.0, 20.0, 35.0, 50.0, 100.0],
[4.0, 7.5, 10.0, 15.0, 20.0, 40.0, 50.0],
[4.0, 7.5, 10.0, 20.0, 40.0, 50.0, 100.0],
[4.0, 7.5, 10.0, 20.0, 40.0, 50.0, 100.0]
]
bin_edge_3 = [
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[5.0, 7.5, 10.0, 20.0, 40.0, 100.0, 100.0],
[5.0, 7.5, 10.0, 20.0, 35.0, 50.0, 120.0],
[5.0, 7.5, 10.0, 15.0, 20.0, 40.0, 60.0],
[5.0, 7.5, 10.0, 20.0, 40.0, 50.0, 110.0],
[5.0, 7.5, 10.0, 20.0, 40.0, 50.0, 120.0]
]
Using the above will create files named bin_edge_1
, bin_edge_2
, ..., etc.
Generate Cyclone Mortality projections using data from
- Fabricius, Katharina E., et al. "Disturbance gradients on inshore and offshore coral reefs caused by a severe tropical cyclone." Limnology and Oceanography 53.2 (2008): 690-704.
- ReefMod Engine data set
The mortality regression model was ported from an R script written by Dr. Vanessa Haller, intended for use with the C~Scape coral ecosystem model.
Usage:
(rrap-dg) $ rrapdg cyclones generate [rrapdg datapackage path] [reefmod engine datapackage path] [output directory path]
The output directory is assumed to already exist.
Download data from M&DS datastore
(rrap-dg) $ rrapdg data-store download [dataset id] [output directory]
For example, to download and save the dataset with id "102.100.100/602432" in the current directory:
(rrap-dg) $ rrapdg data-store download 102.100.100/602432 .
Semantically, the command is to download from a source to a destination.
TODO: Uploading/submitting datasets.
TODO
Assign each location in a geopackage file to a cluster using k-means clustering.
The cluster a location is a member of is indicated by a new column named cluster_id
.
Results are outputted to a new geopackage file saved to the user-indicated location.
The number of clusters are determined by optimizing for a high Silhouette score with
Adaptive Differential Evolution (adaptive_de_rand_1_bin_radiuslimited()
in
BlackBoxOptim.jl).
(rrap-dg) $ rrapdg domain cluster [geopackage path] [output directory path]
# Example
(rrap-dg) $ rrapdg domain cluster "C:/example/example.gpkg" "./test.gpkg"
The method reports a "Best candidate", the floor of which indicates the identified optimal number of clusters.
rrap-dg
is distributed under the terms of the MIT license.