Skip to content

Commit

Permalink
Merge pull request #9 from CRC-FONDA/dev
Browse files Browse the repository at this point in the history
Adressing review suggestions for Version 1.0.0
  • Loading branch information
Felix-Kummer authored Nov 26, 2024
2 parents 6fe422b + a651b55 commit caee766
Show file tree
Hide file tree
Showing 49 changed files with 3,349 additions and 544 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ testing/
testing*
*.pyc
null/
.nf-test*
1 change: 1 addition & 0 deletions .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ bump_version: null
lint:
files_exist:
- conf/igenomes.config
- conf/igenomes_ignored.config
nf_core_version: 3.0.2
org_path: null
repository_type: pipeline
Expand Down
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## v1.0.0 - [date]

Initial release of nf-core/rangeland, created with the [nf-core](https://nf-co.re/) template.
First release of `nf-core/rangeland`.
This work is a continuation, and nf-core port, of the [original version of this pipeline](https://github.com/CRC-FONDA/FORCE2NXF-Rangeland).

### `Added`

Expand Down
21 changes: 13 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@

## Introduction

**nf-core/rangeland** is a geographical best-practice analysis pipeline for remotely sensed imagery. The pipeline processes satellite imagery alongside auxiliary data in multiple steps to arrive at a set of trend files related to land-cover changes. The main pipeline steps are:
**nf-core/rangeland** is a geographical best-practice analysis pipeline for remotely sensed imagery.
The pipeline processes satellite imagery alongside auxiliary data in multiple steps to arrive at a set of trend files related to land-cover changes. The main pipeline steps are:

1. Read satellite imagery, digital elevation model, endmember definition, water vapor database and area of interest definition
2. Generate allow list and analysis mask to determine which pixels from the satellite data can be used
Expand All @@ -28,20 +29,22 @@
5. Time series analyses to obtain trends in vegetation dynamics
6. Create mosaic and pyramid visualizations of the results

7. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/))
8. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))
7. Present QC results ([`MultiQC`](http://multiqc.info/))

## Usage

> [!NOTE]
> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.
> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow.
> Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.
To run the pipeline on real data, input data needs to be acquired. Concretely, satellite imagery, water vapor data, a digital elevation model, endmember definitions, a datacube specification, and a area-of-interest specification are required. Please refer to the [usage documentation](https://nf-co.re/rangeland/usage) for details on the input structure.
To run the pipeline on real data, input data needs to be acquired.
Concretely, satellite imagery, water vapor data, a digital elevation model, endmember definitions, a datacube specification, and a area-of-interest specification are required.
Please refer to the [usage documentation](https://nf-co.re/rangeland/usage) for details on the input structure.

Now, you can run the pipeline using:

```bash
nextflow run nf-core/rangeland/main.nf \
nextflow run nf-core/rangeland \
-profile <docker/singularity/.../institute> \
--input <SATELLITE IMAGES> \
--dem <DIGITAL ELEVATION MODEL> \
Expand Down Expand Up @@ -72,7 +75,8 @@ The rangeland workflow was originally written by:

The original workflow can be found on [github](https://github.com/CRC-FONDA/FORCE2NXF-Rangeland).

Transformation to nf-core/rangeland was conducted by [Felix Kummer](https://github.com/Felix-Kummer). nf-core alignment started on the [nf-core branch of the original repository](https://github.com/CRC-FONDA/FORCE2NXF-Rangeland/tree/nf-core).
Transformation to nf-core/rangeland was conducted by [Felix Kummer](https://github.com/Felix-Kummer).
nf-core alignment started on the [nf-core branch of the original repository](https://github.com/CRC-FONDA/FORCE2NXF-Rangeland/tree/nf-core).

We thank the following people for their extensive assistance in the development of this pipeline:

Expand Down Expand Up @@ -114,7 +118,8 @@ You can cite the `nf-core` publication as follows:
>
> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).
This pipeline is based one the publication listed below. The publication can be cited as follows:
This pipeline is based one the publication listed below.
The publication can be cited as follows:

> **FORCE on Nextflow: Scalable Analysis of Earth Observation Data on Commodity Clusters**
>
Expand Down
70 changes: 36 additions & 34 deletions bin/merge_boa.r
Original file line number Diff line number Diff line change
@@ -1,44 +1,46 @@
#!/usr/bin/env Rscript

args = commandArgs(trailingOnly=TRUE)
## Originally written by Felix Kummer and released under the MIT license.
## See git repository (https://github.com/nf-core/rangeland) for full license text.

# Script for merging bottom of atmosphere (boa) .tif raster files.
# This can improve the performance of downstream tasks.

if (length(args) < 3) {
stop("\nthis program needs at least 3 inputs\n1: output filename\n2-*: input files", call.=FALSE)
}

fout <- args[1]
finp <- args[2:length(args)]
nf <- length(finp)

require(raster)


img <- brick(finp[1])
nc <- ncell(img)
nb <- nbands(img)
require(terra)

args <- commandArgs(trailingOnly = TRUE)

sum <- matrix(0, nc, nb)
num <- matrix(0, nc, nb)

for (i in 1:nf){

data <- brick(finp[i])[]

num <- num + !is.na(data)

data[is.na(data)] <- 0
sum <- sum + data

if (length(args) < 3) {
stop("\nError: this program needs at least 3 inputs\n1: output filename\n2-*: input files", call.=FALSE)
}

mean <- sum/num
img[] <- mean

fout <- args[1]
finp <- args[2:length(args)]

writeRaster(img, filename = fout, format = "GTiff", datatype = "INT2S",
options = c("INTERLEAVE=BAND", "COMPRESS=LZW", "PREDICTOR=2",
"NUM_THREADS=ALL_CPUS", "BIGTIFF=YES",
sprintf("BLOCKXSIZE=%s", img@file@blockcols[1]),
sprintf("BLOCKYSIZE=%s", img@file@blockrows[1])))
# Load input rasters
rasters <- lapply(finp, rast)

# Calculate the sum of non-NA values across all rasters
sum_rasters <- Reduce("+", lapply(rasters, function(x) {
x[is.na(x)] <- 0
return(x)
}))

# Calculate the number of values non-NA values for each cell
count_rasters <- Reduce("+", lapply(rasters, function(x) {
return(!is.na(x))
}))

# Calculate the mean raster
mean_raster <- sum_rasters / count_rasters

# Write the mean raster
writeRaster(mean_raster,
filename = fout,
datatype = "INT2S",
filetype = "GTiff",
gdal = c("COMPRESS=LZW", "PREDICTOR=2",
"NUM_THREADS=ALL_CPUS", "BIGTIFF=YES",
sprintf("BLOCKXSIZE=%s", ncol(mean_raster)),
sprintf("BLOCKYSIZE=%s", nrow(mean_raster))))
59 changes: 31 additions & 28 deletions bin/merge_qai.r
Original file line number Diff line number Diff line change
@@ -1,38 +1,41 @@
#!/usr/bin/env Rscript

args = commandArgs(trailingOnly=TRUE)
## Originally written by Felix Kummer and released under the MIT license.
## See git repository (https://github.com/nf-core/rangeland) for full license text.

# Script for merging quality information (qai) .tif raster files.
# This can improve the performance of downstream tasks.

if (length(args) < 3) {
stop("\nthis program needs at least 3 inputs\n1: output filename\n2-*: input files", call.=FALSE)
}

fout <- args[1]
finp <- args[2:length(args)]
nf <- length(finp)

require(raster)


img <- raster(finp[1])
nc <- ncell(img)
require(terra)

args <- commandArgs(trailingOnly = TRUE)

last <- rep(1, nc)

for (i in 1:nf){

data <- raster(finp[i])[]

last[!is.na(data)] <- data[!is.na(data)]

if (length(args) < 3) {
stop("\nError: this program needs at least 3 inputs\n1: output filename\n2-*: input files", call.=FALSE)
}

img[] <- last

fout <- args[1]
finp <- args[2:length(args)]

writeRaster(img, filename = fout, format = "GTiff", datatype = "INT2S",
options = c("INTERLEAVE=BAND", "COMPRESS=LZW", "PREDICTOR=2",
"NUM_THREADS=ALL_CPUS", "BIGTIFF=YES",
sprintf("BLOCKXSIZE=%s", img@file@blockcols[1]),
sprintf("BLOCKYSIZE=%s", img@file@blockrows[1])))
# load raster files into single SpatRaster
rasters <- rast(finp)

# Merge rasters by maintaining the last non-NA value
merged_raster <- app(rasters, function(x) {
non_na_values <- na.omit(x)
if (length(non_na_values) == 0) {
return(1)
}
return(tail(non_na_values, 1)[1])
})

# Write merged raster
writeRaster(merged_raster,
filename = fout,
filetype = "GTiff",
datatype = "INT2S",
gdal = c("INTERLEAVE=BAND", "COMPRESS=LZW", "PREDICTOR=2",
"NUM_THREADS=ALL_CPUS", "BIGTIFF=YES",
sprintf("BLOCKXSIZE=%s", ncol(merged_raster)),
sprintf("BLOCKYSIZE=%s", nrow(merged_raster))))
20 changes: 5 additions & 15 deletions bin/test.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
#!/usr/bin/env Rscript

## Originally written by David Frantz and Felix Kummer and released under the MIT license.
## See git repository (https://github.com/nf-core/rangeland) for full license text.

# Script to verify pipeline results from test and test_full profiles.

args = commandArgs(trailingOnly=TRUE)


Expand Down Expand Up @@ -116,21 +121,6 @@ peak_year_of_change <- peak_rast["YEAR-OF-CHANGE"]



# FOR REFERENCE: SAVE RASTERS
#######################################################################

#writeRaster(woody_cover_changes, "woody_cover_chg_ref.tif")
#writeRaster(woody_cover_year_of_change, "woody_cover_yoc_ref.tif")

#writeRaster(herbaceous_cover_changes, "herbaceous_cover_chg_ref.tif")
#writeRaster(herbaceous_cover_year_of_change, "herbaceous_cover_yoc_ref.tif")

#writeRaster(peak_changes, "peak_chg_ref.tif")
#writeRaster(peak_year_of_change, "peak_yoc_ref.tif")




# COMPARE TESTRUN WITH REFERENCE EXECUTION
#######################################################################
failure <- FALSE
Expand Down
1 change: 0 additions & 1 deletion conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@

process {

// TODO nf-core: Check the defaults for all processes
cpus = { 1 * task.attempt }
memory = { 6.GB * task.attempt }
time = { 4.h * task.attempt }
Expand Down
Loading

0 comments on commit caee766

Please sign in to comment.