'init' not passed on to `bpiterate` in .reduceByYield_iterate #5

teunbrand · 2021-02-09T15:37:02Z

Hello everyone,

I was trying to use reduceByYield(..., init = DF, iterate = TRUE, parallel = TRUE), but it didn't seem to pass on the init argument to the downstream reduce function. Adapting an example from the documentation, I can show the problem as follows. Below is identical to the example:

suppressPackageStartupMessages({
    library(Rsamtools)
    library(GenomicFiles)
})

fl <- system.file(package="Rsamtools", "extdata", "ex1.bam")
bf <- BamFile(fl, yieldSize=500)

YIELD <- function(X, ...) {
    flag = scanBamFlag(isUnmappedQuery=FALSE)
    param = ScanBamParam(flag=flag, what="seq")
    scanBam(X, param=param, ...)[[1]][['seq']]
}
MAP <- function(value, ...) {
    requireNamespace("Biostrings", quietly=TRUE)
    Biostrings::alphabetFrequency(value, collapse=TRUE)
}
REDUCE <- `+`

Then, we could want to offset every number by +100 and try to do this through the init parameter.

init <- alphabetFrequency(DNAStringSet())
init <- setNames(rep(100, ncol(init)), colnames(init))
print(init)
#>   A   C   G   T   M   R   W   S   Y   K   V   H   D   B   N   -   +   . 
#> 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

When we do this, the output is identical to the output we'd get if we had not set the init argument.

outcome <- reduceByYield(bf, YIELD, MAP, REDUCE, parallel=TRUE, init = init)

print(outcome)
#>     A     C     G     T     M     R     W     S     Y     K     V     H     D 
#> 39904 23195 20477 31681     0     0     0     0     0     0     0     0     0 
#>     B     N     -     +     . 
#>     0    29     0     0     0

The following is the outcome I had expected, and is also the outcome when setting parallel = FALSE.

print(outcome + 100)
#>     A     C     G     T     M     R     W     S     Y     K     V     H     D 
#> 40004 23295 20577 31781   100   100   100   100   100   100   100   100   100 
#>     B     N     -     +     . 
#>   100   129   100   100   100

I think that the line mentioned below doesn't pass on the init parameter to bpiterate, but I don't know if this is intended or not.

GenomicFiles/R/reduceByYield.R

Line 16 in f17056c

result <- bpiterate(ITER, FUN=MAP, REDUCE=REDUCE, ...)

I had assumed this is a bug because I thought changing the parallel parameter shouldn't effect the outcome, but it does, so I thought to report it here.

Thanks for reading!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'init' not passed on to `bpiterate` in .reduceByYield_iterate #5

'init' not passed on to `bpiterate` in .reduceByYield_iterate #5

teunbrand commented Feb 9, 2021

'init' not passed on to bpiterate in .reduceByYield_iterate #5

'init' not passed on to bpiterate in .reduceByYield_iterate #5

Comments

teunbrand commented Feb 9, 2021

'init' not passed on to `bpiterate` in .reduceByYield_iterate #5

'init' not passed on to `bpiterate` in .reduceByYield_iterate #5