Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix whitelist logic for Drop-seq #263

Closed
FelixKrueger opened this issue Sep 5, 2023 · 0 comments · Fixed by #268
Closed

Fix whitelist logic for Drop-seq #263

FelixKrueger opened this issue Sep 5, 2023 · 0 comments · Fixed by #268
Labels
bug Something isn't working enhancement New feature or request

Comments

@FelixKrueger
Copy link

Description of the bug

We tried to run the scrnaseq workkflow on some Drop-seq data, but it crashed with this error:

gzip: invalid magic

Looking at this a bit more closely, this was the command that caused it:

# run simpleaf quant
gzip -dcf  > whitelist.uncompressed.txt

# run simpleaf quant

Before running SIMPLEAF_QUANT, it attempts to uncompress a non-existent file. In other words, for any method that doesn't provide a whitelist of possible barcodes (such as Drop-seq), as it stands the scrnaseq workflow will fail by design.

if (params.barcode_whitelist) {
    ch_barcode_whitelist = file(params.barcode_whitelist)
} else if (params.protocol.contains("10X")) {
    ch_barcode_whitelist = file("$baseDir/assets/whitelist/10x_${chemistry}_barcode_whitelist.txt.gz", checkIfExists: true)
} else {
    ch_barcode_whitelist = [] // THIS LOGIC NEEDS FIXING
}

if (params.barcode_whitelist) {

TEMPORARY REMEDY

As the gzip command is hard-coded into the script block (see above), the only way to get it to not fail is by staging a gzip compressed file via the whitelist option (I uploaded an empty file to S3):

gzip -dcf empty_gzip_file.txt.gz > whitelist.uncompressed.txt

One can then get simpleaf_quant to infer the confident barcodes, e.g. via the --knee method (thanks to @rob-p for advice!). This will then skip adding the (non-existent) whitelist to the simpleaf_quant command, achieved here:

// check if users are using one of the mutually excludable parameters:

Lastly, I had to pass the following external arguments to simpleaf_quant to use the knee argument as well as a resolution method how near-duplicate UMIs are resolved.
(quoting Rob:

The cr-like method is a safe default I think. That is, it’s not just meant for chromium chemistries, but is a general algorithm. In general, I think the specific method by which similar UMIs should be allocated is still an active area of research.)

Adding the following to the Nextflow config allowed the job to complete successfully:

withName:'SIMPLEAF_QUANT'{
    ext.args = "--knee -r cr-like"
}

Thanks, Felix

Command used and terminal output

No response

Relevant files

No response

System information

No response

@FelixKrueger FelixKrueger added bug Something isn't working enhancement New feature or request labels Sep 5, 2023
This was referenced Sep 21, 2023
@grst grst closed this as completed in #268 Oct 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant