You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to process the 10X 1.3 Million Brain Cells from E18 Mice dataset using Alevin with compiled salmon version 0.12.0 using the gencode.vM19.pc_transcripts.fa.gz as reference (https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.3.0/1M_neurons). The chemistry used is the 10x-v2. I have divided the fastqs into the 133 libraries and I'm trying to run Alevin per library fastqs (~140 r1 fastqs per library). The dataset has been processed with the longranger demux program, which outputs one fastq with both the UMI+barcode and read-sequence. I have divided the fastqs so that it corresponds to the input of Alevin (i.e. the UMI+barcode in one fastq and the read-sequence in the other). However it seems that Alevin gets stuck on processing the barcodes, no error code is produced it just doesn't seem to do anything anymore with just "processed X Million barcodes" printed on the screen. Are you aware of such a problem with many fastq files or is there something that I'm not taking into account? Is there a limit how many files can be used as an input? I tested Alevin with 60 fastqs (120 in total r1+r2 fastqs) and it ran through but with more than 60 fastqs it seems to get stuck on processing the barcodes. If it is not possible to run all the library related fastqs, do you recommend running them in smaller batches and then combining the resulting count matrices?
HI @mariaolaaksonen ,
Thanks for raising the issue and using Alevin with 1.3M dataset.
Can you check if your issue has the same behavior as in #329, i.e. Alevin is stuck after processing a multiple of 4 number of barcodes?
We have already fixed the issue but it's not in the master or in the release v0.12.0 of salmon.
As a fast resolution, we'd recommend compiling salmon from source using the develop branch. If you can wait for sometime, we'd release a new version with the hot-fix soon.
I'm trying to process the 10X 1.3 Million Brain Cells from E18 Mice dataset using Alevin with compiled salmon version 0.12.0 using the gencode.vM19.pc_transcripts.fa.gz as reference (https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.3.0/1M_neurons). The chemistry used is the 10x-v2. I have divided the fastqs into the 133 libraries and I'm trying to run Alevin per library fastqs (~140 r1 fastqs per library). The dataset has been processed with the longranger demux program, which outputs one fastq with both the UMI+barcode and read-sequence. I have divided the fastqs so that it corresponds to the input of Alevin (i.e. the UMI+barcode in one fastq and the read-sequence in the other). However it seems that Alevin gets stuck on processing the barcodes, no error code is produced it just doesn't seem to do anything anymore with just "processed X Million barcodes" printed on the screen. Are you aware of such a problem with many fastq files or is there something that I'm not taking into account? Is there a limit how many files can be used as an input? I tested Alevin with 60 fastqs (120 in total r1+r2 fastqs) and it ran through but with more than 60 fastqs it seems to get stuck on processing the barcodes. If it is not possible to run all the library related fastqs, do you recommend running them in smaller batches and then combining the resulting count matrices?
Command used: salmon alevin -l ISR -1 R1_fastqs -2 R2_fastqs --chromium -i index -p 20 -o alevin_output --tgMap txp2gene_mouse.tsv --dumpCsvCounts --whitelist barcode_whitelist.txt --minScoreFraction 0.7
The barcode whitelist was gotten from the HDF5 file which has the original data in a filtered matrix format (it has been run through the cellranger).
The text was updated successfully, but these errors were encountered: