Segmentation fault when using --soloFeatures Velocyto #1602

GeorgetteTanner · 2022-07-07T15:22:46Z

Hi Alex

Thanks for the great program.

I've been getting a segmentation fault error which happens with both the newest patch (2.7.10a_alpha_220601) as well as previous patches for 2.7.10a_alpha and version 2.7.9a. I've narrowed the problem down to when I include --soloFeatures Velocyto as it works fine when I use "--soloFeatures Gene GeneFull SJ". Just noticed this was also mentioned as a side issue in another thread: #1366

As a work around am I right that if I subtract Gene counts from GeneFull counts I would effectively end up with un-spliced counts, and with Gene counts representing spliced+ambiguous counts? If so this may be the best approach anyway as the He et al. 2022 Alevin-fry paper (https://www.nature.com/articles/s41592-022-01408-3) shows that RNA velocity may be improved by combining ambiguous with spliced reads. Or are ambiguous reads not counted in Gene counts?

Thanks
Georgette

alexdobin · 2022-07-08T14:10:10Z

Hi Georgette,

could you please send me the Log.out file from your run?
I agree that the GeneFull-Gene counts could be a reasonable approximation for unspliced counts.

GeorgetteTanner · 2022-07-09T16:09:15Z

Hi Alex

This is the log file for the failed run: Log.out.txt

Thanks
Georgette

ms-gx · 2022-07-26T20:32:43Z

I also get a segfault when setting Velocyto.

Command:
$star_path --runThreadN 63 --genomeDir $ref_annotation_path --readFilesIn "$workdir_path"FILTERED_"$id"_Aligned.sortedByCoord.out.bam --readFilesType SAM SE --readFilesCommand samtools view -F 0x100 --soloInputSAMattrBarcodeSeq CR UR --soloInputSAMattrBarcodeQual CY UY --soloType CB_UMI_Simple --outFileNamePrefix "$workdir_path"MAPPING2_"$id"_ --soloUMIlen 12 --soloCBwhitelist <(zcat cellranger/barcodes/3M-february-2018.txt.gz) --outSAMattributes CB UB cN --outSAMtype BAM SortedByCoordinate --soloCBmatchWLtype 1MM multi Nbase pseudocounts --soloUMIfiltering MultiGeneUMI_CR --soloUMIdedup 1MM_CR --clipAdapterType CellRanger4 --outFilterScoreMin 30 --soloCellFilter EmptyDrops_CR --limitBAMsortRAM $sort_memory --outSAMunmapped Within --soloFeatures Gene GeneFull SJ Velocyto > "$workdir_path"MAPPING2_"$id".mainlog.txt

EDIT:
Same as said above: as soon as I remove Velocyto (so only Gene GeneFull SJ) there is no segfault.

alexdobin · 2022-08-18T18:05:17Z

Hi Georgette, Michael,

I was trying to reproduce this seg-faults with both sets of parameters, but it did not happen on m test sets.
Could you please check that it happens on a smaller subset of reads (<100k) and send me such subset?

alexdobin · 2022-08-18T19:34:56Z

I have found another potential problem that may be causing a seg-fault and fixed it:
https://github.com/alexdobin/STAR/releases/tag/2.7.10a_alpha_220818
If you could test this patch, it would be great!

ms-gx · 2022-08-19T10:01:28Z

Dear Alex

I tested 2.7.10a_alpha_220818 on the dataset which used to trigger the segfault... and voilà it works now and there is no segfault.

Also, I tested again with the exact same conditions (except STAR version obviously) on 2.7.10a with the same dataset and there I could reproduce the segfault again.

So the problem is gone for me. Thanks much Alex!

alexdobin · 2022-08-22T18:27:27Z

Hi Michael,
thanks a lot for testing it!

johnchamberlin · 2023-11-06T20:39:50Z

Hello, I am also having a velocyto-induced segfault issue with STAR version 2.7.9a. This is with paired-end alignment and option '--peOverlapNbasesMin 5'

This is with 3' 10x genomics 150x150bp sequencing which I assume is not a normal use case. Do you know if it works with 5' assay data? See also #1366.

Thanks.

STAR --genomeDir GRCh38_ensg104.filtered --peOverlapNbasesMin 5 --soloType CB_UMI_Simple --soloBarcodeMate 2 --clip5pNbases 0 60 --soloCBwhi telist 3M-february-2018.txt --soloCBstart 1 --soloCBlen 16 --soloUMIstart 17 --soloUMIlen 12 --soloB arcodeReadLength 150 --readFilesIn R2.fastq.gz R1.fastq.gz --soloFeatures Gene GeneFull SJ Velocyto--soloUMIfiltering MultiGeneUMI --soloCBmatchWLtype 1MM_multi_Nbase _pseudocounts --outFileNamePrefix AG1. --outSAMtype BAM SortedByCoordinate --ou tSAMattributes NH HI nM AS CR UR CB UB GX GN sS sQ sM --runThreadN 16 --readFilesCommand zcat

alexdobin · 2023-11-17T19:48:14Z

Hi John,

Velocyto calculation may not work with 5' protocol, and --peOverlapNbasesMin may not work properly with any solo options.
If you need to merge the reads, I would recommend doing it with another tool before mapping, but keeping the merged cDNA sequence and barcode sequence as separate reads - this way you will be able to use solo 3' options.

alexdobin added the issue: code Likely to be an issue with STAR code label Jul 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault when using --soloFeatures Velocyto #1602

Segmentation fault when using --soloFeatures Velocyto #1602

GeorgetteTanner commented Jul 7, 2022

alexdobin commented Jul 8, 2022

GeorgetteTanner commented Jul 9, 2022

ms-gx commented Jul 26, 2022 •

edited

Loading

alexdobin commented Aug 18, 2022

alexdobin commented Aug 18, 2022

ms-gx commented Aug 19, 2022

alexdobin commented Aug 22, 2022

johnchamberlin commented Nov 6, 2023

alexdobin commented Nov 17, 2023

Segmentation fault when using --soloFeatures Velocyto #1602

Segmentation fault when using --soloFeatures Velocyto #1602

Comments

GeorgetteTanner commented Jul 7, 2022

alexdobin commented Jul 8, 2022

GeorgetteTanner commented Jul 9, 2022

ms-gx commented Jul 26, 2022 • edited Loading

alexdobin commented Aug 18, 2022

alexdobin commented Aug 18, 2022

ms-gx commented Aug 19, 2022

alexdobin commented Aug 22, 2022

johnchamberlin commented Nov 6, 2023

alexdobin commented Nov 17, 2023

ms-gx commented Jul 26, 2022 •

edited

Loading