Try to use the reads rather than UMIs for counting SNPs in spatial transcriptome (10x Visium) #133

wJDKnight · 2024-08-02T01:30:41Z

I am working on calling variants in spatial transcriptomics data (10x Visium). Since the sequencing depth of spatial transcriptome is poorer than single-cell data, I want to treat all reads in the bam independently. Therefore, I used --UMItag None. That means, I changed the code from this (using the default UMI tag)

cellsnp-lite -s $OUT_BAM -b $BARCODE -O $OUT_DIR -R $REGION_VCF -p ${n_processes} --minMAF 0.05 --minCOUNT 20 --gzip --genotype

to this (using UMItag None)

cellsnp-lite -s $OUT_BAM -b $BARCODE --UMItag None -O $OUT_DIR -R $REGION_VCF -p ${n_processes} --minMAF 0.05 --minCOUNT 20 --gzip --genotype

I expected a higher sequencing depth (DP) in the output VCF but it wasn't. The overall DP decreased.

Could it be because of some filtering criteria? When should I use --countORPHAN?

The text was updated successfully, but these errors were encountered:

hxj5 · 2024-08-02T01:57:59Z

Hi, thanks for the detailed feedback. The --exclFLAG option probably matters in this case. It is used for read filtering based on BAM FLAGs: skip reads with any mask bits set. Default is UNMAP,SECONDARY,QCFAIL (when use UMI) or UNMAP,SECONDARY,QCFAIL,DUP (otherwise).

In other words, when you set --UMItag None, by default the reads marked as duplicates in FLAG will be filtered. To keep these reads, you can manually set "--exclFLAG", e.g., to --exclFLAG 772.

It is not recommended to use --countORPHAN in pair-end sequencing. You may check out the details of all the read filtering options in the manual.

wJDKnight · 2024-08-02T02:12:14Z

Thank you very much for such a quick response. I will check the usage of the "--exclFLAG" and update feedback later.

wJDKnight · 2024-08-02T10:19:41Z

By using that flag, the overall DP increases to three times what it was before. It seems to be working well. Thanks a lot. The cellsnp-lite is really a very nice tool.

wJDKnight · 2024-08-16T12:41:11Z

Though I got a larger DP by including DUP, I am wondering why excluding DP will decrease the DP. Here is an example of a loci with 4 reads in one UMI group.

In scenario A, DP for that loci will 1. In C, it will be 4. I think B should be 2, am I right?
But in the real data, I found that the DP of B is smaller than A, for every loci they both detected. How does that happen?

hxj5 · 2024-08-17T02:45:11Z

Hi, --exclFALG option simply filters the reads by checking whether the DUP bit is set in the sam FLAG. In the above example, if all the three "blue" reads are masked as DUP, they will all be filtered and the count for them is 0 instead of 1. Following this rule, the DP for scenario B will be between 0 (if all the 4 reads are set DUP) and 4 (if none of the reads is DUP), based on their FLAG.

Cellsnp-lite totally relies on the FLAG set by the upstream alignment tool. You may further check the FLAG of the reads to investigate the DP difference between the three scenarios.

wJDKnight closed this as completed Aug 2, 2024

wJDKnight reopened this Aug 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try to use the reads rather than UMIs for counting SNPs in spatial transcriptome (10x Visium) #133

Try to use the reads rather than UMIs for counting SNPs in spatial transcriptome (10x Visium) #133

wJDKnight commented Aug 2, 2024

hxj5 commented Aug 2, 2024

wJDKnight commented Aug 2, 2024

wJDKnight commented Aug 2, 2024

wJDKnight commented Aug 16, 2024

hxj5 commented Aug 17, 2024

Try to use the reads rather than UMIs for counting SNPs in spatial transcriptome (10x Visium) #133

Try to use the reads rather than UMIs for counting SNPs in spatial transcriptome (10x Visium) #133

Comments

wJDKnight commented Aug 2, 2024

hxj5 commented Aug 2, 2024

wJDKnight commented Aug 2, 2024

wJDKnight commented Aug 2, 2024

wJDKnight commented Aug 16, 2024

hxj5 commented Aug 17, 2024