-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTH ratio 1 #93
Comments
Hi, thanks for the feedback. It could happen that some SNPs have OTH ratio 1 in mode 1, since the REF and ALT alleles are not inferred from data but specified in the input VCF by the user. Sometimes the two specified alleles do not "express", either truely happened as there are other alleles expressed on that SNP, or artifacts due to sequencing / alignment errors etc, especially for those SNPs with low coverage. |
Hi @hxj5, Many thanks for your help. The AD, DP, and OTH matrix can be used to compute the REF, ALT, and OTHER frequency. But the annotation of the matrix is based on the reference VCF file and not derived from the single cell data. For example, a SNP which is ALT in the reference could be the REF in my own data. Therefore, to filter by minMAF either the REF or ALT frequency needs to be greater than the threshold. But you ignore the OTHER frequency, is that correct? Also, if AD and DP is based on the reference VCF file and not the single cell data wouldn't that be a problem for Vireo? Best wishes, |
Thanks for the good question.
|
Dear @hxj5, I was on vaccation last week, sorry for the delay. Many thanks for the detailed answer that is very helpful! I managed to write a script that sorts and filters the cellsnp output into vartrix format. Here the REF and ALT matrix is based on the single cell data. I will do some tests and close update/close the thread afterwards. Again, many thanks for your time! Best wishes, Florian |
Dear @hxj5 and @huangyh09, Many thanks for helping me with this issue and related #90 #62. I wrote a script to convert the cellsnp output to vatrix format and filter the vatrix matrix by minMAF. Hence, filter the matrix by the ALT frequency according to the single cell data. For the plots below I show the cellsnp output with minMAF=0 and varied the minMAF for the vatrix output. For each patient we have samples from different time points which are either only host or mixed host/donr samples. I pooled those per patient and run cellsnp mode 1 with minMAF=0. Vireo was run with -M 150. The Vireo output can then be finally by time point to test if only donor samples were annotated correctly (not shown). But below I show only the results for the pooled samples per patient. Some quick information on the patients: Vireo is capable of handling samples where there is only one genotype present (patient 2) and seems quite sensible if one genotype is overrepresented (patient 1). In that case smaller minMAF give better results. In case the patient is unrelated (patient 3 and 4) the results are independent of minMAF. In case they are closely related the analysis yields mostly unassigned cells and the change of minMAF either has minor effects (patient_5 and patient_7) or very strong effects (patient 9). From the second figure I conclude that for closely related patients I could try to relax the min_prob filter to reduce the unassigned cells #24. I hope that is interesting for you and I want to say many thanks again with all your help! Best wishes, |
Hello cellsnp-lite team,
I run cellsnp-lite with default parameters in mode 1 and get SNPs with OTH ratio 1
OTH_ratio <- Matrix::rowSums(OTH) / Matrix::rowSums(DP+OTH)
. Does that make sense? Highly appreciate your input.Best wishes,
Florian
The text was updated successfully, but these errors were encountered: