Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A large number of cells are unassigned & very few n_vars #96

Open
nansne opened this issue Apr 3, 2024 · 3 comments
Open

A large number of cells are unassigned & very few n_vars #96

nansne opened this issue Apr 3, 2024 · 3 comments

Comments

@nansne
Copy link

nansne commented Apr 3, 2024

Hi,
I ran vireo model 2 with my 10X multiome data (joint ATAC+RNA, a pool of 23 samples). But got a large number of unassigned cells, and very few n_vars per cells. I'm confused which step was wrong. Thank you!
微信图片_20240403142811
9cc1108f06e5e3e6901fb27498596ed

@huangyh09
Copy link
Collaborator

hi, it says only 1102 out of 149K variants were matched to the donor VCF file and can be used for demultiplexing. You may try the following things:

  1. check if your donor VCF has been imputed - you can get more SNP by imputation, e.g., via Sanger Imputation Server. Then, it will give you more SNPs to use.
  2. If 1) doesn't improve much, you may consider running cellsnp-lite with the donor VCF you have in 1) again, while you may want only to keep SNPs that donors have different genotypes.

Yuanhua

@nansne
Copy link
Author

nansne commented Apr 6, 2024

Thank you, Yuanhua. The donor VCF had been imputed before.
So I followed your second advice, ran cellsnp-lite with the donor VCF, and the code as:
cellsnp-lite
-s ./gex_possorted_bam.bam
-b ./barcodes.tsv.gz
-O ./testvkha_poolvcf_ref
-R ./data_pool1.vcf.gz
-p 16 --minMAF 0.1 --minCOUNT 20 --gzip --UMItag None
it took a long time ([I::main] time spent: 27613 seconds.)
Next, I ran vireo with the code:
vireo
-c ./testvkha_poolvcf_ref
-d ./data_pool1.vcf.gz
-o ./testvkha_poolvcf_ref
-p 16 --randSeed 2 --genoTag GT
however, there were many unassigned cells too
image

image

Looking forward to hearing from you!
Best wishes,
Nan

@huangyh09
Copy link
Collaborator

Hi Nan, good to hear it's been imputed already. However, the results are still quite concerning, particularly the estimated allelic rate mean, it should be near 0, 0.5, and 1.0, while yours are 0.133, 0.846, and 0.156.

You may double-check that your scRNA BAM and your donor VCF are in the same genome build, e.g., both hg38 or hg37.

Yuanhua

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants