Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: column index exceeds matrix dimensions #104

Open
hsymoon opened this issue Jul 8, 2024 · 3 comments
Open

ValueError: column index exceeds matrix dimensions #104

hsymoon opened this issue Jul 8, 2024 · 3 comments

Comments

@hsymoon
Copy link

hsymoon commented Jul 8, 2024

Hi eveyone,
I want to verify the best donor numbers of my vcf-data,which is the outputs of cellsnp-lite. I tried the demo command developers provided as follows:
`cell_vcf = vireoSNP.load_VCF('batch1_cellSNP.cells.vcf.gz', biallelic_only=True)

cell_dat = vireoSNP.vcf.read_sparse_GeneINFO(cell_vcf['GenoINFO'], keys=['AD', 'DP'])

AD = cell_dat['AD']

DP = cell_dat['DP']

n_donor_list = np.arange(2, 7)

ELBO_list_all = []

for _n_don in n_donor_list:
res = vireoSNP.vireo_wrap(AD, DP, n_donor=_n_don, learn_GT=True,
n_extra_donor=0, ASE_mode=False, fix_beta_sum=False,
n_init=50, check_doublet=True, random_seed=1)
ELBO_list_all.append(res['LB_list']) `

but I got the reply as follows:

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/public/home/syhou/miniconda3/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/vireoSNP/utils/vireo_wrap.py", line 17, in _model_fit
_model.fit(AD, DP, min_iter=5, max_iter=max_iter,
File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/vireoSNP/utils/vireo_model.py", line 313, in fit
ELBO += np.sum(get_binom_coeff(AD, DP))
File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/vireoSNP/utils/vireo_base.py", line 15, in get_binom_coeff
_AD = AD[idx].astype(np.int64)
File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_index.py", line 44, in getitem
row, col = self._validate_indices(key)
File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_index.py", line 147, in _validate_indices
row, col = _unpack_index(key)
File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_index.py", line 272, in _unpack_index
return index.nonzero()
File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_base.py", line 826, in nonzero
A = self.tocoo()
File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_compressed.py", line 1040, in tocoo
return self._coo_container(
File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_coo.py", line 204, in init
self._check()
File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_coo.py", line 293, in _check
raise ValueError('column index exceeds matrix dimensions')
ValueError: column index exceeds matrix dimensions
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "", line 2, in
File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/vireoSNP/utils/vireo_wrap.py", line 83, in vireo_wrap
_models_all = [res.get() for res in result]
File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/vireoSNP/utils/vireo_wrap.py", line 83, in
_models_all = [res.get() for res in result]
File "/public/home/syhou/miniconda3/lib/python3.10/multiprocessing/pool.py", line 774, in get
raise self._value
ValueError: column index exceeds matrix dimensions

It is so sad,here is my information of vcf_data used :
Number of samples: 9809
Number of SNPs: 14131
Number of INDELs: 0
Number of MNPs: 0
Number of others: 0
Number of sites: 14131

Can you help me , Thanks for replying

@huangyh09
Copy link
Collaborator

Thanks for sharing the issue. While you shared the info in the VCF file, can you still try print(cell_dat['AD'].shape)?

@hsymoon
Copy link
Author

hsymoon commented Jul 9, 2024

print(cell_dat['AD'].shape)

Hello,here is the command output:
print(cell_dat['AD'].shape) (14131, 9809)
Thanks for helping me

@huangyh09
Copy link
Collaborator

Thanks. This looks normal. Can you further try?

from vireoSNP.utils.vireo_base import get_binom_coeff, logbincoeff

np.sum(get_binom_coeff(AD, DP))

np.sum(logbincoeff(DP, AD, is_sparse=True))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants