XClone BAF xclone.pp.xclonedata > cell_anno.index.name = None #24

stublakemore · 2024-12-04T08:48:52Z

Dear Xianjie,

After spending a lot of time trying to resolve this issue myself, I wonder whether you can help me (again)!

Using Python3.9 and XClone v0.3.8 as per my other now resolved issues, when attempting to create the BAF_adata object using the xclone.pp.xclonedata function following the standard API documentation (https://xclone-cnv.readthedocs.io/en/latest/API.html#baf-module), I get the following error when running exactly this code:

BAF_adata = xclone.pp.xclonedata([AD_file, DP_file], 'BAF', mtx_barcodes_file, "hg19_genes", "Sample_6_BAF")

Traceback (most recent call last):
File "", line 1, in
File "/projects/mpi-sclc/sblakemo/anaconda3/envs/xclone/lib/python3.9/site-packages/xclone/preprocessing/_data.py", line 251, in xclonedata
cell_anno.index.name = None
UnboundLocalError: local variable 'cell_anno' referenced before assignment

Maybe there's something amiss with my .mtx objects, because I have 3 rather than two: cellSNP.tag.AD.mtx cellSNP.tag.DP.mtx cellSNP.tag.OTH.mtx, but running an adapted version of xclone.pp.xclonedata leads to the same error

BAF_adata = xclone.pp.xclonedata([AD_file, DP_file, OTH_file], 'BAF', mtx_barcodes_file, "hg19_genes", "Sample_6_BAF")
Traceback (most recent call last):
File "", line 1, in
File "/projects/mpi-sclc/sblakemo/anaconda3/envs/xclone/lib/python3.9/site-packages/xclone/preprocessing/_data.py", line 251, in xclonedata
cell_anno.index.name = None
UnboundLocalError: local variable 'cell_anno' referenced before assignment

I wondered whether I needed to initially load the features.tsv from the RDR analysis first as cell_anno, but that also kicks up the same error.

Following your intuition of looking at the log files, I inspected it, but don't seem to find anything that would mean my resolved xcltk issue actually isn't resolved? Please find the log file attached! You will surely identify something I can't!

Many thanks,

Stuart
pileup.log

stublakemore · 2024-12-11T11:39:17Z

Dear Xianjie,

I would really appreciate it if you could give me your insight on this! I don't have any further developments to report regarding either a solution or the cause of the issue.

Cheers,

Stuart

hxj5 · 2024-12-13T03:26:36Z

Do you have any suggestions @Rongtingting?

Rongtingting · 2024-12-22T07:07:57Z

Hi Stuart,
@stublakemore
Could you try https://xclone-cnv.readthedocs.io/en/latest/preprocessing.html#baf-load this or attach the codes you used?
I am afraid that you did not specify the right path for the data files?

Bests,
Rongting

stublakemore · 2024-12-23T08:46:50Z

Dear Rongting,

Thanks for getting in touch. The exact codes I used is in my initial issue query at the top of the message chain. Specifically, I wonder whether it's a problem with the xlctk baf preprocessing, because not only do I have these file names rather than the readthedocs names: cellSNP.tag.AD.mtx cellSNP.tag.DP.mtx cellSNP.tag.OTH.mtx rather than "AD.mtx" & "DP.mtx", but I also rather have cellSNP.samples.tsv rather than "barcodes.tsv". I've tried taking the barcodes.tsv file from the RDR pre-processing object, without success... Below my code:

Attempt 1:

data_dir = "/projects/mpi-sclc/sblakemo/Sample_6/Sample_6_baf/pileup/"
AD_file = data_dir + "cellSNP.tag.AD.mtx"
DP_file = data_dir + "cellSNP.tag.DP.mtx"
mtx_barcodes_file = data_dir + "cellSNP.samples.tsv"
BAF_adata = xclone.pp.xclonedata([AD_file, DP_file], 'BAF',
... mtx_barcodes_file,
... genome_mode = "hg19_genes")
Traceback (most recent call last):
File "", line 1, in
File "/projects/mpi-sclc/sblakemo/anaconda3/envs/xclone/lib/python3.9/site-packages/xclone/preprocessing/_data.py", line 262, in xclonedata
Xadata = AnnData(AD, obs=cell_anno, var=regions_anno) # dtype='int32'
File "/projects/mpi-sclc/sblakemo/anaconda3/envs/xclone/lib/python3.9/site-packages/anndata/_core/anndata.py", line 254, in init
self._init_as_actual(
File "/projects/mpi-sclc/sblakemo/anaconda3/envs/xclone/lib/python3.9/site-packages/anndata/_core/anndata.py", line 428, in _init_as_actual
self._var = _gen_dataframe(
File "/projects/mpi-sclc/sblakemo/anaconda3/envs/xclone/lib/python3.9/functools.py", line 888, in wrapper
return dispatch(args[0].class)(*args, **kw)
File "/projects/mpi-sclc/sblakemo/anaconda3/envs/xclone/lib/python3.9/site-packages/anndata/_core/aligned_df.py", line 65, in _gen_dataframe_df
raise _mk_df_error(source, attr, length, len(anno))
ValueError: Observations annot. var must have as many rows as X has columns (1680), but has 32696 rows.

Attempt 2:

data_dir = "/projects/mpi-sclc/sblakemo/Sample_6/Sample_6_baf/pileup/"
AD_file = data_dir + "cellSNP.tag.AD.mtx"
DP_file = data_dir + "cellSNP.tag.DP.mtx"
mtx_barcodes_file = data_dir + "barcodes.tsv"
BAF_adata = xclone.pp.xclonedata([AD_file, DP_file], 'BAF',
... mtx_barcodes_file,
... genome_mode = "hg19_genes")
Traceback (most recent call last):
File "", line 1, in
File "/projects/mpi-sclc/sblakemo/anaconda3/envs/xclone/lib/python3.9/site-packages/xclone/preprocessing/_data.py", line 262, in xclonedata
Xadata = AnnData(AD, obs=cell_anno, var=regions_anno) # dtype='int32'
File "/projects/mpi-sclc/sblakemo/anaconda3/envs/xclone/lib/python3.9/site-packages/anndata/_core/anndata.py", line 254, in init
self._init_as_actual(
File "/projects/mpi-sclc/sblakemo/anaconda3/envs/xclone/lib/python3.9/site-packages/anndata/_core/anndata.py", line 428, in _init_as_actual
self._var = _gen_dataframe(
File "/projects/mpi-sclc/sblakemo/anaconda3/envs/xclone/lib/python3.9/functools.py", line 888, in wrapper
return dispatch(args[0].class)(*args, **kw)
File "/projects/mpi-sclc/sblakemo/anaconda3/envs/xclone/lib/python3.9/site-packages/anndata/_core/aligned_df.py", line 65, in _gen_dataframe_df
raise _mk_df_error(source, attr, length, len(anno))
ValueError: Observations annot. var must have as many rows as X has columns (1680), but has 32696 rows.

Here also the head of the AD matrix, which shows 4 columns but only containing 3 columns worth of data...
head cellSNP.tag.AD.mtx
%%MatrixMarket matrix coordinate integer general
%
1680 757 74066
1 6 1
1 42 1
1 43 1
1 51 1
1 90 1
1 118 1
1 120 1

which is the same for the DP matrix
%%MatrixMarket matrix coordinate integer general
%
1680 757 151948
1 6 1
1 12 2
1 25 1
1 42 1
1 43 1
1 49 1
1 51 1

If you need anything else from me to be able to resolve the issue, let me know and I'll look to provide further information!

Cheers,

Stuart

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XClone BAF xclone.pp.xclonedata > cell_anno.index.name = None #24

XClone BAF xclone.pp.xclonedata > cell_anno.index.name = None #24

stublakemore commented Dec 4, 2024

stublakemore commented Dec 11, 2024

hxj5 commented Dec 13, 2024

Rongtingting commented Dec 22, 2024

stublakemore commented Dec 23, 2024

XClone BAF xclone.pp.xclonedata > cell_anno.index.name = None #24

XClone BAF xclone.pp.xclonedata > cell_anno.index.name = None #24

Comments

stublakemore commented Dec 4, 2024

stublakemore commented Dec 11, 2024

hxj5 commented Dec 13, 2024

Rongtingting commented Dec 22, 2024

stublakemore commented Dec 23, 2024