-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: "invalid class dgRMatrix
object"
#18
Comments
Is this maybe falsely assuming that I am passing a sparse matrix object? That would explain why it works for a different case on my local machine. |
Anndata2ri only calls anndata2ri/anndata2ri/scipy2ri/py2r.py Lines 60 to 62 in 3fe0c7d
So I’m pretty sure it’s impossible that it “falsely assumes” that you pass a sparse matrix object. I’m, pretty sure you actually pass a Lines 70 to 71 in 3fe0c7d
So I assume there’s a sparse matrix in |
It's in the case study of the best practices notebook that worked before the recent update (not exactly a minimal example). I will see if there's a sparse matrix in there, and otherwise check when this breaks. |
You are correct. There are the following sparse matrices in
Wasn't it the case before that you ignored sparse matrices in Would it be possible to only ignore sparse matrices in |
I think it’s a case of “couldn’t handle sparse matrices before, can do that now” I don’t think SCEs have a canonical location for those, so I just put it into the generic “metadata” list. I don’t think ignoring this is the right way, there seems to be a bug either in scipy or here. My code looks pretty OK I think: anndata2ri/anndata2ri/scipy2ri/py2r.py Lines 62 to 70 in 3fe0c7d
So why does R say “invalid class “dgRMatrix” object: slot j is not increasing inside a column”. Did whatever code created the |
The sparse matrices in |
(one of) those specific sparse matrices’ memory layout seems to be the problem, not any sparse matrix stored in .uns |
I have an idea what it could be. In this case I subset an anndata object. I assume this doesn't subset the sparse matrices in |
So it has to do with subsetting. Although the
This works:
This doesn't work:
|
I added the |
Maybe the subsetting results in Anyway, I can’t reproduce this:
scanpy==1.4.3 anndata==0.6.20 umap==0.3.0 numpy==1.16.4 scipy==1.3.0 pandas==0.24.2 scikit-learn==0.21.2 statsmodels==0.9.0 python-igraph==0.7.1 louvain==0.6.1 |
Just linking to the anndata issue report for this: scverse/anndata#165 Also, in case anyone else sees this issue: It has to do with scipy version 1.2.1 |
I also encountered the same error, it is about subsetting but it has nothing to do with |
I just encountered the same error. It only occurs on the subsetted matrices. However my scipy and statsmodels packages are current. These are all the versions that I am using: I did try and downgrade to scipy==1.3.0 and statsmodels=0.10.0 but I am still getting the same error. Any idea what else to try? Thanks!!!! Eva |
And your |
Version for anndata2ri is 1.0.1 |
I recently had an error with ordering of sparse matrix indices after subsetting. R doesn't like it if the indices aren't contiguous anymore, so it gives an error when converting. You could check if all of your sparse matrices have |
Just ran |
At this point, we need a minimal reproducible example to see what’s happening. Just a code block that can be executed in a fresh session of python and triggers the issue in as little code as possible. |
I think I have resolved the issue with a workaround that is probably more ideal anyways - doing everything in a Docker container. Seems to be an OS specific issue that occurs while saving a subsetted .h5ad file. Thanks for all your help. These are my original commands (did not work):this was done on Windows 8.1 OS adata = sc.read('./write/previouslysaved_adata.h5ad')
adata[
(adata.obs['sample']=='ct') |
(adata.obs['sample']=='treatment_1')
].write('/data/subset_adata.h5ad') this was done in Docker container adata = sc.read('./data/subset_adata.h5ad') %%R -i adata
adata This works:all done in Docker container: adata = sc.read('./data/previouslysaved_adata.h5ad')
adata[
(adata.obs['sample']=='ct') |
(adata.obs['sample']=='treatment_1')
].write('/data/subset_adata.h5ad')
adata = sc.read('./data/subset_adata.h5ad') %%R -i adata
adata |
So this sounds like it is a windows issue, no? |
I'm finding this same error in a project I'm working on with @LuckyMD (https://github.com/singlecellopenproblems/SingleCellOpenProblems), also involving subsetting a sparse matrix from openproblems.tasks.label_projection.datasets import zebrafish_labels
import scIB.preprocessing
adata = zebrafish_labels(test=True) # This uses scanpy.pp.subsample()
scIB.preprocessing.normalize(adata) raises ---------------------------------------------------------------------------
RRuntimeError Traceback (most recent call last)
<ipython-input-1-6c0cfb2c1423> in <module>
3
4 adata = zebrafish_labels(test=True) # This uses scanpy.pp.subsample()
----> 5 scIB.preprocessing.normalize(adata)
~/.local/lib/python3.8/site-packages/scIB/preprocessing.py in normalize(adata, min_mean)
166 sc.tl.louvain(adata_pp, key_added='groups', resolution=0.5)
167
--> 168 ro.globalenv['data_mat'] = adata.X.T
169 ro.globalenv['input_groups'] = adata_pp.obs['groups']
170 size_factors = ro.r(f'computeSumFactors(data_mat, clusters = input_groups, min.mean = {min_mean})')
/usr/lib/python3.8/site-packages/rpy2/robjects/environments.py in __setitem__(self, item, value)
30
31 def __setitem__(self, item: str, value: typing.Any) -> None:
---> 32 robj = conversion.converter.py2rpy(value)
33 super(Environment, self).__setitem__(item, robj)
34
/usr/lib/python3.8/functools.py in wrapper(*args, **kw)
873 '1 positional argument')
874
--> 875 return dispatch(args[0].__class__)(*args, **kw)
876
877 funcname = getattr(func, '__name__', 'singledispatch function')
~/.local/lib/python3.8/site-packages/anndata2ri/scipy2ri/py2r.py in wrapper(obj)
41
42 with localconverter(default_converter + numpy2ri.converter):
---> 43 return f(obj)
44
45 return wrapper
~/.local/lib/python3.8/site-packages/anndata2ri/scipy2ri/py2r.py in csr_to_rmat(csr)
63 def csr_to_rmat(csr: sparse.csr_matrix):
64 t, conv_data, _ = get_type_conv(csr.dtype)
---> 65 return methods.new(
66 f"{t}gRMatrix",
67 j=as_integer(csr.indices),
/usr/lib/python3.8/site-packages/rpy2/robjects/functions.py in __call__(self, *args, **kwargs)
195 v = kwargs.pop(k)
196 kwargs[r_k] = v
--> 197 return (super(SignatureTranslatedFunction, self)
198 .__call__(*args, **kwargs))
199
/usr/lib/python3.8/site-packages/rpy2/robjects/functions.py in __call__(self, *args, **kwargs)
123 else:
124 new_kwargs[k] = conversion.py2rpy(v)
--> 125 res = super(Function, self).__call__(*new_args, **new_kwargs)
126 res = conversion.rpy2py(res)
127 return res
/usr/lib/python3.8/site-packages/rpy2/rinterface_lib/conversion.py in _(*args, **kwargs)
42 def _cdata_res_to_rinterface(function):
43 def _(*args, **kwargs):
---> 44 cdata = function(*args, **kwargs)
45 # TODO: test cdata is of the expected CType
46 return _cdata_to_rinterface(cdata)
/usr/lib/python3.8/site-packages/rpy2/rinterface.py in __call__(self, *args, **kwargs)
622 error_occured))
623 if error_occured[0]:
--> 624 raise embedded.RRuntimeError(_rinterface._geterrmessage())
625 return res
626
RRuntimeError: Error in validObject(.Object) :
invalid class “dgRMatrix” object: slot j is not increasing inside a column I have the following package versions installed
|
So |
I think this is happening during the call to |
Seems like that’s the correct thing to do. I think correctly constructed matrices do have the property R expects. |
So I think that this has to do with sparse matrices in R expecting contiguous indices. After subsetting, that is no longer the case. Thus, whenever I move a sparse index into R, I do the following:
That should solve the problem I think. |
Ah, nice! I didn't see the update. Does this fix affect passing back and forth between R and python? For example, if you have a subsetted python matrix, pass it to R and then back to python, would it still be seen as similar/equal? |
Comparison operations should test if the represented data is equal, not implementation details. Therefore I’d say that if they aren’t, that could be considered a bug in scipy’s |
fair point. |
I'm seeing this issue again with latest versions of scipy and anndata2ri if issparse(adata.X):
if not adata.X.has_sorted_indices:
adata.X.sort_indices()
ro.globalenv["adata"] = adata ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[49], line 4
2 if not adata.X.has_sorted_indices:
3 adata.X.sort_indices()
----> 4 ro.globalenv["adata"] = adata
File ~/opt/anaconda3/envs/velocyto/lib/python3.10/site-packages/rpy2/robjects/environments.py:35, in Environment.__setitem__(self, item, value)
34 def __setitem__(self, item: str, value: typing.Any) -> None:
---> 35 robj = conversion.get_conversion().py2rpy(value)
36 super(Environment, self).__setitem__(item, robj)
File ~/opt/anaconda3/envs/velocyto/lib/python3.10/functools.py:889, in singledispatch.<locals>.wrapper(*args, **kw)
885 if not args:
886 raise TypeError(f'{funcname} requires at least '
887 '1 positional argument')
--> 889 return dispatch(args[0].__class__)(*args, **kw)
File ~/opt/anaconda3/envs/velocyto/lib/python3.10/site-packages/anndata2ri/py2r.py:56, in py2rpy_anndata(obj)
54 # TODO: sparse
55 x = {} if obj.X is None else dict(X=mat_converter.py2rpy(obj.X.T))
---> 56 layers = {k: mat_converter.py2rpy(v.T) for k, v in obj.layers.items()}
57 assays = ListVector({**x, **layers})
59 row_args = {k: pandas2ri.py2rpy(v) for k, v in obj.var.items()}
File ~/opt/anaconda3/envs/velocyto/lib/python3.10/site-packages/anndata2ri/py2r.py:56, in <dictcomp>(.0)
54 # TODO: sparse
55 x = {} if obj.X is None else dict(X=mat_converter.py2rpy(obj.X.T))
---> 56 layers = {k: mat_converter.py2rpy(v.T) for k, v in obj.layers.items()}
57 assays = ListVector({**x, **layers})
59 row_args = {k: pandas2ri.py2rpy(v) for k, v in obj.var.items()}
File ~/opt/anaconda3/envs/velocyto/lib/python3.10/functools.py:889, in singledispatch.<locals>.wrapper(*args, **kw)
885 if not args:
886 raise TypeError(f'{funcname} requires at least '
887 '1 positional argument')
--> 889 return dispatch(args[0].__class__)(*args, **kw)
File ~/opt/anaconda3/envs/velocyto/lib/python3.10/site-packages/anndata2ri/scipy2ri/py2r.py:88, in py2r_context.<locals>.wrapper(obj)
36 importr('Matrix') # make class available
37 matrix = SignatureTranslatedAnonymousPackage(
38 """
39 sparse_matrix <- function(x, conv_data, dims, ...) {
(...)
85 'matrix',
86 )
---> 88 return f(obj)
File ~/opt/anaconda3/envs/velocyto/lib/python3.10/site-packages/anndata2ri/scipy2ri/py2r.py:97, in csc_to_rmat(csc)
93 @converter.py2rpy.register(sparse.csc_matrix)
94 @py2r_context
95 def csc_to_rmat(csc: sparse.csc_matrix):
96 csc.sort_indices()
---> 97 conv_data = get_type_conv(csc.dtype)
98 with localconverter(default_converter + numpy2ri.converter):
99 return matrix.from_csc(i=csc.indices, p=csc.indptr, x=csc.data, dims=list(csc.shape), conv_data=conv_data)
File ~/opt/anaconda3/envs/velocyto/lib/python3.10/site-packages/anndata2ri/scipy2ri/py2r.py:28, in get_type_conv(dtype)
26 return base.as_logical
27 else:
---> 28 raise ValueError(f'Unknown dtype {dtype!r} cannot be converted to ?gRMatrix.')
ValueError: Unknown dtype dtype('uint16') cannot be converted to ?gRMatrix. anndata2ri==1.1, scipy==1.10.0, rpy2==3.5.8, scanpy==1.9.1 |
Hi Phil,
I'm just updating my case study notebook and updated
anndata2ri
for that. However, when converting my anndata object via:I get the following error:
I'm pretty sure this has to do with versions in my conda environment as anndata2ri conversion worked fine when I worked locally on my laptop. Any idea if there are any R version dependencies that might not be met?
The text was updated successfully, but these errors were encountered: