Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in clusterData function #95

Open
aletolia opened this issue Sep 7, 2024 · 3 comments
Open

Error in clusterData function #95

aletolia opened this issue Sep 7, 2024 · 3 comments

Comments

@aletolia
Copy link

aletolia commented Sep 7, 2024

感谢您的精彩工作!我正在遵循您公众号 https://mp.weixin.qq.com/s?__biz=MzkyMTI1MTYxNA==&mid=2247509312&idx=1&sn=5ab254679b6934cbe6c7a325154e1143&chksm=c1849f31f6f31627cc1148df6eb33e1e757a455c27d4822e864a28987aa2af95e9688b8061fd&token=553825481&lang=zh_CN#rd ‘听说你想画个 monocle3 的拟时序热图?’ 中的内容在我自己的数据集上进行复现,我首先使用

genes <- row.names(subset(modulated_genes, q_value == 0 & morans_I > 0.24))
mat <- pre_pseudotime_matrix(cds_obj = cds,
                             gene_list = genes)

得到了六组基因,但接下来如果我使用

> # kmeans
> ck <- clusterData(exp = mat,
+                   cluster.method = "kmeans",
+                   scaleData = TRUE,
+                   cluster.num = 6)

会产生错误:

0 genes excluded.
Error: number of cluster centres must lie between 1 and nrow(x)`

而如果将其修改成:

> ck <- clusterData(exp = mat,
+                   cluster.method = "kmeans",
+                   scaleData = TRUE,
+                   cluster.num = 5)

则会出现:

0 genes excluded.
Error in hclust(get_dist(t(submat), distance), method = method) : 
  size cannot be NA nor exceed 65536

以下是加载的包的和此前的monocle3代码:

library(Seurat)
library(SeuratDisk)
library(dplyr)
library(ggplot2)
library(RColorBrewer)
library(monocle3)
library(SeuratWrappers)
library(clusterProfiler)
library(org.Hs.eg.db)
library(enrichplot)
library(ClusterGVis)
# 加载Seurat对象
sce <- LoadH5Seurat("adata_seurat_bbknn_annotated.h5Seurat")

# 将Seurat对象转换为Monocle3对象
cds <- as.cell_data_set(sce)

# 添加分群信息到Monocle3对象
colData(cds)$cluster <- sce@meta.data$leiden_res0_5

# 从Seurat对象中提取UMAP坐标,并添加到Monocle3对象
umap_coords <- Embeddings(sce, "UMAP")
reducedDims(cds)$UMAP <- umap_coords

# 确保使用UMAP进行降维
cds <- cluster_cells(cds, reduction_method = "UMAP")

# 使用现有的UMAP进行降维和拟时序分析
cds <- learn_graph(cds)
cds <- order_cells(cds)
plot_cells(cds,
           color_cells_by = "pseudotime",
           label_cell_groups=FALSE,
           label_leaves=FALSE,
           label_branch_points=FALSE,
           graph_label_size=1.5)

modulated_genes <- graph_test(cds, neighbor_graph = "principal_graph", cores = 6)
genes <- row.names(subset(modulated_genes, q_value == 0 & morans_I > 0.24))

library(ClusterGVis)
mat <- pre_pseudotime_matrix(cds_obj = cds,
                             gene_list = genes)
head(mat[1:6,1:6])

希望您可以抽空解答这个问题,谢谢!

@junjunlab
Copy link
Owner

基因有65536那么多吗?人的基因也就2万多点

@aletolia
Copy link
Author

基因有65536那么多吗?人的基因也就2万多点

实际上是 mat 是一个形状为 num [1:6, 1:72486] 的矩阵,其中 72486 是细胞的数目,我认为这在形状上没有问题——我实际检查这个数据时发现它的纵列是被选择的 6 个基因,横行则是按照拟时序大小排列的细胞编号(可能是?)内容则是表达量,如下图所示
image
但是让我不解的是,在我尝试将 mat 转置之后,运行成功并得到了如下的结果:
image
但显然这个结果存在问题,正如您说的一样:人类的功能基因实际上没那么多。看起来这里右侧的数目实际上应该指的是细胞的数目而非基因的数目。我使用了一个从 anndata 对象转换来的 seurat 对象然后利用 monocle3 直接转换成 cds 进行分析,这会是问题所在吗?
读取的代码如上所述:

# 加载Seurat对象
sce <- LoadH5Seurat("adata_seurat_bbknn_annotated.h5Seurat")

# 将Seurat对象转换为Monocle3对象
cds <- as.cell_data_set(sce)

# 添加分群信息到Monocle3对象
colData(cds)$cluster <- sce@meta.data$leiden_res0_5

# 从Seurat对象中提取UMAP坐标,并添加到Monocle3对象
umap_coords <- Embeddings(sce, "UMAP")
reducedDims(cds)$UMAP <- umap_coords

# 确保使用UMAP进行降维
cds <- cluster_cells(cds, reduction_method = "UMAP")

# 使用现有的UMAP进行降维和拟时序分析
cds <- learn_graph(cds)
cds <- order_cells(cds)
plot_cells(cds,
           color_cells_by = "pseudotime",
           label_cell_groups=FALSE,
           label_leaves=FALSE,
           label_branch_points=FALSE,
           graph_label_size=1.5)

在您的微信推文中,您似乎是从 rds 中读取了原始数据然后直接构建了 cds 对象

@aletolia
Copy link
Author

基因有65536那么多吗?人的基因也就2万多点

这是 cds 的具体情况:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants