Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

带有生物学重复的wgcna结果能直接引用到clustergvis里面的同时合并生物学重复吗? #110

Open
Ziwei-Liu opened this issue Dec 3, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@Ziwei-Liu
Copy link

我的wgcna输入数据是每个时间点有3个生物学重复的,在运行cw <- clusterData(exp = datExpr0, cluster.method = "wgcna", object = net)之后得到的cw的结构是

> str(cw)
List of 5
 $ wide.res:'data.frame':       15475 obs. of  33 variables:
  ..$ ERR_E1   : num [1:15475] 1.158 -2.138 -0.572 0.821 -0.332 ...
  ..$ ERR_E2   : num [1:15475] 0.989 0.493 1.535 -1.205 0.105 ...
  ..$ ERR_E3   : num [1:15475] 0.843 0.263 -1.492 0.61 2.111 ...
  ..$ ERR_NI1  : num [1:15475] -1.554 1.79 -0.954 1.122 0.244 ...
  ..$ ERR_NI2  : num [1:15475] -1.665 0.417 0.584 1.084 -0.504 ...
  ..$ ERR_NI3  : num [1:15475] 0.876 2.232 2.267 -0.656 0.236 ...
  ..$ ERR_NV1  : num [1:15475] 0.3469 -0.0634 0.5386 0.3853 -1.126 ...
  ..$ ERR_NV2  : num [1:15475] -1.02 -0.908 0.735 0.403 -1.126 ...
  ..$ ERR_NV3  : num [1:15475] 0.7875 -0.3503 -1.0401 -0.0251 -0.4062 ...
  ..$ ERR_ZI1  : num [1:15475] -1.6543 0.3853 0.0608 -1.0349 -1.126 ...
  ..$ ERR_ZI2  : num [1:15475] 0.3174 -0.2661 -0.0398 -0.5856 0.5682 ...
  ..$ ERR_ZI3  : num [1:15475] 0.507 1.21 1.047 0.163 1.984 ...
  ..$ ERR_ZII1 : num [1:15475] 0.3346 0.5296 -0.0917 0.4836 1.359 ...
  ..$ ERR_ZII2 : num [1:15475] -0.0773 -1.29 0.7879 -0.6568 -1.126 ...
  ..$ ERR_ZII3 : num [1:15475] 1.0002 -0.0152 -2.5131 0.971 -0.2035 ...
  ..$ ERR_ZIII1: num [1:15475] -1.67 -1.27 0.37 0.789 -0.457 ...
  ..$ ERR_ZIII2: num [1:15475] -1.5062 -0.6577 -0.7353 -0.8668 -0.0317 ...
  ..$ ERR_ZIII3: num [1:15475] 0.271 -0.331 0.456 1.046 0.787 ...
  ..$ ERR_MI1  : num [1:15475] -0.144 1.036 -0.791 -2.097 0.566 ...
  ..$ ERR_MI2  : num [1:15475] -0.389 0.286 -0.104 -2.061 -1.126 ...
  ..$ ERR_MI3  : num [1:15475] -0.161 0.293 -0.752 0.961 -0.647 ...
  ..$ ERR_MII1 : num [1:15475] 0.737 0.904 0.31 1.053 -1.126 ...
  ..$ ERR_MII2 : num [1:15475] -0.1691 0.0709 1.4271 -1.9131 1.1729 ...
  ..$ ERR_MII3 : num [1:15475] -1.65576 -0.51909 0.32075 0.65915 -0.00877 ...
  ..$ ERR_MIII1: num [1:15475] -0.867 -0.753 -1.037 0.776 -0.418 ...
  ..$ ERR_MIII2: num [1:15475] 1.183 -0.68 -0.291 0.322 -1.126 ...
  ..$ ERR_MIII3: num [1:15475] 0.311 0.717 -0.664 -1.046 -1.126 ...
  ..$ ERR_P1   : num [1:15475] 1.347 0.832 -0.538 0.148 1.642 ...
  ..$ ERR_P2   : num [1:15475] 1.045 -2.108 1.235 -0.584 1.332 ...
  ..$ ERR_P3   : num [1:15475] 0.4782 -0.1068 -0.0578 0.934 -0.0932 ...
  ..$ gene     : chr [1:15475] "LG26G000047" "LG44G000174" "LG13G000245" "LG10G000718" ...
  ..$ cluster  : num [1:15475] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ modulecol: chr [1:15475] "grey" "grey" "grey" "grey" ...
 $ long.res:'data.frame':       464250 obs. of  6 variables:
  ..$ cluster     : num [1:464250] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ gene        : chr [1:464250] "LG26G000047" "LG44G000174" "LG13G000245" "LG10G000718" ...
  ..$ modulecol   : chr [1:464250] "grey" "grey" "grey" "grey" ...
  ..$ cell_type   : Factor w/ 30 levels "ERR_E1","ERR_E2",..: 1 1 1 1 1 1 1 1 1 1 ...
  ..$ norm_value  : num [1:464250] 1.158 -2.138 -0.572 0.821 -0.332 ...
  ..$ cluster_name: Factor w/ 18 levels "cluster 1 (2071 grey)",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ type    : chr "wgcna"
 $ geneMode: chr "none"
 $ geneType: chr "none"

然而我实际的数据分组情况是

> traitData
          E NI NV ZI ZII ZIII MI MII MIII P
ERR_E1    1  0  0  0   0    0  0   0    0 0
ERR_E2    1  0  0  0   0    0  0   0    0 0
ERR_E3    1  0  0  0   0    0  0   0    0 0
ERR_NI1   0  1  0  0   0    0  0   0    0 0
ERR_NI2   0  1  0  0   0    0  0   0    0 0
ERR_NI3   0  1  0  0   0    0  0   0    0 0
ERR_NV1   0  0  1  0   0    0  0   0    0 0
ERR_NV2   0  0  1  0   0    0  0   0    0 0
ERR_NV3   0  0  1  0   0    0  0   0    0 0
ERR_ZI1   0  0  0  1   0    0  0   0    0 0
ERR_ZI2   0  0  0  1   0    0  0   0    0 0
ERR_ZI3   0  0  0  1   0    0  0   0    0 0
ERR_ZII1  0  0  0  0   1    0  0   0    0 0
ERR_ZII2  0  0  0  0   1    0  0   0    0 0
ERR_ZII3  0  0  0  0   1    0  0   0    0 0
ERR_ZIII1 0  0  0  0   0    1  0   0    0 0
ERR_ZIII2 0  0  0  0   0    1  0   0    0 0
ERR_ZIII3 0  0  0  0   0    1  0   0    0 0
ERR_MI1   0  0  0  0   0    0  1   0    0 0
ERR_MI2   0  0  0  0   0    0  1   0    0 0
ERR_MI3   0  0  0  0   0    0  1   0    0 0
ERR_MII1  0  0  0  0   0    0  0   1    0 0
ERR_MII2  0  0  0  0   0    0  0   1    0 0
ERR_MII3  0  0  0  0   0    0  0   1    0 0
ERR_MIII1 0  0  0  0   0    0  0   0    1 0
ERR_MIII2 0  0  0  0   0    0  0   0    1 0
ERR_MIII3 0  0  0  0   0    0  0   0    1 0
ERR_P1    0  0  0  0   0    0  0   0    0 1
ERR_P2    0  0  0  0   0    0  0   0    0 1
ERR_P3    0  0  0  0   0    0  0   0    0 1

E1、E2、E3是E期的三个重复,NI1、NI2、NI3是NI期的三个重复,以此类推。
我有30列的输入数据,但实际只有10个分期,每个分期3个生物学重复。
如果直接用cw画图,那么每个生物学重复都会被当作一个时间点,这明显是不合理的。clusterData在引入wgcna数据的时候,是否可以增加一个参数,用来载入描述数据分组的traitData,以便在从net中引入wgcna数据的同时合并生物学重复?

p.s.我目前解决这个问题的方式是:

###datExpr0是使用blockwiseModules计算net的时候的输入数据矩阵
datExprm <- as.data.frame(t(datExpr0))

###输出标准化之后的表达量矩阵为csv文件
write.csv(datExprm, 'Embryo_gene.fpkm.matrix.normalized.csv', quote = FALSE)

###在excel中操作,将导出的标准化后的表达量,按生物学重复重新合并
#即原本存在E1、E2、E3三个生物学重复的E阶段表达量,要通过取平均合并为一个E阶段的表达量
#excel的每N列求平均公式(N=3):
=AVERAGE(OFFSET($B2,,3*(COLUMN(B2)-2),,3))

###重新读入合并生物学重复之后的标准化表达量
datExprmVar = read.csv('Embryo_gene.fpkm.matrix.normalized.BioDups_Merged.csv', header = TRUE, row.names = 1)

#从net中获取module label和对应的颜色信息
#net来自wgcna包的blockwiseModules的计算结果
moduleLabels = net$colors
moduleColors = labels2colors(moduleLabels)

#将module分组信息转化为dataframe
modLab <- t(data.frame(as.list(moduleLabels)))
colnames(modLab) <- "Module"

#将moduleColors先转化为named num再转化为dataframe
modCol <- moduleColors
names(modCol) <- names(moduleLabels)
modCol <- t(data.frame(as.list(modCol)))
colnames(modCol) <- "ModuleColor"

#合并module分组信息与normalized biodup merged expression file
expNorM <- transform(merge(datExprmVar, modLab, by = 0, all = TRUE), row.names=Row.names, Row.names=NULL)
expNorM <- transform(merge(expNorM, modCol, by = 0, all = TRUE), row.names=Row.names, Row.names=NULL)

#按特定module作图,聚类模式选择kmeans,聚类类别数选择1(聚类模式为mfuzz时,聚类类别数选1会报错)(因为这一步操作是按module选取基因输入clustergvis画图,输入clusterData的基因不需要进行进一步的聚类,因此要保证cluster.num = 1)
for (i in names(table(moduleColors))){
print(i)
expPatterns = subset(expNorM[expNorM$ModuleColor == i,], select = -c(Module, ModuleColor))
cm <- clusterData(exp = expPatterns,cluster.method = "kmeans",cluster.num = 1,seed = 42)
visCluster(object = cm,plot.type = "line")
}
@junjunlab junjunlab added the enhancement New feature or request label Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants