Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

column_order is ignored when column_split is used #1166

Open
dcarbajo opened this issue Feb 23, 2024 · 5 comments
Open

column_order is ignored when column_split is used #1166

dcarbajo opened this issue Feb 23, 2024 · 5 comments

Comments

@dcarbajo
Copy link

dcarbajo commented Feb 23, 2024

I have a heat map like this, where I apply column_order to obtain the order of columns that I want (with NoT group before LTR group):

Screenshot 2024-02-23 at 12 09 43

Now I just want to apply a column gap between the NoT and the LTR columns. For that I use column_split with a variable that maps my matrix column names to their group, in the same order.

This worked for me before, cause my groups were A and B, so the A-B order was kept. However, now I want to plot this groups as B-A, so when I apply column_split, the group order is reversed, ignoring my column_order.

Screenshot 2024-02-23 at 12 10 34

How can I overcome this? I tried changing the order of my split variable, but to no avail. Thanks!

@dcarbajo dcarbajo changed the title column_order is ignored when column_split is used Potential bug: column_order is ignored when column_split is used Feb 28, 2024
@dcarbajo
Copy link
Author

dcarbajo commented Mar 4, 2024

I posted a full MWE in here, pasted below as well:

Check the following MWE with the iris data.

1- Create my data matrix (small subset of iris for just setosa and virginica Species), and a meta information data frame with just sample ID and Species (my grouping variable):

data(iris)
my_setosa=subset(iris, Species=="setosa")
my_virginica=subset(iris, Species=="virginica")
set.seed(123)
rows_used1 <- sort(sample(1:nrow(my_setosa), 5, replace = F))
rows_used2 <- sort(sample(1:nrow(my_virginica), 5, replace = F))
my_iris=rbind(my_setosa[rows_used1,], my_virginica[rows_used2,])
#
heat_mat=t(as.matrix(my_iris[,-ncol(my_iris)]))
meta_df=data.frame(ID=paste0("id",rownames(my_iris)), Species=my_iris[,ncol(my_iris)])
meta_df$Species=droplevels(meta_df$Species)
colnames(heat_mat)=meta_df$ID

These look like this:

> heat_mat
             id3 id14 id15 id31 id42 id114 id125 id137 id143 id150
Sepal.Length 4.7  4.3  5.8  4.8  4.5   5.7   6.7   6.3   5.8   5.9
Sepal.Width  3.2  3.0  4.0  3.1  2.3   2.5   3.3   3.4   2.7   3.0
Petal.Length 1.3  1.1  1.2  1.6  1.3   5.0   5.7   5.6   5.1   5.1
Petal.Width  0.2  0.1  0.2  0.2  0.3   2.0   2.1   2.4   1.9   1.8
> meta_df
      ID   Species
1    id3    setosa
2   id14    setosa
3   id15    setosa
4   id31    setosa
5   id42    setosa
6  id114 virginica
7  id125 virginica
8  id137 virginica
9  id143 virginica
10 id150 virginica

2- Define heatmap and Species grouping colors, column annotation (Species groups), column order and column split (gap between Species groups):

palette=grDevices::colorRampPalette(c("green","black","red"))(11)
col_vec=c("red","blue")
col_vec=stats::setNames(col_vec, levels(meta_df$Species))
column_ha <- ComplexHeatmap::HeatmapAnnotation(
               Species = meta_df$Species,
               col = list(Species = col_vec),
               show_legend = TRUE)
column_order=meta_df[order(meta_df$Species, decreasing=T),]$ID
column_split <- meta_df$Species

Note here (and this is the problem), that I want the virginica group on the left, and the setosa group on the right, so my column order is the following:

> column_order
 [1] "id114" "id125" "id137" "id143" "id150" "id3"   "id14"  "id15"  "id31"
[10] "id42"

3- Make the heatmap without column_split:

complex_heat <- ComplexHeatmap::Heatmap(heat_mat, cluster_rows = FALSE, cluster_columns = FALSE,
                                        row_dend_width = grid::unit(2, "inch"),
                                        rect_gp = grid::gpar(col = "white", lwd = 2),
                                        col = palette,
                                        top_annotation = column_ha,
                                        column_order = column_order,
                                        #column_split = column_split,
                                        column_gap = grid::unit(0.1, "inch"),
                                        border = TRUE)
grDevices::png(filename="heatmap.png", height=400, width=600)
ComplexHeatmap::draw(complex_heat)
grDevices::dev.off()

This is all good, the heatmap produced below has my sample IDs ordered correctly with virginica on the left (in blue as specified by col_vector), and setosa on the right (in red as specified by col_vector):

heatmap

4- Make the heatmap with column_split; if I just uncomment the column_split line that specifies to split the column by the Species variable, I woould expect the exact same heatmap, with just a gap between the two Species groups. However, the column_order is ignored, and setosa samples appear on the left...

complex_heat <- ComplexHeatmap::Heatmap(heat_mat, cluster_rows = FALSE, cluster_columns = FALSE,
                                        row_dend_width = grid::unit(2, "inch"),
                                        rect_gp = grid::gpar(col = "white", lwd = 2),
                                        col = palette,
                                        top_annotation = column_ha,
                                        column_order = column_order,
                                        column_split = column_split,
                                        column_gap = grid::unit(0.1, "inch"),
                                        border = TRUE)

heatmap

@dcarbajo dcarbajo changed the title Potential bug: column_order is ignored when column_split is used column_order is ignored when column_split is used Mar 7, 2024
@dcarbajo
Copy link
Author

dcarbajo commented Mar 7, 2024

Found the answer, column_split should be a factor with the same order as meta_df$Species, but the levels specified in the order we want them plotted...

The solution is to define column_split like this:

column_split=factor(as.character(meta_df$Species), levels=c('virginica','setosa'))

@hookoop
Copy link

hookoop commented Jul 4, 2024

Hey,
I'm writing this for anyone who stumbles upon this thread because I could not get this solution to work.. I could get the columns to split how I want to, but the samples get suffled so that they do not correspond to the assigned groups. I have specified the levels of my factor, and factor(meta_df$treat, levels= unique(meta_df$treat)) has always worked for me before.

The only solution I could find was column_split = factor(c(2,2,2,1)) , as my object had levels "PRE" (pre treatment) and "POST" (post treatment) corresponding to 1 and 2 respectively. If I try to order the column_split by any variation of c("PRE", "POST","POST","POST") I will not get my desired outcome. This is still confusing because on the heatmap the order shows up as 1, 2,2,2 (like I wanted) yet I has to specify it to be 2,2,2,1 in the code.

@vetmohit89
Copy link

I tried above suggested method to re-order the columns but no help.

Heatmap(Actinobacteria_counts_scaled_matrix_clean_sorted, cluster_columns = dend2, column_split = factor(as.character(meta_df$Species), levels=c('WT','db/db', 'db+ACE2', 'db+IW')),
+         column_order=column_order,
+         cluster_row_slices = TRUE,
+         cluster_column_slices = TRUE,
+          top_annotation = HeatmapAnnotation(foo = fa, col = list(foo = fa_col)))
fa = rep(c("WT", "DB", "DB_ACE", "DB_IW"), times = c(9, 7, 4, 5))
fa_col <- c("WT" = 2, "DB" = 3, "DB_ACE" = 4, "DB_IW" = 5)

dend2 = cluster_within_group(Actinobacteria_counts_scaled_matrix_clean, fa)

Error: When cluster_columns is a dendrogram, column_split can only be a single number.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants