Using Bulk RNASeq as reference #17

Rohit-Satyam · 2022-04-14T20:06:28Z

Hi Developers!!

I have Time-series bulk data (with two biological replicates for each time) and now I wish to use it to annotate my single-cell data. My Single-cell data contains two samples, One control and one drug-treated.

Your vignette is quite straight forward but can you tell me how I combine two biological replicates per time point into one column so that I have one column for each time point and use it for assigning labels. Will log normalizing both replicates and then take average per gene per time point work? Do you have any guidelines for using Bulk RNASeq datasets to label single-cell data? Can you check if the code I used below is valid?

Note: My cells are highly synchronized at 10 hours post-infection and I wish to confirm it by the time tag transfer using Bulk RNASeq.

library(Seurat)
library(dplyr)
c <- read.csv("counts.csv", header = T, row.names = 1)
ribo <- read.csv("ribosomal_genes.csv", header = F)
## removing ribo genes 225
c <- c[!rownames(c) %in% ribo$V1,]
> c %>% head()
T0_rep1 T0_rep2 T2_rep1 T2_rep2 T4_rep1 T4_rep2 T6_rep1
Gene1      46      48     202     298     357     428     125
Gene2     130     137     693     844    1713    1504     630
Gene3       0       0       0       0       1       0       0
Gene4       1       1       1       0       0       1       1
Gene5     412     524     994    1569    1959    2095     771
Gene6   16909   16966   16490   23660   30364   30484   10008

meta <- read.csv("meta.csv", header = T)
> meta %>% head()
Sample_id Prefix Replicate Time timetag
T0_rep1   T0_rep1     T0      rep1    4    4hpi
T0_rep2   T0_rep2     T0      rep2    4    4hpi
T2_rep1   T2_rep1     T2      rep1    6    6hpi
T2_rep2   T2_rep2     T2      rep2    6    6hpi
T4_rep1   T4_rep1     T4      rep1    8    8hpi
T4_rep2   T4_rep2     T4      rep2    8    8hpi

rownames(meta) <- meta$Sample_id
bulk <- CreateSeuratObject(counts = c, project = "Time-Series")
bulk$Sample <- "T-series"
bulk$batch <- "T-series"
bulk$batch2 <- "T-series"
bulk@meta.data <- cbind(bulk@meta.data,meta)
bulk <- NormalizeData(bulk)
prefix <- unique(meta$Prefix)
mat.temp <- data.frame(GetAssayData(bulk, slot="data"))
l <- list()
biorepavg <- lapply(1:length(prefix),function(x){
  l[[x]] <- mat.temp[,grep(prefix[x], colnames(mat.temp))] %>% rowMeans()
  
})

df <- data.frame(sapply(biorepavg,c))
colnames(df) <- prefix
newmeta <- meta[,c(2,4,5)] %>% unique() %>% `rownames<-`(.$Prefix)
> newmeta %>% head
Prefix Time timetag
T0      T0    4    4hpi
T2      T2    6    6hpi
T4      T4    8    8hpi
T6      T6   10   10hpi
T8      T8   12   12hpi
T10    T10   14   14hpi
reference <- SingleCellExperiment(list(counts=df),colData=DataFrame(celltypes =newmeta$timetag))

library(CHETAH)
input <- SingleCellExperiment(assays = list(counts = GetAssayData(p10,slot = "counts")),colData=DataFrame(p10@meta.data))
input <- CHETAHclassifier(input = input, ref_cells = reference)
> table(input$celltype_CHETAH)

  4hpi  Node1 Node14 Node15 Node16 Node18  Node2  Node3  Node4 
  1398      9      3      3     17      3     56   9113    336 
> classify_all = Classify(input, 0)
> table(classify_all$celltype_CHETAH)

10hpi 32hpi 42hpi  4hpi  8hpi 
 9458     3    17  1454     6

The bulk reference correlation map is as follows:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Bulk RNASeq as reference #17

Using Bulk RNASeq as reference #17

Rohit-Satyam commented Apr 14, 2022 •

edited

Loading

Using Bulk RNASeq as reference #17

Using Bulk RNASeq as reference #17

Comments

Rohit-Satyam commented Apr 14, 2022 • edited Loading

Rohit-Satyam commented Apr 14, 2022 •

edited

Loading