-
Notifications
You must be signed in to change notification settings - Fork 1
/
Bioc2021_workshop.Rmd
549 lines (419 loc) · 20.1 KB
/
Bioc2021_workshop.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
---
title: "Bioc2021: Visualisation of highly-multiplexed imaging data in R"
date: "`r BiocStyle::doc_date()`"
author:
- name: Nils Eling
affiliation:
- Department for Quantitative Biomedicine, University of Zurich
- Institute for Molecular Health Sciences, ETH Zurich
email: [email protected]
- name: Nicolas Damond
affiliation:
- Department for Quantitative Biomedicine, University of Zurich
- Institute for Molecular Health Sciences, ETH Zurich
- name: Tobias Hoch
affiliation:
- Department for Quantitative Biomedicine, University of Zurich
- Institute for Molecular Health Sciences, ETH Zurich
- name: Bernd Bodenmiller
affiliation:
- Department for Quantitative Biomedicine, University of Zurich
- Institute for Molecular Health Sciences, ETH Zurich
output:
BiocStyle::html_document:
toc_float: yes
pandoc_args: [
"--output=index.html"
]
knit: (function(inputFile, encoding) { rmarkdown::render(inputFile, encoding = encoding, output_file = paste0(dirname(inputFile),'/index.html')) })
abstract: |
Highly multiplexed imaging acquires the single-cell expression of
RNA, metabolites or proteins in a spatially-resolved fashion. These measurements can be
visualized across multiple length-scales. First, pixel-level intensities
represent the spatial distributions of feature expression with highest
resolution. Second, after segmentation, expression values or cell-level
metadata (e.g. cell-type information) can be visualized on segmented cell
areas. This workflow describes the use of the [cytomapper](https://www.bioconductor.org/packages/release/bioc/html/cytomapper.html)
Bioconductor package to demonstrate functions for the visualization of multiplexed
read-outs and cell-level information obtained by multiplexed imaging.
The main functions of this package allow 1. the visualization of
pixel-level information across multiple channels, 2. the display of
cell-level information (expression and/or metadata) on segmentation masks
and 3. gating and visualisation of single cells.
vignette: |
%\VignetteIndexEntry{"Visualization of imaging cytometry data in R"}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
`r fontawesome::fa(name = "github", fill = "#333")` <a href="https://github.com/nilseling">@nilseling</a>
`r fontawesome::fa(name = "twitter", fill = "#1DA1F2")` <a href="https://twitter.com/NilsEling">@NilsEling</a>
# Data and code availability
To follow this tutorial, please visit
[https://github.com/BodenmillerGroup/cytomapper_demos/tree/main/docs](https://github.com/BodenmillerGroup/cytomapper_demos/tree/main/docs).
The compiled .html of this workshop is hosted at:
[https://bodenmillergroup.github.io/cytomapper_demos](https://bodenmillergroup.github.io/cytomapper_demos).
The
[cytomapper](https://www.bioconductor.org/packages/release/bioc/html/cytomapper.html)
package can be installed via:
```{r installation, eval=FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("cytomapper")
```
To reproduce the analysis, clone the repository:
```
git clone https://github.com/BodenmillerGroup/cytomapper_demos.git
```
and open the `Bioc2021_workshop.Rmd` file in the `docs` folder.
We provide three images, their segmentation masks and the quantified single-cell
data in form of a `SingleCellExperiment` object in the `data` folder. The data
is taken from [A map of human type 1 diabetes progression by imaging mass
cytometry](https://www.sciencedirect.com/science/article/pii/S1550413118306910).
# Introduction
The analysis and visualization of highly-multiplexed imaging data relies on three
types of data structures:
* Multi-channel images realized as three dimensional arrays or dimensions `x`,
`y`, `c` where the numeric entry of each voxel represents the intensity of the
pixel at position `x` and `y` for channel `c`.
* Single-channel segmentation masks realized as matrices where sets of pixels
with the same integer ID represent individual objects (here, these are segmented
cells)
* A table containing the quantified features per cell and channel (e.g. mean
pixel intensity)
The `cytomapper` package handles these data types using objects of the following
S4 classes:
* Multiple multi-channel images are stored in form of `EBImage::Image` objects
within a `cytomapper::CytoImageList` object
* Multiple single-channel segmentation masks are stored in form of
`EBImage::Image` objects within a `cytomapper::CytoImageList` object
* Per-cell intensity measures and cell-/channel-specific metadata are stored in
a `SingleCellExperiment::SingleCellExperiment` container
![cytomapper overview figure. A) The plotCells function combines a SingleCellExperiment and CytoImageList object to visualize marker expression or cell-specific metadata on segmentation masks. B) The plotPixels function requires a CytoImageList object to visualize the combined expression of up to six markers as composite images](imgs/Overview.png)
The `cytomapper` package contains three broad functionalities:
* Visualization of pixel-intensities as composites of up to six channels (`plotPixels`)
* Visualization of cell-specific features on segmentation masks (`plotCells`)
* Interactive gating of cells and visualization of gated cells on images (`cytomapperShiny`)
The follwing demonstration gives an overview on data handling and processing,
data visualization and interactive analyses.
We use [imaging mass cytometry](https://www.nature.com/articles/nmeth.2869) data
to highlight the functionality of the `cytomapper` package. However, any imaging
technology is supported as long as the data can be read into R (memory
restrictions and file type restrictions.)
The raw data has been processed using the [ImcSegmentationPipeline](https://github.com/BodenmillerGroup/ImcSegmentationPipeline).
In this tutorial, we assume that image segmentation has been performed beforehand.
# Reading in the data
The following section describes how to read in images (e.g. in `.tiff` format)
into `CytoImageList` objects and how to generate `SingleCellExperiment`
objects from these images.
## Reading in images
The `cytomapper::loadImages` function reads in multi-channel images and segmentation masks into `CytoImageList` objects.
```{r, reading-in-data-1, message=FALSE}
library(cytomapper)
# Read in 32-bit multi-channel images
(images <- loadImages("../data/images/", pattern = ".tiff"))
# Read in 16-bit unsigned integer segmentation masks
(masks <- loadImages("../data/masks/", pattern = ".tiff"))
```
It is always recommended to observe the numeric pixel values to make sure that
images were read in correctly:
```{r histograms}
# multi-channel images - first image, first channel
hist(log10(images[[1]][,,1] + 1))
# Segmentation mask
masks[[1]]
```
We notice, that the segmentation masks were not read in as integer images.
This behavior arises due to issues with accessing the `tiff` metadata after
pre-processing using `CellProfiler`.
In these cases, the `cytomapper::scaleImages` function can be used to rescale
segmentation masks to only contain integer IDs. Here, the scaling factor
is `2 ^ 16 - 1 = 65535` accounting for the 16-bit unsigned integer encoding.
```{r scaleImages}
masks <- scaleImages(masks, 2 ^ 16 - 1)
masks[[1]]
```
As an alternative, while reading in the images, the `as.is` option can be set to `TRUE`.
```{r as.is}
masks <- loadImages("../data/masks/", pattern = ".tiff", as.is = TRUE)
masks[[1]]
```
We can already visualize the segmentation mask to get an idea of the tissue
structure:
```{r mask-viz}
plotCells(masks)
```
### Reading data to disk
To increase scalability of the `cytomapper` package, images can be stored on
disk making use of the `HDF5Array` and `DelayedArray` package. Reading in the
data can also be parallelised, which is only recommended for large images.
```{r read-in-to-disk}
format(object.size(images), units = "Kb")
images_ondisk <- loadImages("../data/images/", pattern = ".tiff",
on_disk = TRUE, h5FilesPath = "../data/images",
BPPARAM = BiocParallel::bpparam())
format(object.size(images_ondisk), units = "Kb")
```
All `cytomapper` functions support images and masks stored on disk.
## The `SingleCellExperiment` object
In the original analysis of this dataset, single-cell features have been
extracted using `CellProfiler`. For this tutorial, we provide a
`SingleCellExperiment` object, which already contains the mean intensities per
cell and channel and all relevant metadata (e.g. cell-type annotation).
The `SingleCellExperiment` can be read-in from the `data` folder:
```{r read-in-sce}
(sce <- readRDS("../data/sce.rds"))
colData(sce)[,c("ImageName", "CellNumber", "CellType")]
```
# Image pre-processing
The following section will discuss setting image metadata and image normalization.
## Setting metadata
Before image visualization, there are a few metadata entries that need to be set.
We will need to set the channel names of the images via the `channelNames`
getter/setter function. The channel order here is the same as the row order of
the `SingleCellExperiment` object. We will also need to synchronise the image
IDs across the multi-channel images and segmentation masks by storing a
DataFrame in the elementMetadata slot of the CytoImageList object.
```{r, format-the-data, message=FALSE}
# Add channel names
channelNames(images) <- rownames(sce)
# Add image name to metadata
(mcols(images) <- mcols(masks) <- DataFrame(ImageName = c("E30", "G23", "J01")))
```
## Normalization
Channel normalization is a crucial step for image visualization to enhance
the visibility of biological signals. One option is to transform (e.g. `sqrt`)
the images. A more widely used alternative is to perform a range-scaling
and clipping of channel intensities.
The `cytomapper` package exports the `normalize` function that scales channels
between 0 and 1 across all images (by default). Clipping is performed by setting
the `inputRange` parameter in the `normalize` function.
```{r normalization}
# Min-max normalization
images_norm <- normalize(images)
# Clip the images
images_norm <- normalize(images_norm, inputRange = c(0, 0.2))
```
## Measuring object features
The `cytomapper::measureObjects` function takes the segmentation masks and
multi- channel image objects and computes morphological (e.g. cell shape, size
and location) and intensity features (default: mean intensity per channel and
object/cell).
```{r, measure-features}
sce_measured <- measureObjects(mask = masks, image = images,
img_id = "ImageName")
sce_measured
```
The pixel intensities per cell can be summarized in different ways (e.g. as
quantiles). Furthermore, parallelization is possible by setting `BPPARAM = bpparam()`.
Cell-specific morphological features are stored in `colData(sce)` while the
mean pixel intensities per cell and channel are stored in `counts(sce)`.
```{r show-internal-structure}
colData(sce_measured)
counts(sce_measured)[1:10, 1:10]
```
# Data visualization
After having formatted the data, we can now move on to data visualization.
## Visualizing pixel intensities
The `cytomapper::plotPixels` function is the main function for visualization
of pixel intensities.
The easiest way of visualizing images is to just specify channel names:
```{r plotPixels-1}
plotPixels(images,
colour_by = c("PIN", "CDH"))
```
Here, the intensities appear weak. For this, we have normalized the images:
```{r plotPixels-2}
plotPixels(images_norm,
colour_by = c("PIN", "CDH"))
```
We can see a decline in pro-insulin expression while E-cadherin expression stays the same.
Instead of normalizing the images, you can also specify the background (b),
contrast (c) and gamma (g) levels of the images:
```{r plotPixels-3}
plotPixels(images,
colour_by = c("PIN", "CDH"),
bcg = list(PIN = c(0, 5, 1),
CDH = c(0, 5, 1)))
```
The `cytomapper` package allows flexible adjustments of all visual attributes of the images:
```{r plotPixels-4}
plotPixels(
image = images,
colour_by = c("PIN", "CD4", "CD8a"),
colour = list(PIN = c("black", "yellow"),
CD4 = c("black", "blue"),
CD8a = c("black", "red")),
bcg = list(PIN = c(0, 10, 1),
CD4 = c(0, 8, 1),
CD8a = c(0, 10, 1)),
image_title = list(
text = c("Non-diabetic",
"Recent onset T1D",
"Long duration T1D")
),
scale_bar = list(
length = 100,
label = expression("100 " ~ mu * "m")
))
```
The `colour_by` parameter defines the channel names by which to colour the
composite. Per channel, a colour scale is generated by setting `colour`. The
attributes of the image titles can be set via the parameter `image_title` and
attributes of the scale bar are set via the parameter `scale_bar`.
To see all available parameter options, refer to `?plotting-param`
## Visualizing cells
In the next step, we can visualize mean pixel intensities and cell-specific metadata
directly on the segmentation masks.
For this, we will use the `cytomapper::plotCells` function while combining the
`SingleCellExperiment` object and the `CytoImageList` storing the segmentation
masks:
```{r plotCells-1}
plotCells(mask = masks, object = sce,
cell_id = "CellNumber", img_id = "ImageName",
colour_by = c("PIN", "CD8a", "CD4"),
exprs_values = "exprs")
```
The `assay(sce, "exprs")` slot stores the asinh-transformed mean pixel intensities
per channel and cell. The `cell_id` entry needs to be stored in the `colData(sce)`
slot to link each individual cell to their corresponding pixels in the segmentation mask.
The `img_id` parameter specifies the `colData(sce)` entry storing the name/ID
of the image to which each cell belongs. This should also be an entry in
`mcols(masks)`.
The `SingleCellExperiment` object is subsettable to highlight only certain cells:
```{r plotCells-2}
cur_sce <- sce[,sce$CellType %in%
c("beta", "alpha", "delta", "Tc", "Th")]
plotCells(mask = masks, object = cur_sce,
cell_id = "CellNumber", img_id = "ImageName",
colour_by = c("PIN", "CD8a", "CD4"),
exprs_values = "exprs",
missing_colour = "white")
```
Here, `missing_colour` defines the colour of cells missing in the `SingleCellExperiment` object.
This can also be useful when visualizing cell-specific metadata:
```{r plotCells-3}
plotCells(
mask = masks,
object = cur_sce,
cell_id = "CellNumber",
img_id = "ImageName",
colour_by = "CellType",
image_title = list(
text = c("Non-diabetic",
"Recent onset T1D",
"Long duration T1D"),
colour = "black"),
scale_bar = list(
length = 100,
label = expression("100 " ~ mu * "m"),
colour = "black"),
missing_colour = "white",
background_colour = "gray")
```
## Visualizing cells and pixel intensities
Finally, all three objects (multi-channel images, segmentation masks and
the single-cell data) can be combined to outline cells on composite images.
As a first demonstration, we select only pancreatic beta cells and visualize
their corresponing marker pro-insulin:
```{r outline-1}
cur_sce <- sce[,sce$CellType == "beta"]
plotPixels(images_norm,
mask = masks,
object = cur_sce,
cell_id = "CellNumber",
img_id = "ImageName",
colour_by = c("PIN", "H3"),
bcg = list(H3 = c(0, 2, 1)),
colour = list(PIN = c("black", "yellow"),
H3 = c("black", "blue"),
CellType = c(beta = "red")),
outline_by = "CellType")
```
One can now also visualize multiple cell types and multiple markers together:
```{r outlineCells-2, echo=FALSE}
cur_sce <- sce[,sce$CellType %in%
c("beta", "alpha", "delta", "Tc", "Th")]
plotPixels(image = images,
object = cur_sce,
mask = masks,
cell_id = "CellNumber",
img_id = "ImageName",
colour_by = c("PIN", "CD4", "CD8a"),
outline_by = "CellType",
colour = list(PIN = c("black", "yellow"),
CD4 = c("black", "blue"),
CD8a = c("black", "red")),
bcg = list(PIN = c(0, 10, 1),
CD4 = c(0, 8, 1),
CD8a = c(0, 10, 1)),
image_title = list(text = c("Non-diabetic",
"Recent onset T1D",
"Long duration T1D")),
scale_bar = list(length = 100,
label = expression("100 " ~ mu * "m")),
thick = TRUE)
```
# Interactive visualization
We have developed a `shiny` application to gate cells based on their mean
pixel intensity with additional visualization on the images.
The idea behind this gating strategy is to generate ground truth cell type labels,
which can be used to train a classifier and classify cells rather than
relying on clustering.
The shiny app can be opened using the `cytomapper::cytomapperShiny` function:
```{r cytomapperShiny}
if (interactive()) {
cytomapperShiny(sce, mask = masks, image = images,
img_id = "ImageName", cell_id = "CellNumber")
}
```
**Using the Shiny application**
The help page provides a recommended workflow on how to most efficiently use the
app. The workflow is solely a recommendation - the app provides full flexibility
to change settings during each step. To see the full documentation, please refer
to the help page found at `?cytomapperShiny`
1. Select the number of plots
The slider under "General controls" can be used to
specify the number of plots on which to perform gating. Up to two markers can be
visualized per plot.
2. Select the sample
The assay dropdown selection under "General controls" allows the user to specify
on which assay entry to perform gating. In most cases, a log- or
arcsinh-transformation can help to distinguish between 'positive' and 'negative'
populations.
3. Select the markers
For each plot, up to two markers can be specified. If selecting a single marker,
please specify this marker in the first of the two dropdown menus. A violin plot
is used to visualize the expression of a single marker while a scatter plot is
used to visualize the expression of two markers.
4. Gate cells
When selecting cells in one plot, only those cells are visualized on the
following plot. Once markers, the assay or the number of plots are changed,
gates are cleared.
5. Observe the selected cells
After gating, the selected cells are visualized on the corresponding images by
switching to the "Images tab". By default, the first marker is selected. The
user can change the displayed marker or press reset marker to switch to the
markers used for gating. If a multi-channel image object is provided, the
contrast of the image can be changed. The right panel visualizes the selected
cells either by filling in the segmentation masks or by outlining the cells on
the images.
6. Change samples
Samples can now be iteratively changed using the dropdown menu under General
controls . The gates will remain on the plots and can be adjusted for each
sample.
7. Save the selected cells
Finally, the selected cells can be saved by clicking the download button next to
the '?' symbol. The selected cells will be stored as a `SingleCellExperiment`
object in .rds format. Per selection, the user can provide a Cell label that
will be stored in the `colData` under the `cytomapper_CellLabel` entry of the
downloaded object.
# Further resources
For pre-processing multiplexed imaging data, please refer to the [ImcSegmentationPipeline](https://github.com/BodenmillerGroup/ImcSegmentationPipeline) and/or the
[steinbock](https://github.com/BodenmillerGroup/steinbock) package.
To test the `cytomapper` package on different datasets, check out the [imcdatasets](https://bioconductor.org/packages/release/data/experiment/html/imcdatasets.html) Bioconductor package.
The [imcRtools](https://github.com/BodenmillerGroup/imcRtools) package is currently being developed
to facilitate handling of multiplexed imaging data and spatial analysis.
# Session info {.unnumbered}
```{r sessionInfo, echo=FALSE}
sessionInfo()
```