-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 3e99102
Showing
195 changed files
with
22,497 additions
and
0 deletions.
There are no files selected for viewing
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,185 @@ | ||
# Introduction {#intro} | ||
|
||
Highly multiplexed imaging (HMI) enables the simultaneous detection of dozens of | ||
biological molecules (e.g., proteins, transcripts; also referred to as | ||
“markers”) in tissues. Recently established multiplexed tissue imaging | ||
technologies rely on cyclic staining with fluorescently-tagged antibodies | ||
[@Lin2018; @Gut2018], or the use of oligonucleotide-tagged [@Goltsev2018; | ||
@Saka2019] or metal-tagged [@Giesen2014; @Angelo2014] antibodies, among others. | ||
The key strength of these technologies is that they allow in-depth analysis of | ||
single cells within their spatial tissue context. As a result, these methods | ||
have enabled analysis of the spatial architecture of the tumor microenvironment | ||
[@Lin2018; @Jackson2020; @Ali2020; @Schurch2020], determination of nucleic acid | ||
and protein abundances for assessment of spatial co-localization of cell types | ||
and chemokines [@Hoch2022] and spatial niches of virus infected cells [@Jiang2022], | ||
and characterization of pathological features during COVID-19 infection | ||
[@Rendeiro2021; @Mitamura2021], Type 1 diabetes progression [@Damond2019] and | ||
autoimmune disease [@Ferrian2021]. | ||
|
||
Imaging mass cytometry (IMC) utilizes metal-tagged antibodies to detect over 40 | ||
proteins and other metal-tagged molecules in biological samples. IMC can be used | ||
to perform highly multiplexed imaging and is particularly suited to profiling | ||
selected areas of tissues across many samples. | ||
|
||
![IMC_workflow](img/IMC_workflow.png) | ||
*Overview of imaging mass cytometry data acquisition. Taken from [@Giesen2014]* | ||
|
||
IMC has first been published in 2014 [@Giesen2014] and has been commercialized by | ||
Standard BioTools<sup><font size="1">TM</font></sup> to be distributed as the Hyperion Imaging | ||
System<sup><font size="1">TM</font></sup> (documentation is available | ||
[here](https://www.fluidigm.com/products-services/instruments/hyperion)). | ||
Similar to other HMI technologies such as MIBI [@Angelo2014], CyCIF [@Lin2018], | ||
4i [@Gut2018], CODEX [@Goltsev2018] and SABER [@Saka2019], IMC captures the spatial | ||
expression of multiple proteins in parallel. With a nominal 1 μm resolution, | ||
IMC is able to detect cytoplasmic and nuclear localization of proteins. The | ||
current ablation frequency of IMC is 200Hz, meaning that a 1 mm$^2$ area | ||
can be imaged within about 2 hours. | ||
|
||
## Technical details of IMC | ||
|
||
Technical aspects of how data acquisition works can be found in the original | ||
publication [@Giesen2014]. Briefly, antibodies to detect targets in biological | ||
material are labeled with heavy metals (e.g., lanthanides) that do not occur in | ||
biological systems and thus can be used upon binding to their target as a | ||
readout similar to fluorophores in fluorescence microscopy. Thin sections of the | ||
biological sample on a glass slide are stained with an antibody cocktail. | ||
Stained microscopy slides are mounted on a precise motor-driven stage inside the | ||
ablation chamber of the IMC instrument. A high-energy UV laser is focused on the | ||
tissue, and each individual laser shot ablates tissue from an area of roughly 1 | ||
μm$^2$. The energy of the laser is absorbed by the tissue resulting | ||
in vaporization followed by condensation of the ablated material. The ablated | ||
material from each laser shot is transported in the gas phase into the plasma of | ||
the mass cytometer, where first atomization of the particles and then ionization | ||
of the atoms occurs. The ion cloud is then transferred into a vacuum, and all | ||
ions below a mass of 80 m/z are filtered using a quadrupole mass filter. The | ||
remaining ions (mostly those used to tag antibodies) are analyzed in a | ||
time-of-flight mass spectrometer to ultimately obtain an accumulated mass | ||
spectrum from all ions that correspond to a single laser shot. One can regard | ||
this spectrum as the information underlying a 1 μm$^2$ pixel. With | ||
repetitive laser shots (e.g., at 200 Hz) and a simultaneous lateral sample | ||
movement, a tissue can be ablated pixel by pixel. Ultimately an image is | ||
reconstructed from each pixel mass spectrum. | ||
|
||
In principle, IMC can be applied to the same type of samples as conventional | ||
fluorescence microscopy. The largest distinction from fluorescence microscopy is | ||
that for IMC, primary-labeled antibodies are commonly used, whereas in | ||
fluorescence microscopy secondary antibodies carrying fluorophores are widely | ||
applied. Additionally, for IMC, samples are dried before acquisition and can be | ||
stored for years. Formalin-fixed and paraffin-embedded (FFPE) samples are widely | ||
used for IMC. The FFPE blocks are cut to 2-5 μm thick sections and are | ||
stained, dried, and analyzed with IMC. | ||
|
||
### Metal-conjugated antobodies and staining | ||
|
||
Metal-labeled antibodies are used to stain molecules in tissues enabling to | ||
delineate tissue structures, cells, and subcellular structures. Metal-conjugated | ||
antibodies can either be purchased directly from Standard BioTools<sup><font size="1">TM</font></sup> ([MaxPar IMC Antibodies](https://store.fluidigm.com/Cytometry/ConsumablesandReagentsCytometry/MaxparAntibodies?cclcl=en_US)), | ||
or antibodies can be purchased and labeled individually ([MaxPar Antibody | ||
Labeling](https://store.fluidigm.com/Cytometry/ConsumablesandReagentsCytometry/MaxparAntibodyLabelingKits?cclcl=en_US)). | ||
Antibody labeling using the MaxPar kits is performed via TCEP antibody reduction | ||
followed by crosslinking with sulfhydryl-reactive maleimide-bearing metal | ||
polymers. For each antibody it is essential to validate its functionality, | ||
specificity and optimize its usage to provide optimal signal to noise. To | ||
facilitate antibody handling, a database is highly useful. | ||
[Airlab](https://github.com/BodenmillerGroup/airlab-web) is such a platform; it | ||
allows antibody lot tracking, validation data uploads, and panel generation for | ||
subsequent upload to the IMC acquisition software from Standard BioTools<sup><font size="1">TM</font></sup> | ||
|
||
Depending on the sample type, different staining protocols can be used. | ||
Generally, once antibodies of choice have been conjugated to a metal tag, | ||
titration experiments are performed to identify the optimal staining | ||
concentration. For FFPE samples, different staining protocols have been | ||
described, and different antibodies show variable staining with different | ||
protocols. Protocols such as the one provided by Standard BioTools<sup><font size="1">TM</font></sup> or the one describe by | ||
[@Ijsselsteijn2019] are recommended. Briefly, for FFPE tissues, a dewaxing | ||
step is performed to remove the paraffin used to embed the material, followed by | ||
a graded re-hydration of the samples. Thereafter, heat-induced epitope retrieval | ||
(HIER), a step aiming at the reversal of formalin-based fixation, is used to | ||
unmask epitopes within tissues and make them accessible to antibodies. Epitope | ||
unmasking is generally performed in either basic, EDTA-based buffers (pH 9.2) or | ||
acidic, citrate-based buffers (pH 6). Next, a buffer containing bovine serum | ||
albumin (BSA) is used to block non-specific binding. This buffer is also used to | ||
dilute antibody stocks for the actual antibody staining. Staining time and | ||
temperature may vary and optimization must be performed to ensure that each | ||
single antibody performs well. However, overnight staining at 4°C or 3-5 | ||
hours at room temperature seem to be suitable in many cases. | ||
|
||
Following antibody incubation, unbound antibodies are washed away and a | ||
counterstain comparable to DAPI is applied to enable the identification of | ||
nuclei. The [Iridium intercalator](https://store.fluidigm.com/Cytometry/ConsumablesandReagentsCytometry/MassCytometryReagents/Cell-ID%E2%84%A2%20Intercalator-Ir%E2%80%94125%20%C2%B5M) | ||
from Standard BioTools<sup><font size="1">TM</font></sup> is a reagent of choice and applied in a brief 5 minute staining. | ||
Finally, the samples are washed again and then dried under an airflow. Once | ||
dried, the samples are ready for analysis using IMC and are | ||
usually stable for a long period of time (at least one year). | ||
|
||
### Data acquisition | ||
|
||
Data is acquired using the CyTOF software from Standard BioTools<sup><font size="1">TM</font></sup> (see manuals | ||
[here](https://go.fluidigm.com/hyperion-support-documents)). | ||
|
||
The regions of interest are selected by providing coordinates for ablation. To | ||
determine the region to be imaged, so called "panoramas" can be generated. These | ||
are stitched images of single fields of views of about 200 μm in diameter. | ||
Panoramas provide an optical overview of the tissue with a resolution similar to | ||
10x in microscopy and are intended to help with the selection of regions of | ||
interest for ablation. The tissue should be centered on the glass side, since | ||
the imaging mass cytometer cannot access roughly 5 mm from each of the slide | ||
edges. Currently, the instruments can process one slide at a time and usually one MCD | ||
file per sample slide is generated. | ||
|
||
Many regions of interest can be defined on a single slide and acquisition | ||
parameters such as channels to acquire, acquisition speed (100 Hz or 200 Hz), | ||
ablation energy, and other parameters are user-defined. It is recommended that | ||
all isotope channels are recorded. This will result in larger raw data files but valuable information such as | ||
potential contamination of the argon gas (e.g., Xenon) or of the samples (e.g., | ||
lead, barium) is stored. | ||
|
||
To process a large number of slides or to select regions on whole-slide samples, | ||
panoramas may not provide sufficient information. If this is the case, | ||
multi-color immunofluorescence of the same slide prior to staining with | ||
metal-labeled antibodies may be performed. To allow for region selection based | ||
on immunofluorescence images and to align those images with a panorama of the | ||
same or consecutive sections of the sample, we developed | ||
[napping](https://github.com/BodenmillerGroup/napping). | ||
|
||
Acquisition time is directly proportional to the total size of ablation, and run | ||
times for samples of large area or for large sample numbers can roughly be calculated by | ||
dividing the ablation area in square micrometer by the ablation speed (e.g., | ||
200Hz). In addition to the proprietary MCD file format, TXT files can also | ||
be generated for each region of interest. This is recommended as a back-up | ||
option in case of errors that may corrupt MCD files but not TXT files. | ||
|
||
## IMC data format {#data-format} | ||
|
||
Upon completion of the acquisition an MCD file of variable size is generated. A | ||
single MCD file can hold raw acquisition data for multiple regions of interest, | ||
optical images providing a slide level overview of the sample ("panoramas"), and | ||
detailed metadata about the experiment. Additionally, for each acquisition a | ||
TXT file is generated which holds the same pixel information as the matched | ||
acquisition in the MCD file. | ||
|
||
The Hyperion Imaging System<sup><font size="1">TM</font></sup> produces files in the following folder structure: | ||
|
||
``` | ||
. | ||
+-- {XYZ}_ROI_001_1.txt | ||
+-- {XYZ}_ROI_002_2.txt | ||
+-- {XYZ}_ROI_003_3.txt | ||
+-- {XYZ}.mcd | ||
``` | ||
|
||
Here, `{XYZ}` defines the filename, `ROI_001`, `ROI_002`, `ROI_003` are | ||
user-defined names (descriptions) for the selected regions of interest (ROI), | ||
and `1`, `2`, `3` indicate the unique acquisition identifiers. The ROI | ||
description entry can be specified in the Standard BioTools software when | ||
selecting ROIs. The MCD file contains the raw imaging data and the full metadata | ||
of all acquired ROIs, while each TXT file contains data of a single ROI without | ||
metadata. To follow a consistent naming scheme and to bundle all metadata, we | ||
recommend to zip the folder. Each ZIP file should only contain data from a | ||
single MCD file, and the name of the ZIP file should match the name of the MCD | ||
file. | ||
|
||
We refer to this data as raw data and the further | ||
processing of this data is described in Section \@ref(processing). | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
# Multi-channel image processing {#processing} | ||
|
||
This book focuses on common analysis steps of spatially-resolved single-cell data | ||
**after** image segmentation and feature extraction. In this chapter, the sections | ||
describe the processing of multiplexed imaging data, including file type | ||
conversion, image segmentation, feature extraction and data export. To obtain | ||
more detailed information on the individual image processing approaches, please | ||
visit their repositories: | ||
|
||
[steinbock](https://github.com/BodenmillerGroup/steinbock): The `steinbock` | ||
toolkit offers tools for multi-channel image processing using the command-line | ||
or Python code [@Windhager2021]. Supported tasks include IMC data pre-processing, | ||
multi-channel image segmentation, object quantification and data | ||
export to a variety of file formats. It supports functionality similar to those | ||
of the IMC Segmentation Pipeline (see below) and further allows deep-learning enabled image | ||
segmentation. The toolkit is available as platform-independent Docker | ||
container, ensuring reproducibility and user-friendly installation. Read more in | ||
the [Docs](https://bodenmillergroup.github.io/steinbock/latest/). | ||
|
||
[IMC Segmentation | ||
Pipeline](https://github.com/BodenmillerGroup/ImcSegmentationPipeline): The IMC | ||
segmentation pipeline offers a rather manual way of segmenting multi-channel | ||
images using a pixel classification-based approach. We continue to maintain the | ||
pipeline but recommend the use of the `steinbock` toolkit for multi-channel | ||
image processing. Raw IMC data pre-processing is performed using the | ||
[readimc](https://github.com/BodenmillerGroup/readimc) Python package to convert | ||
raw MCD files into OME-TIFF and TIFF files. After image cropping, an | ||
[Ilastik](https://www.ilastik.org/) pixel classifier is trained for image | ||
classification prior to image segmentation using | ||
[CellProfiler](https://cellprofiler.org/). Features (i.e., mean pixel intensity) | ||
of segmented objects (i.e., cells) are quantified and exported. Read more in the | ||
[Docs](https://bodenmillergroup.github.io/ImcSegmentationPipeline/). | ||
|
||
## Image pre-processing (IMC specific) | ||
|
||
Image pre-processing is technology dependent. While most multiplexed imaging | ||
technologies generated TIFF or OME-TIFF files which can be directly segmented | ||
using the `steinbock` toolkit, IMC produces data in the proprietary | ||
data format MCD. | ||
|
||
To facilitate IMC data pre-processing, the | ||
[readimc](https://github.com/BodenmillerGroup/readimc) open-source Python | ||
package allows extracting the multi-modal (IMC acquisitions, panoramas), | ||
multi-region, multi-channel information contained in raw IMC images. Both the | ||
IMC Segmentation Pipeline and the `steinbock` toolkit use the `readimc` | ||
package for IMC data pre-processing. Starting from IMC raw data and a "panel" | ||
file, individual acquisitions are extracted as TIFF files and OME-TIFF files if | ||
using the IMC Segmentation Pipeline. The panel contains information of | ||
antibodies used in the experiment and the user can specify which channels to | ||
keep for downstream analysis. When using the IMC Segmentation Pipeline, random | ||
tiles are cropped from images for convenience of pixel labelling. | ||
|
||
## Image segmentation | ||
|
||
The IMC Segmentation Pipeline supports pixel classification-based image | ||
segmentation while `steinbock` supports pixel classification-based and deep | ||
learning-based segmentation. | ||
|
||
**Pixel classification-based** image segmentation is performed by training a | ||
random forest classifier using [Ilastik](https://www.ilastik.org/) on the | ||
randomly extracted image crops and selected image channels. Pixels are | ||
classified as nuclear, cytoplasmic, or background. Employing a customizable | ||
[CellProfiler](https://cellprofiler.org/) pipeline, the probabilities are then | ||
thresholded for segmenting nuclei, and nuclei are expanded into cytoplasmic | ||
regions to obtain cell masks. | ||
|
||
**Deep learning-based** image segmentation is performed as presented by | ||
[@Greenwald2021]. Briefly, `steinbock` first aggregates user-defined | ||
image channels to generate two-channel images representing nuclear and | ||
cytoplasmic signals. Next, the | ||
[DeepCell](https://github.com/vanvalenlab/intro-to-deepcell) Python package is | ||
used to run `Mesmer`, a deep learning-enabled segmentation algorithm pre-trained | ||
on `TissueNet`, to automatically obtain cell masks without any further user | ||
input. | ||
|
||
Segmentation masks are single-channel images that match the input images in | ||
size, with non-zero grayscale values indicating the IDs of segmented objects | ||
(e.g., cells). These masks are written out as TIFF files after segmentation. | ||
|
||
## Feature extraction {#feature-extraction} | ||
|
||
Using the segmentation masks together with their corresponding multi-channel | ||
images, the IMC Segmentation Pipeline as well as the `steinbock` toolkit extract | ||
object-specific features. These include the mean pixel intensity per object and | ||
channel, morphological features (e.g., object area) and the objects' locations. | ||
Object-specific features are written out as CSV files where rows represent | ||
individual objects and columns represent features. | ||
|
||
Furthermore, the IMC Segmentation Pipeline and the `steinbock` toolkit compute | ||
_spatial object graphs_, in which nodes correspond to objects, and nodes in | ||
spatial proximity are connected by an edge. These graphs serve as a proxy for | ||
interactions between neighboring cells. They are stored as edge list in form of | ||
one CSV file per image. | ||
|
||
Both approaches also write out image-specific metadata (e.g., width and height) | ||
as a CSV file. | ||
|
||
## Data export | ||
|
||
To further facilitate compatibility with downstream analysis, `steinbock` | ||
exports data to a variety of file formats such as OME-TIFF for images, FCS for | ||
single-cell data, the _anndata_ format [@Virshup2021] for data analysis in Python, | ||
and various graph file formats for network analysis using software such as | ||
[CytoScape](https://cytoscape.org/) [@Shannon2003]. For export to OME-TIFF, | ||
steinbock uses [xtiff](https://github.com/BodenmillerGroup/xtiff), a Python | ||
package developed for writing multi-channel TIFF stacks. | ||
|
||
## Data import into R | ||
|
||
In Section \@ref(read-data), we will highlight the use of the | ||
[imcRtools](https://github.com/BodenmillerGroup/imcRtools) and | ||
[cytomapper](https://github.com/BodenmillerGroup/cytomapper) R/Bioconductor | ||
packages to read spatially-resolved, single-cell and images as generated by the | ||
IMC Segmentation Pipeline and the `steinbock` toolkit into the statistical | ||
programming language R. All further downstream analyses are performed in R and | ||
detailed in the following sections. | ||
|
||
|
||
|
||
|
||
|
||
|
Oops, something went wrong.