diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 00000000..e69de29b diff --git a/01-intro.md b/01-intro.md new file mode 100644 index 00000000..3ba53dfe --- /dev/null +++ b/01-intro.md @@ -0,0 +1,185 @@ +# Introduction {#intro} + +Highly multiplexed imaging (HMI) enables the simultaneous detection of dozens of +biological molecules (e.g., proteins, transcripts; also referred to as +“markers”) in tissues. Recently established multiplexed tissue imaging +technologies rely on cyclic staining with fluorescently-tagged antibodies +[@Lin2018; @Gut2018], or the use of oligonucleotide-tagged [@Goltsev2018; +@Saka2019] or metal-tagged [@Giesen2014; @Angelo2014] antibodies, among others. +The key strength of these technologies is that they allow in-depth analysis of +single cells within their spatial tissue context. As a result, these methods +have enabled analysis of the spatial architecture of the tumor microenvironment +[@Lin2018; @Jackson2020; @Ali2020; @Schurch2020], determination of nucleic acid +and protein abundances for assessment of spatial co-localization of cell types +and chemokines [@Hoch2022] and spatial niches of virus infected cells [@Jiang2022], +and characterization of pathological features during COVID-19 infection +[@Rendeiro2021; @Mitamura2021], Type 1 diabetes progression [@Damond2019] and +autoimmune disease [@Ferrian2021]. + +Imaging mass cytometry (IMC) utilizes metal-tagged antibodies to detect over 40 +proteins and other metal-tagged molecules in biological samples. IMC can be used +to perform highly multiplexed imaging and is particularly suited to profiling +selected areas of tissues across many samples. + +![IMC_workflow](img/IMC_workflow.png) +*Overview of imaging mass cytometry data acquisition. Taken from [@Giesen2014]* + +IMC has first been published in 2014 [@Giesen2014] and has been commercialized by +Standard BioToolsTM to be distributed as the Hyperion Imaging +SystemTM (documentation is available +[here](https://www.fluidigm.com/products-services/instruments/hyperion)). +Similar to other HMI technologies such as MIBI [@Angelo2014], CyCIF [@Lin2018], +4i [@Gut2018], CODEX [@Goltsev2018] and SABER [@Saka2019], IMC captures the spatial +expression of multiple proteins in parallel. With a nominal 1 μm resolution, +IMC is able to detect cytoplasmic and nuclear localization of proteins. The +current ablation frequency of IMC is 200Hz, meaning that a 1 mm$^2$ area +can be imaged within about 2 hours. + +## Technical details of IMC + +Technical aspects of how data acquisition works can be found in the original +publication [@Giesen2014]. Briefly, antibodies to detect targets in biological +material are labeled with heavy metals (e.g., lanthanides) that do not occur in +biological systems and thus can be used upon binding to their target as a +readout similar to fluorophores in fluorescence microscopy. Thin sections of the +biological sample on a glass slide are stained with an antibody cocktail. +Stained microscopy slides are mounted on a precise motor-driven stage inside the +ablation chamber of the IMC instrument. A high-energy UV laser is focused on the +tissue, and each individual laser shot ablates tissue from an area of roughly 1 +μm$^2$. The energy of the laser is absorbed by the tissue resulting +in vaporization followed by condensation of the ablated material. The ablated +material from each laser shot is transported in the gas phase into the plasma of +the mass cytometer, where first atomization of the particles and then ionization +of the atoms occurs. The ion cloud is then transferred into a vacuum, and all +ions below a mass of 80 m/z are filtered using a quadrupole mass filter. The +remaining ions (mostly those used to tag antibodies) are analyzed in a +time-of-flight mass spectrometer to ultimately obtain an accumulated mass +spectrum from all ions that correspond to a single laser shot. One can regard +this spectrum as the information underlying a 1 μm$^2$ pixel. With +repetitive laser shots (e.g., at 200 Hz) and a simultaneous lateral sample +movement, a tissue can be ablated pixel by pixel. Ultimately an image is +reconstructed from each pixel mass spectrum. + +In principle, IMC can be applied to the same type of samples as conventional +fluorescence microscopy. The largest distinction from fluorescence microscopy is +that for IMC, primary-labeled antibodies are commonly used, whereas in +fluorescence microscopy secondary antibodies carrying fluorophores are widely +applied. Additionally, for IMC, samples are dried before acquisition and can be +stored for years. Formalin-fixed and paraffin-embedded (FFPE) samples are widely +used for IMC. The FFPE blocks are cut to 2-5 μm thick sections and are +stained, dried, and analyzed with IMC. + +### Metal-conjugated antobodies and staining + +Metal-labeled antibodies are used to stain molecules in tissues enabling to +delineate tissue structures, cells, and subcellular structures. Metal-conjugated +antibodies can either be purchased directly from Standard BioToolsTM ([MaxPar IMC Antibodies](https://store.fluidigm.com/Cytometry/ConsumablesandReagentsCytometry/MaxparAntibodies?cclcl=en_US)), +or antibodies can be purchased and labeled individually ([MaxPar Antibody +Labeling](https://store.fluidigm.com/Cytometry/ConsumablesandReagentsCytometry/MaxparAntibodyLabelingKits?cclcl=en_US)). +Antibody labeling using the MaxPar kits is performed via TCEP antibody reduction +followed by crosslinking with sulfhydryl-reactive maleimide-bearing metal +polymers. For each antibody it is essential to validate its functionality, +specificity and optimize its usage to provide optimal signal to noise. To +facilitate antibody handling, a database is highly useful. +[Airlab](https://github.com/BodenmillerGroup/airlab-web) is such a platform; it +allows antibody lot tracking, validation data uploads, and panel generation for +subsequent upload to the IMC acquisition software from Standard BioToolsTM + +Depending on the sample type, different staining protocols can be used. +Generally, once antibodies of choice have been conjugated to a metal tag, +titration experiments are performed to identify the optimal staining +concentration. For FFPE samples, different staining protocols have been +described, and different antibodies show variable staining with different +protocols. Protocols such as the one provided by Standard BioToolsTM or the one describe by +[@Ijsselsteijn2019] are recommended. Briefly, for FFPE tissues, a dewaxing +step is performed to remove the paraffin used to embed the material, followed by +a graded re-hydration of the samples. Thereafter, heat-induced epitope retrieval +(HIER), a step aiming at the reversal of formalin-based fixation, is used to +unmask epitopes within tissues and make them accessible to antibodies. Epitope +unmasking is generally performed in either basic, EDTA-based buffers (pH 9.2) or +acidic, citrate-based buffers (pH 6). Next, a buffer containing bovine serum +albumin (BSA) is used to block non-specific binding. This buffer is also used to +dilute antibody stocks for the actual antibody staining. Staining time and +temperature may vary and optimization must be performed to ensure that each +single antibody performs well. However, overnight staining at 4°C or 3-5 +hours at room temperature seem to be suitable in many cases. + +Following antibody incubation, unbound antibodies are washed away and a +counterstain comparable to DAPI is applied to enable the identification of +nuclei. The [Iridium intercalator](https://store.fluidigm.com/Cytometry/ConsumablesandReagentsCytometry/MassCytometryReagents/Cell-ID%E2%84%A2%20Intercalator-Ir%E2%80%94125%20%C2%B5M) +from Standard BioToolsTM is a reagent of choice and applied in a brief 5 minute staining. +Finally, the samples are washed again and then dried under an airflow. Once +dried, the samples are ready for analysis using IMC and are +usually stable for a long period of time (at least one year). + +### Data acquisition + +Data is acquired using the CyTOF software from Standard BioToolsTM (see manuals +[here](https://go.fluidigm.com/hyperion-support-documents)). + +The regions of interest are selected by providing coordinates for ablation. To +determine the region to be imaged, so called "panoramas" can be generated. These +are stitched images of single fields of views of about 200 μm in diameter. +Panoramas provide an optical overview of the tissue with a resolution similar to +10x in microscopy and are intended to help with the selection of regions of +interest for ablation. The tissue should be centered on the glass side, since +the imaging mass cytometer cannot access roughly 5 mm from each of the slide +edges. Currently, the instruments can process one slide at a time and usually one MCD +file per sample slide is generated. + +Many regions of interest can be defined on a single slide and acquisition +parameters such as channels to acquire, acquisition speed (100 Hz or 200 Hz), +ablation energy, and other parameters are user-defined. It is recommended that +all isotope channels are recorded. This will result in larger raw data files but valuable information such as +potential contamination of the argon gas (e.g., Xenon) or of the samples (e.g., +lead, barium) is stored. + +To process a large number of slides or to select regions on whole-slide samples, +panoramas may not provide sufficient information. If this is the case, +multi-color immunofluorescence of the same slide prior to staining with +metal-labeled antibodies may be performed. To allow for region selection based +on immunofluorescence images and to align those images with a panorama of the +same or consecutive sections of the sample, we developed +[napping](https://github.com/BodenmillerGroup/napping). + +Acquisition time is directly proportional to the total size of ablation, and run +times for samples of large area or for large sample numbers can roughly be calculated by +dividing the ablation area in square micrometer by the ablation speed (e.g., +200Hz). In addition to the proprietary MCD file format, TXT files can also +be generated for each region of interest. This is recommended as a back-up +option in case of errors that may corrupt MCD files but not TXT files. + +## IMC data format {#data-format} + +Upon completion of the acquisition an MCD file of variable size is generated. A +single MCD file can hold raw acquisition data for multiple regions of interest, +optical images providing a slide level overview of the sample ("panoramas"), and +detailed metadata about the experiment. Additionally, for each acquisition a +TXT file is generated which holds the same pixel information as the matched +acquisition in the MCD file. + +The Hyperion Imaging SystemTM produces files in the following folder structure: + +``` +. ++-- {XYZ}_ROI_001_1.txt ++-- {XYZ}_ROI_002_2.txt ++-- {XYZ}_ROI_003_3.txt ++-- {XYZ}.mcd +``` + +Here, `{XYZ}` defines the filename, `ROI_001`, `ROI_002`, `ROI_003` are +user-defined names (descriptions) for the selected regions of interest (ROI), +and `1`, `2`, `3` indicate the unique acquisition identifiers. The ROI +description entry can be specified in the Standard BioTools software when +selecting ROIs. The MCD file contains the raw imaging data and the full metadata +of all acquired ROIs, while each TXT file contains data of a single ROI without +metadata. To follow a consistent naming scheme and to bundle all metadata, we +recommend to zip the folder. Each ZIP file should only contain data from a +single MCD file, and the name of the ZIP file should match the name of the MCD +file. + +We refer to this data as raw data and the further +processing of this data is described in Section \@ref(processing). + + diff --git a/02-processing.md b/02-processing.md new file mode 100644 index 00000000..b34aea3b --- /dev/null +++ b/02-processing.md @@ -0,0 +1,122 @@ +# Multi-channel image processing {#processing} + +This book focuses on common analysis steps of spatially-resolved single-cell data +**after** image segmentation and feature extraction. In this chapter, the sections +describe the processing of multiplexed imaging data, including file type +conversion, image segmentation, feature extraction and data export. To obtain +more detailed information on the individual image processing approaches, please +visit their repositories: + +[steinbock](https://github.com/BodenmillerGroup/steinbock): The `steinbock` +toolkit offers tools for multi-channel image processing using the command-line +or Python code [@Windhager2021]. Supported tasks include IMC data pre-processing, +multi-channel image segmentation, object quantification and data +export to a variety of file formats. It supports functionality similar to those +of the IMC Segmentation Pipeline (see below) and further allows deep-learning enabled image +segmentation. The toolkit is available as platform-independent Docker +container, ensuring reproducibility and user-friendly installation. Read more in +the [Docs](https://bodenmillergroup.github.io/steinbock/latest/). + +[IMC Segmentation +Pipeline](https://github.com/BodenmillerGroup/ImcSegmentationPipeline): The IMC +segmentation pipeline offers a rather manual way of segmenting multi-channel +images using a pixel classification-based approach. We continue to maintain the +pipeline but recommend the use of the `steinbock` toolkit for multi-channel +image processing. Raw IMC data pre-processing is performed using the +[readimc](https://github.com/BodenmillerGroup/readimc) Python package to convert +raw MCD files into OME-TIFF and TIFF files. After image cropping, an +[Ilastik](https://www.ilastik.org/) pixel classifier is trained for image +classification prior to image segmentation using +[CellProfiler](https://cellprofiler.org/). Features (i.e., mean pixel intensity) +of segmented objects (i.e., cells) are quantified and exported. Read more in the +[Docs](https://bodenmillergroup.github.io/ImcSegmentationPipeline/). + +## Image pre-processing (IMC specific) + +Image pre-processing is technology dependent. While most multiplexed imaging +technologies generated TIFF or OME-TIFF files which can be directly segmented +using the `steinbock` toolkit, IMC produces data in the proprietary +data format MCD. + +To facilitate IMC data pre-processing, the +[readimc](https://github.com/BodenmillerGroup/readimc) open-source Python +package allows extracting the multi-modal (IMC acquisitions, panoramas), +multi-region, multi-channel information contained in raw IMC images. Both the +IMC Segmentation Pipeline and the `steinbock` toolkit use the `readimc` +package for IMC data pre-processing. Starting from IMC raw data and a "panel" +file, individual acquisitions are extracted as TIFF files and OME-TIFF files if +using the IMC Segmentation Pipeline. The panel contains information of +antibodies used in the experiment and the user can specify which channels to +keep for downstream analysis. When using the IMC Segmentation Pipeline, random +tiles are cropped from images for convenience of pixel labelling. + +## Image segmentation + +The IMC Segmentation Pipeline supports pixel classification-based image +segmentation while `steinbock` supports pixel classification-based and deep +learning-based segmentation. + +**Pixel classification-based** image segmentation is performed by training a +random forest classifier using [Ilastik](https://www.ilastik.org/) on the +randomly extracted image crops and selected image channels. Pixels are +classified as nuclear, cytoplasmic, or background. Employing a customizable +[CellProfiler](https://cellprofiler.org/) pipeline, the probabilities are then +thresholded for segmenting nuclei, and nuclei are expanded into cytoplasmic +regions to obtain cell masks. + +**Deep learning-based** image segmentation is performed as presented by +[@Greenwald2021]. Briefly, `steinbock` first aggregates user-defined +image channels to generate two-channel images representing nuclear and +cytoplasmic signals. Next, the +[DeepCell](https://github.com/vanvalenlab/intro-to-deepcell) Python package is +used to run `Mesmer`, a deep learning-enabled segmentation algorithm pre-trained +on `TissueNet`, to automatically obtain cell masks without any further user +input. + +Segmentation masks are single-channel images that match the input images in +size, with non-zero grayscale values indicating the IDs of segmented objects +(e.g., cells). These masks are written out as TIFF files after segmentation. + +## Feature extraction {#feature-extraction} + +Using the segmentation masks together with their corresponding multi-channel +images, the IMC Segmentation Pipeline as well as the `steinbock` toolkit extract +object-specific features. These include the mean pixel intensity per object and +channel, morphological features (e.g., object area) and the objects' locations. +Object-specific features are written out as CSV files where rows represent +individual objects and columns represent features. + +Furthermore, the IMC Segmentation Pipeline and the `steinbock` toolkit compute +_spatial object graphs_, in which nodes correspond to objects, and nodes in +spatial proximity are connected by an edge. These graphs serve as a proxy for +interactions between neighboring cells. They are stored as edge list in form of +one CSV file per image. + +Both approaches also write out image-specific metadata (e.g., width and height) +as a CSV file. + +## Data export + +To further facilitate compatibility with downstream analysis, `steinbock` +exports data to a variety of file formats such as OME-TIFF for images, FCS for +single-cell data, the _anndata_ format [@Virshup2021] for data analysis in Python, +and various graph file formats for network analysis using software such as +[CytoScape](https://cytoscape.org/) [@Shannon2003]. For export to OME-TIFF, +steinbock uses [xtiff](https://github.com/BodenmillerGroup/xtiff), a Python +package developed for writing multi-channel TIFF stacks. + +## Data import into R + +In Section \@ref(read-data), we will highlight the use of the +[imcRtools](https://github.com/BodenmillerGroup/imcRtools) and +[cytomapper](https://github.com/BodenmillerGroup/cytomapper) R/Bioconductor +packages to read spatially-resolved, single-cell and images as generated by the +IMC Segmentation Pipeline and the `steinbock` toolkit into the statistical +programming language R. All further downstream analyses are performed in R and +detailed in the following sections. + + + + + + diff --git a/03-prerequisites.md b/03-prerequisites.md new file mode 100644 index 00000000..5028980a --- /dev/null +++ b/03-prerequisites.md @@ -0,0 +1,524 @@ +# Prerequisites {#prerequisites} + +The analysis presented in this book requires a basic understanding of the +`R` programing language. An introduction to `R` can be found [here](https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf) and +in the book [R for Data Science](https://r4ds.hadley.nz/). + +Furthermore, it is beneficial to be familiar with single-cell data analysis +using the [Bioconductor](https://www.bioconductor.org/) framework. The +[Orchestrating Single-Cell Analysis with Bioconductor](https://bioconductor.org/books/release/OSCA/) book +gives an excellent overview on data containers and basic analysis that are being +used here. + +An overview on IMC as technology and necessary image processing steps can be +found on the [IMC workflow website](https://bodenmillergroup.github.io/IMCWorkflow/). + +Before we get started on IMC data analysis, we will need to make sure that +software dependencies are installed and the example data is downloaded. + +## Obtain the code + +This book provides R code to perform single-cell and spatial data analysis. +You can copy the individual code chunks into your R scripts or you can obtain +the full code of the book via: + +``` +git clone https://github.com/BodenmillerGroup/IMCDataAnalysis.git +``` + +## Software requirements + +The R packages needed to execute the presented workflow can either be manually +installed (see section \@ref(manual-install)) or are available within a provided +Docker container (see section \@ref(docker)). The Docker option is useful if you +want to exactly reproduce the presented analysis across operating systems; +however, the manual install gives you more flexibility for exploratory data +analysis. + +### Using Docker {#docker} + +For reproducibility purposes, we provide a Docker container [here](https://github.com/BodenmillerGroup/IMCDataAnalysis/pkgs/container/imcdataanalysis). + +1. After installing [Docker](https://docs.docker.com/get-docker/) you can first pull the container via: + +``` +docker pull ghcr.io/bodenmillergroup/imcdataanalysis:latest +``` + +and then run the container: + +``` +docker run -v /path/to/IMCDataAnalysis:/home/rstudio/IMCDataAnalysis \ + -e PASSWORD=bioc -p 8787:8787 \ + ghcr.io/bodenmillergroup/imcdataanalysis:latest +``` + +Here, the `/path/to/` needs to be adjusted to where you keep the code and data +of the book. + +**Of note: it is recommended to use a date-tagged version of the container to ensure reproducibility**. +This can be done via: + +``` +docker pull ghcr.io/bodenmillergroup/imcdataanalysis: +``` + +2. An RStudio server session can be accessed via a browser at `localhost:8787` using `Username: rstudio` and `Password: bioc`. +3. Navigate to `IMCDataAnalysis` and open the `IMCDataAnalysis.Rproj` file. +4. Code in the individual files can now be executed or the whole workflow can be build by entering `bookdown::render_book()`. + +### Manual installation {#manual-install} + +The following section describes how to manually install all needed R packages +when not using the provided Docker container. +To install all R packages needed for the analysis, please run: + + +```r +if (!requireNamespace("BiocManager", quietly = TRUE)) + install.packages("BiocManager") + +BiocManager::install(c("rmarkdown", "bookdown", "pheatmap", "viridis", "zoo", + "devtools", "testthat", "tiff", "distill", "ggrepel", + "patchwork", "mclust", "RColorBrewer", "uwot", "Rtsne", + "harmony", "Seurat", "SeuratObject", "cowplot", "kohonen", + "caret", "randomForest", "ggridges", "cowplot", + "gridGraphics", "scales", "tiff", "harmony", "Matrix", + "CATALYST", "scuttle", "scater", "dittoSeq", + "tidyverse", "BiocStyle", "batchelor", "bluster", "scran", + "lisaClust", "spicyR", "iSEE", "imcRtools", "cytomapper", + "imcdatasets", "cytoviewer")) + +# Github dependencies +devtools::install_github("i-cyto/Rphenograph") +``` + + + +### Major package versions + +Throughout the analysis, we rely on different R software packages. +This section lists the most commonly used packages in this workflow. + +Data containers: + +* [SpatialExperiment](https://bioconductor.org/packages/release/bioc/html/SpatialExperiment.html) version 1.10.0 +* [SingleCellExperiment](https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html) version 1.22.0 + +Data analysis: + +* [CATALYST](https://bioconductor.org/packages/release/bioc/html/CATALYST.html) version 1.24.0 +* [imcRtools](https://bioconductor.org/packages/release/bioc/html/imcRtools.html) version 1.6.5 +* [scuttle](https://bioconductor.org/packages/release/bioc/html/scuttle.html) version 1.10.2 +* [scater](https://bioconductor.org/packages/release/bioc/html/scater.html) version 1.28.0 +* [batchelor](https://www.bioconductor.org/packages/release/bioc/html/batchelor.html) version 1.16.0 +* [bluster](https://www.bioconductor.org/packages/release/bioc/html/bluster.html) version 1.10.0 +* [scran](https://www.bioconductor.org/packages/release/bioc/html/scran.html) version 1.28.2 +* [harmony](https://github.com/immunogenomics/harmony) version 1.0.1 +* [Seurat](https://satijalab.org/seurat/index.html) version 4.4.0 +* [lisaClust](https://www.bioconductor.org/packages/release/bioc/html/lisaClust.html) version 1.8.1 +* [caret](https://topepo.github.io/caret/) version 6.0.94 + +Data visualization: + +* [cytomapper](https://bioconductor.org/packages/release/bioc/html/cytomapper.html) version 1.12.0 +* [cytoviewer](https://bioconductor.org/packages/release/bioc/html/cytoviewer.html) version 1.0.1 +* [dittoSeq](https://bioconductor.org/packages/release/bioc/html/dittoSeq.html) version 1.12.1 + +Tidy R: + +* [tidyverse](https://www.tidyverse.org/) version 2.0.0 + +## Image processing {#image-processing} + +The analysis presented here fully relies on packages written in the programming +language `R` and primarily focuses on analysis approaches downstream of image +processing. The example data available at +[https://zenodo.org/record/7575859](https://zenodo.org/record/7575859) were +processed (file type conversion, image segmentation, feature extraction as +explained in Section \@ref(processing)) using the +[steinbock](https://bodenmillergroup.github.io/steinbock/latest/) toolkit. The +exact command line interface calls to process the raw data are shown below: + + + + +```bash +#!/usr/bin/env bash +BASEDIR=$(cd -- "$(dirname "${BASH_SOURCE[0]}")" && pwd -P) +cd "${BASEDIR}" + +# raw data collection +mkdir raw +wget https://zenodo.org/record/6449127/files/IMCWorkflow.ilp +wget https://zenodo.org/record/6449127/files/analysis.zip +unzip analysis.zip +rm analysis.zip +rm -r analysis/cpinp +rm -r analysis/cpout +rm -r analysis/histocat +rm -r analysis/ilastik +rm -r analysis/ometiff +cd raw +wget https://zenodo.org/record/5949116/files/panel.csv +wget https://zenodo.org/record/5949116/files/Patient1.zip +wget https://zenodo.org/record/5949116/files/Patient2.zip +wget https://zenodo.org/record/5949116/files/Patient3.zip +wget https://zenodo.org/record/5949116/files/Patient4.zip +cd ${BASEDIR} + +# steinbock alias setup +shopt -s expand_aliases +alias steinbock="docker run -v ${BASEDIR}:/data -u $(id -u):$(id -g) ghcr.io/bodenmillergroup/steinbock:0.16.0" + +# raw data preprocessing +steinbock preprocess imc panel --namecol Clean_Target +steinbock preprocess imc images --hpf 50 + +# random forest-based segmentation using Ilastik/CellProfiler +steinbock classify ilastik prepare --cropsize 500 --seed 123 +rm pixel_classifier.ilp && mv IMCWorkflow.ilp pixel_classifier.ilp +rm -r ilastik_crops && mv analysis/crops ilastik_crops +steinbock classify ilastik fix --no-backup +steinbock classify ilastik run +steinbock segment cellprofiler prepare +steinbock segment cellprofiler run -o masks_ilastik + +# deep learning-based whole-cell segmentation using DeepCell/Mesmer +steinbock segment deepcell --app mesmer --minmax -o masks_deepcell + +# single-cell feature extraction +steinbock measure intensities --masks masks_deepcell +steinbock measure regionprops --masks masks_deepcell +steinbock measure neighbors --masks masks_deepcell --type expansion --dmax 4 + +# data export +steinbock export ome +steinbock export histocat --masks masks_deepcell +steinbock export csv intensities regionprops -o cells.csv +steinbock export csv intensities regionprops --no-concat -o cells_csv +steinbock export fcs intensities regionprops -o cells.fcs +steinbock export fcs intensities regionprops --no-concat -o cells_fcs +steinbock export anndata --intensities intensities --data regionprops --neighbors neighbors -o cells.h5ad +steinbock export anndata --intensities intensities --data regionprops --neighbors neighbors --no-concat -o cells_h5ad +steinbock export graphs --data intensities + +# archiving +zip -r img.zip img +zip -r ilastik_img.zip ilastik_img +zip -r ilastik_crops.zip ilastik_crops +zip -r ilastik_probabilities.zip ilastik_probabilities +zip -r masks_ilastik.zip masks_ilastik +zip -r masks_deepcell.zip masks_deepcell +zip -r intensities.zip intensities +zip -r regionprops.zip regionprops +zip -r neighbors.zip neighbors +zip -r ome.zip ome +zip -r histocat.zip histocat +zip -r cells_csv.zip cells_csv +zip -r cells_fcs.zip cells_fcs +zip -r cells_h5ad.zip cells_h5ad +zip -r graphs.zip graphs +``` + +## Download example data {#download-data} + +Throughout this tutorial, we will access a number of different data types. +To declutter the analysis scripts, we will already download all needed data here. + +To highlight the basic steps of IMC data analysis, we provide example data that +were acquired as part of the **I**ntegrated i**MMU**noprofiling of large adaptive +**CAN**cer patient cohorts projects ([immucan.eu](https://immucan.eu/)). The +raw data of 4 patients can be accessed online at +[zenodo.org/record/7575859](https://zenodo.org/record/7575859). We will only +download the sample/patient metadata information here: + + +```r +download.file("https://zenodo.org/record/7575859/files/sample_metadata.csv", + destfile = "data/sample_metadata.csv") +``` + +### Processed multiplexed imaging data + +The IMC raw data was either processed using the +[steinbock](https://github.com/BodenmillerGroup/steinbock) toolkit or the +[IMC Segmentation Pipeline](https://github.com/BodenmillerGroup/ImcSegmentationPipeline). +Image processing included file type conversion, cell segmentation and feature +extraction. + +**steinbock output** + +This book uses the output of the `steinbock` framework when applied to process +the example data. The processed data includes the single-cell mean intensity +files, the single-cell morphological features and spatial locations, spatial +object graphs in form of edge lists indicating cells in close proximity, hot +pixel filtered multi-channel images, segmentation masks, image metadata and +channel metadata. All these files will be downloaded here for later use. The +commands which were used to generate this data can be found in the shell script +above. + + +```r +# download intensities +url <- "https://zenodo.org/record/7624451/files/intensities.zip" +destfile <- "data/steinbock/intensities.zip" +download.file(url, destfile) +unzip(destfile, exdir="data/steinbock", overwrite=TRUE) +unlink(destfile) + +# download regionprops +url <- "https://zenodo.org/record/7624451/files/regionprops.zip" +destfile <- "data/steinbock/regionprops.zip" +download.file(url, destfile) +unzip(destfile, exdir="data/steinbock", overwrite=TRUE) +unlink(destfile) + +# download neighbors +url <- "https://zenodo.org/record/7624451/files/neighbors.zip" +destfile <- "data/steinbock/neighbors.zip" +download.file(url, destfile) +unzip(destfile, exdir="data/steinbock", overwrite=TRUE) +unlink(destfile) + +# download images +url <- "https://zenodo.org/record/7624451/files/img.zip" +destfile <- "data/steinbock/img.zip" +download.file(url, destfile) +unzip(destfile, exdir="data/steinbock", overwrite=TRUE) +unlink(destfile) + +# download masks +url <- "https://zenodo.org/record/7624451/files/masks_deepcell.zip" +destfile <- "data/steinbock/masks_deepcell.zip" +download.file(url, destfile) +unzip(destfile, exdir="data/steinbock", overwrite=TRUE) +unlink(destfile) + +# download individual files +download.file("https://zenodo.org/record/7624451/files/panel.csv", + "data/steinbock/panel.csv") +download.file("https://zenodo.org/record/7624451/files/images.csv", + "data/steinbock/images.csv") +download.file("https://zenodo.org/record/7624451/files/steinbock.sh", + "data/steinbock/steinbock.sh") +``` + +**IMC Segmentation Pipeline output** + +The example data was also processed using the +[IMC Segmetation Pipeline](https://github.com/BodenmillerGroup/ImcSegmentationPipeline) (version 3). +To highlight the use of the reader function for this type of output, we will need +to download the `cpout` folder which is part of the `analysis` folder. The `cpout` +folder stores all relevant output files of the pipeline. For a full description +of the pipeline, please refer to the [docs](https://bodenmillergroup.github.io/ImcSegmentationPipeline/). + + +```r +# download analysis folder +url <- "https://zenodo.org/record/7997296/files/analysis.zip" +destfile <- "data/ImcSegmentationPipeline/analysis.zip" +download.file(url, destfile) +unzip(destfile, exdir="data/ImcSegmentationPipeline", overwrite=TRUE) +unlink(destfile) + +unlink("data/ImcSegmentationPipeline/analysis/cpinp/", recursive=TRUE) +unlink("data/ImcSegmentationPipeline/analysis/crops/", recursive=TRUE) +unlink("data/ImcSegmentationPipeline/analysis/histocat/", recursive=TRUE) +unlink("data/ImcSegmentationPipeline/analysis/ilastik/", recursive=TRUE) +unlink("data/ImcSegmentationPipeline/analysis/ometiff/", recursive=TRUE) +unlink("data/ImcSegmentationPipeline/analysis/cpout/images/", recursive=TRUE) +unlink("data/ImcSegmentationPipeline/analysis/cpout/probabilities/", recursive=TRUE) +unlink("data/ImcSegmentationPipeline/analysis/cpout/masks/", recursive=TRUE) +``` + +### Files for spillover matrix estimation + +To highlight the estimation and correction of channel-spillover as described by +[@Chevrier2017], we can access an example spillover-acquisition from: + + +```r +download.file("https://zenodo.org/record/7575859/files/compensation.zip", + "data/compensation.zip") +unzip("data/compensation.zip", exdir="data", overwrite=TRUE) +unlink("data/compensation.zip") +``` + +### Gated cells + +In Section \@ref(classification), we present a cell type classification approach +that relies on previously gated cells. This ground truth data is available +online at [zenodo.org/record/8095133](https://zenodo.org/record/8095133) and +will be downloaded here for later use: + + +```r +download.file("https://zenodo.org/record/8095133/files/gated_cells.zip", + "data/gated_cells.zip") +unzip("data/gated_cells.zip", exdir="data", overwrite=TRUE) +unlink("data/gated_cells.zip") +``` + +## Software versions {#sessionInfo} + +
+ SessionInfo + + +``` +## R version 4.3.1 (2023-06-16) +## Platform: x86_64-pc-linux-gnu (64-bit) +## Running under: Ubuntu 22.04.3 LTS +## +## Matrix products: default +## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 +## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 +## +## locale: +## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C +## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 +## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 +## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C +## [9] LC_ADDRESS=C LC_TELEPHONE=C +## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C +## +## time zone: Etc/UTC +## tzcode source: system (glibc) +## +## attached base packages: +## [1] stats4 stats graphics grDevices utils datasets methods +## [8] base +## +## other attached packages: +## [1] cytoviewer_1.0.1 caret_6.0-94 +## [3] lattice_0.21-8 lisaClust_1.8.1 +## [5] scran_1.28.2 bluster_1.10.0 +## [7] lubridate_1.9.3 forcats_1.0.0 +## [9] stringr_1.5.0 dplyr_1.1.3 +## [11] purrr_1.0.2 readr_2.1.4 +## [13] tidyr_1.3.0 tibble_3.2.1 +## [15] tidyverse_2.0.0 dittoSeq_1.12.1 +## [17] cytomapper_1.12.0 EBImage_4.42.0 +## [19] imcRtools_1.6.5 scater_1.28.0 +## [21] ggplot2_3.4.3 scuttle_1.10.2 +## [23] SpatialExperiment_1.10.0 CATALYST_1.24.0 +## [25] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2 +## [27] Biobase_2.60.0 GenomicRanges_1.52.0 +## [29] GenomeInfoDb_1.36.3 IRanges_2.34.1 +## [31] S4Vectors_0.38.2 BiocGenerics_0.46.0 +## [33] MatrixGenerics_1.12.3 matrixStats_1.0.0 +## +## loaded via a namespace (and not attached): +## [1] R.methodsS3_1.8.2 vroom_1.6.3 +## [3] tiff_0.1-11 nnet_7.3-19 +## [5] goftest_1.2-3 DT_0.29 +## [7] HDF5Array_1.28.1 TH.data_1.1-2 +## [9] vctrs_0.6.3 spatstat.random_3.1-6 +## [11] digest_0.6.33 png_0.1-8 +## [13] shape_1.4.6 proxy_0.4-27 +## [15] ggrepel_0.9.3 spicyR_1.12.2 +## [17] deldir_1.0-9 parallelly_1.36.0 +## [19] magick_2.8.0 MASS_7.3-60 +## [21] reshape2_1.4.4 httpuv_1.6.11 +## [23] foreach_1.5.2 withr_2.5.1 +## [25] xfun_0.40 ggpubr_0.6.0 +## [27] ellipsis_0.3.2 survival_3.5-5 +## [29] RTriangle_1.6-0.12 ggbeeswarm_0.7.2 +## [31] RProtoBufLib_2.12.1 drc_3.0-1 +## [33] systemfonts_1.0.4 zoo_1.8-12 +## [35] GlobalOptions_0.1.2 gtools_3.9.4 +## [37] R.oo_1.25.0 promises_1.2.1 +## [39] rstatix_0.7.2 globals_0.16.2 +## [41] rhdf5filters_1.12.1 rhdf5_2.44.0 +## [43] rstudioapi_0.15.0 miniUI_0.1.1.1 +## [45] archive_1.1.6 units_0.8-4 +## [47] generics_0.1.3 concaveman_1.1.0 +## [49] zlibbioc_1.46.0 ScaledMatrix_1.8.1 +## [51] ggraph_2.1.0 polyclip_1.10-6 +## [53] GenomeInfoDbData_1.2.10 fftwtools_0.9-11 +## [55] xtable_1.8-4 doParallel_1.0.17 +## [57] evaluate_0.21 S4Arrays_1.0.6 +## [59] hms_1.1.3 bookdown_0.35 +## [61] irlba_2.3.5.1 colorspace_2.1-0 +## [63] spatstat.data_3.0-1 magrittr_2.0.3 +## [65] later_1.3.1 viridis_0.6.4 +## [67] spatstat.geom_3.2-5 future.apply_1.11.0 +## [69] XML_3.99-0.14 cowplot_1.1.1 +## [71] class_7.3-22 svgPanZoom_0.3.4 +## [73] pillar_1.9.0 nlme_3.1-162 +## [75] iterators_1.0.14 compiler_4.3.1 +## [77] beachmat_2.16.0 shinycssloaders_1.0.0 +## [79] stringi_1.7.12 gower_1.0.1 +## [81] sf_1.0-14 tensor_1.5 +## [83] minqa_1.2.6 ClassifyR_3.4.11 +## [85] plyr_1.8.8 crayon_1.5.2 +## [87] abind_1.4-5 locfit_1.5-9.8 +## [89] sp_2.0-0 graphlayouts_1.0.1 +## [91] bit_4.0.5 terra_1.7-46 +## [93] sandwich_3.0-2 codetools_0.2-19 +## [95] multcomp_1.4-25 recipes_1.0.8 +## [97] BiocSingular_1.16.0 bslib_0.5.1 +## [99] e1071_1.7-13 GetoptLong_1.0.5 +## [101] mime_0.12 MultiAssayExperiment_1.26.0 +## [103] splines_4.3.1 circlize_0.4.15 +## [105] Rcpp_1.0.11 sparseMatrixStats_1.12.2 +## [107] knitr_1.44 utf8_1.2.3 +## [109] clue_0.3-65 lme4_1.1-34 +## [111] listenv_0.9.0 nnls_1.5 +## [113] DelayedMatrixStats_1.22.6 ggsignif_0.6.4 +## [115] Matrix_1.6-1.1 scam_1.2-14 +## [117] statmod_1.5.0 tzdb_0.4.0 +## [119] svglite_2.1.1 tweenr_2.0.2 +## [121] pkgconfig_2.0.3 pheatmap_1.0.12 +## [123] tools_4.3.1 cachem_1.0.8 +## [125] viridisLite_0.4.2 DBI_1.1.3 +## [127] numDeriv_2016.8-1.1 fastmap_1.1.1 +## [129] rmarkdown_2.25 scales_1.2.1 +## [131] grid_4.3.1 shinydashboard_0.7.2 +## [133] broom_1.0.5 sass_0.4.7 +## [135] carData_3.0-5 rpart_4.1.19 +## [137] farver_2.1.1 tidygraph_1.2.3 +## [139] mgcv_1.8-42 yaml_2.3.7 +## [141] cli_3.6.1 lifecycle_1.0.3 +## [143] mvtnorm_1.2-3 lava_1.7.2.1 +## [145] backports_1.4.1 DropletUtils_1.20.0 +## [147] BiocParallel_1.34.2 cytolib_2.12.1 +## [149] timechange_0.2.0 gtable_0.3.4 +## [151] rjson_0.2.21 ggridges_0.5.4 +## [153] parallel_4.3.1 pROC_1.18.4 +## [155] limma_3.56.2 colourpicker_1.3.0 +## [157] jsonlite_1.8.7 edgeR_3.42.4 +## [159] bitops_1.0-7 bit64_4.0.5 +## [161] Rtsne_0.16 FlowSOM_2.8.0 +## [163] spatstat.utils_3.0-3 BiocNeighbors_1.18.0 +## [165] flowCore_2.12.2 jquerylib_0.1.4 +## [167] metapod_1.8.0 dqrng_0.3.1 +## [169] R.utils_2.12.2 timeDate_4022.108 +## [171] shiny_1.7.5 ConsensusClusterPlus_1.64.0 +## [173] htmltools_0.5.6 distances_0.1.9 +## [175] glue_1.6.2 XVector_0.40.0 +## [177] RCurl_1.98-1.12 classInt_0.4-10 +## [179] jpeg_0.1-10 gridExtra_2.3 +## [181] boot_1.3-28.1 igraph_1.5.1 +## [183] R6_2.5.1 cluster_2.1.4 +## [185] Rhdf5lib_1.22.1 ipred_0.9-14 +## [187] nloptr_2.0.3 DelayedArray_0.26.7 +## [189] tidyselect_1.2.0 vipor_0.4.5 +## [191] plotrix_3.8-2 ggforce_0.4.1 +## [193] raster_3.6-23 car_3.1-2 +## [195] future_1.33.0 ModelMetrics_1.2.2.2 +## [197] rsvd_1.0.5 munsell_0.5.0 +## [199] KernSmooth_2.23-21 data.table_1.14.8 +## [201] htmlwidgets_1.6.2 ComplexHeatmap_2.16.0 +## [203] RColorBrewer_1.1-3 rlang_1.1.1 +## [205] spatstat.sparse_3.0-2 spatstat.explore_3.2-3 +## [207] lmerTest_3.1-3 colorRamps_2.3.1 +## [209] ggnewscale_0.4.9 fansi_1.0.4 +## [211] hardhat_1.3.0 beeswarm_0.4.0 +## [213] prodlim_2023.08.28 +``` +
+ + + diff --git a/04-read_data.md b/04-read_data.md new file mode 100644 index 00000000..b0b94638 --- /dev/null +++ b/04-read_data.md @@ -0,0 +1,797 @@ +# Read in the data {#read-data} + +This section describes how to read in single-cell data and images into `R` +**after** image processing and segmentation (see Section \@ref(processing)). + +To highlight examples for IMC data analysis, we provide already processed data at +[10.5281/zenodo.6043599](https://zenodo.org/record/6043599). +This data has already been downloaded in Section \@ref(download-data) and can +be accessed in the folder `data`. + +We use the [imcRtools](https://github.com/BodenmillerGroup/imcRtools) package to +read in single-cell data extracted using the `steinbock` framework or the IMC +Segmentation Pipeline. Both image processing approaches also generate +multi-channel images and segmentation masks that can be read into `R` using the +[cytomapper](https://github.com/BodenmillerGroup/cytomapper) package. + + +```r +library(imcRtools) +library(cytomapper) +``` + +## Read in single-cell information + +For single-cell data analysis in `R` the +[SingleCellExperiment](https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html) +[@Amezquita2019] data container is commonly used within the Bioconductor +framework. It allows standardized access to (i) expression data, (ii) cellular +metadata (e.g., cell type), (iii) feature metadata (e.g., marker name) and (iv) +experiment-wide metadata. For an in-depth introduction to the `SingleCellExperiment` +container, please refer to the [SingleCellExperiment class](https://bioconductor.org/books/release/OSCA.intro/the-singlecellexperiment-class.html). + +The [SpatialExperiment](https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html) +class [@Righelli2022] is an extension of the `SingleCellExperiment` class. It +was developed to store spatial data in addition to single-cell data and an +extended introduction is accessible +[here](https://bioconductor.org/packages/release/bioc/vignettes/SpatialExperiment/inst/doc/SpatialExperiment.html). + +To read in single-cell data generated by the `steinbock` framework or the IMC +Segmentation Pipeline, the `imcRtools` package provides the `read_steinbock` and +`read_cpout` functions, respectively. By default, the data is read into a +`SpatialExperiment` object; however, data can be read in as a +`SingleCellExperiment` object by setting `return_as = "sce"`. All functions +presented in this book are applicable to both data containers. + +### steinbock generated data + +The downloaded example data (Section \@ref(download-data)) processed with the [steinbock](https://github.com/BodenmillerGroup/steinbock) framework can be read in with the `read_steinbock` function provided by `imcRtools`. For more information, please refer to +`?read_steinbock`. + + +```r +spe <- read_steinbock("data/steinbock/") +spe +``` + +``` +## class: SpatialExperiment +## dim: 40 47859 +## metadata(0): +## assays(1): counts +## rownames(40): MPO HistoneH3 ... DNA1 DNA2 +## rowData names(12): channel name ... Final.Concentration...Dilution +## uL.to.add +## colnames: NULL +## colData names(8): sample_id ObjectNumber ... width_px height_px +## reducedDimNames(0): +## mainExpName: NULL +## altExpNames(0): +## spatialCoords names(2) : Pos_X Pos_Y +## imgData names(1): sample_id +``` + +By default, single-cell data is read in as `SpatialExperiment` object. +The summarized pixel intensities per channel and cell (here mean intensity) are +stored in the `counts` slot. Columns represent cells and rows represent channels. + + +```r +counts(spe)[1:5,1:5] +``` + +``` +## [,1] [,2] [,3] [,4] [,5] +## MPO 0.5751064 0.4166667 0.4975494 0.890154 0.1818182 +## HistoneH3 3.1273082 11.3597883 2.3841440 7.712961 1.4512715 +## SMA 0.2600939 1.6720383 0.1535190 1.193948 0.2986703 +## CD16 2.0347747 2.5880536 2.2943074 15.629083 0.6084220 +## CD38 0.2530137 0.6826669 1.1902979 2.126060 0.2917793 +``` + +Metadata associated to individual cells are stored in the `colData` slot. After +initial image processing, these metadata include the numeric identifier (`ObjectNumber`), +the area, and morphological features of each cell. In addition, `sample_id` stores +the image name from which each cell was extracted and the width and height of the +corresponding images are stored. + + +```r +head(colData(spe)) +``` + +``` +## DataFrame with 6 rows and 8 columns +## sample_id ObjectNumber area axis_major_length axis_minor_length +## +## 1 Patient1_001 1 12 7.40623 1.89529 +## 2 Patient1_001 2 24 16.48004 1.96284 +## 3 Patient1_001 3 17 9.85085 1.98582 +## 4 Patient1_001 4 24 8.08290 3.91578 +## 5 Patient1_001 5 22 8.79367 3.11653 +## 6 Patient1_001 6 25 9.17436 3.46929 +## eccentricity width_px height_px +## +## 1 0.966702 600 600 +## 2 0.992882 600 600 +## 3 0.979470 600 600 +## 4 0.874818 600 600 +## 5 0.935091 600 600 +## 6 0.925744 600 600 +``` + +The main difference between the `SpatialExperiment` and the +`SingleCellExperiment` data container is the way spatial +locations of all cells are stored. For the `SingleCellExperiment` container, the +locations are stored in the `colData` slot while the `SpatialExperiment` +container stores them in the `spatialCoords` slot: + + +```r +head(spatialCoords(spe)) +``` + +``` +## Pos_X Pos_Y +## 1 468.5833 0.4166667 +## 2 515.8333 0.4166667 +## 3 587.2353 0.4705882 +## 4 192.2500 1.2500000 +## 5 231.7727 0.9090909 +## 6 270.1600 1.0400000 +``` + +The _spatial object graphs_ generated by steinbock (see Section +\@ref(feature-extraction) are read into a `colPair` slot with the name +`neighborhood` of the `SpatialExperiment` (or `SingleCellExperiment`) object. +Cell-cell interactions (cells in close spatial proximity) are represented as +"edge list" (stored as `SelfHits` object). Here, the left side represents the +column indices of the `SpatialExperiment` object of the "from" cells and the +right side represents the column indices of the "to" cells. For visualization of +the _spatial object graphs_, please refer to Section \@ref(spatial-viz). + + +```r +colPair(spe, "neighborhood") +``` + +``` +## SelfHits object with 257116 hits and 0 metadata columns: +## from to +## +## [1] 1 27 +## [2] 1 55 +## [3] 2 10 +## [4] 2 44 +## [5] 2 81 +## ... ... ... +## [257112] 47858 47836 +## [257113] 47859 47792 +## [257114] 47859 47819 +## [257115] 47859 47828 +## [257116] 47859 47854 +## ------- +## nnode: 47859 +``` + +Finally, metadata regarding the channels are stored in the `rowData` slot. This +information is extracted from the `panel.csv` file. + +Channels have the same order as the rows in the `panel.csv` file for which the +keep column is set to 1, and match the order of channels in the multi-channel +images (see Section \@ref(read-images)). For the example data, channels are +ordered by isotope mass. + + +```r +head(rowData(spe)) +``` + +``` +## DataFrame with 6 rows and 12 columns +## channel name keep ilastik deepcell cellpose +## +## MPO Y89 MPO 1 NA NA NA +## HistoneH3 In113 HistoneH3 1 1 1 NA +## SMA In115 SMA 1 NA NA NA +## CD16 Pr141 CD16 1 NA NA NA +## CD38 Nd142 CD38 1 NA NA NA +## HLADR Nd143 HLADR 1 NA NA NA +## Tube.Number Target Antibody.Clone Stock.Concentration +## +## MPO 2101 Myeloperoxidase MPO Polyclonal MPO 500 +## HistoneH3 2113 Histone H3 D1H2 500 +## SMA 1914 SMA 1A4 500 +## CD16 2079 CD16 EPR16784 500 +## CD38 2095 CD38 EPR4106 500 +## HLADR 2087 HLA-DR TAL 1B5 500 +## Final.Concentration...Dilution uL.to.add +## +## MPO 4 ug/mL 0.8 +## HistoneH3 1 ug/mL 0.2 +## SMA 0.25 ug/mL 0.05 +## CD16 5 ug/mL 1 +## CD38 2.5 ug/mL 0.5 +## HLADR 1 ug/mL 0.2 +``` + +### IMC Segmentation Pipeline generated data + +The [IMC Segmentation Pipeline](https://github.com/BodenmillerGroup/ImcSegmentationPipeline) offers an +alternative approach to multiplexed image processing and segmentation. The +default pipeline is also available via `steinbock`. The IMC Segmentation +Pipeline is based on [Ilastik](https://www.ilastik.org/) pixel classification +and image segmentation using [CellProfiler](https://cellprofiler.org/). We recommend +to become familiar with the pipeline as it allows flexible extension to more +complicated image analysis and segmentation tasks. For standard image analysis +and segmentation, `steinbock` is the preferred choice. Please refer to +the [documentation](https://bodenmillergroup.github.io/ImcSegmentationPipeline/) +to get an overview on the pipeline. + +All relevant [output](https://bodenmillergroup.github.io/ImcSegmentationPipeline/output.html) +storing single-cell data is contained in the `cpout` folder. +For reading in the single-cell measurement, the `imcRtools` package offers the +`read_cpout` function: + + +```r +spe2 <- read_cpout("data/ImcSegmentationPipeline/analysis/cpout/") +rownames(spe2) <- rowData(spe2)$Clean_Target +spe2 +``` + +``` +## class: SpatialExperiment +## dim: 40 43796 +## metadata(0): +## assays(1): counts +## rownames(40): MPO HistoneH3 ... DNA1 DNA2 +## rowData names(11): Tube.Number Metal.Tag ... ilastik deepcell +## colnames: NULL +## colData names(12): sample_id ObjectNumber ... Metadata_acid +## Metadata_description +## reducedDimNames(0): +## mainExpName: NULL +## altExpNames(0): +## spatialCoords names(2) : Pos_X Pos_Y +## imgData names(1): sample_id +``` + +Similar to the `steinbock` output, cell morphological features and image level +metadata are stored in the `colData(spe2)` slot, the interaction information +is contained in `colPair(spe2, type = "neighborhood")` and the mean intensity +per channel and cell is stored in `counts(spe2)`. + +### Reading custom files + +When not using `steinbock` or the `ImcSegmentationPipeline`, the single-cell +information has to be read in from custom files. We now demonstrate how +to generate a `SpatialExperiment` object from single-cell data contained +in individual files. As an example, we use files generated by `CellProfiler` +as part of the `ImcSegmentationPipeline`. + +First we will read in the single-cell features stored in a CSV file: + + +```r +library(readr) + +cur_features <- read_csv("data/ImcSegmentationPipeline/analysis/cpout/cell.csv") + +dim(cur_features) +``` + +``` +## [1] 43796 941 +``` + +```r +head(colnames(cur_features)) +``` + +``` +## [1] "ImageNumber" "ObjectNumber" +## [3] "AreaShape_Area" "AreaShape_BoundingBoxArea" +## [5] "AreaShape_BoundingBoxMaximum_X" "AreaShape_BoundingBoxMaximum_Y" +``` + +This file contains a large number of single-cell features including the cell +identifier (`ObjectNumber`), the image identifier (`ImageNumber`), morphological +features (`AreaShape_*`), the cells' locations (`Location_Center_*`) and the +mean pixel intensity per cell and per channel (`Intensity_MeanIntensity_FullStack_*`). + +Now, we split the features into intensity features, cell-specific metadata and +the physical location of the cells: + + +```r +counts <- cur_features[,grepl("Intensity_MeanIntensity_FullStack", + colnames(cur_features))] + +meta <- cur_features[,c("ImageNumber", "ObjectNumber", "AreaShape_Area", + "AreaShape_Eccentricity", "AreaShape_MeanRadius")] + +coords <- cur_features[,c("Location_Center_X", "Location_Center_Y")] +``` + +`CellProfiler` writes out the mean pixel intensities after scaling them +bit a scaling factor which is bit encoding-specific. The images to which +the IMC Segmentation Pipeline was applied were saved with 16-bit encoding. +This means for the example data, the mean pixel intensities need to +be scaled by a factor of `2 ^ 16 - 1 = 65535`. + + +```r +counts <- counts * 65535 +``` + +In addition, `CellProfiler` does not order the channel numerically but rather +as a character; `1, 10, 2, 3, ...` rather than `1, 2, 3, ...`. Therefore we +will need to reorder the channels. + + +```r +library(stringr) +cur_ch <- str_split(colnames(counts), "_", simplify = TRUE)[,4] +cur_ch <- sub("c", "", cur_ch) + +counts <- counts[,order(as.numeric(cur_ch))] +``` + +From these features we can now construct the `SpatialExperiment` object. + + +```r +spe3 <- SpatialExperiment(assays = list(counts = t(counts)), + colData = meta, + sample_id = as.character(meta$ImageNumber), + spatialCoords = as.matrix(coords)) +``` + +Next, we can store the spatial cell graph generated by `CellProfiler` in the +`colPairs` slot of the object. Spatial cell graphs are usually stored as edge +list in form of a CSV file. The `colPairs` slot requires a `SelfHits` entry +storing an edge list where numeric entries represent the index of the `from` and +`to` cell in the `SpatialExperiment` object. To generate such an edge list, we +need to match the cell IDs contained in the CSV against the cell IDs in the +`SpatialExperiment` object. + + +```r +cur_pairs <- read_csv("data/ImcSegmentationPipeline/analysis/cpout/Object relationships.csv") + +cur_from <- paste(cur_pairs$`First Image Number`, cur_pairs$`First Object Number`) +cur_to <- paste(cur_pairs$`Second Image Number`, cur_pairs$`Second Object Number`) + +edgelist <- SelfHits(from = match(cur_from, + paste(spe3$ImageNumber, spe3$ObjectNumber)), + to = match(cur_to, + paste(spe3$ImageNumber, spe3$ObjectNumber)), + nnode = ncol(spe3)) + +colPair(spe3, "neighborhood") <- edgelist +``` + +For further downstream analysis, we will use the `steinbock` results. + +## Single-cell processing {#cell-processing} + +After reading in the single-cell data, few further processing steps need to be +taken. + +**Add additional metadata** + +We can set the `colnames` of the object to generate unique identifiers per cell: + + +```r +colnames(spe) <- paste0(spe$sample_id, "_", spe$ObjectNumber) +``` + +It is also often the case that sample-specific metadata are available externally. +For the current data, we need to link the cancer type (also referred to as "Indication") +to each sample. This metadata is available as external CSV file: + + +```r +library(tidyverse) + +# Read patient metadata +meta <- read_csv("data/sample_metadata.csv") + +# Extract patient id and ROI id from sample name +spe$patient_id <- str_extract(spe$sample_id, "Patient[1-4]") +spe$ROI <- str_extract(spe$sample_id, "00[1-8]") + +# Store cancer type in SPE object +spe$indication <- meta$Indication[match(spe$patient_id, meta$`Sample ID`)] + +unique(spe$patient_id) +``` + +``` +## [1] "Patient1" "Patient2" "Patient3" "Patient4" +``` + +```r +unique(spe$ROI) +``` + +``` +## [1] "001" "002" "003" "004" "005" "006" "007" "008" +``` + +```r +unique(spe$indication) +``` + +``` +## [1] "SCCHN" "BCC" "NSCLC" "CRC" +``` + +The selected patients were diagnosed with different cancer types: + +* SCCHN - head and neck cancer +* BCC - breast cancer +* NSCLC - lung cancer +* CRC - colorectal cancer + +**Transform counts** + +The distribution of expression counts across cells is often observed to be +skewed towards the right side meaning lots of cells display low counts and few +cells have high counts. To avoid analysis biases from these high-expressing +cells, the expression counts are commonly transformed or clipped. + +Here, we perform counts transformation using an inverse hyperbolic sine +function. This transformation is commonly applied to [flow cytometry +data](https://support.cytobank.org/hc/en-us/articles/206148057-About-the-Arcsinh-transform). +The `cofactor` here defines the expression range on which no scaling is +performed. While the `cofactor` for CyTOF data is often set to `5`, IMC data +usually display much lower counts. We therefore apply a `cofactor` of `1`. + +However, other transformations such as `log(counts(spe) + 0.01)` should be +tested when analysing IMC data. + + +```r +library(dittoSeq) +dittoRidgePlot(spe, var = "CD3", group.by = "patient_id", assay = "counts") + + ggtitle("CD3 - before transformation") +``` + + + +```r +assay(spe, "exprs") <- asinh(counts(spe)/1) +dittoRidgePlot(spe, var = "CD3", group.by = "patient_id", assay = "exprs") + + ggtitle("CD3 - after transformation") +``` + + + +**Define interesting channels** + +For downstream analysis such as visualization, dimensionality reduction and +clustering, only a subset of markers should be used. As convenience, we can +store an additional entry in the `rowData` slot that specifies the markers of +interest. Here, we deselect the nuclear markers, which were primarily used for +cell segmentation, and keep all other biological targets. However, more informed +marker selection should be performed to exclude lowly expressed marker or +markers with low signal-to-noise ratio. + + +```r +rowData(spe)$use_channel <- !grepl("DNA|Histone", rownames(spe)) +``` + +**Define color schemes** + +We will define color schemes for different metadata entries of the data and +conveniently store them in the `metadata` slot of the `SpatialExperiment` which +will be helpful for downstream data visualizations. We will use colors from the +`RColorBrewer` and `dittoSeq` packages but any other coloring package will +suffice. + + +```r +library(RColorBrewer) +color_vectors <- list() + +ROI <- setNames(brewer.pal(length(unique(spe$ROI)), name = "BrBG"), + unique(spe$ROI)) +patient_id <- setNames(brewer.pal(length(unique(spe$patient_id)), name = "Set1"), + unique(spe$patient_id)) +sample_id <- setNames(c(brewer.pal(6, "YlOrRd")[3:5], + brewer.pal(6, "PuBu")[3:6], + brewer.pal(6, "YlGn")[3:5], + brewer.pal(6, "BuPu")[3:6]), + unique(spe$sample_id)) +indication <- setNames(brewer.pal(length(unique(spe$indication)), name = "Set2"), + unique(spe$indication)) + +color_vectors$ROI <- ROI +color_vectors$patient_id <- patient_id +color_vectors$sample_id <- sample_id +color_vectors$indication <- indication + +metadata(spe)$color_vectors <- color_vectors +``` + +## Read in images {#read-images} + +The `cytomapper` package allows multi-channel image handling and visualization +within the Bioconductor framework. The most common data format for multi-channel +images or segmentation masks is the TIFF file format, which is used by `steinbock` +and the `IMC segementation pipeline` to save images. + +Here, we will read in multi-channel images and segmentation masks into a +[CytoImageList](https://www.bioconductor.org/packages/release/bioc/vignettes/cytomapper/inst/doc/cytomapper.html#5_The_CytoImageList_object) +data container. It allows storing multiple multi-channel images and requires +matched channels across all images within the object. + +The `loadImages` function is used to read in processed multi-channel images and +their corresponding segmentation masks. **Of note**: the multi-channel images +generated by `steinbock` are saved as 32-bit images while the segmentation masks +are saved as 16-bit images. To correctly scale pixel values of the segmentation +masks when reading them in, we will need to set `as.is = TRUE`. + + +```r +images <- loadImages("data/steinbock/img/") +``` + +``` +## All files in the provided location will be read in. +``` + +```r +masks <- loadImages("data/steinbock/masks_deepcell/", as.is = TRUE) +``` + +``` +## All files in the provided location will be read in. +``` + +In the case of multi-channel images, it is beneficial to set the `channelNames` +for easy visualization. Using the `steinbock` framework, the channel order of +the single-cell data matches the channel order of the multi-channel images. +However, it is recommended to make sure that the channel order is identical +between the single-cell data and the images. + + +```r +channelNames(images) <- rownames(spe) +images +``` + +``` +## CytoImageList containing 14 image(s) +## names(14): Patient1_001 Patient1_002 Patient1_003 Patient2_001 Patient2_002 Patient2_003 Patient2_004 Patient3_001 Patient3_002 Patient3_003 Patient4_005 Patient4_006 Patient4_007 Patient4_008 +## Each image contains 40 channel(s) +## channelNames(40): MPO HistoneH3 SMA CD16 CD38 HLADR CD27 CD15 CD45RA CD163 B2M CD20 CD68 Ido1 CD3 LAG3 / LAG33 CD11c PD1 PDGFRb CD7 GrzB PDL1 TCF7 CD45RO FOXP3 ICOS CD8a CarbonicAnhydrase CD33 Ki67 VISTA CD40 CD4 CD14 Ecad CD303 CD206 cleavedPARP DNA1 DNA2 +``` + +For visualization shown in Section \@ref(image-visualization) we will need to +add additional metadata to the `elementMetadata` slot of the `CytoImageList` +objects. This slot is easily accessible using the `mcols` function. + +Here, we will store the matched `sample_id`, `patient_id` and `indication` +information within the `elementMetadata` slot of the multi-channel images and +segmentation masks objects. It is crucial that the order of the images in +both `CytoImageList` objects is the same. + + +```r +all.equal(names(images), names(masks)) +``` + +``` +## [1] TRUE +``` + +```r +# Extract patient id from image name +patient_id <- str_extract(names(images), "Patient[1-4]") + +# Retrieve cancer type per patient from metadata file +indication <- meta$Indication[match(patient_id, meta$`Sample ID`)] + +# Store patient and image level information in elementMetadata +mcols(images) <- mcols(masks) <- DataFrame(sample_id = names(images), + patient_id = patient_id, + indication = indication) +``` + +## Generate single-cell data from images + +An alternative way of generating a `SingleCellExperiment` object directly +from the multi-channel images and segmentation masks is supported by the +[measureObjects](https://bodenmillergroup.github.io/cytomapper/reference/measureObjects.html) +function of the `cytomapper` package. For each cell present in the `masks` +object, the function computes the mean pixel intensity per channel as well as +morphological features (area, radius, major axis length, eccentricity) and the +location of cells: + + +```r +cytomapper_sce <- measureObjects(masks, image = images, img_id = "sample_id") + +cytomapper_sce +``` + +``` +## class: SingleCellExperiment +## dim: 40 47859 +## metadata(0): +## assays(1): counts +## rownames(40): MPO HistoneH3 ... DNA1 DNA2 +## rowData names(0): +## colnames: NULL +## colData names(10): sample_id object_id ... patient_id indication +## reducedDimNames(0): +## mainExpName: NULL +## altExpNames(0): +``` + +## Accessing publicly available IMC datasets + +The [imcdatasets](https://github.com/BodenmillerGroup/imcdatasets) +R/Bioconductor package provides a number of publicly available IMC datasets. For +a complete introduction to the package, please refer to the +[documentation](https://bioconductor.org/packages/release/data/experiment/vignettes/imcdatasets/inst/doc/imcdatasets.html). +Here, we can read in example data of [@Damond2019] taken from patients diagnosed +with Type I Diabetes. The example here consists of a `CytoImageList` object of +100 images, a `CytoImageList` object of 100 segmentation masks and a +`SingleCellExperiment` object containing 252059 cells. **Of note:** downloading the +images takes quite some time and uses 8GB of memory. + + +```r +library(imcdatasets) + +pancreasImages <- Damond_2019_Pancreas(data_type = "images") +pancreasMasks <- Damond_2019_Pancreas(data_type = "masks") +pancreasSCE <- Damond_2019_Pancreas(data_type = "sce") +``` + +## Save objects + +Finally, the generated data objects can be saved for further downstream +processing and analysis. + + +```r +saveRDS(spe, "data/spe.rds") +saveRDS(images, "data/images.rds") +saveRDS(masks, "data/masks.rds") +``` + + + + + + + + + + + +## Session Info + +
+ SessionInfo + + +``` +## R version 4.3.1 (2023-06-16) +## Platform: x86_64-pc-linux-gnu (64-bit) +## Running under: Ubuntu 22.04.3 LTS +## +## Matrix products: default +## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 +## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 +## +## locale: +## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C +## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 +## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 +## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C +## [9] LC_ADDRESS=C LC_TELEPHONE=C +## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C +## +## time zone: Etc/UTC +## tzcode source: system (glibc) +## +## attached base packages: +## [1] stats4 stats graphics grDevices utils datasets methods +## [8] base +## +## other attached packages: +## [1] testthat_3.1.10 RColorBrewer_1.1-3 +## [3] dittoSeq_1.12.1 lubridate_1.9.3 +## [5] forcats_1.0.0 dplyr_1.1.3 +## [7] purrr_1.0.2 tidyr_1.3.0 +## [9] tibble_3.2.1 ggplot2_3.4.3 +## [11] tidyverse_2.0.0 stringr_1.5.0 +## [13] readr_2.1.4 cytomapper_1.12.0 +## [15] EBImage_4.42.0 imcRtools_1.6.5 +## [17] SpatialExperiment_1.10.0 SingleCellExperiment_1.22.0 +## [19] SummarizedExperiment_1.30.2 Biobase_2.60.0 +## [21] GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 +## [23] IRanges_2.34.1 S4Vectors_0.38.2 +## [25] BiocGenerics_0.46.0 MatrixGenerics_1.12.3 +## [27] matrixStats_1.0.0 +## +## loaded via a namespace (and not attached): +## [1] later_1.3.1 bitops_1.0-7 +## [3] R.oo_1.25.0 svgPanZoom_0.3.4 +## [5] polyclip_1.10-6 lifecycle_1.0.3 +## [7] sf_1.0-14 rprojroot_2.0.3 +## [9] edgeR_3.42.4 lattice_0.21-8 +## [11] vroom_1.6.3 MASS_7.3-60 +## [13] magrittr_2.0.3 limma_3.56.2 +## [15] sass_0.4.7 rmarkdown_2.25 +## [17] jquerylib_0.1.4 yaml_2.3.7 +## [19] httpuv_1.6.11 sp_2.0-0 +## [21] cowplot_1.1.1 DBI_1.1.3 +## [23] pkgload_1.3.3 abind_1.4-5 +## [25] zlibbioc_1.46.0 R.utils_2.12.2 +## [27] ggraph_2.1.0 RCurl_1.98-1.12 +## [29] tweenr_2.0.2 GenomeInfoDbData_1.2.10 +## [31] ggrepel_0.9.3 RTriangle_1.6-0.12 +## [33] terra_1.7-46 pheatmap_1.0.12 +## [35] units_0.8-4 dqrng_0.3.1 +## [37] svglite_2.1.1 DelayedMatrixStats_1.22.6 +## [39] codetools_0.2-19 DropletUtils_1.20.0 +## [41] DelayedArray_0.26.7 DT_0.29 +## [43] scuttle_1.10.2 ggforce_0.4.1 +## [45] tidyselect_1.2.0 raster_3.6-23 +## [47] farver_2.1.1 viridis_0.6.4 +## [49] jsonlite_1.8.7 BiocNeighbors_1.18.0 +## [51] e1071_1.7-13 ellipsis_0.3.2 +## [53] tidygraph_1.2.3 ggridges_0.5.4 +## [55] systemfonts_1.0.4 tools_4.3.1 +## [57] Rcpp_1.0.11 glue_1.6.2 +## [59] gridExtra_2.3 xfun_0.40 +## [61] HDF5Array_1.28.1 shinydashboard_0.7.2 +## [63] withr_2.5.1 fastmap_1.1.1 +## [65] rhdf5filters_1.12.1 fansi_1.0.4 +## [67] digest_0.6.33 timechange_0.2.0 +## [69] R6_2.5.1 mime_0.12 +## [71] colorspace_2.1-0 jpeg_0.1-10 +## [73] R.methodsS3_1.8.2 utf8_1.2.3 +## [75] generics_0.1.3 data.table_1.14.8 +## [77] class_7.3-22 graphlayouts_1.0.1 +## [79] htmlwidgets_1.6.2 S4Arrays_1.0.6 +## [81] pkgconfig_2.0.3 gtable_0.3.4 +## [83] XVector_0.40.0 brio_1.1.3 +## [85] htmltools_0.5.6 bookdown_0.35 +## [87] fftwtools_0.9-11 scales_1.2.1 +## [89] png_0.1-8 knitr_1.44 +## [91] rstudioapi_0.15.0 tzdb_0.4.0 +## [93] rjson_0.2.21 proxy_0.4-27 +## [95] cachem_1.0.8 rhdf5_2.44.0 +## [97] KernSmooth_2.23-21 parallel_4.3.1 +## [99] vipor_0.4.5 concaveman_1.1.0 +## [101] desc_1.4.2 pillar_1.9.0 +## [103] grid_4.3.1 vctrs_0.6.3 +## [105] promises_1.2.1 distances_0.1.9 +## [107] beachmat_2.16.0 xtable_1.8-4 +## [109] archive_1.1.6 beeswarm_0.4.0 +## [111] evaluate_0.21 magick_2.8.0 +## [113] cli_3.6.1 locfit_1.5-9.8 +## [115] compiler_4.3.1 rlang_1.1.1 +## [117] crayon_1.5.2 labeling_0.4.3 +## [119] classInt_0.4-10 ggbeeswarm_0.7.2 +## [121] stringi_1.7.12 viridisLite_0.4.2 +## [123] BiocParallel_1.34.2 nnls_1.5 +## [125] munsell_0.5.0 tiff_0.1-11 +## [127] Matrix_1.6-1.1 hms_1.1.3 +## [129] sparseMatrixStats_1.12.2 bit64_4.0.5 +## [131] Rhdf5lib_1.22.1 shiny_1.7.5 +## [133] igraph_1.5.1 bslib_0.5.1 +## [135] bit_4.0.5 +``` +
+ diff --git a/04-read_data_files/figure-html/transform-counts-1.png b/04-read_data_files/figure-html/transform-counts-1.png new file mode 100644 index 00000000..604d3783 Binary files /dev/null and b/04-read_data_files/figure-html/transform-counts-1.png differ diff --git a/04-read_data_files/figure-html/transform-counts-2.png b/04-read_data_files/figure-html/transform-counts-2.png new file mode 100644 index 00000000..103ce62b Binary files /dev/null and b/04-read_data_files/figure-html/transform-counts-2.png differ diff --git a/05-spillover_matrix.md b/05-spillover_matrix.md new file mode 100644 index 00000000..cd8a6952 --- /dev/null +++ b/05-spillover_matrix.md @@ -0,0 +1,637 @@ +# Spillover correction + +**Original scripts:** *Vito Zanotelli*, **adapted/maintained by:** *Nils Eling* + +This section highlights how to generate a spillover matrix from individually +acquired single metal spots on an agarose slide. Each spot needs to be imaged as +its own acquisition/ROI and individual TXT files containing the pixel +intensities per spot need to be available. For complete details on the spillover +correction approach, please refer to [the original +publication](https://www.cell.com/cell-systems/fulltext/S2405-4712(18)30063-2) [@Chevrier2017]. + +**Spillover slide preparation:** + +* Prepare 2% agarose in double distilled H$_2$O in a beaker and melt it in a microwave until well dissolved. +* Dip a blank superfrost plus glass microscope slide into the agarose and submerge it until the label. +* Remove the slide and prop it up against a support to allow the excess agarose to run off onto paper towels. +* Allow the slide to dry completely (at least 30 minutes). +* Retrieve all the antibody conjugates used in the panel for which the spillover matrix is to be generated and place them on ice. +* Arrange them in a known order (e.g., mass of the conjugated metal). +* Pipette 0.3 µl spots of 0.4% trypan blue dye into an array on the slide. Prepare one spot per antibody, and make sure the spots are well separated. +* Pipette 0.3 µl of each antibody conjugate (usually at 0.5 mg/ml) onto a unique blue spot, taking care to avoid different antibodies bleeding into each other. Note the exact location of each conjugate on the slide. +* Let the spots dry completely, at least 1 hour. + +**Spillover slide acquisition:** + +* Create a JPEG or PNG image of the slide using a mobile phone camera or flat-bed scanner. +* In the CyTOF software, create a new file and import the slide image into it. +* Create a panorama across all the spots to visualize their locations. +* Within each spot, create a region of interest (ROI) with a width of 200 pixels and a height of 10 pixels. +* Name each ROI with the mass and name of the metal conjugate contained in the spot, e.g "Ir193" or "Ho165". This will be how each TXT file is named. +* Set the profiling type of each ROI to "Local". +* Apply the antibody panel to all the ROIs. This panel should contain all (or more) of the isotopes in the panel, with the correct metal specified. For example: if the metal used is Barium 138, make sure this, rather than Lanthanum 138, is selected. +* Save the file, make sure "Generate Text File" is selected, and start the acquisition. + +This procedure will generate an MCD file similar to the one available on zenodo: +[10.5281/zenodo.5949115](https://doi.org/10.5281/zenodo.5949115) + +The original code of the spillover correction manuscript is available on Github +[here](https://github.com/BodenmillerGroup/cyTOFcompensation); however, due to +changes in the +[CATALYST](https://bioconductor.org/packages/release/bioc/html/CATALYST.html) +package, users were not able to reproduce the analysis using the newest software +versions. The following workflow uses the newest package versions to generate a +spillover matrix and perform spillover correction. + +In brief, the highlighted workflow comprises 9 steps: + +1. Reading in the data +2. Quality control +3. (Optional) pixel binning +4. "Debarcoding" for pixel assignment +5. Pixel selection for spillover matrix estimation +6. Spillover matrix generation +7. Saving the results +8. Single-cell compensation +9. Image compensation + +## Generate the spillover matrix + +In the first step, we will generate a spillover matrix based on the single-metal +spots and save it for later use. + +### Read in the data + +Here, we will read in the individual TXT files into a `SingleCellExperiment` +object. This object can be used directly by the `CATALYST` package to estimate +the spillover. + +For this to work, the TXT file names need to contain the spotted metal isotope +name. By default, the first occurrence of the isotope in the format `(mt)(mass)` +(e.g. `Sm152` for Samarium isotope with the atomic mass 152) will be used as +spot identifier. Alternatively, a named list of already read-in pixel intensities +can be provided. For more information, please refer to the man page `?readSCEfromTXT`. + +For further downstream analysis, we will asinh-transform the data using a +cofactor of 5; a common transformation for CyTOF data [@Bendall2011]. +As the pixel intensities are larger than the cell intensities, the cofactor +here is larger than the cofactor when transforming the mean cell intensities. + + +```r +library(imcRtools) + +# Create SingleCellExperiment from TXT files +sce <- readSCEfromTXT("data/compensation/") +``` + +``` +## Spotted channels: Y89, In113, In115, Pr141, Nd142, Nd143, Nd144, Nd145, Nd146, Sm147, Nd148, Sm149, Nd150, Eu151, Sm152, Eu153, Sm154, Gd155, Gd156, Gd158, Tb159, Gd160, Dy161, Dy162, Dy163, Dy164, Ho165, Er166, Er167, Er168, Tm169, Er170, Yb171, Yb172, Yb173, Yb174, Lu175, Yb176 +## Acquired channels: Ar80, Y89, In113, In115, Xe131, Xe134, Ba136, La138, Pr141, Nd142, Nd143, Nd144, Nd145, Nd146, Sm147, Nd148, Sm149, Nd150, Eu151, Sm152, Eu153, Sm154, Gd155, Gd156, Gd158, Tb159, Gd160, Dy161, Dy162, Dy163, Dy164, Ho165, Er166, Er167, Er168, Tm169, Er170, Yb171, Yb172, Yb173, Yb174, Lu175, Yb176, Ir191, Ir193, Pt196, Pb206 +## Channels spotted but not acquired: +## Channels acquired but not spotted: Ar80, Xe131, Xe134, Ba136, La138, Ir191, Ir193, Pt196, Pb206 +``` + +```r +assay(sce, "exprs") <- asinh(counts(sce)/5) +``` + +### Quality control + +In the next step, we will observe the median pixel intensities per spot and +threshold on medians < 200 counts. +These types of visualization serve two purposes: + +1. Small median pixel intensities (< 200 counts) might hinder the robust +estimation of the channel spillover. In that case, consecutive pixels can be +summed (see [Optional pixel binning](#pixel_binning)). + +2. Each spotted metal (row) should show the highest median pixel intensity in its +corresponding channel (column). If this is not the case, either the naming of the +TXT files was incorrect or the incorrect metal was spotted. + + +```r +# Log10 median pixel counts per spot and channel +plotSpotHeatmap(sce) +``` + + + +```r +# Thresholded on 200 pixel counts +plotSpotHeatmap(sce, log = FALSE, threshold = 200) +``` + + + +As we can see, nearly all median pixel intensities are > 200 counts for each spot. +We also observe acquired channels for which no spot was placed (e.g., Xe134, Ir191, Ir193). + +### Optional pixel binning {#pixel_binning} + +In cases where median pixel intensities are low (< 200 counts), consecutive +pixels can be summed to increase the robustness of the spillover estimation. +The `imcRtools` package provides the `binAcrossPixels` function, +which performs aggregation for each channel across `bin_size` consecutive pixels +per spotted metal. + + +```r +# Define grouping +bin_size = 10 + +sce2 <- binAcrossPixels(sce, bin_size = bin_size) + +# Log10 median pixel counts per spot and channel +plotSpotHeatmap(sce2) +``` + + + +```r +# Thresholded on 200 pixel counts +plotSpotHeatmap(sce2, log = FALSE, threshold = 200) +``` + + + +Here, we can see an increase in the median pixel intensities and accumulation of +off-diagonal signal. Due to already high original pixel intensities, we will +refrain from aggregating across consecutive pixels for this demonstration. + +### Filtering incorrectly assigned pixels + +The following step uses functions provided by the `CATALYST` package to +"debarcode" the pixels. Based on the intensity distribution of all channels, +pixels are assigned to their corresponding barcode; here this is the already +known metal spot. This procedure serves the purpose to identify pixels that +cannot be robustly assigned to the spotted metal. Pixels of such kind can be +regarded as "noisy", "background" or "artefacts" that should be removed prior to +spillover estimation. + +We will also need to specify which channels were spotted (argument `bc_key`). +This information is directly contained in the `colData(sce)` slot. +To facilitate visualization, we will order the `bc_key` by mass. + +The general workflow for pixel debarcoding is as follows: + +1. assign a preliminary metal mass to each pixel +2. for each pixel, estimate a cutoff parameter for the distance between +positive and negative pixel sets +3. apply the estimated cutoffs to identify truly positive pixels + + +```r +library(CATALYST) + +bc_key <- as.numeric(unique(sce$sample_mass)) +bc_key <- bc_key[order(bc_key)] + +sce <- assignPrelim(sce, bc_key = bc_key) +sce <- estCutoffs(sce) +sce <- applyCutoffs(sce) +``` + +The obtained `SingleCellExperiment` now contains the additional `bc_id` entry. +For each pixel, this vector indicates the assigned mass (e.g. `161`) or +`0`, meaning unassigned. + +This information can be visualized in form of a heatmap: + + +```r +library(pheatmap) +cur_table <- table(sce$bc_id, sce$sample_mass) + +# Visualize the correctly and incorrectly assigned pixels +pheatmap(log10(cur_table + 1), + cluster_rows = FALSE, + cluster_cols = FALSE) +``` + + + +```r +# Compute the fraction of unassigned pixels per spot +cur_table["0",] / colSums(cur_table) +``` + +``` +## 113 115 141 142 143 144 145 146 147 148 149 +## 0.1985 0.1060 0.2575 0.3195 0.3190 0.3825 0.3545 0.4280 0.3570 0.4770 0.4200 +## 150 151 152 153 154 155 156 158 159 160 161 +## 0.4120 0.4025 0.4050 0.4630 0.4190 0.4610 0.3525 0.4020 0.4655 0.4250 0.5595 +## 162 163 164 165 166 167 168 169 170 171 172 +## 0.4340 0.4230 0.4390 0.4055 0.5210 0.3900 0.3285 0.3680 0.5015 0.4900 0.5650 +## 173 174 175 176 89 +## 0.3125 0.4605 0.4710 0.2845 0.3015 +``` + +We can see here, that all pixels were assigned to the right mass and that all +pixel sets are made up of > 800 pixels. + +However, in cases where incorrect assignment occurred or where few pixels were +measured for some spots, the `imcRtools` package exports a simple helper +function to exclude pixels based on these criteria: + + +```r +sce <- filterPixels(sce, minevents = 40, correct_pixels = TRUE) +``` + +In the `filterPixels` function, the `minevents` parameter specifies the threshold +under which correctly assigned pixel sets are excluded from spillover +estimation. The `correct_pixels` parameter indicates whether pixels that were +assigned to masses other than the spotted mass should be excluded from spillover +estimation. The default values often result in sufficient pixel filtering; +however, if very few pixels (~100) are measured per spot, the `minevents` +parameter value needs to be lowered. + +### Compute spillover matrix + +Based on the single-positive pixels, we use the `CATALYST::computeSpillmat()` +function to compute the spillover matrix and `CATALYST::plotSpillmat()` to +visualize it. The `plotSpillmat` function checks the spotted and acquired +metal isotopes against a pre-defined `CATALYST::isotope_list()`. In this data, +the `Ar80` channel was additionally acquired to check for deviations in signal +intensity. `Ar80` needs to be added to a custom `isotope_list` object for +visualization. + + +```r +sce <- computeSpillmat(sce) + +isotope_list <- CATALYST::isotope_list +isotope_list$Ar <- 80 + +plotSpillmat(sce, isotope_list = isotope_list) +``` + +``` +## Warning: The `guide` argument in `scale_*()` cannot be `FALSE`. This was deprecated in +## ggplot2 3.3.4. +## ℹ Please use "none" instead. +## ℹ The deprecated feature was likely used in the CATALYST package. +## Please report the issue at . +## This warning is displayed once every 8 hours. +## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was +## generated. +``` + + + +```r +# Save spillover matrix in variable +sm <- metadata(sce)$spillover_matrix +``` + +**Of note: the visualization of the spillover matrix using CATALYST does currently +not visualize spillover between the larger channels.** In this case, the +spillover matrix is clipped at Yb171. + +As we can see, the largest spillover appears in `In113 --> In115` and we also +observe the `+16` oxide impurities for e.g. `Nd148 --> Dy164`. + +We can save the spillover matrix for external use. + + +```r +write.csv(sm, "data/sm.csv") +``` + +## Single-cell data compensation + +The `CATALYST` package can be used to perform spillover compensation on the +**single-cell mean intensities**. Here, the `SpatialExperiment` object generated +in Section \@ref(read-data) is read in. The `CATALYST` package requires an entry +to `rowData(spe)$channel_name` for the `compCytof` function to run. This entry +should contain the metal isotopes in the form (mt)(mass)Di (e.g., `Sm152Di` for +Samarium isotope with the atomic mass 152). + +The `compCytof` function performs channel spillover compensation on the mean +pixel intensities per channel and cell. Here, we will not overwrite the assays +in the `SpatialExperiment` object to later highlight the effect of compensation. +As shown in Section \@ref(read-data), also the compensated counts are +asinh-transformed using a cofactor of 1. + + +```r +spe <- readRDS("data/spe.rds") +rowData(spe)$channel_name <- paste0(rowData(spe)$channel, "Di") + +spe <- compCytof(spe, sm, + transform = TRUE, cofactor = 1, + isotope_list = isotope_list, + overwrite = FALSE) +``` + +To check the effect of channel spillover compensation, the expression of markers +that are affected by spillover (e.g., E-cadherin in channel Yb173 and CD303 in +channel Yb174) can be visualized in form of scatter plots before and after +compensation. + + +```r +library(dittoSeq) +library(patchwork) +before <- dittoScatterPlot(spe, x.var = "Ecad", y.var = "CD303", + assay.x = "exprs", assay.y = "exprs") + + ggtitle("Before compensation") + +after <- dittoScatterPlot(spe, x.var = "Ecad", y.var = "CD303", + assay.x = "compexprs", assay.y = "compexprs") + + ggtitle("After compensation") +before + after +``` + + + +We observe that the spillover Yb173 --> Yb174 was successfully corrected. +To facilitate further downstream analysis, the non-compensated assays can now be +replaced by their compensated counterparts: + + +```r +assay(spe, "counts") <- assay(spe, "compcounts") +assay(spe, "exprs") <- assay(spe, "compexprs") +assay(spe, "compcounts") <- assay(spe, "compexprs") <- NULL +``` + +## Image compensation + +The [cytomapper](https://github.com/BodenmillerGroup/cytomapper) package allows channel +spillover compensation directly on **multi-channel images**. +The `compImage` function takes a `CytoImageList` object and the estimated +spillover matrix as input. More info on how to work with `CytoImageList` +objects can be seen in Section \@ref(image-visualization). + +At this point, we can read in the `CytoImageList` object containing multi-channel +images as generated in Section \@ref(read-data). +The `channelNames` need to be set according to their metal isotope in the form +(mt)(mass)Di and therefore match `colnames(sm)`. + + +```r +library(cytomapper) + +images <- readRDS("data/images.rds") +channelNames(images) <- rowData(spe)$channel_name +``` + +The CATALYST package provides the `adaptSpillmat` function that corrects the +spillover matrix in a way that rows and columns match a predefined set of +metals. Please refer to `?compCytof` for more information how metals in the +spillover matrix are matched to acquired channels in the `SingleCellExperiment` +object. + +The spillover matrix can now be adapted to exclude channels that were not kept +for downstream analysis. + + +```r +adapted_sm <- adaptSpillmat(sm, channelNames(images), + isotope_list = isotope_list) +``` + +``` +## Compensation is likely to be inaccurate. +## Spill values for the following interactions +## have not been estimated: +``` + +``` +## Ir191Di -> Ir193Di +``` + +``` +## Ir193Di -> Ir191Di +``` + +The adapted spillover matrix now matches the `channelNames` of the +`CytoImageList` object and can be used to perform pixel-level spillover +compensation. Here, we parallelise the image compensation on all available minus 2 cores. When +working on Windows, you will need to use the `SnowParam` function instead of +`MultiCoreParam`. + + +```r +library(BiocParallel) + +images_comp <- compImage(images, adapted_sm, + BPPARAM = MulticoreParam()) +``` + +As a sanity check, we will visualize the image before and after compensation: + + +```r +# Before compensation +plotPixels(images[5], colour_by = "Yb173Di", + image_title = list(text = "Yb173 (Ecad) - before", position = "topleft"), + legend = NULL, bcg = list(Yb173Di = c(0, 4, 1))) +``` + + + +```r +plotPixels(images[5], colour_by = "Yb174Di", + image_title = list(text = "Yb174 (CD303) - before", position = "topleft"), + legend = NULL, bcg = list(Yb174Di = c(0, 4, 1))) +``` + + + +```r +# After compensation +plotPixels(images_comp[5], colour_by = "Yb173Di", + image_title = list(text = "Yb173 (Ecad) - after", position = "topleft"), + legend = NULL, bcg = list(Yb173Di = c(0, 4, 1))) +``` + + + +```r +plotPixels(images_comp[5], colour_by = "Yb174Di", + image_title = list(text = "Yb174 (CD303) - after", position = "topleft"), + legend = NULL, bcg = list(Yb174Di = c(0, 4, 1))) +``` + + + +For convenience, we will re-set the `channelNames` to their biological targtes: + + +```r +channelNames(images_comp) <- rownames(spe) +``` + + + +## Write out compensated images + +In the final step, the compensated images are written out as 16-bit TIFF +files: + + +```r +library(tiff) +dir.create("data/comp_img") +lapply(names(images_comp), function(x){ + writeImage(as.array(images_comp[[x]])/(2^16 - 1), + paste0("data/comp_img/", x, ".tiff"), + bits.per.sample = 16) +}) +``` + +## Save objects + +For further downstream analysis, the compensated `SpatialExperiment` and +`CytoImageList` objects are saved replacing the former objects: + + +```r +saveRDS(spe, "data/spe.rds") +saveRDS(images_comp, "data/images.rds") +``` + + + +## Session Info + +
+ SessionInfo + + +``` +## R version 4.3.1 (2023-06-16) +## Platform: x86_64-pc-linux-gnu (64-bit) +## Running under: Ubuntu 22.04.3 LTS +## +## Matrix products: default +## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 +## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 +## +## locale: +## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C +## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 +## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 +## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C +## [9] LC_ADDRESS=C LC_TELEPHONE=C +## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C +## +## time zone: Etc/UTC +## tzcode source: system (glibc) +## +## attached base packages: +## [1] stats4 stats graphics grDevices utils datasets methods +## [8] base +## +## other attached packages: +## [1] testthat_3.1.10 tiff_0.1-11 +## [3] BiocParallel_1.34.2 cytomapper_1.12.0 +## [5] EBImage_4.42.0 patchwork_1.1.3 +## [7] dittoSeq_1.12.1 ggplot2_3.4.3 +## [9] pheatmap_1.0.12 CATALYST_1.24.0 +## [11] imcRtools_1.6.5 SpatialExperiment_1.10.0 +## [13] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2 +## [15] Biobase_2.60.0 GenomicRanges_1.52.0 +## [17] GenomeInfoDb_1.36.3 IRanges_2.34.1 +## [19] S4Vectors_0.38.2 BiocGenerics_0.46.0 +## [21] MatrixGenerics_1.12.3 matrixStats_1.0.0 +## +## loaded via a namespace (and not attached): +## [1] bitops_1.0-7 sf_1.0-14 +## [3] RColorBrewer_1.1-3 doParallel_1.0.17 +## [5] tools_4.3.1 backports_1.4.1 +## [7] utf8_1.2.3 R6_2.5.1 +## [9] DT_0.29 HDF5Array_1.28.1 +## [11] rhdf5filters_1.12.1 GetoptLong_1.0.5 +## [13] withr_2.5.1 sp_2.0-0 +## [15] gridExtra_2.3 cli_3.6.1 +## [17] archive_1.1.6 sandwich_3.0-2 +## [19] labeling_0.4.3 sass_0.4.7 +## [21] nnls_1.5 mvtnorm_1.2-3 +## [23] readr_2.1.4 proxy_0.4-27 +## [25] ggridges_0.5.4 systemfonts_1.0.4 +## [27] colorRamps_2.3.1 svglite_2.1.1 +## [29] R.utils_2.12.2 scater_1.28.0 +## [31] plotrix_3.8-2 limma_3.56.2 +## [33] flowCore_2.12.2 rstudioapi_0.15.0 +## [35] generics_0.1.3 shape_1.4.6 +## [37] gtools_3.9.4 vroom_1.6.3 +## [39] car_3.1-2 dplyr_1.1.3 +## [41] Matrix_1.6-1.1 RProtoBufLib_2.12.1 +## [43] ggbeeswarm_0.7.2 fansi_1.0.4 +## [45] abind_1.4-5 R.methodsS3_1.8.2 +## [47] terra_1.7-46 lifecycle_1.0.3 +## [49] multcomp_1.4-25 yaml_2.3.7 +## [51] edgeR_3.42.4 carData_3.0-5 +## [53] rhdf5_2.44.0 Rtsne_0.16 +## [55] grid_4.3.1 promises_1.2.1 +## [57] dqrng_0.3.1 crayon_1.5.2 +## [59] shinydashboard_0.7.2 lattice_0.21-8 +## [61] beachmat_2.16.0 cowplot_1.1.1 +## [63] magick_2.8.0 pillar_1.9.0 +## [65] knitr_1.44 ComplexHeatmap_2.16.0 +## [67] RTriangle_1.6-0.12 rjson_0.2.21 +## [69] codetools_0.2-19 glue_1.6.2 +## [71] data.table_1.14.8 vctrs_0.6.3 +## [73] png_0.1-8 gtable_0.3.4 +## [75] cachem_1.0.8 xfun_0.40 +## [77] S4Arrays_1.0.6 mime_0.12 +## [79] DropletUtils_1.20.0 tidygraph_1.2.3 +## [81] ConsensusClusterPlus_1.64.0 survival_3.5-5 +## [83] iterators_1.0.14 cytolib_2.12.1 +## [85] units_0.8-4 ellipsis_0.3.2 +## [87] TH.data_1.1-2 bit64_4.0.5 +## [89] rprojroot_2.0.3 bslib_0.5.1 +## [91] irlba_2.3.5.1 svgPanZoom_0.3.4 +## [93] vipor_0.4.5 KernSmooth_2.23-21 +## [95] colorspace_2.1-0 DBI_1.1.3 +## [97] raster_3.6-23 tidyselect_1.2.0 +## [99] bit_4.0.5 compiler_4.3.1 +## [101] BiocNeighbors_1.18.0 desc_1.4.2 +## [103] DelayedArray_0.26.7 bookdown_0.35 +## [105] scales_1.2.1 classInt_0.4-10 +## [107] distances_0.1.9 stringr_1.5.0 +## [109] digest_0.6.33 fftwtools_0.9-11 +## [111] rmarkdown_2.25 XVector_0.40.0 +## [113] htmltools_0.5.6 pkgconfig_2.0.3 +## [115] jpeg_0.1-10 sparseMatrixStats_1.12.2 +## [117] fastmap_1.1.1 rlang_1.1.1 +## [119] GlobalOptions_0.1.2 htmlwidgets_1.6.2 +## [121] shiny_1.7.5 DelayedMatrixStats_1.22.6 +## [123] farver_2.1.1 jquerylib_0.1.4 +## [125] zoo_1.8-12 jsonlite_1.8.7 +## [127] R.oo_1.25.0 BiocSingular_1.16.0 +## [129] RCurl_1.98-1.12 magrittr_2.0.3 +## [131] scuttle_1.10.2 GenomeInfoDbData_1.2.10 +## [133] Rhdf5lib_1.22.1 munsell_0.5.0 +## [135] Rcpp_1.0.11 ggnewscale_0.4.9 +## [137] viridis_0.6.4 stringi_1.7.12 +## [139] ggraph_2.1.0 brio_1.1.3 +## [141] zlibbioc_1.46.0 MASS_7.3-60 +## [143] plyr_1.8.8 parallel_4.3.1 +## [145] ggrepel_0.9.3 graphlayouts_1.0.1 +## [147] splines_4.3.1 hms_1.1.3 +## [149] circlize_0.4.15 locfit_1.5-9.8 +## [151] igraph_1.5.1 ggpubr_0.6.0 +## [153] ggsignif_0.6.4 pkgload_1.3.3 +## [155] ScaledMatrix_1.8.1 reshape2_1.4.4 +## [157] XML_3.99-0.14 drc_3.0-1 +## [159] evaluate_0.21 tzdb_0.4.0 +## [161] foreach_1.5.2 tweenr_2.0.2 +## [163] httpuv_1.6.11 tidyr_1.3.0 +## [165] purrr_1.0.2 polyclip_1.10-6 +## [167] clue_0.3-65 ggforce_0.4.1 +## [169] rsvd_1.0.5 broom_1.0.5 +## [171] xtable_1.8-4 e1071_1.7-13 +## [173] rstatix_0.7.2 later_1.3.1 +## [175] viridisLite_0.4.2 class_7.3-22 +## [177] tibble_3.2.1 FlowSOM_2.8.0 +## [179] beeswarm_0.4.0 cluster_2.1.4 +## [181] concaveman_1.1.0 +``` +
+ diff --git a/05-spillover_matrix_files/figure-html/QC-heatmap-1.png b/05-spillover_matrix_files/figure-html/QC-heatmap-1.png new file mode 100644 index 00000000..ff628516 Binary files /dev/null and b/05-spillover_matrix_files/figure-html/QC-heatmap-1.png differ diff --git a/05-spillover_matrix_files/figure-html/QC-heatmap-2.png b/05-spillover_matrix_files/figure-html/QC-heatmap-2.png new file mode 100644 index 00000000..1abc16a0 Binary files /dev/null and b/05-spillover_matrix_files/figure-html/QC-heatmap-2.png differ diff --git a/05-spillover_matrix_files/figure-html/assignment-heatmap-1.png b/05-spillover_matrix_files/figure-html/assignment-heatmap-1.png new file mode 100644 index 00000000..41aeb129 Binary files /dev/null and b/05-spillover_matrix_files/figure-html/assignment-heatmap-1.png differ diff --git a/05-spillover_matrix_files/figure-html/binning-1.png b/05-spillover_matrix_files/figure-html/binning-1.png new file mode 100644 index 00000000..c4eb13da Binary files /dev/null and b/05-spillover_matrix_files/figure-html/binning-1.png differ diff --git a/05-spillover_matrix_files/figure-html/binning-2.png b/05-spillover_matrix_files/figure-html/binning-2.png new file mode 100644 index 00000000..7a8d45da Binary files /dev/null and b/05-spillover_matrix_files/figure-html/binning-2.png differ diff --git a/05-spillover_matrix_files/figure-html/compute-spillover-1.png b/05-spillover_matrix_files/figure-html/compute-spillover-1.png new file mode 100644 index 00000000..e5093aa1 Binary files /dev/null and b/05-spillover_matrix_files/figure-html/compute-spillover-1.png differ diff --git a/05-spillover_matrix_files/figure-html/image-visualization-1.png b/05-spillover_matrix_files/figure-html/image-visualization-1.png new file mode 100644 index 00000000..f4107a28 Binary files /dev/null and b/05-spillover_matrix_files/figure-html/image-visualization-1.png differ diff --git a/05-spillover_matrix_files/figure-html/image-visualization-2.png b/05-spillover_matrix_files/figure-html/image-visualization-2.png new file mode 100644 index 00000000..959c8d27 Binary files /dev/null and b/05-spillover_matrix_files/figure-html/image-visualization-2.png differ diff --git a/05-spillover_matrix_files/figure-html/image-visualization-3.png b/05-spillover_matrix_files/figure-html/image-visualization-3.png new file mode 100644 index 00000000..b85c0d04 Binary files /dev/null and b/05-spillover_matrix_files/figure-html/image-visualization-3.png differ diff --git a/05-spillover_matrix_files/figure-html/image-visualization-4.png b/05-spillover_matrix_files/figure-html/image-visualization-4.png new file mode 100644 index 00000000..6cc8f054 Binary files /dev/null and b/05-spillover_matrix_files/figure-html/image-visualization-4.png differ diff --git a/05-spillover_matrix_files/figure-html/visualize-single-cell-spillover-1.png b/05-spillover_matrix_files/figure-html/visualize-single-cell-spillover-1.png new file mode 100644 index 00000000..c8273459 Binary files /dev/null and b/05-spillover_matrix_files/figure-html/visualize-single-cell-spillover-1.png differ diff --git a/06-quality_control.md b/06-quality_control.md new file mode 100644 index 00000000..c5471fe7 --- /dev/null +++ b/06-quality_control.md @@ -0,0 +1,708 @@ +# Image and cell-level quality control + +The following section discusses possible quality indicators for data obtained +by IMC and other highly multiplexed imaging technologies. Here, we will focus +on describing quality metrics on the single-cell as well as image level. + +## Read in the data + +We will first read in the data processed in previous sections: + + +```r +images <- readRDS("data/images.rds") +masks <- readRDS("data/masks.rds") +spe <- readRDS("data/spe.rds") +``` + +## Segmentation quality control {#seg-quality} + +The first step after image segmentation is to observe its accuracy. +Without having ground-truth data readily available, a common approach to +segmentation quality control is to overlay segmentation masks on composite images +displaying channels that were used for segmentation. +The [cytomapper](https://www.bioconductor.org/packages/release/bioc/html/cytomapper.html) +package supports exactly this tasks by using the `plotPixels` function. + +Here, we select 3 random images and perform image- and channel-wise +normalization (channels are first min-max normalized and scaled to a range of +0-1 before clipping the maximum intensity to 0.2). + + +```r +library(cytomapper) +set.seed(20220118) +img_ids <- sample(seq_along(images), 3) + +# Normalize and clip images +cur_images <- images[img_ids] +cur_images <- cytomapper::normalize(cur_images, separateImages = TRUE) +cur_images <- cytomapper::normalize(cur_images, inputRange = c(0, 0.2)) + +plotPixels(cur_images, + mask = masks[img_ids], + img_id = "sample_id", + missing_colour = "white", + colour_by = c("CD163", "CD20", "CD3", "Ecad", "DNA1"), + colour = list(CD163 = c("black", "yellow"), + CD20 = c("black", "red"), + CD3 = c("black", "green"), + Ecad = c("black", "cyan"), + DNA1 = c("black", "blue")), + image_title = NULL, + legend = list(colour_by.title.cex = 0.7, + colour_by.labels.cex = 0.7)) +``` + + + +We can see that nuclei are centered within the segmentation masks and all cell +types are correctly segmented (note: to zoom into the image you can right click +and select `Open Image in New Tab`). A common challenge here is to segment large (e.g., +epithelial cells - in cyan) _versus_ small (e.g., B cells - in red). However, the +segmentation approach here appears to correctly segment cells across different +sizes. + +An easier and interactive way of observing segmentation quality is to use the +interactive image viewer provided by the +[cytoviewer](https://github.com/BodenmillerGroup/cytoviewer) R/Bioconductor +package [@Meyer2023]. Under "Image-level" > "Basic controls", up to six markers +can be selected for visualization. The contrast of each marker can be adjusted. +Under "Image-level" > "Advanced controls", click the "Show cell outlines" box +to outline segmented cells on the images. + + +```r +library(cytoviewer) + +app <- cytoviewer(image = images, + mask = masks, + cell_id = "ObjectNumber", + img_id = "sample_id") + +if (interactive()) { + shiny::runApp(app, launch.browser = TRUE) +} +``` + +An additional approach to observe cell segmentation quality and potentially also +antibody specificity issues is to visualize single-cell expression in form of a +heatmap. Here, we sub-sample the dataset to 2000 cells for visualization +purposes and overlay the cancer type from which the cells were extracted. + + +```r +library(dittoSeq) +library(viridis) +cur_cells <- sample(seq_len(ncol(spe)), 2000) + +dittoHeatmap(spe[,cur_cells], + genes = rownames(spe)[rowData(spe)$use_channel], + assay = "exprs", + cluster_cols = TRUE, + scale = "none", + heatmap.colors = viridis(100), + annot.by = "indication", + annotation_colors = list(indication = metadata(spe)$color_vectors$indication)) +``` + + + +We can differentiate between epithelial cells (Ecad+) and immune cells +(CD45RO+). Some of the markers are detected in specific cells (e.g., Ki67, CD20, +Ecad) while others are more broadly expressed across cells (e.g., HLADR, B2M, +CD4). + +## Image-level quality control {#image-quality} + +Image-level quality control is often performed using tools that offer a +graphical user interface such as [QuPath](https://qupath.github.io/), +[FIJI](https://imagej.net/software/fiji/) and the previously mentioned +[cytoviewer](https://github.com/BodenmillerGroup/cytoviewer) package. Viewers +that were specifically developed for IMC data can be seen +[here](https://bodenmillergroup.github.io/IMCWorkflow/viewers.html). In this +section, we will specifically focus on quantitative metrics to assess image +quality. + +It is often of interest to calculate the signal-to-noise ratio (SNR) for +individual channels and markers. Here, we define the SNR as: + +$$SNR = I_s/I_n$$ + +where $I_s$ is the intensity of the signal (mean intensity of pixels with true +signal) and $I_n$ is the intensity of the noise (mean intensity of pixels +containing noise). This definition of the SNR is just one of many and other +measures can be applied. Finding a threshold that separates pixels containing +signal and pixels containing noise is not trivial and different approaches can +be chosen. Here, we use the `otsu` thresholding approach to find pixels of the +"foreground" (i.e., signal) and "background" (i.e., noise). The SNR is then +defined as the mean intensity of foreground pixels divided by the mean intensity +of background pixels. We compute this measure as well as the mean signal +intensity per image. The plot below shows the average SNR _versus_ the average +signal intensity across all images. + + +```r +library(tidyverse) +library(ggrepel) +library(EBImage) + +cur_snr <- lapply(names(images), function(x){ + img <- images[[x]] + mat <- apply(img, 3, function(ch){ + # Otsu threshold + thres <- otsu(ch, range = c(min(ch), max(ch)), levels = 65536) + # Signal-to-noise ratio + snr <- mean(ch[ch > thres]) / mean(ch[ch <= thres]) + # Signal intensity + ps <- mean(ch[ch > thres]) + + return(c(snr = snr, ps = ps)) + }) + t(mat) %>% as.data.frame() %>% + mutate(image = x, + marker = colnames(mat)) %>% + pivot_longer(cols = c(snr, ps)) +}) + +cur_snr <- do.call(rbind, cur_snr) + +cur_snr %>% + group_by(marker, name) %>% + summarize(log_mean = log2(mean(value))) %>% + pivot_wider(names_from = name, values_from = log_mean) %>% + ggplot() + + geom_point(aes(ps, snr)) + + geom_label_repel(aes(ps, snr, label = marker)) + + theme_minimal(base_size = 15) + ylab("Signal-to-noise ratio [log2]") + + xlab("Signal intensity [log2]") +``` + + + +We observe PD1, LAG3 and cleaved PARP to have high SNR but low signal intensity +meaning that in general these markers are not abundantly expressed. The Iridium +intercalator (here marked as DNA1 and DNA2) has the highest signal intensity +but low SNR. This might be due to staining differences between individual nuclei +where some nuclei are considered as background. We do however observe high +SNR and sufficient signal intensity for the majority of markers. + +Otsu thesholding and SNR calculation does not perform well if the markers are +lowly abundant. In the next code chunk, we will remove markers that have +a positive signal of below 2 per image. + + +```r +cur_snr <- cur_snr %>% + pivot_wider(names_from = name, values_from = value) %>% + filter(ps > 2) %>% + pivot_longer(cols = c(snr, ps)) + +cur_snr %>% + group_by(marker, name) %>% + summarize(log_mean = log2(mean(value))) %>% + pivot_wider(names_from = name, values_from = log_mean) %>% + ggplot() + + geom_point(aes(ps, snr)) + + geom_label_repel(aes(ps, snr, label = marker)) + + theme_minimal(base_size = 15) + ylab("Signal-to-noise ratio [log2]") + + xlab("Signal intensity [log2]") +``` + + + +This visualization shows a reduces SNR for PD1, LAG3 and cleaved PARP which was +previously inflated due to low signal. + +Another quality indicator is the image area covered by cells (or biological +tissue). This metric identifies ROIs where little cells are present, possibly +hinting at incorrect selection of the ROI. We can compute the percentage of +covered image area using the metadata contained in the `SpatialExperiment` +object: + + +```r +cell_density <- colData(spe) %>% + as.data.frame() %>% + group_by(sample_id) %>% + # Compute the number of pixels covered by cells and + # the total number of pixels + summarize(cell_area = sum(area), + no_pixels = mean(width_px) * mean(height_px)) %>% + # Divide the total number of pixels + # by the number of pixels covered by cells + mutate(covered_area = cell_area / no_pixels) + +# Visualize the image area covered by cells per image +ggplot(cell_density) + + geom_point(aes(reorder(sample_id,covered_area), covered_area)) + + theme_minimal(base_size = 15) + + theme(axis.text.x = element_text(angle = 90, hjust = 1, size = 15)) + + ylim(c(0, 1)) + + ylab("% covered area") + xlab("") +``` + + + +We observe that two of the 14 images show unusually low cell coverage. These +two images can now be visualized using `cytomapper`. + + +```r +# Normalize and clip images +cur_images <- images[c("Patient4_005", "Patient4_007")] +cur_images <- cytomapper::normalize(cur_images, separateImages = TRUE) +cur_images <- cytomapper::normalize(cur_images, inputRange = c(0, 0.2)) + +plotPixels(cur_images, + mask = masks[c("Patient4_005", "Patient4_007")], + img_id = "sample_id", + missing_colour = "white", + colour_by = c("CD163", "CD20", "CD3", "Ecad", "DNA1"), + colour = list(CD163 = c("black", "yellow"), + CD20 = c("black", "red"), + CD3 = c("black", "green"), + Ecad = c("black", "cyan"), + DNA1 = c("black", "blue")), + legend = list(colour_by.title.cex = 0.7, + colour_by.labels.cex = 0.7)) +``` + + + +These two images display less dense tissue structure but overall the images are +intact and appear to be segmented correctly. + +Finally, it can be beneficial to visualize the mean marker expression per image +to identify images with outlying marker expression. This check does not +indicate image quality _per se_ but can highlight biological differences. Here, +we will use the `aggregateAcrossCells` function of the +*[scuttle](https://bioconductor.org/packages/3.17/scuttle)* package to compute the mean expression per +image. For visualization purposes, we again `asinh` transform the mean expression +values. + + +```r +library(scuttle) + +image_mean <- aggregateAcrossCells(spe, + ids = spe$sample_id, + statistics="mean", + use.assay.type = "counts") +assay(image_mean, "exprs") <- asinh(counts(image_mean)) + +dittoHeatmap(image_mean, genes = rownames(spe)[rowData(spe)$use_channel], + assay = "exprs", cluster_cols = TRUE, scale = "none", + heatmap.colors = viridis(100), + annot.by = c("indication", "patient_id", "ROI"), + annotation_colors = list(indication = metadata(spe)$color_vectors$indication, + patient_id = metadata(spe)$color_vectors$patient_id, + ROI = metadata(spe)$color_vectors$ROI), + show_colnames = TRUE) +``` + + + +We observe extensive biological variation across the 14 images specifically for +some of the cell phenotype markers including the macrophage marker CD206, the B +cell marker CD20, the neutrophil marker CD15, and the proliferation marker Ki67. +These differences will be further studied in the following chapters. + +## Cell-level quality control {#cell-quality} + +In the following paragraphs we will look at different metrics and visualization +approaches to assess data quality (as well as biological differences) on the +single-cell level. + +Related to the signal-to-noise ratio (SNR) calculated above on the pixel-level, +a similar measure can be derived on the single-cell level. Here, we will use +a two component Gaussian mixture model for each marker to find cells +with positive and negative expression. The SNR is defined as: + +$$SNR = I_s/I_n$$ + +where $I_s$ is the intensity of the signal (mean intensity of cells with +positive signal) and $I_n$ is the intensity of the noise (mean intensity of +cells lacking expression). To define cells with positive and negative marker +expression, we fit the mixture model across the transformed counts of all cells +contained in the `SpatialExperiment` object. Next, for each marker we calculate +the mean of the non-transformed counts for the positive and the negative cells. +The SNR is then the ratio between the mean of the positive signal and the mean +of the negative signal. + + +```r +library(mclust) + +set.seed(220224) +mat <- sapply(seq_len(nrow(spe)), function(x){ + cur_exprs <- assay(spe, "exprs")[x,] + cur_counts <- assay(spe, "counts")[x,] + + cur_model <- Mclust(cur_exprs, G = 2) + mean1 <- mean(cur_counts[cur_model$classification == 1]) + mean2 <- mean(cur_counts[cur_model$classification == 2]) + + signal <- ifelse(mean1 > mean2, mean1, mean2) + noise <- ifelse(mean1 > mean2, mean2, mean1) + + return(c(snr = signal/noise, ps = signal)) +}) + +cur_snr <- t(mat) %>% as.data.frame() %>% + mutate(marker = rownames(spe)) + +cur_snr %>% ggplot() + + geom_point(aes(log2(ps), log2(snr))) + + geom_label_repel(aes(log2(ps), log2(snr), label = marker)) + + theme_minimal(base_size = 15) + ylab("Signal-to-noise ratio [log2]") + + xlab("Signal intensity [log2]") +``` + + + +Next, we observe the distributions of cell size across the individual images. +Differences in cell size distributions can indicate segmentation biases due to +differences in cell density or can indicate biological differences due to cell +type compositions (tumor cells tend to be larger than immune cells). + + +```r +dittoPlot(spe, var = "area", + group.by = "sample_id", + plots = "boxplot") + + ylab("Cell area") + xlab("") +``` + + + +```r +summary(spe$area) +``` + +``` +## Min. 1st Qu. Median Mean 3rd Qu. Max. +## 3.00 47.00 70.00 76.38 98.00 466.00 +``` + +The median cell size is 70 pixels with a median major axis +length of 11.3. The largest cell +has an area of 466 pixels which relates to a diameter of +21.6 pixels assuming a circular shape. +Overall, the distribution of cell sizes is similar across images with images from +`Patient4_005` and `Patient4_007` showing a reduced average cell size. These +images contain fewer tumor cells which can explain the smaller average cell size. + +We detect very small cells in the dataset and will remove them. +The chosen threshold is arbitrary and needs to be adjusted per dataset. + + +```r +sum(spe$area < 5) +``` + +``` +## [1] 65 +``` + +```r +spe <- spe[,spe$area >= 5] +``` + +Another quality indicator can be an absolute measure of cell density often +reported in cells per mm$^2$. + + +```r +cell_density <- colData(spe) %>% + as.data.frame() %>% + group_by(sample_id) %>% + summarize(cell_count = n(), + no_pixels = mean(width_px) * mean(height_px)) %>% + mutate(cells_per_mm2 = cell_count/(no_pixels/1000000)) + +ggplot(cell_density) + + geom_point(aes(reorder(sample_id,cells_per_mm2), cells_per_mm2)) + + theme_minimal(base_size = 15) + + theme(axis.text.x = element_text(angle = 90, hjust = 1, size = 8)) + + ylab("Cells per mm2") + xlab("") +``` + + + +The number of cells per mm$^2$ varies across images which also depends on the +number of tumor/non-tumor cells. As we can see in the following sections, some +immune cells appear in cell dense regions while other stromal regions are less +dense. + +The data presented here originate from samples from different locations with +potential differences in pre-processing and each sample was stained individually. +These (and other) technical aspects can induce staining differences between +samples or batches of samples. Observing potential staining differences can be +crucial to assess data quality. We will use ridgeline visualizations to check +differences in staining patterns: + + +```r +multi_dittoPlot(spe, vars = rownames(spe)[rowData(spe)$use_channel], + group.by = "patient_id", plots = "ridgeplot", + assay = "exprs", + color.panel = metadata(spe)$color_vectors$patient_id) +``` + + + +We observe variations in the distributions of marker expression across patients. +These variations may arise partly from different abundances of cells in +different images (e.g., Patient3 may have higher numbers of CD11c+ and PD1+ +cells) as well as staining differences between samples. While most of the +selected markers are specifically expressed in immune cell subtypes, we can see +that E-Cadherin (a marker for epithelial (tumor) cells) shows a similar +expression range across all patients. + +Finally, we will use non-linear dimensionality reduction methods to project +cells from a high-dimensional (40) down to a low-dimensional (2) space. For this +the *[scater](https://bioconductor.org/packages/3.17/scater)* package provides the `runUMAP` and +`runTSNE` function. To ensure reproducibility, we will need to set a seed; +however different seeds and different parameter settings (e.g., the `perplexity` +parameter in the `runTSNE` function) need to be tested to avoid +over-interpretation of visualization artefacts. For dimensionality reduction, we +will use all channels that show biological variation across the dataset. +However, marker selection can be performed with different biological questions +in mind. Here, both the `runUMAP` and `runTSNE` function are not deterministic, +meaning they produce different results across different runs. We therefore +set a `seed` in this chunk for reproducibility purposes. + + +```r +library(scater) + +set.seed(220225) +spe <- runUMAP(spe, subset_row = rowData(spe)$use_channel, exprs_values = "exprs") +spe <- runTSNE(spe, subset_row = rowData(spe)$use_channel, exprs_values = "exprs") +``` + +After dimensionality reduction, the low-dimensional embeddings are stored in the +`reducedDim` slot. + + +```r +reducedDims(spe) +``` + +``` +## List of length 2 +## names(2): UMAP TSNE +``` + +```r +head(reducedDim(spe, "UMAP")) +``` + +``` +## UMAP1 UMAP2 +## Patient1_001_1 -4.810167 -3.777362 +## Patient1_001_2 -4.397347 -3.456036 +## Patient1_001_3 -4.369883 -3.445561 +## Patient1_001_4 -4.081614 -3.162119 +## Patient1_001_5 -6.234012 -2.433976 +## Patient1_001_6 -5.666597 -3.428058 +``` + +Visualization of the low-dimensional embedding facilitates assessment of +potential "batch effects". The `dittoDimPlot` +function allows flexible visualization. It returns `ggplot` objects which +can be further modified. + + +```r +library(patchwork) + +# visualize patient id +p1 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "UMAP", size = 0.2) + + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + + ggtitle("Patient ID on UMAP") +p2 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "TSNE", size = 0.2) + + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + + ggtitle("Patient ID on TSNE") + +# visualize region of interest id +p3 <- dittoDimPlot(spe, var = "ROI", reduction.use = "UMAP", size = 0.2) + + scale_color_manual(values = metadata(spe)$color_vectors$ROI) + + ggtitle("ROI ID on UMAP") +p4 <- dittoDimPlot(spe, var = "ROI", reduction.use = "TSNE", size = 0.2) + + scale_color_manual(values = metadata(spe)$color_vectors$ROI) + + ggtitle("ROI ID on TSNE") + +# visualize indication +p5 <- dittoDimPlot(spe, var = "indication", reduction.use = "UMAP", size = 0.2) + + scale_color_manual(values = metadata(spe)$color_vectors$indication) + + ggtitle("Indication on UMAP") +p6 <- dittoDimPlot(spe, var = "indication", reduction.use = "TSNE", size = 0.2) + + scale_color_manual(values = metadata(spe)$color_vectors$indication) + + ggtitle("Indication on TSNE") + +(p1 + p2) / (p3 + p4) / (p5 + p6) +``` + + + + +```r +# visualize marker expression +p1 <- dittoDimPlot(spe, var = "Ecad", reduction.use = "UMAP", + assay = "exprs", size = 0.2) + + scale_color_viridis(name = "Ecad") + + ggtitle("E-Cadherin expression on UMAP") +p2 <- dittoDimPlot(spe, var = "CD45RO", reduction.use = "UMAP", + assay = "exprs", size = 0.2) + + scale_color_viridis(name = "CD45RO") + + ggtitle("CD45RO expression on UMAP") +p3 <- dittoDimPlot(spe, var = "Ecad", reduction.use = "TSNE", + assay = "exprs", size = 0.2) + + scale_color_viridis(name = "Ecad") + + ggtitle("Ecad expression on TSNE") +p4 <- dittoDimPlot(spe, var = "CD45RO", reduction.use = "TSNE", + assay = "exprs", size = 0.2) + + scale_color_viridis(name = "CD45RO") + + ggtitle("CD45RO expression on TSNE") + +(p1 + p2) / (p3 + p4) +``` + + + +We observe a strong separation of tumor cells (Ecad+ cells) between the +patients. Here, each patient was diagnosed with a different tumor type. The +separation of tumor cells could be of biological origin since tumor cells tend +to display differences in expression between patients and cancer types and/or of +technical origin: the panel only contains a single tumor marker (E-Cadherin) and +therefore slight technical differences in staining causes visible separation +between cells of different patients. Nevertheless, the immune compartment +(CD45RO+ cells) mix between patients and we can rule out systematic staining +differences between patients. + +## Save objects + +The modified `SpatialExperiment` object is saved for further downstream analysis. + + +```r +saveRDS(spe, "data/spe.rds") +``` + + + +## Session Info + +
+ SessionInfo + + +``` +## R version 4.3.1 (2023-06-16) +## Platform: x86_64-pc-linux-gnu (64-bit) +## Running under: Ubuntu 22.04.3 LTS +## +## Matrix products: default +## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 +## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 +## +## locale: +## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C +## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 +## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 +## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C +## [9] LC_ADDRESS=C LC_TELEPHONE=C +## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C +## +## time zone: Etc/UTC +## tzcode source: system (glibc) +## +## attached base packages: +## [1] stats4 stats graphics grDevices utils datasets methods +## [8] base +## +## other attached packages: +## [1] testthat_3.1.10 patchwork_1.1.3 +## [3] scater_1.28.0 mclust_6.0.0 +## [5] scuttle_1.10.2 ggrepel_0.9.3 +## [7] lubridate_1.9.3 forcats_1.0.0 +## [9] stringr_1.5.0 dplyr_1.1.3 +## [11] purrr_1.0.2 readr_2.1.4 +## [13] tidyr_1.3.0 tibble_3.2.1 +## [15] tidyverse_2.0.0 viridis_0.6.4 +## [17] viridisLite_0.4.2 dittoSeq_1.12.1 +## [19] ggplot2_3.4.3 cytoviewer_1.0.1 +## [21] cytomapper_1.12.0 SingleCellExperiment_1.22.0 +## [23] SummarizedExperiment_1.30.2 Biobase_2.60.0 +## [25] GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 +## [27] IRanges_2.34.1 S4Vectors_0.38.2 +## [29] BiocGenerics_0.46.0 MatrixGenerics_1.12.3 +## [31] matrixStats_1.0.0 EBImage_4.42.0 +## +## loaded via a namespace (and not attached): +## [1] RColorBrewer_1.1-3 rstudioapi_0.15.0 +## [3] jsonlite_1.8.7 magrittr_2.0.3 +## [5] ggbeeswarm_0.7.2 magick_2.8.0 +## [7] farver_2.1.1 rmarkdown_2.25 +## [9] zlibbioc_1.46.0 vctrs_0.6.3 +## [11] memoise_2.0.1 DelayedMatrixStats_1.22.6 +## [13] RCurl_1.98-1.12 terra_1.7-46 +## [15] svgPanZoom_0.3.4 htmltools_0.5.6 +## [17] S4Arrays_1.0.6 BiocNeighbors_1.18.0 +## [19] raster_3.6-23 Rhdf5lib_1.22.1 +## [21] rhdf5_2.44.0 sass_0.4.7 +## [23] bslib_0.5.1 desc_1.4.2 +## [25] htmlwidgets_1.6.2 fontawesome_0.5.2 +## [27] cachem_1.0.8 mime_0.12 +## [29] lifecycle_1.0.3 pkgconfig_2.0.3 +## [31] rsvd_1.0.5 colourpicker_1.3.0 +## [33] Matrix_1.6-1.1 R6_2.5.1 +## [35] fastmap_1.1.1 GenomeInfoDbData_1.2.10 +## [37] shiny_1.7.5 digest_0.6.33 +## [39] colorspace_2.1-0 shinycssloaders_1.0.0 +## [41] rprojroot_2.0.3 irlba_2.3.5.1 +## [43] dqrng_0.3.1 pkgload_1.3.3 +## [45] beachmat_2.16.0 labeling_0.4.3 +## [47] timechange_0.2.0 fansi_1.0.4 +## [49] nnls_1.5 abind_1.4-5 +## [51] compiler_4.3.1 withr_2.5.1 +## [53] tiff_0.1-11 BiocParallel_1.34.2 +## [55] HDF5Array_1.28.1 R.utils_2.12.2 +## [57] DelayedArray_0.26.7 rjson_0.2.21 +## [59] tools_4.3.1 vipor_0.4.5 +## [61] beeswarm_0.4.0 httpuv_1.6.11 +## [63] R.oo_1.25.0 glue_1.6.2 +## [65] rhdf5filters_1.12.1 promises_1.2.1 +## [67] grid_4.3.1 Rtsne_0.16 +## [69] generics_0.1.3 gtable_0.3.4 +## [71] tzdb_0.4.0 R.methodsS3_1.8.2 +## [73] hms_1.1.3 ScaledMatrix_1.8.1 +## [75] BiocSingular_1.16.0 sp_2.0-0 +## [77] utf8_1.2.3 XVector_0.40.0 +## [79] RcppAnnoy_0.0.21 pillar_1.9.0 +## [81] limma_3.56.2 later_1.3.1 +## [83] lattice_0.21-8 tidyselect_1.2.0 +## [85] locfit_1.5-9.8 miniUI_0.1.1.1 +## [87] knitr_1.44 gridExtra_2.3 +## [89] bookdown_0.35 edgeR_3.42.4 +## [91] svglite_2.1.1 xfun_0.40 +## [93] shinydashboard_0.7.2 brio_1.1.3 +## [95] DropletUtils_1.20.0 pheatmap_1.0.12 +## [97] stringi_1.7.12 fftwtools_0.9-11 +## [99] yaml_2.3.7 evaluate_0.21 +## [101] codetools_0.2-19 archive_1.1.6 +## [103] BiocManager_1.30.22 cli_3.6.1 +## [105] uwot_0.1.16 xtable_1.8-4 +## [107] systemfonts_1.0.4 munsell_0.5.0 +## [109] jquerylib_0.1.4 Rcpp_1.0.11 +## [111] png_0.1-8 parallel_4.3.1 +## [113] ellipsis_0.3.2 jpeg_0.1-10 +## [115] sparseMatrixStats_1.12.2 bitops_1.0-7 +## [117] SpatialExperiment_1.10.0 scales_1.2.1 +## [119] ggridges_0.5.4 crayon_1.5.2 +## [121] BiocStyle_2.28.1 rlang_1.1.1 +## [123] cowplot_1.1.1 +``` +
diff --git a/06-quality_control_files/figure-html/cell-density-1.png b/06-quality_control_files/figure-html/cell-density-1.png new file mode 100644 index 00000000..c42ff24a Binary files /dev/null and b/06-quality_control_files/figure-html/cell-density-1.png differ diff --git a/06-quality_control_files/figure-html/cell-size-1.png b/06-quality_control_files/figure-html/cell-size-1.png new file mode 100644 index 00000000..e0a5258f Binary files /dev/null and b/06-quality_control_files/figure-html/cell-size-1.png differ diff --git a/06-quality_control_files/figure-html/cell-snr-1.png b/06-quality_control_files/figure-html/cell-snr-1.png new file mode 100644 index 00000000..7052cc4d Binary files /dev/null and b/06-quality_control_files/figure-html/cell-snr-1.png differ diff --git a/06-quality_control_files/figure-html/image-snr-1.png b/06-quality_control_files/figure-html/image-snr-1.png new file mode 100644 index 00000000..e19367d5 Binary files /dev/null and b/06-quality_control_files/figure-html/image-snr-1.png differ diff --git a/06-quality_control_files/figure-html/low-density-images-1.png b/06-quality_control_files/figure-html/low-density-images-1.png new file mode 100644 index 00000000..8659639d Binary files /dev/null and b/06-quality_control_files/figure-html/low-density-images-1.png differ diff --git a/06-quality_control_files/figure-html/mean-expression-per-image-1.png b/06-quality_control_files/figure-html/mean-expression-per-image-1.png new file mode 100644 index 00000000..1974256c Binary files /dev/null and b/06-quality_control_files/figure-html/mean-expression-per-image-1.png differ diff --git a/06-quality_control_files/figure-html/no-cells-per-image-1.png b/06-quality_control_files/figure-html/no-cells-per-image-1.png new file mode 100644 index 00000000..0017b790 Binary files /dev/null and b/06-quality_control_files/figure-html/no-cells-per-image-1.png differ diff --git a/06-quality_control_files/figure-html/overlay-masks-1.png b/06-quality_control_files/figure-html/overlay-masks-1.png new file mode 100644 index 00000000..bb6710ca Binary files /dev/null and b/06-quality_control_files/figure-html/overlay-masks-1.png differ diff --git a/06-quality_control_files/figure-html/ridges-1.png b/06-quality_control_files/figure-html/ridges-1.png new file mode 100644 index 00000000..f26e9bf3 Binary files /dev/null and b/06-quality_control_files/figure-html/ridges-1.png differ diff --git a/06-quality_control_files/figure-html/segmentation-heatmap-1.png b/06-quality_control_files/figure-html/segmentation-heatmap-1.png new file mode 100644 index 00000000..b9d467b2 Binary files /dev/null and b/06-quality_control_files/figure-html/segmentation-heatmap-1.png differ diff --git a/06-quality_control_files/figure-html/snr-adjusted-1.png b/06-quality_control_files/figure-html/snr-adjusted-1.png new file mode 100644 index 00000000..a4e3cb92 Binary files /dev/null and b/06-quality_control_files/figure-html/snr-adjusted-1.png differ diff --git a/06-quality_control_files/figure-html/visualizing-dimred-1-1.png b/06-quality_control_files/figure-html/visualizing-dimred-1-1.png new file mode 100644 index 00000000..f89d129f Binary files /dev/null and b/06-quality_control_files/figure-html/visualizing-dimred-1-1.png differ diff --git a/06-quality_control_files/figure-html/visualizing-dimred-2-1.png b/06-quality_control_files/figure-html/visualizing-dimred-2-1.png new file mode 100644 index 00000000..43e70737 Binary files /dev/null and b/06-quality_control_files/figure-html/visualizing-dimred-2-1.png differ diff --git a/07-batch_correction.md b/07-batch_correction.md new file mode 100644 index 00000000..8c873e33 --- /dev/null +++ b/07-batch_correction.md @@ -0,0 +1,591 @@ +# Batch effect correction {#batch-effects} + +In Section \@ref(cell-quality) we observed staining/expression differences +between the individual samples. This can arise due to technical (e.g., +differences in sample processing) as well as biological (e.g., differential +expression between patients/indications) effects. However, the combination of these effects +hinders cell phenotyping via clustering as highlighted in Section \@ref(clustering). + +To integrate cells across samples, we can use computational +strategies developed for correcting batch effects in single-cell RNA sequencing +data. In the following sections, we will use functions of the +[batchelor](https://www.bioconductor.org/packages/release/bioc/html/batchelor.html), +[harmony](https://github.com/immunogenomics/harmony) and +[Seurat](https://satijalab.org/seurat/articles/integration_introduction.html) +packages to correct for such batch effects. + +Of note: the correction approaches presented here aim at removing any +differences between samples. This will also remove biological differences +between the patients/indications. Nevertheless, integrating cells across samples +can facilitate the detection of cell phenotypes via clustering. + +First, we will read in the `SpatialExperiment` object containing the single-cell +data. + + +```r +spe <- readRDS("data/spe.rds") +``` + +## fastMNN correction + +The `batchelor` package provides the `mnnCorrect` and `fastMNN` functions to +correct for differences between samples/batches. Both functions build up on +finding mutual nearest neighbors (MNN) among the cells of different samples and +correct expression differences between the batches [@Haghverdi2018]. The `mnnCorrect` function +returns corrected expression counts while the `fastMNN` functions performs the +correction in reduced dimension space. As such, `fastMNN` returns integrated +cells in form of a low dimensional embedding. + +Paper: [Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors](https://www.nature.com/articles/nbt.4091) +Documentation: [batchelor](https://www.bioconductor.org/packages/release/bioc/vignettes/batchelor/inst/doc/correction.html) + +### Perform sample correction + +Here, we apply the `fastMNN` function to integrate cells between +patients. By setting `auto.merge = TRUE` the function estimates the best +batch merging order by maximizing the number of MNN pairs at each merging step. +This is more time consuming than merging sequentially based on how batches appear in the +dataset (default). We again select the markers defined in Section \@ref(cell-processing) +for sample correction. + +The function returns a `SingleCellExperiment` object which contains corrected +low-dimensional coordinates for each cell in the `reducedDim(out, "corrected")` +slot. This low-dimensional embedding can be further used for clustering and +non-linear dimensionality reduction. We check that the order of cells is the +same between the input and output object and then transfer the corrected +coordinates to the main `SpatialExperiment` object. + + + + +```r +library(batchelor) +set.seed(220228) +out <- fastMNN(spe, batch = spe$patient_id, + auto.merge = TRUE, + subset.row = rowData(spe)$use_channel, + assay.type = "exprs") + +# Check that order of cells is the same +stopifnot(all.equal(colnames(spe), colnames(out))) + +# Transfer the correction results to the main spe object +reducedDim(spe, "fastMNN") <- reducedDim(out, "corrected") +``` + + + +The computational time of the `fastMNN` function call is +2.33 minutes. + +Of note, the warnings that the `fastMNN` function produces can be avoided as follows: + +1. The following warning can be avoided by setting `BSPARAM = BiocSingular::ExactParam()` + +``` +Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, : + You're computing too large a percentage of total singular values, use a standard svd instead. +``` + +2. The following warning can be avoided by requesting fewer singular values by setting `d = 30` + +``` +In check_numbers(k = k, nu = nu, nv = nv, limit = min(dim(x)) - : + more singular values/vectors requested than available +``` + +### Quality control of correction results + +The `fastMNN` function further returns outputs that can be used to assess the +quality of the batch correction. The `metadata(out)$merge.info` entry collects +diagnostics for each individual merging step. Here, the `batch.size` and +`lost.var` entries are important. The `batch.size` entry reports the relative +magnitude of the batch effect and the `lost.var` entry represents the percentage +of lost variance per merging step. A large `batch.size` and low `lost.var` +indicate sufficient batch correction. + + +```r +merge_info <- metadata(out)$merge.info + +merge_info[,c("left", "right", "batch.size")] +``` + +``` +## DataFrame with 3 rows and 3 columns +## left right batch.size +## +## 1 Patient4 Patient2 0.381635 +## 2 Patient4,Patient2 Patient1 0.581013 +## 3 Patient4,Patient2,Patient1 Patient3 0.767376 +``` + +```r +merge_info$lost.var +``` + +``` +## Patient1 Patient2 Patient3 Patient4 +## [1,] 0.000000000 0.031154864 0.00000000 0.046198914 +## [2,] 0.043363546 0.009772150 0.00000000 0.011931892 +## [3,] 0.005394755 0.003023119 0.07219394 0.005366304 +``` + +We observe that Patient4 and Patient2 are most similar with a low batch effect. +Merging cells of Patient3 into the combined batch of Patient1, +Patient2 and Patient4 resulted in the highest percentage of lost variance and +the detection of the largest batch effect. In the next paragraph we can +visualize the correction results. + +### Visualization + +The simplest option to check if the sample effects were corrected is by using +non-linear dimensionality reduction techniques and observe mixing of cells across +samples. We will recompute the UMAP embedding using the corrected +low-dimensional coordinates for each cell. + + +```r +library(scater) + +set.seed(220228) +spe <- runUMAP(spe, dimred= "fastMNN", name = "UMAP_mnnCorrected") +``` + +Next, we visualize the corrected UMAP while overlaying patient IDs. + + +```r +library(cowplot) +library(dittoSeq) +library(viridis) + +# visualize patient id +p1 <- dittoDimPlot(spe, var = "patient_id", + reduction.use = "UMAP", size = 0.2) + + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + + ggtitle("Patient ID on UMAP before correction") +p2 <- dittoDimPlot(spe, var = "patient_id", + reduction.use = "UMAP_mnnCorrected", size = 0.2) + + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + + ggtitle("Patient ID on UMAP after correction") + +plot_grid(p1, p2) +``` + + + +We observe an imperfect merging of Patient3 into all other samples. This +was already seen when displaying the merging information above. +We now also visualize the expression of selected markers across all cells +before and after batch correction. + + +```r +markers <- c("Ecad", "CD45RO", "CD20", "CD3", "FOXP3", "CD206", "MPO", "SMA", "Ki67") + +# Before correction +plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP", + assay = "exprs", size = 0.2, list.out = TRUE) +plot_list <- lapply(plot_list, function(x) x + scale_color_viridis()) +plot_grid(plotlist = plot_list) +``` + + + +```r +# After correction +plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP_mnnCorrected", + assay = "exprs", size = 0.2, list.out = TRUE) +plot_list <- lapply(plot_list, function(x) x + scale_color_viridis()) +plot_grid(plotlist = plot_list) +``` + + + +We observe that immune cells across patients are merged after batch correction +using `fastMNN`. However, the tumor cells of different patients still cluster +separately. + +## harmony correction + +The `harmony` algorithm performs batch correction by iteratively clustering and +correcting the positions of cells in PCA space [@Korsunsky2019]. We will first +perform PCA on the asinh-transformed counts and then call the `RunHarmony` +function to perform data integration. + +Paper: [Fast, sensitive and accurate integration of single-cell data with Harmony](https://www.nature.com/articles/s41592-019-0619-0) +Documentation: [harmony](https://portals.broadinstitute.org/harmony/index.html) + +Similar to the `fastMNN` function, `harmony` returns the corrected +low-dimensional coordinates for each cell. These can be transfered to the +`reducedDim` slot of the original `SpatialExperiment` object. + + + + +```r +library(harmony) +library(BiocSingular) + +spe <- runPCA(spe, + subset_row = rowData(spe)$use_channel, + exprs_values = "exprs", + ncomponents = 30, + BSPARAM = ExactParam()) + +set.seed(230616) +out <- RunHarmony(spe, group.by.vars = "patient_id") + +# Check that order of cells is the same +stopifnot(all.equal(colnames(spe), colnames(out))) + +reducedDim(spe, "harmony") <- reducedDim(out, "HARMONY") +``` + + + +The computational time of the `HarmonyMatrix` function call is +1.3 minutes. + +### Visualization + +We will now again visualize the cells in low dimensions after UMAP embedding. + + +```r +set.seed(220228) +spe <- runUMAP(spe, dimred = "harmony", name = "UMAP_harmony") +``` + + +```r +# visualize patient id +p1 <- dittoDimPlot(spe, var = "patient_id", + reduction.use = "UMAP", size = 0.2) + + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + + ggtitle("Patient ID on UMAP before correction") +p2 <- dittoDimPlot(spe, var = "patient_id", + reduction.use = "UMAP_harmony", size = 0.2) + + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + + ggtitle("Patient ID on UMAP after correction") + +plot_grid(p1, p2) +``` + + + +And we visualize selected marker expression as defined above. + + +```r +# Before correction +plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP", + assay = "exprs", size = 0.2, list.out = TRUE) +plot_list <- lapply(plot_list, function(x) x + scale_color_viridis()) +plot_grid(plotlist = plot_list) +``` + + + +```r +# After correction +plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP_harmony", + assay = "exprs", size = 0.2, list.out = TRUE) +plot_list <- lapply(plot_list, function(x) x + scale_color_viridis()) +plot_grid(plotlist = plot_list) +``` + + + +We observe a more aggressive merging of cells from different patients compared +to the results after `fastMNN` correction. Importantly, immune cell and epithelial +markers are expressed in distinct regions of the UMAP. + +## Seurat correction + +The `Seurat` package provides a number of functionalities to analyze single-cell +data. As such it also allows the integration of cells across different samples. +Conceptually, `Seurat` performs batch correction similarly to `fastMNN` by +finding mutual nearest neighbors (MNN) in low dimensional space before +correcting the expression values of cells [@Stuart2019]. + +Paper: [Comprehensive Integration of Single-Cell Data](https://www.cell.com/cell/fulltext/S0092-8674(19)30559-8) +Documentation: [Seurat](https://satijalab.org/seurat/index.html) + +To use `Seurat`, we will first create a `Seurat` object from the `SpatialExperiment` +object and add relevant metadata. The object also needs to be split by patient +prior to integration. + + +```r +library(Seurat) +library(SeuratObject) +seurat_obj <- as.Seurat(spe, counts = "counts", data = "exprs") +seurat_obj <- AddMetaData(seurat_obj, as.data.frame(colData(spe))) + +seurat.list <- SplitObject(seurat_obj, split.by = "patient_id") +``` + +To avoid long run times, we will use an approach that relies on reciprocal PCA +instead of canonical correlation analysis for dimensionality reduction and +initial alignment. For an extended tutorial on how to use `Seurat` for data +integration, please refer to their +[vignette](https://satijalab.org/seurat/articles/integration_rpca.html). + +We will first define the features used for integration and perform PCA on cells +of each patient individually. The `FindIntegrationAnchors` function detects MNNs between +cells of different patients and the `IntegrateData` function corrects the +expression values of cells. We slightly increase the number of neighbors to be +considered for MNN detection (the `k.anchor` parameter). This increases the integration +strength. + + + + +```r +features <- rownames(spe)[rowData(spe)$use_channel] +seurat.list <- lapply(X = seurat.list, FUN = function(x) { + x <- ScaleData(x, features = features, verbose = FALSE) + x <- RunPCA(x, features = features, verbose = FALSE, approx = FALSE) + return(x) +}) + +anchors <- FindIntegrationAnchors(object.list = seurat.list, + anchor.features = features, + reduction = "rpca", + k.anchor = 20) + +combined <- IntegrateData(anchorset = anchors) +``` + + + +We now select the `integrated` assay and perform PCA dimensionality reduction. +The cell coordinates in PCA reduced space can then be transferred to the +original `SpatialExperiment` object. **Of note:** by splitting the object into +individual batch-specific objects, the ordering of cells in the integrated +object might not match the ordering of cells in the input object. In this case, +columns will need to be reordered. Here, we test if the ordering of cells in the +integrated `Seurat` object matches the ordering of cells in the main +`SpatialExperiment` object. + + +```r +DefaultAssay(combined) <- "integrated" + +combined <- ScaleData(combined, verbose = FALSE) +combined <- RunPCA(combined, npcs = 30, verbose = FALSE, approx = FALSE) + +# Check that order of cells is the same +stopifnot(all.equal(colnames(spe), colnames(combined))) + +reducedDim(spe, "seurat") <- Embeddings(combined, reduction = "pca") +``` + +The computational time of the `Seurat` function calls is +4.29 minutes. + +### Visualization + +As above, we recompute the UMAP embeddings based on `Seurat` integrated results +and visualize the embedding. + + +```r +set.seed(220228) +spe <- runUMAP(spe, dimred = "seurat", name = "UMAP_seurat") +``` + +Visualize patient IDs. + + +```r +# visualize patient id +p1 <- dittoDimPlot(spe, var = "patient_id", + reduction.use = "UMAP", size = 0.2) + + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + + ggtitle("Patient ID on UMAP before correction") +p2 <- dittoDimPlot(spe, var = "patient_id", + reduction.use = "UMAP_seurat", size = 0.2) + + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + + ggtitle("Patient ID on UMAP after correction") + +plot_grid(p1, p2) +``` + + + +Visualization of marker expression. + + +```r +# Before correction +plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP", + assay = "exprs", size = 0.2, list.out = TRUE) +plot_list <- lapply(plot_list, function(x) x + scale_color_viridis()) +plot_grid(plotlist = plot_list) +``` + + + +```r +# After correction +plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP_seurat", + assay = "exprs", size = 0.2, list.out = TRUE) +plot_list <- lapply(plot_list, function(x) x + scale_color_viridis()) +plot_grid(plotlist = plot_list) +``` + + + +Similar to the methods presented above, `Seurat` integrates immune cells correctly. +When visualizing the patient IDs, slight patient-to-patient differences within tumor +cells can be detected. + +Choosing the correct integration approach is challenging without having ground truth +cell labels available. It is recommended to compare different techniques and different +parameter settings. Please refer to the documentation of the individual tools +to become familiar with the possible parameter choices. Furthermore, in the following +section, we will discuss clustering and classification approaches in light of +expression differences between samples. + +In general, it appears that MNN-based approaches are less conservative in terms +of merging compared to `harmony`. On the other hand, `harmony` could well merge +cells in a way that regresses out biological signals. + +## Save objects + +The modified `SpatialExperiment` object is saved for further downstream analysis. + + +```r +saveRDS(spe, "data/spe.rds") +``` + + + +## Session Info + +
+ SessionInfo + + +``` +## R version 4.3.1 (2023-06-16) +## Platform: x86_64-pc-linux-gnu (64-bit) +## Running under: Ubuntu 22.04.3 LTS +## +## Matrix products: default +## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 +## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 +## +## locale: +## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C +## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 +## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 +## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C +## [9] LC_ADDRESS=C LC_TELEPHONE=C +## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C +## +## time zone: Etc/UTC +## tzcode source: system (glibc) +## +## attached base packages: +## [1] stats4 stats graphics grDevices utils datasets methods +## [8] base +## +## other attached packages: +## [1] testthat_3.1.10 SeuratObject_4.1.4 +## [3] Seurat_4.4.0 BiocSingular_1.16.0 +## [5] harmony_1.0.1 Rcpp_1.0.11 +## [7] viridis_0.6.4 viridisLite_0.4.2 +## [9] dittoSeq_1.12.1 cowplot_1.1.1 +## [11] scater_1.28.0 ggplot2_3.4.3 +## [13] scuttle_1.10.2 SpatialExperiment_1.10.0 +## [15] batchelor_1.16.0 SingleCellExperiment_1.22.0 +## [17] SummarizedExperiment_1.30.2 Biobase_2.60.0 +## [19] GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 +## [21] IRanges_2.34.1 S4Vectors_0.38.2 +## [23] BiocGenerics_0.46.0 MatrixGenerics_1.12.3 +## [25] matrixStats_1.0.0 +## +## loaded via a namespace (and not attached): +## [1] RcppAnnoy_0.0.21 splines_4.3.1 +## [3] later_1.3.1 bitops_1.0-7 +## [5] tibble_3.2.1 R.oo_1.25.0 +## [7] polyclip_1.10-6 lifecycle_1.0.3 +## [9] rprojroot_2.0.3 edgeR_3.42.4 +## [11] globals_0.16.2 lattice_0.21-8 +## [13] MASS_7.3-60 magrittr_2.0.3 +## [15] limma_3.56.2 plotly_4.10.2 +## [17] sass_0.4.7 rmarkdown_2.25 +## [19] jquerylib_0.1.4 yaml_2.3.7 +## [21] httpuv_1.6.11 sctransform_0.4.0 +## [23] spatstat.sparse_3.0-2 sp_2.0-0 +## [25] reticulate_1.32.0 pbapply_1.7-2 +## [27] RColorBrewer_1.1-3 ResidualMatrix_1.10.0 +## [29] pkgload_1.3.3 abind_1.4-5 +## [31] zlibbioc_1.46.0 Rtsne_0.16 +## [33] purrr_1.0.2 R.utils_2.12.2 +## [35] RCurl_1.98-1.12 GenomeInfoDbData_1.2.10 +## [37] ggrepel_0.9.3 irlba_2.3.5.1 +## [39] spatstat.utils_3.0-3 listenv_0.9.0 +## [41] pheatmap_1.0.12 goftest_1.2-3 +## [43] spatstat.random_3.1-6 dqrng_0.3.1 +## [45] fitdistrplus_1.1-11 parallelly_1.36.0 +## [47] DelayedMatrixStats_1.22.6 leiden_0.4.3 +## [49] codetools_0.2-19 DropletUtils_1.20.0 +## [51] DelayedArray_0.26.7 tidyselect_1.2.0 +## [53] farver_2.1.1 ScaledMatrix_1.8.1 +## [55] spatstat.explore_3.2-3 jsonlite_1.8.7 +## [57] BiocNeighbors_1.18.0 ellipsis_0.3.2 +## [59] progressr_0.14.0 ggridges_0.5.4 +## [61] survival_3.5-5 tools_4.3.1 +## [63] ica_1.0-3 glue_1.6.2 +## [65] gridExtra_2.3 xfun_0.40 +## [67] dplyr_1.1.3 HDF5Array_1.28.1 +## [69] withr_2.5.1 fastmap_1.1.1 +## [71] rhdf5filters_1.12.1 fansi_1.0.4 +## [73] digest_0.6.33 rsvd_1.0.5 +## [75] R6_2.5.1 mime_0.12 +## [77] colorspace_2.1-0 scattermore_1.2 +## [79] tensor_1.5 spatstat.data_3.0-1 +## [81] R.methodsS3_1.8.2 RhpcBLASctl_0.23-42 +## [83] utf8_1.2.3 tidyr_1.3.0 +## [85] generics_0.1.3 data.table_1.14.8 +## [87] httr_1.4.7 htmlwidgets_1.6.2 +## [89] S4Arrays_1.0.6 uwot_0.1.16 +## [91] pkgconfig_2.0.3 gtable_0.3.4 +## [93] lmtest_0.9-40 XVector_0.40.0 +## [95] brio_1.1.3 htmltools_0.5.6 +## [97] bookdown_0.35 scales_1.2.1 +## [99] png_0.1-8 knitr_1.44 +## [101] rstudioapi_0.15.0 reshape2_1.4.4 +## [103] rjson_0.2.21 nlme_3.1-162 +## [105] cachem_1.0.8 zoo_1.8-12 +## [107] rhdf5_2.44.0 stringr_1.5.0 +## [109] KernSmooth_2.23-21 parallel_4.3.1 +## [111] miniUI_0.1.1.1 vipor_0.4.5 +## [113] desc_1.4.2 pillar_1.9.0 +## [115] grid_4.3.1 vctrs_0.6.3 +## [117] RANN_2.6.1 promises_1.2.1 +## [119] beachmat_2.16.0 xtable_1.8-4 +## [121] cluster_2.1.4 waldo_0.5.1 +## [123] beeswarm_0.4.0 evaluate_0.21 +## [125] magick_2.8.0 cli_3.6.1 +## [127] locfit_1.5-9.8 compiler_4.3.1 +## [129] rlang_1.1.1 crayon_1.5.2 +## [131] future.apply_1.11.0 labeling_0.4.3 +## [133] plyr_1.8.8 ggbeeswarm_0.7.2 +## [135] stringi_1.7.12 deldir_1.0-9 +## [137] BiocParallel_1.34.2 munsell_0.5.0 +## [139] lazyeval_0.2.2 spatstat.geom_3.2-5 +## [141] Matrix_1.6-1.1 patchwork_1.1.3 +## [143] sparseMatrixStats_1.12.2 future_1.33.0 +## [145] Rhdf5lib_1.22.1 shiny_1.7.5 +## [147] ROCR_1.0-11 igraph_1.5.1 +## [149] bslib_0.5.1 +``` +
diff --git a/07-batch_correction_files/figure-html/visualizing-batch-correction-fastMNN-1-1.png b/07-batch_correction_files/figure-html/visualizing-batch-correction-fastMNN-1-1.png new file mode 100644 index 00000000..5505b0d0 Binary files /dev/null and b/07-batch_correction_files/figure-html/visualizing-batch-correction-fastMNN-1-1.png differ diff --git a/07-batch_correction_files/figure-html/visualizing-batch-correction-fastMNN-2-1.png b/07-batch_correction_files/figure-html/visualizing-batch-correction-fastMNN-2-1.png new file mode 100644 index 00000000..3a24688f Binary files /dev/null and b/07-batch_correction_files/figure-html/visualizing-batch-correction-fastMNN-2-1.png differ diff --git a/07-batch_correction_files/figure-html/visualizing-batch-correction-fastMNN-2-2.png b/07-batch_correction_files/figure-html/visualizing-batch-correction-fastMNN-2-2.png new file mode 100644 index 00000000..947f601b Binary files /dev/null and b/07-batch_correction_files/figure-html/visualizing-batch-correction-fastMNN-2-2.png differ diff --git a/07-batch_correction_files/figure-html/visualizing-batch-correction-harmony-1-1.png b/07-batch_correction_files/figure-html/visualizing-batch-correction-harmony-1-1.png new file mode 100644 index 00000000..c4ef840e Binary files /dev/null and b/07-batch_correction_files/figure-html/visualizing-batch-correction-harmony-1-1.png differ diff --git a/07-batch_correction_files/figure-html/visualizing-batch-correction-harmony-2-1.png b/07-batch_correction_files/figure-html/visualizing-batch-correction-harmony-2-1.png new file mode 100644 index 00000000..3a24688f Binary files /dev/null and b/07-batch_correction_files/figure-html/visualizing-batch-correction-harmony-2-1.png differ diff --git a/07-batch_correction_files/figure-html/visualizing-batch-correction-harmony-2-2.png b/07-batch_correction_files/figure-html/visualizing-batch-correction-harmony-2-2.png new file mode 100644 index 00000000..2ba1572f Binary files /dev/null and b/07-batch_correction_files/figure-html/visualizing-batch-correction-harmony-2-2.png differ diff --git a/07-batch_correction_files/figure-html/visualizing-batch-correction-seurat-1-1.png b/07-batch_correction_files/figure-html/visualizing-batch-correction-seurat-1-1.png new file mode 100644 index 00000000..bbee7307 Binary files /dev/null and b/07-batch_correction_files/figure-html/visualizing-batch-correction-seurat-1-1.png differ diff --git a/07-batch_correction_files/figure-html/visualizing-batch-correction-seurat-2-1.png b/07-batch_correction_files/figure-html/visualizing-batch-correction-seurat-2-1.png new file mode 100644 index 00000000..3a24688f Binary files /dev/null and b/07-batch_correction_files/figure-html/visualizing-batch-correction-seurat-2-1.png differ diff --git a/07-batch_correction_files/figure-html/visualizing-batch-correction-seurat-2-2.png b/07-batch_correction_files/figure-html/visualizing-batch-correction-seurat-2-2.png new file mode 100644 index 00000000..4713a32b Binary files /dev/null and b/07-batch_correction_files/figure-html/visualizing-batch-correction-seurat-2-2.png differ diff --git a/08-phenotyping.md b/08-phenotyping.md new file mode 100644 index 00000000..53176007 --- /dev/null +++ b/08-phenotyping.md @@ -0,0 +1,1397 @@ +# Cell phenotyping + +A common step during single-cell data analysis is the annotation of cells based +on their phenotype. Defining cell phenotypes is often subjective and relies +on previous biological knowledge. The [Orchestrating Single Cell Analysis with Bioconductor](https://bioconductor.org/books/release/OSCA.basic/cell-type-annotation.html) book +presents a number of approaches to phenotype cells detected by single-cell RNA +sequencing based on reference datasets or gene set analysis. + +In highly-multiplexed imaging, target proteins or molecules are manually +selected based on the biological question at hand. It narrows down the feature +space and facilitates the manual annotation of clusters to derive cell +phenotypes. We will therefore discuss and compare a number of clustering +approaches to group cells based on their similarity in marker expression in +Section \@ref(clustering). + +Unlike single-cell RNA sequencing or CyTOF data, single-cell data derived from +highly-multiplexed imaging data often suffers from "lateral spillover" between +neighboring cells. This spillover caused by imperfect segmentation often hinders +accurate clustering to define specific cell phenotypes in multiplexed imaging +data. Tools have been developed to correct lateral spillover between cells +[@Bai2021] but the approach requires careful selection of the markers to +correct. In Section \@ref(classification) we will train and apply a random +forest classifier to classify cell phenotypes in the dataset as alternative +approach to clustering-based cell phenotyping. This approach has been previously used to +identify major cell phenotypes in metastatic melanoma and avoids clustering of +cells [@Hoch2022]. + +## Load data + +We will first read in the previously generated `SpatialExperiment` object and +sample 2000 cells to visualize cluster membership. + + +```r +library(SpatialExperiment) +spe <- readRDS("data/spe.rds") + +# Sample cells +set.seed(220619) +cur_cells <- sample(seq_len(ncol(spe)), 2000) +``` + +## Clustering approaches {#clustering} + +In the first section, we will present clustering approaches to identify cellular +phenotypes in the dataset. These methods group cells based on their similarity +in marker expression or by their proximity in low dimensional space. A number of +approaches have been developed to cluster data derived from single-cell RNA +sequencing technologies [@Yu2022] or CyTOF [@Weber2016]. For demonstration +purposes, we will highlight common clustering approaches that are available in R +and have been used for clustering cells obtained from IMC. Two approaches rely +on graph-based clustering and one approach uses self organizing maps (SOM). + +### Rphenograph + +The PhenoGraph clustering approach was first described to group cells of a CyTOF +dataset [@Levine2015]. The algorithm first constructs a graph by detecting the +`k` nearest neighbours based on euclidean distance in expression space. In the +next step, edges between nodes (cells) are weighted by their overlap in nearest +neighbor sets. To quantify the overlap in shared nearest neighbor sets, the +jaccard index is used. The Louvain modularity optimization approach is used to +detect connected communities and partition the graph into clusters of cells. +This clustering strategy was used by Jackson, Fischer _et al._ and Schulz _et +al._ to cluster IMC data [@Jackson2020; @Schulz2018]. + +There are several different PhenoGraph implementations available in R. Here, we +use the one available at +[https://github.com/i-cyto/Rphenograph](https://github.com/i-cyto/Rphenograph). +For large datasets, +[https://github.com/stuchly/Rphenoannoy](https://github.com/stuchly/Rphenoannoy) +offers a more performant implementation of the algorithm. + +In the following code chunk, we select the asinh-transformed mean pixel +intensities per cell and channel and subset the channels to the ones containing +biological variation. This matrix is transposed to store cells in rows. Within +the `Rphenograph` function, we select the 45 nearest neighbors for graph +building and louvain community detection (default). The function returns a list +of length 2, the first entry being the graph and the second entry containing the +community object. Calling `membership` on the community object will return +cluster IDs for each cell. These cluster IDs are then stored within the +`colData` of the `SpatialExperiment` object. Cluster IDs are mapped on top of +the UMAP embedding and single-cell marker expression within each cluster are +visualized in form of a heatmap. + +It is recommended to test different inputs to `k` as shown in the next section. +Selecting larger values for `k` results in larger clusters. + + + + +```r +library(Rphenograph) +library(igraph) +library(dittoSeq) +library(viridis) + +mat <- t(assay(spe, "exprs")[rowData(spe)$use_channel,]) + +set.seed(230619) +out <- Rphenograph(mat, k = 45) + +clusters <- factor(membership(out[[2]])) + +spe$pg_clusters <- clusters + +dittoDimPlot(spe, var = "pg_clusters", + reduction.use = "UMAP", size = 0.2, + do.label = TRUE) + + ggtitle("Phenograph clusters on UMAP") +``` + + + + +```r +dittoHeatmap(spe[,cur_cells], + genes = rownames(spe)[rowData(spe)$use_channel], + assay = "exprs", scale = "none", + heatmap.colors = viridis(100), + annot.by = c("pg_clusters", "patient_id"), + annot.colors = c(dittoColors(1)[1:length(unique(spe$pg_clusters))], + metadata(spe)$color_vectors$patient_id)) +``` + + + + + +The `Rphenograph` function call took +2.31 minutes. + +We can observe that some of the clusters only contain cells of a single patient. +This can often be observed in the tumor compartment. In the next step, we +use the integrated cells (see Section \@ref(batch-effects)) in low dimensional +embedding for clustering. Here, the low dimensional embedding can +be directly accessed from the `reducedDim` slot. + + +```r +mat <- reducedDim(spe, "fastMNN") + +set.seed(230619) +out <- Rphenograph(mat, k = 45) + +clusters <- factor(membership(out[[2]])) + +spe$pg_clusters_corrected <- clusters + +dittoDimPlot(spe, var = "pg_clusters_corrected", + reduction.use = "UMAP_mnnCorrected", size = 0.2, + do.label = TRUE) + + ggtitle("Phenograph clusters on UMAP, integrated cells") +``` + + + + +```r +dittoHeatmap(spe[,cur_cells], + genes = rownames(spe)[rowData(spe)$use_channel], + assay = "exprs", scale = "none", + heatmap.colors = viridis(100), + annot.by = c("pg_clusters_corrected","patient_id"), + annot.colors = c(dittoColors(1)[1:length(unique(spe$pg_clusters_corrected))], + metadata(spe)$color_vectors$patient_id)) +``` + + + +Clustering using the integrated embedding leads to clusters that contain cells +of different patients. Cluster annotation can now be performed by manually +labeling cells based on their marker expression (see Notes in Section +\@ref(clustering-notes)). + +### Shared nearest neighbour graph {#snn-graph} + +The [bluster](https://www.bioconductor.org/packages/release/bioc/html/bluster.html) +package provides a simple interface to cluster cells using a number of different +[clustering approaches](https://www.bioconductor.org/packages/release/bioc/vignettes/bluster/inst/doc/clusterRows.html) and different metrics to [access cluster stability](https://www.bioconductor.org/packages/release/bioc/vignettes/bluster/inst/doc/diagnostics.html). + +For simplicity, we will focus on graph based clustering as this is the most +popular and a fast method for single-cell clustering. The `bluster` package +provides functionalities to build k-nearest neighbor (KNN) graphs and its weighted +version, shared nearest neighbor (SNN) graphs where nodes represent cells. +The user can chose the number of neighbors to consider (parameter `k`), +the edge weighting method (parameter `type`) and the community detection +function to use (parameter `cluster.fun`). As all parameters affect the clustering +results, the `bluster` package provides the `clusterSweep` function to test +a number of parameter settings in parallel. In the following code chunk, +we select the asinh-transformed mean pixel intensities and subset the markers +of interest. The resulting matrix is transposed to fit to the requirements of +the bluster package (cells in rows). + +We test two different settings for `k`, two for `type` and fix the `cluster.fun` +to `louvain` as this is one of the most common approaches for community detection. +This function call is parallelized by setting the `BPPARAM` parameter. + + + + +```r +library(bluster) +library(BiocParallel) +library(ggplot2) + +mat <- t(assay(spe, "exprs")[rowData(spe)$use_channel,]) + +combinations <- clusterSweep(mat, + BLUSPARAM=SNNGraphParam(), + k=c(10L, 20L), + type = c("rank", "jaccard"), + cluster.fun = "louvain", + BPPARAM = MulticoreParam(RNGseed = 220427)) +``` + +We next calculate two metrics to estimate cluster stability: the average +silhouette width and the neighborhood purity. + +We use the `approxSilhouette` function to compute the silhouette width for each +cell and compute the average across all cells per parameter setting. Please see +`?silhouette` for more information on how the silhouette width is computed for +each cell. A large average silhouette width indicates a cluster parameter +setting for which cells that are well clustered. + +The `neighborPurity` function computes the fraction of cells around each cell +with the same cluster ID. Per parameter setting, we compute the average +neighborhood purity across all cells. A large average neighborhood purity +indicates a cluster parameter setting for which cells that are well clustered. + + +```r +sil <- vapply(as.list(combinations$clusters), + function(x) mean(approxSilhouette(mat, x)$width), + 0) + +ggplot(data.frame(method = names(sil), + sil = sil)) + + geom_point(aes(method, sil)) + + theme_classic(base_size = 15) + + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + + xlab("Cluster parameter combination") + + ylab("Average silhouette width") +``` + + + +```r +pur <- vapply(as.list(combinations$clusters), + function(x) mean(neighborPurity(mat, x)$purity), + 0) + +ggplot(data.frame(method = names(pur), + pur = pur)) + + geom_point(aes(method, pur)) + + theme_classic(base_size = 15) + + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + + xlab("Cluster parameter combination") + + ylab("Average neighborhood purity") +``` + + + + + +The cluster parameter sweep took +8.81 minutes. + +Performing a cluster sweep takes some time as multiple function calls are run in parallel. +We do however recommend testing a number of different parameter settings to +assess clustering performance. + +Once parameter settings are known, we can either use the `clusterRows` function +of the `bluster` package to cluster cells or its convenient wrapper function +exported by the +[scran](https://bioconductor.org/packages/release/bioc/html/scran.html) package. +The `scran::clusterCells` function accepts a `SpatialExperiment` (or +`SingleCellExperiment`) object which stores cells in columns. By default, the +function detects the 10 nearest neighbours for each cell, performs rank-based +weighting of edges (see `?makeSNNGraph` for more information) and uses the +`cluster_walktrap` function to detect communities in the graph. + +As we can see above, the clustering approach in this dataset with `k` being 20 +and rank-based edge weighting leads to the highest silhouette width and highest +neighborhood purity. + + + + +```r +library(scran) + +set.seed(220620) +clusters <- clusterCells(spe[rowData(spe)$use_channel,], + assay.type = "exprs", + BLUSPARAM = SNNGraphParam(k=20, + cluster.fun = "louvain", + type = "rank")) + +spe$nn_clusters <- clusters + +dittoDimPlot(spe, var = "nn_clusters", + reduction.use = "UMAP", size = 0.2, + do.label = TRUE) + + ggtitle("SNN clusters on UMAP") +``` + + + + +```r +dittoHeatmap(spe[,cur_cells], + genes = rownames(spe)[rowData(spe)$use_channel], + assay = "exprs", scale = "none", + heatmap.colors = viridis(100), + annot.by = c("nn_clusters", "patient_id"), + annot.colors = c(dittoColors(1)[1:length(unique(spe$nn_clusters))], + metadata(spe)$color_vectors$patient_id)) +``` + + + + + +The shared nearest neighbor graph clustering approach took +1.31 minutes. + +This function was used by [@Tietscher2022] to cluster cells obtained by IMC. Setting +`type = "jaccard"` performs clustering similar to `Rphenograph` above and [Seurat](https://satijalab.org/seurat/articles/pbmc3k_tutorial.html#cluster-the-cells-1). + +Similar to the results obtained by `Rphenograph`, some of the clusters are +patient-specific. We can now perform clustering of the integrated cells +by directly specifying which low-dimensional embedding to use: + + +```r +set.seed(220621) +clusters <- clusterCells(spe, + use.dimred = "fastMNN", + BLUSPARAM = SNNGraphParam(k = 20, + cluster.fun = "louvain", + type = "rank")) + +spe$nn_clusters_corrected <- clusters + +dittoDimPlot(spe, var = "nn_clusters_corrected", + reduction.use = "UMAP_mnnCorrected", size = 0.2, + do.label = TRUE) + + ggtitle("SNN clusters on UMAP, integrated cells") +``` + + + + +```r +dittoHeatmap(spe[,cur_cells], + genes = rownames(spe)[rowData(spe)$use_channel], + assay = "exprs", scale = "none", + heatmap.colors = viridis(100), + annot.by = c("nn_clusters_corrected","patient_id"), + annot.colors = c(dittoColors(1)[1:length(unique(spe$nn_clusters_corrected))], + metadata(spe)$color_vectors$patient_id)) +``` + + + +### Self organizing maps + +An alternative to graph-based clustering is offered by the +[CATALYST](https://bioconductor.org/packages/release/bioc/html/CATALYST.html) +package. The `cluster` function internally uses the +[FlowSOM](https://bioconductor.org/packages/release/bioc/html/FlowSOM.html) +package to group cells into 100 (default) clusters based on self organizing maps +(SOM). In the next step, the +[ConsensusClusterPlus](https://bioconductor.org/packages/release/bioc/html/ConsensusClusterPlus.html) +package is used to perform hierarchical consensus clustering of the previously +detected 100 SOM nodes into 2 to `maxK` clusters. Cluster stability for each `k` +can be assessed by plotting the `delta_area(spe)`. The optimal number +of clusters can be found by selecting the `k` at which a plateau is reached. +In the example below, an optimal `k` lies somewhere around 13. + + + + +```r +library(CATALYST) + +# Run FlowSOM and ConsensusClusterPlus clustering +set.seed(220410) +spe <- cluster(spe, + features = rownames(spe)[rowData(spe)$use_channel], + maxK = 30) + +# Assess cluster stability +delta_area(spe) +``` + + + +```r +spe$som_clusters <- cluster_ids(spe, "meta13") + +dittoDimPlot(spe, var = "som_clusters", + reduction.use = "UMAP", size = 0.2, + do.label = TRUE) + + ggtitle("SOM clusters on UMAP") +``` + + + + +```r +dittoHeatmap(spe[,cur_cells], + genes = rownames(spe)[rowData(spe)$use_channel], + assay = "exprs", scale = "none", + heatmap.colors = viridis(100), + annot.by = c("som_clusters", "patient_id"), + annot.colors = c(dittoColors(1)[1:length(unique(spe$som_clusters))], + metadata(spe)$color_vectors$patient_id)) +``` + + + + + +Running FlowSOM clustering took 0.22 minutes. + +The `CATALYST` package does not provide functionality to perform `FlowSOM` and +`ConsensusClusterPlus` clustering directly on the batch-corrected, integrated cells. As an +alternative to the `CATALYST` package, the `bluster` package provides SOM +clustering when specifying the `SomParam()` parameter. Similar to the `CATALYST` +approach, we will first cluster the dataset into 100 clusters (also called +"codes"). These codes are then further clustered into a maximum of 30 clusters +using `ConsensusClusterPlus` (using hierarchical clustering and euclidean +distance). The delta area plot can be accessed using the (not exported) +`.plot_delta_area` function from `CATALYST`. Here, it seems that the plateau is +reached at a `k` of 16 and we will store the final cluster IDs within the +`SpatialExperiment` object. + + +```r +library(kohonen) +library(ConsensusClusterPlus) + +# Select integrated cells +mat <- reducedDim(spe, "fastMNN") + +# Perform SOM clustering +set.seed(220410) +som.out <- clusterRows(mat, SomParam(100), full = TRUE) + +# Cluster the 100 SOM codes into larger clusters +ccp <- ConsensusClusterPlus(t(som.out$objects$som$codes[[1]]), + maxK = 30, + reps = 100, + distance = "euclidean", + seed = 220410, + plot = NULL) +``` + + +```r +# Visualize delta area plot +CATALYST:::.plot_delta_area(ccp) +``` + + + +```r +# Link ConsensusClusterPlus clusters with SOM codes and save in object +som.cluster <- ccp[[16]][["consensusClass"]][som.out$clusters] +spe$som_clusters_corrected <- as.factor(som.cluster) + +dittoDimPlot(spe, var = "som_clusters_corrected", + reduction.use = "UMAP_mnnCorrected", size = 0.2, + do.label = TRUE) + + ggtitle("SOM clusters on UMAP, integrated cells") +``` + + + + +```r +dittoHeatmap(spe[,cur_cells], + genes = rownames(spe)[rowData(spe)$use_channel], + assay = "exprs", scale = "none", + heatmap.colors = viridis(100), + annot.by = c("som_clusters_corrected","patient_id"), + annot.colors = c(dittoColors(1)[1:length(unique(spe$som_clusters_corrected))], + metadata(spe)$color_vectors$patient_id)) +``` + + + +The `FlowSOM` clustering approach has been used by [@Hoch2022] to sub-cluster tumor +cells as measured by IMC. + +### Compare between clustering approaches + +Finally, we can compare the results of different clustering approaches. For +this, we visualize the number of cells that are shared between different +clustering results in a pairwise fashion. In the following heatmaps a high match +between clustering results can be seen for those clusters that are uniquely +detected in both approaches. + +First, we will visualize the match between the three different approaches +applied to the asinh-transformed counts. + + +```r +library(patchwork) +library(pheatmap) +library(gridExtra) + +tab1 <- table(paste("Rphenograph", spe$pg_clusters), + paste("SNN", spe$nn_clusters)) +tab2 <- table(paste("Rphenograph", spe$pg_clusters), + paste("SOM", spe$som_clusters)) +tab3 <- table(paste("SNN", spe$nn_clusters), + paste("SOM", spe$som_clusters)) + +pheatmap(log10(tab1 + 10), color = viridis(100)) +``` + + + +```r +pheatmap(log10(tab2 + 10), color = viridis(100)) +``` + + + +```r +pheatmap(log10(tab3 + 10), color = viridis(100)) +``` + + + +We observe that `Rphenograph` and the shared nearest neighbor (SNN) approach by +`scran` show similar results (first heatmap above). For example, Rphenograph +cluster 20 (a tumor cluster) is perfectly captured by SNN cluster 12. On the +other hand, the Neutrophil cluster (SNN cluster 6) is split into Rphenograph +cluster 2 and Rphenograph cluster 6. A common approach +is to now merge clusters that contain similar cell types and annotate them by +hand (see below). + +Below, a comparison between the clustering results of the integrated cells +is shown. + + +```r +tab1 <- table(paste("Rphenograph", spe$pg_clusters_corrected), + paste("SNN", spe$nn_clusters_corrected)) +tab2 <- table(paste("Rphenograph", spe$pg_clusters_corrected), + paste("SOM", spe$som_clusters_corrected)) +tab3 <- table(paste("SNN", spe$nn_clusters_corrected), + paste("SOM", spe$som_clusters_corrected)) + +pheatmap(log10(tab1 + 10), color = viridis(100)) +``` + + + +```r +pheatmap(log10(tab2 + 10), color = viridis(100)) +``` + + + +```r +pheatmap(log10(tab3 + 10), color = viridis(100)) +``` + + + +In comparison to clustering on the non-integrated cells, the clustering results +of the integrated cells show higher overlap. The SNN approach resulted in fewer +clusters and therefore matches better with the SOM clustering approach. + +### Further clustering notes {#clustering-notes} + +The `bluster` package provides a number of metrics to assess cluster stability +[here](https://www.bioconductor.org/packages/release/bioc/vignettes/bluster/inst/doc/diagnostics.html). +For brevity we only highlighted the use of the silhouette width and the +neighborhood purity but different metrics should be tested to assess cluster +stability. + +To assign cell types to clusters, we manually annotate clusters based on their +marker expression. For example, SNN cluster 12 (clustering of the integrated +cells) shows high, homogeneous expression of CD20 and we might therefore label +this cluster as B cells. The next chapter \@ref(single-cell-visualization) will +highlight single-cell visualization methods that can be helpful for manual +cluster annotations. + +An example how to label clusters can be seen below: + + +```r +library(dplyr) +cluster_celltype <- recode(spe$nn_clusters_corrected, + "1" = "Tumor_proliferating", + "2" = "Myeloid", + "3" = "Tumor", + "4" = "Tumor", + "5" = "Stroma", + "6" = "Proliferating", + "7" = "Myeloid", + "8" = "Plasma_cell", + "9" = "CD8", + "10" = "CD4", + "11" = "Neutrophil", + "12" = "Bcell", + "13" = "Stroma") + +spe$cluster_celltype <- cluster_celltype +``` + +## Classification approach {#classification} + +In this section, we will highlight a cell type classification approach based +on ground truth labeling and random forest classification. The rational for +this supervised cell phenotyping approach is to use the information contained +in the pre-defined markers to detect cells of interest. This approach was +used by Hoch _et al._ to classify cell types in a metastatic melanoma IMC +dataset [@Hoch2022]. + +The antibody panel used in the example data set mainly focuses on immune cell +types and little on tumor cell phenotypes. Therefore we will label the following +cell types: + +* Tumor (E-cadherin positive) +* Stroma (SMA, PDGFRb positive) +* Plasma cells (CD38 positive) +* Neutrophil (MPO, CD15 positive) +* Myeloid cells (HLADR positive) +* B cells (CD20 positive) +* B next to T cells (CD20, CD3 positive) +* Regulatory T cells (FOXP3 positive) +* CD8+ T cells (CD3, CD8 positive) +* CD4+ T cells (CD3, CD4 positive) + +The "B next to T cell" phenotype (`BnTcell`) is commonly observed in immune +infiltrated regions measured by IMC. We include this phenotype to account for B +cell/T cell interactions where precise classification into B cells or T cells is +not possible. The exact gating scheme can be seen at +[img/Gating_scheme.pdf](img/Gating_scheme.pdf). + +As related approaches, [Astir](https://github.com/camlab-bioml/astir) and +[Garnett](https://cole-trapnell-lab.github.io/garnett/) use pre-defined panel +information to classify cell phenotypes based on their marker expression. + +### Manual labeling of cells + +The [cytomapper](https://www.bioconductor.org/packages/release/bioc/html/cytomapper.html) +package provides the `cytomapperShiny` function that allows gating of cells +based on their marker expression and visualization of selected cells directly +on the images. + + +```r +library(cytomapper) +if (interactive()) { + + images <- readRDS("data/images.rds") + masks <- readRDS("data/masks.rds") + + cytomapperShiny(object = spe, mask = masks, image = images, + cell_id = "ObjectNumber", img_id = "sample_id") +} +``` + +The labeled cells for this data set can be accessed at +[10.5281/zenodo.6554544](https://zenodo.org/record/6554544) and were downloaded +in Section \@ref(prerequisites). Gating is performed per image and the +`cytomapperShiny` function allows the export of gated cells in form of a +`SingleCellExperiment` or `SpatialExperiment` object. The cell label is stored +in `colData(object)$cytomapper_CellLabel` and the gates are stored in +`metadata(object)`. In the next section, we will read in and consolidate the +labeled data. + +### Define color vectors + +For consistent visualization of cell types, we will now pre-define their colors: + + +```r +celltype <- setNames(c("#3F1B03", "#F4AD31", "#894F36", "#1C750C", "#EF8ECC", + "#6471E2", "#4DB23B", "grey", "#F4800C", "#BF0A3D", "#066970"), + c("Tumor", "Stroma", "Myeloid", "CD8", "Plasma_cell", + "Treg", "CD4", "undefined", "BnTcell", "Bcell", "Neutrophil")) + +metadata(spe)$color_vectors$celltype <- celltype +``` + +### Read in and consolidate labeled data + +Here, we will read in the individual `SpatialExperiment` objects containing the +labeled cells and concatenate them. In the process of concatenating the +`SpatialExperiment` objects along their columns, the `sample_id` entry is +appended by `.1, .2, .3, ...` due to replicated entries. + + +```r +library(SingleCellExperiment) +label_files <- list.files("data/gated_cells", + full.names = TRUE, pattern = ".rds$") + +# Read in SPE objects +spes <- lapply(label_files, readRDS) + +# Merge SPE objects +concat_spe <- do.call("cbind", spes) +``` + +In the following code chunk we will identify cells that were labeled multiple +times. This occurs when different cell phenotypes are gated per image and can +affect immune cells that are located inside the tumor compartment. + +We will first identify those cells that were uniquely labeled. In the next step, +we will identify those cells that were labeled twice AND were labeled as Tumor +cells. These cells will be assigned their immune cell label. Finally, we will +save the unique labels within the original `SpatialExperiment` object. + +**Of note:** this concatenation strategy is specific for cell phenotypes contained in this +example dataset. The gated cell labels might need to be processed in a slightly +different way when working with other samples. + +For these tasks, we will define a filter function: + + +```r +filter_labels <- function(object, + label = "cytomapper_CellLabel") { + cur_tab <- unclass(table(colnames(object), object[[label]])) + + cur_labels <- colnames(cur_tab)[apply(cur_tab, 1, which.max)] + names(cur_labels) <- rownames(cur_tab) + + cur_labels <- cur_labels[rowSums(cur_tab) == 1] + + return(cur_labels) +} +``` + +This function is now applied to all cells and then only non-tumor cells. + + +```r +labels <- filter_labels(concat_spe) + +cur_spe <- concat_spe[,concat_spe$cytomapper_CellLabel != "Tumor"] + +non_tumor_labels <- filter_labels(cur_spe) + +additional_cells <- setdiff(names(non_tumor_labels), names(labels)) + +final_labels <- c(labels, non_tumor_labels[additional_cells]) + +# Transfer labels to SPE object +spe_labels <- rep("unlabeled", ncol(spe)) +names(spe_labels) <- colnames(spe) +spe_labels[names(final_labels)] <- final_labels +spe$cell_labels <- spe_labels + +# Number of cells labeled per patient +table(spe$cell_labels, spe$patient_id) +``` + +``` +## +## Patient1 Patient2 Patient3 Patient4 +## Bcell 152 131 234 263 +## BnTcell 396 37 240 1029 +## CD4 45 342 167 134 +## CD8 60 497 137 128 +## Myeloid 183 378 672 517 +## Neutrophil 97 4 17 16 +## Plasma_cell 34 536 87 59 +## Stroma 84 37 85 236 +## Treg 139 149 49 24 +## Tumor 2342 906 1618 1133 +## unlabeled 7214 9780 7826 9580 +``` + +Based on these labels, we can now train a random forest classifier to classify +all remaining, unlabeled cells. + +### Train classifier + +In this section, we will use the +[caret](https://topepo.github.io/caret/index.html) framework for machine +learning in R. This package provides an interface to train a number of +regression and classification models in a coherent fashion. We use a random +forest classifier due to low number of parameters, high speed and an observed +high performance for cell type classification [@Hoch2022]. + +In the following section, we will first split the `SpatialExperiment` object +into labeled and unlabeled cells. Based on the labeled cells, we split +the data into a train (75% of the data) and test (25% of the data) dataset. +We currently do not provide an independently labeled validation dataset. + +The `caret` package provides the `trainControl` function, which specifies model +training parameters and the `train` function, which performs the actual model +training. While training the model, we also want to estimate the best model +parameters. In the case of the chosen random forest model (`method = "rf"`), we +only need to estimate a single parameters (`mtry`) which corresponds to the +number of variables randomly sampled as candidates at each split. To estimate +the best parameter, we will perform a 5-fold cross validation (set within +`trainControl`) over a tune length of 5 entries to `mtry`. In the following +code chunk, the `createDataPartition` and the `train` function are not deterministic, +meaning they return different results across different runs. We therefore set +a `seed` here for both functions. + + + + +```r +library(caret) + +# Split between labeled and unlabeled cells +lab_spe <- spe[,spe$cell_labels != "unlabeled"] +unlab_spe <- spe[,spe$cell_labels == "unlabeled"] + +# Randomly split into train and test data +set.seed(221029) +trainIndex <- createDataPartition(factor(lab_spe$cell_labels), p = 0.75) + +train_spe <- lab_spe[,trainIndex$Resample1] +test_spe <- lab_spe[,-trainIndex$Resample1] + +# Define fit parameters for 5-fold cross validation +fitControl <- trainControl(method = "cv", + number = 5) + +# Select the arsinh-transformed counts for training +cur_mat <- t(assay(train_spe, "exprs")[rowData(train_spe)$use_channel,]) + +# Train a random forest classifier +rffit <- train(x = cur_mat, + y = factor(train_spe$cell_labels), + method = "rf", ntree = 1000, + tuneLength = 5, + trControl = fitControl) + +rffit +``` + +``` +## Random Forest +## +## 10049 samples +## 37 predictor +## 10 classes: 'Bcell', 'BnTcell', 'CD4', 'CD8', 'Myeloid', 'Neutrophil', 'Plasma_cell', 'Stroma', 'Treg', 'Tumor' +## +## No pre-processing +## Resampling: Cross-Validated (5 fold) +## Summary of sample sizes: 8040, 8039, 8038, 8038, 8041 +## Resampling results across tuning parameters: +## +## mtry Accuracy Kappa +## 2 0.9643726 0.9524051 +## 10 0.9780071 0.9707483 +## 19 0.9801973 0.9736577 +## 28 0.9787052 0.9716635 +## 37 0.9779095 0.9705890 +## +## Accuracy was used to select the optimal model using the largest value. +## The final value used for the model was mtry = 19. +``` + + + +Training the classifier took +11.77 minutes. + +### Classifier performance + +We next observe the accuracy of the classifer when predicting cell phenotypes +across the cross-validation and when applying the classifier to the test +dataset. + +First, we can visualize the classification accuracy during parameter +tuning: + + +```r +ggplot(rffit) + + geom_errorbar(data = rffit$results, + aes(ymin = Accuracy - AccuracySD, + ymax = Accuracy + AccuracySD), + width = 0.4) + + theme_classic(base_size = 15) +``` + + + +The best value for `mtry` is 19 and is used when predicting new data. + +It is often recommended to visualize the variable importance of the +classifier. The following plot specifies which variables (markers) are +most important for classifying the data. + + +```r +plot(varImp(rffit)) +``` + + + +As expected, the markers that were used for gating (Ecad, CD3, CD20, HLADR, +CD8a, CD38, FOXP3) were important for classification. + +To assess the accuracy, sensitivity, specificity, among other quality measures of +the classifier, we will now predict cell phenotypes in the test data. + + +```r +# Select the arsinh-transformed counts of the test data +cur_mat <- t(assay(test_spe, "exprs")[rowData(test_spe)$use_channel,]) + +# Predict the cell phenotype labels of the test data +set.seed(231019) +cur_pred <- predict(rffit, newdata = cur_mat) +``` + +While the overall classification accuracy can appear high, we also want +to check if each cell phenotype class is correctly predicted. +For this, we will calculate the confusion matrix between predicted and actual +cell labels. This measure may highlight individual cell phenotype classes that +were not correctly predicted by the classifier. When setting `mode = "everything"`, +the `confusionMatrix` function returns all available prediction measures including +sensitivity, specificity, precision, recall and the F1 score per cell +phenotype class. + + +```r +cm <- confusionMatrix(data = cur_pred, + reference = factor(test_spe$cell_labels), + mode = "everything") + +cm +``` + +``` +## Confusion Matrix and Statistics +## +## Reference +## Prediction Bcell BnTcell CD4 CD8 Myeloid Neutrophil Plasma_cell Stroma +## Bcell 186 2 0 0 0 0 6 0 +## BnTcell 4 423 1 0 0 0 0 0 +## CD4 0 0 163 0 0 2 3 2 +## CD8 0 0 0 199 0 0 8 0 +## Myeloid 0 0 2 1 437 0 0 0 +## Neutrophil 0 0 0 0 0 30 0 0 +## Plasma_cell 1 0 3 2 0 0 158 0 +## Stroma 0 0 2 0 0 0 0 108 +## Treg 0 0 0 0 0 0 3 0 +## Tumor 4 0 1 3 0 1 1 0 +## Reference +## Prediction Treg Tumor +## Bcell 0 1 +## BnTcell 0 1 +## CD4 0 5 +## CD8 0 3 +## Myeloid 0 0 +## Neutrophil 0 0 +## Plasma_cell 1 0 +## Stroma 0 0 +## Treg 89 2 +## Tumor 0 1487 +## +## Overall Statistics +## +## Accuracy : 0.9806 +## 95% CI : (0.9753, 0.985) +## No Information Rate : 0.4481 +## P-Value [Acc > NIR] : < 2.2e-16 +## +## Kappa : 0.9741 +## +## Mcnemar's Test P-Value : NA +## +## Statistics by Class: +## +## Class: Bcell Class: BnTcell Class: CD4 Class: CD8 +## Sensitivity 0.95385 0.9953 0.94767 0.97073 +## Specificity 0.99714 0.9979 0.99622 0.99650 +## Pos Pred Value 0.95385 0.9860 0.93143 0.94762 +## Neg Pred Value 0.99714 0.9993 0.99716 0.99809 +## Precision 0.95385 0.9860 0.93143 0.94762 +## Recall 0.95385 0.9953 0.94767 0.97073 +## F1 0.95385 0.9906 0.93948 0.95904 +## Prevalence 0.05830 0.1271 0.05142 0.06129 +## Detection Rate 0.05561 0.1265 0.04873 0.05949 +## Detection Prevalence 0.05830 0.1283 0.05232 0.06278 +## Balanced Accuracy 0.97549 0.9966 0.97195 0.98361 +## Class: Myeloid Class: Neutrophil Class: Plasma_cell +## Sensitivity 1.0000 0.909091 0.88268 +## Specificity 0.9990 1.000000 0.99779 +## Pos Pred Value 0.9932 1.000000 0.95758 +## Neg Pred Value 1.0000 0.999095 0.99340 +## Precision 0.9932 1.000000 0.95758 +## Recall 1.0000 0.909091 0.88268 +## F1 0.9966 0.952381 0.91860 +## Prevalence 0.1306 0.009865 0.05351 +## Detection Rate 0.1306 0.008969 0.04723 +## Detection Prevalence 0.1315 0.008969 0.04933 +## Balanced Accuracy 0.9995 0.954545 0.94024 +## Class: Stroma Class: Treg Class: Tumor +## Sensitivity 0.98182 0.98889 0.9920 +## Specificity 0.99938 0.99846 0.9946 +## Pos Pred Value 0.98182 0.94681 0.9933 +## Neg Pred Value 0.99938 0.99969 0.9935 +## Precision 0.98182 0.94681 0.9933 +## Recall 0.98182 0.98889 0.9920 +## F1 0.98182 0.96739 0.9927 +## Prevalence 0.03288 0.02691 0.4481 +## Detection Rate 0.03229 0.02661 0.4445 +## Detection Prevalence 0.03288 0.02810 0.4475 +## Balanced Accuracy 0.99060 0.99368 0.9933 +``` + +To easily visualize these results, we can now plot the true positive rate +(sensitivity) versus the false positive rate (1 - specificity). The size of the +point is determined by the number of true positives divided by the total number +of cells. + + +```r +library(tidyverse) + +data.frame(cm$byClass) %>% + mutate(class = sub("Class: ", "", rownames(cm$byClass))) %>% + ggplot() + + geom_point(aes(1 - Specificity, Sensitivity, + size = Detection.Rate, + fill = class), + shape = 21) + + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) + + theme_classic(base_size = 15) + + ylab("Sensitivity (TPR)") + + xlab("1 - Specificity (FPR)") +``` + + + +We observe high sensitivity and specificity for most cell types. Plasma cells +show the lowest true positive rate with 88% being sufficiently high. + +Finally, to observe which cell phenotypes were wrongly classified, we can visualize +the distribution of classification probabilities per cell phenotype class: + + +```r +set.seed(231019) +cur_pred <- predict(rffit, + newdata = cur_mat, + type = "prob") +cur_pred$truth <- factor(test_spe$cell_labels) + +cur_pred %>% + pivot_longer(cols = Bcell:Tumor) %>% + ggplot() + + geom_boxplot(aes(x = name, y = value, fill = name), outlier.size = 0.5) + + facet_wrap(. ~ truth, ncol = 1) + + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) + + theme(panel.background = element_blank(), + axis.text.x = element_text(angle = 45, hjust = 1)) +``` + + + +The boxplots indicate the classification probabilities per class. The classifier +is well trained if classification probabilities are only high for the one +specific class. + +### Classification of new data + +In the final section, we will now use the tuned and tested random forest +classifier to predict the cell phenotypes of the unlabeled data. + +First, we predict the cell phenotypes and extract their classification +probabilities. + + +```r +# Select the arsinh-transformed counts of the unlabeled data for prediction +cur_mat <- t(assay(unlab_spe, "exprs")[rowData(unlab_spe)$use_channel,]) + +# Predict the cell phenotype labels of the unlabeled data +set.seed(231014) +cell_class <- as.character(predict(rffit, + newdata = cur_mat, + type = "raw")) +names(cell_class) <- rownames(cur_mat) + +table(cell_class) +``` + +``` +## cell_class +## Bcell BnTcell CD4 CD8 Myeloid Neutrophil +## 817 979 3620 2716 6302 559 +## Plasma_cell Stroma Treg Tumor +## 2692 4904 1170 10641 +``` + +```r +# Extract prediction probabilities for each cell +set.seed(231014) +cell_prob <- predict(rffit, + newdata = cur_mat, + type = "prob") +``` + +Each cell is assigned to the class with highest probability. There are however +cases, where the highest probability is low meaning the cell can not be uniquely +assigned to a class. We next want to identify these cells and label them as +"undefined". Here, we select a maximum classification probability threshold +of 40% but this threshold needs to be adjusted for other datasets. The adjusted +cell labels are then stored in the `SpatialExperiment` object. + + +```r +library(ggridges) + +# Distribution of maximum probabilities +tibble(max_prob = rowMax(as.matrix(cell_prob)), + type = cell_class) %>% + ggplot() + + geom_density_ridges(aes(x = max_prob, y = cell_class, fill = cell_class)) + + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) + + theme_classic(base_size = 15) + + xlab("Maximum probability") + + ylab("Cell type") + + xlim(c(0,1.2)) +``` + +``` +## Picking joint bandwidth of 0.0238 +``` + + + +```r +# Label undefined cells +cell_class[rowMax(as.matrix(cell_prob)) < 0.4] <- "undefined" + +# Store labels in SpatialExperiment onject +cell_labels <- spe$cell_labels +cell_labels[colnames(unlab_spe)] <- cell_class +spe$celltype <- cell_labels + +table(spe$celltype, spe$patient_id) +``` + +``` +## +## Patient1 Patient2 Patient3 Patient4 +## Bcell 179 527 431 458 +## BnTcell 416 586 594 1078 +## CD4 391 1370 699 1385 +## CD8 518 1365 479 1142 +## Myeloid 1369 2197 1723 2731 +## Neutrophil 348 9 148 176 +## Plasma_cell 650 2122 351 274 +## Stroma 633 676 736 3261 +## Treg 553 409 243 310 +## Tumor 5560 3334 5648 2083 +## undefined 129 202 80 221 +``` + +We can now compare the cell labels derived by classification to the different +clustering strategies. The first comparison is against the clustering results +using the asinh-transformed counts. + + +```r +tab1 <- table(spe$celltype, + paste("Rphenograph", spe$pg_clusters)) +tab2 <- table(spe$celltype, + paste("SNN", spe$nn_clusters)) +tab3 <- table(spe$celltype, + paste("SOM", spe$som_clusters)) + +pheatmap(log10(tab1 + 10), color = viridis(100)) +``` + + + +```r +pheatmap(log10(tab2 + 10), color = viridis(100)) +``` + + + +```r +pheatmap(log10(tab3 + 10), color = viridis(100)) +``` + + + +We can see that Tumor and Myeloid cells span multiple clusters while +Neutrophiles are detected as an individual cluster by all clustering approaches. + +We next compare the cell classification against clustering results using the +integrated cells. + + +```r +tab1 <- table(spe$celltype, + paste("Rphenograph", spe$pg_clusters_corrected)) +tab2 <- table(spe$celltype, + paste("SNN", spe$nn_clusters_corrected)) +tab3 <- table(spe$celltype, + paste("SOM", spe$som_clusters_corrected)) + +pheatmap(log10(tab1 + 10), color = viridis(100)) +``` + + + +```r +pheatmap(log10(tab2 + 10), color = viridis(100)) +``` + + + +```r +pheatmap(log10(tab3 + 10), color = viridis(100)) +``` + + + +We observe a high agreement between the shared nearest neighbor clustering +approach using the integrated cells and the cell phenotypes derived by +classification. + +In the next sections, we will highlight visualization strategies to verify the +correctness of the phenotyping approach. Specifically, Section +\@ref(outline-cells) shows how to outline identified cell phenotypes on +composite images. + + + +Finally, we save the updated `SpatialExperiment` object. + + +```r +saveRDS(spe, "data/spe.rds") +``` + + + +## Session Info + +
+ SessionInfo + + +``` +## R version 4.3.1 (2023-06-16) +## Platform: x86_64-pc-linux-gnu (64-bit) +## Running under: Ubuntu 22.04.3 LTS +## +## Matrix products: default +## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 +## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 +## +## locale: +## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C +## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 +## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 +## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C +## [9] LC_ADDRESS=C LC_TELEPHONE=C +## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C +## +## time zone: Etc/UTC +## tzcode source: system (glibc) +## +## attached base packages: +## [1] stats4 stats graphics grDevices utils datasets methods +## [8] base +## +## other attached packages: +## [1] testthat_3.1.10 ggridges_0.5.4 +## [3] lubridate_1.9.3 forcats_1.0.0 +## [5] stringr_1.5.0 purrr_1.0.2 +## [7] readr_2.1.4 tidyr_1.3.0 +## [9] tibble_3.2.1 tidyverse_2.0.0 +## [11] caret_6.0-94 lattice_0.21-8 +## [13] cytomapper_1.12.0 EBImage_4.42.0 +## [15] dplyr_1.1.3 gridExtra_2.3 +## [17] pheatmap_1.0.12 patchwork_1.1.3 +## [19] ConsensusClusterPlus_1.64.0 kohonen_3.0.12 +## [21] CATALYST_1.24.0 scran_1.28.2 +## [23] scuttle_1.10.2 BiocParallel_1.34.2 +## [25] bluster_1.10.0 viridis_0.6.4 +## [27] viridisLite_0.4.2 dittoSeq_1.12.1 +## [29] ggplot2_3.4.3 igraph_1.5.1 +## [31] Rphenograph_0.99.1.9003 SpatialExperiment_1.10.0 +## [33] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2 +## [35] Biobase_2.60.0 GenomicRanges_1.52.0 +## [37] GenomeInfoDb_1.36.3 IRanges_2.34.1 +## [39] S4Vectors_0.38.2 BiocGenerics_0.46.0 +## [41] MatrixGenerics_1.12.3 matrixStats_1.0.0 +## +## loaded via a namespace (and not attached): +## [1] bitops_1.0-7 RColorBrewer_1.1-3 +## [3] doParallel_1.0.17 tools_4.3.1 +## [5] backports_1.4.1 utf8_1.2.3 +## [7] R6_2.5.1 HDF5Array_1.28.1 +## [9] rhdf5filters_1.12.1 GetoptLong_1.0.5 +## [11] withr_2.5.1 sp_2.0-0 +## [13] cli_3.6.1 sandwich_3.0-2 +## [15] labeling_0.4.3 sass_0.4.7 +## [17] nnls_1.5 mvtnorm_1.2-3 +## [19] randomForest_4.7-1.1 proxy_0.4-27 +## [21] systemfonts_1.0.4 colorRamps_2.3.1 +## [23] svglite_2.1.1 R.utils_2.12.2 +## [25] scater_1.28.0 parallelly_1.36.0 +## [27] plotrix_3.8-2 limma_3.56.2 +## [29] flowCore_2.12.2 rstudioapi_0.15.0 +## [31] generics_0.1.3 shape_1.4.6 +## [33] gtools_3.9.4 car_3.1-2 +## [35] Matrix_1.6-1.1 RProtoBufLib_2.12.1 +## [37] waldo_0.5.1 ggbeeswarm_0.7.2 +## [39] fansi_1.0.4 abind_1.4-5 +## [41] R.methodsS3_1.8.2 terra_1.7-46 +## [43] lifecycle_1.0.3 multcomp_1.4-25 +## [45] yaml_2.3.7 edgeR_3.42.4 +## [47] carData_3.0-5 rhdf5_2.44.0 +## [49] recipes_1.0.8 Rtsne_0.16 +## [51] grid_4.3.1 promises_1.2.1 +## [53] dqrng_0.3.1 crayon_1.5.2 +## [55] shinydashboard_0.7.2 beachmat_2.16.0 +## [57] cowplot_1.1.1 magick_2.8.0 +## [59] pillar_1.9.0 knitr_1.44 +## [61] ComplexHeatmap_2.16.0 metapod_1.8.0 +## [63] rjson_0.2.21 future.apply_1.11.0 +## [65] codetools_0.2-19 glue_1.6.2 +## [67] data.table_1.14.8 vctrs_0.6.3 +## [69] png_0.1-8 gtable_0.3.4 +## [71] cachem_1.0.8 gower_1.0.1 +## [73] xfun_0.40 S4Arrays_1.0.6 +## [75] mime_0.12 prodlim_2023.08.28 +## [77] DropletUtils_1.20.0 survival_3.5-5 +## [79] timeDate_4022.108 iterators_1.0.14 +## [81] cytolib_2.12.1 hardhat_1.3.0 +## [83] lava_1.7.2.1 statmod_1.5.0 +## [85] ellipsis_0.3.2 TH.data_1.1-2 +## [87] ipred_0.9-14 nlme_3.1-162 +## [89] rprojroot_2.0.3 bslib_0.5.1 +## [91] irlba_2.3.5.1 svgPanZoom_0.3.4 +## [93] vipor_0.4.5 rpart_4.1.19 +## [95] colorspace_2.1-0 raster_3.6-23 +## [97] nnet_7.3-19 tidyselect_1.2.0 +## [99] compiler_4.3.1 BiocNeighbors_1.18.0 +## [101] desc_1.4.2 DelayedArray_0.26.7 +## [103] bookdown_0.35 scales_1.2.1 +## [105] tiff_0.1-11 digest_0.6.33 +## [107] fftwtools_0.9-11 rmarkdown_2.25 +## [109] XVector_0.40.0 htmltools_0.5.6 +## [111] pkgconfig_2.0.3 jpeg_0.1-10 +## [113] sparseMatrixStats_1.12.2 fastmap_1.1.1 +## [115] rlang_1.1.1 GlobalOptions_0.1.2 +## [117] htmlwidgets_1.6.2 shiny_1.7.5 +## [119] DelayedMatrixStats_1.22.6 farver_2.1.1 +## [121] jquerylib_0.1.4 zoo_1.8-12 +## [123] jsonlite_1.8.7 ModelMetrics_1.2.2.2 +## [125] R.oo_1.25.0 BiocSingular_1.16.0 +## [127] RCurl_1.98-1.12 magrittr_2.0.3 +## [129] GenomeInfoDbData_1.2.10 Rhdf5lib_1.22.1 +## [131] munsell_0.5.0 Rcpp_1.0.11 +## [133] ggnewscale_0.4.9 pROC_1.18.4 +## [135] stringi_1.7.12 brio_1.1.3 +## [137] zlibbioc_1.46.0 MASS_7.3-60 +## [139] plyr_1.8.8 listenv_0.9.0 +## [141] parallel_4.3.1 ggrepel_0.9.3 +## [143] splines_4.3.1 hms_1.1.3 +## [145] circlize_0.4.15 locfit_1.5-9.8 +## [147] ggpubr_0.6.0 ggsignif_0.6.4 +## [149] pkgload_1.3.3 reshape2_1.4.4 +## [151] ScaledMatrix_1.8.1 XML_3.99-0.14 +## [153] drc_3.0-1 evaluate_0.21 +## [155] tzdb_0.4.0 foreach_1.5.2 +## [157] tweenr_2.0.2 httpuv_1.6.11 +## [159] RANN_2.6.1 polyclip_1.10-6 +## [161] future_1.33.0 clue_0.3-65 +## [163] ggforce_0.4.1 rsvd_1.0.5 +## [165] broom_1.0.5 xtable_1.8-4 +## [167] e1071_1.7-13 rstatix_0.7.2 +## [169] later_1.3.1 class_7.3-22 +## [171] FlowSOM_2.8.0 beeswarm_0.4.0 +## [173] cluster_2.1.4 timechange_0.2.0 +## [175] globals_0.16.2 +``` +
diff --git a/08-phenotyping_files/figure-html/accuracy-tuning-1.png b/08-phenotyping_files/figure-html/accuracy-tuning-1.png new file mode 100644 index 00000000..b9a1ac5e Binary files /dev/null and b/08-phenotyping_files/figure-html/accuracy-tuning-1.png differ diff --git a/08-phenotyping_files/figure-html/cluster-diagnostics-1.png b/08-phenotyping_files/figure-html/cluster-diagnostics-1.png new file mode 100644 index 00000000..bb54753b Binary files /dev/null and b/08-phenotyping_files/figure-html/cluster-diagnostics-1.png differ diff --git a/08-phenotyping_files/figure-html/cluster-diagnostics-2.png b/08-phenotyping_files/figure-html/cluster-diagnostics-2.png new file mode 100644 index 00000000..5ce7be2c Binary files /dev/null and b/08-phenotyping_files/figure-html/cluster-diagnostics-2.png differ diff --git a/08-phenotyping_files/figure-html/compare-corrected-1.png b/08-phenotyping_files/figure-html/compare-corrected-1.png new file mode 100644 index 00000000..52d62836 Binary files /dev/null and b/08-phenotyping_files/figure-html/compare-corrected-1.png differ diff --git a/08-phenotyping_files/figure-html/compare-corrected-2.png b/08-phenotyping_files/figure-html/compare-corrected-2.png new file mode 100644 index 00000000..bc3abdfc Binary files /dev/null and b/08-phenotyping_files/figure-html/compare-corrected-2.png differ diff --git a/08-phenotyping_files/figure-html/compare-corrected-3.png b/08-phenotyping_files/figure-html/compare-corrected-3.png new file mode 100644 index 00000000..1551ab02 Binary files /dev/null and b/08-phenotyping_files/figure-html/compare-corrected-3.png differ diff --git a/08-phenotyping_files/figure-html/compare-raw-1.png b/08-phenotyping_files/figure-html/compare-raw-1.png new file mode 100644 index 00000000..c8e7c86a Binary files /dev/null and b/08-phenotyping_files/figure-html/compare-raw-1.png differ diff --git a/08-phenotyping_files/figure-html/compare-raw-2.png b/08-phenotyping_files/figure-html/compare-raw-2.png new file mode 100644 index 00000000..01b0f563 Binary files /dev/null and b/08-phenotyping_files/figure-html/compare-raw-2.png differ diff --git a/08-phenotyping_files/figure-html/compare-raw-3.png b/08-phenotyping_files/figure-html/compare-raw-3.png new file mode 100644 index 00000000..849a7801 Binary files /dev/null and b/08-phenotyping_files/figure-html/compare-raw-3.png differ diff --git a/08-phenotyping_files/figure-html/corrected-1.png b/08-phenotyping_files/figure-html/corrected-1.png new file mode 100644 index 00000000..5b865b4b Binary files /dev/null and b/08-phenotyping_files/figure-html/corrected-1.png differ diff --git a/08-phenotyping_files/figure-html/corrected-2.png b/08-phenotyping_files/figure-html/corrected-2.png new file mode 100644 index 00000000..89e50d88 Binary files /dev/null and b/08-phenotyping_files/figure-html/corrected-2.png differ diff --git a/08-phenotyping_files/figure-html/corrected-3.png b/08-phenotyping_files/figure-html/corrected-3.png new file mode 100644 index 00000000..82055ce7 Binary files /dev/null and b/08-phenotyping_files/figure-html/corrected-3.png differ diff --git a/08-phenotyping_files/figure-html/flowSOM-1-1.png b/08-phenotyping_files/figure-html/flowSOM-1-1.png new file mode 100644 index 00000000..710c141a Binary files /dev/null and b/08-phenotyping_files/figure-html/flowSOM-1-1.png differ diff --git a/08-phenotyping_files/figure-html/flowSOM-1-2.png b/08-phenotyping_files/figure-html/flowSOM-1-2.png new file mode 100644 index 00000000..d7a4ddf3 Binary files /dev/null and b/08-phenotyping_files/figure-html/flowSOM-1-2.png differ diff --git a/08-phenotyping_files/figure-html/prediciton-probability-1.png b/08-phenotyping_files/figure-html/prediciton-probability-1.png new file mode 100644 index 00000000..28dce7d7 Binary files /dev/null and b/08-phenotyping_files/figure-html/prediciton-probability-1.png differ diff --git a/08-phenotyping_files/figure-html/raw-counts-1.png b/08-phenotyping_files/figure-html/raw-counts-1.png new file mode 100644 index 00000000..e6cef74b Binary files /dev/null and b/08-phenotyping_files/figure-html/raw-counts-1.png differ diff --git a/08-phenotyping_files/figure-html/raw-counts-2.png b/08-phenotyping_files/figure-html/raw-counts-2.png new file mode 100644 index 00000000..07b94cbc Binary files /dev/null and b/08-phenotyping_files/figure-html/raw-counts-2.png differ diff --git a/08-phenotyping_files/figure-html/raw-counts-3.png b/08-phenotyping_files/figure-html/raw-counts-3.png new file mode 100644 index 00000000..ba1b764b Binary files /dev/null and b/08-phenotyping_files/figure-html/raw-counts-3.png differ diff --git a/08-phenotyping_files/figure-html/rphenograph-1-1.png b/08-phenotyping_files/figure-html/rphenograph-1-1.png new file mode 100644 index 00000000..66787a04 Binary files /dev/null and b/08-phenotyping_files/figure-html/rphenograph-1-1.png differ diff --git a/08-phenotyping_files/figure-html/rphenograph-2-1.png b/08-phenotyping_files/figure-html/rphenograph-2-1.png new file mode 100644 index 00000000..89cf008c Binary files /dev/null and b/08-phenotyping_files/figure-html/rphenograph-2-1.png differ diff --git a/08-phenotyping_files/figure-html/snn-1-1.png b/08-phenotyping_files/figure-html/snn-1-1.png new file mode 100644 index 00000000..041d1d4a Binary files /dev/null and b/08-phenotyping_files/figure-html/snn-1-1.png differ diff --git a/08-phenotyping_files/figure-html/snn-2-1.png b/08-phenotyping_files/figure-html/snn-2-1.png new file mode 100644 index 00000000..1792da48 Binary files /dev/null and b/08-phenotyping_files/figure-html/snn-2-1.png differ diff --git a/08-phenotyping_files/figure-html/specificity-sensitivity-1.png b/08-phenotyping_files/figure-html/specificity-sensitivity-1.png new file mode 100644 index 00000000..12e81af6 Binary files /dev/null and b/08-phenotyping_files/figure-html/specificity-sensitivity-1.png differ diff --git a/08-phenotyping_files/figure-html/undefined-cells-1.png b/08-phenotyping_files/figure-html/undefined-cells-1.png new file mode 100644 index 00000000..21464d8d Binary files /dev/null and b/08-phenotyping_files/figure-html/undefined-cells-1.png differ diff --git a/08-phenotyping_files/figure-html/unnamed-chunk-10-1.png b/08-phenotyping_files/figure-html/unnamed-chunk-10-1.png new file mode 100644 index 00000000..815844a7 Binary files /dev/null and b/08-phenotyping_files/figure-html/unnamed-chunk-10-1.png differ diff --git a/08-phenotyping_files/figure-html/unnamed-chunk-12-1.png b/08-phenotyping_files/figure-html/unnamed-chunk-12-1.png new file mode 100644 index 00000000..a879378b Binary files /dev/null and b/08-phenotyping_files/figure-html/unnamed-chunk-12-1.png differ diff --git a/08-phenotyping_files/figure-html/unnamed-chunk-14-1.png b/08-phenotyping_files/figure-html/unnamed-chunk-14-1.png new file mode 100644 index 00000000..1794716d Binary files /dev/null and b/08-phenotyping_files/figure-html/unnamed-chunk-14-1.png differ diff --git a/08-phenotyping_files/figure-html/unnamed-chunk-14-2.png b/08-phenotyping_files/figure-html/unnamed-chunk-14-2.png new file mode 100644 index 00000000..8cb4a89b Binary files /dev/null and b/08-phenotyping_files/figure-html/unnamed-chunk-14-2.png differ diff --git a/08-phenotyping_files/figure-html/unnamed-chunk-15-1.png b/08-phenotyping_files/figure-html/unnamed-chunk-15-1.png new file mode 100644 index 00000000..4807ef62 Binary files /dev/null and b/08-phenotyping_files/figure-html/unnamed-chunk-15-1.png differ diff --git a/08-phenotyping_files/figure-html/unnamed-chunk-2-1.png b/08-phenotyping_files/figure-html/unnamed-chunk-2-1.png new file mode 100644 index 00000000..3596f2a4 Binary files /dev/null and b/08-phenotyping_files/figure-html/unnamed-chunk-2-1.png differ diff --git a/08-phenotyping_files/figure-html/unnamed-chunk-4-1.png b/08-phenotyping_files/figure-html/unnamed-chunk-4-1.png new file mode 100644 index 00000000..36f3ac14 Binary files /dev/null and b/08-phenotyping_files/figure-html/unnamed-chunk-4-1.png differ diff --git a/08-phenotyping_files/figure-html/unnamed-chunk-8-1.png b/08-phenotyping_files/figure-html/unnamed-chunk-8-1.png new file mode 100644 index 00000000..76f36939 Binary files /dev/null and b/08-phenotyping_files/figure-html/unnamed-chunk-8-1.png differ diff --git a/08-phenotyping_files/figure-html/variable-importance-1.png b/08-phenotyping_files/figure-html/variable-importance-1.png new file mode 100644 index 00000000..6ee6ff08 Binary files /dev/null and b/08-phenotyping_files/figure-html/variable-importance-1.png differ diff --git a/09-singlecell_visualization.md b/09-singlecell_visualization.md new file mode 100644 index 00000000..cd1e5650 --- /dev/null +++ b/09-singlecell_visualization.md @@ -0,0 +1,979 @@ +# Single cell visualization {#single-cell-visualization} + +The following section describes typical approaches for visualizing +single-cell data. + +This chapter is divided into three parts. Section \@ref(cell-type-level) +will highlight visualization approaches downstream of cell type +classification from Section \@ref(classification). We will then focus on +visualization methods that relate single-cell data to the sample level +in Section \@ref(sample-level). Lastly, Section \@ref(rich-example) will +provide a more customized example on how to integrate various +single-cell and sample metadata into one heatmap using the +[ComplexHeatmap](https://bioconductor.org/packages/release/bioc/html/ComplexHeatmap.html) +package [@Gu2016]. + +Visualization functions from popular R packages in single-cell research +such as +[scater](https://bioconductor.org/packages/release/bioc/html/scater.html), +[DittoSeq](https://bioconductor.org/packages/release/bioc/html/dittoSeq.html) +and +[CATALYST](https://bioconductor.org/packages/release/bioc/html/CATALYST.html) +will be utilized. We will recycle methods and functions that we have +used in previous sections, while also introducing new ones. + +Please note that this chapter aims to provide an overview on **common** +visualization options and should be seen as a stepping-stone. However, +many more options exist and the user should customize the visualization +according to the biological question at hand. + +## Load data + +First, we will read in the previously generated `SpatialExperiment` +object. + + +```r +spe <- readRDS("data/spe.rds") +``` + +For visualization purposes, we will define markers that were used for +cell type classification and markers that can indicate a specific cell +state (e.g., Ki67 for proliferating cells). + + +```r +# Define cell phenotype markers +type_markers <- c("Ecad", "CD45RO", "CD20", "CD3", "FOXP3", "CD206", "MPO", + "SMA", "CD8a", "CD4", "HLADR", "CD15", "CD38", "PDGFRb") + +# Define cell state markers +state_markers <- c("CarbonicAnhydrase", "Ki67", "PD1", "GrzB", "PDL1", + "ICOS", "TCF7", "VISTA") + +# Add to spe +rowData(spe)$marker_class <- ifelse(rownames(spe) %in% type_markers, "type", + ifelse(rownames(spe) %in% state_markers, "state", + "other")) +``` + +## Cell-type level {#cell-type-level} + +In the first section of this chapter, the grouping-level for the +visualization approaches will be the cell type classification from +Section \@ref(classification). Other grouping levels (e.g., cluster +assignments from Section \@ref(clustering)) are possible and the user +should adjust depending on the chosen analysis workflow. + +### Dimensionality reduction visualization + +As seen before, we can visualize single-cells in low-dimensional space. +Often, non-linear methods for dimensionality reduction such as tSNE and +UMAP are used. They aim to preserve the distances between each cell and its +neighbors in the high-dimensional space. + +Interpreting these plots is not trivial, but local neighborhoods in the +plot can suggest similarity in expression for given cells. See +[Orchestrating Single-Cell Analysis with +Bioconductor](https://bioconductor.org/books/release/OSCA/) for more +details. + +Here, we will use `dittoDimPlot` from the +[DittoSeq](https://bioconductor.org/packages/release/bioc/html/dittoSeq.html) +package and `plotReducedDim` from the +[scater](https://bioconductor.org/packages/release/bioc/html/scater.html) package +to visualize the fastMNN-corrected UMAP colored by cell type and +expression (using the asinh-transformed intensities), respectively. + +Both functions are highly flexible and return `ggplot` objects which can +be further modified. + + +```r +library(dittoSeq) +library(scater) +library(patchwork) +library(cowplot) +library(viridis) + +## UMAP colored by cell type and expression - dittoDimPlot +p1 <- dittoDimPlot(spe, + var = "celltype", + reduction.use = "UMAP_mnnCorrected", + size = 0.2, + do.label = TRUE) + + scale_color_manual(values = metadata(spe)$color_vectors$celltype) + + theme(legend.title = element_blank()) + + ggtitle("Cell types on UMAP, integrated cells") + +p2 <- dittoDimPlot(spe, + var = "Ecad", + assay = "exprs", + reduction.use = "UMAP_mnnCorrected", + size = 0.2, + colors = viridis(100), + do.label = TRUE) + + scale_color_viridis() + +p1 + p2 +``` + + + +The `plotReducedDim` function of the `scater` package provides an alternative +way for visualizing cells in low dimensions. Here, we loop over all type +markers, generate one plot per marker and plot the indivudual plots side-by-side. + + +```r +# UMAP colored by expression for all markers - plotReducedDim +plot_list <- lapply(rownames(spe)[rowData(spe)$marker_class == "type"], function(x){ + p <- plotReducedDim(spe, + dimred = "UMAP_mnnCorrected", + colour_by = x, + by_exprs_values = "exprs", + point_size = 0.2) + return(p) + }) + +plot_grid(plotlist = plot_list) +``` + + + +### Heatmap visualization + +Next, it is often useful to visualize single-cell expression per cell +type in form of a heatmap. For this, we will use the `dittoHeatmap` +function from the +[DittoSeq](https://bioconductor.org/packages/release/bioc/html/dittoSeq.html) +package. + +We sub-sample the dataset to 4000 cells for ease of visualization and +overlay the cancer type and patient ID from which the cells were +extracted. + + +```r +set.seed(220818) +cur_cells <- sample(seq_len(ncol(spe)), 4000) + +# Heatmap visualization - DittoHeatmap +dittoHeatmap(spe[,cur_cells], + genes = rownames(spe)[rowData(spe)$marker_class == "type"], + assay = "exprs", + cluster_cols = FALSE, + scale = "none", + heatmap.colors = viridis(100), + annot.by = c("celltype", "indication", "patient_id"), + annotation_colors = list(indication = metadata(spe)$color_vectors$indication, + patient_id = metadata(spe)$color_vectors$patient_id, + celltype = metadata(spe)$color_vectors$celltype)) +``` + + + +Similarly, we can visualize the mean marker expression per cell type for all +cells bu first calculating the mean marker expression per cell type using the +`aggregateAcrossCells` function from the +[scuttle](https://bioconductor.org/packages/release/bioc/html/scuttle.html) +package and then use `dittoHeatmap`. We will annotate the heatmap with the +number of cells per cell type and we will used different ways for feature +scaling. + + +```r +library(scuttle) + +## aggregate by cell type +celltype_mean <- aggregateAcrossCells(as(spe, "SingleCellExperiment"), + ids = spe$celltype, + statistics = "mean", + use.assay.type = "exprs", + subset.row = rownames(spe)[rowData(spe)$marker_class == "type"]) + +# No scaling +dittoHeatmap(celltype_mean, + assay = "exprs", + cluster_cols = TRUE, + scale = "none", + heatmap.colors = viridis(100), + annot.by = c("celltype", "ncells"), + annotation_colors = list(celltype = metadata(spe)$color_vectors$celltype, + ncells = plasma(100))) +``` + + + +```r +# Scaled to max +dittoHeatmap(celltype_mean, + assay = "exprs", + cluster_cols = TRUE, + scaled.to.max = TRUE, + heatmap.colors.max.scaled = inferno(100), + annot.by = c("celltype", "ncells"), + annotation_colors = list(celltype = metadata(spe)$color_vectors$celltype, + ncells = plasma(100))) +``` + + + +```r +# Z score scaled +dittoHeatmap(celltype_mean, + assay = "exprs", + cluster_cols = TRUE, + annot.by = c("celltype", "ncells"), + annotation_colors = list(celltype = metadata(spe)$color_vectors$celltype, + ncells = plasma(100))) +``` + + + +As illustrated above for not-, max-, and Z score-scaled expression values, +different ways of scaling can have strong effects on visualization +output and we encourage the user to test multiple options. + +Overall, we can observe cell-type specific marker expression (e.g., Tumor += Ecad high and B cells = CD20 high) in agreement with the gating scheme +of Section \@ref(classification). + +### Violin plot visualization + +The `plotExpression` function from the +[scater](https://bioconductor.org/packages/release/bioc/html/scater.html) package +allows to plot the distribution of expression values across cell types +for a chosen set of proteins. The output is a `ggplot` object which can be +modified further. + + +```r +# Violin Plot - plotExpression +plotExpression(spe[,cur_cells], + features = rownames(spe)[rowData(spe)$marker_class == "type"], + x = "celltype", + exprs_values = "exprs", + colour_by = "celltype") + + theme(axis.text.x = element_text(angle = 90))+ + scale_color_manual(values = metadata(spe)$color_vectors$celltype) +``` + + + +### Scatter plot visualization + +Moreover, a protein expression based scatter plot can be generated with +`dittoScatterPlot` (returns a `ggplot` object). We overlay the plot with +the cell type information. + + +```r +# Scatter plot +dittoScatterPlot(spe, + x.var = "CD3", + y.var="CD20", + assay.x = "exprs", + assay.y = "exprs", + color.var = "celltype") + + scale_color_manual(values = metadata(spe)$color_vectors$celltype) + + ggtitle("Scatterplot for CD3/CD20 labelled by celltype") +``` + + + +We can nicely observe how the "B next to T cell" phenotype (`BnTcell`) +has high expression values for both CD20 and CD3. + +**Of note**, in a setting where the user aims to assign labels to +clusters based on marker genes/proteins, all of the above plots can be +particularly helpful. + +### Barplot visualization + +In order to display frequencies of cell types per sample/patient, the +`dittoBarPlot` function will be used. Data can be represented as +percentages or counts and again `ggplot` objects are outputted. + + +```r +# by sample_id - percentage +dittoBarPlot(spe, + var = "celltype", + group.by = "sample_id") + + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) +``` + + + +```r +# by patient_id - percentage +dittoBarPlot(spe, + var = "celltype", + group.by = "patient_id") + + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) +``` + + + +```r +# by patient_id - count +dittoBarPlot(spe, + scale = "count", + var = "celltype", + group.by = "patient_id") + + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) +``` + + + +We can see that cell type frequencies change between samples/patients +and that the highest proportion/counts of plasma cells and stromal +cells can be observed for Patient 2 and Patient 4, respectively. + +### CATALYST-based visualization + +In the following, we highlight some useful visualization +functions from the +[CATALYST](https://bioconductor.org/packages/release/bioc/html/CATALYST.html) +package. + +To this end, we will first convert the `SpatialExperiment` object into a +CATALYST-compatible format. + + +```r +library(CATALYST) + +# Save SPE in CATALYST-compatible object with renamed colData entries and +# new metadata information +spe_cat <- spe + +spe_cat$sample_id <- factor(spe$sample_id) +spe_cat$condition <- factor(spe$indication) +spe_cat$cluster_id <- factor(spe$celltype) + +# Add celltype information to metadata +metadata(spe_cat)$cluster_codes <- data.frame(celltype = factor(spe_cat$celltype)) +``` + +All of the `CATALYST` functions presented below return `ggplot` objects, +which allow flexible downstream adjustment. + +#### Pseudobulk-level MDS plot + +Pseudobulk-level multi-dimensional scaling (MDS) plots can be rendered +with the exported `pbMDS` function. + +Here, we will use `pbMDS` to highlight expression similarities between +cell types and subsequently for each celltype-sample-combination. + + +```r +# MDS pseudobulk by cell type +pbMDS(spe_cat, + by = "cluster_id", + features = rownames(spe_cat)[rowData(spe_cat)$marker_class == "type"], + label_by = "cluster_id", + k = "celltype") + + scale_color_manual(values = metadata(spe_cat)$color_vectors$celltype) +``` + + + +```r +# MDS pseudobulk by cell type and sample_id +pbMDS(spe_cat, + by = "both", + features = rownames(spe_cat)[rowData(spe_cat)$marker_class == "type"], + k = "celltype", + shape_by = "condition", + size_by = TRUE) + + scale_color_manual(values = metadata(spe_cat)$color_vectors$celltype) +``` + + + +We can see that the pseudobulk-expression profile of neutrophils seems +markedly distinct from the other cell types, while comparable cell types +such as the T cell subtypes group together. Furthermore, pseudobulk +cell-type profiles from SCCHN appear different from the other +indications. + +#### Reduced dimension plot on CLR of proportions + +The `clrDR` function produces dimensionality reduction plots on centered +log-ratios (CLR) of sample/cell type proportions across cell +type/samples. + +As with `pbMDS`, the output plots aim to illustrate the degree of +similarity between cell types based on sample proportions. + + +```r +# CLR on cluster proportions across samples +clrDR(spe_cat, + dr = "PCA", + by = "cluster_id", + k = "celltype", + label_by = "cluster_id", + arrow_col = "sample_id", + point_pal = metadata(spe_cat)$color_vectors$celltype) +``` + + + +We can again observe that neutrophils have a divergent profile also in +terms of their sample proportions. + +#### Pseudobulk expression boxplot + +The `plotPbExprs` generates combined box- and jitter-plots of aggregated marker +expression per cell type and sample (image). Here, we further split the data by +cancer type. + + +```r +plotPbExprs(spe_cat, + k = "celltype", + facet_by = "cluster_id", + ncol = 2, + features = rownames(spe_cat)[rowData(spe_cat)$marker_class == "type"]) + + scale_color_manual(values = metadata(spe_cat)$color_vectors$indication) +``` + + + +Notably, CD15 levels are elevated in SCCHN in comparison to all other +indications for most cell types. + +## Sample-level {#sample-level} + +In the next section, we will shift the grouping-level focus from the +cell type to the sample-level. Sample-levels will be further divided +into the sample-(image) and patient-level. + +Although we will mostly repeat the functions from the previous section +\@ref(cell-type-level), sample- and patient-level centered visualization +can provide additional quality control and biological interpretation. + +### Dimensionality reduction visualization + +Visualization of low-dimensional embeddings, here comparing non-corrected and +fastMNN-corrected UMAPs, and coloring it by sample-levels is often used +for "batch effect" assessment as mentioned in Section +\@ref(cell-quality). + +We will again use `dittoDimPlot`. + + +```r +## UMAP colored by cell type and expression - dittoDimPlot +p1 <- dittoDimPlot(spe, + var = "sample_id", + reduction.use = "UMAP", + size = 0.2, + colors = viridis(100), + do.label = FALSE) + + scale_color_manual(values = metadata(spe)$color_vectors$sample_id) + + theme(legend.title = element_blank()) + + ggtitle("Sample ID") + +p2 <- dittoDimPlot(spe, + var = "sample_id", + reduction.use = "UMAP_mnnCorrected", + size = 0.2, + colors = viridis(100), + do.label = FALSE) + + scale_color_manual(values = metadata(spe)$color_vectors$sample_id) + + theme(legend.title = element_blank()) + + ggtitle("Sample ID") + +p3 <- dittoDimPlot(spe, + var = "patient_id", + reduction.use = "UMAP", + size = 0.2, + do.label = FALSE) + + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + + theme(legend.title = element_blank()) + + ggtitle("Patient ID") + +p4 <- dittoDimPlot(spe, + var = "patient_id", + reduction.use = "UMAP_mnnCorrected", + size = 0.2, + do.label = FALSE) + + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + + theme(legend.title = element_blank()) + + ggtitle("Patient ID") + +(p1 + p2) / (p3 + p4) +``` + + + +As illustrated in Section \@ref(batch-effects), we see that the fastMNN +approach (right side of the plot) leads to mixing of cells across +samples/patients and thus batch effect correction. + +### Heatmap visualization + +It can be beneficial to use a heatmap to visualize single-cell +expression per sample and patient. Such a plot, which we will create +using `dittoHeatmap`, can highlight biological differences across +samples/patients. + + +```r +# Heatmap visualization - DittoHeatmap +dittoHeatmap(spe[,cur_cells], + genes = rownames(spe)[rowData(spe)$marker_class == "type"], + assay = "exprs", + order.by = c("patient_id","sample_id"), + cluster_cols = FALSE, + scale = "none", + heatmap.colors = viridis(100), + annot.by = c("celltype", "indication", "patient_id", "sample_id"), + annotation_colors = list(celltype = metadata(spe)$color_vectors$celltype, + indication = metadata(spe)$color_vectors$indication, + patient_id = metadata(spe)$color_vectors$patient_id, + sample_id = metadata(spe)$color_vectors$sample_id)) +``` + + + +As in Section \@ref(image-quality), aggregated mean marker expression +per sample/patient allow identification of samples/patients with +outlying expression patterns. + +Here, we will focus on the patient level and use `aggregateAcrossCells` +and `dittoHeatmap`. The heatmap will be annotated with the number of +cells per patient and cancer type and displayed using two scaling +options. + + +```r +# mean expression by patient_id +patient_mean <- aggregateAcrossCells(as(spe, "SingleCellExperiment"), + ids = spe$patient_id, + statistics = "mean", + use.assay.type = "exprs", + subset.row = rownames(spe)[rowData(spe)$marker_class == "type"]) + +# No scaling +dittoHeatmap(patient_mean, + assay = "exprs", + cluster_cols = TRUE, + scale = "none", + heatmap.colors = viridis(100), + annot.by = c("patient_id","indication","ncells"), + annotation_colors = list(patient_id = metadata(spe)$color_vectors$patient_id, + indication = metadata(spe)$color_vectors$indication, + ncells = plasma(100))) +``` + + + +```r +# Max expression scaling +dittoHeatmap(patient_mean, + assay = "exprs", + cluster_cols = TRUE, + scaled.to.max = TRUE, + heatmap.colors.max.scaled = inferno(100), + annot.by = c("patient_id","indication","ncells"), + annotation_colors = list(patient_id = metadata(spe)$color_vectors$patient_id, + indication = metadata(spe)$color_vectors$indication, + ncells = plasma(100))) +``` + + + +As seen before, CD15 levels are elevated in Patient 4 (SCCHN), while SMA +levels are highest for Patient 4 (CRC). + +### Barplot visualization + +Complementary to displaying cell type frequencies per sample/patient, we +can use `dittoBarPlot` to display sample/patient frequencies per cell +type. + + +```r +dittoBarPlot(spe, + var = "patient_id", + group.by = "celltype") + + scale_fill_manual(values = metadata(spe)$color_vectors$patient_id) +``` + + + +```r +dittoBarPlot(spe, + var = "sample_id", + group.by = "celltype") + + scale_fill_manual(values = metadata(spe)$color_vectors$sample_id) +``` + + + +`Patient2` has the highest and lowest proportion of plasma cells and +neutrophils, respectively. + +### CATALYST-based visualization + +#### Pseudobulk-level MDS plot + +Expression-based pseudobulks for each sample can be compared with the +`pbMDS` function. + + +```r +# MDS pseudobulk by sample_id +pbMDS(spe_cat, + by = "sample_id", + color_by = "sample_id", + features = rownames(spe_cat)[rowData(spe_cat)$marker_class == "type"]) + + scale_color_manual(values = metadata(spe_cat)$color_vectors$sample_id) +``` + + + +There are marked differences in pseudobulk-expression patterns between +samples and across patients, which can be driven by biological +differences and also technical aspects such as divergent region +selection. + +#### Reduced dimension plot on CLR of proportions + +The `clrDR` function can also be used to analyze similarity of samples +based on cell type proportions. + + +```r +# CLR on sample proportions across clusters +clrDR(spe_cat, + dr = "PCA", + by = "sample_id", + point_col = "sample_id", + k = "celltype", + point_pal = metadata(spe_cat)$color_vectors$sample_id) + + scale_color_manual(values = metadata(spe_cat)$color_vectors$celltype) +``` + +``` +## Scale for colour is already present. +## Adding another scale for colour, which will replace the existing scale. +``` + + + +There are notable differences between samples based on their cell type +proportions. + +Interestingly, `Patient3_001`, `Patient1_003`, `Patient4_007` and +`Patient4_006` group together and the PC loadings indicate a strong +contribution of BnT and B cells, which could propose formation of +tertiary lymphoid structures (TLS). In section \@ref(spatial-viz), we +will be able to confirm this hypothesis visually on the images. + +## Further examples {#rich-example} + +In the last section of this chapter, we will use the popular +[ComplexHeatmap](https://bioconductor.org/packages/release/bioc/html/ComplexHeatmap.html) +package to create a visualization example that combines various +cell-type- and sample-level information. + +[ComplexHeatmap](https://bioconductor.org/packages/release/bioc/html/ComplexHeatmap.html) +is highly versatile and is originally inspired from the +[pheatmap](https://cran.r-project.org/web/packages/pheatmap/index.html) +package. Therefore, many arguments have the same/similar names. + +For more details, we would recommend to read the [reference +book](https://jokergoo.github.io/ComplexHeatmap-reference/book/). + +### Publication-ready ComplexHeatmap + +For this example, we will concatenate heatmaps and annotations +horizontally into one rich heatmap list. The grouping-level for the +visualization will again be the cell type information from Section +\@ref(classification) + +Initially, we will create two separate `Heatmap` objects for cell type +and state markers. + +Then, metadata information, including the cancer type proportion and +number of cells/patients per cell type, will be extracted into +`HeatmapAnnotation` objects. + +Notably, we will add spatial features per cell type, here the number of +neighbors extracted from `colPair(spe)` and cell area, in another +`HeatmapAnnotation` object. + +Ultimately, all objects are combined in a `HeatmapList` and visualized. + + +```r +library(ComplexHeatmap) +library(circlize) +library(tidyverse) +set.seed(22) + +### 1. Heatmap bodies ### + +# Heatmap body color +col_exprs <- colorRamp2(c(0,1,2,3,4), + c("#440154FF","#3B518BFF","#20938CFF", + "#6ACD5AFF","#FDE725FF")) + +# Create Heatmap objects +# By cell type markers +celltype_mean <- aggregateAcrossCells(as(spe, "SingleCellExperiment"), + ids = spe$celltype, + statistics = "mean", + use.assay.type = "exprs", + subset.row = rownames(spe)[rowData(spe)$marker_class == "type"]) + +h_type <- Heatmap(t(assay(celltype_mean, "exprs")), + column_title = "type_markers", + col = col_exprs, + name= "mean exprs", + show_row_names = TRUE, + show_column_names = TRUE) + +# By cell state markers +cellstate_mean <- aggregateAcrossCells(as(spe, "SingleCellExperiment"), + ids = spe$celltype, + statistics = "mean", + use.assay.type = "exprs", + subset.row = rownames(spe)[rowData(spe)$marker_class == "state"]) + +h_state <- Heatmap(t(assay(cellstate_mean, "exprs")), + column_title = "state_markers", + col = col_exprs, + name= "mean exprs", + show_row_names = TRUE, + show_column_names = TRUE) + + +### 2. Heatmap annotation ### + +### 2.1 Metadata features + +anno <- colData(celltype_mean) %>% as.data.frame %>% select(celltype, ncells) + +# Proportion of indication per celltype +indication <- unclass(prop.table(table(spe$celltype, spe$indication), margin = 1)) + +# Number of contributing patients per celltype +cluster_PID <- colData(spe) %>% + as.data.frame() %>% + select(celltype, patient_id) %>% + group_by(celltype) %>% table() %>% + as.data.frame() + +n_PID <- cluster_PID %>% + filter(Freq>0) %>% + group_by(celltype) %>% + count(name = "n_PID") %>% + column_to_rownames("celltype") + +# Create HeatmapAnnotation objects +ha_anno <- HeatmapAnnotation(celltype = anno$celltype, + border = TRUE, + gap = unit(1,"mm"), + col = list(celltype = metadata(spe)$color_vectors$celltype), + which = "row") + +ha_meta <- HeatmapAnnotation(n_cells = anno_barplot(anno$ncells, width = unit(10, "mm")), + n_PID = anno_barplot(n_PID, width = unit(10, "mm")), + indication = anno_barplot(indication,width = unit(10, "mm"), + gp = gpar(fill = metadata(spe)$color_vectors$indication)), + border = TRUE, + annotation_name_rot = 90, + gap = unit(1,"mm"), + which = "row") + +### 2.2 Spatial features + +# Add number of neighbors to spe object (saved in colPair) +spe$n_neighbors <- countLnodeHits(colPair(spe, "neighborhood")) + +# Select spatial features and average over celltypes +spatial <- colData(spe) %>% + as.data.frame() %>% + select(area, celltype, n_neighbors) + +spatial <- spatial %>% + select(-celltype) %>% + aggregate(by = list(celltype = spatial$celltype), FUN = mean) %>% + column_to_rownames("celltype") + +# Create HeatmapAnnotation object +ha_spatial <- HeatmapAnnotation( + area = spatial$area, + n_neighbors = spatial$n_neighbors, + border = TRUE, + gap = unit(1,"mm"), + which = "row") + +### 3. Plot rich heatmap ### + +# Create HeatmapList object +h_list <- h_type + + h_state + + ha_anno + + ha_spatial + + ha_meta + +# Add customized legend for anno_barplot() +lgd <- Legend(title = "indication", + at = colnames(indication), + legend_gp = gpar(fill = metadata(spe)$color_vectors$indication)) + +# Plot +draw(h_list,annotation_legend_list = list(lgd)) +``` + + + +This plot summarizes most of the information we have seen in this +chapter previously. In addition, we can observe that tumor cells have +the largest mean cell area, high number of neighbors and elevated Ki67 +expression. BnT cells have the highest number of neighbors on average, +which is biological sound given their predominant location in highly +immune infiltrated regions (such as TLS). + +### Interactive visualization + +For interactive visualization of the single-cell data the +[iSEE](https://www.bioconductor.org/packages/release/bioc/html/iSEE.html) shiny +application can be used. For a comprehensive tutorial, please refer to the +[iSEE vignette](https://www.bioconductor.org/packages/release/bioc/vignettes/iSEE/inst/doc/basic.html). + + +```r +if (interactive()) { + library(iSEE) + + iSEE(spe) +} +``` + +## Session Info + +
+ SessionInfo + + +``` +## R version 4.3.1 (2023-06-16) +## Platform: x86_64-pc-linux-gnu (64-bit) +## Running under: Ubuntu 22.04.3 LTS +## +## Matrix products: default +## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 +## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 +## +## locale: +## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C +## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 +## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 +## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C +## [9] LC_ADDRESS=C LC_TELEPHONE=C +## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C +## +## time zone: Etc/UTC +## tzcode source: system (glibc) +## +## attached base packages: +## [1] grid stats4 stats graphics grDevices utils datasets +## [8] methods base +## +## other attached packages: +## [1] lubridate_1.9.3 forcats_1.0.0 +## [3] stringr_1.5.0 dplyr_1.1.3 +## [5] purrr_1.0.2 readr_2.1.4 +## [7] tidyr_1.3.0 tibble_3.2.1 +## [9] tidyverse_2.0.0 circlize_0.4.15 +## [11] ComplexHeatmap_2.16.0 CATALYST_1.24.0 +## [13] viridis_0.6.4 viridisLite_0.4.2 +## [15] cowplot_1.1.1 patchwork_1.1.3 +## [17] scater_1.28.0 scuttle_1.10.2 +## [19] dittoSeq_1.12.1 ggplot2_3.4.3 +## [21] SpatialExperiment_1.10.0 SingleCellExperiment_1.22.0 +## [23] SummarizedExperiment_1.30.2 Biobase_2.60.0 +## [25] GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 +## [27] IRanges_2.34.1 S4Vectors_0.38.2 +## [29] BiocGenerics_0.46.0 MatrixGenerics_1.12.3 +## [31] matrixStats_1.0.0 +## +## loaded via a namespace (and not attached): +## [1] splines_4.3.1 bitops_1.0-7 +## [3] R.oo_1.25.0 polyclip_1.10-6 +## [5] XML_3.99-0.14 lifecycle_1.0.3 +## [7] rstatix_0.7.2 edgeR_3.42.4 +## [9] doParallel_1.0.17 lattice_0.21-8 +## [11] MASS_7.3-60 backports_1.4.1 +## [13] magrittr_2.0.3 limma_3.56.2 +## [15] sass_0.4.7 rmarkdown_2.25 +## [17] jquerylib_0.1.4 yaml_2.3.7 +## [19] plotrix_3.8-2 RColorBrewer_1.1-3 +## [21] ConsensusClusterPlus_1.64.0 multcomp_1.4-25 +## [23] abind_1.4-5 zlibbioc_1.46.0 +## [25] Rtsne_0.16 R.utils_2.12.2 +## [27] RCurl_1.98-1.12 TH.data_1.1-2 +## [29] tweenr_2.0.2 sandwich_3.0-2 +## [31] GenomeInfoDbData_1.2.10 ggrepel_0.9.3 +## [33] irlba_2.3.5.1 pheatmap_1.0.12 +## [35] dqrng_0.3.1 DelayedMatrixStats_1.22.6 +## [37] codetools_0.2-19 DropletUtils_1.20.0 +## [39] DelayedArray_0.26.7 ggforce_0.4.1 +## [41] tidyselect_1.2.0 shape_1.4.6 +## [43] farver_2.1.1 ScaledMatrix_1.8.1 +## [45] jsonlite_1.8.7 GetoptLong_1.0.5 +## [47] BiocNeighbors_1.18.0 ggridges_0.5.4 +## [49] survival_3.5-5 iterators_1.0.14 +## [51] foreach_1.5.2 tools_4.3.1 +## [53] ggnewscale_0.4.9 Rcpp_1.0.11 +## [55] glue_1.6.2 gridExtra_2.3 +## [57] xfun_0.40 HDF5Array_1.28.1 +## [59] withr_2.5.1 fastmap_1.1.1 +## [61] rhdf5filters_1.12.1 fansi_1.0.4 +## [63] digest_0.6.33 rsvd_1.0.5 +## [65] timechange_0.2.0 R6_2.5.1 +## [67] colorspace_2.1-0 Cairo_1.6-1 +## [69] gtools_3.9.4 R.methodsS3_1.8.2 +## [71] utf8_1.2.3 generics_0.1.3 +## [73] data.table_1.14.8 S4Arrays_1.0.6 +## [75] pkgconfig_2.0.3 gtable_0.3.4 +## [77] RProtoBufLib_2.12.1 XVector_0.40.0 +## [79] htmltools_0.5.6 carData_3.0-5 +## [81] bookdown_0.35 clue_0.3-65 +## [83] scales_1.2.1 png_0.1-8 +## [85] colorRamps_2.3.1 knitr_1.44 +## [87] rstudioapi_0.15.0 tzdb_0.4.0 +## [89] reshape2_1.4.4 rjson_0.2.21 +## [91] cachem_1.0.8 zoo_1.8-12 +## [93] rhdf5_2.44.0 GlobalOptions_0.1.2 +## [95] parallel_4.3.1 vipor_0.4.5 +## [97] pillar_1.9.0 vctrs_0.6.3 +## [99] ggpubr_0.6.0 car_3.1-2 +## [101] BiocSingular_1.16.0 cytolib_2.12.1 +## [103] beachmat_2.16.0 cluster_2.1.4 +## [105] beeswarm_0.4.0 evaluate_0.21 +## [107] magick_2.8.0 mvtnorm_1.2-3 +## [109] cli_3.6.1 locfit_1.5-9.8 +## [111] compiler_4.3.1 rlang_1.1.1 +## [113] crayon_1.5.2 ggsignif_0.6.4 +## [115] labeling_0.4.3 FlowSOM_2.8.0 +## [117] flowCore_2.12.2 plyr_1.8.8 +## [119] ggbeeswarm_0.7.2 stringi_1.7.12 +## [121] BiocParallel_1.34.2 nnls_1.5 +## [123] munsell_0.5.0 Matrix_1.6-1.1 +## [125] hms_1.1.3 sparseMatrixStats_1.12.2 +## [127] Rhdf5lib_1.22.1 drc_3.0-1 +## [129] igraph_1.5.1 broom_1.0.5 +## [131] bslib_0.5.1 +``` +
diff --git a/09-singlecell_visualization_files/figure-html/barplot celltype-1.png b/09-singlecell_visualization_files/figure-html/barplot celltype-1.png new file mode 100644 index 00000000..b1b5400d Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/barplot celltype-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/barplot celltype-2.png b/09-singlecell_visualization_files/figure-html/barplot celltype-2.png new file mode 100644 index 00000000..bf215e41 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/barplot celltype-2.png differ diff --git a/09-singlecell_visualization_files/figure-html/barplot celltype-3.png b/09-singlecell_visualization_files/figure-html/barplot celltype-3.png new file mode 100644 index 00000000..e1d0227c Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/barplot celltype-3.png differ diff --git a/09-singlecell_visualization_files/figure-html/barplot sample-1.png b/09-singlecell_visualization_files/figure-html/barplot sample-1.png new file mode 100644 index 00000000..8ec4633f Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/barplot sample-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/barplot sample-2.png b/09-singlecell_visualization_files/figure-html/barplot sample-2.png new file mode 100644 index 00000000..aa48d822 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/barplot sample-2.png differ diff --git a/09-singlecell_visualization_files/figure-html/cell type umap 2-1.png b/09-singlecell_visualization_files/figure-html/cell type umap 2-1.png new file mode 100644 index 00000000..4a924b10 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/cell type umap 2-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/cell type umap-1.png b/09-singlecell_visualization_files/figure-html/cell type umap-1.png new file mode 100644 index 00000000..ba642bb8 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/cell type umap-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/celltype - clrDR-1.png b/09-singlecell_visualization_files/figure-html/celltype - clrDR-1.png new file mode 100644 index 00000000..8bdbb48f Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/celltype - clrDR-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/celltype heatmap-1.png b/09-singlecell_visualization_files/figure-html/celltype heatmap-1.png new file mode 100644 index 00000000..84a66dc7 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/celltype heatmap-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/celltype mean-expression-per-cluster-1.png b/09-singlecell_visualization_files/figure-html/celltype mean-expression-per-cluster-1.png new file mode 100644 index 00000000..8cacf40e Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/celltype mean-expression-per-cluster-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/celltype mean-expression-per-cluster-2.png b/09-singlecell_visualization_files/figure-html/celltype mean-expression-per-cluster-2.png new file mode 100644 index 00000000..f26f73a0 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/celltype mean-expression-per-cluster-2.png differ diff --git a/09-singlecell_visualization_files/figure-html/celltype mean-expression-per-cluster-3.png b/09-singlecell_visualization_files/figure-html/celltype mean-expression-per-cluster-3.png new file mode 100644 index 00000000..c3a37837 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/celltype mean-expression-per-cluster-3.png differ diff --git a/09-singlecell_visualization_files/figure-html/celltype pbExprs-1.png b/09-singlecell_visualization_files/figure-html/celltype pbExprs-1.png new file mode 100644 index 00000000..00be99b4 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/celltype pbExprs-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/celltype pbmds-1.png b/09-singlecell_visualization_files/figure-html/celltype pbmds-1.png new file mode 100644 index 00000000..ee006c1e Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/celltype pbmds-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/celltype pbmds-2.png b/09-singlecell_visualization_files/figure-html/celltype pbmds-2.png new file mode 100644 index 00000000..d21d1df8 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/celltype pbmds-2.png differ diff --git a/09-singlecell_visualization_files/figure-html/celltype scatter-1.png b/09-singlecell_visualization_files/figure-html/celltype scatter-1.png new file mode 100644 index 00000000..7b971a85 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/celltype scatter-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/celltype violin-1.png b/09-singlecell_visualization_files/figure-html/celltype violin-1.png new file mode 100644 index 00000000..6210dd22 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/celltype violin-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/complex-heatmap-1.png b/09-singlecell_visualization_files/figure-html/complex-heatmap-1.png new file mode 100644 index 00000000..1396ebc9 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/complex-heatmap-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/sample heatmap-1.png b/09-singlecell_visualization_files/figure-html/sample heatmap-1.png new file mode 100644 index 00000000..d82f351d Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/sample heatmap-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/sample mean-expression-per-cluster-1.png b/09-singlecell_visualization_files/figure-html/sample mean-expression-per-cluster-1.png new file mode 100644 index 00000000..4df124e4 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/sample mean-expression-per-cluster-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/sample mean-expression-per-cluster-2.png b/09-singlecell_visualization_files/figure-html/sample mean-expression-per-cluster-2.png new file mode 100644 index 00000000..0b72d61d Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/sample mean-expression-per-cluster-2.png differ diff --git a/09-singlecell_visualization_files/figure-html/sample umap-1.png b/09-singlecell_visualization_files/figure-html/sample umap-1.png new file mode 100644 index 00000000..dda31970 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/sample umap-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/sample-clrDR-1.png b/09-singlecell_visualization_files/figure-html/sample-clrDR-1.png new file mode 100644 index 00000000..63a5d988 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/sample-clrDR-1.png differ diff --git a/09-singlecell_visualization_files/figure-html/sample-pbmds-1.png b/09-singlecell_visualization_files/figure-html/sample-pbmds-1.png new file mode 100644 index 00000000..c0800d78 Binary files /dev/null and b/09-singlecell_visualization_files/figure-html/sample-pbmds-1.png differ diff --git a/10-image_visualization.md b/10-image_visualization.md new file mode 100644 index 00000000..d24a09a0 --- /dev/null +++ b/10-image_visualization.md @@ -0,0 +1,684 @@ +# Image visualization {#image-visualization} + +The following section describes how to visualize the abundance of biomolecules +(e.g., protein or RNA) as well as cell-specific metadata on images. Section +\@ref(pixel-visualization) focuses on visualizing pixel-level information +including the generation of pseudo-color composite images. Section +\@ref(mask-visualization) highlights the visualization of cell metadata (e.g., +cell phenotype) as well as summarized pixel intensities on cell segmentation +masks. + +The +[cytomapper](https://www.bioconductor.org/packages/release/bioc/html/cytomapper.html) +R/Bioconductor package was developed to support the handling and visualization +of multiple multi-channel images and segmentation masks [@Eling2020]. The main +data object for image handling is the +[CytoImageList](https://www.bioconductor.org/packages/release/bioc/vignettes/cytomapper/inst/doc/cytomapper.html#5_The_CytoImageList_object) +container which we used in Section \@ref(read-data) to store multi-channel +images and segmentation masks. + +We will first read in the previously processed data and randomly select 3 images +for visualization purposes. + + +```r +library(SpatialExperiment) +library(cytomapper) +spe <- readRDS("data/spe.rds") +images <- readRDS("data/images.rds") +masks <- readRDS("data/masks.rds") + +# Sample images +set.seed(220517) +cur_id <- sample(unique(spe$sample_id), 3) +cur_images <- images[names(images) %in% cur_id] +cur_masks <- masks[names(masks) %in% cur_id] +``` + +## Pixel visualization {#pixel-visualization} + +The following section gives examples for visualizing individual channels or +multiple channels as pseudo-color composite images. For this the `cytomapper` +package exports the `plotPixels` function which expects a `CytoImageList` object +storing one or multiple multi-channel images. In the simplest use case, a +single channel can be visualized as follows: + + +```r +plotPixels(cur_images, + colour_by = "Ecad", + bcg = list(Ecad = c(0, 5, 1))) +``` + + + +The plot above shows the tissue expression of the epithelial tumor marker +E-cadherin on the 3 selected images. The `bcg` parameter (default `c(0, 1, 1)`) +stands for "background", "contrast", "gamma" and controls these attributes of +the image. This parameter takes a named list where each entry specifies these +attributes per channel. The first value of the numeric vector will be added to +the pixel intensities (background); pixel intensities will be multiplied by the +second entry of the vector (contrast); pixel intensities will be exponentiated +by the third entry of the vector (gamma). In most cases, it is sufficient to +adjust the second (contrast) entry of the vector. + +The following example highlights the visualization of 6 markers (maximum allowed +number of markers) at once per image. The markers indicate the spatial +distribution of tumor cells (E-cadherin), T cells (CD3), B cells (CD20), CD8+ T +cells (CD8a), plasma cells (CD38) and proliferating cells (Ki67). + + +```r +plotPixels(cur_images, + colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67"), + bcg = list(Ecad = c(0, 5, 1), + CD3 = c(0, 5, 1), + CD20 = c(0, 5, 1), + CD8a = c(0, 5, 1), + CD38 = c(0, 8, 1), + Ki67 = c(0, 5, 1))) +``` + + + +### Adjusting colors + +The default colors for visualization are chosen by the additive RGB (red, green, +blue) color model. For six markers the default colors are: red, green, blue, +cyan (green + blue), magenta (red + blue), yellow (green + red). These colors +are the easiest to distinguish by eye. However, you can select other colors for +each channel by setting the `colour` parameter: + + +```r +plotPixels(cur_images, + colour_by = c("Ecad", "CD3", "CD20"), + bcg = list(Ecad = c(0, 5, 1), + CD3 = c(0, 5, 1), + CD20 = c(0, 5, 1)), + colour = list(Ecad = c("black", "burlywood1"), + CD3 = c("black", "cyan2"), + CD20 = c("black", "firebrick1"))) +``` + + + +The `colour` parameter takes a named list in which each entry specifies the +colors from which a color gradient is constructed via `colorRampPalette`. These +are usually vectors of length 2 in which the first entry is `"black"` and the +second entry specifies the color of choice. Although not recommended, you can +also specify more than two colors to generate a more complex color gradient. + +### Image normalization + +As an alternative to setting the `bcg` parameter, images can first be +normalized. Normalization here means to scale the pixel intensities per channel +between 0 and 1 (or a range specified by the `ft` parameter in the `normalize` +function). By default, the `normalize` function scales pixel intensities across +**all** images contained in the `CytoImageList` object (`separateImages = FALSE`). +Each individual channel is scaled independently (`separateChannels = TRUE`). + +After 0-1 normalization, maximum pixel intensities can be clipped to enhance the +contrast of the image (setting the `inputRange` parameter). In the following +example, the clipping to 0 and 0.2 is the same as multiplying the pixel +intensities by a factor of 5. + + +```r +# 0 - 1 channel scaling across all images +norm_images <- cytomapper::normalize(cur_images) + +# Clip channel at 0.2 +norm_images <- cytomapper::normalize(norm_images, inputRange = c(0, 0.2)) + +plotPixels(norm_images, + colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67")) +``` + + + +The default setting of scaling pixel intensities across all images ensures +comparable intensity levels across images. Pixel intensities can also be +scaled **per image** therefore correcting for staining/expression differences +between images: + + +```r +# 0 - 1 channel scaling per image +norm_images <- cytomapper::normalize(cur_images, separateImages = TRUE) + +# Clip channel at 0.2 +norm_images <- cytomapper::normalize(norm_images, inputRange = c(0, 0.2)) + +plotPixels(norm_images, + colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67")) +``` + + + +As we can see, the marker Ki67 appears brighter on image 2 and 3 in comparison +to scaling the channel across all images. + +Finally, the `normalize` function also accepts a named list input for the +`inputRange` argument. In this list, the clipping range per channel can be set +individually: + + +```r +# 0 - 1 channel scaling per image +norm_images <- cytomapper::normalize(cur_images, + separateImages = TRUE, + inputRange = list(Ecad = c(0, 50), + CD3 = c(0, 30), + CD20 = c(0, 40), + CD8a = c(0, 50), + CD38 = c(0, 10), + Ki67 = c(0, 70))) + +plotPixels(norm_images, + colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67")) +``` + + + +## Cell visualization {#mask-visualization} + +In the following section, we will show examples on how to visualize single +cells either as segmentation masks or outlined on composite images. This type +of visualization allows to observe the spatial distribution of cell phenotypes, +the visual assessment of morphological features and quality control in terms +of cell segmentation and phenotyping. + +### Visualzing metadata + +The `cytomapper` package provides the `plotCells` function that accepts a +`CytoImageList` object containing segmentation masks. These are defined as +single channel images where sets of pixels with the same integer ID identify +individual cells. This integer ID can be found as an entry in the `colData(spe)` +slot and as pixel information in the segmentation masks. The entry in +`colData(spe)` needs to be specified via the `cell_id` argument to the +`plotCells` function. In that way, data contained in the `SpatialExperiment` +object can be mapped to segmentation masks. For the current dataset, the cell +IDs are stored in `colData(spe)$ObjectNumber`. + +As cell IDs are only unique within a single image, `plotCells` also requires +the `img_id` argument. This argument specifies the `colData(spe)` as well as the +`mcols(masks)` entry that stores the unique image name from which each cell was +extracted. In the current dataset the unique image names are stored in +`colData(spe)$sample_id` and `mcols(masks)$sample_id`. + +Providing these two entries that allow mapping between the `SpatialExperiment` +object and segmentation masks, we can now color individual cells based on their +cell type: + + +```r +plotCells(cur_masks, + object = spe, + cell_id = "ObjectNumber", + img_id = "sample_id", + colour_by = "celltype") +``` + + + +For consistent visualization, the `plotCells` function takes a named list as +`color` argument. The entry name must match the `colour_by` argument. + + +```r +plotCells(cur_masks, + object = spe, + cell_id = "ObjectNumber", + img_id = "sample_id", + colour_by = "celltype", + colour = list(celltype = metadata(spe)$color_vectors$celltype)) +``` + + + +If only individual cell types should be visualized, the `SpatialExperiment` +object can be subsetted (e.g., to only contain CD8+ T cells). In the following +example CD8+ T cells are colored in red and all other cells that are not +contained in the dataset are colored in white (as set by the `missing_color` +argument). + + +```r +CD8 <- spe[,spe$celltype == "CD8"] + +plotCells(cur_masks, + object = CD8, + cell_id = "ObjectNumber", + img_id = "sample_id", + colour_by = "celltype", + colour = list(celltype = c(CD8 = "red")), + missing_colour = "white") +``` + + + +In terms of visualizing metadata, any entry in the `colData(spe)` slot can be +visualized. The `plotCells` function automatically detects if the entry +is continuous or discrete. In this fashion, we can now visualize the area of each +cell: + + +```r +plotCells(cur_masks, + object = spe, + cell_id = "ObjectNumber", + img_id = "sample_id", + colour_by = "area") +``` + + + +### Visualizating expression + +Similar to visualizing single-cell metadata on segmentation masks, we can +use the `plotCells` function to visualize the aggregated pixel intensities +per cell. In the current dataset pixel intensities were aggregated by computing +the mean pixel intensity per cell and per channel. The `plotCells` function +accepts the `exprs_values` argument (default `counts`) that allows selecting +the assay which stores the expression values that should be visualized. + +In the following example, we visualize the asinh-transformed mean pixel +intensities of the epithelial marker E-cadherin on segmentation masks. + + +```r +plotCells(cur_masks, + object = spe, + cell_id = "ObjectNumber", + img_id = "sample_id", + colour_by = "Ecad", + exprs_values = "exprs") +``` + + + +We will now visualize the maximum number of +allowed markers as composites on the segmentation masks. As above the markers +indicate the spatial distribution of tumor cells (E-cadherin), T cells (CD3), B +cells (CD20), CD8+ T cells (CD8a), plasma cells (CD38) and proliferating cells +(Ki67). + + +```r +plotCells(cur_masks, + object = spe, + cell_id = "ObjectNumber", + img_id = "sample_id", + colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67"), + exprs_values = "exprs") +``` + + + +While visualizing 6 markers on the pixel-level may still allow the distinction +of different tissue structures, observing single-cell expression levels is +difficult when visualizing many markers simultaneously due to often overlapping +expression. + +Similarly to adjusting marker colors when visualizing pixel intensities, we +can change the color gradients per marker by setting the `color` argument: + + +```r +plotCells(cur_masks, + object = spe, + cell_id = "ObjectNumber", + img_id = "sample_id", + colour_by = c("Ecad", "CD3", "CD20"), + exprs_values = "exprs", + colour = list(Ecad = c("black", "burlywood1"), + CD3 = c("black", "cyan2"), + CD20 = c("black", "firebrick1"))) +``` + + + +### Outlining cells on images {#outline-cells} + +The following section highlights the combined visualization of pixel- and +cell-level information at once. For this, besides the `SpatialExperiment` object, +the `plotPixels` function accepts two `CytoImageList` objects. One for the +multi-channel images and one for the segmentation masks. By specifying the +`outline_by` parameter, the outlines of cells can now be colored based on their +metadata. + +The following example first generates a 3-channel composite images displaying +the expression of E-cadherin, CD3 and CD20 before coloring the cells' outlines +by their cell phenotype. + + +```r +plotPixels(image = cur_images, + mask = cur_masks, + object = spe, + cell_id = "ObjectNumber", + img_id = "sample_id", + colour_by = c("Ecad", "CD3", "CD20"), + outline_by = "celltype", + bcg = list(Ecad = c(0, 5, 1), + CD3 = c(0, 5, 1), + CD20 = c(0, 5, 1)), + colour = list(celltype = metadata(spe)$color_vectors$celltype), + thick = TRUE) +``` + + + +Distinguishing individual cell phenotypes is nearly impossible in the images +above. + +However, the `SpatialExperiment` object can be subsetted to only contain cells +of a single or few phenotypes. This allows the selective visualization of cell +outlines on composite images. + +Here, we select all CD8+ T cells from the dataset and outline them on a 2-channel +composite image displaying the expression of CD3 and CD8a. + + +```r +CD8 <- spe[,spe$celltype == "CD8"] + +plotPixels(image = cur_images, + mask = cur_masks, + object = CD8, + cell_id = "ObjectNumber", img_id = "sample_id", + colour_by = c("CD3", "CD8a"), + outline_by = "celltype", + bcg = list(CD3 = c(0, 5, 1), + CD8a = c(0, 5, 1)), + colour = list(celltype = c("CD8" = "white")), + thick = TRUE) +``` + + + +This type of visualization allows the quality control of two things: 1. +segmentation quality of individual cell types can be checked and 2. cell +phenotyping accuracy can be visually assessed against expected marker expression. + +## Adjusting plot annotations + +The `cytomapper` package provides a number of function arguments to adjust the +visual appearance of figures that are shared between the `plotPixels` and +`plotCells` function. + +For a full overview of the arguments please refer to `?plotting-param`. + +We use the following example to highlight how to adjust the scale bar, the image +title, the legend appearance and the margin between images. + + +```r +plotPixels(cur_images, + colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67"), + bcg = list(Ecad = c(0, 5, 1), + CD3 = c(0, 5, 1), + CD20 = c(0, 5, 1), + CD8a = c(0, 5, 1), + CD38 = c(0, 8, 1), + Ki67 = c(0, 5, 1)), + scale_bar = list(length = 100, + label = expression("100 " ~ mu * "m"), + cex = 0.7, + lwidth = 10, + colour = "grey", + position = "bottomleft", + margin = c(5,5), + frame = 3), + image_title = list(text = mcols(cur_images)$indication, + position = "topright", + colour = "grey", + margin = c(5,5), + font = 2, + cex = 2), + legend = list(colour_by.title.cex = 0.7, + margin = 10), + margin = 40) +``` + + + +## Displaying individual images + +By default, all images are displayed on the same graphics device. This can be +useful when saving all images at once (see next section) to zoom into the +individual images instead of opening each image individually. However, when +displaying images in a markdown document these are more accessible when +visualized individually. For this, the `plotPixels` and `plotCells` function +accepts the `display` parameter that when set to `"single"` displays each +resulting image in its own graphics device: + + +```r +plotCells(cur_masks, + object = spe, + cell_id = "ObjectNumber", + img_id = "sample_id", + colour_by = "celltype", + colour = list(celltype = metadata(spe)$color_vectors$celltype), + display = "single", + legend = NULL) +``` + + + +## Saving and returning images + +The final section addresses how to save composite images and how to return them +for integration with other plots. + +The `plotPixels` and `plotCells` functions accept the `save_plot` argument which +takes a named list of the following entries: `filename` indicates the location +and file type of the image saved to disk; `scale` adjusts the resolution of the +saved image (this only needs to be adjusted for small images). + + +```r +plotCells(cur_masks, + object = spe, + cell_id = "ObjectNumber", + img_id = "sample_id", + colour_by = "celltype", + colour = list(celltype = metadata(spe)$color_vectors$celltype), + save_plot = list(filename = "data/celltype_image.png")) +``` + +The composite images (together with their annotation) can also be returned. In +the following code chunk we save two example plots to variables (`out1` and +`out2`). + + +```r +out1 <- plotCells(cur_masks, + object = spe, + cell_id = "ObjectNumber", + img_id = "sample_id", + colour_by = "celltype", + colour = list(celltype = metadata(spe)$color_vectors$celltype), + return_plot = TRUE) +``` + +```r +out2 <- plotCells(cur_masks, + object = spe, + cell_id = "ObjectNumber", + img_id = "sample_id", + colour_by = c("Ecad", "CD3", "CD20"), + exprs_values = "exprs", + return_plot = TRUE) +``` + +The composite images are stored in `out1$plot` and `out2$plot` and can be +converted into a graph object recognized by the +[cowplot](https://cran.r-project.org/web/packages/cowplot/vignettes/introduction.html) +package. + +The final function call of the following chunk plots both object next to each +other. + + +```r +library(cowplot) +library(gridGraphics) +p1 <- ggdraw(out1$plot, clip = "on") +p2 <- ggdraw(out2$plot, clip = "on") + +plot_grid(p1, p2) +``` + + + +## Interactive image visualization + +The +[cytoviewer](https://bioconductor.org/packages/release/bioc/html/cytoviewer.html) +package allows the interactive visualization of multi-channel images and +segmentation masks. It also allows to map cellular metadata onto segmentation +masks and outlining of cells on composite images. For a full introduction to the +package, please refer to [the vignette](https://bioconductor.org/packages/release/bioc/vignettes/cytoviewer/inst/doc/cytoviewer.html). + + +```r +library(cytoviewer) + +app <- cytoviewer(image = images, + mask = masks, + object = spe, + cell_id = "ObjectNumber", + img_id = "sample_id") + +if (interactive()) { + shiny::runApp(app, launch.browser = TRUE) +} +``` + +## Session Info + +
+ SessionInfo + + +``` +## R version 4.3.1 (2023-06-16) +## Platform: x86_64-pc-linux-gnu (64-bit) +## Running under: Ubuntu 22.04.3 LTS +## +## Matrix products: default +## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 +## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 +## +## locale: +## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C +## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 +## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 +## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C +## [9] LC_ADDRESS=C LC_TELEPHONE=C +## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C +## +## time zone: Etc/UTC +## tzcode source: system (glibc) +## +## attached base packages: +## [1] grid stats4 stats graphics grDevices utils datasets +## [8] methods base +## +## other attached packages: +## [1] cytoviewer_1.0.1 gridGraphics_0.5-1 +## [3] cowplot_1.1.1 cytomapper_1.12.0 +## [5] EBImage_4.42.0 SpatialExperiment_1.10.0 +## [7] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2 +## [9] Biobase_2.60.0 GenomicRanges_1.52.0 +## [11] GenomeInfoDb_1.36.3 IRanges_2.34.1 +## [13] S4Vectors_0.38.2 BiocGenerics_0.46.0 +## [15] MatrixGenerics_1.12.3 matrixStats_1.0.0 +## +## loaded via a namespace (and not attached): +## [1] splines_4.3.1 later_1.3.1 +## [3] bitops_1.0-7 tibble_3.2.1 +## [5] R.oo_1.25.0 svgPanZoom_0.3.4 +## [7] polyclip_1.10-6 XML_3.99-0.14 +## [9] lifecycle_1.0.3 rstatix_0.7.2 +## [11] edgeR_3.42.4 doParallel_1.0.17 +## [13] lattice_0.21-8 MASS_7.3-60 +## [15] backports_1.4.1 magrittr_2.0.3 +## [17] limma_3.56.2 sass_0.4.7 +## [19] rmarkdown_2.25 plotrix_3.8-2 +## [21] jquerylib_0.1.4 yaml_2.3.7 +## [23] httpuv_1.6.11 sp_2.0-0 +## [25] RColorBrewer_1.1-3 ConsensusClusterPlus_1.64.0 +## [27] multcomp_1.4-25 abind_1.4-5 +## [29] zlibbioc_1.46.0 Rtsne_0.16 +## [31] purrr_1.0.2 R.utils_2.12.2 +## [33] RCurl_1.98-1.12 TH.data_1.1-2 +## [35] tweenr_2.0.2 sandwich_3.0-2 +## [37] circlize_0.4.15 GenomeInfoDbData_1.2.10 +## [39] ggrepel_0.9.3 irlba_2.3.5.1 +## [41] CATALYST_1.24.0 terra_1.7-46 +## [43] dqrng_0.3.1 svglite_2.1.1 +## [45] DelayedMatrixStats_1.22.6 codetools_0.2-19 +## [47] DropletUtils_1.20.0 DelayedArray_0.26.7 +## [49] scuttle_1.10.2 ggforce_0.4.1 +## [51] tidyselect_1.2.0 shape_1.4.6 +## [53] raster_3.6-23 farver_2.1.1 +## [55] ScaledMatrix_1.8.1 viridis_0.6.4 +## [57] jsonlite_1.8.7 BiocNeighbors_1.18.0 +## [59] GetoptLong_1.0.5 ellipsis_0.3.2 +## [61] scater_1.28.0 ggridges_0.5.4 +## [63] survival_3.5-5 iterators_1.0.14 +## [65] systemfonts_1.0.4 foreach_1.5.2 +## [67] tools_4.3.1 ggnewscale_0.4.9 +## [69] Rcpp_1.0.11 glue_1.6.2 +## [71] gridExtra_2.3 xfun_0.40 +## [73] dplyr_1.1.3 HDF5Array_1.28.1 +## [75] shinydashboard_0.7.2 withr_2.5.1 +## [77] fastmap_1.1.1 rhdf5filters_1.12.1 +## [79] fansi_1.0.4 rsvd_1.0.5 +## [81] digest_0.6.33 R6_2.5.1 +## [83] mime_0.12 colorspace_2.1-0 +## [85] gtools_3.9.4 jpeg_0.1-10 +## [87] R.methodsS3_1.8.2 utf8_1.2.3 +## [89] tidyr_1.3.0 generics_0.1.3 +## [91] data.table_1.14.8 htmlwidgets_1.6.2 +## [93] S4Arrays_1.0.6 pkgconfig_2.0.3 +## [95] gtable_0.3.4 ComplexHeatmap_2.16.0 +## [97] RProtoBufLib_2.12.1 XVector_0.40.0 +## [99] htmltools_0.5.6 carData_3.0-5 +## [101] bookdown_0.35 fftwtools_0.9-11 +## [103] clue_0.3-65 scales_1.2.1 +## [105] png_0.1-8 colorRamps_2.3.1 +## [107] knitr_1.44 rstudioapi_0.15.0 +## [109] reshape2_1.4.4 rjson_0.2.21 +## [111] cachem_1.0.8 zoo_1.8-12 +## [113] rhdf5_2.44.0 GlobalOptions_0.1.2 +## [115] stringr_1.5.0 shinycssloaders_1.0.0 +## [117] miniUI_0.1.1.1 parallel_4.3.1 +## [119] vipor_0.4.5 pillar_1.9.0 +## [121] vctrs_0.6.3 promises_1.2.1 +## [123] ggpubr_0.6.0 BiocSingular_1.16.0 +## [125] car_3.1-2 cytolib_2.12.1 +## [127] beachmat_2.16.0 xtable_1.8-4 +## [129] cluster_2.1.4 archive_1.1.6 +## [131] beeswarm_0.4.0 evaluate_0.21 +## [133] magick_2.8.0 mvtnorm_1.2-3 +## [135] cli_3.6.1 locfit_1.5-9.8 +## [137] compiler_4.3.1 rlang_1.1.1 +## [139] crayon_1.5.2 ggsignif_0.6.4 +## [141] FlowSOM_2.8.0 plyr_1.8.8 +## [143] flowCore_2.12.2 ggbeeswarm_0.7.2 +## [145] stringi_1.7.12 viridisLite_0.4.2 +## [147] BiocParallel_1.34.2 nnls_1.5 +## [149] munsell_0.5.0 tiff_0.1-11 +## [151] colourpicker_1.3.0 Matrix_1.6-1.1 +## [153] sparseMatrixStats_1.12.2 ggplot2_3.4.3 +## [155] Rhdf5lib_1.22.1 shiny_1.7.5 +## [157] fontawesome_0.5.2 drc_3.0-1 +## [159] memoise_2.0.1 igraph_1.5.1 +## [161] broom_1.0.5 bslib_0.5.1 +``` +
diff --git a/10-image_visualization_files/figure-html/6-channel-1.png b/10-image_visualization_files/figure-html/6-channel-1.png new file mode 100644 index 00000000..944425e6 Binary files /dev/null and b/10-image_visualization_files/figure-html/6-channel-1.png differ diff --git a/10-image_visualization_files/figure-html/6-channel-expression-1.png b/10-image_visualization_files/figure-html/6-channel-expression-1.png new file mode 100644 index 00000000..1826b9e0 Binary files /dev/null and b/10-image_visualization_files/figure-html/6-channel-expression-1.png differ diff --git a/10-image_visualization_files/figure-html/Ecad-expression-1.png b/10-image_visualization_files/figure-html/Ecad-expression-1.png new file mode 100644 index 00000000..6fa300e3 Binary files /dev/null and b/10-image_visualization_files/figure-html/Ecad-expression-1.png differ diff --git a/10-image_visualization_files/figure-html/adjusting-parameters-1.png b/10-image_visualization_files/figure-html/adjusting-parameters-1.png new file mode 100644 index 00000000..02d71735 Binary files /dev/null and b/10-image_visualization_files/figure-html/adjusting-parameters-1.png differ diff --git a/10-image_visualization_files/figure-html/area-1.png b/10-image_visualization_files/figure-html/area-1.png new file mode 100644 index 00000000..2863444d Binary files /dev/null and b/10-image_visualization_files/figure-html/area-1.png differ diff --git a/10-image_visualization_files/figure-html/celltype-1.png b/10-image_visualization_files/figure-html/celltype-1.png new file mode 100644 index 00000000..e705046a Binary files /dev/null and b/10-image_visualization_files/figure-html/celltype-1.png differ diff --git a/10-image_visualization_files/figure-html/default-normalization-1.png b/10-image_visualization_files/figure-html/default-normalization-1.png new file mode 100644 index 00000000..9ed9fda3 Binary files /dev/null and b/10-image_visualization_files/figure-html/default-normalization-1.png differ diff --git a/10-image_visualization_files/figure-html/individual-images-1.png b/10-image_visualization_files/figure-html/individual-images-1.png new file mode 100644 index 00000000..a93de02e Binary files /dev/null and b/10-image_visualization_files/figure-html/individual-images-1.png differ diff --git a/10-image_visualization_files/figure-html/individual-images-2.png b/10-image_visualization_files/figure-html/individual-images-2.png new file mode 100644 index 00000000..2d43ab03 Binary files /dev/null and b/10-image_visualization_files/figure-html/individual-images-2.png differ diff --git a/10-image_visualization_files/figure-html/individual-images-3.png b/10-image_visualization_files/figure-html/individual-images-3.png new file mode 100644 index 00000000..326bd3eb Binary files /dev/null and b/10-image_visualization_files/figure-html/individual-images-3.png differ diff --git a/10-image_visualization_files/figure-html/individual-normalization-1.png b/10-image_visualization_files/figure-html/individual-normalization-1.png new file mode 100644 index 00000000..28977016 Binary files /dev/null and b/10-image_visualization_files/figure-html/individual-normalization-1.png differ diff --git a/10-image_visualization_files/figure-html/outlining-CD8-1.png b/10-image_visualization_files/figure-html/outlining-CD8-1.png new file mode 100644 index 00000000..92b32d9c Binary files /dev/null and b/10-image_visualization_files/figure-html/outlining-CD8-1.png differ diff --git a/10-image_visualization_files/figure-html/outlining-all-cells-1.png b/10-image_visualization_files/figure-html/outlining-all-cells-1.png new file mode 100644 index 00000000..fb3d8192 Binary files /dev/null and b/10-image_visualization_files/figure-html/outlining-all-cells-1.png differ diff --git a/10-image_visualization_files/figure-html/selective-visualization-1.png b/10-image_visualization_files/figure-html/selective-visualization-1.png new file mode 100644 index 00000000..fed2cf44 Binary files /dev/null and b/10-image_visualization_files/figure-html/selective-visualization-1.png differ diff --git a/10-image_visualization_files/figure-html/setting-celltype-colors-1.png b/10-image_visualization_files/figure-html/setting-celltype-colors-1.png new file mode 100644 index 00000000..7f833518 Binary files /dev/null and b/10-image_visualization_files/figure-html/setting-celltype-colors-1.png differ diff --git a/10-image_visualization_files/figure-html/setting-colors-1.png b/10-image_visualization_files/figure-html/setting-colors-1.png new file mode 100644 index 00000000..88bafe9b Binary files /dev/null and b/10-image_visualization_files/figure-html/setting-colors-1.png differ diff --git a/10-image_visualization_files/figure-html/setting-expression-colors-1.png b/10-image_visualization_files/figure-html/setting-expression-colors-1.png new file mode 100644 index 00000000..053690c0 Binary files /dev/null and b/10-image_visualization_files/figure-html/setting-expression-colors-1.png differ diff --git a/10-image_visualization_files/figure-html/setting-inputRange-1.png b/10-image_visualization_files/figure-html/setting-inputRange-1.png new file mode 100644 index 00000000..1368b9aa Binary files /dev/null and b/10-image_visualization_files/figure-html/setting-inputRange-1.png differ diff --git a/10-image_visualization_files/figure-html/side-by-side-plot-1.png b/10-image_visualization_files/figure-html/side-by-side-plot-1.png new file mode 100644 index 00000000..69ffe0ef Binary files /dev/null and b/10-image_visualization_files/figure-html/side-by-side-plot-1.png differ diff --git a/10-image_visualization_files/figure-html/single-channel-1.png b/10-image_visualization_files/figure-html/single-channel-1.png new file mode 100644 index 00000000..679d2c1e Binary files /dev/null and b/10-image_visualization_files/figure-html/single-channel-1.png differ diff --git a/11-spatial_analysis.md b/11-spatial_analysis.md new file mode 100644 index 00000000..cd40f39f --- /dev/null +++ b/11-spatial_analysis.md @@ -0,0 +1,1131 @@ +# Performing spatial analysis + +Highly multiplexed imaging technologies measure the spatial distributions of +molecule abundances across tissue sections. As such, having the option to +analyze single cells in their spatial tissue context is a key strength of these +technologies. + +A number of software packages such as +[squidpy](https://squidpy.readthedocs.io/en/stable/), +[giotto](https://giottosuite.readthedocs.io/en/master/) and +[Seurat](https://satijalab.org/seurat/articles/spatial_vignette_2.html) have +been developed to analyse and visualize cells in their spatial context. The +following chapter will highlight the use of +[imcRtools](https://bioconductor.org/packages/release/bioc/html/imcRtools.html) +and other Bioconductor packages to visualize and analyse single-cell data +obtained from highly multiplexed imaging technologies. + +We will first read in the spatially-annotated single-cell data processed in the +previous sections. + + +```r +library(SpatialExperiment) +spe <- readRDS("data/spe.rds") +``` + +## Spatial interaction graphs + +Many spatial analysis approaches either compare the observed versus expected +number of cells around a given cell type (point process) or utilize interaction +graphs (spatial object graphs) to estimate clustering or interaction frequencies +between cell types. + +The [steinbock](https://bodenmillergroup.github.io/steinbock/latest/cli/measurement/) +framework allows the construction of these spatial graphs. During image +processing (see Section \@ref(image-processing)), we have constructed +a spatial graph by expanding the individual cell masks by 4 pixels. + +The `imcRtools` package further allows the *ad hoc* consctruction of spatial +graphs directly using a `SpatialExperiment` or `SingleCellExperiment` object +while considering the spatial location (centroids) of individual cells. The +[buildSpatialGraph](https://bodenmillergroup.github.io/imcRtools/reference/buildSpatialGraph.html) +function allows constructing spatial graphs by detecting the k-nearest neighbors +in 2D (`knn`), by detecting all cells within a given distance to the center cell +(`expansion`) and by Delaunay triangulation (`delaunay`). + +When constructing a knn graph, the number of neighbors (`k`) needs to be set and +(optionally) the maximum distance to consider (`max_dist`) can be specified. +When constructing a graph via expansion, the distance to expand (`threshold`) +needs to be provided. For graphs constructed via Delaunay triangulation, +the `max_dist` parameter can be set to avoid unusually large connections at the +edge of the image. + + +```r +library(imcRtools) +``` + + +```r +spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "knn", k = 20) +``` + +``` +## The returned object is ordered by the 'sample_id' entry. +``` + +```r +spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "expansion", threshold = 20) +``` + +``` +## The returned object is ordered by the 'sample_id' entry. +``` + +```r +spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "delaunay", max_dist = 20) +``` + +``` +## The returned object is ordered by the 'sample_id' entry. +``` + +The spatial graphs are stored in `colPair(spe, name)` slots. These slots store +`SelfHits` objects representing edge lists in which the first column indicates +the index of the "from" cell and the second column the index of the "to" cell. +Each edge list is newly constructed when subsetting the object. + + +```r +colPairNames(spe) +``` + +``` +## [1] "neighborhood" "knn_interaction_graph" +## [3] "expansion_interaction_graph" "delaunay_interaction_graph" +``` + +Here, `colPair(spe, "neighborhood")` stores the spatial graph constructed by +`steinbock`, `colPair(spe, "knn_interaction_graph")` stores the knn spatial +graph, `colPair(spe, "expansion_interaction_graph")` stores the expansion graph +and `colPair(spe, "delaunay_interaction_graph")` stores the graph constructed by +Delaunay triangulation. + +## Spatial visualization {#spatial-viz} + +Section \@ref(image-visualization) highlights the use of the +[cytomapper](https://www.bioconductor.org/packages/release/bioc/html/cytomapper.html) +package to visualize multichannel images and segmentation masks. Here, we +introduce the +[plotSpatial](https://bodenmillergroup.github.io/imcRtools/reference/plotSpatial.html) +function of the [imcRtools](https://www.bioconductor.org/packages/release/bioc/html/imcRtools.html) package to visualize the cells' centroids and +cell-cell interactions as spatial graphs. + +In the following example, we select one image for visualization purposes. +Here, each dot (node) represents a cell and edges are drawn between cells +in close physical proximity as detected by `steinbock` or the `buildSpatialGraph` +function. Nodes are variably colored based on the cell type and edges are +colored in grey. + + +```r +library(ggplot2) +library(viridis) + +# steinbock interaction graph +plotSpatial(spe[,spe$sample_id == "Patient3_001"], + node_color_by = "celltype", + img_id = "sample_id", + draw_edges = TRUE, + colPairName = "neighborhood", + nodes_first = FALSE, + edge_color_fix = "grey") + + scale_color_manual(values = metadata(spe)$color_vectors$celltype) + + ggtitle("steinbock interaction graph") +``` + + + +```r +# knn interaction graph +plotSpatial(spe[,spe$sample_id == "Patient3_001"], + node_color_by = "celltype", + img_id = "sample_id", + draw_edges = TRUE, + colPairName = "knn_interaction_graph", + nodes_first = FALSE, + edge_color_fix = "grey") + + scale_color_manual(values = metadata(spe)$color_vectors$celltype) + + ggtitle("knn interaction graph") +``` + + + +```r +# expansion interaction graph +plotSpatial(spe[,spe$sample_id == "Patient3_001"], + node_color_by = "celltype", + img_id = "sample_id", + draw_edges = TRUE, + colPairName = "expansion_interaction_graph", + nodes_first = FALSE, + edge_color_fix = "grey") + + scale_color_manual(values = metadata(spe)$color_vectors$celltype) + + ggtitle("expansion interaction graph") +``` + + + +```r +# delaunay interaction graph +plotSpatial(spe[,spe$sample_id == "Patient3_001"], + node_color_by = "celltype", + img_id = "sample_id", + draw_edges = TRUE, + colPairName = "delaunay_interaction_graph", + nodes_first = FALSE, + edge_color_fix = "grey") + + scale_color_manual(values = metadata(spe)$color_vectors$celltype) + + ggtitle("delaunay interaction graph") +``` + + + +Nodes can also be colored based on the cells' expression levels (e.g., +E-cadherin expression) and their size can be adjusted (e.g., based on measured +cell area). + + +```r +plotSpatial(spe[,spe$sample_id == "Patient3_001"], + node_color_by = "Ecad", + assay_type = "exprs", + img_id = "sample_id", + draw_edges = TRUE, + colPairName = "expansion_interaction_graph", + nodes_first = FALSE, + node_size_by = "area", + directed = FALSE, + edge_color_fix = "grey") + + scale_size_continuous(range = c(0.1, 2)) + + ggtitle("E-cadherin expression") +``` + + + +Finally, the `plotSpatial` function allows displaying all images at once. This +visualization can be useful to quickly detect larger structures of interest. + + +```r +plotSpatial(spe, + node_color_by = "celltype", + img_id = "sample_id", + node_size_fix = 0.5) + + scale_color_manual(values = metadata(spe)$color_vectors$celltype) +``` + + + +For a full documentation on the `plotSpatial` function, please refer to +`?plotSpatial`. + +## Spatial community analysis + +The detection of spatial communities was proposed by [@Jackson2020]. Here, cells +are clustered solely based on their interactions as defined by the spatial +object graph. We can perform spatial community detection across all cells as +displayed in the next code chunk. Communities with less than 10 cells are +excluded. **Of note:** we set the seed outside of the function call for +reproducibility porposes as internally the `louvain` modularity optimization +function is used which gives different results over different runs. + + +```r +set.seed(230621) +spe <- detectCommunity(spe, + colPairName = "neighborhood", + size_threshold = 10) + +plotSpatial(spe, + node_color_by = "spatial_community", + img_id = "sample_id", + node_size_fix = 0.5) + + theme(legend.position = "none") + + ggtitle("Spatial tumor communities") + + scale_color_manual(values = rev(colors())) +``` + + + +The example shown above might not be of interest if different tissue structures +exist within which spatial communities should be computed. In the following +example, we perform spatial community detection separately for tumor and stromal +cells. + +The general procedure is as follows: + +1. create a `colData(spe)` entry that specifies if a cell is part of the tumor +or stroma compartment. + +2. use the `detectCommunity` function of the `imcRtools` +package to cluster cells within the tumor or stroma compartment solely based on +their spatial interaction graph as constructed by the `steinbock` package. + +Both tumor and stromal spatial communities are stored in the `colData` of +the `SpatialExperiment` object under the `spatial_community` identifier. + +**Of note:** Here, and in contrast to the function call above, we set the seed +argument within the `SerialParam` function for reproducibility +purposes. We need this here due to the way the `detectCommunity` function +is implemented when setting the `group_by` parameter. + + +```r +spe$tumor_stroma <- ifelse(spe$celltype == "Tumor", "Tumor", "Stroma") + +library(BiocParallel) +spe <- detectCommunity(spe, + colPairName = "neighborhood", + size_threshold = 10, + group_by = "tumor_stroma", + BPPARAM = SerialParam(RNGseed = 220819)) +``` + +We can now separately visualize the tumor and stromal communities. + + +```r +plotSpatial(spe[,spe$celltype == "Tumor"], + node_color_by = "spatial_community", + img_id = "sample_id", + node_size_fix = 0.5) + + theme(legend.position = "none") + + ggtitle("Spatial tumor communities") + + scale_color_manual(values = rev(colors())) +``` + + + +```r +plotSpatial(spe[,spe$celltype != "Tumor"], + node_color_by = "spatial_community", + img_id = "sample_id", + node_size_fix = 0.5) + + theme(legend.position = "none") + + ggtitle("Spatial non-tumor communities") + + scale_color_manual(values = rev(colors())) +``` + + + +The example data was acquired using a panel that mainly focuses on immune cells. +We are therefore unable to detect many tumor sub-phenotypes and will +focus on the stromal communities. + +In the next step, the fraction of cell types within each +spatial stromal community is displayed. + + +```r +library(pheatmap) +library(viridis) + +cur_spe <- spe[,spe$celltype != "Tumor"] + +for_plot <- prop.table(table(cur_spe$spatial_community, + cur_spe$celltype), + margin = 1) + +pheatmap(for_plot, + color = colorRampPalette(c("dark blue", "white", "dark red"))(100), + show_rownames = FALSE, + scale = "column") +``` + + + +We observe that many spatial stromal communities are made up of myeloid cells or +"stromal" (non-immune) cells. Other communities are mainly made up of B cells +and BnT cells indicating tertiary lymphoid structures (TLS). While plasma cells, +CD4$^+$ or CD8$^+$ T cells tend to aggregate, only in few spatial stromal +communities consists of mainly neutrophils. + +## Cellular neighborhood analysis + +The following section highlights the use of the `imcRtools` package to +detect cellular neighborhoods. This approach has been proposed by +[@Goltsev2018] and [@Schurch2020] to group cells based on information +contained in their direct neighborhood. + +[@Goltsev2018] perfomed Delaunay triangulation-based graph construction, +neighborhood aggregation and then clustered cells. [@Schurch2020] on the +other hand constructed a 10-nearest neighbor graph before aggregating +information across neighboring cells. + +In the following code chunk we will use the 20-nearest neighbor graph as +constructed above to define the direct cellular neighborhood. The +[aggregateNeighbors](https://bodenmillergroup.github.io/imcRtools/reference/aggregateNeighbors.html) +function allows neighborhood aggregation in 2 different ways: + +1. For each cell the function computes the fraction of cells of a + certain type (e.g., cell type) among its neighbors. +2. For each cell it aggregates (e.g., mean) the expression counts + across all neighboring cells. + +Based on these measures, cells can now be clustered into cellular +neighborhoods. We will first compute the fraction of the different cell +types among the 20-nearest neighbors and use kmeans clustering to group +cells into 6 cellular neighborhoods. + +**Of note:** constructing a 20-nearest neighbor graph and clustering +using kmeans with `k=6` is only an example. Similar to the analysis done +in Section \@ref(snn-graph), it is recommended to perform a parameter +sweep across different graph construction algorithms and different +parmaters `k` for kmeans clustering. Finding the best CN detection +settings is also subject to the question at hand. Constructing graphs +with more neighbors usually results in larger CNs. + + +```r +# By celltypes +spe <- aggregateNeighbors(spe, + colPairName = "knn_interaction_graph", + aggregate_by = "metadata", + count_by = "celltype") + +set.seed(220705) + +cn_1 <- kmeans(spe$aggregatedNeighbors, centers = 6) +spe$cn_celltypes <- as.factor(cn_1$cluster) + +plotSpatial(spe, + node_color_by = "cn_celltypes", + img_id = "sample_id", + node_size_fix = 0.5) + + scale_color_brewer(palette = "Set3") +``` + + + +The next code chunk visualizes the cell type compositions of the +detected cellular neighborhoods (CN). + + +```r +for_plot <- prop.table(table(spe$cn_celltypes, spe$celltype), + margin = 1) + +pheatmap(for_plot, + color = colorRampPalette(c("dark blue", "white", "dark red"))(100), + scale = "column") +``` + + + +CN 1 and CN 6 are mainly composed of tumor cells with CN 6 forming the +tumor/stroma border. CN 3 is mainly composed of B and BnT cells +indicating TLS. CN 5 is composed of aggregated plasma cells and most T +cells. + +We will now detect cellular neighborhoods by computing the mean +expression across the 20-nearest neighbor prior to kmeans clustering +(k=6). + + +```r +# By expression +spe <- aggregateNeighbors(spe, + colPairName = "knn_interaction_graph", + aggregate_by = "expression", + assay_type = "exprs", + subset_row = rowData(spe)$use_channel) + +set.seed(220705) + +cn_2 <- kmeans(spe$mean_aggregatedExpression, centers = 6) +spe$cn_expression <- as.factor(cn_2$cluster) + +plotSpatial(spe, + node_color_by = "cn_expression", + img_id = "sample_id", + node_size_fix = 0.5) + + scale_color_brewer(palette = "Set3") +``` + + + +Also here, we can visualize the cell type composition of each cellular +neighborhood. + + +```r +for_plot <- prop.table(table(spe$cn_expression, spe$celltype), + margin = 1) + +pheatmap(for_plot, + color = colorRampPalette(c("dark blue", "white", "dark red"))(100), + scale = "column") +``` + + + +When clustering cells based on the mean expression within the direct +neighborhood, tumor cells are split across CN 6, CN 1 and CN 4 without +forming a clear tumor/stroma interface. This result reflects +patient-to-patient differences in the expression of tumor markers. + +CN 3 again contains B cells and BnT cells but also CD8 and undefined +cells, therefore it is less representative of TLS compared to CN 3 in +previous CN approach. CN detection based on mean marker expression is +therefore sensitive to staining/expression differences between samples +as well as lateral spillover due to imperfect segmentation. + +An alternative to the `aggregateNeighbors` function is provided by the +[lisaClust](https://bioconductor.org/packages/release/bioc/html/lisaClust.html) +Bioconductor package [@Patrick2023]. In contrast to `imcRtools`, the +`lisaClust` package computes local indicators of spatial associations +(LISA) functions and clusters cells based on those. More precise, the +package summarizes L-functions from a Poisson point process model to +derive numeric vectors for each cell which can then again be clustered +using kmeans. All steps are supported by the `lisaClust` function which +can be applied to a `SingleCellExperiment` and `SpatialExperiment` object. + + +In the following example, we calculate the LISA curves within a 10µm, 20µm and +50µm neighborhood around each cell. Increasing these radii will lead to broader +and smoother spatial clusters. However, a number of parameter settings should be +tested to estimate the robustness of the results. + + +```r +library(lisaClust) + +set.seed(220705) +spe <- lisaClust(spe, + k = 6, + Rs = c(10, 20, 50), + spatialCoords = c("Pos_X", "Pos_Y"), + cellType = "celltype", + imageID = "sample_id") + +plotSpatial(spe, + node_color_by = "region", + img_id = "sample_id", + node_size_fix = 0.5) + + scale_color_brewer(palette = "Set3") +``` + + + +Similar to the example above, we can now observe the cell type +composition per spatial cluster. + + +```r +for_plot <- prop.table(table(spe$region, spe$celltype), + margin = 1) + +pheatmap(for_plot, + color = colorRampPalette(c("dark blue", "white", "dark red"))(100), + scale = "column") +``` + + + +In this case, CN 1 and 4 contain tumor cells but no CN is forming the +tumor/stroma interface. CN 3 represents TLS. CN 2 indicates T cell +subtypes and plasma cells are aggregated to CN 5. + +As an alternative way of visualizing the enrichment of cell types within the +detected CNs, the `lisaClust` package provides the `regionMap` function. + + +```r +regionMap(spe, + cellType = "celltype", + region = "region") +``` + + + +## Spatial context analysis + +Downstream of CN assignments, we will analyze the spatial context (SC) +of each cell using three functions from the `imcRtools` package. + +While CNs can represent sites of unique local processes, the term SC was +coined by Bhate and colleagues [@Bhate2022] and describes tissue regions +in which distinct CNs may be interacting. Hence, SCs may be interesting +regions of specialized biological events. + +Here, we will first detect SCs using the `detectSpatialContext` function. This +function relies on CN fractions for each cell in a spatial interaction +graph (originally a KNN graph), which we will calculate using +`buildSpatialGraph` and `aggregateNeighbors`. We will focus on the CNs +derived from cell type fractions but other CN assignments are possible. + +**Of note**, the window size (k for KNN) for `buildSpatialGraph` should +reflect a length scale on which biological signals can be exchanged and +depends, among others, on cell density and tissue area. In view of their +divergent functionality, we recommend to use a larger window size for SC +(interaction between local processes) than for CN (local processes) +detection. Since we used a 20-nearest neighbor graph for CN assignment, +we will use a 40-nearest neighbor graph for SC detection. As before, +different parameters should be tested. + +Subsequently, the CN fractions are sorted from high-to-low and the SC of +each cell is assigned as the minimal combination of SCs that additively +surpass a user-defined threshold. The default threshold of 0.9 aims to +represent the dominant CNs, hence the most prevalent signals, in a given +window. + +For more details and biological validation, please refer to +[@Bhate2022]. + + +```r +library(circlize) +library(RColorBrewer) + +# Construct a 40-nearest neighbor graph +spe <- buildSpatialGraph(spe, + img_id = "sample_id", + type = "knn", + name = "knn_spatialcontext_graph", + k = 40) + +# Compute the fraction of cellular neighborhoods around each cell +spe <- aggregateNeighbors(spe, + colPairName = "knn_spatialcontext_graph", + aggregate_by = "metadata", + count_by = "cn_celltypes", + name = "aggregatedNeighborhood") + +# Detect spatial contexts +spe <- detectSpatialContext(spe, + entry = "aggregatedNeighborhood", + threshold = 0.90, + name = "spatial_context") + +# Define SC color scheme +n_SCs <- length(unique(spe$spatial_context)) +col_SC <- setNames(colorRampPalette(brewer.pal(9, "Paired"))(n_SCs), + sort(unique(spe$spatial_context))) + +# Visualize spatial contexts on images +plotSpatial(spe, + node_color_by = "spatial_context", + img_id = "sample_id", + node_size_fix = 0.5) + + scale_color_manual(values = col_SC) +``` + + + +We detect a total of 52 distinct +SCs across this dataset. + +For ease of interpretation, we will directly compare the CN and SC +assignments for `Patient3_001`. + + +```r +library(patchwork) + +# Compare CN and SC for one patient +p1 <- plotSpatial(spe[,spe$sample_id == "Patient3_001"], + node_color_by = "cn_celltypes", + img_id = "sample_id", + node_size_fix = 0.5) + + scale_color_brewer(palette = "Set3") + +p2 <- plotSpatial(spe[,spe$sample_id == "Patient3_001"], + node_color_by = "spatial_context", + img_id = "sample_id", + node_size_fix = 0.5) + + scale_color_manual(values = col_SC, limits = force) + +p1 + p2 +``` + + + +As expected, we can observe that interfaces between different CNs make +up distinct SCs. For instance, interface between CN 3 (TLS region +consisting of B and BnT cells) and CN 5 (Plasma- and T-cell dominated) +turns to SC 3_5. On the other hand, the core of CN 3 becomes SC 3, since +the most abundant CN of the neighborhood for these cells is just the CN +itself. + +Next, we filter the SCs based on user-defined thresholds for number of +group entries (here at least 3 patients) and/or total number of cells +(here minimum of 100 cells) per SC using the `filterSpatialContext` function. + + +```r +## Filter spatial contexts +# By number of group entries +spe <- filterSpatialContext(spe, + entry = "spatial_context", + group_by = "patient_id", + group_threshold = 3, + name = "spatial_context_filtered") + +plotSpatial(spe, + node_color_by = "spatial_context_filtered", + img_id = "sample_id", + node_size_fix = 0.5) + + scale_color_manual(values = col_SC, limits = force) +``` + + + +```r +# Filter out small and infrequent spatial contexts +spe <- filterSpatialContext(spe, + entry = "spatial_context", + group_by = "patient_id", + group_threshold = 3, + cells_threshold = 100, + name = "spatial_context_filtered") + +plotSpatial(spe, + node_color_by = "spatial_context_filtered", + img_id = "sample_id", + node_size_fix = 0.5) + + scale_color_manual(values = col_SC, limits = force) +``` + + + +Lastly, we can use the `plotSpatialContext` function to generate *SC +graphs*, analogous to *CN combination maps* in [@Bhate2022]. Returned +objects are `ggplots`, which can be easily modified further. We will +create a SC graph for the filtered SCs here. + + +```r +## Plot spatial context graph + +# Colored by name, size by n_cells +plotSpatialContext(spe, + entry = "spatial_context_filtered", + group_by = "sample_id", + node_color_by = "name", + node_size_by = "n_cells", + node_label_color_by = "name") +``` + + + +```r +# Colored by n_cells, size by n_group +plotSpatialContext(spe, + entry = "spatial_context_filtered", + group_by = "sample_id", + node_color_by = "n_cells", + node_size_by = "n_group", + node_label_color_by = "n_cells") + + scale_color_viridis() +``` + + + +SC 1 (Tumor-dominated), SC 1_6 (Tumor and Tumor-Stroma interface) and SC +4_5 (Plasma/T cell and Myeloid/Neutrophil interface) are the most +frequent SCs in this dataset. Moreover, we may compare the degree of the +different nodes in the SC graph. For example, we can observe that SC 1 +has only one degree (directed to SC 1_6), while SC 5 (T cells and plasma cells) has +a much higher degree (n = 4) and potentially more CN interactions. + +## Patch detection + +The previous section focused on detecting cellular neighborhoods in a rather +unsupervised fashion. However, the `imcRtools` package also provides methods for +detecting spatial compartments in a supervised fashion. The +[patchDetection](https://bodenmillergroup.github.io/imcRtools/reference/patchDetection.html) +function allows the detection of connected sets of similar cells as proposed by +[@Hoch2022]. In the following example, we will use the `patchDetection` function +to detect tumor patches in three steps: + +1. Find connected sets of tumor cells (using the `steinbock` graph). +2. Components which contain less than 10 cells are excluded. +3. Expand the components by 1µm to construct a concave hull around the patch and +include cells within the patch. + + +```r +spe <- patchDetection(spe, + patch_cells = spe$celltype == "Tumor", + img_id = "sample_id", + expand_by = 1, + min_patch_size = 10, + colPairName = "neighborhood", + BPPARAM = MulticoreParam()) +``` + +``` +## The returned object is ordered by the 'sample_id' entry. +``` + +```r +plotSpatial(spe, + node_color_by = "patch_id", + img_id = "sample_id", + node_size_fix = 0.5) + + theme(legend.position = "none") + + scale_color_manual(values = rev(colors())) +``` + + + +We can now calculate the fraction of T cells within each tumor patch to roughly +estimate T cell infiltration. + + +```r +library(tidyverse) +colData(spe) %>% as_tibble() %>% + group_by(patch_id, sample_id) %>% + summarize(Tcell_count = sum(celltype == "CD8" | celltype == "CD4"), + patch_size = n(), + Tcell_freq = Tcell_count / patch_size) %>% + filter(!is.na(patch_id)) %>% + ggplot() + + geom_point(aes(log10(patch_size), Tcell_freq, color = sample_id)) + + theme_classic() +``` + + + +We can now measure the size of each patch using the +[patchSize](https://bodenmillergroup.github.io/imcRtools/reference/patchSize.html) +function and visualize tumor patch distribution per patient. + + +```r +patch_size <- patchSize(spe, "patch_id") + +patch_size <- merge(patch_size, + colData(spe)[match(patch_size$patch_id, spe$patch_id),], + by = "patch_id") + +ggplot(as.data.frame(patch_size)) + + geom_boxplot(aes(patient_id, log10(size))) + + geom_point(aes(patient_id, log10(size))) +``` + + + +The +[minDistToCells](https://bodenmillergroup.github.io/imcRtools/reference/minDistToCells.html) +function can be used to calculate the minimum distance between each cell and a +cell set of interest. Here, we highlight its use to calculate the minimum +distance of all cells to the detected tumor patches. Negative values indicate +the minimum distance of each tumor patch cell to a non-tumor patch cell. + + +```r +spe <- minDistToCells(spe, + x_cells = !is.na(spe$patch_id), + img_id = "sample_id") +``` + +``` +## The returned object is ordered by the 'sample_id' entry. +``` + +```r +plotSpatial(spe, + node_color_by = "distToCells", + img_id = "sample_id", + node_size_fix = 0.5) + + scale_color_gradient2(low = "dark blue", mid = "white", high = "dark red") +``` + + + +Finally, we can observe the minimum distances to tumor patches in a cell type +specific manner. + + +```r +library(ggridges) + +ggplot(as.data.frame(colData(spe))) + + geom_density_ridges(aes(distToCells, celltype, fill = celltype)) + + geom_vline(xintercept = 0, color = "dark red", linewidth = 2) + + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) +``` + + + +## Interaction analysis + +**Bug notice: we discovered and fixed a bug in the `testInteractions` function in version below 1.5.5 which affected `SingleCellExperiment` or `SpatialExperiment` objects in which cells were not grouped by image. Please make sure you have the newest version (>= 1.6.0) installed.** + +The next section focuses on statistically testing the pairwise interaction +between all cell types of the dataset. For this, the `imcRtools` package +provides the +[testInteractions](https://bodenmillergroup.github.io/imcRtools/reference/testInteractions.html) +function which implements the interaction testing strategy proposed by +[@Shapiro2017]. + +Per grouping level (e.g., image), the `testInteractions` function computes the +averaged cell type/cell type interaction count and compares this count against +an empirical null distribution which is generated by permuting all cell labels (while maintaining the tissue structure). + +In the following example, we use the `steinbock` generated spatial interaction +graph and estimate the interaction or avoidance between cell types in the +dataset. + + +```r +library(scales) +out <- testInteractions(spe, + group_by = "sample_id", + label = "celltype", + colPairName = "neighborhood", + BPPARAM = SerialParam(RNGseed = 221029)) + +head(out) +``` + +``` +## DataFrame with 6 rows and 10 columns +## group_by from_label to_label ct p_gt p_lt +## +## 1 Patient1_001 Bcell Bcell 0 1.000000 1.000000 +## 2 Patient1_001 Bcell BnTcell 0 1.000000 0.998002 +## 3 Patient1_001 Bcell CD4 3 0.001998 1.000000 +## 4 Patient1_001 Bcell CD8 0 1.000000 0.898102 +## 5 Patient1_001 Bcell Myeloid 0 1.000000 0.804196 +## 6 Patient1_001 Bcell Neutrophil NA NA NA +## interaction p sig sigval +## +## 1 FALSE 1.000000 FALSE 0 +## 2 FALSE 0.998002 FALSE 0 +## 3 TRUE 0.001998 TRUE 1 +## 4 FALSE 0.898102 FALSE 0 +## 5 FALSE 0.804196 FALSE 0 +## 6 NA NA NA NA +``` + +The returned `DataFrame` contains the test results per grouping level (in this case +the image ID, `group_by`), "from" cell type (`from_label`) and "to" cell type +(`to_label`). The `sigval` entry indicates if a pair of cell types is +significantly interacting (`sigval = 1`), if a pair of cell types is +significantly avoiding (`sigval = -1`) or if no significant interaction or +avoidance was detected (`sigval = 0`). + +These results can be visualized by computing the sum of the `sigval` entries +across all images: + + +```r +out %>% as_tibble() %>% + group_by(from_label, to_label) %>% + summarize(sum_sigval = sum(sigval, na.rm = TRUE)) %>% + ggplot() + + geom_tile(aes(from_label, to_label, fill = sum_sigval)) + + scale_fill_gradient2(low = muted("blue"), mid = "white", high = muted("red")) + + theme(axis.text.x = element_text(angle = 45, hjust = 1)) +``` + + + +In the plot above the red tiles indicate cell type pairs that were detected to +significantly interact on a large number of images. On the other hand, blue +tiles show cell type pairs which tend to avoid each other on a large number +of images. + +Here we can observe that tumor cells are mostly compartmentalized and are in +avoidance with other cell types. As expected, B cells interact with BnT cells; +regulatory T cells interact with CD4+ T cells and CD8+ T cells. Most cell types +show self interactions indicating spatial clustering. + +The `imcRtools` package further implements an interaction testing strategy +proposed by [@Schulz2018] where the hypothesis is tested if at least n cells of +a certain type are located around a target cell type (`from_cell`). This type of +testing can be performed by selecting `method = "patch"` and specifying the +number of patch cells via the `patch_size` parameter. + + +```r +out <- testInteractions(spe, + group_by = "sample_id", + label = "celltype", + colPairName = "neighborhood", + method = "patch", + patch_size = 3, + BPPARAM = SerialParam(RNGseed = 221029)) + +out %>% as_tibble() %>% + group_by(from_label, to_label) %>% + summarize(sum_sigval = sum(sigval, na.rm = TRUE)) %>% + ggplot() + + geom_tile(aes(from_label, to_label, fill = sum_sigval)) + + scale_fill_gradient2(low = muted("blue"), mid = "white", high = muted("red")) + + theme(axis.text.x = element_text(angle = 45, hjust = 1)) +``` + + + +These results are comparable to the interaction testing presented above. The +main difference comes from the lack of symmetry. We can now for example see that +3 or more myeloid cells sit around CD4$^+$ T cells while this interaction is not +as strong when considering CD4$^+$ T cells sitting around myeloid cells. + +Finally, we save the updated `SpatialExperiment` object. + + +```r +saveRDS(spe, "data/spe.rds") +``` + + + +## Session Info + +
+ SessionInfo + + +``` +## R version 4.3.1 (2023-06-16) +## Platform: x86_64-pc-linux-gnu (64-bit) +## Running under: Ubuntu 22.04.3 LTS +## +## Matrix products: default +## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 +## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 +## +## locale: +## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C +## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 +## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 +## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C +## [9] LC_ADDRESS=C LC_TELEPHONE=C +## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C +## +## time zone: Etc/UTC +## tzcode source: system (glibc) +## +## attached base packages: +## [1] stats4 stats graphics grDevices utils datasets methods +## [8] base +## +## other attached packages: +## [1] testthat_3.1.10 scales_1.2.1 +## [3] ggridges_0.5.4 lubridate_1.9.3 +## [5] forcats_1.0.0 stringr_1.5.0 +## [7] dplyr_1.1.3 purrr_1.0.2 +## [9] readr_2.1.4 tidyr_1.3.0 +## [11] tibble_3.2.1 tidyverse_2.0.0 +## [13] patchwork_1.1.3 RColorBrewer_1.1-3 +## [15] circlize_0.4.15 lisaClust_1.8.1 +## [17] pheatmap_1.0.12 BiocParallel_1.34.2 +## [19] viridis_0.6.4 viridisLite_0.4.2 +## [21] ggplot2_3.4.3 imcRtools_1.6.5 +## [23] SpatialExperiment_1.10.0 SingleCellExperiment_1.22.0 +## [25] SummarizedExperiment_1.30.2 Biobase_2.60.0 +## [27] GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 +## [29] IRanges_2.34.1 S4Vectors_0.38.2 +## [31] BiocGenerics_0.46.0 MatrixGenerics_1.12.3 +## [33] matrixStats_1.0.0 +## +## loaded via a namespace (and not attached): +## [1] spatstat.sparse_3.0-2 bitops_1.0-7 +## [3] sf_1.0-14 EBImage_4.42.0 +## [5] doParallel_1.0.17 numDeriv_2016.8-1.1 +## [7] tools_4.3.1 backports_1.4.1 +## [9] utf8_1.2.3 R6_2.5.1 +## [11] DT_0.29 HDF5Array_1.28.1 +## [13] mgcv_1.8-42 rhdf5filters_1.12.1 +## [15] GetoptLong_1.0.5 withr_2.5.1 +## [17] sp_2.0-0 gridExtra_2.3 +## [19] ClassifyR_3.4.11 cli_3.6.1 +## [21] spatstat.explore_3.2-3 sandwich_3.0-2 +## [23] labeling_0.4.3 sass_0.4.7 +## [25] spatstat.data_3.0-1 nnls_1.5 +## [27] mvtnorm_1.2-3 proxy_0.4-27 +## [29] systemfonts_1.0.4 colorRamps_2.3.1 +## [31] svglite_2.1.1 R.utils_2.12.2 +## [33] scater_1.28.0 plotrix_3.8-2 +## [35] limma_3.56.2 flowCore_2.12.2 +## [37] rstudioapi_0.15.0 generics_0.1.3 +## [39] shape_1.4.6 spatstat.random_3.1-6 +## [41] gtools_3.9.4 vroom_1.6.3 +## [43] car_3.1-2 scam_1.2-14 +## [45] Matrix_1.6-1.1 RProtoBufLib_2.12.1 +## [47] ggbeeswarm_0.7.2 fansi_1.0.4 +## [49] abind_1.4-5 R.methodsS3_1.8.2 +## [51] terra_1.7-46 lifecycle_1.0.3 +## [53] multcomp_1.4-25 yaml_2.3.7 +## [55] edgeR_3.42.4 carData_3.0-5 +## [57] rhdf5_2.44.0 Rtsne_0.16 +## [59] grid_4.3.1 promises_1.2.1 +## [61] dqrng_0.3.1 crayon_1.5.2 +## [63] shinydashboard_0.7.2 lattice_0.21-8 +## [65] beachmat_2.16.0 cowplot_1.1.1 +## [67] magick_2.8.0 cytomapper_1.12.0 +## [69] pillar_1.9.0 knitr_1.44 +## [71] ComplexHeatmap_2.16.0 RTriangle_1.6-0.12 +## [73] boot_1.3-28.1 rjson_0.2.21 +## [75] codetools_0.2-19 glue_1.6.2 +## [77] V8_4.3.3 data.table_1.14.8 +## [79] MultiAssayExperiment_1.26.0 vctrs_0.6.3 +## [81] png_0.1-8 gtable_0.3.4 +## [83] cachem_1.0.8 xfun_0.40 +## [85] S4Arrays_1.0.6 mime_0.12 +## [87] DropletUtils_1.20.0 tidygraph_1.2.3 +## [89] ConsensusClusterPlus_1.64.0 survival_3.5-5 +## [91] iterators_1.0.14 cytolib_2.12.1 +## [93] units_0.8-4 ellipsis_0.3.2 +## [95] TH.data_1.1-2 nlme_3.1-162 +## [97] bit64_4.0.5 rprojroot_2.0.3 +## [99] bslib_0.5.1 irlba_2.3.5.1 +## [101] svgPanZoom_0.3.4 vipor_0.4.5 +## [103] KernSmooth_2.23-21 colorspace_2.1-0 +## [105] DBI_1.1.3 raster_3.6-23 +## [107] tidyselect_1.2.0 curl_5.0.2 +## [109] bit_4.0.5 compiler_4.3.1 +## [111] BiocNeighbors_1.18.0 desc_1.4.2 +## [113] DelayedArray_0.26.7 bookdown_0.35 +## [115] classInt_0.4-10 distances_0.1.9 +## [117] goftest_1.2-3 tiff_0.1-11 +## [119] digest_0.6.33 minqa_1.2.6 +## [121] fftwtools_0.9-11 spatstat.utils_3.0-3 +## [123] rmarkdown_2.25 XVector_0.40.0 +## [125] CATALYST_1.24.0 htmltools_0.5.6 +## [127] pkgconfig_2.0.3 jpeg_0.1-10 +## [129] lme4_1.1-34 sparseMatrixStats_1.12.2 +## [131] fastmap_1.1.1 rlang_1.1.1 +## [133] GlobalOptions_0.1.2 htmlwidgets_1.6.2 +## [135] shiny_1.7.5 DelayedMatrixStats_1.22.6 +## [137] farver_2.1.1 jquerylib_0.1.4 +## [139] zoo_1.8-12 jsonlite_1.8.7 +## [141] spicyR_1.12.2 R.oo_1.25.0 +## [143] BiocSingular_1.16.0 RCurl_1.98-1.12 +## [145] magrittr_2.0.3 scuttle_1.10.2 +## [147] GenomeInfoDbData_1.2.10 Rhdf5lib_1.22.1 +## [149] munsell_0.5.0 Rcpp_1.0.11 +## [151] ggnewscale_0.4.9 stringi_1.7.12 +## [153] ggraph_2.1.0 brio_1.1.3 +## [155] zlibbioc_1.46.0 MASS_7.3-60 +## [157] plyr_1.8.8 parallel_4.3.1 +## [159] ggrepel_0.9.3 deldir_1.0-9 +## [161] graphlayouts_1.0.1 splines_4.3.1 +## [163] tensor_1.5 hms_1.1.3 +## [165] locfit_1.5-9.8 igraph_1.5.1 +## [167] ggpubr_0.6.0 spatstat.geom_3.2-5 +## [169] ggsignif_0.6.4 pkgload_1.3.3 +## [171] reshape2_1.4.4 ScaledMatrix_1.8.1 +## [173] XML_3.99-0.14 drc_3.0-1 +## [175] evaluate_0.21 nloptr_2.0.3 +## [177] tzdb_0.4.0 foreach_1.5.2 +## [179] tweenr_2.0.2 httpuv_1.6.11 +## [181] polyclip_1.10-6 clue_0.3-65 +## [183] ggforce_0.4.1 rsvd_1.0.5 +## [185] broom_1.0.5 xtable_1.8-4 +## [187] e1071_1.7-13 rstatix_0.7.2 +## [189] later_1.3.1 class_7.3-22 +## [191] lmerTest_3.1-3 FlowSOM_2.8.0 +## [193] beeswarm_0.4.0 cluster_2.1.4 +## [195] timechange_0.2.0 concaveman_1.1.0 +``` +
+ + diff --git a/11-spatial_analysis_files/figure-html/celltype-distance-1.png b/11-spatial_analysis_files/figure-html/celltype-distance-1.png new file mode 100644 index 00000000..6332b76b Binary files /dev/null and b/11-spatial_analysis_files/figure-html/celltype-distance-1.png differ diff --git a/11-spatial_analysis_files/figure-html/cn-analysis-1.png b/11-spatial_analysis_files/figure-html/cn-analysis-1.png new file mode 100644 index 00000000..981c76b5 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/cn-analysis-1.png differ diff --git a/11-spatial_analysis_files/figure-html/compare cn sc-1.png b/11-spatial_analysis_files/figure-html/compare cn sc-1.png new file mode 100644 index 00000000..63a8a780 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/compare cn sc-1.png differ diff --git a/11-spatial_analysis_files/figure-html/detectSpatialContext-1.png b/11-spatial_analysis_files/figure-html/detectSpatialContext-1.png new file mode 100644 index 00000000..28b26bd0 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/detectSpatialContext-1.png differ diff --git a/11-spatial_analysis_files/figure-html/filterSpatialContext-1.png b/11-spatial_analysis_files/figure-html/filterSpatialContext-1.png new file mode 100644 index 00000000..ae9aad93 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/filterSpatialContext-1.png differ diff --git a/11-spatial_analysis_files/figure-html/filterSpatialContext-2.png b/11-spatial_analysis_files/figure-html/filterSpatialContext-2.png new file mode 100644 index 00000000..e1bb2d92 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/filterSpatialContext-2.png differ diff --git a/11-spatial_analysis_files/figure-html/lisaClust-1.png b/11-spatial_analysis_files/figure-html/lisaClust-1.png new file mode 100644 index 00000000..e0310890 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/lisaClust-1.png differ diff --git a/11-spatial_analysis_files/figure-html/lisaClust-3-1.png b/11-spatial_analysis_files/figure-html/lisaClust-3-1.png new file mode 100644 index 00000000..08e9aa77 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/lisaClust-3-1.png differ diff --git a/11-spatial_analysis_files/figure-html/minDistCells-1.png b/11-spatial_analysis_files/figure-html/minDistCells-1.png new file mode 100644 index 00000000..78f89a28 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/minDistCells-1.png differ diff --git a/11-spatial_analysis_files/figure-html/patch-size-1.png b/11-spatial_analysis_files/figure-html/patch-size-1.png new file mode 100644 index 00000000..f8161efa Binary files /dev/null and b/11-spatial_analysis_files/figure-html/patch-size-1.png differ diff --git a/11-spatial_analysis_files/figure-html/patchDetection-1-1.png b/11-spatial_analysis_files/figure-html/patchDetection-1-1.png new file mode 100644 index 00000000..95c838cb Binary files /dev/null and b/11-spatial_analysis_files/figure-html/patchDetection-1-1.png differ diff --git a/11-spatial_analysis_files/figure-html/patchDetection-2-1.png b/11-spatial_analysis_files/figure-html/patchDetection-2-1.png new file mode 100644 index 00000000..a40ad2c0 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/patchDetection-2-1.png differ diff --git a/11-spatial_analysis_files/figure-html/plotSpatialContext-1.png b/11-spatial_analysis_files/figure-html/plotSpatialContext-1.png new file mode 100644 index 00000000..ac13ab14 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/plotSpatialContext-1.png differ diff --git a/11-spatial_analysis_files/figure-html/plotSpatialContext-2.png b/11-spatial_analysis_files/figure-html/plotSpatialContext-2.png new file mode 100644 index 00000000..ab085665 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/plotSpatialContext-2.png differ diff --git a/11-spatial_analysis_files/figure-html/spatial-community-1-1.png b/11-spatial_analysis_files/figure-html/spatial-community-1-1.png new file mode 100644 index 00000000..02b6bc87 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/spatial-community-1-1.png differ diff --git a/11-spatial_analysis_files/figure-html/spatial-community-heatmap-1.png b/11-spatial_analysis_files/figure-html/spatial-community-heatmap-1.png new file mode 100644 index 00000000..6e750082 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/spatial-community-heatmap-1.png differ diff --git a/11-spatial_analysis_files/figure-html/spatial-community-viz-1.png b/11-spatial_analysis_files/figure-html/spatial-community-viz-1.png new file mode 100644 index 00000000..46413beb Binary files /dev/null and b/11-spatial_analysis_files/figure-html/spatial-community-viz-1.png differ diff --git a/11-spatial_analysis_files/figure-html/spatial-community-viz-2.png b/11-spatial_analysis_files/figure-html/spatial-community-viz-2.png new file mode 100644 index 00000000..a3bd0e68 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/spatial-community-viz-2.png differ diff --git a/11-spatial_analysis_files/figure-html/spatial-viz-1-1.png b/11-spatial_analysis_files/figure-html/spatial-viz-1-1.png new file mode 100644 index 00000000..a6314b62 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/spatial-viz-1-1.png differ diff --git a/11-spatial_analysis_files/figure-html/spatial-viz-1-2.png b/11-spatial_analysis_files/figure-html/spatial-viz-1-2.png new file mode 100644 index 00000000..c9cf6a42 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/spatial-viz-1-2.png differ diff --git a/11-spatial_analysis_files/figure-html/spatial-viz-1-3.png b/11-spatial_analysis_files/figure-html/spatial-viz-1-3.png new file mode 100644 index 00000000..dc24b1f2 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/spatial-viz-1-3.png differ diff --git a/11-spatial_analysis_files/figure-html/spatial-viz-1-4.png b/11-spatial_analysis_files/figure-html/spatial-viz-1-4.png new file mode 100644 index 00000000..3c315919 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/spatial-viz-1-4.png differ diff --git a/11-spatial_analysis_files/figure-html/spatial-viz-2-1.png b/11-spatial_analysis_files/figure-html/spatial-viz-2-1.png new file mode 100644 index 00000000..0c88e830 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/spatial-viz-2-1.png differ diff --git a/11-spatial_analysis_files/figure-html/spatial-viz-3-1.png b/11-spatial_analysis_files/figure-html/spatial-viz-3-1.png new file mode 100644 index 00000000..c762b98a Binary files /dev/null and b/11-spatial_analysis_files/figure-html/spatial-viz-3-1.png differ diff --git a/11-spatial_analysis_files/figure-html/testInteractions-2-1.png b/11-spatial_analysis_files/figure-html/testInteractions-2-1.png new file mode 100644 index 00000000..070ea008 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/testInteractions-2-1.png differ diff --git a/11-spatial_analysis_files/figure-html/testInteractions-3-1.png b/11-spatial_analysis_files/figure-html/testInteractions-3-1.png new file mode 100644 index 00000000..2f63749b Binary files /dev/null and b/11-spatial_analysis_files/figure-html/testInteractions-3-1.png differ diff --git a/11-spatial_analysis_files/figure-html/unnamed-chunk-1-1.png b/11-spatial_analysis_files/figure-html/unnamed-chunk-1-1.png new file mode 100644 index 00000000..4d0a3901 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/unnamed-chunk-1-1.png differ diff --git a/11-spatial_analysis_files/figure-html/unnamed-chunk-2-1.png b/11-spatial_analysis_files/figure-html/unnamed-chunk-2-1.png new file mode 100644 index 00000000..ca627785 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/unnamed-chunk-2-1.png differ diff --git a/11-spatial_analysis_files/figure-html/unnamed-chunk-3-1.png b/11-spatial_analysis_files/figure-html/unnamed-chunk-3-1.png new file mode 100644 index 00000000..660ca2e3 Binary files /dev/null and b/11-spatial_analysis_files/figure-html/unnamed-chunk-3-1.png differ diff --git a/11-spatial_analysis_files/figure-html/unnamed-chunk-4-1.png b/11-spatial_analysis_files/figure-html/unnamed-chunk-4-1.png new file mode 100644 index 00000000..2e8e1dab Binary files /dev/null and b/11-spatial_analysis_files/figure-html/unnamed-chunk-4-1.png differ diff --git a/12-references.md b/12-references.md new file mode 100644 index 00000000..48143a70 --- /dev/null +++ b/12-references.md @@ -0,0 +1,3 @@ + +# References {-} + diff --git a/404.html b/404.html new file mode 100644 index 00000000..f34f19ba --- /dev/null +++ b/404.html @@ -0,0 +1,446 @@ + + + + + + + Page not found | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

Page not found

+

The page you requested cannot be found (perhaps it was moved or renamed).

+

You may want to try searching to find the page's new location, or use +the table of contents to find the page you are looking for.

+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/batch-effects.html b/batch-effects.html new file mode 100644 index 00000000..d1dab367 --- /dev/null +++ b/batch-effects.html @@ -0,0 +1,903 @@ + + + + + + + 8 Batch effect correction | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

8 Batch effect correction

+

In Section 7.4 we observed staining/expression differences +between the individual samples. This can arise due to technical (e.g., +differences in sample processing) as well as biological (e.g., differential +expression between patients/indications) effects. However, the combination of these effects +hinders cell phenotyping via clustering as highlighted in Section 9.2.

+

To integrate cells across samples, we can use computational +strategies developed for correcting batch effects in single-cell RNA sequencing +data. In the following sections, we will use functions of the +batchelor, +harmony and +Seurat +packages to correct for such batch effects.

+

Of note: the correction approaches presented here aim at removing any +differences between samples. This will also remove biological differences +between the patients/indications. Nevertheless, integrating cells across samples +can facilitate the detection of cell phenotypes via clustering.

+

First, we will read in the SpatialExperiment object containing the single-cell +data.

+
spe <- readRDS("data/spe.rds")
+
+

8.1 fastMNN correction

+

The batchelor package provides the mnnCorrect and fastMNN functions to +correct for differences between samples/batches. Both functions build up on +finding mutual nearest neighbors (MNN) among the cells of different samples and +correct expression differences between the batches (Haghverdi et al. 2018). The mnnCorrect function +returns corrected expression counts while the fastMNN functions performs the +correction in reduced dimension space. As such, fastMNN returns integrated +cells in form of a low dimensional embedding.

+

Paper: Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors
+Documentation: batchelor

+
+

8.1.1 Perform sample correction

+

Here, we apply the fastMNN function to integrate cells between +patients. By setting auto.merge = TRUE the function estimates the best +batch merging order by maximizing the number of MNN pairs at each merging step. +This is more time consuming than merging sequentially based on how batches appear in the +dataset (default). We again select the markers defined in Section 5.2 +for sample correction.

+

The function returns a SingleCellExperiment object which contains corrected +low-dimensional coordinates for each cell in the reducedDim(out, "corrected") +slot. This low-dimensional embedding can be further used for clustering and +non-linear dimensionality reduction. We check that the order of cells is the +same between the input and output object and then transfer the corrected +coordinates to the main SpatialExperiment object.

+
library(batchelor)
+set.seed(220228)
+out <- fastMNN(spe, batch = spe$patient_id,
+               auto.merge = TRUE,
+               subset.row = rowData(spe)$use_channel,
+               assay.type = "exprs")
+
+# Check that order of cells is the same
+stopifnot(all.equal(colnames(spe), colnames(out)))
+
+# Transfer the correction results to the main spe object
+reducedDim(spe, "fastMNN") <- reducedDim(out, "corrected")
+

The computational time of the fastMNN function call is +2.33 minutes.

+

Of note, the warnings that the fastMNN function produces can be avoided as follows:

+
    +
  1. The following warning can be avoided by setting BSPARAM = BiocSingular::ExactParam()
  2. +
+
Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE,  :
+  You're computing too large a percentage of total singular values, use a standard svd instead.
+
    +
  1. The following warning can be avoided by requesting fewer singular values by setting d = 30
  2. +
+
In check_numbers(k = k, nu = nu, nv = nv, limit = min(dim(x)) -  :
+  more singular values/vectors requested than available
+
+
+

8.1.2 Quality control of correction results

+

The fastMNN function further returns outputs that can be used to assess the +quality of the batch correction. The metadata(out)$merge.info entry collects +diagnostics for each individual merging step. Here, the batch.size and +lost.var entries are important. The batch.size entry reports the relative +magnitude of the batch effect and the lost.var entry represents the percentage +of lost variance per merging step. A large batch.size and low lost.var +indicate sufficient batch correction.

+
merge_info <- metadata(out)$merge.info 
+
+merge_info[,c("left", "right", "batch.size")]
+
## DataFrame with 3 rows and 3 columns
+##                         left    right batch.size
+##                       <List>   <List>  <numeric>
+## 1                   Patient4 Patient2   0.381635
+## 2          Patient4,Patient2 Patient1   0.581013
+## 3 Patient4,Patient2,Patient1 Patient3   0.767376
+
merge_info$lost.var
+
##         Patient1    Patient2   Patient3    Patient4
+## [1,] 0.000000000 0.031154864 0.00000000 0.046198914
+## [2,] 0.043363546 0.009772150 0.00000000 0.011931892
+## [3,] 0.005394755 0.003023119 0.07219394 0.005366304
+

We observe that Patient4 and Patient2 are most similar with a low batch effect. +Merging cells of Patient3 into the combined batch of Patient1, +Patient2 and Patient4 resulted in the highest percentage of lost variance and +the detection of the largest batch effect. In the next paragraph we can +visualize the correction results.

+
+
+

8.1.3 Visualization

+

The simplest option to check if the sample effects were corrected is by using +non-linear dimensionality reduction techniques and observe mixing of cells across +samples. We will recompute the UMAP embedding using the corrected +low-dimensional coordinates for each cell.

+
library(scater)
+
+set.seed(220228)
+spe <- runUMAP(spe, dimred= "fastMNN", name = "UMAP_mnnCorrected") 
+

Next, we visualize the corrected UMAP while overlaying patient IDs.

+
library(cowplot)
+library(dittoSeq)
+library(viridis)
+
+# visualize patient id 
+p1 <- dittoDimPlot(spe, var = "patient_id", 
+                   reduction.use = "UMAP", size = 0.2) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$patient_id) +
+    ggtitle("Patient ID on UMAP before correction")
+p2 <- dittoDimPlot(spe, var = "patient_id", 
+                   reduction.use = "UMAP_mnnCorrected", size = 0.2) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$patient_id) +
+    ggtitle("Patient ID on UMAP after correction")
+
+plot_grid(p1, p2)
+

+

We observe an imperfect merging of Patient3 into all other samples. This +was already seen when displaying the merging information above. +We now also visualize the expression of selected markers across all cells +before and after batch correction.

+
markers <- c("Ecad", "CD45RO", "CD20", "CD3", "FOXP3", "CD206", "MPO", "SMA", "Ki67")
+
+# Before correction
+plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP", 
+                   assay = "exprs", size = 0.2, list.out = TRUE) 
+plot_list <- lapply(plot_list, function(x) x + scale_color_viridis())
+plot_grid(plotlist = plot_list) 
+

+
# After correction
+plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP_mnnCorrected", 
+                   assay = "exprs", size = 0.2, list.out = TRUE) 
+plot_list <- lapply(plot_list, function(x) x + scale_color_viridis())
+plot_grid(plotlist = plot_list) 
+

+

We observe that immune cells across patients are merged after batch correction +using fastMNN. However, the tumor cells of different patients still cluster +separately.

+
+
+
+

8.2 harmony correction

+

The harmony algorithm performs batch correction by iteratively clustering and +correcting the positions of cells in PCA space (Korsunsky et al. 2019). We will first +perform PCA on the asinh-transformed counts and then call the RunHarmony +function to perform data integration.

+

Paper: Fast, sensitive and accurate integration of single-cell data with Harmony
+Documentation: harmony

+

Similar to the fastMNN function, harmony returns the corrected +low-dimensional coordinates for each cell. These can be transfered to the +reducedDim slot of the original SpatialExperiment object.

+
library(harmony)
+library(BiocSingular)
+
+spe <- runPCA(spe, 
+              subset_row = rowData(spe)$use_channel, 
+              exprs_values = "exprs", 
+              ncomponents = 30,
+              BSPARAM = ExactParam())
+
+set.seed(230616)
+out <- RunHarmony(spe, group.by.vars = "patient_id")
+
+# Check that order of cells is the same
+stopifnot(all.equal(colnames(spe), colnames(out)))
+
+reducedDim(spe, "harmony") <- reducedDim(out, "HARMONY")
+

The computational time of the HarmonyMatrix function call is +1.3 minutes.

+
+

8.2.1 Visualization

+

We will now again visualize the cells in low dimensions after UMAP embedding.

+
set.seed(220228)
+spe <- runUMAP(spe, dimred = "harmony", name = "UMAP_harmony") 
+
# visualize patient id 
+p1 <- dittoDimPlot(spe, var = "patient_id", 
+                   reduction.use = "UMAP", size = 0.2) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$patient_id) +
+    ggtitle("Patient ID on UMAP before correction")
+p2 <- dittoDimPlot(spe, var = "patient_id", 
+                   reduction.use = "UMAP_harmony", size = 0.2) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$patient_id) +
+    ggtitle("Patient ID on UMAP after correction")
+
+plot_grid(p1, p2)
+

+

And we visualize selected marker expression as defined above.

+
# Before correction
+plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP", 
+                   assay = "exprs", size = 0.2, list.out = TRUE) 
+plot_list <- lapply(plot_list, function(x) x + scale_color_viridis())
+plot_grid(plotlist = plot_list) 
+

+
# After correction
+plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP_harmony", 
+                   assay = "exprs", size = 0.2, list.out = TRUE) 
+plot_list <- lapply(plot_list, function(x) x + scale_color_viridis())
+plot_grid(plotlist = plot_list) 
+

+

We observe a more aggressive merging of cells from different patients compared +to the results after fastMNN correction. Importantly, immune cell and epithelial +markers are expressed in distinct regions of the UMAP.

+
+
+
+

8.3 Seurat correction

+

The Seurat package provides a number of functionalities to analyze single-cell +data. As such it also allows the integration of cells across different samples. +Conceptually, Seurat performs batch correction similarly to fastMNN by +finding mutual nearest neighbors (MNN) in low dimensional space before +correcting the expression values of cells (Stuart et al. 2019).

+

Paper: Comprehensive Integration of Single-Cell Data
+Documentation: Seurat

+

To use Seurat, we will first create a Seurat object from the SpatialExperiment +object and add relevant metadata. The object also needs to be split by patient +prior to integration.

+
library(Seurat)
+library(SeuratObject)
+seurat_obj <- as.Seurat(spe, counts = "counts", data = "exprs")
+seurat_obj <- AddMetaData(seurat_obj, as.data.frame(colData(spe)))
+
+seurat.list <- SplitObject(seurat_obj, split.by = "patient_id")
+

To avoid long run times, we will use an approach that relies on reciprocal PCA +instead of canonical correlation analysis for dimensionality reduction and +initial alignment. For an extended tutorial on how to use Seurat for data +integration, please refer to their +vignette.

+

We will first define the features used for integration and perform PCA on cells +of each patient individually. The FindIntegrationAnchors function detects MNNs between +cells of different patients and the IntegrateData function corrects the +expression values of cells. We slightly increase the number of neighbors to be +considered for MNN detection (the k.anchor parameter). This increases the integration +strength.

+
features <- rownames(spe)[rowData(spe)$use_channel]
+seurat.list <- lapply(X = seurat.list, FUN = function(x) {
+    x <- ScaleData(x, features = features, verbose = FALSE)
+    x <- RunPCA(x, features = features, verbose = FALSE, approx = FALSE)
+    return(x)
+})
+
+anchors <- FindIntegrationAnchors(object.list = seurat.list, 
+                                  anchor.features = features,
+                                  reduction = "rpca", 
+                                  k.anchor = 20)
+
+combined <- IntegrateData(anchorset = anchors)
+

We now select the integrated assay and perform PCA dimensionality reduction. +The cell coordinates in PCA reduced space can then be transferred to the +original SpatialExperiment object. Of note: by splitting the object into +individual batch-specific objects, the ordering of cells in the integrated +object might not match the ordering of cells in the input object. In this case, +columns will need to be reordered. Here, we test if the ordering of cells in the +integrated Seurat object matches the ordering of cells in the main +SpatialExperiment object.

+
DefaultAssay(combined) <- "integrated"
+
+combined <- ScaleData(combined, verbose = FALSE)
+combined <- RunPCA(combined, npcs = 30, verbose = FALSE, approx = FALSE)
+
+# Check that order of cells is the same
+stopifnot(all.equal(colnames(spe), colnames(combined)))
+
+reducedDim(spe, "seurat") <- Embeddings(combined, reduction = "pca")
+

The computational time of the Seurat function calls is +4.29 minutes.

+
+

8.3.1 Visualization

+

As above, we recompute the UMAP embeddings based on Seurat integrated results +and visualize the embedding.

+
set.seed(220228)
+spe <- runUMAP(spe, dimred = "seurat", name = "UMAP_seurat") 
+

Visualize patient IDs.

+
# visualize patient id 
+p1 <- dittoDimPlot(spe, var = "patient_id", 
+                   reduction.use = "UMAP", size = 0.2) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$patient_id) +
+    ggtitle("Patient ID on UMAP before correction")
+p2 <- dittoDimPlot(spe, var = "patient_id", 
+                   reduction.use = "UMAP_seurat", size = 0.2) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$patient_id) +
+    ggtitle("Patient ID on UMAP after correction")
+
+plot_grid(p1, p2)
+

+

Visualization of marker expression.

+
# Before correction
+plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP", 
+                   assay = "exprs", size = 0.2, list.out = TRUE) 
+plot_list <- lapply(plot_list, function(x) x + scale_color_viridis())
+plot_grid(plotlist = plot_list) 
+

+
# After correction
+plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP_seurat", 
+                   assay = "exprs", size = 0.2, list.out = TRUE) 
+plot_list <- lapply(plot_list, function(x) x + scale_color_viridis())
+plot_grid(plotlist = plot_list) 
+

+

Similar to the methods presented above, Seurat integrates immune cells correctly. +When visualizing the patient IDs, slight patient-to-patient differences within tumor +cells can be detected.

+

Choosing the correct integration approach is challenging without having ground truth +cell labels available. It is recommended to compare different techniques and different +parameter settings. Please refer to the documentation of the individual tools +to become familiar with the possible parameter choices. Furthermore, in the following +section, we will discuss clustering and classification approaches in light of +expression differences between samples.

+

In general, it appears that MNN-based approaches are less conservative in terms +of merging compared to harmony. On the other hand, harmony could well merge +cells in a way that regresses out biological signals.

+
+
+
+

8.4 Save objects

+

The modified SpatialExperiment object is saved for further downstream analysis.

+
saveRDS(spe, "data/spe.rds")
+
+
+

8.5 Session Info

+
+ +SessionInfo + +
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.3 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## locale:
+##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+## 
+## time zone: Etc/UTC
+## tzcode source: system (glibc)
+## 
+## attached base packages:
+## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
+## [8] base     
+## 
+## other attached packages:
+##  [1] testthat_3.1.10             SeuratObject_4.1.4         
+##  [3] Seurat_4.4.0                BiocSingular_1.16.0        
+##  [5] harmony_1.0.1               Rcpp_1.0.11                
+##  [7] viridis_0.6.4               viridisLite_0.4.2          
+##  [9] dittoSeq_1.12.1             cowplot_1.1.1              
+## [11] scater_1.28.0               ggplot2_3.4.3              
+## [13] scuttle_1.10.2              SpatialExperiment_1.10.0   
+## [15] batchelor_1.16.0            SingleCellExperiment_1.22.0
+## [17] SummarizedExperiment_1.30.2 Biobase_2.60.0             
+## [19] GenomicRanges_1.52.0        GenomeInfoDb_1.36.3        
+## [21] IRanges_2.34.1              S4Vectors_0.38.2           
+## [23] BiocGenerics_0.46.0         MatrixGenerics_1.12.3      
+## [25] matrixStats_1.0.0          
+## 
+## loaded via a namespace (and not attached):
+##   [1] RcppAnnoy_0.0.21          splines_4.3.1            
+##   [3] later_1.3.1               bitops_1.0-7             
+##   [5] tibble_3.2.1              R.oo_1.25.0              
+##   [7] polyclip_1.10-6           lifecycle_1.0.3          
+##   [9] rprojroot_2.0.3           edgeR_3.42.4             
+##  [11] globals_0.16.2            lattice_0.21-8           
+##  [13] MASS_7.3-60               magrittr_2.0.3           
+##  [15] limma_3.56.2              plotly_4.10.2            
+##  [17] sass_0.4.7                rmarkdown_2.25           
+##  [19] jquerylib_0.1.4           yaml_2.3.7               
+##  [21] httpuv_1.6.11             sctransform_0.4.0        
+##  [23] spatstat.sparse_3.0-2     sp_2.0-0                 
+##  [25] reticulate_1.32.0         pbapply_1.7-2            
+##  [27] RColorBrewer_1.1-3        ResidualMatrix_1.10.0    
+##  [29] pkgload_1.3.3             abind_1.4-5              
+##  [31] zlibbioc_1.46.0           Rtsne_0.16               
+##  [33] purrr_1.0.2               R.utils_2.12.2           
+##  [35] RCurl_1.98-1.12           GenomeInfoDbData_1.2.10  
+##  [37] ggrepel_0.9.3             irlba_2.3.5.1            
+##  [39] spatstat.utils_3.0-3      listenv_0.9.0            
+##  [41] pheatmap_1.0.12           goftest_1.2-3            
+##  [43] spatstat.random_3.1-6     dqrng_0.3.1              
+##  [45] fitdistrplus_1.1-11       parallelly_1.36.0        
+##  [47] DelayedMatrixStats_1.22.6 leiden_0.4.3             
+##  [49] codetools_0.2-19          DropletUtils_1.20.0      
+##  [51] DelayedArray_0.26.7       tidyselect_1.2.0         
+##  [53] farver_2.1.1              ScaledMatrix_1.8.1       
+##  [55] spatstat.explore_3.2-3    jsonlite_1.8.7           
+##  [57] BiocNeighbors_1.18.0      ellipsis_0.3.2           
+##  [59] progressr_0.14.0          ggridges_0.5.4           
+##  [61] survival_3.5-5            tools_4.3.1              
+##  [63] ica_1.0-3                 glue_1.6.2               
+##  [65] gridExtra_2.3             xfun_0.40                
+##  [67] dplyr_1.1.3               HDF5Array_1.28.1         
+##  [69] withr_2.5.1               fastmap_1.1.1            
+##  [71] rhdf5filters_1.12.1       fansi_1.0.4              
+##  [73] digest_0.6.33             rsvd_1.0.5               
+##  [75] R6_2.5.1                  mime_0.12                
+##  [77] colorspace_2.1-0          scattermore_1.2          
+##  [79] tensor_1.5                spatstat.data_3.0-1      
+##  [81] R.methodsS3_1.8.2         RhpcBLASctl_0.23-42      
+##  [83] utf8_1.2.3                tidyr_1.3.0              
+##  [85] generics_0.1.3            data.table_1.14.8        
+##  [87] httr_1.4.7                htmlwidgets_1.6.2        
+##  [89] S4Arrays_1.0.6            uwot_0.1.16              
+##  [91] pkgconfig_2.0.3           gtable_0.3.4             
+##  [93] lmtest_0.9-40             XVector_0.40.0           
+##  [95] brio_1.1.3                htmltools_0.5.6          
+##  [97] bookdown_0.35             scales_1.2.1             
+##  [99] png_0.1-8                 knitr_1.44               
+## [101] rstudioapi_0.15.0         reshape2_1.4.4           
+## [103] rjson_0.2.21              nlme_3.1-162             
+## [105] cachem_1.0.8              zoo_1.8-12               
+## [107] rhdf5_2.44.0              stringr_1.5.0            
+## [109] KernSmooth_2.23-21        parallel_4.3.1           
+## [111] miniUI_0.1.1.1            vipor_0.4.5              
+## [113] desc_1.4.2                pillar_1.9.0             
+## [115] grid_4.3.1                vctrs_0.6.3              
+## [117] RANN_2.6.1                promises_1.2.1           
+## [119] beachmat_2.16.0           xtable_1.8-4             
+## [121] cluster_2.1.4             waldo_0.5.1              
+## [123] beeswarm_0.4.0            evaluate_0.21            
+## [125] magick_2.8.0              cli_3.6.1                
+## [127] locfit_1.5-9.8            compiler_4.3.1           
+## [129] rlang_1.1.1               crayon_1.5.2             
+## [131] future.apply_1.11.0       labeling_0.4.3           
+## [133] plyr_1.8.8                ggbeeswarm_0.7.2         
+## [135] stringi_1.7.12            deldir_1.0-9             
+## [137] BiocParallel_1.34.2       munsell_0.5.0            
+## [139] lazyeval_0.2.2            spatstat.geom_3.2-5      
+## [141] Matrix_1.6-1.1            patchwork_1.1.3          
+## [143] sparseMatrixStats_1.12.2  future_1.33.0            
+## [145] Rhdf5lib_1.22.1           shiny_1.7.5              
+## [147] ROCR_1.0-11               igraph_1.5.1             
+## [149] bslib_0.5.1
+
+ +
+
+

References

+
+
+Haghverdi, Laleh, Aaron T. L. Lun, Michael D. Morgan, and John C. Marioni. 2018. “Batch Effects in Single-Cell RNA-Sequencing Data Are Corrected by Matching Mutual Nearest Neighbors.” Nature Biotechnology 36: 421–27. +
+
+Korsunsky, Ilya, Nghia Millard, Jean Fan, Kamil Slowikowski, Fan Zhang, Kevin Wei, Yuriy Baglaenko, Michael Brenner, Po-ru Loh, and Soumya Raychaudhuri. 2019. “Fast, Sensitive and Accurate Integration of Single-Cell Data with Harmony.” Nature Methods 16: 1289–96. +
+
+Stuart, Tim, Andrew Butler, Paul Hoffman, Christoph Hafemeister, Efthymia Papalexi, William M. III Mauck, Yuhan Hao, Marlon Stoeckius, Peter Smibert, and Rahul Satija. 2019. “Comprehensive Integration of Single-Cell Data.” Cell 177: 1888–1902. +
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/cell-phenotyping.html b/cell-phenotyping.html new file mode 100644 index 00000000..7eace332 --- /dev/null +++ b/cell-phenotyping.html @@ -0,0 +1,1547 @@ + + + + + + + 9 Cell phenotyping | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

9 Cell phenotyping

+

A common step during single-cell data analysis is the annotation of cells based +on their phenotype. Defining cell phenotypes is often subjective and relies +on previous biological knowledge. The Orchestrating Single Cell Analysis with Bioconductor book +presents a number of approaches to phenotype cells detected by single-cell RNA +sequencing based on reference datasets or gene set analysis.

+

In highly-multiplexed imaging, target proteins or molecules are manually +selected based on the biological question at hand. It narrows down the feature +space and facilitates the manual annotation of clusters to derive cell +phenotypes. We will therefore discuss and compare a number of clustering +approaches to group cells based on their similarity in marker expression in +Section 9.2.

+

Unlike single-cell RNA sequencing or CyTOF data, single-cell data derived from +highly-multiplexed imaging data often suffers from “lateral spillover” between +neighboring cells. This spillover caused by imperfect segmentation often hinders +accurate clustering to define specific cell phenotypes in multiplexed imaging +data. Tools have been developed to correct lateral spillover between cells +(Bai et al. 2021) but the approach requires careful selection of the markers to +correct. In Section 9.3 we will train and apply a random +forest classifier to classify cell phenotypes in the dataset as alternative +approach to clustering-based cell phenotyping. This approach has been previously used to +identify major cell phenotypes in metastatic melanoma and avoids clustering of +cells (Hoch et al. 2022).

+
+

9.1 Load data

+

We will first read in the previously generated SpatialExperiment object and +sample 2000 cells to visualize cluster membership.

+
library(SpatialExperiment)
+spe <- readRDS("data/spe.rds")
+
+# Sample cells
+set.seed(220619)
+cur_cells <- sample(seq_len(ncol(spe)), 2000)
+
+
+

9.2 Clustering approaches

+

In the first section, we will present clustering approaches to identify cellular +phenotypes in the dataset. These methods group cells based on their similarity +in marker expression or by their proximity in low dimensional space. A number of +approaches have been developed to cluster data derived from single-cell RNA +sequencing technologies (Yu et al. 2022) or CyTOF (Weber and Robinson 2016). For demonstration +purposes, we will highlight common clustering approaches that are available in R +and have been used for clustering cells obtained from IMC. Two approaches rely +on graph-based clustering and one approach uses self organizing maps (SOM).

+
+

9.2.1 Rphenograph

+

The PhenoGraph clustering approach was first described to group cells of a CyTOF +dataset (Levine et al. 2015). The algorithm first constructs a graph by detecting the +k nearest neighbours based on euclidean distance in expression space. In the +next step, edges between nodes (cells) are weighted by their overlap in nearest +neighbor sets. To quantify the overlap in shared nearest neighbor sets, the +jaccard index is used. The Louvain modularity optimization approach is used to +detect connected communities and partition the graph into clusters of cells. +This clustering strategy was used by Jackson, Fischer et al. and Schulz et +al. to cluster IMC data (Jackson et al. 2020; Schulz et al. 2018).

+

There are several different PhenoGraph implementations available in R. Here, we +use the one available at +https://github.com/i-cyto/Rphenograph. +For large datasets, +https://github.com/stuchly/Rphenoannoy +offers a more performant implementation of the algorithm.

+

In the following code chunk, we select the asinh-transformed mean pixel +intensities per cell and channel and subset the channels to the ones containing +biological variation. This matrix is transposed to store cells in rows. Within +the Rphenograph function, we select the 45 nearest neighbors for graph +building and louvain community detection (default). The function returns a list +of length 2, the first entry being the graph and the second entry containing the +community object. Calling membership on the community object will return +cluster IDs for each cell. These cluster IDs are then stored within the +colData of the SpatialExperiment object. Cluster IDs are mapped on top of +the UMAP embedding and single-cell marker expression within each cluster are +visualized in form of a heatmap.

+

It is recommended to test different inputs to k as shown in the next section. +Selecting larger values for k results in larger clusters.

+
library(Rphenograph)
+library(igraph)
+library(dittoSeq)
+library(viridis)
+
+mat <- t(assay(spe, "exprs")[rowData(spe)$use_channel,])
+
+set.seed(230619)
+out <- Rphenograph(mat, k = 45)
+
+clusters <- factor(membership(out[[2]]))
+
+spe$pg_clusters <- clusters
+
+dittoDimPlot(spe, var = "pg_clusters", 
+             reduction.use = "UMAP", size = 0.2,
+             do.label = TRUE) +
+    ggtitle("Phenograph clusters on UMAP")
+

+
dittoHeatmap(spe[,cur_cells], 
+             genes = rownames(spe)[rowData(spe)$use_channel],
+             assay = "exprs", scale = "none",
+             heatmap.colors = viridis(100), 
+             annot.by = c("pg_clusters", "patient_id"),
+             annot.colors = c(dittoColors(1)[1:length(unique(spe$pg_clusters))],
+                              metadata(spe)$color_vectors$patient_id))
+

+

The Rphenograph function call took +2.31 minutes.

+

We can observe that some of the clusters only contain cells of a single patient. +This can often be observed in the tumor compartment. In the next step, we +use the integrated cells (see Section 8) in low dimensional +embedding for clustering. Here, the low dimensional embedding can +be directly accessed from the reducedDim slot.

+
mat <- reducedDim(spe, "fastMNN")
+
+set.seed(230619)
+out <- Rphenograph(mat, k = 45)
+
+clusters <- factor(membership(out[[2]]))
+
+spe$pg_clusters_corrected <- clusters
+
+dittoDimPlot(spe, var = "pg_clusters_corrected", 
+             reduction.use = "UMAP_mnnCorrected", size = 0.2,
+             do.label = TRUE) +
+    ggtitle("Phenograph clusters on UMAP, integrated cells")
+

+
dittoHeatmap(spe[,cur_cells], 
+             genes = rownames(spe)[rowData(spe)$use_channel],
+             assay = "exprs", scale = "none",
+             heatmap.colors = viridis(100), 
+             annot.by = c("pg_clusters_corrected","patient_id"),
+             annot.colors = c(dittoColors(1)[1:length(unique(spe$pg_clusters_corrected))],
+                              metadata(spe)$color_vectors$patient_id))
+

+

Clustering using the integrated embedding leads to clusters that contain cells +of different patients. Cluster annotation can now be performed by manually +labeling cells based on their marker expression (see Notes in Section +9.2.5).

+
+
+

9.2.2 Shared nearest neighbour graph

+

The bluster +package provides a simple interface to cluster cells using a number of different +clustering approaches and different metrics to access cluster stability.

+

For simplicity, we will focus on graph based clustering as this is the most +popular and a fast method for single-cell clustering. The bluster package +provides functionalities to build k-nearest neighbor (KNN) graphs and its weighted +version, shared nearest neighbor (SNN) graphs where nodes represent cells. +The user can chose the number of neighbors to consider (parameter k), +the edge weighting method (parameter type) and the community detection +function to use (parameter cluster.fun). As all parameters affect the clustering +results, the bluster package provides the clusterSweep function to test +a number of parameter settings in parallel. In the following code chunk, +we select the asinh-transformed mean pixel intensities and subset the markers +of interest. The resulting matrix is transposed to fit to the requirements of +the bluster package (cells in rows).

+

We test two different settings for k, two for type and fix the cluster.fun +to louvain as this is one of the most common approaches for community detection. +This function call is parallelized by setting the BPPARAM parameter.

+
library(bluster)
+library(BiocParallel)
+library(ggplot2)
+
+mat <- t(assay(spe, "exprs")[rowData(spe)$use_channel,])
+
+combinations <- clusterSweep(mat, 
+                             BLUSPARAM=SNNGraphParam(),
+                             k=c(10L, 20L), 
+                             type = c("rank", "jaccard"), 
+                             cluster.fun = "louvain",
+                             BPPARAM = MulticoreParam(RNGseed = 220427))
+

We next calculate two metrics to estimate cluster stability: the average +silhouette width and the neighborhood purity.

+

We use the approxSilhouette function to compute the silhouette width for each +cell and compute the average across all cells per parameter setting. Please see +?silhouette for more information on how the silhouette width is computed for +each cell. A large average silhouette width indicates a cluster parameter +setting for which cells that are well clustered.

+

The neighborPurity function computes the fraction of cells around each cell +with the same cluster ID. Per parameter setting, we compute the average +neighborhood purity across all cells. A large average neighborhood purity +indicates a cluster parameter setting for which cells that are well clustered.

+
sil <- vapply(as.list(combinations$clusters), 
+              function(x) mean(approxSilhouette(mat, x)$width), 
+              0)
+
+ggplot(data.frame(method = names(sil),
+                  sil = sil)) +
+    geom_point(aes(method, sil)) +
+    theme_classic(base_size = 15) +
+    theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
+    xlab("Cluster parameter combination") +
+    ylab("Average silhouette width")
+

+
pur <- vapply(as.list(combinations$clusters), 
+              function(x) mean(neighborPurity(mat, x)$purity), 
+              0)
+
+ggplot(data.frame(method = names(pur),
+                  pur = pur)) +
+    geom_point(aes(method, pur)) +
+    theme_classic(base_size = 15) +
+    theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
+    xlab("Cluster parameter combination") +
+    ylab("Average neighborhood purity")
+

+

The cluster parameter sweep took +8.81 minutes.

+

Performing a cluster sweep takes some time as multiple function calls are run in parallel. +We do however recommend testing a number of different parameter settings to +assess clustering performance.

+

Once parameter settings are known, we can either use the clusterRows function +of the bluster package to cluster cells or its convenient wrapper function +exported by the +scran package. +The scran::clusterCells function accepts a SpatialExperiment (or +SingleCellExperiment) object which stores cells in columns. By default, the +function detects the 10 nearest neighbours for each cell, performs rank-based +weighting of edges (see ?makeSNNGraph for more information) and uses the +cluster_walktrap function to detect communities in the graph.

+

As we can see above, the clustering approach in this dataset with k being 20 +and rank-based edge weighting leads to the highest silhouette width and highest +neighborhood purity.

+
library(scran)
+
+set.seed(220620)
+clusters <- clusterCells(spe[rowData(spe)$use_channel,], 
+                         assay.type = "exprs", 
+                         BLUSPARAM = SNNGraphParam(k=20, 
+                                                  cluster.fun = "louvain",
+                                                  type = "rank"))
+
+spe$nn_clusters <- clusters
+
+dittoDimPlot(spe, var = "nn_clusters", 
+             reduction.use = "UMAP", size = 0.2,
+             do.label = TRUE) +
+    ggtitle("SNN clusters on UMAP")
+

+
dittoHeatmap(spe[,cur_cells], 
+             genes = rownames(spe)[rowData(spe)$use_channel],
+             assay = "exprs", scale = "none",
+             heatmap.colors = viridis(100), 
+             annot.by = c("nn_clusters", "patient_id"),
+             annot.colors = c(dittoColors(1)[1:length(unique(spe$nn_clusters))],
+                              metadata(spe)$color_vectors$patient_id))
+

+

The shared nearest neighbor graph clustering approach took +1.31 minutes.

+

This function was used by (Tietscher et al. 2022) to cluster cells obtained by IMC. Setting +type = "jaccard" performs clustering similar to Rphenograph above and Seurat.

+

Similar to the results obtained by Rphenograph, some of the clusters are +patient-specific. We can now perform clustering of the integrated cells +by directly specifying which low-dimensional embedding to use:

+
set.seed(220621)
+clusters <- clusterCells(spe, 
+                         use.dimred = "fastMNN", 
+                         BLUSPARAM = SNNGraphParam(k = 20, 
+                                        cluster.fun = "louvain",
+                                        type = "rank"))
+
+spe$nn_clusters_corrected <- clusters
+
+dittoDimPlot(spe, var = "nn_clusters_corrected", 
+             reduction.use = "UMAP_mnnCorrected", size = 0.2,
+             do.label = TRUE) +
+    ggtitle("SNN clusters on UMAP, integrated cells")
+

+
dittoHeatmap(spe[,cur_cells], 
+             genes = rownames(spe)[rowData(spe)$use_channel],
+             assay = "exprs", scale = "none",
+             heatmap.colors = viridis(100), 
+             annot.by = c("nn_clusters_corrected","patient_id"),
+             annot.colors = c(dittoColors(1)[1:length(unique(spe$nn_clusters_corrected))],
+                              metadata(spe)$color_vectors$patient_id))
+

+
+
+

9.2.3 Self organizing maps

+

An alternative to graph-based clustering is offered by the +CATALYST +package. The cluster function internally uses the +FlowSOM +package to group cells into 100 (default) clusters based on self organizing maps +(SOM). In the next step, the +ConsensusClusterPlus +package is used to perform hierarchical consensus clustering of the previously +detected 100 SOM nodes into 2 to maxK clusters. Cluster stability for each k +can be assessed by plotting the delta_area(spe). The optimal number +of clusters can be found by selecting the k at which a plateau is reached. +In the example below, an optimal k lies somewhere around 13.

+
library(CATALYST)
+
+# Run FlowSOM and ConsensusClusterPlus clustering
+set.seed(220410)
+spe <- cluster(spe, 
+               features = rownames(spe)[rowData(spe)$use_channel],
+               maxK = 30)
+
+# Assess cluster stability
+delta_area(spe)
+

+
spe$som_clusters <- cluster_ids(spe, "meta13")
+
+dittoDimPlot(spe, var = "som_clusters", 
+             reduction.use = "UMAP", size = 0.2,
+             do.label = TRUE) +
+    ggtitle("SOM clusters on UMAP")
+

+
dittoHeatmap(spe[,cur_cells], 
+             genes = rownames(spe)[rowData(spe)$use_channel],
+             assay = "exprs", scale = "none",
+             heatmap.colors = viridis(100), 
+             annot.by = c("som_clusters", "patient_id"),
+             annot.colors = c(dittoColors(1)[1:length(unique(spe$som_clusters))],
+                              metadata(spe)$color_vectors$patient_id))
+

+

Running FlowSOM clustering took 0.22 minutes.

+

The CATALYST package does not provide functionality to perform FlowSOM and +ConsensusClusterPlus clustering directly on the batch-corrected, integrated cells. As an +alternative to the CATALYST package, the bluster package provides SOM +clustering when specifying the SomParam() parameter. Similar to the CATALYST +approach, we will first cluster the dataset into 100 clusters (also called +“codes”). These codes are then further clustered into a maximum of 30 clusters +using ConsensusClusterPlus (using hierarchical clustering and euclidean +distance). The delta area plot can be accessed using the (not exported) +.plot_delta_area function from CATALYST. Here, it seems that the plateau is +reached at a k of 16 and we will store the final cluster IDs within the +SpatialExperiment object.

+
library(kohonen)
+library(ConsensusClusterPlus)
+
+# Select integrated cells
+mat <- reducedDim(spe, "fastMNN")
+
+# Perform SOM clustering
+set.seed(220410)
+som.out <- clusterRows(mat, SomParam(100), full = TRUE)
+
+# Cluster the 100 SOM codes into larger clusters
+ccp <- ConsensusClusterPlus(t(som.out$objects$som$codes[[1]]),
+                            maxK = 30,
+                            reps = 100, 
+                            distance = "euclidean", 
+                            seed = 220410, 
+                            plot = NULL)
+
# Visualize delta area plot
+CATALYST:::.plot_delta_area(ccp)
+

+
# Link ConsensusClusterPlus clusters with SOM codes and save in object
+som.cluster <- ccp[[16]][["consensusClass"]][som.out$clusters]
+spe$som_clusters_corrected <- as.factor(som.cluster)
+
+dittoDimPlot(spe, var = "som_clusters_corrected", 
+             reduction.use = "UMAP_mnnCorrected", size = 0.2,
+             do.label = TRUE) +
+    ggtitle("SOM clusters on UMAP, integrated cells")
+

+
dittoHeatmap(spe[,cur_cells], 
+             genes = rownames(spe)[rowData(spe)$use_channel],
+             assay = "exprs", scale = "none",
+             heatmap.colors = viridis(100), 
+             annot.by = c("som_clusters_corrected","patient_id"),
+             annot.colors = c(dittoColors(1)[1:length(unique(spe$som_clusters_corrected))],
+                              metadata(spe)$color_vectors$patient_id))
+

+

The FlowSOM clustering approach has been used by (Hoch et al. 2022) to sub-cluster tumor +cells as measured by IMC.

+
+
+

9.2.4 Compare between clustering approaches

+

Finally, we can compare the results of different clustering approaches. For +this, we visualize the number of cells that are shared between different +clustering results in a pairwise fashion. In the following heatmaps a high match +between clustering results can be seen for those clusters that are uniquely +detected in both approaches.

+

First, we will visualize the match between the three different approaches +applied to the asinh-transformed counts.

+
library(patchwork)
+library(pheatmap)
+library(gridExtra)
+
+tab1 <- table(paste("Rphenograph", spe$pg_clusters), 
+              paste("SNN", spe$nn_clusters))
+tab2 <- table(paste("Rphenograph", spe$pg_clusters), 
+              paste("SOM", spe$som_clusters))
+tab3 <- table(paste("SNN", spe$nn_clusters), 
+              paste("SOM", spe$som_clusters))
+
+pheatmap(log10(tab1 + 10), color = viridis(100))
+

+
pheatmap(log10(tab2 + 10), color = viridis(100))
+

+
pheatmap(log10(tab3 + 10), color = viridis(100))
+

+

We observe that Rphenograph and the shared nearest neighbor (SNN) approach by +scran show similar results (first heatmap above). For example, Rphenograph +cluster 20 (a tumor cluster) is perfectly captured by SNN cluster 12. On the +other hand, the Neutrophil cluster (SNN cluster 6) is split into Rphenograph +cluster 2 and Rphenograph cluster 6. A common approach +is to now merge clusters that contain similar cell types and annotate them by +hand (see below).

+

Below, a comparison between the clustering results of the integrated cells +is shown.

+
tab1 <- table(paste("Rphenograph", spe$pg_clusters_corrected), 
+              paste("SNN", spe$nn_clusters_corrected))
+tab2 <- table(paste("Rphenograph", spe$pg_clusters_corrected), 
+              paste("SOM", spe$som_clusters_corrected))
+tab3 <- table(paste("SNN", spe$nn_clusters_corrected), 
+              paste("SOM", spe$som_clusters_corrected))
+
+pheatmap(log10(tab1 + 10), color = viridis(100))
+

+
pheatmap(log10(tab2 + 10), color = viridis(100))
+

+
pheatmap(log10(tab3 + 10), color = viridis(100))
+

+

In comparison to clustering on the non-integrated cells, the clustering results +of the integrated cells show higher overlap. The SNN approach resulted in fewer +clusters and therefore matches better with the SOM clustering approach.

+
+
+

9.2.5 Further clustering notes

+

The bluster package provides a number of metrics to assess cluster stability +here. +For brevity we only highlighted the use of the silhouette width and the +neighborhood purity but different metrics should be tested to assess cluster +stability.

+

To assign cell types to clusters, we manually annotate clusters based on their +marker expression. For example, SNN cluster 12 (clustering of the integrated +cells) shows high, homogeneous expression of CD20 and we might therefore label +this cluster as B cells. The next chapter 10 will +highlight single-cell visualization methods that can be helpful for manual +cluster annotations.

+

An example how to label clusters can be seen below:

+
library(dplyr)
+cluster_celltype <- recode(spe$nn_clusters_corrected,
+                            "1" = "Tumor_proliferating",
+                            "2" = "Myeloid",
+                            "3" = "Tumor",
+                            "4" = "Tumor",
+                            "5" = "Stroma",
+                            "6" = "Proliferating",
+                            "7" = "Myeloid",
+                            "8" = "Plasma_cell",
+                            "9" = "CD8",
+                            "10" = "CD4",
+                            "11" = "Neutrophil",
+                            "12" = "Bcell",
+                            "13" = "Stroma")
+
+spe$cluster_celltype <- cluster_celltype
+
+
+
+

9.3 Classification approach

+

In this section, we will highlight a cell type classification approach based +on ground truth labeling and random forest classification. The rational for +this supervised cell phenotyping approach is to use the information contained +in the pre-defined markers to detect cells of interest. This approach was +used by Hoch et al. to classify cell types in a metastatic melanoma IMC +dataset (Hoch et al. 2022).

+

The antibody panel used in the example data set mainly focuses on immune cell +types and little on tumor cell phenotypes. Therefore we will label the following +cell types:

+
    +
  • Tumor (E-cadherin positive)
  • +
  • Stroma (SMA, PDGFRb positive)
  • +
  • Plasma cells (CD38 positive)
  • +
  • Neutrophil (MPO, CD15 positive)
  • +
  • Myeloid cells (HLADR positive)
  • +
  • B cells (CD20 positive)
  • +
  • B next to T cells (CD20, CD3 positive)
  • +
  • Regulatory T cells (FOXP3 positive)
  • +
  • CD8+ T cells (CD3, CD8 positive)
  • +
  • CD4+ T cells (CD3, CD4 positive)
  • +
+

The “B next to T cell” phenotype (BnTcell) is commonly observed in immune +infiltrated regions measured by IMC. We include this phenotype to account for B +cell/T cell interactions where precise classification into B cells or T cells is +not possible. The exact gating scheme can be seen at +img/Gating_scheme.pdf.

+

As related approaches, Astir and +Garnett use pre-defined panel +information to classify cell phenotypes based on their marker expression.

+
+

9.3.1 Manual labeling of cells

+

The cytomapper +package provides the cytomapperShiny function that allows gating of cells +based on their marker expression and visualization of selected cells directly +on the images.

+
library(cytomapper)
+if (interactive()) {
+    
+    images <- readRDS("data/images.rds")
+    masks <- readRDS("data/masks.rds")
+    
+    cytomapperShiny(object = spe, mask = masks, image = images, 
+                    cell_id = "ObjectNumber", img_id = "sample_id")
+}
+

The labeled cells for this data set can be accessed at +10.5281/zenodo.6554544 and were downloaded +in Section 4. Gating is performed per image and the +cytomapperShiny function allows the export of gated cells in form of a +SingleCellExperiment or SpatialExperiment object. The cell label is stored +in colData(object)$cytomapper_CellLabel and the gates are stored in +metadata(object). In the next section, we will read in and consolidate the +labeled data.

+
+
+

9.3.2 Define color vectors

+

For consistent visualization of cell types, we will now pre-define their colors:

+
celltype <- setNames(c("#3F1B03", "#F4AD31", "#894F36", "#1C750C", "#EF8ECC", 
+                       "#6471E2", "#4DB23B", "grey", "#F4800C", "#BF0A3D", "#066970"),
+                     c("Tumor", "Stroma", "Myeloid", "CD8", "Plasma_cell", 
+                       "Treg", "CD4", "undefined", "BnTcell", "Bcell", "Neutrophil"))
+
+metadata(spe)$color_vectors$celltype <- celltype
+
+
+

9.3.3 Read in and consolidate labeled data

+

Here, we will read in the individual SpatialExperiment objects containing the +labeled cells and concatenate them. In the process of concatenating the +SpatialExperiment objects along their columns, the sample_id entry is +appended by .1, .2, .3, ... due to replicated entries.

+
library(SingleCellExperiment)
+label_files <- list.files("data/gated_cells", 
+                          full.names = TRUE, pattern = ".rds$")
+
+# Read in SPE objects
+spes <- lapply(label_files, readRDS)
+
+# Merge SPE objects
+concat_spe <- do.call("cbind", spes)
+

In the following code chunk we will identify cells that were labeled multiple +times. This occurs when different cell phenotypes are gated per image and can +affect immune cells that are located inside the tumor compartment.

+

We will first identify those cells that were uniquely labeled. In the next step, +we will identify those cells that were labeled twice AND were labeled as Tumor +cells. These cells will be assigned their immune cell label. Finally, we will +save the unique labels within the original SpatialExperiment object.

+

Of note: this concatenation strategy is specific for cell phenotypes contained in this +example dataset. The gated cell labels might need to be processed in a slightly +different way when working with other samples.

+

For these tasks, we will define a filter function:

+
filter_labels <- function(object, 
+                          label = "cytomapper_CellLabel") {
+    cur_tab <- unclass(table(colnames(object), object[[label]]))
+    
+    cur_labels <- colnames(cur_tab)[apply(cur_tab, 1, which.max)]
+    names(cur_labels) <- rownames(cur_tab)
+    
+    cur_labels <- cur_labels[rowSums(cur_tab) == 1]
+    
+    return(cur_labels)
+}
+

This function is now applied to all cells and then only non-tumor cells.

+
labels <- filter_labels(concat_spe)
+
+cur_spe <- concat_spe[,concat_spe$cytomapper_CellLabel != "Tumor"]
+
+non_tumor_labels <- filter_labels(cur_spe)
+
+additional_cells <- setdiff(names(non_tumor_labels), names(labels))
+
+final_labels <- c(labels, non_tumor_labels[additional_cells])
+
+# Transfer labels to SPE object
+spe_labels <- rep("unlabeled", ncol(spe))
+names(spe_labels) <- colnames(spe)
+spe_labels[names(final_labels)] <- final_labels
+spe$cell_labels <- spe_labels
+
+# Number of cells labeled per patient
+table(spe$cell_labels, spe$patient_id)
+
##              
+##               Patient1 Patient2 Patient3 Patient4
+##   Bcell            152      131      234      263
+##   BnTcell          396       37      240     1029
+##   CD4               45      342      167      134
+##   CD8               60      497      137      128
+##   Myeloid          183      378      672      517
+##   Neutrophil        97        4       17       16
+##   Plasma_cell       34      536       87       59
+##   Stroma            84       37       85      236
+##   Treg             139      149       49       24
+##   Tumor           2342      906     1618     1133
+##   unlabeled       7214     9780     7826     9580
+

Based on these labels, we can now train a random forest classifier to classify +all remaining, unlabeled cells.

+
+
+

9.3.4 Train classifier

+

In this section, we will use the +caret framework for machine +learning in R. This package provides an interface to train a number of +regression and classification models in a coherent fashion. We use a random +forest classifier due to low number of parameters, high speed and an observed +high performance for cell type classification (Hoch et al. 2022).

+

In the following section, we will first split the SpatialExperiment object +into labeled and unlabeled cells. Based on the labeled cells, we split +the data into a train (75% of the data) and test (25% of the data) dataset. +We currently do not provide an independently labeled validation dataset.

+

The caret package provides the trainControl function, which specifies model +training parameters and the train function, which performs the actual model +training. While training the model, we also want to estimate the best model +parameters. In the case of the chosen random forest model (method = "rf"), we +only need to estimate a single parameters (mtry) which corresponds to the +number of variables randomly sampled as candidates at each split. To estimate +the best parameter, we will perform a 5-fold cross validation (set within +trainControl) over a tune length of 5 entries to mtry. In the following +code chunk, the createDataPartition and the train function are not deterministic, +meaning they return different results across different runs. We therefore set +a seed here for both functions.

+
library(caret)
+
+# Split between labeled and unlabeled cells
+lab_spe <- spe[,spe$cell_labels != "unlabeled"]
+unlab_spe <- spe[,spe$cell_labels == "unlabeled"]
+
+# Randomly split into train and test data
+set.seed(221029)
+trainIndex <- createDataPartition(factor(lab_spe$cell_labels), p = 0.75)
+
+train_spe <- lab_spe[,trainIndex$Resample1]
+test_spe <- lab_spe[,-trainIndex$Resample1]
+
+# Define fit parameters for 5-fold cross validation
+fitControl <- trainControl(method = "cv",
+                           number = 5)
+
+# Select the arsinh-transformed counts for training
+cur_mat <- t(assay(train_spe, "exprs")[rowData(train_spe)$use_channel,])
+
+# Train a random forest classifier
+rffit <- train(x = cur_mat, 
+               y = factor(train_spe$cell_labels),
+               method = "rf", ntree = 1000,
+               tuneLength = 5,
+               trControl = fitControl)
+
+rffit
+
## Random Forest 
+## 
+## 10049 samples
+##    37 predictor
+##    10 classes: 'Bcell', 'BnTcell', 'CD4', 'CD8', 'Myeloid', 'Neutrophil', 'Plasma_cell', 'Stroma', 'Treg', 'Tumor' 
+## 
+## No pre-processing
+## Resampling: Cross-Validated (5 fold) 
+## Summary of sample sizes: 8040, 8039, 8038, 8038, 8041 
+## Resampling results across tuning parameters:
+## 
+##   mtry  Accuracy   Kappa    
+##    2    0.9643726  0.9524051
+##   10    0.9780071  0.9707483
+##   19    0.9801973  0.9736577
+##   28    0.9787052  0.9716635
+##   37    0.9779095  0.9705890
+## 
+## Accuracy was used to select the optimal model using the largest value.
+## The final value used for the model was mtry = 19.
+

Training the classifier took +11.77 minutes.

+
+
+

9.3.5 Classifier performance

+

We next observe the accuracy of the classifer when predicting cell phenotypes +across the cross-validation and when applying the classifier to the test +dataset.

+

First, we can visualize the classification accuracy during parameter +tuning:

+
ggplot(rffit) + 
+  geom_errorbar(data = rffit$results,
+                aes(ymin = Accuracy - AccuracySD,
+                    ymax = Accuracy + AccuracySD),
+                width = 0.4) +
+    theme_classic(base_size = 15)
+

+

The best value for mtry is 19 and is used when predicting new data.

+

It is often recommended to visualize the variable importance of the +classifier. The following plot specifies which variables (markers) are +most important for classifying the data.

+
plot(varImp(rffit))
+

+

As expected, the markers that were used for gating (Ecad, CD3, CD20, HLADR, +CD8a, CD38, FOXP3) were important for classification.

+

To assess the accuracy, sensitivity, specificity, among other quality measures of +the classifier, we will now predict cell phenotypes in the test data.

+
# Select the arsinh-transformed counts of the test data
+cur_mat <- t(assay(test_spe, "exprs")[rowData(test_spe)$use_channel,])
+
+# Predict the cell phenotype labels of the test data
+set.seed(231019)
+cur_pred <- predict(rffit, newdata = cur_mat)
+

While the overall classification accuracy can appear high, we also want +to check if each cell phenotype class is correctly predicted. +For this, we will calculate the confusion matrix between predicted and actual +cell labels. This measure may highlight individual cell phenotype classes that +were not correctly predicted by the classifier. When setting mode = "everything", +the confusionMatrix function returns all available prediction measures including +sensitivity, specificity, precision, recall and the F1 score per cell +phenotype class.

+
cm <- confusionMatrix(data = cur_pred, 
+                      reference = factor(test_spe$cell_labels), 
+                      mode = "everything")
+
+cm
+
## Confusion Matrix and Statistics
+## 
+##              Reference
+## Prediction    Bcell BnTcell  CD4  CD8 Myeloid Neutrophil Plasma_cell Stroma
+##   Bcell         186       2    0    0       0          0           6      0
+##   BnTcell         4     423    1    0       0          0           0      0
+##   CD4             0       0  163    0       0          2           3      2
+##   CD8             0       0    0  199       0          0           8      0
+##   Myeloid         0       0    2    1     437          0           0      0
+##   Neutrophil      0       0    0    0       0         30           0      0
+##   Plasma_cell     1       0    3    2       0          0         158      0
+##   Stroma          0       0    2    0       0          0           0    108
+##   Treg            0       0    0    0       0          0           3      0
+##   Tumor           4       0    1    3       0          1           1      0
+##              Reference
+## Prediction    Treg Tumor
+##   Bcell          0     1
+##   BnTcell        0     1
+##   CD4            0     5
+##   CD8            0     3
+##   Myeloid        0     0
+##   Neutrophil     0     0
+##   Plasma_cell    1     0
+##   Stroma         0     0
+##   Treg          89     2
+##   Tumor          0  1487
+## 
+## Overall Statistics
+##                                          
+##                Accuracy : 0.9806         
+##                  95% CI : (0.9753, 0.985)
+##     No Information Rate : 0.4481         
+##     P-Value [Acc > NIR] : < 2.2e-16      
+##                                          
+##                   Kappa : 0.9741         
+##                                          
+##  Mcnemar's Test P-Value : NA             
+## 
+## Statistics by Class:
+## 
+##                      Class: Bcell Class: BnTcell Class: CD4 Class: CD8
+## Sensitivity               0.95385         0.9953    0.94767    0.97073
+## Specificity               0.99714         0.9979    0.99622    0.99650
+## Pos Pred Value            0.95385         0.9860    0.93143    0.94762
+## Neg Pred Value            0.99714         0.9993    0.99716    0.99809
+## Precision                 0.95385         0.9860    0.93143    0.94762
+## Recall                    0.95385         0.9953    0.94767    0.97073
+## F1                        0.95385         0.9906    0.93948    0.95904
+## Prevalence                0.05830         0.1271    0.05142    0.06129
+## Detection Rate            0.05561         0.1265    0.04873    0.05949
+## Detection Prevalence      0.05830         0.1283    0.05232    0.06278
+## Balanced Accuracy         0.97549         0.9966    0.97195    0.98361
+##                      Class: Myeloid Class: Neutrophil Class: Plasma_cell
+## Sensitivity                  1.0000          0.909091            0.88268
+## Specificity                  0.9990          1.000000            0.99779
+## Pos Pred Value               0.9932          1.000000            0.95758
+## Neg Pred Value               1.0000          0.999095            0.99340
+## Precision                    0.9932          1.000000            0.95758
+## Recall                       1.0000          0.909091            0.88268
+## F1                           0.9966          0.952381            0.91860
+## Prevalence                   0.1306          0.009865            0.05351
+## Detection Rate               0.1306          0.008969            0.04723
+## Detection Prevalence         0.1315          0.008969            0.04933
+## Balanced Accuracy            0.9995          0.954545            0.94024
+##                      Class: Stroma Class: Treg Class: Tumor
+## Sensitivity                0.98182     0.98889       0.9920
+## Specificity                0.99938     0.99846       0.9946
+## Pos Pred Value             0.98182     0.94681       0.9933
+## Neg Pred Value             0.99938     0.99969       0.9935
+## Precision                  0.98182     0.94681       0.9933
+## Recall                     0.98182     0.98889       0.9920
+## F1                         0.98182     0.96739       0.9927
+## Prevalence                 0.03288     0.02691       0.4481
+## Detection Rate             0.03229     0.02661       0.4445
+## Detection Prevalence       0.03288     0.02810       0.4475
+## Balanced Accuracy          0.99060     0.99368       0.9933
+

To easily visualize these results, we can now plot the true positive rate +(sensitivity) versus the false positive rate (1 - specificity). The size of the +point is determined by the number of true positives divided by the total number +of cells.

+
library(tidyverse)
+
+data.frame(cm$byClass) %>%
+  mutate(class = sub("Class: ", "", rownames(cm$byClass))) %>%
+  ggplot() + 
+  geom_point(aes(1 - Specificity, Sensitivity, 
+                 size = Detection.Rate,
+                 fill = class),
+             shape = 21) + 
+  scale_fill_manual(values = metadata(spe)$color_vectors$celltype) +
+  theme_classic(base_size = 15) + 
+  ylab("Sensitivity (TPR)") +
+  xlab("1 - Specificity (FPR)")
+

+

We observe high sensitivity and specificity for most cell types. Plasma cells +show the lowest true positive rate with 88% being sufficiently high.

+

Finally, to observe which cell phenotypes were wrongly classified, we can visualize +the distribution of classification probabilities per cell phenotype class:

+
set.seed(231019)
+cur_pred <- predict(rffit, 
+                    newdata = cur_mat, 
+                    type = "prob")
+cur_pred$truth <- factor(test_spe$cell_labels)
+
+cur_pred %>%
+  pivot_longer(cols = Bcell:Tumor) %>%
+  ggplot() +
+  geom_boxplot(aes(x = name, y = value, fill = name), outlier.size = 0.5) +
+  facet_wrap(. ~ truth, ncol = 1) + 
+  scale_fill_manual(values = metadata(spe)$color_vectors$celltype)  +
+  theme(panel.background = element_blank(), 
+        axis.text.x = element_text(angle = 45, hjust = 1))
+

+

The boxplots indicate the classification probabilities per class. The classifier +is well trained if classification probabilities are only high for the one +specific class.

+
+
+

9.3.6 Classification of new data

+

In the final section, we will now use the tuned and tested random forest +classifier to predict the cell phenotypes of the unlabeled data.

+

First, we predict the cell phenotypes and extract their classification +probabilities.

+
# Select the arsinh-transformed counts of the unlabeled data for prediction
+cur_mat <- t(assay(unlab_spe, "exprs")[rowData(unlab_spe)$use_channel,])
+
+# Predict the cell phenotype labels of the unlabeled data
+set.seed(231014)
+cell_class <- as.character(predict(rffit, 
+                                   newdata = cur_mat, 
+                                   type = "raw"))
+names(cell_class) <- rownames(cur_mat)
+
+table(cell_class)
+
## cell_class
+##       Bcell     BnTcell         CD4         CD8     Myeloid  Neutrophil 
+##         817         979        3620        2716        6302         559 
+## Plasma_cell      Stroma        Treg       Tumor 
+##        2692        4904        1170       10641
+
# Extract prediction probabilities for each cell
+set.seed(231014)
+cell_prob <- predict(rffit, 
+                     newdata = cur_mat, 
+                     type = "prob")
+

Each cell is assigned to the class with highest probability. There are however +cases, where the highest probability is low meaning the cell can not be uniquely +assigned to a class. We next want to identify these cells and label them as +“undefined”. Here, we select a maximum classification probability threshold +of 40% but this threshold needs to be adjusted for other datasets. The adjusted +cell labels are then stored in the SpatialExperiment object.

+
library(ggridges)
+
+# Distribution of maximum probabilities
+tibble(max_prob = rowMax(as.matrix(cell_prob)),
+       type = cell_class) %>%
+    ggplot() +
+        geom_density_ridges(aes(x = max_prob, y = cell_class, fill = cell_class)) +
+        scale_fill_manual(values = metadata(spe)$color_vectors$celltype) +
+        theme_classic(base_size = 15) +
+        xlab("Maximum probability") +
+        ylab("Cell type") + 
+        xlim(c(0,1.2))
+
## Picking joint bandwidth of 0.0238
+

+
# Label undefined cells
+cell_class[rowMax(as.matrix(cell_prob)) < 0.4] <- "undefined"
+
+# Store labels in SpatialExperiment onject
+cell_labels <- spe$cell_labels
+cell_labels[colnames(unlab_spe)] <- cell_class
+spe$celltype <- cell_labels 
+
+table(spe$celltype, spe$patient_id)
+
##              
+##               Patient1 Patient2 Patient3 Patient4
+##   Bcell            179      527      431      458
+##   BnTcell          416      586      594     1078
+##   CD4              391     1370      699     1385
+##   CD8              518     1365      479     1142
+##   Myeloid         1369     2197     1723     2731
+##   Neutrophil       348        9      148      176
+##   Plasma_cell      650     2122      351      274
+##   Stroma           633      676      736     3261
+##   Treg             553      409      243      310
+##   Tumor           5560     3334     5648     2083
+##   undefined        129      202       80      221
+

We can now compare the cell labels derived by classification to the different +clustering strategies. The first comparison is against the clustering results +using the asinh-transformed counts.

+
tab1 <- table(spe$celltype, 
+              paste("Rphenograph", spe$pg_clusters))
+tab2 <- table(spe$celltype, 
+              paste("SNN", spe$nn_clusters))
+tab3 <- table(spe$celltype, 
+              paste("SOM", spe$som_clusters))
+
+pheatmap(log10(tab1 + 10), color = viridis(100))
+

+
pheatmap(log10(tab2 + 10), color = viridis(100))
+

+
pheatmap(log10(tab3 + 10), color = viridis(100))
+

+

We can see that Tumor and Myeloid cells span multiple clusters while +Neutrophiles are detected as an individual cluster by all clustering approaches.

+

We next compare the cell classification against clustering results using the +integrated cells.

+
tab1 <- table(spe$celltype, 
+              paste("Rphenograph", spe$pg_clusters_corrected))
+tab2 <- table(spe$celltype, 
+              paste("SNN", spe$nn_clusters_corrected))
+tab3 <- table(spe$celltype, 
+              paste("SOM", spe$som_clusters_corrected))
+
+pheatmap(log10(tab1 + 10), color = viridis(100))
+

+
pheatmap(log10(tab2 + 10), color = viridis(100))
+

+
pheatmap(log10(tab3 + 10), color = viridis(100))
+

+

We observe a high agreement between the shared nearest neighbor clustering +approach using the integrated cells and the cell phenotypes derived by +classification.

+

In the next sections, we will highlight visualization strategies to verify the +correctness of the phenotyping approach. Specifically, Section +11.2.3 shows how to outline identified cell phenotypes on +composite images.

+

Finally, we save the updated SpatialExperiment object.

+
saveRDS(spe, "data/spe.rds")
+
+
+
+

9.4 Session Info

+
+ +SessionInfo + +
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.3 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## locale:
+##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+## 
+## time zone: Etc/UTC
+## tzcode source: system (glibc)
+## 
+## attached base packages:
+## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
+## [8] base     
+## 
+## other attached packages:
+##  [1] testthat_3.1.10             ggridges_0.5.4             
+##  [3] lubridate_1.9.3             forcats_1.0.0              
+##  [5] stringr_1.5.0               purrr_1.0.2                
+##  [7] readr_2.1.4                 tidyr_1.3.0                
+##  [9] tibble_3.2.1                tidyverse_2.0.0            
+## [11] caret_6.0-94                lattice_0.21-8             
+## [13] cytomapper_1.12.0           EBImage_4.42.0             
+## [15] dplyr_1.1.3                 gridExtra_2.3              
+## [17] pheatmap_1.0.12             patchwork_1.1.3            
+## [19] ConsensusClusterPlus_1.64.0 kohonen_3.0.12             
+## [21] CATALYST_1.24.0             scran_1.28.2               
+## [23] scuttle_1.10.2              BiocParallel_1.34.2        
+## [25] bluster_1.10.0              viridis_0.6.4              
+## [27] viridisLite_0.4.2           dittoSeq_1.12.1            
+## [29] ggplot2_3.4.3               igraph_1.5.1               
+## [31] Rphenograph_0.99.1.9003     SpatialExperiment_1.10.0   
+## [33] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2
+## [35] Biobase_2.60.0              GenomicRanges_1.52.0       
+## [37] GenomeInfoDb_1.36.3         IRanges_2.34.1             
+## [39] S4Vectors_0.38.2            BiocGenerics_0.46.0        
+## [41] MatrixGenerics_1.12.3       matrixStats_1.0.0          
+## 
+## loaded via a namespace (and not attached):
+##   [1] bitops_1.0-7              RColorBrewer_1.1-3       
+##   [3] doParallel_1.0.17         tools_4.3.1              
+##   [5] backports_1.4.1           utf8_1.2.3               
+##   [7] R6_2.5.1                  HDF5Array_1.28.1         
+##   [9] rhdf5filters_1.12.1       GetoptLong_1.0.5         
+##  [11] withr_2.5.1               sp_2.0-0                 
+##  [13] cli_3.6.1                 sandwich_3.0-2           
+##  [15] labeling_0.4.3            sass_0.4.7               
+##  [17] nnls_1.5                  mvtnorm_1.2-3            
+##  [19] randomForest_4.7-1.1      proxy_0.4-27             
+##  [21] systemfonts_1.0.4         colorRamps_2.3.1         
+##  [23] svglite_2.1.1             R.utils_2.12.2           
+##  [25] scater_1.28.0             parallelly_1.36.0        
+##  [27] plotrix_3.8-2             limma_3.56.2             
+##  [29] flowCore_2.12.2           rstudioapi_0.15.0        
+##  [31] generics_0.1.3            shape_1.4.6              
+##  [33] gtools_3.9.4              car_3.1-2                
+##  [35] Matrix_1.6-1.1            RProtoBufLib_2.12.1      
+##  [37] waldo_0.5.1               ggbeeswarm_0.7.2         
+##  [39] fansi_1.0.4               abind_1.4-5              
+##  [41] R.methodsS3_1.8.2         terra_1.7-46             
+##  [43] lifecycle_1.0.3           multcomp_1.4-25          
+##  [45] yaml_2.3.7                edgeR_3.42.4             
+##  [47] carData_3.0-5             rhdf5_2.44.0             
+##  [49] recipes_1.0.8             Rtsne_0.16               
+##  [51] grid_4.3.1                promises_1.2.1           
+##  [53] dqrng_0.3.1               crayon_1.5.2             
+##  [55] shinydashboard_0.7.2      beachmat_2.16.0          
+##  [57] cowplot_1.1.1             magick_2.8.0             
+##  [59] pillar_1.9.0              knitr_1.44               
+##  [61] ComplexHeatmap_2.16.0     metapod_1.8.0            
+##  [63] rjson_0.2.21              future.apply_1.11.0      
+##  [65] codetools_0.2-19          glue_1.6.2               
+##  [67] data.table_1.14.8         vctrs_0.6.3              
+##  [69] png_0.1-8                 gtable_0.3.4             
+##  [71] cachem_1.0.8              gower_1.0.1              
+##  [73] xfun_0.40                 S4Arrays_1.0.6           
+##  [75] mime_0.12                 prodlim_2023.08.28       
+##  [77] DropletUtils_1.20.0       survival_3.5-5           
+##  [79] timeDate_4022.108         iterators_1.0.14         
+##  [81] cytolib_2.12.1            hardhat_1.3.0            
+##  [83] lava_1.7.2.1              statmod_1.5.0            
+##  [85] ellipsis_0.3.2            TH.data_1.1-2            
+##  [87] ipred_0.9-14              nlme_3.1-162             
+##  [89] rprojroot_2.0.3           bslib_0.5.1              
+##  [91] irlba_2.3.5.1             svgPanZoom_0.3.4         
+##  [93] vipor_0.4.5               rpart_4.1.19             
+##  [95] colorspace_2.1-0          raster_3.6-23            
+##  [97] nnet_7.3-19               tidyselect_1.2.0         
+##  [99] compiler_4.3.1            BiocNeighbors_1.18.0     
+## [101] desc_1.4.2                DelayedArray_0.26.7      
+## [103] bookdown_0.35             scales_1.2.1             
+## [105] tiff_0.1-11               digest_0.6.33            
+## [107] fftwtools_0.9-11          rmarkdown_2.25           
+## [109] XVector_0.40.0            htmltools_0.5.6          
+## [111] pkgconfig_2.0.3           jpeg_0.1-10              
+## [113] sparseMatrixStats_1.12.2  fastmap_1.1.1            
+## [115] rlang_1.1.1               GlobalOptions_0.1.2      
+## [117] htmlwidgets_1.6.2         shiny_1.7.5              
+## [119] DelayedMatrixStats_1.22.6 farver_2.1.1             
+## [121] jquerylib_0.1.4           zoo_1.8-12               
+## [123] jsonlite_1.8.7            ModelMetrics_1.2.2.2     
+## [125] R.oo_1.25.0               BiocSingular_1.16.0      
+## [127] RCurl_1.98-1.12           magrittr_2.0.3           
+## [129] GenomeInfoDbData_1.2.10   Rhdf5lib_1.22.1          
+## [131] munsell_0.5.0             Rcpp_1.0.11              
+## [133] ggnewscale_0.4.9          pROC_1.18.4              
+## [135] stringi_1.7.12            brio_1.1.3               
+## [137] zlibbioc_1.46.0           MASS_7.3-60              
+## [139] plyr_1.8.8                listenv_0.9.0            
+## [141] parallel_4.3.1            ggrepel_0.9.3            
+## [143] splines_4.3.1             hms_1.1.3                
+## [145] circlize_0.4.15           locfit_1.5-9.8           
+## [147] ggpubr_0.6.0              ggsignif_0.6.4           
+## [149] pkgload_1.3.3             reshape2_1.4.4           
+## [151] ScaledMatrix_1.8.1        XML_3.99-0.14            
+## [153] drc_3.0-1                 evaluate_0.21            
+## [155] tzdb_0.4.0                foreach_1.5.2            
+## [157] tweenr_2.0.2              httpuv_1.6.11            
+## [159] RANN_2.6.1                polyclip_1.10-6          
+## [161] future_1.33.0             clue_0.3-65              
+## [163] ggforce_0.4.1             rsvd_1.0.5               
+## [165] broom_1.0.5               xtable_1.8-4             
+## [167] e1071_1.7-13              rstatix_0.7.2            
+## [169] later_1.3.1               class_7.3-22             
+## [171] FlowSOM_2.8.0             beeswarm_0.4.0           
+## [173] cluster_2.1.4             timechange_0.2.0         
+## [175] globals_0.16.2
+
+ +
+
+

References

+
+
+Bai, Yunhao, Bokai Zhu, Xavier Rovira-Clave, Han Chen, Maxim Markovic, Chi Ngai Chan, Tung-Hung Su, et al. 2021. “Adjacent Cell Marker Lateral Spillover Compensation and Reinforcement for Multiplexed Images.” Frontiers in Immunology 12. +
+
+Hoch, Tobias, Daniel Schulz, Nils Eling, Julia Martínez Gómez, Mitchell P. Levesque, and Bernd Bodenmiller. 2022. “Multiplexed Imaging Mass Cytometry of the Chemokine Milieus in Melanoma Characterizes Features of the Response to Immunotherapy.” Science Immunology 7 (70): eabk1692. +
+
+Jackson, Hartland W., Jana R. Fischer, Vito R. T. Zanotelli, H. Raza Ali, Robert Mechera, Savas D. Soysal, Holger Moch, et al. 2020. “The Single-Cell Pathology Landscape of Breast Cancer.” Nature 578: 615–20. +
+
+Levine, Jacob H., Erin F. Simonds, Sean C. Bendall, Kara L. Davis, El-ad D. Amir, Michelle D. Tadmor, Oren Litvin, et al. 2015. “Data-Driven Phenotypic Dissection of AML Reveals Progenitor-Like Cells That Correlate with Prognosis.” Cell 162: 184–97. +
+
+Schulz, Daniel, Vito RT Zanotelli, Rana R Fischer, Denis Schapiro, Stefanie Engler, Xiao-Kang Lun, Hartland W Jackson, and Bernd Bodenmiller. 2018. “Simultaneous Multiplexed Imaging of mRNA and Proteins with Subcellular Resolution in Breast Cancer Tissue Samples by Mass Cytometry.” Cell Systems 6: 25–36.e5. +
+
+Tietscher, Sandra, Johanna Wagner, Tobias Anzeneder, Claus Langwieder, Martin Rees, Bettina Sobottka, Natalie de Souza, and Bernd Bodenmiller. 2022. “A Comprehensive Single-Cell Map of t Cell Exhaustion-Associated Immune Environments in Human Breast Cancer.” Research Square. +
+
+Weber, Lukas M., and Mark D. Robinson. 2016. “Comparison of Clustering Methods for High-Dimensional Single-Cell Flow and Mass Cytometry Data.” Cytometry Part A 89A: 1084–96. +
+
+Yu, Lijia, Yue Cao, Jean Y. H. Yang, and Pengyi Yang. 2022. “Benchmarking Clustering Algorithms on Estimating the Number of Cell Types from Single-Cell RNA-Sequencing Data.” Genome Biology 23 (1). https://doi.org/10.1186/s13059-022-02622-0. +
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/image-and-cell-level-quality-control.html b/image-and-cell-level-quality-control.html new file mode 100644 index 00000000..d72146af --- /dev/null +++ b/image-and-cell-level-quality-control.html @@ -0,0 +1,1012 @@ + + + + + + + 7 Image and cell-level quality control | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

7 Image and cell-level quality control

+

The following section discusses possible quality indicators for data obtained +by IMC and other highly multiplexed imaging technologies. Here, we will focus +on describing quality metrics on the single-cell as well as image level.

+
+

7.1 Read in the data

+

We will first read in the data processed in previous sections:

+
images <- readRDS("data/images.rds")
+masks <- readRDS("data/masks.rds")
+spe <- readRDS("data/spe.rds")
+
+
+

7.2 Segmentation quality control

+

The first step after image segmentation is to observe its accuracy. +Without having ground-truth data readily available, a common approach to +segmentation quality control is to overlay segmentation masks on composite images +displaying channels that were used for segmentation. +The cytomapper +package supports exactly this tasks by using the plotPixels function.

+

Here, we select 3 random images and perform image- and channel-wise +normalization (channels are first min-max normalized and scaled to a range of +0-1 before clipping the maximum intensity to 0.2).

+
library(cytomapper)
+set.seed(20220118)
+img_ids <- sample(seq_along(images), 3)
+
+# Normalize and clip images
+cur_images <- images[img_ids]
+cur_images <- cytomapper::normalize(cur_images, separateImages = TRUE)
+cur_images <- cytomapper::normalize(cur_images, inputRange = c(0, 0.2))
+
+plotPixels(cur_images,
+           mask = masks[img_ids],
+           img_id = "sample_id",
+           missing_colour = "white",
+           colour_by = c("CD163", "CD20", "CD3", "Ecad", "DNA1"),
+           colour = list(CD163 = c("black", "yellow"),
+                         CD20 = c("black", "red"),
+                         CD3 = c("black", "green"),
+                         Ecad = c("black", "cyan"),
+                         DNA1 = c("black", "blue")),
+           image_title = NULL,
+           legend = list(colour_by.title.cex = 0.7,
+                         colour_by.labels.cex = 0.7))
+

+

We can see that nuclei are centered within the segmentation masks and all cell +types are correctly segmented (note: to zoom into the image you can right click +and select Open Image in New Tab). A common challenge here is to segment large (e.g., +epithelial cells - in cyan) versus small (e.g., B cells - in red). However, the +segmentation approach here appears to correctly segment cells across different +sizes.

+

An easier and interactive way of observing segmentation quality is to use the +interactive image viewer provided by the +cytoviewer R/Bioconductor +package (Meyer, Eling, and Bodenmiller 2023). Under “Image-level” > “Basic controls”, up to six markers +can be selected for visualization. The contrast of each marker can be adjusted. +Under “Image-level” > “Advanced controls”, click the “Show cell outlines” box +to outline segmented cells on the images.

+
library(cytoviewer)
+
+app <- cytoviewer(image = images, 
+                  mask = masks, 
+                  cell_id = "ObjectNumber", 
+                  img_id = "sample_id")
+
+if (interactive()) {
+    shiny::runApp(app, launch.browser = TRUE)
+}
+

An additional approach to observe cell segmentation quality and potentially also +antibody specificity issues is to visualize single-cell expression in form of a +heatmap. Here, we sub-sample the dataset to 2000 cells for visualization +purposes and overlay the cancer type from which the cells were extracted.

+
library(dittoSeq)
+library(viridis)
+cur_cells <- sample(seq_len(ncol(spe)), 2000)
+
+dittoHeatmap(spe[,cur_cells], 
+             genes = rownames(spe)[rowData(spe)$use_channel],
+             assay = "exprs", 
+             cluster_cols = TRUE, 
+             scale = "none",
+             heatmap.colors = viridis(100), 
+             annot.by = "indication",
+             annotation_colors = list(indication = metadata(spe)$color_vectors$indication))
+

+

We can differentiate between epithelial cells (Ecad+) and immune cells +(CD45RO+). Some of the markers are detected in specific cells (e.g., Ki67, CD20, +Ecad) while others are more broadly expressed across cells (e.g., HLADR, B2M, +CD4).

+
+
+

7.3 Image-level quality control

+

Image-level quality control is often performed using tools that offer a +graphical user interface such as QuPath, +FIJI and the previously mentioned +cytoviewer package. Viewers +that were specifically developed for IMC data can be seen +here. In this +section, we will specifically focus on quantitative metrics to assess image +quality.

+

It is often of interest to calculate the signal-to-noise ratio (SNR) for +individual channels and markers. Here, we define the SNR as:

+

\[SNR = I_s/I_n\]

+

where \(I_s\) is the intensity of the signal (mean intensity of pixels with true +signal) and \(I_n\) is the intensity of the noise (mean intensity of pixels +containing noise). This definition of the SNR is just one of many and other +measures can be applied. Finding a threshold that separates pixels containing +signal and pixels containing noise is not trivial and different approaches can +be chosen. Here, we use the otsu thresholding approach to find pixels of the +“foreground” (i.e., signal) and “background” (i.e., noise). The SNR is then +defined as the mean intensity of foreground pixels divided by the mean intensity +of background pixels. We compute this measure as well as the mean signal +intensity per image. The plot below shows the average SNR versus the average +signal intensity across all images.

+
library(tidyverse)
+library(ggrepel)
+library(EBImage)
+
+cur_snr <- lapply(names(images), function(x){
+    img <- images[[x]]
+    mat <- apply(img, 3, function(ch){
+        # Otsu threshold
+        thres <- otsu(ch, range = c(min(ch), max(ch)), levels = 65536)
+        # Signal-to-noise ratio
+        snr <- mean(ch[ch > thres]) / mean(ch[ch <= thres])
+        # Signal intensity
+        ps <- mean(ch[ch > thres])
+        
+        return(c(snr = snr, ps = ps))
+    })
+    t(mat) %>% as.data.frame() %>% 
+        mutate(image = x,
+               marker = colnames(mat)) %>% 
+        pivot_longer(cols = c(snr, ps))
+})
+
+cur_snr <- do.call(rbind, cur_snr)
+
+cur_snr %>% 
+    group_by(marker, name) %>%
+    summarize(log_mean = log2(mean(value))) %>%
+    pivot_wider(names_from = name, values_from = log_mean) %>%
+    ggplot() +
+    geom_point(aes(ps, snr)) +
+    geom_label_repel(aes(ps, snr, label = marker)) +
+    theme_minimal(base_size = 15) + ylab("Signal-to-noise ratio [log2]") +
+    xlab("Signal intensity [log2]")
+

+

We observe PD1, LAG3 and cleaved PARP to have high SNR but low signal intensity +meaning that in general these markers are not abundantly expressed. The Iridium +intercalator (here marked as DNA1 and DNA2) has the highest signal intensity +but low SNR. This might be due to staining differences between individual nuclei +where some nuclei are considered as background. We do however observe high +SNR and sufficient signal intensity for the majority of markers.

+

Otsu thesholding and SNR calculation does not perform well if the markers are +lowly abundant. In the next code chunk, we will remove markers that have +a positive signal of below 2 per image.

+
cur_snr <- cur_snr %>% 
+    pivot_wider(names_from = name, values_from = value) %>%
+    filter(ps > 2) %>%
+    pivot_longer(cols = c(snr, ps))
+
+cur_snr %>% 
+    group_by(marker, name) %>%
+    summarize(log_mean = log2(mean(value))) %>%
+    pivot_wider(names_from = name, values_from = log_mean) %>%
+    ggplot() +
+    geom_point(aes(ps, snr)) +
+    geom_label_repel(aes(ps, snr, label = marker)) +
+    theme_minimal(base_size = 15) + ylab("Signal-to-noise ratio [log2]") +
+    xlab("Signal intensity [log2]")
+

+

This visualization shows a reduces SNR for PD1, LAG3 and cleaved PARP which was +previously inflated due to low signal.

+

Another quality indicator is the image area covered by cells (or biological +tissue). This metric identifies ROIs where little cells are present, possibly +hinting at incorrect selection of the ROI. We can compute the percentage of +covered image area using the metadata contained in the SpatialExperiment +object:

+
cell_density <- colData(spe) %>%
+    as.data.frame() %>%
+    group_by(sample_id) %>%
+    # Compute the number of pixels covered by cells and 
+    # the total number of pixels
+    summarize(cell_area = sum(area),
+              no_pixels = mean(width_px) * mean(height_px)) %>%
+    # Divide the total number of pixels 
+    # by the number of pixels covered by cells
+    mutate(covered_area = cell_area / no_pixels)
+
+# Visualize the image area covered by cells per image
+ggplot(cell_density) +
+        geom_point(aes(reorder(sample_id,covered_area), covered_area)) + 
+        theme_minimal(base_size = 15) +
+        theme(axis.text.x = element_text(angle = 90, hjust = 1, size = 15)) +
+        ylim(c(0, 1)) +
+        ylab("% covered area") + xlab("")
+

+

We observe that two of the 14 images show unusually low cell coverage. These +two images can now be visualized using cytomapper.

+
# Normalize and clip images
+cur_images <- images[c("Patient4_005", "Patient4_007")]
+cur_images <- cytomapper::normalize(cur_images, separateImages = TRUE)
+cur_images <- cytomapper::normalize(cur_images, inputRange = c(0, 0.2))
+
+plotPixels(cur_images,
+           mask = masks[c("Patient4_005", "Patient4_007")],
+           img_id = "sample_id",
+           missing_colour = "white",
+           colour_by = c("CD163", "CD20", "CD3", "Ecad", "DNA1"),
+           colour = list(CD163 = c("black", "yellow"),
+                         CD20 = c("black", "red"),
+                         CD3 = c("black", "green"),
+                         Ecad = c("black", "cyan"),
+                         DNA1 = c("black", "blue")),
+           legend = list(colour_by.title.cex = 0.7,
+                         colour_by.labels.cex = 0.7))
+

+

These two images display less dense tissue structure but overall the images are +intact and appear to be segmented correctly.

+

Finally, it can be beneficial to visualize the mean marker expression per image +to identify images with outlying marker expression. This check does not +indicate image quality per se but can highlight biological differences. Here, +we will use the aggregateAcrossCells function of the +scuttle package to compute the mean expression per +image. For visualization purposes, we again asinh transform the mean expression +values.

+
library(scuttle)
+
+image_mean <- aggregateAcrossCells(spe, 
+                                   ids = spe$sample_id, 
+                                   statistics="mean",
+                                   use.assay.type = "counts")
+assay(image_mean, "exprs") <- asinh(counts(image_mean))
+
+dittoHeatmap(image_mean, genes = rownames(spe)[rowData(spe)$use_channel],
+             assay = "exprs", cluster_cols = TRUE, scale = "none",
+             heatmap.colors = viridis(100), 
+             annot.by = c("indication", "patient_id", "ROI"),
+             annotation_colors = list(indication = metadata(spe)$color_vectors$indication,
+                                      patient_id = metadata(spe)$color_vectors$patient_id,
+                                      ROI = metadata(spe)$color_vectors$ROI),
+             show_colnames = TRUE)
+

+

We observe extensive biological variation across the 14 images specifically for +some of the cell phenotype markers including the macrophage marker CD206, the B +cell marker CD20, the neutrophil marker CD15, and the proliferation marker Ki67. +These differences will be further studied in the following chapters.

+
+
+

7.4 Cell-level quality control

+

In the following paragraphs we will look at different metrics and visualization +approaches to assess data quality (as well as biological differences) on the +single-cell level.

+

Related to the signal-to-noise ratio (SNR) calculated above on the pixel-level, +a similar measure can be derived on the single-cell level. Here, we will use +a two component Gaussian mixture model for each marker to find cells +with positive and negative expression. The SNR is defined as:

+

\[SNR = I_s/I_n\]

+

where \(I_s\) is the intensity of the signal (mean intensity of cells with +positive signal) and \(I_n\) is the intensity of the noise (mean intensity of +cells lacking expression). To define cells with positive and negative marker +expression, we fit the mixture model across the transformed counts of all cells +contained in the SpatialExperiment object. Next, for each marker we calculate +the mean of the non-transformed counts for the positive and the negative cells. +The SNR is then the ratio between the mean of the positive signal and the mean +of the negative signal.

+
library(mclust)
+
+set.seed(220224)
+mat <- sapply(seq_len(nrow(spe)), function(x){
+    cur_exprs <- assay(spe, "exprs")[x,]
+    cur_counts <- assay(spe, "counts")[x,]
+    
+    cur_model <- Mclust(cur_exprs, G = 2)
+    mean1 <- mean(cur_counts[cur_model$classification == 1])
+    mean2 <- mean(cur_counts[cur_model$classification == 2])
+    
+    signal <- ifelse(mean1 > mean2, mean1, mean2)
+    noise <- ifelse(mean1 > mean2, mean2, mean1)
+    
+    return(c(snr = signal/noise, ps = signal))
+})
+    
+cur_snr <- t(mat) %>% as.data.frame() %>% 
+        mutate(marker = rownames(spe))
+
+cur_snr %>% ggplot() +
+    geom_point(aes(log2(ps), log2(snr))) +
+    geom_label_repel(aes(log2(ps), log2(snr), label = marker)) +
+    theme_minimal(base_size = 15) + ylab("Signal-to-noise ratio [log2]") +
+    xlab("Signal intensity [log2]")
+

+

Next, we observe the distributions of cell size across the individual images. +Differences in cell size distributions can indicate segmentation biases due to +differences in cell density or can indicate biological differences due to cell +type compositions (tumor cells tend to be larger than immune cells).

+
dittoPlot(spe, var = "area", 
+          group.by = "sample_id", 
+          plots = "boxplot") +
+        ylab("Cell area") + xlab("")
+

+
summary(spe$area)
+
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
+##    3.00   47.00   70.00   76.38   98.00  466.00
+

The median cell size is 70 pixels with a median major axis +length of 11.3. The largest cell +has an area of 466 pixels which relates to a diameter of +21.6 pixels assuming a circular shape. +Overall, the distribution of cell sizes is similar across images with images from +Patient4_005 and Patient4_007 showing a reduced average cell size. These +images contain fewer tumor cells which can explain the smaller average cell size.

+

We detect very small cells in the dataset and will remove them. +The chosen threshold is arbitrary and needs to be adjusted per dataset.

+
sum(spe$area < 5)
+
## [1] 65
+
spe <- spe[,spe$area >= 5]
+

Another quality indicator can be an absolute measure of cell density often +reported in cells per mm\(^2\).

+
cell_density <- colData(spe) %>%
+    as.data.frame() %>%
+    group_by(sample_id) %>%
+    summarize(cell_count = n(),
+           no_pixels = mean(width_px) * mean(height_px)) %>%
+    mutate(cells_per_mm2 = cell_count/(no_pixels/1000000))
+
+ggplot(cell_density) +
+    geom_point(aes(reorder(sample_id,cells_per_mm2), cells_per_mm2)) + 
+    theme_minimal(base_size = 15) + 
+    theme(axis.text.x = element_text(angle = 90, hjust = 1, size = 8)) +
+    ylab("Cells per mm2") + xlab("")
+

+

The number of cells per mm\(^2\) varies across images which also depends on the +number of tumor/non-tumor cells. As we can see in the following sections, some +immune cells appear in cell dense regions while other stromal regions are less +dense.

+

The data presented here originate from samples from different locations with +potential differences in pre-processing and each sample was stained individually. +These (and other) technical aspects can induce staining differences between +samples or batches of samples. Observing potential staining differences can be +crucial to assess data quality. We will use ridgeline visualizations to check +differences in staining patterns:

+
multi_dittoPlot(spe, vars = rownames(spe)[rowData(spe)$use_channel],
+               group.by = "patient_id", plots = "ridgeplot", 
+               assay = "exprs", 
+               color.panel = metadata(spe)$color_vectors$patient_id)
+

+

We observe variations in the distributions of marker expression across patients. +These variations may arise partly from different abundances of cells in +different images (e.g., Patient3 may have higher numbers of CD11c+ and PD1+ +cells) as well as staining differences between samples. While most of the +selected markers are specifically expressed in immune cell subtypes, we can see +that E-Cadherin (a marker for epithelial (tumor) cells) shows a similar +expression range across all patients.

+

Finally, we will use non-linear dimensionality reduction methods to project +cells from a high-dimensional (40) down to a low-dimensional (2) space. For this +the scater package provides the runUMAP and +runTSNE function. To ensure reproducibility, we will need to set a seed; +however different seeds and different parameter settings (e.g., the perplexity +parameter in the runTSNE function) need to be tested to avoid +over-interpretation of visualization artefacts. For dimensionality reduction, we +will use all channels that show biological variation across the dataset. +However, marker selection can be performed with different biological questions +in mind. Here, both the runUMAP and runTSNE function are not deterministic, +meaning they produce different results across different runs. We therefore +set a seed in this chunk for reproducibility purposes.

+
library(scater)
+
+set.seed(220225)
+spe <- runUMAP(spe, subset_row = rowData(spe)$use_channel, exprs_values = "exprs") 
+spe <- runTSNE(spe, subset_row = rowData(spe)$use_channel, exprs_values = "exprs") 
+

After dimensionality reduction, the low-dimensional embeddings are stored in the +reducedDim slot.

+
reducedDims(spe)
+
## List of length 2
+## names(2): UMAP TSNE
+
head(reducedDim(spe, "UMAP"))
+
##                    UMAP1     UMAP2
+## Patient1_001_1 -4.810167 -3.777362
+## Patient1_001_2 -4.397347 -3.456036
+## Patient1_001_3 -4.369883 -3.445561
+## Patient1_001_4 -4.081614 -3.162119
+## Patient1_001_5 -6.234012 -2.433976
+## Patient1_001_6 -5.666597 -3.428058
+

Visualization of the low-dimensional embedding facilitates assessment of +potential “batch effects”. The dittoDimPlot +function allows flexible visualization. It returns ggplot objects which +can be further modified.

+
library(patchwork)
+
+# visualize patient id 
+p1 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "UMAP", size = 0.2) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$patient_id) +
+    ggtitle("Patient ID on UMAP")
+p2 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "TSNE", size = 0.2) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$patient_id) +
+    ggtitle("Patient ID on TSNE")
+
+# visualize region of interest id
+p3 <- dittoDimPlot(spe, var = "ROI", reduction.use = "UMAP", size = 0.2) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$ROI) +
+    ggtitle("ROI ID on UMAP")
+p4 <- dittoDimPlot(spe, var = "ROI", reduction.use = "TSNE", size = 0.2) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$ROI) +
+    ggtitle("ROI ID on TSNE")
+
+# visualize indication
+p5 <- dittoDimPlot(spe, var = "indication", reduction.use = "UMAP", size = 0.2) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$indication) +
+    ggtitle("Indication on UMAP")
+p6 <- dittoDimPlot(spe, var = "indication", reduction.use = "TSNE", size = 0.2) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$indication) +
+    ggtitle("Indication on TSNE")
+
+(p1 + p2) / (p3 + p4) / (p5 + p6)
+

+
# visualize marker expression
+p1 <- dittoDimPlot(spe, var = "Ecad", reduction.use = "UMAP", 
+                   assay = "exprs", size = 0.2) +
+    scale_color_viridis(name = "Ecad") +
+    ggtitle("E-Cadherin expression on UMAP")
+p2 <- dittoDimPlot(spe, var = "CD45RO", reduction.use = "UMAP", 
+                   assay = "exprs", size = 0.2) +
+    scale_color_viridis(name = "CD45RO") +
+    ggtitle("CD45RO expression on UMAP")
+p3 <- dittoDimPlot(spe, var = "Ecad", reduction.use = "TSNE", 
+                   assay = "exprs", size = 0.2) +
+    scale_color_viridis(name = "Ecad") +
+    ggtitle("Ecad expression on TSNE")
+p4 <- dittoDimPlot(spe, var = "CD45RO", reduction.use = "TSNE", 
+                   assay = "exprs", size = 0.2) +
+    scale_color_viridis(name = "CD45RO") +
+    ggtitle("CD45RO expression on TSNE")
+
+(p1 + p2) / (p3 + p4)
+

+

We observe a strong separation of tumor cells (Ecad+ cells) between the +patients. Here, each patient was diagnosed with a different tumor type. The +separation of tumor cells could be of biological origin since tumor cells tend +to display differences in expression between patients and cancer types and/or of +technical origin: the panel only contains a single tumor marker (E-Cadherin) and +therefore slight technical differences in staining causes visible separation +between cells of different patients. Nevertheless, the immune compartment +(CD45RO+ cells) mix between patients and we can rule out systematic staining +differences between patients.

+
+
+

7.5 Save objects

+

The modified SpatialExperiment object is saved for further downstream analysis.

+
saveRDS(spe, "data/spe.rds")
+
+
+

7.6 Session Info

+
+ +SessionInfo + +
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.3 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## locale:
+##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+## 
+## time zone: Etc/UTC
+## tzcode source: system (glibc)
+## 
+## attached base packages:
+## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
+## [8] base     
+## 
+## other attached packages:
+##  [1] testthat_3.1.10             patchwork_1.1.3            
+##  [3] scater_1.28.0               mclust_6.0.0               
+##  [5] scuttle_1.10.2              ggrepel_0.9.3              
+##  [7] lubridate_1.9.3             forcats_1.0.0              
+##  [9] stringr_1.5.0               dplyr_1.1.3                
+## [11] purrr_1.0.2                 readr_2.1.4                
+## [13] tidyr_1.3.0                 tibble_3.2.1               
+## [15] tidyverse_2.0.0             viridis_0.6.4              
+## [17] viridisLite_0.4.2           dittoSeq_1.12.1            
+## [19] ggplot2_3.4.3               cytoviewer_1.0.1           
+## [21] cytomapper_1.12.0           SingleCellExperiment_1.22.0
+## [23] SummarizedExperiment_1.30.2 Biobase_2.60.0             
+## [25] GenomicRanges_1.52.0        GenomeInfoDb_1.36.3        
+## [27] IRanges_2.34.1              S4Vectors_0.38.2           
+## [29] BiocGenerics_0.46.0         MatrixGenerics_1.12.3      
+## [31] matrixStats_1.0.0           EBImage_4.42.0             
+## 
+## loaded via a namespace (and not attached):
+##   [1] RColorBrewer_1.1-3        rstudioapi_0.15.0        
+##   [3] jsonlite_1.8.7            magrittr_2.0.3           
+##   [5] ggbeeswarm_0.7.2          magick_2.8.0             
+##   [7] farver_2.1.1              rmarkdown_2.25           
+##   [9] zlibbioc_1.46.0           vctrs_0.6.3              
+##  [11] memoise_2.0.1             DelayedMatrixStats_1.22.6
+##  [13] RCurl_1.98-1.12           terra_1.7-46             
+##  [15] svgPanZoom_0.3.4          htmltools_0.5.6          
+##  [17] S4Arrays_1.0.6            BiocNeighbors_1.18.0     
+##  [19] raster_3.6-23             Rhdf5lib_1.22.1          
+##  [21] rhdf5_2.44.0              sass_0.4.7               
+##  [23] bslib_0.5.1               desc_1.4.2               
+##  [25] htmlwidgets_1.6.2         fontawesome_0.5.2        
+##  [27] cachem_1.0.8              mime_0.12                
+##  [29] lifecycle_1.0.3           pkgconfig_2.0.3          
+##  [31] rsvd_1.0.5                colourpicker_1.3.0       
+##  [33] Matrix_1.6-1.1            R6_2.5.1                 
+##  [35] fastmap_1.1.1             GenomeInfoDbData_1.2.10  
+##  [37] shiny_1.7.5               digest_0.6.33            
+##  [39] colorspace_2.1-0          shinycssloaders_1.0.0    
+##  [41] rprojroot_2.0.3           irlba_2.3.5.1            
+##  [43] dqrng_0.3.1               pkgload_1.3.3            
+##  [45] beachmat_2.16.0           labeling_0.4.3           
+##  [47] timechange_0.2.0          fansi_1.0.4              
+##  [49] nnls_1.5                  abind_1.4-5              
+##  [51] compiler_4.3.1            withr_2.5.1              
+##  [53] tiff_0.1-11               BiocParallel_1.34.2      
+##  [55] HDF5Array_1.28.1          R.utils_2.12.2           
+##  [57] DelayedArray_0.26.7       rjson_0.2.21             
+##  [59] tools_4.3.1               vipor_0.4.5              
+##  [61] beeswarm_0.4.0            httpuv_1.6.11            
+##  [63] R.oo_1.25.0               glue_1.6.2               
+##  [65] rhdf5filters_1.12.1       promises_1.2.1           
+##  [67] grid_4.3.1                Rtsne_0.16               
+##  [69] generics_0.1.3            gtable_0.3.4             
+##  [71] tzdb_0.4.0                R.methodsS3_1.8.2        
+##  [73] hms_1.1.3                 ScaledMatrix_1.8.1       
+##  [75] BiocSingular_1.16.0       sp_2.0-0                 
+##  [77] utf8_1.2.3                XVector_0.40.0           
+##  [79] RcppAnnoy_0.0.21          pillar_1.9.0             
+##  [81] limma_3.56.2              later_1.3.1              
+##  [83] lattice_0.21-8            tidyselect_1.2.0         
+##  [85] locfit_1.5-9.8            miniUI_0.1.1.1           
+##  [87] knitr_1.44                gridExtra_2.3            
+##  [89] bookdown_0.35             edgeR_3.42.4             
+##  [91] svglite_2.1.1             xfun_0.40                
+##  [93] shinydashboard_0.7.2      brio_1.1.3               
+##  [95] DropletUtils_1.20.0       pheatmap_1.0.12          
+##  [97] stringi_1.7.12            fftwtools_0.9-11         
+##  [99] yaml_2.3.7                evaluate_0.21            
+## [101] codetools_0.2-19          archive_1.1.6            
+## [103] BiocManager_1.30.22       cli_3.6.1                
+## [105] uwot_0.1.16               xtable_1.8-4             
+## [107] systemfonts_1.0.4         munsell_0.5.0            
+## [109] jquerylib_0.1.4           Rcpp_1.0.11              
+## [111] png_0.1-8                 parallel_4.3.1           
+## [113] ellipsis_0.3.2            jpeg_0.1-10              
+## [115] sparseMatrixStats_1.12.2  bitops_1.0-7             
+## [117] SpatialExperiment_1.10.0  scales_1.2.1             
+## [119] ggridges_0.5.4            crayon_1.5.2             
+## [121] BiocStyle_2.28.1          rlang_1.1.1              
+## [123] cowplot_1.1.1
+
+ +
+
+

References

+
+
+Meyer, Lasse, Nils Eling, and Bernd Bodenmiller. 2023. “Cytoviewer: An r/Bioconductor Package for Interactive Visualization and Exploration of Highly Multiplexed Imaging Data.” +
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/image-visualization.html b/image-visualization.html new file mode 100644 index 00000000..e13135ff --- /dev/null +++ b/image-visualization.html @@ -0,0 +1,992 @@ + + + + + + + 11 Image visualization | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

11 Image visualization

+

The following section describes how to visualize the abundance of biomolecules +(e.g., protein or RNA) as well as cell-specific metadata on images. Section +11.1 focuses on visualizing pixel-level information +including the generation of pseudo-color composite images. Section +11.2 highlights the visualization of cell metadata (e.g., +cell phenotype) as well as summarized pixel intensities on cell segmentation +masks.

+

The +cytomapper +R/Bioconductor package was developed to support the handling and visualization +of multiple multi-channel images and segmentation masks (Eling et al. 2020). The main +data object for image handling is the +CytoImageList +container which we used in Section 5 to store multi-channel +images and segmentation masks.

+

We will first read in the previously processed data and randomly select 3 images +for visualization purposes.

+
library(SpatialExperiment)
+library(cytomapper)
+spe <- readRDS("data/spe.rds")
+images <- readRDS("data/images.rds")
+masks <- readRDS("data/masks.rds")
+
+# Sample images
+set.seed(220517)
+cur_id <- sample(unique(spe$sample_id), 3)
+cur_images <- images[names(images) %in% cur_id]
+cur_masks <- masks[names(masks) %in% cur_id]
+
+

11.1 Pixel visualization

+

The following section gives examples for visualizing individual channels or +multiple channels as pseudo-color composite images. For this the cytomapper +package exports the plotPixels function which expects a CytoImageList object +storing one or multiple multi-channel images. In the simplest use case, a +single channel can be visualized as follows:

+
plotPixels(cur_images, 
+           colour_by = "Ecad",
+           bcg = list(Ecad = c(0, 5, 1)))
+

+

The plot above shows the tissue expression of the epithelial tumor marker +E-cadherin on the 3 selected images. The bcg parameter (default c(0, 1, 1)) +stands for “background”, “contrast”, “gamma” and controls these attributes of +the image. This parameter takes a named list where each entry specifies these +attributes per channel. The first value of the numeric vector will be added to +the pixel intensities (background); pixel intensities will be multiplied by the +second entry of the vector (contrast); pixel intensities will be exponentiated +by the third entry of the vector (gamma). In most cases, it is sufficient to +adjust the second (contrast) entry of the vector.

+

The following example highlights the visualization of 6 markers (maximum allowed +number of markers) at once per image. The markers indicate the spatial +distribution of tumor cells (E-cadherin), T cells (CD3), B cells (CD20), CD8+ T +cells (CD8a), plasma cells (CD38) and proliferating cells (Ki67).

+
plotPixels(cur_images, 
+           colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67"),
+           bcg = list(Ecad = c(0, 5, 1),
+                      CD3 = c(0, 5, 1),
+                      CD20 = c(0, 5, 1),
+                      CD8a = c(0, 5, 1),
+                      CD38 = c(0, 8, 1),
+                      Ki67 = c(0, 5, 1)))
+

+
+

11.1.1 Adjusting colors

+

The default colors for visualization are chosen by the additive RGB (red, green, +blue) color model. For six markers the default colors are: red, green, blue, +cyan (green + blue), magenta (red + blue), yellow (green + red). These colors +are the easiest to distinguish by eye. However, you can select other colors for +each channel by setting the colour parameter:

+
plotPixels(cur_images, 
+           colour_by = c("Ecad", "CD3", "CD20"),
+           bcg = list(Ecad = c(0, 5, 1),
+                      CD3 = c(0, 5, 1),
+                      CD20 = c(0, 5, 1)),
+           colour = list(Ecad = c("black", "burlywood1"),
+                         CD3 = c("black", "cyan2"),
+                         CD20 = c("black", "firebrick1")))
+

+

The colour parameter takes a named list in which each entry specifies the +colors from which a color gradient is constructed via colorRampPalette. These +are usually vectors of length 2 in which the first entry is "black" and the +second entry specifies the color of choice. Although not recommended, you can +also specify more than two colors to generate a more complex color gradient.

+
+
+

11.1.2 Image normalization

+

As an alternative to setting the bcg parameter, images can first be +normalized. Normalization here means to scale the pixel intensities per channel +between 0 and 1 (or a range specified by the ft parameter in the normalize +function). By default, the normalize function scales pixel intensities across +all images contained in the CytoImageList object (separateImages = FALSE). +Each individual channel is scaled independently (separateChannels = TRUE).

+

After 0-1 normalization, maximum pixel intensities can be clipped to enhance the +contrast of the image (setting the inputRange parameter). In the following +example, the clipping to 0 and 0.2 is the same as multiplying the pixel +intensities by a factor of 5.

+
# 0 - 1 channel scaling across all images
+norm_images <- cytomapper::normalize(cur_images)
+
+# Clip channel at 0.2
+norm_images <- cytomapper::normalize(norm_images, inputRange = c(0, 0.2))
+
+plotPixels(norm_images, 
+           colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67"))
+

+

The default setting of scaling pixel intensities across all images ensures +comparable intensity levels across images. Pixel intensities can also be +scaled per image therefore correcting for staining/expression differences +between images:

+
# 0 - 1 channel scaling per image
+norm_images <- cytomapper::normalize(cur_images, separateImages = TRUE)
+
+# Clip channel at 0.2
+norm_images <- cytomapper::normalize(norm_images, inputRange = c(0, 0.2))
+
+plotPixels(norm_images, 
+           colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67"))
+

+

As we can see, the marker Ki67 appears brighter on image 2 and 3 in comparison +to scaling the channel across all images.

+

Finally, the normalize function also accepts a named list input for the +inputRange argument. In this list, the clipping range per channel can be set +individually:

+
# 0 - 1 channel scaling per image
+norm_images <- cytomapper::normalize(cur_images, 
+                         separateImages = TRUE,
+                         inputRange = list(Ecad = c(0, 50), 
+                                           CD3 = c(0, 30),
+                                           CD20 = c(0, 40),
+                                           CD8a = c(0, 50),
+                                           CD38 = c(0, 10),
+                                           Ki67 = c(0, 70)))
+
+plotPixels(norm_images, 
+           colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67"))
+

+
+
+
+

11.2 Cell visualization

+

In the following section, we will show examples on how to visualize single +cells either as segmentation masks or outlined on composite images. This type +of visualization allows to observe the spatial distribution of cell phenotypes, +the visual assessment of morphological features and quality control in terms +of cell segmentation and phenotyping.

+
+

11.2.1 Visualzing metadata

+

The cytomapper package provides the plotCells function that accepts a +CytoImageList object containing segmentation masks. These are defined as +single channel images where sets of pixels with the same integer ID identify +individual cells. This integer ID can be found as an entry in the colData(spe) +slot and as pixel information in the segmentation masks. The entry in +colData(spe) needs to be specified via the cell_id argument to the +plotCells function. In that way, data contained in the SpatialExperiment +object can be mapped to segmentation masks. For the current dataset, the cell +IDs are stored in colData(spe)$ObjectNumber.

+

As cell IDs are only unique within a single image, plotCells also requires +the img_id argument. This argument specifies the colData(spe) as well as the +mcols(masks) entry that stores the unique image name from which each cell was +extracted. In the current dataset the unique image names are stored in +colData(spe)$sample_id and mcols(masks)$sample_id.

+

Providing these two entries that allow mapping between the SpatialExperiment +object and segmentation masks, we can now color individual cells based on their +cell type:

+
plotCells(cur_masks,
+          object = spe, 
+          cell_id = "ObjectNumber", 
+          img_id = "sample_id",
+          colour_by = "celltype")
+

+

For consistent visualization, the plotCells function takes a named list as +color argument. The entry name must match the colour_by argument.

+
plotCells(cur_masks,
+          object = spe, 
+          cell_id = "ObjectNumber", 
+          img_id = "sample_id",
+          colour_by = "celltype",
+          colour = list(celltype = metadata(spe)$color_vectors$celltype))
+

+

If only individual cell types should be visualized, the SpatialExperiment +object can be subsetted (e.g., to only contain CD8+ T cells). In the following +example CD8+ T cells are colored in red and all other cells that are not +contained in the dataset are colored in white (as set by the missing_color +argument).

+
CD8 <- spe[,spe$celltype == "CD8"]
+
+plotCells(cur_masks,
+          object = CD8, 
+          cell_id = "ObjectNumber", 
+          img_id = "sample_id",
+          colour_by = "celltype",
+          colour = list(celltype = c(CD8 = "red")),
+          missing_colour = "white")
+

+

In terms of visualizing metadata, any entry in the colData(spe) slot can be +visualized. The plotCells function automatically detects if the entry +is continuous or discrete. In this fashion, we can now visualize the area of each +cell:

+
plotCells(cur_masks,
+          object = spe, 
+          cell_id = "ObjectNumber", 
+          img_id = "sample_id",
+          colour_by = "area")
+

+
+
+

11.2.2 Visualizating expression

+

Similar to visualizing single-cell metadata on segmentation masks, we can +use the plotCells function to visualize the aggregated pixel intensities +per cell. In the current dataset pixel intensities were aggregated by computing +the mean pixel intensity per cell and per channel. The plotCells function +accepts the exprs_values argument (default counts) that allows selecting +the assay which stores the expression values that should be visualized.

+

In the following example, we visualize the asinh-transformed mean pixel +intensities of the epithelial marker E-cadherin on segmentation masks.

+
plotCells(cur_masks,
+          object = spe, 
+          cell_id = "ObjectNumber", 
+          img_id = "sample_id",
+          colour_by = "Ecad",
+          exprs_values = "exprs")
+

+

We will now visualize the maximum number of +allowed markers as composites on the segmentation masks. As above the markers +indicate the spatial distribution of tumor cells (E-cadherin), T cells (CD3), B +cells (CD20), CD8+ T cells (CD8a), plasma cells (CD38) and proliferating cells +(Ki67).

+
plotCells(cur_masks,
+          object = spe, 
+          cell_id = "ObjectNumber", 
+          img_id = "sample_id",
+          colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67"),
+          exprs_values = "exprs")
+

+

While visualizing 6 markers on the pixel-level may still allow the distinction +of different tissue structures, observing single-cell expression levels is +difficult when visualizing many markers simultaneously due to often overlapping +expression.

+

Similarly to adjusting marker colors when visualizing pixel intensities, we +can change the color gradients per marker by setting the color argument:

+
plotCells(cur_masks,
+          object = spe, 
+          cell_id = "ObjectNumber", 
+          img_id = "sample_id",
+          colour_by = c("Ecad", "CD3", "CD20"),
+          exprs_values = "exprs",
+          colour = list(Ecad = c("black", "burlywood1"),
+                        CD3 = c("black", "cyan2"),
+                        CD20 = c("black", "firebrick1")))
+

+
+
+

11.2.3 Outlining cells on images

+

The following section highlights the combined visualization of pixel- and +cell-level information at once. For this, besides the SpatialExperiment object, +the plotPixels function accepts two CytoImageList objects. One for the +multi-channel images and one for the segmentation masks. By specifying the +outline_by parameter, the outlines of cells can now be colored based on their +metadata.

+

The following example first generates a 3-channel composite images displaying +the expression of E-cadherin, CD3 and CD20 before coloring the cells’ outlines +by their cell phenotype.

+
plotPixels(image = cur_images,
+           mask = cur_masks,
+           object = spe, 
+           cell_id = "ObjectNumber", 
+           img_id = "sample_id",
+           colour_by = c("Ecad", "CD3", "CD20"),
+           outline_by = "celltype",
+           bcg = list(Ecad = c(0, 5, 1),
+                      CD3 = c(0, 5, 1),
+                      CD20 = c(0, 5, 1)),
+           colour = list(celltype = metadata(spe)$color_vectors$celltype),
+           thick = TRUE)
+

+

Distinguishing individual cell phenotypes is nearly impossible in the images +above.

+

However, the SpatialExperiment object can be subsetted to only contain cells +of a single or few phenotypes. This allows the selective visualization of cell +outlines on composite images.

+

Here, we select all CD8+ T cells from the dataset and outline them on a 2-channel +composite image displaying the expression of CD3 and CD8a.

+
CD8 <- spe[,spe$celltype == "CD8"]
+
+plotPixels(image = cur_images,
+           mask = cur_masks,
+           object = CD8, 
+           cell_id = "ObjectNumber", img_id = "sample_id",
+           colour_by = c("CD3", "CD8a"),
+           outline_by = "celltype",
+           bcg = list(CD3 = c(0, 5, 1),
+                      CD8a = c(0, 5, 1)),
+           colour = list(celltype = c("CD8" = "white")),
+           thick = TRUE)
+

+

This type of visualization allows the quality control of two things: 1. +segmentation quality of individual cell types can be checked and 2. cell +phenotyping accuracy can be visually assessed against expected marker expression.

+
+
+
+

11.3 Adjusting plot annotations

+

The cytomapper package provides a number of function arguments to adjust the +visual appearance of figures that are shared between the plotPixels and +plotCells function.

+

For a full overview of the arguments please refer to ?plotting-param.

+

We use the following example to highlight how to adjust the scale bar, the image +title, the legend appearance and the margin between images.

+
plotPixels(cur_images, 
+           colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67"),
+           bcg = list(Ecad = c(0, 5, 1),
+                      CD3 = c(0, 5, 1),
+                      CD20 = c(0, 5, 1),
+                      CD8a = c(0, 5, 1),
+                      CD38 = c(0, 8, 1),
+                      Ki67 = c(0, 5, 1)),
+           scale_bar = list(length = 100,
+                            label = expression("100 " ~ mu * "m"),
+                            cex = 0.7, 
+                            lwidth = 10,
+                            colour = "grey",
+                            position = "bottomleft",
+                            margin = c(5,5),
+                            frame = 3),
+           image_title = list(text = mcols(cur_images)$indication,
+                              position = "topright",
+                              colour = "grey",
+                              margin = c(5,5),
+                              font = 2,
+                              cex = 2),
+           legend = list(colour_by.title.cex = 0.7,
+                         margin = 10),
+           margin = 40)
+

+
+
+

11.4 Displaying individual images

+

By default, all images are displayed on the same graphics device. This can be +useful when saving all images at once (see next section) to zoom into the +individual images instead of opening each image individually. However, when +displaying images in a markdown document these are more accessible when +visualized individually. For this, the plotPixels and plotCells function +accepts the display parameter that when set to "single" displays each +resulting image in its own graphics device:

+
plotCells(cur_masks,
+          object = spe, 
+          cell_id = "ObjectNumber", 
+          img_id = "sample_id",
+          colour_by = "celltype",
+          colour = list(celltype = metadata(spe)$color_vectors$celltype),
+          display = "single",
+          legend = NULL)
+

+
+
+

11.5 Saving and returning images

+

The final section addresses how to save composite images and how to return them +for integration with other plots.

+

The plotPixels and plotCells functions accept the save_plot argument which +takes a named list of the following entries: filename indicates the location +and file type of the image saved to disk; scale adjusts the resolution of the +saved image (this only needs to be adjusted for small images).

+
plotCells(cur_masks,
+          object = spe, 
+          cell_id = "ObjectNumber", 
+          img_id = "sample_id",
+          colour_by = "celltype",
+          colour = list(celltype = metadata(spe)$color_vectors$celltype),
+          save_plot = list(filename = "data/celltype_image.png"))
+

The composite images (together with their annotation) can also be returned. In +the following code chunk we save two example plots to variables (out1 and +out2).

+
out1 <- plotCells(cur_masks,
+                  object = spe, 
+                  cell_id = "ObjectNumber", 
+                  img_id = "sample_id",
+                  colour_by = "celltype",
+                  colour = list(celltype = metadata(spe)$color_vectors$celltype),
+                  return_plot = TRUE)
+
out2 <- plotCells(cur_masks,
+                  object = spe, 
+                  cell_id = "ObjectNumber", 
+                  img_id = "sample_id",
+                  colour_by = c("Ecad", "CD3", "CD20"),
+                  exprs_values = "exprs",
+                  return_plot = TRUE)
+

The composite images are stored in out1$plot and out2$plot and can be +converted into a graph object recognized by the +cowplot +package.

+

The final function call of the following chunk plots both object next to each +other.

+
library(cowplot)
+library(gridGraphics)
+p1 <- ggdraw(out1$plot, clip = "on")
+p2 <- ggdraw(out2$plot, clip = "on")
+
+plot_grid(p1, p2)
+

+
+
+

11.6 Interactive image visualization

+

The +cytoviewer +package allows the interactive visualization of multi-channel images and +segmentation masks. It also allows to map cellular metadata onto segmentation +masks and outlining of cells on composite images. For a full introduction to the +package, please refer to the vignette.

+
library(cytoviewer)
+
+app <- cytoviewer(image = images,
+                  mask = masks,
+                  object = spe,
+                  cell_id = "ObjectNumber",
+                  img_id = "sample_id")
+
+if (interactive()) {
+    shiny::runApp(app, launch.browser = TRUE)
+}
+
+
+

11.7 Session Info

+
+ +SessionInfo + +
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.3 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## locale:
+##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+## 
+## time zone: Etc/UTC
+## tzcode source: system (glibc)
+## 
+## attached base packages:
+## [1] grid      stats4    stats     graphics  grDevices utils     datasets 
+## [8] methods   base     
+## 
+## other attached packages:
+##  [1] cytoviewer_1.0.1            gridGraphics_0.5-1         
+##  [3] cowplot_1.1.1               cytomapper_1.12.0          
+##  [5] EBImage_4.42.0              SpatialExperiment_1.10.0   
+##  [7] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2
+##  [9] Biobase_2.60.0              GenomicRanges_1.52.0       
+## [11] GenomeInfoDb_1.36.3         IRanges_2.34.1             
+## [13] S4Vectors_0.38.2            BiocGenerics_0.46.0        
+## [15] MatrixGenerics_1.12.3       matrixStats_1.0.0          
+## 
+## loaded via a namespace (and not attached):
+##   [1] splines_4.3.1               later_1.3.1                
+##   [3] bitops_1.0-7                tibble_3.2.1               
+##   [5] R.oo_1.25.0                 svgPanZoom_0.3.4           
+##   [7] polyclip_1.10-6             XML_3.99-0.14              
+##   [9] lifecycle_1.0.3             rstatix_0.7.2              
+##  [11] edgeR_3.42.4                doParallel_1.0.17          
+##  [13] lattice_0.21-8              MASS_7.3-60                
+##  [15] backports_1.4.1             magrittr_2.0.3             
+##  [17] limma_3.56.2                sass_0.4.7                 
+##  [19] rmarkdown_2.25              plotrix_3.8-2              
+##  [21] jquerylib_0.1.4             yaml_2.3.7                 
+##  [23] httpuv_1.6.11               sp_2.0-0                   
+##  [25] RColorBrewer_1.1-3          ConsensusClusterPlus_1.64.0
+##  [27] multcomp_1.4-25             abind_1.4-5                
+##  [29] zlibbioc_1.46.0             Rtsne_0.16                 
+##  [31] purrr_1.0.2                 R.utils_2.12.2             
+##  [33] RCurl_1.98-1.12             TH.data_1.1-2              
+##  [35] tweenr_2.0.2                sandwich_3.0-2             
+##  [37] circlize_0.4.15             GenomeInfoDbData_1.2.10    
+##  [39] ggrepel_0.9.3               irlba_2.3.5.1              
+##  [41] CATALYST_1.24.0             terra_1.7-46               
+##  [43] dqrng_0.3.1                 svglite_2.1.1              
+##  [45] DelayedMatrixStats_1.22.6   codetools_0.2-19           
+##  [47] DropletUtils_1.20.0         DelayedArray_0.26.7        
+##  [49] scuttle_1.10.2              ggforce_0.4.1              
+##  [51] tidyselect_1.2.0            shape_1.4.6                
+##  [53] raster_3.6-23               farver_2.1.1               
+##  [55] ScaledMatrix_1.8.1          viridis_0.6.4              
+##  [57] jsonlite_1.8.7              BiocNeighbors_1.18.0       
+##  [59] GetoptLong_1.0.5            ellipsis_0.3.2             
+##  [61] scater_1.28.0               ggridges_0.5.4             
+##  [63] survival_3.5-5              iterators_1.0.14           
+##  [65] systemfonts_1.0.4           foreach_1.5.2              
+##  [67] tools_4.3.1                 ggnewscale_0.4.9           
+##  [69] Rcpp_1.0.11                 glue_1.6.2                 
+##  [71] gridExtra_2.3               xfun_0.40                  
+##  [73] dplyr_1.1.3                 HDF5Array_1.28.1           
+##  [75] shinydashboard_0.7.2        withr_2.5.1                
+##  [77] fastmap_1.1.1               rhdf5filters_1.12.1        
+##  [79] fansi_1.0.4                 rsvd_1.0.5                 
+##  [81] digest_0.6.33               R6_2.5.1                   
+##  [83] mime_0.12                   colorspace_2.1-0           
+##  [85] gtools_3.9.4                jpeg_0.1-10                
+##  [87] R.methodsS3_1.8.2           utf8_1.2.3                 
+##  [89] tidyr_1.3.0                 generics_0.1.3             
+##  [91] data.table_1.14.8           htmlwidgets_1.6.2          
+##  [93] S4Arrays_1.0.6              pkgconfig_2.0.3            
+##  [95] gtable_0.3.4                ComplexHeatmap_2.16.0      
+##  [97] RProtoBufLib_2.12.1         XVector_0.40.0             
+##  [99] htmltools_0.5.6             carData_3.0-5              
+## [101] bookdown_0.35               fftwtools_0.9-11           
+## [103] clue_0.3-65                 scales_1.2.1               
+## [105] png_0.1-8                   colorRamps_2.3.1           
+## [107] knitr_1.44                  rstudioapi_0.15.0          
+## [109] reshape2_1.4.4              rjson_0.2.21               
+## [111] cachem_1.0.8                zoo_1.8-12                 
+## [113] rhdf5_2.44.0                GlobalOptions_0.1.2        
+## [115] stringr_1.5.0               shinycssloaders_1.0.0      
+## [117] miniUI_0.1.1.1              parallel_4.3.1             
+## [119] vipor_0.4.5                 pillar_1.9.0               
+## [121] vctrs_0.6.3                 promises_1.2.1             
+## [123] ggpubr_0.6.0                BiocSingular_1.16.0        
+## [125] car_3.1-2                   cytolib_2.12.1             
+## [127] beachmat_2.16.0             xtable_1.8-4               
+## [129] cluster_2.1.4               archive_1.1.6              
+## [131] beeswarm_0.4.0              evaluate_0.21              
+## [133] magick_2.8.0                mvtnorm_1.2-3              
+## [135] cli_3.6.1                   locfit_1.5-9.8             
+## [137] compiler_4.3.1              rlang_1.1.1                
+## [139] crayon_1.5.2                ggsignif_0.6.4             
+## [141] FlowSOM_2.8.0               plyr_1.8.8                 
+## [143] flowCore_2.12.2             ggbeeswarm_0.7.2           
+## [145] stringi_1.7.12              viridisLite_0.4.2          
+## [147] BiocParallel_1.34.2         nnls_1.5                   
+## [149] munsell_0.5.0               tiff_0.1-11                
+## [151] colourpicker_1.3.0          Matrix_1.6-1.1             
+## [153] sparseMatrixStats_1.12.2    ggplot2_3.4.3              
+## [155] Rhdf5lib_1.22.1             shiny_1.7.5                
+## [157] fontawesome_0.5.2           drc_3.0-1                  
+## [159] memoise_2.0.1               igraph_1.5.1               
+## [161] broom_1.0.5                 bslib_0.5.1
+
+ +
+
+

References

+
+
+Eling, Nils, Nicolas Damond, Tobias Hoch, and Bernd Bodenmiller. 2020. “Cytomapper: An r/Bioconductor Package for Visualization of Highly Multiplexed Imaging Data.” Bioinformatics 36 (24): 5706--5708. +
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/img/Gating_scheme.pdf b/img/Gating_scheme.pdf new file mode 100644 index 00000000..173c9c64 Binary files /dev/null and b/img/Gating_scheme.pdf differ diff --git a/img/IMC_workflow.png b/img/IMC_workflow.png new file mode 100644 index 00000000..8f558fb9 Binary files /dev/null and b/img/IMC_workflow.png differ diff --git a/index.html b/index.html new file mode 100644 index 00000000..7c44b2b0 --- /dev/null +++ b/index.html @@ -0,0 +1,513 @@ + + + + + + + Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+ +
+

1 IMC Data Analysis Workflow

+

This workflow highlights the use of common R/Bioconductor packages +to analyze single-cell data obtained from segmented multi-channel images. We will not perform multi-channel image processing and segmentation in R +but rather link to available approaches in Section 3. While we +use imaging mass cytometry (IMC) data as an example, the concepts presented here can be applied to images +obtained by other highly-multiplexed imaging technologies (e.g. CODEX, MIBI, +mIF, etc.).

+

We will give an introduction to IMC in Section 2 and highlight +strategies to extract single-cell data from multi-channel images in Section +3.

+

Reproducible code written in R is available from Section 4 +onwards and the workflow can be largely divided into the following parts:

+
    +
  1. Preprocessing (reading in the data, spillover correction)
  2. +
  3. Image- and cell-level quality control, low-dimensional visualization
  4. +
  5. Sample/batch effect correction
  6. +
  7. Cell phenotyping via clustering or classification
  8. +
  9. Single-cell and image visualization
  10. +
  11. Spatial analyses
  12. +
+
+

1.1 Disclaimer

+

Multi-channel image and spatial, single-cell analysis is complex and we +highlight an example workflow here. However, this workflow is not complete and +does not cover all possible aspects of exploratory data analysis. Instead, we +demonstrate this workflow as a solid basis that supports other aspects of data +analysis. It offers interoperability with other packages for single-cell and +spatial analysis and the user will need to become familiar with the general +framework to efficiently analyse data obtained from multiplexed imaging +technologies.

+
+
+

1.2 Feedback and contributing

+

We provide the workflow as an open-source resource. It does not mean that +this workflow is tested on all possible datasets or biological questions.

+

If you notice an issue or missing information, please report an issue +here. We also +welcome contributions in form of pull requests or feature requests in form of +issues. Have a look at the source code at:

+

https://github.com/BodenmillerGroup/IMCDataAnalysis

+
+
+

1.3 Citation

+

The workflow has been published in
+https://www.nature.com/articles/s41596-023-00881-0 +which you can cite as follows:

+
Windhager, J., Zanotelli, V.R.T., Schulz, D. et al. An end-to-end workflow for multiplexed image processing and analysis.
+    Nat Protoc (2023). 
+
+
+

1.4 Changelog

+

Version 1.0.0 [2023-06-30]

+
    +
  • First stable release of the workflow
  • +
+

Version 1.0.1 [2023-10-19]

+
    +
  • Added seed before predict call after training a classifier
  • +
+ +
+

*
+1: Department for Quantitative Biomedicine, University of Zurich
+2: Institute for Molecular Health Sciences, ETH Zurich

+ +
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/index.md b/index.md new file mode 100644 index 00000000..09971753 --- /dev/null +++ b/index.md @@ -0,0 +1,90 @@ +--- +title: "Analysis workflow for IMC data" +author: "**Authors:** Nils Eling [1](#DQBM),[2](#IMHS),[*](#email), Vito Zanotelli [1](#DQBM),[2](#IMHS), Michelle Daniel [1](#DQBM),[2](#IMHS), Daniel Schulz [1](#DQBM),[2](#IMHS), Jonas Windhager [1](#DQBM),[2](#IMHS), Lasse Meyer [1](#DQBM),[2](#IMHS)" +date: "**Compiled:** 2023-10-19" +site: bookdown::bookdown_site +github-repo: "BodenmillerGroup/IMCDataAnalysis" +documentclass: book +bibliography: [book.bib, packages.bib] +biblio-style: apalike +link-citations: yes +description: "This bookdown project highlights possible down-stream analyses performed on imaging mass cytometry data." +--- + +# IMC Data Analysis Workflow {#preamble} + +This workflow highlights the use of common R/Bioconductor packages +to analyze single-cell data obtained from segmented multi-channel images. We will not perform multi-channel image processing and segmentation in R +but rather link to available approaches in Section \@ref(processing). While we +use imaging mass cytometry (IMC) data as an example, the concepts presented here can be applied to images +obtained by other highly-multiplexed imaging technologies (e.g. CODEX, MIBI, +mIF, etc.). + +We will give an introduction to IMC in Section \@ref(intro) and highlight +strategies to extract single-cell data from multi-channel images in Section +\@ref(processing). + +Reproducible code written in R is available from Section \@ref(prerequisites) +onwards and the workflow can be largely divided into the following parts: + +1. Preprocessing (reading in the data, spillover correction) +2. Image- and cell-level quality control, low-dimensional visualization +3. Sample/batch effect correction +4. Cell phenotyping via clustering or classification +5. Single-cell and image visualization +6. Spatial analyses + +## Disclaimer + +Multi-channel image and spatial, single-cell analysis is complex and we +highlight an example workflow here. However, this workflow is not complete and +does not cover all possible aspects of exploratory data analysis. Instead, we +demonstrate this workflow as a solid basis that supports other aspects of data +analysis. It offers interoperability with other packages for single-cell and +spatial analysis and the user will need to become familiar with the general +framework to efficiently analyse data obtained from multiplexed imaging +technologies. + +## Feedback and contributing + +We provide the workflow as an open-source resource. It does not mean that +this workflow is tested on all possible datasets or biological questions. + +If you notice an issue or missing information, please report an issue +[here](https://github.com/BodenmillerGroup/IMCDataAnalysis/issues). We also +welcome contributions in form of pull requests or feature requests in form of +issues. Have a look at the source code at: + +[https://github.com/BodenmillerGroup/IMCDataAnalysis](https://github.com/BodenmillerGroup/IMCDataAnalysis) + +## Citation + +The workflow has been published in +[https://www.nature.com/articles/s41596-023-00881-0](https://www.nature.com/articles/s41596-023-00881-0) +which you can cite as follows: + +``` +Windhager, J., Zanotelli, V.R.T., Schulz, D. et al. An end-to-end workflow for multiplexed image processing and analysis. + Nat Protoc (2023). +``` + +## Changelog + + +```{=html} +

Version 1.0.0 [2023-06-30]

+
    +
  • First stable release of the workflow
  • +
+

Version 1.0.1 [2023-10-19]

+
    +
  • Added seed before predict call after training a classifier
  • +
+ +``` + +--- + +* nils.eling@uzh.ch +1: Department for Quantitative Biomedicine, University of Zurich +2: Institute for Molecular Health Sciences, ETH Zurich diff --git a/intro.html b/intro.html new file mode 100644 index 00000000..658cdefe --- /dev/null +++ b/intro.html @@ -0,0 +1,659 @@ + + + + + + + 2 Introduction | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

2 Introduction

+

Highly multiplexed imaging (HMI) enables the simultaneous detection of dozens of +biological molecules (e.g., proteins, transcripts; also referred to as +“markers”) in tissues. Recently established multiplexed tissue imaging +technologies rely on cyclic staining with fluorescently-tagged antibodies +(Lin et al. 2018; Gut, Herrmann, and Pelkmans 2018), or the use of oligonucleotide-tagged (Goltsev et al. 2018; Saka et al. 2019) or metal-tagged (Giesen et al. 2014; Angelo et al. 2014) antibodies, among others. +The key strength of these technologies is that they allow in-depth analysis of +single cells within their spatial tissue context. As a result, these methods +have enabled analysis of the spatial architecture of the tumor microenvironment +(Lin et al. 2018; Jackson et al. 2020; Ali et al. 2020; Schürch et al. 2020), determination of nucleic acid +and protein abundances for assessment of spatial co-localization of cell types +and chemokines (Hoch et al. 2022) and spatial niches of virus infected cells (Jiang et al. 2022), +and characterization of pathological features during COVID-19 infection +(Rendeiro et al. 2021; Mitamura et al. 2021), Type 1 diabetes progression (Damond et al. 2019) and +autoimmune disease (Ferrian et al. 2021).

+

Imaging mass cytometry (IMC) utilizes metal-tagged antibodies to detect over 40 +proteins and other metal-tagged molecules in biological samples. IMC can be used +to perform highly multiplexed imaging and is particularly suited to profiling +selected areas of tissues across many samples.

+

IMC_workflow +Overview of imaging mass cytometry data acquisition. Taken from (Giesen et al. 2014)

+

IMC has first been published in 2014 (Giesen et al. 2014) and has been commercialized by +Standard BioToolsTM to be distributed as the Hyperion Imaging +SystemTM (documentation is available +here). +Similar to other HMI technologies such as MIBI (Angelo et al. 2014), CyCIF (Lin et al. 2018), +4i (Gut, Herrmann, and Pelkmans 2018), CODEX (Goltsev et al. 2018) and SABER (Saka et al. 2019), IMC captures the spatial +expression of multiple proteins in parallel. With a nominal 1 μm resolution, +IMC is able to detect cytoplasmic and nuclear localization of proteins. The +current ablation frequency of IMC is 200Hz, meaning that a 1 mm\(^2\) area +can be imaged within about 2 hours.

+
+

2.1 Technical details of IMC

+

Technical aspects of how data acquisition works can be found in the original +publication (Giesen et al. 2014). Briefly, antibodies to detect targets in biological +material are labeled with heavy metals (e.g., lanthanides) that do not occur in +biological systems and thus can be used upon binding to their target as a +readout similar to fluorophores in fluorescence microscopy. Thin sections of the +biological sample on a glass slide are stained with an antibody cocktail. +Stained microscopy slides are mounted on a precise motor-driven stage inside the +ablation chamber of the IMC instrument. A high-energy UV laser is focused on the +tissue, and each individual laser shot ablates tissue from an area of roughly 1 +μm\(^2\). The energy of the laser is absorbed by the tissue resulting +in vaporization followed by condensation of the ablated material. The ablated +material from each laser shot is transported in the gas phase into the plasma of +the mass cytometer, where first atomization of the particles and then ionization +of the atoms occurs. The ion cloud is then transferred into a vacuum, and all +ions below a mass of 80 m/z are filtered using a quadrupole mass filter. The +remaining ions (mostly those used to tag antibodies) are analyzed in a +time-of-flight mass spectrometer to ultimately obtain an accumulated mass +spectrum from all ions that correspond to a single laser shot. One can regard +this spectrum as the information underlying a 1 μm\(^2\) pixel. With +repetitive laser shots (e.g., at 200 Hz) and a simultaneous lateral sample +movement, a tissue can be ablated pixel by pixel. Ultimately an image is +reconstructed from each pixel mass spectrum.

+

In principle, IMC can be applied to the same type of samples as conventional +fluorescence microscopy. The largest distinction from fluorescence microscopy is +that for IMC, primary-labeled antibodies are commonly used, whereas in +fluorescence microscopy secondary antibodies carrying fluorophores are widely +applied. Additionally, for IMC, samples are dried before acquisition and can be +stored for years. Formalin-fixed and paraffin-embedded (FFPE) samples are widely +used for IMC. The FFPE blocks are cut to 2-5 μm thick sections and are +stained, dried, and analyzed with IMC.

+
+

2.1.1 Metal-conjugated antobodies and staining

+

Metal-labeled antibodies are used to stain molecules in tissues enabling to +delineate tissue structures, cells, and subcellular structures. Metal-conjugated +antibodies can either be purchased directly from Standard BioToolsTM (MaxPar IMC Antibodies), +or antibodies can be purchased and labeled individually (MaxPar Antibody +Labeling). +Antibody labeling using the MaxPar kits is performed via TCEP antibody reduction +followed by crosslinking with sulfhydryl-reactive maleimide-bearing metal +polymers. For each antibody it is essential to validate its functionality, +specificity and optimize its usage to provide optimal signal to noise. To +facilitate antibody handling, a database is highly useful. +Airlab is such a platform; it +allows antibody lot tracking, validation data uploads, and panel generation for +subsequent upload to the IMC acquisition software from Standard BioToolsTM

+

Depending on the sample type, different staining protocols can be used. +Generally, once antibodies of choice have been conjugated to a metal tag, +titration experiments are performed to identify the optimal staining +concentration. For FFPE samples, different staining protocols have been +described, and different antibodies show variable staining with different +protocols. Protocols such as the one provided by Standard BioToolsTM or the one describe by +(Ijsselsteijn et al. 2019) are recommended. Briefly, for FFPE tissues, a dewaxing +step is performed to remove the paraffin used to embed the material, followed by +a graded re-hydration of the samples. Thereafter, heat-induced epitope retrieval +(HIER), a step aiming at the reversal of formalin-based fixation, is used to +unmask epitopes within tissues and make them accessible to antibodies. Epitope +unmasking is generally performed in either basic, EDTA-based buffers (pH 9.2) or +acidic, citrate-based buffers (pH 6). Next, a buffer containing bovine serum +albumin (BSA) is used to block non-specific binding. This buffer is also used to +dilute antibody stocks for the actual antibody staining. Staining time and +temperature may vary and optimization must be performed to ensure that each +single antibody performs well. However, overnight staining at 4°C or 3-5 +hours at room temperature seem to be suitable in many cases.

+

Following antibody incubation, unbound antibodies are washed away and a +counterstain comparable to DAPI is applied to enable the identification of +nuclei. The Iridium intercalator +from Standard BioToolsTM is a reagent of choice and applied in a brief 5 minute staining. +Finally, the samples are washed again and then dried under an airflow. Once +dried, the samples are ready for analysis using IMC and are +usually stable for a long period of time (at least one year).

+
+
+

2.1.2 Data acquisition

+

Data is acquired using the CyTOF software from Standard BioToolsTM (see manuals +here).

+

The regions of interest are selected by providing coordinates for ablation. To +determine the region to be imaged, so called “panoramas” can be generated. These +are stitched images of single fields of views of about 200 μm in diameter. +Panoramas provide an optical overview of the tissue with a resolution similar to +10x in microscopy and are intended to help with the selection of regions of +interest for ablation. The tissue should be centered on the glass side, since +the imaging mass cytometer cannot access roughly 5 mm from each of the slide +edges. Currently, the instruments can process one slide at a time and usually one MCD +file per sample slide is generated.

+

Many regions of interest can be defined on a single slide and acquisition +parameters such as channels to acquire, acquisition speed (100 Hz or 200 Hz), +ablation energy, and other parameters are user-defined. It is recommended that +all isotope channels are recorded. This will result in larger raw data files but valuable information such as +potential contamination of the argon gas (e.g., Xenon) or of the samples (e.g., +lead, barium) is stored.

+

To process a large number of slides or to select regions on whole-slide samples, +panoramas may not provide sufficient information. If this is the case, +multi-color immunofluorescence of the same slide prior to staining with +metal-labeled antibodies may be performed. To allow for region selection based +on immunofluorescence images and to align those images with a panorama of the +same or consecutive sections of the sample, we developed +napping.

+

Acquisition time is directly proportional to the total size of ablation, and run +times for samples of large area or for large sample numbers can roughly be calculated by +dividing the ablation area in square micrometer by the ablation speed (e.g., +200Hz). In addition to the proprietary MCD file format, TXT files can also +be generated for each region of interest. This is recommended as a back-up +option in case of errors that may corrupt MCD files but not TXT files.

+
+
+
+

2.2 IMC data format

+

Upon completion of the acquisition an MCD file of variable size is generated. A +single MCD file can hold raw acquisition data for multiple regions of interest, +optical images providing a slide level overview of the sample (“panoramas”), and +detailed metadata about the experiment. Additionally, for each acquisition a +TXT file is generated which holds the same pixel information as the matched +acquisition in the MCD file.

+

The Hyperion Imaging SystemTM produces files in the following folder structure:

+
.
++-- {XYZ}_ROI_001_1.txt
++-- {XYZ}_ROI_002_2.txt
++-- {XYZ}_ROI_003_3.txt
++-- {XYZ}.mcd
+

Here, {XYZ} defines the filename, ROI_001, ROI_002, ROI_003 are +user-defined names (descriptions) for the selected regions of interest (ROI), +and 1, 2, 3 indicate the unique acquisition identifiers. The ROI +description entry can be specified in the Standard BioTools software when +selecting ROIs. The MCD file contains the raw imaging data and the full metadata +of all acquired ROIs, while each TXT file contains data of a single ROI without +metadata. To follow a consistent naming scheme and to bundle all metadata, we +recommend to zip the folder. Each ZIP file should only contain data from a +single MCD file, and the name of the ZIP file should match the name of the MCD +file.

+

We refer to this data as raw data and the further +processing of this data is described in Section 3.

+ +
+
+

References

+
+
+Ali, Raza, Hartland W. Jackson, Vito R. T. Zanotelli, Esther Danenberg, Jana R. Fischer, Helen Bardwell, Elena Provenzanoa, et al. 2020. “Imaging Mass Cytometry and Multiplatform Genomics Define the Phenogenomic Landscape of Breast Cancer.” Nature Cancer 1: 163–75. +
+
+Angelo, Michael, Sean C. Bendall, Rachel Finck, Matthew B. Hale, Chuck Hitzman, Alexander D. Borowsky, Richard M. Levenson, et al. 2014. “Multiplexed Ion Beam Imaging of Human Breast Tumors.” Nature Medicine 20 (4): 436–42. +
+
+Damond, Nicolas, Stefanie Engler, Vito R. T. Zanotelli, Denis Schapiro, Clive H. Wasserfall, Irina Kusmartseva, Harry S. Nick, et al. 2019. “A Map of Human Type 1 Diabetes Progression by Imaging Mass Cytometry.” Cell Metabolism 29: 755–768.e5. +
+
+Ferrian, Selena, Candace C. Liu, Erin F. McCaffrey, Rashmi Kumar, Theodore S. Nowicki, David W. Dawson, Alex Baranski, et al. 2021. “Multiplexed Imaging Reveals an IFN-\(\gamma\)-Driven Inflammatory State in Nivolumab-Associated Gastritis.” Cell Reports Medicine 2: 100419. +
+
+Giesen, Charlotte, Hao A. O. Wang, Denis Schapiro, Nevena Zivanovic, Andrea Jacobs, Bodo Hattendorf, Peter J. Schüffler, et al. 2014. “Highly Multiplexed Imaging of Tumor Tissues with Subcellular Resolution by Mass Cytometry.” Nature Methods 11 (4): 417–22. +
+
+Goltsev, Yury, Nikolay Samusik, Julia Kennedy-Darling, Salil Bhate, Matthew Hale, Gustavo Vazquez, Sarah Black, and Garry P. Nolan. 2018. “Deep Profiling of Mouse Splenic Architecture with CODEX Multiplexed Imaging.” Cell 174: 968–81. +
+
+Gut, Gabriele, Markus D Herrmann, and Lucas Pelkmans. 2018. “Multiplexed Protein Maps Link Subcellular Organization to Cellular States.” Science 361: 1–13. +
+
+Hoch, Tobias, Daniel Schulz, Nils Eling, Julia Martínez Gómez, Mitchell P. Levesque, and Bernd Bodenmiller. 2022. “Multiplexed Imaging Mass Cytometry of the Chemokine Milieus in Melanoma Characterizes Features of the Response to Immunotherapy.” Science Immunology 7 (70): eabk1692. +
+
+Ijsselsteijn, Marieke E., Ruud van der Breggen, Arantza F. Sarasqueta, Frits Koning, and Noel F. C. C. de Miranda. 2019. “A 40-Marker Panel for High Dimensional Characterization of Cancer Immune Microenvironments by Imaging Mass Cytometry.” Frontiers in Immunology 10. +
+
+Jackson, Hartland W., Jana R. Fischer, Vito R. T. Zanotelli, H. Raza Ali, Robert Mechera, Savas D. Soysal, Holger Moch, et al. 2020. “The Single-Cell Pathology Landscape of Breast Cancer.” Nature 578: 615–20. +
+
+Jiang, Sizun, Chi Ngai Chan, Xavier Rovira-Clavé, Han Chen, Yunhao Bai, Bokai Zhu, Erin McCaffrey, et al. 2022. “Combined Protein and Nucleic Acid Imaging Reveals Virus-Dependent b Cell and Macrophage Immunosuppression of Tissue Microenvironments.” Immunity 55: 1118–1134.e8. +
+
+Lin, Jia-Ren, Benjamin Izar, Shu Wang, Clarence Yapp, Shaolin Mei, Parin M. Shah, Sandro Santagata, and Peter K. Sorger. 2018. “Highly Multiplexed Immunofluorescence Imaging of Human Tissues and Tumors Using t-CyCIF and Conventional Optical Microscopes.” eLife 7: 1–46. +
+
+Mitamura, Yasutaka, Daniel Schulz, Saskia Oro, Nick Li, Isabel Kolm, Claudia Lang, Reihane Ziadlou, et al. 2021. “Cutaneous and Systemic Hyperinflammation Drives Maculopapular Drug Exanthema in Severely Ill COVID-19 Patients.” Allergy 77: 595–608. +
+
+Rendeiro, André F., Hiranmayi Ravichandran, Yaron Bram, Vasuretha Chandar, Junbum Kim, Cem Meydan, Jiwoon Park, et al. 2021. “The Spatial Landscape of Lung Pathology During COVID-19 Progression.” Nature 593: 564–69. +
+
+Saka, Sinem K., Yu Wang, Jocelyn Y. Kishi, Allen Zhu, Yitian Zeng, Wenxin Xie, Koray Kirli, et al. 2019. “Immuno-SABER Enables Highly Multiplexed and Amplified Protein Imaging in Tissues.” Nature Biotechnology 37: 1080–90. +
+
+Schürch, Christian M, Salil S Bhate, Graham L Barlow, Darci J Phillips, Luca Noti, Inti Zlobec, Pauline Chu, et al. 2020. “Coordinated Cellular Neighborhoods Orchestrate Antitumoral Immunity at the Colorectal Cancer Invasive Front.” Cell 182: 1341–59. +
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/libs/anchor-sections-1.1.0/anchor-sections-hash.css b/libs/anchor-sections-1.1.0/anchor-sections-hash.css new file mode 100644 index 00000000..b563ec97 --- /dev/null +++ b/libs/anchor-sections-1.1.0/anchor-sections-hash.css @@ -0,0 +1,2 @@ +/* Styles for section anchors */ +a.anchor-section::before {content: '#';font-size: 80%;} diff --git a/libs/anchor-sections-1.1.0/anchor-sections.css b/libs/anchor-sections-1.1.0/anchor-sections.css new file mode 100644 index 00000000..041905f8 --- /dev/null +++ b/libs/anchor-sections-1.1.0/anchor-sections.css @@ -0,0 +1,4 @@ +/* Styles for section anchors */ +a.anchor-section {margin-left: 10px; visibility: hidden; color: inherit;} +.hasAnchor:hover a.anchor-section {visibility: visible;} +ul > li > .anchor-section {display: none;} diff --git a/libs/anchor-sections-1.1.0/anchor-sections.js b/libs/anchor-sections-1.1.0/anchor-sections.js new file mode 100644 index 00000000..fee005d9 --- /dev/null +++ b/libs/anchor-sections-1.1.0/anchor-sections.js @@ -0,0 +1,11 @@ +document.addEventListener('DOMContentLoaded', function () { + // If section divs is used, we need to put the anchor in the child header + const headers = document.querySelectorAll("div.hasAnchor.section[class*='level'] > :first-child") + + headers.forEach(function (x) { + // Add to the header node + if (!x.classList.contains('hasAnchor')) x.classList.add('hasAnchor') + // Remove from the section or div created by Pandoc + x.parentElement.classList.remove('hasAnchor') + }) +}) diff --git a/libs/gitbook-2.6.7/css/fontawesome/fontawesome-webfont.ttf b/libs/gitbook-2.6.7/css/fontawesome/fontawesome-webfont.ttf new file mode 100644 index 00000000..35acda2f Binary files /dev/null and b/libs/gitbook-2.6.7/css/fontawesome/fontawesome-webfont.ttf differ diff --git a/libs/gitbook-2.6.7/css/plugin-bookdown.css b/libs/gitbook-2.6.7/css/plugin-bookdown.css new file mode 100644 index 00000000..ab7c20eb --- /dev/null +++ b/libs/gitbook-2.6.7/css/plugin-bookdown.css @@ -0,0 +1,105 @@ +.book .book-header h1 { + padding-left: 20px; + padding-right: 20px; +} +.book .book-header.fixed { + position: fixed; + right: 0; + top: 0; + left: 0; + border-bottom: 1px solid rgba(0,0,0,.07); +} +span.search-highlight { + background-color: #ffff88; +} +@media (min-width: 600px) { + .book.with-summary .book-header.fixed { + left: 300px; + } +} +@media (max-width: 1240px) { + .book .book-body.fixed { + top: 50px; + } + .book .book-body.fixed .body-inner { + top: auto; + } +} +@media (max-width: 600px) { + .book.with-summary .book-header.fixed { + left: calc(100% - 60px); + min-width: 300px; + } + .book.with-summary .book-body { + transform: none; + left: calc(100% - 60px); + min-width: 300px; + } + .book .book-body.fixed { + top: 0; + } +} + +.book .book-body.fixed .body-inner { + top: 50px; +} +.book .book-body .page-wrapper .page-inner section.normal sub, .book .book-body .page-wrapper .page-inner section.normal sup { + font-size: 85%; +} + +@media print { + .book .book-summary, .book .book-body .book-header, .fa { + display: none !important; + } + .book .book-body.fixed { + left: 0px; + } + .book .book-body,.book .book-body .body-inner, .book.with-summary { + overflow: visible !important; + } +} +.kable_wrapper { + border-spacing: 20px 0; + border-collapse: separate; + border: none; + margin: auto; +} +.kable_wrapper > tbody > tr > td { + vertical-align: top; +} +.book .book-body .page-wrapper .page-inner section.normal table tr.header { + border-top-width: 2px; +} +.book .book-body .page-wrapper .page-inner section.normal table tr:last-child td { + border-bottom-width: 2px; +} +.book .book-body .page-wrapper .page-inner section.normal table td, .book .book-body .page-wrapper .page-inner section.normal table th { + border-left: none; + border-right: none; +} +.book .book-body .page-wrapper .page-inner section.normal table.kable_wrapper > tbody > tr, .book .book-body .page-wrapper .page-inner section.normal table.kable_wrapper > tbody > tr > td { + border-top: none; +} +.book .book-body .page-wrapper .page-inner section.normal table.kable_wrapper > tbody > tr:last-child > td { + border-bottom: none; +} + +div.theorem, div.lemma, div.corollary, div.proposition, div.conjecture { + font-style: italic; +} +span.theorem, span.lemma, span.corollary, span.proposition, span.conjecture { + font-style: normal; +} +div.proof>*:last-child:after { + content: "\25a2"; + float: right; +} +.header-section-number { + padding-right: .5em; +} +#header .multi-author { + margin: 0.5em 0 -0.5em 0; +} +#header .date { + margin-top: 1.5em; +} diff --git a/libs/gitbook-2.6.7/css/plugin-clipboard.css b/libs/gitbook-2.6.7/css/plugin-clipboard.css new file mode 100644 index 00000000..6844a70a --- /dev/null +++ b/libs/gitbook-2.6.7/css/plugin-clipboard.css @@ -0,0 +1,18 @@ +div.sourceCode { + position: relative; +} + +.copy-to-clipboard-button { + position: absolute; + right: 0; + top: 0; + visibility: hidden; +} + +.copy-to-clipboard-button:focus { + outline: 0; +} + +div.sourceCode:hover > .copy-to-clipboard-button { + visibility: visible; +} diff --git a/libs/gitbook-2.6.7/css/plugin-fontsettings.css b/libs/gitbook-2.6.7/css/plugin-fontsettings.css new file mode 100644 index 00000000..3fa6f35b --- /dev/null +++ b/libs/gitbook-2.6.7/css/plugin-fontsettings.css @@ -0,0 +1,303 @@ +/* + * Theme 1 + */ +.color-theme-1 .dropdown-menu { + background-color: #111111; + border-color: #7e888b; +} +.color-theme-1 .dropdown-menu .dropdown-caret .caret-inner { + border-bottom: 9px solid #111111; +} +.color-theme-1 .dropdown-menu .buttons { + border-color: #7e888b; +} +.color-theme-1 .dropdown-menu .button { + color: #afa790; +} +.color-theme-1 .dropdown-menu .button:hover { + color: #73553c; +} +/* + * Theme 2 + */ +.color-theme-2 .dropdown-menu { + background-color: #2d3143; + border-color: #272a3a; +} +.color-theme-2 .dropdown-menu .dropdown-caret .caret-inner { + border-bottom: 9px solid #2d3143; +} +.color-theme-2 .dropdown-menu .buttons { + border-color: #272a3a; +} +.color-theme-2 .dropdown-menu .button { + color: #62677f; +} +.color-theme-2 .dropdown-menu .button:hover { + color: #f4f4f5; +} +.book .book-header .font-settings .font-enlarge { + line-height: 30px; + font-size: 1.4em; +} +.book .book-header .font-settings .font-reduce { + line-height: 30px; + font-size: 1em; +} + +/* sidebar transition background */ +div.book.color-theme-1 { + background: #f3eacb; +} +.book.color-theme-1 .book-body { + color: #704214; + background: #f3eacb; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section { + background: #f3eacb; +} + +/* sidebar transition background */ +div.book.color-theme-2 { + background: #1c1f2b; +} + +.book.color-theme-2 .book-body { + color: #bdcadb; + background: #1c1f2b; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section { + background: #1c1f2b; +} +.book.font-size-0 .book-body .page-inner section { + font-size: 1.2rem; +} +.book.font-size-1 .book-body .page-inner section { + font-size: 1.4rem; +} +.book.font-size-2 .book-body .page-inner section { + font-size: 1.6rem; +} +.book.font-size-3 .book-body .page-inner section { + font-size: 2.2rem; +} +.book.font-size-4 .book-body .page-inner section { + font-size: 4rem; +} +.book.font-family-0 { + font-family: Georgia, serif; +} +.book.font-family-1 { + font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal { + color: #704214; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal a { + color: inherit; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal h1, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal h2, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal h3, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal h4, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal h5, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal h6 { + color: inherit; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal h1, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal h2 { + border-color: inherit; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal h6 { + color: inherit; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal hr { + background-color: inherit; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal blockquote { + border-color: #c4b29f; + opacity: 0.9; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code { + background: #fdf6e3; + color: #657b83; + border-color: #f8df9c; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal .highlight { + background-color: inherit; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal table th, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal table td { + border-color: #f5d06c; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal table tr { + color: inherit; + background-color: #fdf6e3; + border-color: #444444; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal table tr:nth-child(2n) { + background-color: #fbeecb; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal { + color: #bdcadb; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal a { + color: #3eb1d0; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal h1, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal h2, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal h3, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal h4, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal h5, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal h6 { + color: #fffffa; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal h1, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal h2 { + border-color: #373b4e; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal h6 { + color: #373b4e; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal hr { + background-color: #373b4e; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal blockquote { + border-color: #373b4e; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code { + color: #9dbed8; + background: #2d3143; + border-color: #2d3143; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal .highlight { + background-color: #282a39; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal table th, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal table td { + border-color: #3b3f54; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal table tr { + color: #b6c2d2; + background-color: #2d3143; + border-color: #3b3f54; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal table tr:nth-child(2n) { + background-color: #35394b; +} +.book.color-theme-1 .book-header { + color: #afa790; + background: transparent; +} +.book.color-theme-1 .book-header .btn { + color: #afa790; +} +.book.color-theme-1 .book-header .btn:hover { + color: #73553c; + background: none; +} +.book.color-theme-1 .book-header h1 { + color: #704214; +} +.book.color-theme-2 .book-header { + color: #7e888b; + background: transparent; +} +.book.color-theme-2 .book-header .btn { + color: #3b3f54; +} +.book.color-theme-2 .book-header .btn:hover { + color: #fffff5; + background: none; +} +.book.color-theme-2 .book-header h1 { + color: #bdcadb; +} +.book.color-theme-1 .book-body .navigation { + color: #afa790; +} +.book.color-theme-1 .book-body .navigation:hover { + color: #73553c; +} +.book.color-theme-2 .book-body .navigation { + color: #383f52; +} +.book.color-theme-2 .book-body .navigation:hover { + color: #fffff5; +} +/* + * Theme 1 + */ +.book.color-theme-1 .book-summary { + color: #afa790; + background: #111111; + border-right: 1px solid rgba(0, 0, 0, 0.07); +} +.book.color-theme-1 .book-summary .book-search { + background: transparent; +} +.book.color-theme-1 .book-summary .book-search input, +.book.color-theme-1 .book-summary .book-search input:focus { + border: 1px solid transparent; +} +.book.color-theme-1 .book-summary ul.summary li.divider { + background: #7e888b; + box-shadow: none; +} +.book.color-theme-1 .book-summary ul.summary li i.fa-check { + color: #33cc33; +} +.book.color-theme-1 .book-summary ul.summary li.done > a { + color: #877f6a; +} +.book.color-theme-1 .book-summary ul.summary li a, +.book.color-theme-1 .book-summary ul.summary li span { + color: #877f6a; + background: transparent; + font-weight: normal; +} +.book.color-theme-1 .book-summary ul.summary li.active > a, +.book.color-theme-1 .book-summary ul.summary li a:hover { + color: #704214; + background: transparent; + font-weight: normal; +} +/* + * Theme 2 + */ +.book.color-theme-2 .book-summary { + color: #bcc1d2; + background: #2d3143; + border-right: none; +} +.book.color-theme-2 .book-summary .book-search { + background: transparent; +} +.book.color-theme-2 .book-summary .book-search input, +.book.color-theme-2 .book-summary .book-search input:focus { + border: 1px solid transparent; +} +.book.color-theme-2 .book-summary ul.summary li.divider { + background: #272a3a; + box-shadow: none; +} +.book.color-theme-2 .book-summary ul.summary li i.fa-check { + color: #33cc33; +} +.book.color-theme-2 .book-summary ul.summary li.done > a { + color: #62687f; +} +.book.color-theme-2 .book-summary ul.summary li a, +.book.color-theme-2 .book-summary ul.summary li span { + color: #c1c6d7; + background: transparent; + font-weight: 600; +} +.book.color-theme-2 .book-summary ul.summary li.active > a, +.book.color-theme-2 .book-summary ul.summary li a:hover { + color: #f4f4f5; + background: #252737; + font-weight: 600; +} diff --git a/libs/gitbook-2.6.7/css/plugin-highlight.css b/libs/gitbook-2.6.7/css/plugin-highlight.css new file mode 100644 index 00000000..2aabd3de --- /dev/null +++ b/libs/gitbook-2.6.7/css/plugin-highlight.css @@ -0,0 +1,426 @@ +.book .book-body .page-wrapper .page-inner section.normal pre, +.book .book-body .page-wrapper .page-inner section.normal code { + /* http://jmblog.github.com/color-themes-for-google-code-highlightjs */ + /* Tomorrow Comment */ + /* Tomorrow Red */ + /* Tomorrow Orange */ + /* Tomorrow Yellow */ + /* Tomorrow Green */ + /* Tomorrow Aqua */ + /* Tomorrow Blue */ + /* Tomorrow Purple */ +} +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-comment, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-comment, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-title, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-title { + color: #8e908c; +} +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-variable, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-variable, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-attribute, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-attribute, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-tag, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-tag, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-regexp, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-regexp, +.book .book-body .page-wrapper .page-inner section.normal pre .ruby .hljs-constant, +.book .book-body .page-wrapper .page-inner section.normal code .ruby .hljs-constant, +.book .book-body .page-wrapper .page-inner section.normal pre .xml .hljs-tag .hljs-title, +.book .book-body .page-wrapper .page-inner section.normal code .xml .hljs-tag .hljs-title, +.book .book-body .page-wrapper .page-inner section.normal pre .xml .hljs-pi, +.book .book-body .page-wrapper .page-inner section.normal code .xml .hljs-pi, +.book .book-body .page-wrapper .page-inner section.normal pre .xml .hljs-doctype, +.book .book-body .page-wrapper .page-inner section.normal code .xml .hljs-doctype, +.book .book-body .page-wrapper .page-inner section.normal pre .html .hljs-doctype, +.book .book-body .page-wrapper .page-inner section.normal code .html .hljs-doctype, +.book .book-body .page-wrapper .page-inner section.normal pre .css .hljs-id, +.book .book-body .page-wrapper .page-inner section.normal code .css .hljs-id, +.book .book-body .page-wrapper .page-inner section.normal pre .css .hljs-class, +.book .book-body .page-wrapper .page-inner section.normal code .css .hljs-class, +.book .book-body .page-wrapper .page-inner section.normal pre .css .hljs-pseudo, +.book .book-body .page-wrapper .page-inner section.normal code .css .hljs-pseudo { + color: #c82829; +} +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-number, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-number, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-preprocessor, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-preprocessor, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-pragma, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-pragma, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-built_in, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-built_in, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-literal, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-literal, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-params, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-params, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-constant, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-constant { + color: #f5871f; +} +.book .book-body .page-wrapper .page-inner section.normal pre .ruby .hljs-class .hljs-title, +.book .book-body .page-wrapper .page-inner section.normal code .ruby .hljs-class .hljs-title, +.book .book-body .page-wrapper .page-inner section.normal pre .css .hljs-rules .hljs-attribute, +.book .book-body .page-wrapper .page-inner section.normal code .css .hljs-rules .hljs-attribute { + color: #eab700; +} +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-string, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-string, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-value, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-value, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-inheritance, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-inheritance, +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-header, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-header, +.book .book-body .page-wrapper .page-inner section.normal pre .ruby .hljs-symbol, +.book .book-body .page-wrapper .page-inner section.normal code .ruby .hljs-symbol, +.book .book-body .page-wrapper .page-inner section.normal pre .xml .hljs-cdata, +.book .book-body .page-wrapper .page-inner section.normal code .xml .hljs-cdata { + color: #718c00; +} +.book .book-body .page-wrapper .page-inner section.normal pre .css .hljs-hexcolor, +.book .book-body .page-wrapper .page-inner section.normal code .css .hljs-hexcolor { + color: #3e999f; +} +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-function, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-function, +.book .book-body .page-wrapper .page-inner section.normal pre .python .hljs-decorator, +.book .book-body .page-wrapper .page-inner section.normal code .python .hljs-decorator, +.book .book-body .page-wrapper .page-inner section.normal pre .python .hljs-title, +.book .book-body .page-wrapper .page-inner section.normal code .python .hljs-title, +.book .book-body .page-wrapper .page-inner section.normal pre .ruby .hljs-function .hljs-title, +.book .book-body .page-wrapper .page-inner section.normal code .ruby .hljs-function .hljs-title, +.book .book-body .page-wrapper .page-inner section.normal pre .ruby .hljs-title .hljs-keyword, +.book .book-body .page-wrapper .page-inner section.normal code .ruby .hljs-title .hljs-keyword, +.book .book-body .page-wrapper .page-inner section.normal pre .perl .hljs-sub, +.book .book-body .page-wrapper .page-inner section.normal code .perl .hljs-sub, +.book .book-body .page-wrapper .page-inner section.normal pre .javascript .hljs-title, +.book .book-body .page-wrapper .page-inner section.normal code .javascript .hljs-title, +.book .book-body .page-wrapper .page-inner section.normal pre .coffeescript .hljs-title, +.book .book-body .page-wrapper .page-inner section.normal code .coffeescript .hljs-title { + color: #4271ae; +} +.book .book-body .page-wrapper .page-inner section.normal pre .hljs-keyword, +.book .book-body .page-wrapper .page-inner section.normal code .hljs-keyword, +.book .book-body .page-wrapper .page-inner section.normal pre .javascript .hljs-function, +.book .book-body .page-wrapper .page-inner section.normal code .javascript .hljs-function { + color: #8959a8; +} +.book .book-body .page-wrapper .page-inner section.normal pre .hljs, +.book .book-body .page-wrapper .page-inner section.normal code .hljs { + display: block; + background: white; + color: #4d4d4c; + padding: 0.5em; +} +.book .book-body .page-wrapper .page-inner section.normal pre .coffeescript .javascript, +.book .book-body .page-wrapper .page-inner section.normal code .coffeescript .javascript, +.book .book-body .page-wrapper .page-inner section.normal pre .javascript .xml, +.book .book-body .page-wrapper .page-inner section.normal code .javascript .xml, +.book .book-body .page-wrapper .page-inner section.normal pre .tex .hljs-formula, +.book .book-body .page-wrapper .page-inner section.normal code .tex .hljs-formula, +.book .book-body .page-wrapper .page-inner section.normal pre .xml .javascript, +.book .book-body .page-wrapper .page-inner section.normal code .xml .javascript, +.book .book-body .page-wrapper .page-inner section.normal pre .xml .vbscript, +.book .book-body .page-wrapper .page-inner section.normal code .xml .vbscript, +.book .book-body .page-wrapper .page-inner section.normal pre .xml .css, +.book .book-body .page-wrapper .page-inner section.normal code .xml .css, +.book .book-body .page-wrapper .page-inner section.normal pre .xml .hljs-cdata, +.book .book-body .page-wrapper .page-inner section.normal code .xml .hljs-cdata { + opacity: 0.5; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code { + /* + +Orginal Style from ethanschoonover.com/solarized (c) Jeremy Hull + +*/ + /* Solarized Green */ + /* Solarized Cyan */ + /* Solarized Blue */ + /* Solarized Yellow */ + /* Solarized Orange */ + /* Solarized Red */ + /* Solarized Violet */ +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs { + display: block; + padding: 0.5em; + background: #fdf6e3; + color: #657b83; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-comment, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-comment, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-template_comment, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-template_comment, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .diff .hljs-header, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .diff .hljs-header, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-doctype, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-doctype, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-pi, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-pi, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .lisp .hljs-string, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .lisp .hljs-string, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-javadoc, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-javadoc { + color: #93a1a1; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-keyword, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-keyword, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-winutils, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-winutils, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .method, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .method, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-addition, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-addition, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .css .hljs-tag, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .css .hljs-tag, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-request, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-request, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-status, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-status, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .nginx .hljs-title, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .nginx .hljs-title { + color: #859900; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-number, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-number, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-command, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-command, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-string, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-string, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-tag .hljs-value, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-tag .hljs-value, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-rules .hljs-value, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-rules .hljs-value, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-phpdoc, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-phpdoc, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .tex .hljs-formula, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .tex .hljs-formula, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-regexp, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-regexp, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-hexcolor, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-hexcolor, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-link_url, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-link_url { + color: #2aa198; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-title, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-title, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-localvars, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-localvars, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-chunk, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-chunk, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-decorator, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-decorator, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-built_in, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-built_in, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-identifier, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-identifier, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .vhdl .hljs-literal, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .vhdl .hljs-literal, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-id, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-id, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .css .hljs-function, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .css .hljs-function { + color: #268bd2; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-attribute, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-attribute, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-variable, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-variable, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .lisp .hljs-body, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .lisp .hljs-body, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .smalltalk .hljs-number, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .smalltalk .hljs-number, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-constant, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-constant, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-class .hljs-title, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-class .hljs-title, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-parent, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-parent, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .haskell .hljs-type, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .haskell .hljs-type, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-link_reference, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-link_reference { + color: #b58900; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-preprocessor, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-preprocessor, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-preprocessor .hljs-keyword, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-preprocessor .hljs-keyword, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-pragma, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-pragma, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-shebang, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-shebang, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-symbol, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-symbol, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-symbol .hljs-string, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-symbol .hljs-string, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .diff .hljs-change, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .diff .hljs-change, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-special, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-special, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-attr_selector, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-attr_selector, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-subst, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-subst, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-cdata, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-cdata, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .clojure .hljs-title, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .clojure .hljs-title, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .css .hljs-pseudo, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .css .hljs-pseudo, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-header, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-header { + color: #cb4b16; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-deletion, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-deletion, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-important, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-important { + color: #dc322f; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .hljs-link_label, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .hljs-link_label { + color: #6c71c4; +} +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal pre .tex .hljs-formula, +.book.color-theme-1 .book-body .page-wrapper .page-inner section.normal code .tex .hljs-formula { + background: #eee8d5; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code { + /* Tomorrow Night Bright Theme */ + /* Original theme - https://github.com/chriskempson/tomorrow-theme */ + /* http://jmblog.github.com/color-themes-for-google-code-highlightjs */ + /* Tomorrow Comment */ + /* Tomorrow Red */ + /* Tomorrow Orange */ + /* Tomorrow Yellow */ + /* Tomorrow Green */ + /* Tomorrow Aqua */ + /* Tomorrow Blue */ + /* Tomorrow Purple */ +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-comment, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-comment, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-title, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-title { + color: #969896; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-variable, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-variable, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-attribute, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-attribute, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-tag, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-tag, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-regexp, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-regexp, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .ruby .hljs-constant, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .ruby .hljs-constant, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .xml .hljs-tag .hljs-title, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .xml .hljs-tag .hljs-title, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .xml .hljs-pi, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .xml .hljs-pi, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .xml .hljs-doctype, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .xml .hljs-doctype, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .html .hljs-doctype, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .html .hljs-doctype, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .css .hljs-id, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .css .hljs-id, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .css .hljs-class, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .css .hljs-class, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .css .hljs-pseudo, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .css .hljs-pseudo { + color: #d54e53; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-number, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-number, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-preprocessor, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-preprocessor, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-pragma, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-pragma, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-built_in, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-built_in, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-literal, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-literal, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-params, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-params, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-constant, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-constant { + color: #e78c45; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .ruby .hljs-class .hljs-title, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .ruby .hljs-class .hljs-title, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .css .hljs-rules .hljs-attribute, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .css .hljs-rules .hljs-attribute { + color: #e7c547; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-string, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-string, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-value, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-value, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-inheritance, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-inheritance, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-header, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-header, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .ruby .hljs-symbol, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .ruby .hljs-symbol, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .xml .hljs-cdata, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .xml .hljs-cdata { + color: #b9ca4a; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .css .hljs-hexcolor, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .css .hljs-hexcolor { + color: #70c0b1; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-function, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-function, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .python .hljs-decorator, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .python .hljs-decorator, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .python .hljs-title, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .python .hljs-title, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .ruby .hljs-function .hljs-title, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .ruby .hljs-function .hljs-title, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .ruby .hljs-title .hljs-keyword, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .ruby .hljs-title .hljs-keyword, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .perl .hljs-sub, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .perl .hljs-sub, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .javascript .hljs-title, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .javascript .hljs-title, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .coffeescript .hljs-title, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .coffeescript .hljs-title { + color: #7aa6da; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs-keyword, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs-keyword, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .javascript .hljs-function, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .javascript .hljs-function { + color: #c397d8; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .hljs, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .hljs { + display: block; + background: black; + color: #eaeaea; + padding: 0.5em; +} +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .coffeescript .javascript, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .coffeescript .javascript, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .javascript .xml, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .javascript .xml, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .tex .hljs-formula, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .tex .hljs-formula, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .xml .javascript, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .xml .javascript, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .xml .vbscript, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .xml .vbscript, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .xml .css, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .xml .css, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal pre .xml .hljs-cdata, +.book.color-theme-2 .book-body .page-wrapper .page-inner section.normal code .xml .hljs-cdata { + opacity: 0.5; +} diff --git a/libs/gitbook-2.6.7/css/plugin-search.css b/libs/gitbook-2.6.7/css/plugin-search.css new file mode 100644 index 00000000..c85e557a --- /dev/null +++ b/libs/gitbook-2.6.7/css/plugin-search.css @@ -0,0 +1,31 @@ +.book .book-summary .book-search { + padding: 6px; + background: transparent; + position: absolute; + top: -50px; + left: 0px; + right: 0px; + transition: top 0.5s ease; +} +.book .book-summary .book-search input, +.book .book-summary .book-search input:focus, +.book .book-summary .book-search input:hover { + width: 100%; + background: transparent; + border: 1px solid #ccc; + box-shadow: none; + outline: none; + line-height: 22px; + padding: 7px 4px; + color: inherit; + box-sizing: border-box; +} +.book.with-search .book-summary .book-search { + top: 0px; +} +.book.with-search .book-summary ul.summary { + top: 50px; +} +.with-search .summary li[data-level] a[href*=".html#"] { + display: none; +} diff --git a/libs/gitbook-2.6.7/css/plugin-table.css b/libs/gitbook-2.6.7/css/plugin-table.css new file mode 100644 index 00000000..7fba1b9f --- /dev/null +++ b/libs/gitbook-2.6.7/css/plugin-table.css @@ -0,0 +1 @@ +.book .book-body .page-wrapper .page-inner section.normal table{display:table;width:100%;border-collapse:collapse;border-spacing:0;overflow:auto}.book .book-body .page-wrapper .page-inner section.normal table td,.book .book-body .page-wrapper .page-inner section.normal table th{padding:6px 13px;border:1px solid #ddd}.book .book-body .page-wrapper .page-inner section.normal table tr{background-color:#fff;border-top:1px solid #ccc}.book .book-body .page-wrapper .page-inner section.normal table tr:nth-child(2n){background-color:#f8f8f8}.book .book-body .page-wrapper .page-inner section.normal table th{font-weight:700} diff --git a/libs/gitbook-2.6.7/css/style.css b/libs/gitbook-2.6.7/css/style.css new file mode 100644 index 00000000..cba69b23 --- /dev/null +++ b/libs/gitbook-2.6.7/css/style.css @@ -0,0 +1,13 @@ +/*! normalize.css v2.1.0 | MIT License | git.io/normalize */img,legend{border:0}*{-webkit-font-smoothing:antialiased}sub,sup{position:relative}.book .book-body .page-wrapper .page-inner section.normal hr:after,.book-langs-index .inner .languages:after,.buttons:after,.dropdown-menu .buttons:after{clear:both}body,html{-ms-text-size-adjust:100%;-webkit-text-size-adjust:100%}article,aside,details,figcaption,figure,footer,header,hgroup,main,nav,section,summary{display:block}audio,canvas,video{display:inline-block}.hidden,[hidden]{display:none}audio:not([controls]){display:none;height:0}html{font-family:sans-serif}body,figure{margin:0}a:focus{outline:dotted thin}a:active,a:hover{outline:0}h1{font-size:2em;margin:.67em 0}abbr[title]{border-bottom:1px dotted}b,strong{font-weight:700}dfn{font-style:italic}hr{-moz-box-sizing:content-box;box-sizing:content-box;height:0}mark{background:#ff0;color:#000}code,kbd,pre,samp{font-family:monospace,serif;font-size:1em}pre{white-space:pre-wrap}q{quotes:"\201C" "\201D" "\2018" "\2019"}small{font-size:80%}sub,sup{font-size:75%;line-height:0;vertical-align:baseline}sup{top:-.5em}sub{bottom:-.25em}svg:not(:root){overflow:hidden}fieldset{border:1px solid silver;margin:0 2px;padding:.35em .625em .75em}legend{padding:0}button,input,select,textarea{font-family:inherit;font-size:100%;margin:0}button,input{line-height:normal}button,select{text-transform:none}button,html input[type=button],input[type=reset],input[type=submit]{-webkit-appearance:button;cursor:pointer}button[disabled],html input[disabled]{cursor:default}input[type=checkbox],input[type=radio]{box-sizing:border-box;padding:0}input[type=search]{-webkit-appearance:textfield;-moz-box-sizing:content-box;-webkit-box-sizing:content-box;box-sizing:content-box}input[type=search]::-webkit-search-cancel-button{margin-right:10px;}button::-moz-focus-inner,input::-moz-focus-inner{border:0;padding:0}textarea{overflow:auto;vertical-align:top}table{border-collapse:collapse;border-spacing:0}/*! + * Preboot v2 + * + * Open sourced under MIT license by @mdo. + * Some variables and mixins from Bootstrap (Apache 2 license). + */.link-inherit,.link-inherit:focus,.link-inherit:hover{color:inherit}/*! + * Font Awesome 4.7.0 by @davegandy - http://fontawesome.io - @fontawesome + * License - http://fontawesome.io/license (Font: SIL OFL 1.1, CSS: MIT License) + */@font-face{font-family:'FontAwesome';src:url('./fontawesome/fontawesome-webfont.ttf?v=4.7.0') format('truetype');font-weight:normal;font-style:normal}.fa{display:inline-block;font:normal normal normal 14px/1 FontAwesome;font-size:inherit;text-rendering:auto;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale}.fa-lg{font-size:1.33333333em;line-height:.75em;vertical-align:-15%}.fa-2x{font-size:2em}.fa-3x{font-size:3em}.fa-4x{font-size:4em}.fa-5x{font-size:5em}.fa-fw{width:1.28571429em;text-align:center}.fa-ul{padding-left:0;margin-left:2.14285714em;list-style-type:none}.fa-ul>li{position:relative}.fa-li{position:absolute;left:-2.14285714em;width:2.14285714em;top:.14285714em;text-align:center}.fa-li.fa-lg{left:-1.85714286em}.fa-border{padding:.2em .25em .15em;border:solid .08em #eee;border-radius:.1em}.fa-pull-left{float:left}.fa-pull-right{float:right}.fa.fa-pull-left{margin-right:.3em}.fa.fa-pull-right{margin-left:.3em}.pull-right{float:right}.pull-left{float:left}.fa.pull-left{margin-right:.3em}.fa.pull-right{margin-left:.3em}.fa-spin{-webkit-animation:fa-spin 2s infinite linear;animation:fa-spin 2s infinite linear}.fa-pulse{-webkit-animation:fa-spin 1s infinite steps(8);animation:fa-spin 1s infinite steps(8)}@-webkit-keyframes fa-spin{0%{-webkit-transform:rotate(0deg);transform:rotate(0deg)}100%{-webkit-transform:rotate(359deg);transform:rotate(359deg)}}@keyframes fa-spin{0%{-webkit-transform:rotate(0deg);transform:rotate(0deg)}100%{-webkit-transform:rotate(359deg);transform:rotate(359deg)}}.fa-rotate-90{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=1)";-webkit-transform:rotate(90deg);-ms-transform:rotate(90deg);transform:rotate(90deg)}.fa-rotate-180{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=2)";-webkit-transform:rotate(180deg);-ms-transform:rotate(180deg);transform:rotate(180deg)}.fa-rotate-270{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=3)";-webkit-transform:rotate(270deg);-ms-transform:rotate(270deg);transform:rotate(270deg)}.fa-flip-horizontal{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=0, mirror=1)";-webkit-transform:scale(-1, 1);-ms-transform:scale(-1, 1);transform:scale(-1, 1)}.fa-flip-vertical{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=2, mirror=1)";-webkit-transform:scale(1, -1);-ms-transform:scale(1, -1);transform:scale(1, -1)}:root .fa-rotate-90,:root .fa-rotate-180,:root .fa-rotate-270,:root .fa-flip-horizontal,:root .fa-flip-vertical{filter:none}.fa-stack{position:relative;display:inline-block;width:2em;height:2em;line-height:2em;vertical-align:middle}.fa-stack-1x,.fa-stack-2x{position:absolute;left:0;width:100%;text-align:center}.fa-stack-1x{line-height:inherit}.fa-stack-2x{font-size:2em}.fa-inverse{color:#fff}.fa-glass:before{content:"\f000"}.fa-music:before{content:"\f001"}.fa-search:before{content:"\f002"}.fa-envelope-o:before{content:"\f003"}.fa-heart:before{content:"\f004"}.fa-star:before{content:"\f005"}.fa-star-o:before{content:"\f006"}.fa-user:before{content:"\f007"}.fa-film:before{content:"\f008"}.fa-th-large:before{content:"\f009"}.fa-th:before{content:"\f00a"}.fa-th-list:before{content:"\f00b"}.fa-check:before{content:"\f00c"}.fa-remove:before,.fa-close:before,.fa-times:before{content:"\f00d"}.fa-search-plus:before{content:"\f00e"}.fa-search-minus:before{content:"\f010"}.fa-power-off:before{content:"\f011"}.fa-signal:before{content:"\f012"}.fa-gear:before,.fa-cog:before{content:"\f013"}.fa-trash-o:before{content:"\f014"}.fa-home:before{content:"\f015"}.fa-file-o:before{content:"\f016"}.fa-clock-o:before{content:"\f017"}.fa-road:before{content:"\f018"}.fa-download:before{content:"\f019"}.fa-arrow-circle-o-down:before{content:"\f01a"}.fa-arrow-circle-o-up:before{content:"\f01b"}.fa-inbox:before{content:"\f01c"}.fa-play-circle-o:before{content:"\f01d"}.fa-rotate-right:before,.fa-repeat:before{content:"\f01e"}.fa-refresh:before{content:"\f021"}.fa-list-alt:before{content:"\f022"}.fa-lock:before{content:"\f023"}.fa-flag:before{content:"\f024"}.fa-headphones:before{content:"\f025"}.fa-volume-off:before{content:"\f026"}.fa-volume-down:before{content:"\f027"}.fa-volume-up:before{content:"\f028"}.fa-qrcode:before{content:"\f029"}.fa-barcode:before{content:"\f02a"}.fa-tag:before{content:"\f02b"}.fa-tags:before{content:"\f02c"}.fa-book:before{content:"\f02d"}.fa-bookmark:before{content:"\f02e"}.fa-print:before{content:"\f02f"}.fa-camera:before{content:"\f030"}.fa-font:before{content:"\f031"}.fa-bold:before{content:"\f032"}.fa-italic:before{content:"\f033"}.fa-text-height:before{content:"\f034"}.fa-text-width:before{content:"\f035"}.fa-align-left:before{content:"\f036"}.fa-align-center:before{content:"\f037"}.fa-align-right:before{content:"\f038"}.fa-align-justify:before{content:"\f039"}.fa-list:before{content:"\f03a"}.fa-dedent:before,.fa-outdent:before{content:"\f03b"}.fa-indent:before{content:"\f03c"}.fa-video-camera:before{content:"\f03d"}.fa-photo:before,.fa-image:before,.fa-picture-o:before{content:"\f03e"}.fa-pencil:before{content:"\f040"}.fa-map-marker:before{content:"\f041"}.fa-adjust:before{content:"\f042"}.fa-tint:before{content:"\f043"}.fa-edit:before,.fa-pencil-square-o:before{content:"\f044"}.fa-share-square-o:before{content:"\f045"}.fa-check-square-o:before{content:"\f046"}.fa-arrows:before{content:"\f047"}.fa-step-backward:before{content:"\f048"}.fa-fast-backward:before{content:"\f049"}.fa-backward:before{content:"\f04a"}.fa-play:before{content:"\f04b"}.fa-pause:before{content:"\f04c"}.fa-stop:before{content:"\f04d"}.fa-forward:before{content:"\f04e"}.fa-fast-forward:before{content:"\f050"}.fa-step-forward:before{content:"\f051"}.fa-eject:before{content:"\f052"}.fa-chevron-left:before{content:"\f053"}.fa-chevron-right:before{content:"\f054"}.fa-plus-circle:before{content:"\f055"}.fa-minus-circle:before{content:"\f056"}.fa-times-circle:before{content:"\f057"}.fa-check-circle:before{content:"\f058"}.fa-question-circle:before{content:"\f059"}.fa-info-circle:before{content:"\f05a"}.fa-crosshairs:before{content:"\f05b"}.fa-times-circle-o:before{content:"\f05c"}.fa-check-circle-o:before{content:"\f05d"}.fa-ban:before{content:"\f05e"}.fa-arrow-left:before{content:"\f060"}.fa-arrow-right:before{content:"\f061"}.fa-arrow-up:before{content:"\f062"}.fa-arrow-down:before{content:"\f063"}.fa-mail-forward:before,.fa-share:before{content:"\f064"}.fa-expand:before{content:"\f065"}.fa-compress:before{content:"\f066"}.fa-plus:before{content:"\f067"}.fa-minus:before{content:"\f068"}.fa-asterisk:before{content:"\f069"}.fa-exclamation-circle:before{content:"\f06a"}.fa-gift:before{content:"\f06b"}.fa-leaf:before{content:"\f06c"}.fa-fire:before{content:"\f06d"}.fa-eye:before{content:"\f06e"}.fa-eye-slash:before{content:"\f070"}.fa-warning:before,.fa-exclamation-triangle:before{content:"\f071"}.fa-plane:before{content:"\f072"}.fa-calendar:before{content:"\f073"}.fa-random:before{content:"\f074"}.fa-comment:before{content:"\f075"}.fa-magnet:before{content:"\f076"}.fa-chevron-up:before{content:"\f077"}.fa-chevron-down:before{content:"\f078"}.fa-retweet:before{content:"\f079"}.fa-shopping-cart:before{content:"\f07a"}.fa-folder:before{content:"\f07b"}.fa-folder-open:before{content:"\f07c"}.fa-arrows-v:before{content:"\f07d"}.fa-arrows-h:before{content:"\f07e"}.fa-bar-chart-o:before,.fa-bar-chart:before{content:"\f080"}.fa-twitter-square:before{content:"\f081"}.fa-facebook-square:before{content:"\f082"}.fa-camera-retro:before{content:"\f083"}.fa-key:before{content:"\f084"}.fa-gears:before,.fa-cogs:before{content:"\f085"}.fa-comments:before{content:"\f086"}.fa-thumbs-o-up:before{content:"\f087"}.fa-thumbs-o-down:before{content:"\f088"}.fa-star-half:before{content:"\f089"}.fa-heart-o:before{content:"\f08a"}.fa-sign-out:before{content:"\f08b"}.fa-linkedin-square:before{content:"\f08c"}.fa-thumb-tack:before{content:"\f08d"}.fa-external-link:before{content:"\f08e"}.fa-sign-in:before{content:"\f090"}.fa-trophy:before{content:"\f091"}.fa-github-square:before{content:"\f092"}.fa-upload:before{content:"\f093"}.fa-lemon-o:before{content:"\f094"}.fa-phone:before{content:"\f095"}.fa-square-o:before{content:"\f096"}.fa-bookmark-o:before{content:"\f097"}.fa-phone-square:before{content:"\f098"}.fa-twitter:before{content:"\f099"}.fa-facebook-f:before,.fa-facebook:before{content:"\f09a"}.fa-github:before{content:"\f09b"}.fa-unlock:before{content:"\f09c"}.fa-credit-card:before{content:"\f09d"}.fa-feed:before,.fa-rss:before{content:"\f09e"}.fa-hdd-o:before{content:"\f0a0"}.fa-bullhorn:before{content:"\f0a1"}.fa-bell:before{content:"\f0f3"}.fa-certificate:before{content:"\f0a3"}.fa-hand-o-right:before{content:"\f0a4"}.fa-hand-o-left:before{content:"\f0a5"}.fa-hand-o-up:before{content:"\f0a6"}.fa-hand-o-down:before{content:"\f0a7"}.fa-arrow-circle-left:before{content:"\f0a8"}.fa-arrow-circle-right:before{content:"\f0a9"}.fa-arrow-circle-up:before{content:"\f0aa"}.fa-arrow-circle-down:before{content:"\f0ab"}.fa-globe:before{content:"\f0ac"}.fa-wrench:before{content:"\f0ad"}.fa-tasks:before{content:"\f0ae"}.fa-filter:before{content:"\f0b0"}.fa-briefcase:before{content:"\f0b1"}.fa-arrows-alt:before{content:"\f0b2"}.fa-group:before,.fa-users:before{content:"\f0c0"}.fa-chain:before,.fa-link:before{content:"\f0c1"}.fa-cloud:before{content:"\f0c2"}.fa-flask:before{content:"\f0c3"}.fa-cut:before,.fa-scissors:before{content:"\f0c4"}.fa-copy:before,.fa-files-o:before{content:"\f0c5"}.fa-paperclip:before{content:"\f0c6"}.fa-save:before,.fa-floppy-o:before{content:"\f0c7"}.fa-square:before{content:"\f0c8"}.fa-navicon:before,.fa-reorder:before,.fa-bars:before{content:"\f0c9"}.fa-list-ul:before{content:"\f0ca"}.fa-list-ol:before{content:"\f0cb"}.fa-strikethrough:before{content:"\f0cc"}.fa-underline:before{content:"\f0cd"}.fa-table:before{content:"\f0ce"}.fa-magic:before{content:"\f0d0"}.fa-truck:before{content:"\f0d1"}.fa-pinterest:before{content:"\f0d2"}.fa-pinterest-square:before{content:"\f0d3"}.fa-google-plus-square:before{content:"\f0d4"}.fa-google-plus:before{content:"\f0d5"}.fa-money:before{content:"\f0d6"}.fa-caret-down:before{content:"\f0d7"}.fa-caret-up:before{content:"\f0d8"}.fa-caret-left:before{content:"\f0d9"}.fa-caret-right:before{content:"\f0da"}.fa-columns:before{content:"\f0db"}.fa-unsorted:before,.fa-sort:before{content:"\f0dc"}.fa-sort-down:before,.fa-sort-desc:before{content:"\f0dd"}.fa-sort-up:before,.fa-sort-asc:before{content:"\f0de"}.fa-envelope:before{content:"\f0e0"}.fa-linkedin:before{content:"\f0e1"}.fa-rotate-left:before,.fa-undo:before{content:"\f0e2"}.fa-legal:before,.fa-gavel:before{content:"\f0e3"}.fa-dashboard:before,.fa-tachometer:before{content:"\f0e4"}.fa-comment-o:before{content:"\f0e5"}.fa-comments-o:before{content:"\f0e6"}.fa-flash:before,.fa-bolt:before{content:"\f0e7"}.fa-sitemap:before{content:"\f0e8"}.fa-umbrella:before{content:"\f0e9"}.fa-paste:before,.fa-clipboard:before{content:"\f0ea"}.fa-lightbulb-o:before{content:"\f0eb"}.fa-exchange:before{content:"\f0ec"}.fa-cloud-download:before{content:"\f0ed"}.fa-cloud-upload:before{content:"\f0ee"}.fa-user-md:before{content:"\f0f0"}.fa-stethoscope:before{content:"\f0f1"}.fa-suitcase:before{content:"\f0f2"}.fa-bell-o:before{content:"\f0a2"}.fa-coffee:before{content:"\f0f4"}.fa-cutlery:before{content:"\f0f5"}.fa-file-text-o:before{content:"\f0f6"}.fa-building-o:before{content:"\f0f7"}.fa-hospital-o:before{content:"\f0f8"}.fa-ambulance:before{content:"\f0f9"}.fa-medkit:before{content:"\f0fa"}.fa-fighter-jet:before{content:"\f0fb"}.fa-beer:before{content:"\f0fc"}.fa-h-square:before{content:"\f0fd"}.fa-plus-square:before{content:"\f0fe"}.fa-angle-double-left:before{content:"\f100"}.fa-angle-double-right:before{content:"\f101"}.fa-angle-double-up:before{content:"\f102"}.fa-angle-double-down:before{content:"\f103"}.fa-angle-left:before{content:"\f104"}.fa-angle-right:before{content:"\f105"}.fa-angle-up:before{content:"\f106"}.fa-angle-down:before{content:"\f107"}.fa-desktop:before{content:"\f108"}.fa-laptop:before{content:"\f109"}.fa-tablet:before{content:"\f10a"}.fa-mobile-phone:before,.fa-mobile:before{content:"\f10b"}.fa-circle-o:before{content:"\f10c"}.fa-quote-left:before{content:"\f10d"}.fa-quote-right:before{content:"\f10e"}.fa-spinner:before{content:"\f110"}.fa-circle:before{content:"\f111"}.fa-mail-reply:before,.fa-reply:before{content:"\f112"}.fa-github-alt:before{content:"\f113"}.fa-folder-o:before{content:"\f114"}.fa-folder-open-o:before{content:"\f115"}.fa-smile-o:before{content:"\f118"}.fa-frown-o:before{content:"\f119"}.fa-meh-o:before{content:"\f11a"}.fa-gamepad:before{content:"\f11b"}.fa-keyboard-o:before{content:"\f11c"}.fa-flag-o:before{content:"\f11d"}.fa-flag-checkered:before{content:"\f11e"}.fa-terminal:before{content:"\f120"}.fa-code:before{content:"\f121"}.fa-mail-reply-all:before,.fa-reply-all:before{content:"\f122"}.fa-star-half-empty:before,.fa-star-half-full:before,.fa-star-half-o:before{content:"\f123"}.fa-location-arrow:before{content:"\f124"}.fa-crop:before{content:"\f125"}.fa-code-fork:before{content:"\f126"}.fa-unlink:before,.fa-chain-broken:before{content:"\f127"}.fa-question:before{content:"\f128"}.fa-info:before{content:"\f129"}.fa-exclamation:before{content:"\f12a"}.fa-superscript:before{content:"\f12b"}.fa-subscript:before{content:"\f12c"}.fa-eraser:before{content:"\f12d"}.fa-puzzle-piece:before{content:"\f12e"}.fa-microphone:before{content:"\f130"}.fa-microphone-slash:before{content:"\f131"}.fa-shield:before{content:"\f132"}.fa-calendar-o:before{content:"\f133"}.fa-fire-extinguisher:before{content:"\f134"}.fa-rocket:before{content:"\f135"}.fa-maxcdn:before{content:"\f136"}.fa-chevron-circle-left:before{content:"\f137"}.fa-chevron-circle-right:before{content:"\f138"}.fa-chevron-circle-up:before{content:"\f139"}.fa-chevron-circle-down:before{content:"\f13a"}.fa-html5:before{content:"\f13b"}.fa-css3:before{content:"\f13c"}.fa-anchor:before{content:"\f13d"}.fa-unlock-alt:before{content:"\f13e"}.fa-bullseye:before{content:"\f140"}.fa-ellipsis-h:before{content:"\f141"}.fa-ellipsis-v:before{content:"\f142"}.fa-rss-square:before{content:"\f143"}.fa-play-circle:before{content:"\f144"}.fa-ticket:before{content:"\f145"}.fa-minus-square:before{content:"\f146"}.fa-minus-square-o:before{content:"\f147"}.fa-level-up:before{content:"\f148"}.fa-level-down:before{content:"\f149"}.fa-check-square:before{content:"\f14a"}.fa-pencil-square:before{content:"\f14b"}.fa-external-link-square:before{content:"\f14c"}.fa-share-square:before{content:"\f14d"}.fa-compass:before{content:"\f14e"}.fa-toggle-down:before,.fa-caret-square-o-down:before{content:"\f150"}.fa-toggle-up:before,.fa-caret-square-o-up:before{content:"\f151"}.fa-toggle-right:before,.fa-caret-square-o-right:before{content:"\f152"}.fa-euro:before,.fa-eur:before{content:"\f153"}.fa-gbp:before{content:"\f154"}.fa-dollar:before,.fa-usd:before{content:"\f155"}.fa-rupee:before,.fa-inr:before{content:"\f156"}.fa-cny:before,.fa-rmb:before,.fa-yen:before,.fa-jpy:before{content:"\f157"}.fa-ruble:before,.fa-rouble:before,.fa-rub:before{content:"\f158"}.fa-won:before,.fa-krw:before{content:"\f159"}.fa-bitcoin:before,.fa-btc:before{content:"\f15a"}.fa-file:before{content:"\f15b"}.fa-file-text:before{content:"\f15c"}.fa-sort-alpha-asc:before{content:"\f15d"}.fa-sort-alpha-desc:before{content:"\f15e"}.fa-sort-amount-asc:before{content:"\f160"}.fa-sort-amount-desc:before{content:"\f161"}.fa-sort-numeric-asc:before{content:"\f162"}.fa-sort-numeric-desc:before{content:"\f163"}.fa-thumbs-up:before{content:"\f164"}.fa-thumbs-down:before{content:"\f165"}.fa-youtube-square:before{content:"\f166"}.fa-youtube:before{content:"\f167"}.fa-xing:before{content:"\f168"}.fa-xing-square:before{content:"\f169"}.fa-youtube-play:before{content:"\f16a"}.fa-dropbox:before{content:"\f16b"}.fa-stack-overflow:before{content:"\f16c"}.fa-instagram:before{content:"\f16d"}.fa-flickr:before{content:"\f16e"}.fa-adn:before{content:"\f170"}.fa-bitbucket:before{content:"\f171"}.fa-bitbucket-square:before{content:"\f172"}.fa-tumblr:before{content:"\f173"}.fa-tumblr-square:before{content:"\f174"}.fa-long-arrow-down:before{content:"\f175"}.fa-long-arrow-up:before{content:"\f176"}.fa-long-arrow-left:before{content:"\f177"}.fa-long-arrow-right:before{content:"\f178"}.fa-apple:before{content:"\f179"}.fa-windows:before{content:"\f17a"}.fa-android:before{content:"\f17b"}.fa-linux:before{content:"\f17c"}.fa-dribbble:before{content:"\f17d"}.fa-skype:before{content:"\f17e"}.fa-foursquare:before{content:"\f180"}.fa-trello:before{content:"\f181"}.fa-female:before{content:"\f182"}.fa-male:before{content:"\f183"}.fa-gittip:before,.fa-gratipay:before{content:"\f184"}.fa-sun-o:before{content:"\f185"}.fa-moon-o:before{content:"\f186"}.fa-archive:before{content:"\f187"}.fa-bug:before{content:"\f188"}.fa-vk:before{content:"\f189"}.fa-weibo:before{content:"\f18a"}.fa-renren:before{content:"\f18b"}.fa-pagelines:before{content:"\f18c"}.fa-stack-exchange:before{content:"\f18d"}.fa-arrow-circle-o-right:before{content:"\f18e"}.fa-arrow-circle-o-left:before{content:"\f190"}.fa-toggle-left:before,.fa-caret-square-o-left:before{content:"\f191"}.fa-dot-circle-o:before{content:"\f192"}.fa-wheelchair:before{content:"\f193"}.fa-vimeo-square:before{content:"\f194"}.fa-turkish-lira:before,.fa-try:before{content:"\f195"}.fa-plus-square-o:before{content:"\f196"}.fa-space-shuttle:before{content:"\f197"}.fa-slack:before{content:"\f198"}.fa-envelope-square:before{content:"\f199"}.fa-wordpress:before{content:"\f19a"}.fa-openid:before{content:"\f19b"}.fa-institution:before,.fa-bank:before,.fa-university:before{content:"\f19c"}.fa-mortar-board:before,.fa-graduation-cap:before{content:"\f19d"}.fa-yahoo:before{content:"\f19e"}.fa-google:before{content:"\f1a0"}.fa-reddit:before{content:"\f1a1"}.fa-reddit-square:before{content:"\f1a2"}.fa-stumbleupon-circle:before{content:"\f1a3"}.fa-stumbleupon:before{content:"\f1a4"}.fa-delicious:before{content:"\f1a5"}.fa-digg:before{content:"\f1a6"}.fa-pied-piper-pp:before{content:"\f1a7"}.fa-pied-piper-alt:before{content:"\f1a8"}.fa-drupal:before{content:"\f1a9"}.fa-joomla:before{content:"\f1aa"}.fa-language:before{content:"\f1ab"}.fa-fax:before{content:"\f1ac"}.fa-building:before{content:"\f1ad"}.fa-child:before{content:"\f1ae"}.fa-paw:before{content:"\f1b0"}.fa-spoon:before{content:"\f1b1"}.fa-cube:before{content:"\f1b2"}.fa-cubes:before{content:"\f1b3"}.fa-behance:before{content:"\f1b4"}.fa-behance-square:before{content:"\f1b5"}.fa-steam:before{content:"\f1b6"}.fa-steam-square:before{content:"\f1b7"}.fa-recycle:before{content:"\f1b8"}.fa-automobile:before,.fa-car:before{content:"\f1b9"}.fa-cab:before,.fa-taxi:before{content:"\f1ba"}.fa-tree:before{content:"\f1bb"}.fa-spotify:before{content:"\f1bc"}.fa-deviantart:before{content:"\f1bd"}.fa-soundcloud:before{content:"\f1be"}.fa-database:before{content:"\f1c0"}.fa-file-pdf-o:before{content:"\f1c1"}.fa-file-word-o:before{content:"\f1c2"}.fa-file-excel-o:before{content:"\f1c3"}.fa-file-powerpoint-o:before{content:"\f1c4"}.fa-file-photo-o:before,.fa-file-picture-o:before,.fa-file-image-o:before{content:"\f1c5"}.fa-file-zip-o:before,.fa-file-archive-o:before{content:"\f1c6"}.fa-file-sound-o:before,.fa-file-audio-o:before{content:"\f1c7"}.fa-file-movie-o:before,.fa-file-video-o:before{content:"\f1c8"}.fa-file-code-o:before{content:"\f1c9"}.fa-vine:before{content:"\f1ca"}.fa-codepen:before{content:"\f1cb"}.fa-jsfiddle:before{content:"\f1cc"}.fa-life-bouy:before,.fa-life-buoy:before,.fa-life-saver:before,.fa-support:before,.fa-life-ring:before{content:"\f1cd"}.fa-circle-o-notch:before{content:"\f1ce"}.fa-ra:before,.fa-resistance:before,.fa-rebel:before{content:"\f1d0"}.fa-ge:before,.fa-empire:before{content:"\f1d1"}.fa-git-square:before{content:"\f1d2"}.fa-git:before{content:"\f1d3"}.fa-y-combinator-square:before,.fa-yc-square:before,.fa-hacker-news:before{content:"\f1d4"}.fa-tencent-weibo:before{content:"\f1d5"}.fa-qq:before{content:"\f1d6"}.fa-wechat:before,.fa-weixin:before{content:"\f1d7"}.fa-send:before,.fa-paper-plane:before{content:"\f1d8"}.fa-send-o:before,.fa-paper-plane-o:before{content:"\f1d9"}.fa-history:before{content:"\f1da"}.fa-circle-thin:before{content:"\f1db"}.fa-header:before{content:"\f1dc"}.fa-paragraph:before{content:"\f1dd"}.fa-sliders:before{content:"\f1de"}.fa-share-alt:before{content:"\f1e0"}.fa-share-alt-square:before{content:"\f1e1"}.fa-bomb:before{content:"\f1e2"}.fa-soccer-ball-o:before,.fa-futbol-o:before{content:"\f1e3"}.fa-tty:before{content:"\f1e4"}.fa-binoculars:before{content:"\f1e5"}.fa-plug:before{content:"\f1e6"}.fa-slideshare:before{content:"\f1e7"}.fa-twitch:before{content:"\f1e8"}.fa-yelp:before{content:"\f1e9"}.fa-newspaper-o:before{content:"\f1ea"}.fa-wifi:before{content:"\f1eb"}.fa-calculator:before{content:"\f1ec"}.fa-paypal:before{content:"\f1ed"}.fa-google-wallet:before{content:"\f1ee"}.fa-cc-visa:before{content:"\f1f0"}.fa-cc-mastercard:before{content:"\f1f1"}.fa-cc-discover:before{content:"\f1f2"}.fa-cc-amex:before{content:"\f1f3"}.fa-cc-paypal:before{content:"\f1f4"}.fa-cc-stripe:before{content:"\f1f5"}.fa-bell-slash:before{content:"\f1f6"}.fa-bell-slash-o:before{content:"\f1f7"}.fa-trash:before{content:"\f1f8"}.fa-copyright:before{content:"\f1f9"}.fa-at:before{content:"\f1fa"}.fa-eyedropper:before{content:"\f1fb"}.fa-paint-brush:before{content:"\f1fc"}.fa-birthday-cake:before{content:"\f1fd"}.fa-area-chart:before{content:"\f1fe"}.fa-pie-chart:before{content:"\f200"}.fa-line-chart:before{content:"\f201"}.fa-lastfm:before{content:"\f202"}.fa-lastfm-square:before{content:"\f203"}.fa-toggle-off:before{content:"\f204"}.fa-toggle-on:before{content:"\f205"}.fa-bicycle:before{content:"\f206"}.fa-bus:before{content:"\f207"}.fa-ioxhost:before{content:"\f208"}.fa-angellist:before{content:"\f209"}.fa-cc:before{content:"\f20a"}.fa-shekel:before,.fa-sheqel:before,.fa-ils:before{content:"\f20b"}.fa-meanpath:before{content:"\f20c"}.fa-buysellads:before{content:"\f20d"}.fa-connectdevelop:before{content:"\f20e"}.fa-dashcube:before{content:"\f210"}.fa-forumbee:before{content:"\f211"}.fa-leanpub:before{content:"\f212"}.fa-sellsy:before{content:"\f213"}.fa-shirtsinbulk:before{content:"\f214"}.fa-simplybuilt:before{content:"\f215"}.fa-skyatlas:before{content:"\f216"}.fa-cart-plus:before{content:"\f217"}.fa-cart-arrow-down:before{content:"\f218"}.fa-diamond:before{content:"\f219"}.fa-ship:before{content:"\f21a"}.fa-user-secret:before{content:"\f21b"}.fa-motorcycle:before{content:"\f21c"}.fa-street-view:before{content:"\f21d"}.fa-heartbeat:before{content:"\f21e"}.fa-venus:before{content:"\f221"}.fa-mars:before{content:"\f222"}.fa-mercury:before{content:"\f223"}.fa-intersex:before,.fa-transgender:before{content:"\f224"}.fa-transgender-alt:before{content:"\f225"}.fa-venus-double:before{content:"\f226"}.fa-mars-double:before{content:"\f227"}.fa-venus-mars:before{content:"\f228"}.fa-mars-stroke:before{content:"\f229"}.fa-mars-stroke-v:before{content:"\f22a"}.fa-mars-stroke-h:before{content:"\f22b"}.fa-neuter:before{content:"\f22c"}.fa-genderless:before{content:"\f22d"}.fa-facebook-official:before{content:"\f230"}.fa-pinterest-p:before{content:"\f231"}.fa-whatsapp:before{content:"\f232"}.fa-server:before{content:"\f233"}.fa-user-plus:before{content:"\f234"}.fa-user-times:before{content:"\f235"}.fa-hotel:before,.fa-bed:before{content:"\f236"}.fa-viacoin:before{content:"\f237"}.fa-train:before{content:"\f238"}.fa-subway:before{content:"\f239"}.fa-medium:before{content:"\f23a"}.fa-yc:before,.fa-y-combinator:before{content:"\f23b"}.fa-optin-monster:before{content:"\f23c"}.fa-opencart:before{content:"\f23d"}.fa-expeditedssl:before{content:"\f23e"}.fa-battery-4:before,.fa-battery:before,.fa-battery-full:before{content:"\f240"}.fa-battery-3:before,.fa-battery-three-quarters:before{content:"\f241"}.fa-battery-2:before,.fa-battery-half:before{content:"\f242"}.fa-battery-1:before,.fa-battery-quarter:before{content:"\f243"}.fa-battery-0:before,.fa-battery-empty:before{content:"\f244"}.fa-mouse-pointer:before{content:"\f245"}.fa-i-cursor:before{content:"\f246"}.fa-object-group:before{content:"\f247"}.fa-object-ungroup:before{content:"\f248"}.fa-sticky-note:before{content:"\f249"}.fa-sticky-note-o:before{content:"\f24a"}.fa-cc-jcb:before{content:"\f24b"}.fa-cc-diners-club:before{content:"\f24c"}.fa-clone:before{content:"\f24d"}.fa-balance-scale:before{content:"\f24e"}.fa-hourglass-o:before{content:"\f250"}.fa-hourglass-1:before,.fa-hourglass-start:before{content:"\f251"}.fa-hourglass-2:before,.fa-hourglass-half:before{content:"\f252"}.fa-hourglass-3:before,.fa-hourglass-end:before{content:"\f253"}.fa-hourglass:before{content:"\f254"}.fa-hand-grab-o:before,.fa-hand-rock-o:before{content:"\f255"}.fa-hand-stop-o:before,.fa-hand-paper-o:before{content:"\f256"}.fa-hand-scissors-o:before{content:"\f257"}.fa-hand-lizard-o:before{content:"\f258"}.fa-hand-spock-o:before{content:"\f259"}.fa-hand-pointer-o:before{content:"\f25a"}.fa-hand-peace-o:before{content:"\f25b"}.fa-trademark:before{content:"\f25c"}.fa-registered:before{content:"\f25d"}.fa-creative-commons:before{content:"\f25e"}.fa-gg:before{content:"\f260"}.fa-gg-circle:before{content:"\f261"}.fa-tripadvisor:before{content:"\f262"}.fa-odnoklassniki:before{content:"\f263"}.fa-odnoklassniki-square:before{content:"\f264"}.fa-get-pocket:before{content:"\f265"}.fa-wikipedia-w:before{content:"\f266"}.fa-safari:before{content:"\f267"}.fa-chrome:before{content:"\f268"}.fa-firefox:before{content:"\f269"}.fa-opera:before{content:"\f26a"}.fa-internet-explorer:before{content:"\f26b"}.fa-tv:before,.fa-television:before{content:"\f26c"}.fa-contao:before{content:"\f26d"}.fa-500px:before{content:"\f26e"}.fa-amazon:before{content:"\f270"}.fa-calendar-plus-o:before{content:"\f271"}.fa-calendar-minus-o:before{content:"\f272"}.fa-calendar-times-o:before{content:"\f273"}.fa-calendar-check-o:before{content:"\f274"}.fa-industry:before{content:"\f275"}.fa-map-pin:before{content:"\f276"}.fa-map-signs:before{content:"\f277"}.fa-map-o:before{content:"\f278"}.fa-map:before{content:"\f279"}.fa-commenting:before{content:"\f27a"}.fa-commenting-o:before{content:"\f27b"}.fa-houzz:before{content:"\f27c"}.fa-vimeo:before{content:"\f27d"}.fa-black-tie:before{content:"\f27e"}.fa-fonticons:before{content:"\f280"}.fa-reddit-alien:before{content:"\f281"}.fa-edge:before{content:"\f282"}.fa-credit-card-alt:before{content:"\f283"}.fa-codiepie:before{content:"\f284"}.fa-modx:before{content:"\f285"}.fa-fort-awesome:before{content:"\f286"}.fa-usb:before{content:"\f287"}.fa-product-hunt:before{content:"\f288"}.fa-mixcloud:before{content:"\f289"}.fa-scribd:before{content:"\f28a"}.fa-pause-circle:before{content:"\f28b"}.fa-pause-circle-o:before{content:"\f28c"}.fa-stop-circle:before{content:"\f28d"}.fa-stop-circle-o:before{content:"\f28e"}.fa-shopping-bag:before{content:"\f290"}.fa-shopping-basket:before{content:"\f291"}.fa-hashtag:before{content:"\f292"}.fa-bluetooth:before{content:"\f293"}.fa-bluetooth-b:before{content:"\f294"}.fa-percent:before{content:"\f295"}.fa-gitlab:before{content:"\f296"}.fa-wpbeginner:before{content:"\f297"}.fa-wpforms:before{content:"\f298"}.fa-envira:before{content:"\f299"}.fa-universal-access:before{content:"\f29a"}.fa-wheelchair-alt:before{content:"\f29b"}.fa-question-circle-o:before{content:"\f29c"}.fa-blind:before{content:"\f29d"}.fa-audio-description:before{content:"\f29e"}.fa-volume-control-phone:before{content:"\f2a0"}.fa-braille:before{content:"\f2a1"}.fa-assistive-listening-systems:before{content:"\f2a2"}.fa-asl-interpreting:before,.fa-american-sign-language-interpreting:before{content:"\f2a3"}.fa-deafness:before,.fa-hard-of-hearing:before,.fa-deaf:before{content:"\f2a4"}.fa-glide:before{content:"\f2a5"}.fa-glide-g:before{content:"\f2a6"}.fa-signing:before,.fa-sign-language:before{content:"\f2a7"}.fa-low-vision:before{content:"\f2a8"}.fa-viadeo:before{content:"\f2a9"}.fa-viadeo-square:before{content:"\f2aa"}.fa-snapchat:before{content:"\f2ab"}.fa-snapchat-ghost:before{content:"\f2ac"}.fa-snapchat-square:before{content:"\f2ad"}.fa-pied-piper:before{content:"\f2ae"}.fa-first-order:before{content:"\f2b0"}.fa-yoast:before{content:"\f2b1"}.fa-themeisle:before{content:"\f2b2"}.fa-google-plus-circle:before,.fa-google-plus-official:before{content:"\f2b3"}.fa-fa:before,.fa-font-awesome:before{content:"\f2b4"}.fa-handshake-o:before{content:"\f2b5"}.fa-envelope-open:before{content:"\f2b6"}.fa-envelope-open-o:before{content:"\f2b7"}.fa-linode:before{content:"\f2b8"}.fa-address-book:before{content:"\f2b9"}.fa-address-book-o:before{content:"\f2ba"}.fa-vcard:before,.fa-address-card:before{content:"\f2bb"}.fa-vcard-o:before,.fa-address-card-o:before{content:"\f2bc"}.fa-user-circle:before{content:"\f2bd"}.fa-user-circle-o:before{content:"\f2be"}.fa-user-o:before{content:"\f2c0"}.fa-id-badge:before{content:"\f2c1"}.fa-drivers-license:before,.fa-id-card:before{content:"\f2c2"}.fa-drivers-license-o:before,.fa-id-card-o:before{content:"\f2c3"}.fa-quora:before{content:"\f2c4"}.fa-free-code-camp:before{content:"\f2c5"}.fa-telegram:before{content:"\f2c6"}.fa-thermometer-4:before,.fa-thermometer:before,.fa-thermometer-full:before{content:"\f2c7"}.fa-thermometer-3:before,.fa-thermometer-three-quarters:before{content:"\f2c8"}.fa-thermometer-2:before,.fa-thermometer-half:before{content:"\f2c9"}.fa-thermometer-1:before,.fa-thermometer-quarter:before{content:"\f2ca"}.fa-thermometer-0:before,.fa-thermometer-empty:before{content:"\f2cb"}.fa-shower:before{content:"\f2cc"}.fa-bathtub:before,.fa-s15:before,.fa-bath:before{content:"\f2cd"}.fa-podcast:before{content:"\f2ce"}.fa-window-maximize:before{content:"\f2d0"}.fa-window-minimize:before{content:"\f2d1"}.fa-window-restore:before{content:"\f2d2"}.fa-times-rectangle:before,.fa-window-close:before{content:"\f2d3"}.fa-times-rectangle-o:before,.fa-window-close-o:before{content:"\f2d4"}.fa-bandcamp:before{content:"\f2d5"}.fa-grav:before{content:"\f2d6"}.fa-etsy:before{content:"\f2d7"}.fa-imdb:before{content:"\f2d8"}.fa-ravelry:before{content:"\f2d9"}.fa-eercast:before{content:"\f2da"}.fa-microchip:before{content:"\f2db"}.fa-snowflake-o:before{content:"\f2dc"}.fa-superpowers:before{content:"\f2dd"}.fa-wpexplorer:before{content:"\f2de"}.fa-meetup:before{content:"\f2e0"}.sr-only{position:absolute;width:1px;height:1px;padding:0;margin:-1px;overflow:hidden;clip:rect(0, 0, 0, 0);border:0}.sr-only-focusable:active,.sr-only-focusable:focus{position:static;width:auto;height:auto;margin:0;overflow:visible;clip:auto} +.book .book-header,.book .book-summary{font-family:"Helvetica Neue",Helvetica,Arial,sans-serif}.book-langs-index{width:100%;height:100%;padding:40px 0;margin:0;overflow:auto}@media (max-width:600px){.book-langs-index{padding:0}}.book-langs-index .inner{max-width:600px;width:100%;margin:0 auto;padding:30px;background:#fff;border-radius:3px}.book-langs-index .inner h3{margin:0}.book-langs-index .inner .languages{list-style:none;padding:20px 30px;margin-top:20px;border-top:1px solid #eee}.book-langs-index .inner .languages:after,.book-langs-index .inner .languages:before{content:" ";display:table;line-height:0}.book-langs-index .inner .languages li{width:50%;float:left;padding:10px 5px;font-size:16px}@media (max-width:600px){.book-langs-index .inner .languages li{width:100%;max-width:100%}}.book .book-header{overflow:visible;height:50px;padding:0 8px;z-index:2;font-size:.85em;color:#7e888b;background:0 0}.book .book-header .btn{display:block;height:50px;padding:0 15px;border-bottom:none;color:#ccc;text-transform:uppercase;line-height:50px;-webkit-box-shadow:none!important;box-shadow:none!important;position:relative;font-size:14px}.book .book-header .btn:hover{position:relative;text-decoration:none;color:#444;background:0 0}.book .book-header h1{margin:0;font-size:20px;font-weight:200;text-align:center;line-height:50px;opacity:0;padding-left:200px;padding-right:200px;-webkit-transition:opacity .2s ease;-moz-transition:opacity .2s ease;-o-transition:opacity .2s ease;transition:opacity .2s ease;overflow:hidden;text-overflow:ellipsis;white-space:nowrap}.book .book-header h1 a,.book .book-header h1 a:hover{color:inherit;text-decoration:none}@media (max-width:1000px){.book .book-header h1{display:none}}.book .book-header h1 i{display:none}.book .book-header:hover h1{opacity:1}.book.is-loading .book-header h1 i{display:inline-block}.book.is-loading .book-header h1 a{display:none}.dropdown{position:relative}.dropdown-menu{position:absolute;top:100%;left:0;z-index:100;display:none;float:left;min-width:160px;padding:0;margin:2px 0 0;list-style:none;font-size:14px;background-color:#fafafa;border:1px solid rgba(0,0,0,.07);border-radius:1px;-webkit-box-shadow:0 6px 12px rgba(0,0,0,.175);box-shadow:0 6px 12px rgba(0,0,0,.175);background-clip:padding-box}.dropdown-menu.open{display:block}.dropdown-menu.dropdown-left{left:auto;right:4%}.dropdown-menu.dropdown-left .dropdown-caret{right:14px;left:auto}.dropdown-menu .dropdown-caret{position:absolute;top:-8px;left:14px;width:18px;height:10px;float:left;overflow:hidden}.dropdown-menu .dropdown-caret .caret-inner,.dropdown-menu .dropdown-caret .caret-outer{display:inline-block;top:0;border-left:9px solid transparent;border-right:9px solid transparent;position:absolute}.dropdown-menu .dropdown-caret .caret-outer{border-bottom:9px solid rgba(0,0,0,.1);height:auto;left:0;width:auto;margin-left:-1px}.dropdown-menu .dropdown-caret .caret-inner{margin-top:-1px;top:1px;border-bottom:9px solid #fafafa}.dropdown-menu .buttons{border-bottom:1px solid rgba(0,0,0,.07)}.dropdown-menu .buttons:after,.dropdown-menu .buttons:before{content:" ";display:table;line-height:0}.dropdown-menu .buttons:last-child{border-bottom:none}.dropdown-menu .buttons .button{border:0;background-color:transparent;color:#a6a6a6;width:100%;text-align:center;float:left;line-height:1.42857143;padding:8px 4px}.alert,.dropdown-menu .buttons .button:hover{color:#444}.dropdown-menu .buttons .button:focus,.dropdown-menu .buttons .button:hover{outline:0}.dropdown-menu .buttons .button.size-2{width:50%}.dropdown-menu .buttons .button.size-3{width:33%}.alert{padding:15px;margin-bottom:20px;background:#eee;border-bottom:5px solid #ddd}.alert-success{background:#dff0d8;border-color:#d6e9c6;color:#3c763d}.alert-info{background:#d9edf7;border-color:#bce8f1;color:#31708f}.alert-danger{background:#f2dede;border-color:#ebccd1;color:#a94442}.alert-warning{background:#fcf8e3;border-color:#faebcc;color:#8a6d3b}.book .book-summary{position:absolute;top:0;left:-300px;bottom:0;z-index:1;width:300px;color:#364149;background:#fafafa;border-right:1px solid rgba(0,0,0,.07);-webkit-transition:left 250ms ease;-moz-transition:left 250ms ease;-o-transition:left 250ms ease;transition:left 250ms ease}.book .book-summary ul.summary{position:absolute;top:0;left:0;right:0;bottom:0;overflow-y:auto;list-style:none;margin:0;padding:0;-webkit-transition:top .5s ease;-moz-transition:top .5s ease;-o-transition:top .5s ease;transition:top .5s ease}.book .book-summary ul.summary li{list-style:none}.book .book-summary ul.summary li.divider{height:1px;margin:7px 0;overflow:hidden;background:rgba(0,0,0,.07)}.book .book-summary ul.summary li i.fa-check{display:none;position:absolute;right:9px;top:16px;font-size:9px;color:#3c3}.book .book-summary ul.summary li.done>a{color:#364149;font-weight:400}.book .book-summary ul.summary li.done>a i{display:inline}.book .book-summary ul.summary li a,.book .book-summary ul.summary li span{display:block;padding:10px 15px;border-bottom:none;color:#364149;background:0 0;text-overflow:ellipsis;overflow:hidden;white-space:nowrap;position:relative}.book .book-summary ul.summary li span{cursor:not-allowed;opacity:.3;filter:alpha(opacity=30)}.book .book-summary ul.summary li a:hover,.book .book-summary ul.summary li.active>a{color:#008cff;background:0 0;text-decoration:none}.book .book-summary ul.summary li ul{padding-left:20px}@media (max-width:600px){.book .book-summary{width:calc(100% - 60px);bottom:0;left:-100%}}.book.with-summary .book-summary{left:0}.book.without-animation .book-summary{-webkit-transition:none!important;-moz-transition:none!important;-o-transition:none!important;transition:none!important}.book{position:relative;width:100%;height:100%}.book .book-body,.book .book-body .body-inner{position:absolute;top:0;left:0;overflow-y:auto;bottom:0;right:0}.book .book-body{color:#000;background:#fff;-webkit-transition:left 250ms ease;-moz-transition:left 250ms ease;-o-transition:left 250ms ease;transition:left 250ms ease}.book .book-body .page-wrapper{position:relative;outline:0}.book .book-body .page-wrapper .page-inner{max-width:800px;margin:0 auto;padding:20px 0 40px}.book .book-body .page-wrapper .page-inner section{margin:0;padding:5px 15px;background:#fff;border-radius:2px;line-height:1.7;font-size:1.6rem}.book .book-body .page-wrapper .page-inner .btn-group .btn{border-radius:0;background:#eee;border:0}@media (max-width:1240px){.book .book-body{-webkit-transition:-webkit-transform 250ms ease;-moz-transition:-moz-transform 250ms ease;-o-transition:-o-transform 250ms ease;transition:transform 250ms ease;padding-bottom:20px}.book .book-body .body-inner{position:static;min-height:calc(100% - 50px)}}@media (min-width:600px){.book.with-summary .book-body{left:300px}}@media (max-width:600px){.book.with-summary{overflow:hidden}.book.with-summary .book-body{-webkit-transform:translate(calc(100% - 60px),0);-moz-transform:translate(calc(100% - 60px),0);-ms-transform:translate(calc(100% - 60px),0);-o-transform:translate(calc(100% - 60px),0);transform:translate(calc(100% - 60px),0)}}.book.without-animation .book-body{-webkit-transition:none!important;-moz-transition:none!important;-o-transition:none!important;transition:none!important}.buttons:after,.buttons:before{content:" ";display:table;line-height:0}.button{border:0;background:#eee;color:#666;width:100%;text-align:center;float:left;line-height:1.42857143;padding:8px 4px}.button:hover{color:#444}.button:focus,.button:hover{outline:0}.button.size-2{width:50%}.button.size-3{width:33%}.book .book-body .page-wrapper .page-inner section{display:none}.book .book-body .page-wrapper .page-inner section.normal{display:block;word-wrap:break-word;overflow:hidden;color:#333;line-height:1.7;text-size-adjust:100%;-ms-text-size-adjust:100%;-webkit-text-size-adjust:100%;-moz-text-size-adjust:100%}.book .book-body .page-wrapper .page-inner section.normal *{box-sizing:border-box;-webkit-box-sizing:border-box;}.book .book-body .page-wrapper .page-inner section.normal>:first-child{margin-top:0!important}.book .book-body .page-wrapper .page-inner section.normal>:last-child{margin-bottom:0!important}.book .book-body .page-wrapper .page-inner section.normal blockquote,.book .book-body .page-wrapper .page-inner section.normal code,.book .book-body .page-wrapper .page-inner section.normal figure,.book .book-body .page-wrapper .page-inner section.normal img,.book .book-body .page-wrapper .page-inner section.normal pre,.book .book-body .page-wrapper .page-inner section.normal table,.book .book-body .page-wrapper .page-inner section.normal tr{page-break-inside:avoid}.book .book-body .page-wrapper .page-inner section.normal h2,.book .book-body .page-wrapper .page-inner section.normal h3,.book .book-body .page-wrapper .page-inner section.normal h4,.book .book-body .page-wrapper .page-inner section.normal h5,.book .book-body .page-wrapper .page-inner section.normal p{orphans:3;widows:3}.book .book-body .page-wrapper .page-inner section.normal h1,.book .book-body .page-wrapper .page-inner section.normal h2,.book .book-body .page-wrapper .page-inner section.normal h3,.book .book-body .page-wrapper .page-inner section.normal h4,.book .book-body .page-wrapper .page-inner section.normal h5{page-break-after:avoid}.book .book-body .page-wrapper .page-inner section.normal b,.book .book-body .page-wrapper .page-inner section.normal strong{font-weight:700}.book .book-body .page-wrapper .page-inner section.normal em{font-style:italic}.book .book-body .page-wrapper .page-inner section.normal blockquote,.book .book-body .page-wrapper .page-inner section.normal dl,.book .book-body .page-wrapper .page-inner section.normal ol,.book .book-body .page-wrapper .page-inner section.normal p,.book .book-body .page-wrapper .page-inner section.normal table,.book .book-body .page-wrapper .page-inner section.normal ul{margin-top:0;margin-bottom:.85em}.book .book-body .page-wrapper .page-inner section.normal a{color:#4183c4;text-decoration:none;background:0 0}.book .book-body .page-wrapper .page-inner section.normal a:active,.book .book-body .page-wrapper .page-inner section.normal a:focus,.book .book-body .page-wrapper .page-inner section.normal a:hover{outline:0;text-decoration:underline}.book .book-body .page-wrapper .page-inner section.normal img{border:0;max-width:100%}.book .book-body .page-wrapper .page-inner section.normal hr{height:4px;padding:0;margin:1.7em 0;overflow:hidden;background-color:#e7e7e7;border:none}.book .book-body .page-wrapper .page-inner section.normal hr:after,.book .book-body .page-wrapper .page-inner section.normal hr:before{display:table;content:" "}.book .book-body .page-wrapper .page-inner section.normal h1,.book .book-body .page-wrapper .page-inner section.normal h2,.book .book-body .page-wrapper .page-inner section.normal h3,.book .book-body .page-wrapper .page-inner section.normal h4,.book .book-body .page-wrapper .page-inner section.normal h5,.book .book-body .page-wrapper .page-inner section.normal h6{margin-top:1.275em;margin-bottom:.85em;}.book .book-body .page-wrapper .page-inner section.normal h1{font-size:2em}.book .book-body .page-wrapper .page-inner section.normal h2{font-size:1.75em}.book .book-body .page-wrapper .page-inner section.normal h3{font-size:1.5em}.book .book-body .page-wrapper .page-inner section.normal h4{font-size:1.25em}.book .book-body .page-wrapper .page-inner section.normal h5{font-size:1em}.book .book-body .page-wrapper .page-inner section.normal h6{font-size:1em;color:#777}.book .book-body .page-wrapper .page-inner section.normal code,.book .book-body .page-wrapper .page-inner section.normal pre{font-family:Consolas,"Liberation Mono",Menlo,Courier,monospace;direction:ltr;border:none;color:inherit}.book .book-body .page-wrapper .page-inner section.normal pre{overflow:auto;word-wrap:normal;margin:0 0 1.275em;padding:.85em 1em;background:#f7f7f7}.book .book-body .page-wrapper .page-inner section.normal pre>code{display:inline;max-width:initial;padding:0;margin:0;overflow:initial;line-height:inherit;font-size:.85em;white-space:pre;background:0 0}.book .book-body .page-wrapper .page-inner section.normal pre>code:after,.book .book-body .page-wrapper .page-inner section.normal pre>code:before{content:normal}.book .book-body .page-wrapper .page-inner section.normal code{padding:.2em;margin:0;font-size:.85em;background-color:#f7f7f7}.book .book-body .page-wrapper .page-inner section.normal code:after,.book .book-body .page-wrapper .page-inner section.normal code:before{letter-spacing:-.2em;content:"\00a0"}.book .book-body .page-wrapper .page-inner section.normal ol,.book .book-body .page-wrapper .page-inner section.normal ul{padding:0 0 0 2em;margin:0 0 .85em}.book .book-body .page-wrapper .page-inner section.normal ol ol,.book .book-body .page-wrapper .page-inner section.normal ol ul,.book .book-body .page-wrapper .page-inner section.normal ul ol,.book .book-body .page-wrapper .page-inner section.normal ul ul{margin-top:0;margin-bottom:0}.book .book-body .page-wrapper .page-inner section.normal ol ol{list-style-type:lower-roman}.book .book-body .page-wrapper .page-inner section.normal blockquote{margin:0 0 .85em;padding:0 15px;opacity:0.75;border-left:4px solid #dcdcdc}.book .book-body .page-wrapper .page-inner section.normal blockquote:first-child{margin-top:0}.book .book-body .page-wrapper .page-inner section.normal blockquote:last-child{margin-bottom:0}.book .book-body .page-wrapper .page-inner section.normal dl{padding:0}.book .book-body .page-wrapper .page-inner section.normal dl dt{padding:0;margin-top:.85em;font-style:italic;font-weight:700}.book .book-body .page-wrapper .page-inner section.normal dl dd{padding:0 .85em;margin-bottom:.85em}.book .book-body .page-wrapper .page-inner section.normal dd{margin-left:0}.book .book-body .page-wrapper .page-inner section.normal .glossary-term{cursor:help;text-decoration:underline}.book .book-body .navigation{position:absolute;top:50px;bottom:0;margin:0;max-width:150px;min-width:90px;display:flex;justify-content:center;align-content:center;flex-direction:column;font-size:40px;color:#ccc;text-align:center;-webkit-transition:all 350ms ease;-moz-transition:all 350ms ease;-o-transition:all 350ms ease;transition:all 350ms ease}.book .book-body .navigation:hover{text-decoration:none;color:#444}.book .book-body .navigation.navigation-next{right:0}.book .book-body .navigation.navigation-prev{left:0}@media (max-width:1240px){.book .book-body .navigation{position:static;top:auto;max-width:50%;width:50%;display:inline-block;float:left}.book .book-body .navigation.navigation-unique{max-width:100%;width:100%}}.book .book-body .page-wrapper .page-inner section.glossary{margin-bottom:40px}.book .book-body .page-wrapper .page-inner section.glossary h2 a,.book .book-body .page-wrapper .page-inner section.glossary h2 a:hover{color:inherit;text-decoration:none}.book .book-body .page-wrapper .page-inner section.glossary .glossary-index{list-style:none;margin:0;padding:0}.book .book-body .page-wrapper .page-inner section.glossary .glossary-index li{display:inline;margin:0 8px;white-space:nowrap}*{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box;-webkit-overflow-scrolling:auto;-webkit-tap-highlight-color:transparent;-webkit-text-size-adjust:none;-webkit-touch-callout:none}a{text-decoration:none}body,html{height:100%}html{font-size:62.5%}body{text-rendering:optimizeLegibility;font-smoothing:antialiased;font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:14px;letter-spacing:.2px;text-size-adjust:100%} +.book .book-summary ul.summary li a span {display:inline;padding:initial;overflow:visible;cursor:auto;opacity:1;} +/* show arrow before summary tag as in bootstrap */ +details > summary {display:list-item;cursor:pointer;} diff --git a/libs/gitbook-2.6.7/js/app.min.js b/libs/gitbook-2.6.7/js/app.min.js new file mode 100644 index 00000000..643f1f98 --- /dev/null +++ b/libs/gitbook-2.6.7/js/app.min.js @@ -0,0 +1 @@ +(function e(t,n,r){function s(o,u){if(!n[o]){if(!t[o]){var a=typeof require=="function"&&require;if(!u&&a)return a(o,!0);if(i)return i(o,!0);var f=new Error("Cannot find module '"+o+"'");throw f.code="MODULE_NOT_FOUND",f}var l=n[o]={exports:{}};t[o][0].call(l.exports,function(e){var n=t[o][1][e];return s(n?n:e)},l,l.exports,e,t,n,r)}return n[o].exports}var i=typeof require=="function"&&require;for(var o=0;o"'`]/g,reHasEscapedHtml=RegExp(reEscapedHtml.source),reHasUnescapedHtml=RegExp(reUnescapedHtml.source);var reEscape=/<%-([\s\S]+?)%>/g,reEvaluate=/<%([\s\S]+?)%>/g,reInterpolate=/<%=([\s\S]+?)%>/g;var reIsDeepProp=/\.|\[(?:[^[\]]*|(["'])(?:(?!\1)[^\n\\]|\\.)*?\1)\]/,reIsPlainProp=/^\w*$/,rePropName=/[^.[\]]+|\[(?:(-?\d+(?:\.\d+)?)|(["'])((?:(?!\2)[^\n\\]|\\.)*?)\2)\]/g;var reRegExpChars=/^[:!,]|[\\^$.*+?()[\]{}|\/]|(^[0-9a-fA-Fnrtuvx])|([\n\r\u2028\u2029])/g,reHasRegExpChars=RegExp(reRegExpChars.source);var reComboMark=/[\u0300-\u036f\ufe20-\ufe23]/g;var reEscapeChar=/\\(\\)?/g;var reEsTemplate=/\$\{([^\\}]*(?:\\.[^\\}]*)*)\}/g;var reFlags=/\w*$/;var reHasHexPrefix=/^0[xX]/;var reIsHostCtor=/^\[object .+?Constructor\]$/;var reIsUint=/^\d+$/;var reLatin1=/[\xc0-\xd6\xd8-\xde\xdf-\xf6\xf8-\xff]/g;var reNoMatch=/($^)/;var reUnescapedString=/['\n\r\u2028\u2029\\]/g;var reWords=function(){var upper="[A-Z\\xc0-\\xd6\\xd8-\\xde]",lower="[a-z\\xdf-\\xf6\\xf8-\\xff]+";return RegExp(upper+"+(?="+upper+lower+")|"+upper+"?"+lower+"|"+upper+"+|[0-9]+","g")}();var contextProps=["Array","ArrayBuffer","Date","Error","Float32Array","Float64Array","Function","Int8Array","Int16Array","Int32Array","Math","Number","Object","RegExp","Set","String","_","clearTimeout","isFinite","parseFloat","parseInt","setTimeout","TypeError","Uint8Array","Uint8ClampedArray","Uint16Array","Uint32Array","WeakMap"];var templateCounter=-1;var typedArrayTags={};typedArrayTags[float32Tag]=typedArrayTags[float64Tag]=typedArrayTags[int8Tag]=typedArrayTags[int16Tag]=typedArrayTags[int32Tag]=typedArrayTags[uint8Tag]=typedArrayTags[uint8ClampedTag]=typedArrayTags[uint16Tag]=typedArrayTags[uint32Tag]=true;typedArrayTags[argsTag]=typedArrayTags[arrayTag]=typedArrayTags[arrayBufferTag]=typedArrayTags[boolTag]=typedArrayTags[dateTag]=typedArrayTags[errorTag]=typedArrayTags[funcTag]=typedArrayTags[mapTag]=typedArrayTags[numberTag]=typedArrayTags[objectTag]=typedArrayTags[regexpTag]=typedArrayTags[setTag]=typedArrayTags[stringTag]=typedArrayTags[weakMapTag]=false;var cloneableTags={};cloneableTags[argsTag]=cloneableTags[arrayTag]=cloneableTags[arrayBufferTag]=cloneableTags[boolTag]=cloneableTags[dateTag]=cloneableTags[float32Tag]=cloneableTags[float64Tag]=cloneableTags[int8Tag]=cloneableTags[int16Tag]=cloneableTags[int32Tag]=cloneableTags[numberTag]=cloneableTags[objectTag]=cloneableTags[regexpTag]=cloneableTags[stringTag]=cloneableTags[uint8Tag]=cloneableTags[uint8ClampedTag]=cloneableTags[uint16Tag]=cloneableTags[uint32Tag]=true;cloneableTags[errorTag]=cloneableTags[funcTag]=cloneableTags[mapTag]=cloneableTags[setTag]=cloneableTags[weakMapTag]=false;var deburredLetters={"À":"A","Á":"A","Â":"A","Ã":"A","Ä":"A","Å":"A","à":"a","á":"a","â":"a","ã":"a","ä":"a","å":"a","Ç":"C","ç":"c","Ð":"D","ð":"d","È":"E","É":"E","Ê":"E","Ë":"E","è":"e","é":"e","ê":"e","ë":"e","Ì":"I","Í":"I","Î":"I","Ï":"I","ì":"i","í":"i","î":"i","ï":"i","Ñ":"N","ñ":"n","Ò":"O","Ó":"O","Ô":"O","Õ":"O","Ö":"O","Ø":"O","ò":"o","ó":"o","ô":"o","õ":"o","ö":"o","ø":"o","Ù":"U","Ú":"U","Û":"U","Ü":"U","ù":"u","ú":"u","û":"u","ü":"u","Ý":"Y","ý":"y","ÿ":"y","Æ":"Ae","æ":"ae","Þ":"Th","þ":"th","ß":"ss"};var htmlEscapes={"&":"&","<":"<",">":">",'"':""","'":"'","`":"`"};var htmlUnescapes={"&":"&","<":"<",">":">",""":'"',"'":"'","`":"`"};var objectTypes={function:true,object:true};var regexpEscapes={0:"x30",1:"x31",2:"x32",3:"x33",4:"x34",5:"x35",6:"x36",7:"x37",8:"x38",9:"x39",A:"x41",B:"x42",C:"x43",D:"x44",E:"x45",F:"x46",a:"x61",b:"x62",c:"x63",d:"x64",e:"x65",f:"x66",n:"x6e",r:"x72",t:"x74",u:"x75",v:"x76",x:"x78"};var stringEscapes={"\\":"\\","'":"'","\n":"n","\r":"r","\u2028":"u2028","\u2029":"u2029"};var freeExports=objectTypes[typeof exports]&&exports&&!exports.nodeType&&exports;var freeModule=objectTypes[typeof module]&&module&&!module.nodeType&&module;var freeGlobal=freeExports&&freeModule&&typeof global=="object"&&global&&global.Object&&global;var freeSelf=objectTypes[typeof self]&&self&&self.Object&&self;var freeWindow=objectTypes[typeof window]&&window&&window.Object&&window;var moduleExports=freeModule&&freeModule.exports===freeExports&&freeExports;var root=freeGlobal||freeWindow!==(this&&this.window)&&freeWindow||freeSelf||this;function baseCompareAscending(value,other){if(value!==other){var valIsNull=value===null,valIsUndef=value===undefined,valIsReflexive=value===value;var othIsNull=other===null,othIsUndef=other===undefined,othIsReflexive=other===other;if(value>other&&!othIsNull||!valIsReflexive||valIsNull&&!othIsUndef&&othIsReflexive||valIsUndef&&othIsReflexive){return 1}if(value-1){}return index}function charsRightIndex(string,chars){var index=string.length;while(index--&&chars.indexOf(string.charAt(index))>-1){}return index}function compareAscending(object,other){return baseCompareAscending(object.criteria,other.criteria)||object.index-other.index}function compareMultiple(object,other,orders){var index=-1,objCriteria=object.criteria,othCriteria=other.criteria,length=objCriteria.length,ordersLength=orders.length;while(++index=ordersLength){return result}var order=orders[index];return result*(order==="asc"||order===true?1:-1)}}return object.index-other.index}function deburrLetter(letter){return deburredLetters[letter]}function escapeHtmlChar(chr){return htmlEscapes[chr]}function escapeRegExpChar(chr,leadingChar,whitespaceChar){if(leadingChar){chr=regexpEscapes[chr]}else if(whitespaceChar){chr=stringEscapes[chr]}return"\\"+chr}function escapeStringChar(chr){return"\\"+stringEscapes[chr]}function indexOfNaN(array,fromIndex,fromRight){var length=array.length,index=fromIndex+(fromRight?0:-1);while(fromRight?index--:++index=9&&charCode<=13)||charCode==32||charCode==160||charCode==5760||charCode==6158||charCode>=8192&&(charCode<=8202||charCode==8232||charCode==8233||charCode==8239||charCode==8287||charCode==12288||charCode==65279)}function replaceHolders(array,placeholder){var index=-1,length=array.length,resIndex=-1,result=[];while(++index>>1;var MAX_SAFE_INTEGER=9007199254740991;var metaMap=WeakMap&&new WeakMap;var realNames={};function lodash(value){if(isObjectLike(value)&&!isArray(value)&&!(value instanceof LazyWrapper)){if(value instanceof LodashWrapper){return value}if(hasOwnProperty.call(value,"__chain__")&&hasOwnProperty.call(value,"__wrapped__")){return wrapperClone(value)}}return new LodashWrapper(value)}function baseLodash(){}function LodashWrapper(value,chainAll,actions){this.__wrapped__=value;this.__actions__=actions||[];this.__chain__=!!chainAll}var support=lodash.support={};lodash.templateSettings={escape:reEscape,evaluate:reEvaluate,interpolate:reInterpolate,variable:"",imports:{_:lodash}};function LazyWrapper(value){this.__wrapped__=value;this.__actions__=[];this.__dir__=1;this.__filtered__=false;this.__iteratees__=[];this.__takeCount__=POSITIVE_INFINITY;this.__views__=[]}function lazyClone(){var result=new LazyWrapper(this.__wrapped__);result.__actions__=arrayCopy(this.__actions__);result.__dir__=this.__dir__;result.__filtered__=this.__filtered__;result.__iteratees__=arrayCopy(this.__iteratees__);result.__takeCount__=this.__takeCount__;result.__views__=arrayCopy(this.__views__);return result}function lazyReverse(){if(this.__filtered__){var result=new LazyWrapper(this);result.__dir__=-1;result.__filtered__=true}else{result=this.clone();result.__dir__*=-1}return result}function lazyValue(){var array=this.__wrapped__.value(),dir=this.__dir__,isArr=isArray(array),isRight=dir<0,arrLength=isArr?array.length:0,view=getView(0,arrLength,this.__views__),start=view.start,end=view.end,length=end-start,index=isRight?end:start-1,iteratees=this.__iteratees__,iterLength=iteratees.length,resIndex=0,takeCount=nativeMin(length,this.__takeCount__);if(!isArr||arrLength=LARGE_ARRAY_SIZE?createCache(values):null,valuesLength=values.length;if(cache){indexOf=cacheIndexOf;isCommon=false;values=cache}outer:while(++indexlength?0:length+start}end=end===undefined||end>length?length:+end||0;if(end<0){end+=length}length=start>end?0:end>>>0;start>>>=0;while(startlength?0:length+start}end=end===undefined||end>length?length:+end||0;if(end<0){end+=length}length=start>end?0:end-start>>>0;start>>>=0;var result=Array(length);while(++index=LARGE_ARRAY_SIZE,seen=isLarge?createCache():null,result=[];if(seen){indexOf=cacheIndexOf;isCommon=false}else{isLarge=false;seen=iteratee?[]:result}outer:while(++index>>1,computed=array[mid];if((retHighest?computed<=value:computed2?sources[length-2]:undefined,guard=length>2?sources[2]:undefined,thisArg=length>1?sources[length-1]:undefined;if(typeof customizer=="function"){customizer=bindCallback(customizer,thisArg,5);length-=2}else{customizer=typeof thisArg=="function"?thisArg:undefined;length-=customizer?1:0}if(guard&&isIterateeCall(sources[0],sources[1],guard)){customizer=length<3?undefined:customizer;length=1}while(++index-1?collection[index]:undefined}return baseFind(collection,predicate,eachFunc)}}function createFindIndex(fromRight){return function(array,predicate,thisArg){if(!(array&&array.length)){return-1}predicate=getCallback(predicate,thisArg,3);return baseFindIndex(array,predicate,fromRight)}}function createFindKey(objectFunc){return function(object,predicate,thisArg){predicate=getCallback(predicate,thisArg,3);return baseFind(object,predicate,objectFunc,true)}}function createFlow(fromRight){return function(){var wrapper,length=arguments.length,index=fromRight?length:-1,leftIndex=0,funcs=Array(length);while(fromRight?index--:++index=LARGE_ARRAY_SIZE){return wrapper.plant(value).value()}var index=0,result=length?funcs[index].apply(this,args):value;while(++index=length||!nativeIsFinite(length)){return""}var padLength=length-strLength;chars=chars==null?" ":chars+"";return repeat(chars,nativeCeil(padLength/chars.length)).slice(0,padLength)}function createPartialWrapper(func,bitmask,thisArg,partials){var isBind=bitmask&BIND_FLAG,Ctor=createCtorWrapper(func);function wrapper(){var argsIndex=-1,argsLength=arguments.length,leftIndex=-1,leftLength=partials.length,args=Array(leftLength+argsLength);while(++leftIndexarrLength)){return false}while(++index-1&&value%1==0&&value-1&&value%1==0&&value<=MAX_SAFE_INTEGER}function isStrictComparable(value){return value===value&&!isObject(value)}function mergeData(data,source){var bitmask=data[1],srcBitmask=source[1],newBitmask=bitmask|srcBitmask,isCommon=newBitmask0){if(++count>=HOT_COUNT){return key}}else{count=0}return baseSetData(key,value)}}();function shimKeys(object){var props=keysIn(object),propsLength=props.length,length=propsLength&&object.length;var allowIndexes=!!length&&isLength(length)&&(isArray(object)||isArguments(object));var index=-1,result=[];while(++index=120?createCache(othIndex&&value):null}var array=arrays[0],index=-1,length=array?array.length:0,seen=caches[0];outer:while(++index-1){splice.call(array,fromIndex,1)}}return array}var pullAt=restParam(function(array,indexes){indexes=baseFlatten(indexes);var result=baseAt(array,indexes);basePullAt(array,indexes.sort(baseCompareAscending));return result});function remove(array,predicate,thisArg){var result=[];if(!(array&&array.length)){return result}var index=-1,indexes=[],length=array.length;predicate=getCallback(predicate,thisArg,3);while(++index2?arrays[length-2]:undefined,thisArg=length>1?arrays[length-1]:undefined;if(length>2&&typeof iteratee=="function"){length-=2}else{iteratee=length>1&&typeof thisArg=="function"?(--length,thisArg):undefined;thisArg=undefined}arrays.length=length;return unzipWith(arrays,iteratee,thisArg)});function chain(value){var result=lodash(value);result.__chain__=true;return result}function tap(value,interceptor,thisArg){interceptor.call(thisArg,value);return value}function thru(value,interceptor,thisArg){return interceptor.call(thisArg,value)}function wrapperChain(){return chain(this)}function wrapperCommit(){return new LodashWrapper(this.value(),this.__chain__)}var wrapperConcat=restParam(function(values){values=baseFlatten(values);return this.thru(function(array){return arrayConcat(isArray(array)?array:[toObject(array)],values)})});function wrapperPlant(value){var result,parent=this;while(parent instanceof baseLodash){var clone=wrapperClone(parent);if(result){previous.__wrapped__=clone}else{result=clone}var previous=clone;parent=parent.__wrapped__}previous.__wrapped__=value;return result}function wrapperReverse(){var value=this.__wrapped__;var interceptor=function(value){return wrapped&&wrapped.__dir__<0?value:value.reverse()};if(value instanceof LazyWrapper){var wrapped=value;if(this.__actions__.length){wrapped=new LazyWrapper(this)}wrapped=wrapped.reverse();wrapped.__actions__.push({func:thru,args:[interceptor],thisArg:undefined});return new LodashWrapper(wrapped,this.__chain__)}return this.thru(interceptor)}function wrapperToString(){return this.value()+""}function wrapperValue(){return baseWrapperValue(this.__wrapped__,this.__actions__)}var at=restParam(function(collection,props){return baseAt(collection,baseFlatten(props))});var countBy=createAggregator(function(result,value,key){hasOwnProperty.call(result,key)?++result[key]:result[key]=1});function every(collection,predicate,thisArg){var func=isArray(collection)?arrayEvery:baseEvery;if(thisArg&&isIterateeCall(collection,predicate,thisArg)){predicate=undefined}if(typeof predicate!="function"||thisArg!==undefined){predicate=getCallback(predicate,thisArg,3)}return func(collection,predicate)}function filter(collection,predicate,thisArg){var func=isArray(collection)?arrayFilter:baseFilter;predicate=getCallback(predicate,thisArg,3);return func(collection,predicate)}var find=createFind(baseEach);var findLast=createFind(baseEachRight,true);function findWhere(collection,source){return find(collection,baseMatches(source))}var forEach=createForEach(arrayEach,baseEach);var forEachRight=createForEach(arrayEachRight,baseEachRight);var groupBy=createAggregator(function(result,value,key){if(hasOwnProperty.call(result,key)){result[key].push(value)}else{result[key]=[value]}});function includes(collection,target,fromIndex,guard){var length=collection?getLength(collection):0;if(!isLength(length)){collection=values(collection);length=collection.length}if(typeof fromIndex!="number"||guard&&isIterateeCall(target,fromIndex,guard)){fromIndex=0}else{fromIndex=fromIndex<0?nativeMax(length+fromIndex,0):fromIndex||0}return typeof collection=="string"||!isArray(collection)&&isString(collection)?fromIndex<=length&&collection.indexOf(target,fromIndex)>-1:!!length&&getIndexOf(collection,target,fromIndex)>-1}var indexBy=createAggregator(function(result,value,key){result[key]=value});var invoke=restParam(function(collection,path,args){var index=-1,isFunc=typeof path=="function",isProp=isKey(path),result=isArrayLike(collection)?Array(collection.length):[];baseEach(collection,function(value){var func=isFunc?path:isProp&&value!=null?value[path]:undefined;result[++index]=func?func.apply(value,args):invokePath(value,path,args)});return result});function map(collection,iteratee,thisArg){var func=isArray(collection)?arrayMap:baseMap;iteratee=getCallback(iteratee,thisArg,3);return func(collection,iteratee)}var partition=createAggregator(function(result,value,key){result[key?0:1].push(value)},function(){return[[],[]]});function pluck(collection,path){return map(collection,property(path))}var reduce=createReduce(arrayReduce,baseEach);var reduceRight=createReduce(arrayReduceRight,baseEachRight);function reject(collection,predicate,thisArg){var func=isArray(collection)?arrayFilter:baseFilter;predicate=getCallback(predicate,thisArg,3);return func(collection,function(value,index,collection){return!predicate(value,index,collection)})}function sample(collection,n,guard){if(guard?isIterateeCall(collection,n,guard):n==null){collection=toIterable(collection);var length=collection.length;return length>0?collection[baseRandom(0,length-1)]:undefined}var index=-1,result=toArray(collection),length=result.length,lastIndex=length-1;n=nativeMin(n<0?0:+n||0,length);while(++index0){result=func.apply(this,arguments)}if(n<=1){func=undefined}return result}}var bind=restParam(function(func,thisArg,partials){var bitmask=BIND_FLAG;if(partials.length){var holders=replaceHolders(partials,bind.placeholder);bitmask|=PARTIAL_FLAG}return createWrapper(func,bitmask,thisArg,partials,holders)});var bindAll=restParam(function(object,methodNames){methodNames=methodNames.length?baseFlatten(methodNames):functions(object);var index=-1,length=methodNames.length;while(++indexwait){complete(trailingCall,maxTimeoutId)}else{timeoutId=setTimeout(delayed,remaining)}}function maxDelayed(){complete(trailing,timeoutId)}function debounced(){args=arguments;stamp=now();thisArg=this;trailingCall=trailing&&(timeoutId||!leading);if(maxWait===false){var leadingCall=leading&&!timeoutId}else{if(!maxTimeoutId&&!leading){lastCalled=stamp}var remaining=maxWait-(stamp-lastCalled),isCalled=remaining<=0||remaining>maxWait;if(isCalled){if(maxTimeoutId){maxTimeoutId=clearTimeout(maxTimeoutId)}lastCalled=stamp;result=func.apply(thisArg,args)}else if(!maxTimeoutId){maxTimeoutId=setTimeout(maxDelayed,remaining)}}if(isCalled&&timeoutId){timeoutId=clearTimeout(timeoutId)}else if(!timeoutId&&wait!==maxWait){timeoutId=setTimeout(delayed,wait)}if(leadingCall){isCalled=true;result=func.apply(thisArg,args)}if(isCalled&&!timeoutId&&!maxTimeoutId){args=thisArg=undefined}return result}debounced.cancel=cancel;return debounced}var defer=restParam(function(func,args){return baseDelay(func,1,args)});var delay=restParam(function(func,wait,args){return baseDelay(func,wait,args)});var flow=createFlow();var flowRight=createFlow(true);function memoize(func,resolver){if(typeof func!="function"||resolver&&typeof resolver!="function"){throw new TypeError(FUNC_ERROR_TEXT)}var memoized=function(){var args=arguments,key=resolver?resolver.apply(this,args):args[0],cache=memoized.cache;if(cache.has(key)){return cache.get(key)}var result=func.apply(this,args);memoized.cache=cache.set(key,result);return result};memoized.cache=new memoize.Cache;return memoized}var modArgs=restParam(function(func,transforms){transforms=baseFlatten(transforms);if(typeof func!="function"||!arrayEvery(transforms,baseIsFunction)){throw new TypeError(FUNC_ERROR_TEXT)}var length=transforms.length;return restParam(function(args){var index=nativeMin(args.length,length);while(index--){args[index]=transforms[index](args[index])}return func.apply(this,args)})});function negate(predicate){if(typeof predicate!="function"){throw new TypeError(FUNC_ERROR_TEXT)}return function(){return!predicate.apply(this,arguments)}}function once(func){return before(2,func)}var partial=createPartial(PARTIAL_FLAG);var partialRight=createPartial(PARTIAL_RIGHT_FLAG);var rearg=restParam(function(func,indexes){return createWrapper(func,REARG_FLAG,undefined,undefined,undefined,baseFlatten(indexes))});function restParam(func,start){if(typeof func!="function"){throw new TypeError(FUNC_ERROR_TEXT)}start=nativeMax(start===undefined?func.length-1:+start||0,0);return function(){var args=arguments,index=-1,length=nativeMax(args.length-start,0),rest=Array(length);while(++indexother}function gte(value,other){return value>=other}function isArguments(value){return isObjectLike(value)&&isArrayLike(value)&&hasOwnProperty.call(value,"callee")&&!propertyIsEnumerable.call(value,"callee")}var isArray=nativeIsArray||function(value){return isObjectLike(value)&&isLength(value.length)&&objToString.call(value)==arrayTag};function isBoolean(value){return value===true||value===false||isObjectLike(value)&&objToString.call(value)==boolTag}function isDate(value){return isObjectLike(value)&&objToString.call(value)==dateTag}function isElement(value){return!!value&&value.nodeType===1&&isObjectLike(value)&&!isPlainObject(value)}function isEmpty(value){if(value==null){return true}if(isArrayLike(value)&&(isArray(value)||isString(value)||isArguments(value)||isObjectLike(value)&&isFunction(value.splice))){return!value.length}return!keys(value).length}function isEqual(value,other,customizer,thisArg){customizer=typeof customizer=="function"?bindCallback(customizer,thisArg,3):undefined;var result=customizer?customizer(value,other):undefined;return result===undefined?baseIsEqual(value,other,customizer):!!result}function isError(value){return isObjectLike(value)&&typeof value.message=="string"&&objToString.call(value)==errorTag}function isFinite(value){return typeof value=="number"&&nativeIsFinite(value)}function isFunction(value){return isObject(value)&&objToString.call(value)==funcTag}function isObject(value){var type=typeof value;return!!value&&(type=="object"||type=="function")}function isMatch(object,source,customizer,thisArg){customizer=typeof customizer=="function"?bindCallback(customizer,thisArg,3):undefined;return baseIsMatch(object,getMatchData(source),customizer)}function isNaN(value){return isNumber(value)&&value!=+value}function isNative(value){if(value==null){return false}if(isFunction(value)){return reIsNative.test(fnToString.call(value))}return isObjectLike(value)&&reIsHostCtor.test(value)}function isNull(value){return value===null}function isNumber(value){return typeof value=="number"||isObjectLike(value)&&objToString.call(value)==numberTag}function isPlainObject(value){var Ctor;if(!(isObjectLike(value)&&objToString.call(value)==objectTag&&!isArguments(value))||!hasOwnProperty.call(value,"constructor")&&(Ctor=value.constructor,typeof Ctor=="function"&&!(Ctor instanceof Ctor))){return false}var result;baseForIn(value,function(subValue,key){result=key});return result===undefined||hasOwnProperty.call(value,result)}function isRegExp(value){return isObject(value)&&objToString.call(value)==regexpTag}function isString(value){return typeof value=="string"||isObjectLike(value)&&objToString.call(value)==stringTag}function isTypedArray(value){return isObjectLike(value)&&isLength(value.length)&&!!typedArrayTags[objToString.call(value)]}function isUndefined(value){return value===undefined}function lt(value,other){return value0;while(++index=nativeMin(start,end)&&value=0&&string.indexOf(target,position)==position}function escape(string){string=baseToString(string);return string&&reHasUnescapedHtml.test(string)?string.replace(reUnescapedHtml,escapeHtmlChar):string}function escapeRegExp(string){string=baseToString(string);return string&&reHasRegExpChars.test(string)?string.replace(reRegExpChars,escapeRegExpChar):string||"(?:)"}var kebabCase=createCompounder(function(result,word,index){return result+(index?"-":"")+word.toLowerCase()});function pad(string,length,chars){string=baseToString(string);length=+length;var strLength=string.length;if(strLength>=length||!nativeIsFinite(length)){return string}var mid=(length-strLength)/2,leftLength=nativeFloor(mid),rightLength=nativeCeil(mid);chars=createPadding("",rightLength,chars);return chars.slice(0,leftLength)+string+chars}var padLeft=createPadDir();var padRight=createPadDir(true);function parseInt(string,radix,guard){if(guard?isIterateeCall(string,radix,guard):radix==null){radix=0}else if(radix){radix=+radix}string=trim(string);return nativeParseInt(string,radix||(reHasHexPrefix.test(string)?16:10))}function repeat(string,n){var result="";string=baseToString(string);n=+n;if(n<1||!string||!nativeIsFinite(n)){return result}do{if(n%2){result+=string}n=nativeFloor(n/2);string+=string}while(n);return result}var snakeCase=createCompounder(function(result,word,index){return result+(index?"_":"")+word.toLowerCase()});var startCase=createCompounder(function(result,word,index){return result+(index?" ":"")+(word.charAt(0).toUpperCase()+word.slice(1))});function startsWith(string,target,position){string=baseToString(string);position=position==null?0:nativeMin(position<0?0:+position||0,string.length);return string.lastIndexOf(target,position)==position}function template(string,options,otherOptions){var settings=lodash.templateSettings;if(otherOptions&&isIterateeCall(string,options,otherOptions)){options=otherOptions=undefined}string=baseToString(string);options=assignWith(baseAssign({},otherOptions||options),settings,assignOwnDefaults);var imports=assignWith(baseAssign({},options.imports),settings.imports,assignOwnDefaults),importsKeys=keys(imports),importsValues=baseValues(imports,importsKeys);var isEscaping,isEvaluating,index=0,interpolate=options.interpolate||reNoMatch,source="__p += '";var reDelimiters=RegExp((options.escape||reNoMatch).source+"|"+interpolate.source+"|"+(interpolate===reInterpolate?reEsTemplate:reNoMatch).source+"|"+(options.evaluate||reNoMatch).source+"|$","g");var sourceURL="//# sourceURL="+("sourceURL"in options?options.sourceURL:"lodash.templateSources["+ ++templateCounter+"]")+"\n";string.replace(reDelimiters,function(match,escapeValue,interpolateValue,esTemplateValue,evaluateValue,offset){interpolateValue||(interpolateValue=esTemplateValue);source+=string.slice(index,offset).replace(reUnescapedString,escapeStringChar);if(escapeValue){isEscaping=true;source+="' +\n__e("+escapeValue+") +\n'"}if(evaluateValue){isEvaluating=true;source+="';\n"+evaluateValue+";\n__p += '"}if(interpolateValue){source+="' +\n((__t = ("+interpolateValue+")) == null ? '' : __t) +\n'"}index=offset+match.length;return match});source+="';\n";var variable=options.variable;if(!variable){source="with (obj) {\n"+source+"\n}\n"}source=(isEvaluating?source.replace(reEmptyStringLeading,""):source).replace(reEmptyStringMiddle,"$1").replace(reEmptyStringTrailing,"$1;");source="function("+(variable||"obj")+") {\n"+(variable?"":"obj || (obj = {});\n")+"var __t, __p = ''"+(isEscaping?", __e = _.escape":"")+(isEvaluating?", __j = Array.prototype.join;\n"+"function print() { __p += __j.call(arguments, '') }\n":";\n")+source+"return __p\n}";var result=attempt(function(){return Function(importsKeys,sourceURL+"return "+source).apply(undefined,importsValues)});result.source=source;if(isError(result)){throw result}return result}function trim(string,chars,guard){var value=string;string=baseToString(string);if(!string){return string}if(guard?isIterateeCall(value,chars,guard):chars==null){return string.slice(trimmedLeftIndex(string),trimmedRightIndex(string)+1)}chars=chars+"";return string.slice(charsLeftIndex(string,chars),charsRightIndex(string,chars)+1)}function trimLeft(string,chars,guard){var value=string;string=baseToString(string);if(!string){return string}if(guard?isIterateeCall(value,chars,guard):chars==null){return string.slice(trimmedLeftIndex(string))}return string.slice(charsLeftIndex(string,chars+""))}function trimRight(string,chars,guard){var value=string;string=baseToString(string);if(!string){return string}if(guard?isIterateeCall(value,chars,guard):chars==null){return string.slice(0,trimmedRightIndex(string)+1)}return string.slice(0,charsRightIndex(string,chars+"")+1)}function trunc(string,options,guard){if(guard&&isIterateeCall(string,options,guard)){options=undefined}var length=DEFAULT_TRUNC_LENGTH,omission=DEFAULT_TRUNC_OMISSION;if(options!=null){if(isObject(options)){var separator="separator"in options?options.separator:separator;length="length"in options?+options.length||0:length;omission="omission"in options?baseToString(options.omission):omission}else{length=+options||0}}string=baseToString(string);if(length>=string.length){return string}var end=length-omission.length;if(end<1){return omission}var result=string.slice(0,end);if(separator==null){return result+omission}if(isRegExp(separator)){if(string.slice(end).search(separator)){var match,newEnd,substring=string.slice(0,end);if(!separator.global){separator=RegExp(separator.source,(reFlags.exec(separator)||"")+"g")}separator.lastIndex=0;while(match=separator.exec(substring)){newEnd=match.index}result=result.slice(0,newEnd==null?end:newEnd)}}else if(string.indexOf(separator,end)!=end){var index=result.lastIndexOf(separator);if(index>-1){result=result.slice(0,index)}}return result+omission}function unescape(string){string=baseToString(string);return string&&reHasEscapedHtml.test(string)?string.replace(reEscapedHtml,unescapeHtmlChar):string}function words(string,pattern,guard){if(guard&&isIterateeCall(string,pattern,guard)){pattern=undefined}string=baseToString(string);return string.match(pattern||reWords)||[]}var attempt=restParam(function(func,args){try{return func.apply(undefined,args)}catch(e){return isError(e)?e:new Error(e)}});function callback(func,thisArg,guard){if(guard&&isIterateeCall(func,thisArg,guard)){thisArg=undefined}return isObjectLike(func)?matches(func):baseCallback(func,thisArg)}function constant(value){return function(){return value}}function identity(value){return value}function matches(source){return baseMatches(baseClone(source,true))}function matchesProperty(path,srcValue){return baseMatchesProperty(path,baseClone(srcValue,true))}var method=restParam(function(path,args){return function(object){return invokePath(object,path,args)}});var methodOf=restParam(function(object,args){return function(path){return invokePath(object,path,args)}});function mixin(object,source,options){if(options==null){var isObj=isObject(source),props=isObj?keys(source):undefined,methodNames=props&&props.length?baseFunctions(source,props):undefined;if(!(methodNames?methodNames.length:isObj)){methodNames=false;options=source;source=object;object=this}}if(!methodNames){methodNames=baseFunctions(source,keys(source))}var chain=true,index=-1,isFunc=isFunction(object),length=methodNames.length;if(options===false){chain=false}else if(isObject(options)&&"chain"in options){chain=options.chain}while(++index0||end<0)){return new LazyWrapper(result)}if(start<0){result=result.takeRight(-start)}else if(start){result=result.drop(start)}if(end!==undefined){end=+end||0;result=end<0?result.dropRight(-end):result.take(end-start)}return result};LazyWrapper.prototype.takeRightWhile=function(predicate,thisArg){return this.reverse().takeWhile(predicate,thisArg).reverse()};LazyWrapper.prototype.toArray=function(){return this.take(POSITIVE_INFINITY)};baseForOwn(LazyWrapper.prototype,function(func,methodName){var checkIteratee=/^(?:filter|map|reject)|While$/.test(methodName),retUnwrapped=/^(?:first|last)$/.test(methodName),lodashFunc=lodash[retUnwrapped?"take"+(methodName=="last"?"Right":""):methodName];if(!lodashFunc){return}lodash.prototype[methodName]=function(){var args=retUnwrapped?[1]:arguments,chainAll=this.__chain__,value=this.__wrapped__,isHybrid=!!this.__actions__.length,isLazy=value instanceof LazyWrapper,iteratee=args[0],useLazy=isLazy||isArray(value);if(useLazy&&checkIteratee&&typeof iteratee=="function"&&iteratee.length!=1){isLazy=useLazy=false}var interceptor=function(value){return retUnwrapped&&chainAll?lodashFunc(value,1)[0]:lodashFunc.apply(undefined,arrayPush([value],args))};var action={func:thru,args:[interceptor],thisArg:undefined},onlyLazy=isLazy&&!isHybrid;if(retUnwrapped&&!chainAll){if(onlyLazy){value=value.clone();value.__actions__.push(action);return func.call(value)}return lodashFunc.call(undefined,this.value())[0]}if(!retUnwrapped&&useLazy){value=onlyLazy?value:new LazyWrapper(this);var result=func.apply(value,args);result.__actions__.push(action);return new LodashWrapper(result,chainAll)}return this.thru(interceptor)}});arrayEach(["join","pop","push","replace","shift","sort","splice","split","unshift"],function(methodName){var func=(/^(?:replace|split)$/.test(methodName)?stringProto:arrayProto)[methodName],chainName=/^(?:push|sort|unshift)$/.test(methodName)?"tap":"thru",retUnwrapped=/^(?:join|pop|replace|shift)$/.test(methodName);lodash.prototype[methodName]=function(){var args=arguments;if(retUnwrapped&&!this.__chain__){return func.apply(this.value(),args)}return this[chainName](function(value){return func.apply(value,args)})}});baseForOwn(LazyWrapper.prototype,function(func,methodName){var lodashFunc=lodash[methodName];if(lodashFunc){var key=lodashFunc.name,names=realNames[key]||(realNames[key]=[]);names.push({name:methodName,func:lodashFunc})}});realNames[createHybridWrapper(undefined,BIND_KEY_FLAG).name]=[{name:"wrapper",func:undefined}];LazyWrapper.prototype.clone=lazyClone;LazyWrapper.prototype.reverse=lazyReverse;LazyWrapper.prototype.value=lazyValue;lodash.prototype.chain=wrapperChain;lodash.prototype.commit=wrapperCommit;lodash.prototype.concat=wrapperConcat;lodash.prototype.plant=wrapperPlant;lodash.prototype.reverse=wrapperReverse;lodash.prototype.toString=wrapperToString;lodash.prototype.run=lodash.prototype.toJSON=lodash.prototype.valueOf=lodash.prototype.value=wrapperValue;lodash.prototype.collect=lodash.prototype.map;lodash.prototype.head=lodash.prototype.first;lodash.prototype.select=lodash.prototype.filter;lodash.prototype.tail=lodash.prototype.rest;return lodash}var _=runInContext();if(typeof define=="function"&&typeof define.amd=="object"&&define.amd){root._=_;define(function(){return _})}else if(freeExports&&freeModule){if(moduleExports){(freeModule.exports=_)._=_}else{freeExports._=_}}else{root._=_}}).call(this)}).call(this,typeof global!=="undefined"?global:typeof self!=="undefined"?self:typeof window!=="undefined"?window:{})},{}],3:[function(require,module,exports){(function(window,document,undefined){var _MAP={8:"backspace",9:"tab",13:"enter",16:"shift",17:"ctrl",18:"alt",20:"capslock",27:"esc",32:"space",33:"pageup",34:"pagedown",35:"end",36:"home",37:"left",38:"up",39:"right",40:"down",45:"ins",46:"del",91:"meta",93:"meta",224:"meta"};var _KEYCODE_MAP={106:"*",107:"+",109:"-",110:".",111:"/",186:";",187:"=",188:",",189:"-",190:".",191:"/",192:"`",219:"[",220:"\\",221:"]",222:"'"};var _SHIFT_MAP={"~":"`","!":"1","@":"2","#":"3",$:"4","%":"5","^":"6","&":"7","*":"8","(":"9",")":"0",_:"-","+":"=",":":";",'"':"'","<":",",">":".","?":"/","|":"\\"};var _SPECIAL_ALIASES={option:"alt",command:"meta",return:"enter",escape:"esc",plus:"+",mod:/Mac|iPod|iPhone|iPad/.test(navigator.platform)?"meta":"ctrl"};var _REVERSE_MAP;for(var i=1;i<20;++i){_MAP[111+i]="f"+i}for(i=0;i<=9;++i){_MAP[i+96]=i}function _addEvent(object,type,callback){if(object.addEventListener){object.addEventListener(type,callback,false);return}object.attachEvent("on"+type,callback)}function _characterFromEvent(e){if(e.type=="keypress"){var character=String.fromCharCode(e.which);if(!e.shiftKey){character=character.toLowerCase()}return character}if(_MAP[e.which]){return _MAP[e.which]}if(_KEYCODE_MAP[e.which]){return _KEYCODE_MAP[e.which]}return String.fromCharCode(e.which).toLowerCase()}function _modifiersMatch(modifiers1,modifiers2){return modifiers1.sort().join(",")===modifiers2.sort().join(",")}function _eventModifiers(e){var modifiers=[];if(e.shiftKey){modifiers.push("shift")}if(e.altKey){modifiers.push("alt")}if(e.ctrlKey){modifiers.push("ctrl")}if(e.metaKey){modifiers.push("meta")}return modifiers}function _preventDefault(e){if(e.preventDefault){e.preventDefault();return}e.returnValue=false}function _stopPropagation(e){if(e.stopPropagation){e.stopPropagation();return}e.cancelBubble=true}function _isModifier(key){return key=="shift"||key=="ctrl"||key=="alt"||key=="meta"}function _getReverseMap(){if(!_REVERSE_MAP){_REVERSE_MAP={};for(var key in _MAP){if(key>95&&key<112){continue}if(_MAP.hasOwnProperty(key)){_REVERSE_MAP[_MAP[key]]=key}}}return _REVERSE_MAP}function _pickBestAction(key,modifiers,action){if(!action){action=_getReverseMap()[key]?"keydown":"keypress"}if(action=="keypress"&&modifiers.length){action="keydown"}return action}function _keysFromString(combination){if(combination==="+"){return["+"]}combination=combination.replace(/\+{2}/g,"+plus");return combination.split("+")}function _getKeyInfo(combination,action){var keys;var key;var i;var modifiers=[];keys=_keysFromString(combination);for(i=0;i1){_bindSequence(combination,sequence,callback,action);return}info=_getKeyInfo(combination,action);self._callbacks[info.key]=self._callbacks[info.key]||[];_getMatches(info.key,info.modifiers,{type:info.action},sequenceName,combination,level);self._callbacks[info.key][sequenceName?"unshift":"push"]({callback:callback,modifiers:info.modifiers,action:info.action,seq:sequenceName,level:level,combo:combination})}self._bindMultiple=function(combinations,callback,action){for(var i=0;i-1){return false}if(_belongsTo(element,self.target)){return false}return element.tagName=="INPUT"||element.tagName=="SELECT"||element.tagName=="TEXTAREA"||element.isContentEditable};Mousetrap.prototype.handleKey=function(){var self=this;return self._handleKey.apply(self,arguments)};Mousetrap.init=function(){var documentMousetrap=Mousetrap(document);for(var method in documentMousetrap){if(method.charAt(0)!=="_"){Mousetrap[method]=function(method){return function(){return documentMousetrap[method].apply(documentMousetrap,arguments)}}(method)}}};Mousetrap.init();window.Mousetrap=Mousetrap;if(typeof module!=="undefined"&&module.exports){module.exports=Mousetrap}if(typeof define==="function"&&define.amd){define(function(){return Mousetrap})}})(window,document)},{}],4:[function(require,module,exports){(function(process){function normalizeArray(parts,allowAboveRoot){var up=0;for(var i=parts.length-1;i>=0;i--){var last=parts[i];if(last==="."){parts.splice(i,1)}else if(last===".."){parts.splice(i,1);up++}else if(up){parts.splice(i,1);up--}}if(allowAboveRoot){for(;up--;up){parts.unshift("..")}}return parts}var splitPathRe=/^(\/?|)([\s\S]*?)((?:\.{1,2}|[^\/]+?|)(\.[^.\/]*|))(?:[\/]*)$/;var splitPath=function(filename){return splitPathRe.exec(filename).slice(1)};exports.resolve=function(){var resolvedPath="",resolvedAbsolute=false;for(var i=arguments.length-1;i>=-1&&!resolvedAbsolute;i--){var path=i>=0?arguments[i]:process.cwd();if(typeof path!=="string"){throw new TypeError("Arguments to path.resolve must be strings")}else if(!path){continue}resolvedPath=path+"/"+resolvedPath;resolvedAbsolute=path.charAt(0)==="/"}resolvedPath=normalizeArray(filter(resolvedPath.split("/"),function(p){return!!p}),!resolvedAbsolute).join("/");return(resolvedAbsolute?"/":"")+resolvedPath||"."};exports.normalize=function(path){var isAbsolute=exports.isAbsolute(path),trailingSlash=substr(path,-1)==="/";path=normalizeArray(filter(path.split("/"),function(p){return!!p}),!isAbsolute).join("/");if(!path&&!isAbsolute){path="."}if(path&&trailingSlash){path+="/"}return(isAbsolute?"/":"")+path};exports.isAbsolute=function(path){return path.charAt(0)==="/"};exports.join=function(){var paths=Array.prototype.slice.call(arguments,0);return exports.normalize(filter(paths,function(p,index){if(typeof p!=="string"){throw new TypeError("Arguments to path.join must be strings")}return p}).join("/"))};exports.relative=function(from,to){from=exports.resolve(from).substr(1);to=exports.resolve(to).substr(1);function trim(arr){var start=0;for(;start=0;end--){if(arr[end]!=="")break}if(start>end)return[];return arr.slice(start,end-start+1)}var fromParts=trim(from.split("/"));var toParts=trim(to.split("/"));var length=Math.min(fromParts.length,toParts.length);var samePartsLength=length;for(var i=0;i1){for(var i=1;i= 0x80 (not a basic code point)","invalid-input":"Invalid input"},baseMinusTMin=base-tMin,floor=Math.floor,stringFromCharCode=String.fromCharCode,key;function error(type){throw RangeError(errors[type])}function map(array,fn){var length=array.length;var result=[];while(length--){result[length]=fn(array[length])}return result}function mapDomain(string,fn){var parts=string.split("@");var result="";if(parts.length>1){result=parts[0]+"@";string=parts[1]}string=string.replace(regexSeparators,".");var labels=string.split(".");var encoded=map(labels,fn).join(".");return result+encoded}function ucs2decode(string){var output=[],counter=0,length=string.length,value,extra;while(counter=55296&&value<=56319&&counter65535){value-=65536;output+=stringFromCharCode(value>>>10&1023|55296);value=56320|value&1023}output+=stringFromCharCode(value);return output}).join("")}function basicToDigit(codePoint){if(codePoint-48<10){return codePoint-22}if(codePoint-65<26){return codePoint-65}if(codePoint-97<26){return codePoint-97}return base}function digitToBasic(digit,flag){return digit+22+75*(digit<26)-((flag!=0)<<5)}function adapt(delta,numPoints,firstTime){var k=0;delta=firstTime?floor(delta/damp):delta>>1;delta+=floor(delta/numPoints);for(;delta>baseMinusTMin*tMax>>1;k+=base){delta=floor(delta/baseMinusTMin)}return floor(k+(baseMinusTMin+1)*delta/(delta+skew))}function decode(input){var output=[],inputLength=input.length,out,i=0,n=initialN,bias=initialBias,basic,j,index,oldi,w,k,digit,t,baseMinusT;basic=input.lastIndexOf(delimiter);if(basic<0){basic=0}for(j=0;j=128){error("not-basic")}output.push(input.charCodeAt(j))}for(index=basic>0?basic+1:0;index=inputLength){error("invalid-input")}digit=basicToDigit(input.charCodeAt(index++));if(digit>=base||digit>floor((maxInt-i)/w)){error("overflow")}i+=digit*w;t=k<=bias?tMin:k>=bias+tMax?tMax:k-bias;if(digitfloor(maxInt/baseMinusT)){error("overflow")}w*=baseMinusT}out=output.length+1;bias=adapt(i-oldi,out,oldi==0);if(floor(i/out)>maxInt-n){error("overflow")}n+=floor(i/out);i%=out;output.splice(i++,0,n)}return ucs2encode(output)}function encode(input){var n,delta,handledCPCount,basicLength,bias,j,m,q,k,t,currentValue,output=[],inputLength,handledCPCountPlusOne,baseMinusT,qMinusT;input=ucs2decode(input);inputLength=input.length;n=initialN;delta=0;bias=initialBias;for(j=0;j=n&¤tValuefloor((maxInt-delta)/handledCPCountPlusOne)){error("overflow")}delta+=(m-n)*handledCPCountPlusOne;n=m;for(j=0;jmaxInt){error("overflow")}if(currentValue==n){for(q=delta,k=base;;k+=base){t=k<=bias?tMin:k>=bias+tMax?tMax:k-bias;if(q0&&len>maxKeys){len=maxKeys}for(var i=0;i=0){kstr=x.substr(0,idx);vstr=x.substr(idx+1)}else{kstr=x;vstr=""}k=decodeURIComponent(kstr);v=decodeURIComponent(vstr);if(!hasOwnProperty(obj,k)){obj[k]=v}else if(isArray(obj[k])){obj[k].push(v)}else{obj[k]=[obj[k],v]}}return obj};var isArray=Array.isArray||function(xs){return Object.prototype.toString.call(xs)==="[object Array]"}},{}],8:[function(require,module,exports){"use strict";var stringifyPrimitive=function(v){switch(typeof v){case"string":return v;case"boolean":return v?"true":"false";case"number":return isFinite(v)?v:"";default:return""}};module.exports=function(obj,sep,eq,name){sep=sep||"&";eq=eq||"=";if(obj===null){obj=undefined}if(typeof obj==="object"){return map(objectKeys(obj),function(k){var ks=encodeURIComponent(stringifyPrimitive(k))+eq;if(isArray(obj[k])){return map(obj[k],function(v){return ks+encodeURIComponent(stringifyPrimitive(v))}).join(sep)}else{return ks+encodeURIComponent(stringifyPrimitive(obj[k]))}}).join(sep)}if(!name)return"";return encodeURIComponent(stringifyPrimitive(name))+eq+encodeURIComponent(stringifyPrimitive(obj))};var isArray=Array.isArray||function(xs){return Object.prototype.toString.call(xs)==="[object Array]"};function map(xs,f){if(xs.map)return xs.map(f);var res=[];for(var i=0;i",'"',"`"," ","\r","\n","\t"],unwise=["{","}","|","\\","^","`"].concat(delims),autoEscape=["'"].concat(unwise),nonHostChars=["%","/","?",";","#"].concat(autoEscape),hostEndingChars=["/","?","#"],hostnameMaxLen=255,hostnamePartPattern=/^[a-z0-9A-Z_-]{0,63}$/,hostnamePartStart=/^([a-z0-9A-Z_-]{0,63})(.*)$/,unsafeProtocol={javascript:true,"javascript:":true},hostlessProtocol={javascript:true,"javascript:":true},slashedProtocol={http:true,https:true,ftp:true,gopher:true,file:true,"http:":true,"https:":true,"ftp:":true,"gopher:":true,"file:":true},querystring=require("querystring");function urlParse(url,parseQueryString,slashesDenoteHost){if(url&&isObject(url)&&url instanceof Url)return url;var u=new Url;u.parse(url,parseQueryString,slashesDenoteHost);return u}Url.prototype.parse=function(url,parseQueryString,slashesDenoteHost){if(!isString(url)){throw new TypeError("Parameter 'url' must be a string, not "+typeof url)}var rest=url;rest=rest.trim();var proto=protocolPattern.exec(rest);if(proto){proto=proto[0];var lowerProto=proto.toLowerCase();this.protocol=lowerProto;rest=rest.substr(proto.length)}if(slashesDenoteHost||proto||rest.match(/^\/\/[^@\/]+@[^@\/]+/)){var slashes=rest.substr(0,2)==="//";if(slashes&&!(proto&&hostlessProtocol[proto])){rest=rest.substr(2);this.slashes=true}}if(!hostlessProtocol[proto]&&(slashes||proto&&!slashedProtocol[proto])){var hostEnd=-1;for(var i=0;i127){newpart+="x"}else{newpart+=part[j]}}if(!newpart.match(hostnamePartPattern)){var validParts=hostparts.slice(0,i);var notHost=hostparts.slice(i+1);var bit=part.match(hostnamePartStart);if(bit){validParts.push(bit[1]);notHost.unshift(bit[2])}if(notHost.length){rest="/"+notHost.join(".")+rest}this.hostname=validParts.join(".");break}}}}if(this.hostname.length>hostnameMaxLen){this.hostname=""}else{this.hostname=this.hostname.toLowerCase()}if(!ipv6Hostname){var domainArray=this.hostname.split(".");var newOut=[];for(var i=0;i0?result.host.split("@"):false;if(authInHost){result.auth=authInHost.shift();result.host=result.hostname=authInHost.shift()}}result.search=relative.search;result.query=relative.query;if(!isNull(result.pathname)||!isNull(result.search)){result.path=(result.pathname?result.pathname:"")+(result.search?result.search:"")}result.href=result.format();return result}if(!srcPath.length){result.pathname=null;if(result.search){result.path="/"+result.search}else{result.path=null}result.href=result.format();return result}var last=srcPath.slice(-1)[0];var hasTrailingSlash=(result.host||relative.host)&&(last==="."||last==="..")||last==="";var up=0;for(var i=srcPath.length;i>=0;i--){last=srcPath[i];if(last=="."){srcPath.splice(i,1)}else if(last===".."){srcPath.splice(i,1);up++}else if(up){srcPath.splice(i,1);up--}}if(!mustEndAbs&&!removeAllDots){for(;up--;up){srcPath.unshift("..")}}if(mustEndAbs&&srcPath[0]!==""&&(!srcPath[0]||srcPath[0].charAt(0)!=="/")){srcPath.unshift("")}if(hasTrailingSlash&&srcPath.join("/").substr(-1)!=="/"){srcPath.push("")}var isAbsolute=srcPath[0]===""||srcPath[0]&&srcPath[0].charAt(0)==="/";if(psychotic){result.hostname=result.host=isAbsolute?"":srcPath.length?srcPath.shift():"";var authInHost=result.host&&result.host.indexOf("@")>0?result.host.split("@"):false;if(authInHost){result.auth=authInHost.shift();result.host=result.hostname=authInHost.shift()}}mustEndAbs=mustEndAbs||result.host&&srcPath.length;if(mustEndAbs&&!isAbsolute){srcPath.unshift("")}if(!srcPath.length){result.pathname=null;result.path=null}else{result.pathname=srcPath.join("/")}if(!isNull(result.pathname)||!isNull(result.search)){result.path=(result.pathname?result.pathname:"")+(result.search?result.search:"")}result.auth=relative.auth||result.auth;result.slashes=result.slashes||relative.slashes;result.href=result.format();return result};Url.prototype.parseHost=function(){var host=this.host;var port=portPattern.exec(host);if(port){port=port[0];if(port!==":"){this.port=port.substr(1)}host=host.substr(0,host.length-port.length)}if(host)this.hostname=host};function isString(arg){return typeof arg==="string"}function isObject(arg){return typeof arg==="object"&&arg!==null}function isNull(arg){return arg===null}function isNullOrUndefined(arg){return arg==null}},{punycode:6,querystring:9}],11:[function(require,module,exports){var $=require("jquery");function toggleDropdown(e){var $dropdown=$(e.currentTarget).parent().find(".dropdown-menu");$dropdown.toggleClass("open");e.stopPropagation();e.preventDefault()}function closeDropdown(e){$(".dropdown-menu").removeClass("open")}function init(){$(document).on("click",".toggle-dropdown",toggleDropdown);$(document).on("click",".dropdown-menu",function(e){e.stopPropagation()});$(document).on("click",closeDropdown)}module.exports={init:init}},{jquery:1}],12:[function(require,module,exports){var $=require("jquery");module.exports=$({})},{jquery:1}],13:[function(require,module,exports){var $=require("jquery");var _=require("lodash");var storage=require("./storage");var dropdown=require("./dropdown");var events=require("./events");var state=require("./state");var keyboard=require("./keyboard");var navigation=require("./navigation");var sidebar=require("./sidebar");var toolbar=require("./toolbar");function start(config){sidebar.init();keyboard.init();dropdown.init();navigation.init();toolbar.createButton({index:0,icon:"fa fa-align-justify",label:"Toggle Sidebar",onClick:function(e){e.preventDefault();sidebar.toggle()}});events.trigger("start",config);navigation.notify()}var gitbook={start:start,events:events,state:state,toolbar:toolbar,sidebar:sidebar,storage:storage,keyboard:keyboard};var MODULES={gitbook:gitbook,jquery:$,lodash:_};window.gitbook=gitbook;window.$=$;window.jQuery=$;gitbook.require=function(mods,fn){mods=_.map(mods,function(mod){mod=mod.toLowerCase();if(!MODULES[mod]){throw new Error("GitBook module "+mod+" doesn't exist")}return MODULES[mod]});fn.apply(null,mods)};module.exports={}},{"./dropdown":11,"./events":12,"./keyboard":14,"./navigation":16,"./sidebar":18,"./state":19,"./storage":20,"./toolbar":21,jquery:1,lodash:2}],14:[function(require,module,exports){var Mousetrap=require("mousetrap");var navigation=require("./navigation");var sidebar=require("./sidebar");function bindShortcut(keys,fn){Mousetrap.bind(keys,function(e){fn();return false})}function init(){bindShortcut(["right"],function(e){navigation.goNext()});bindShortcut(["left"],function(e){navigation.goPrev()});bindShortcut(["s"],function(e){sidebar.toggle()})}module.exports={init:init,bind:bindShortcut}},{"./navigation":16,"./sidebar":18,mousetrap:3}],15:[function(require,module,exports){var state=require("./state");function showLoading(p){state.$book.addClass("is-loading");p.always(function(){state.$book.removeClass("is-loading")});return p}module.exports={show:showLoading}},{"./state":19}],16:[function(require,module,exports){var $=require("jquery");var url=require("url");var events=require("./events");var state=require("./state");var loading=require("./loading");var usePushState=typeof history.pushState!=="undefined";function handleNavigation(relativeUrl,push){var uri=url.resolve(window.location.pathname,relativeUrl);notifyPageChange();location.href=relativeUrl;return}function updateNavigationPosition(){var bodyInnerWidth,pageWrapperWidth;bodyInnerWidth=parseInt($(".body-inner").css("width"),10);pageWrapperWidth=parseInt($(".page-wrapper").css("width"),10);$(".navigation-next").css("margin-right",bodyInnerWidth-pageWrapperWidth+"px")}function notifyPageChange(){events.trigger("page.change")}function preparePage(notify){var $bookBody=$(".book-body");var $bookInner=$bookBody.find(".body-inner");var $pageWrapper=$bookInner.find(".page-wrapper");updateNavigationPosition();$bookInner.scrollTop(0);$bookBody.scrollTop(0);if(notify!==false)notifyPageChange()}function isLeftClickEvent(e){return e.button===0}function isModifiedEvent(e){return!!(e.metaKey||e.altKey||e.ctrlKey||e.shiftKey)}function handlePagination(e){if(isModifiedEvent(e)||!isLeftClickEvent(e)){return}e.stopPropagation();e.preventDefault();var url=$(this).attr("href");if(url)handleNavigation(url,true)}function goNext(){var url=$(".navigation-next").attr("href");if(url)handleNavigation(url,true)}function goPrev(){var url=$(".navigation-prev").attr("href");if(url)handleNavigation(url,true)}function init(){$.ajaxSetup({});if(location.protocol!=="file:"){history.replaceState({path:window.location.href},"")}window.onpopstate=function(event){if(event.state===null){return}return handleNavigation(event.state.path,false)};$(document).on("click",".navigation-prev",handlePagination);$(document).on("click",".navigation-next",handlePagination);$(document).on("click",".summary [data-path] a",handlePagination);$(window).resize(updateNavigationPosition);preparePage(false)}module.exports={init:init,goNext:goNext,goPrev:goPrev,notify:notifyPageChange}},{"./events":12,"./loading":15,"./state":19,jquery:1,url:10}],17:[function(require,module,exports){module.exports={isMobile:function(){return document.body.clientWidth<=600}}},{}],18:[function(require,module,exports){var $=require("jquery");var _=require("lodash");var storage=require("./storage");var platform=require("./platform");var state=require("./state");function toggleSidebar(_state,animation){if(state!=null&&isOpen()==_state)return;if(animation==null)animation=true;state.$book.toggleClass("without-animation",!animation);state.$book.toggleClass("with-summary",_state);storage.set("sidebar",isOpen())}function isOpen(){return state.$book.hasClass("with-summary")}function init(){if(platform.isMobile()){toggleSidebar(false,false)}else{toggleSidebar(storage.get("sidebar",true),false)}$(document).on("click",".book-summary li.chapter a",function(e){if(platform.isMobile())toggleSidebar(false,false)})}function filterSummary(paths){var $summary=$(".book-summary");$summary.find("li").each(function(){var path=$(this).data("path");var st=paths==null||_.contains(paths,path);$(this).toggle(st);if(st)$(this).parents("li").show()})}module.exports={init:init,isOpen:isOpen,toggle:toggleSidebar,filter:filterSummary}},{"./platform":17,"./state":19,"./storage":20,jquery:1,lodash:2}],19:[function(require,module,exports){var $=require("jquery");var url=require("url");var path=require("path");var state={};state.update=function(dom){var $book=$(dom.find(".book"));state.$book=$book;state.level=$book.data("level");state.basePath=$book.data("basepath");state.innerLanguage=$book.data("innerlanguage");state.revision=$book.data("revision");state.filepath=$book.data("filepath");state.chapterTitle=$book.data("chapter-title");state.root=url.resolve(location.protocol+"//"+location.host,path.dirname(path.resolve(location.pathname.replace(/\/$/,"/index.html"),state.basePath))).replace(/\/?$/,"/");state.bookRoot=state.innerLanguage?url.resolve(state.root,".."):state.root};state.update($);module.exports=state},{jquery:1,path:4,url:10}],20:[function(require,module,exports){var baseKey="";module.exports={setBaseKey:function(key){baseKey=key},set:function(key,value){key=baseKey+":"+key;try{sessionStorage[key]=JSON.stringify(value)}catch(e){}},get:function(key,def){key=baseKey+":"+key;if(sessionStorage[key]===undefined)return def;try{var v=JSON.parse(sessionStorage[key]);return v==null?def:v}catch(err){return sessionStorage[key]||def}},remove:function(key){key=baseKey+":"+key;sessionStorage.removeItem(key)}}},{}],21:[function(require,module,exports){var $=require("jquery");var _=require("lodash");var events=require("./events");var buttons=[];function insertAt(parent,selector,index,element){var lastIndex=parent.children(selector).length;if(index<0){index=Math.max(0,lastIndex+1+index)}parent.append(element);if(index",{class:"dropdown-menu",html:''});if(_.isString(dropdown)){$menu.append(dropdown)}else{var groups=_.map(dropdown,function(group){if(_.isArray(group))return group;else return[group]});_.each(groups,function(group){var $group=$("
",{class:"buttons"});var sizeClass="size-"+group.length;_.each(group,function(btn){btn=_.defaults(btn||{},{text:"",className:"",onClick:defaultOnClick});var $btn=$("'; + var clipboard; + + gitbook.events.bind("page.change", function() { + + if (!ClipboardJS.isSupported()) return; + + // the page.change event is thrown twice: before and after the page changes + if (clipboard) { + // clipboard is already defined but we are on the same page + if (clipboard._prevPage === window.location.pathname) return; + // clipboard is already defined and url path change + // we can deduct that we are before page changes + clipboard.destroy(); // destroy the previous events listeners + clipboard = undefined; // reset the clipboard object + return; + } + + $(copyButton).prependTo("div.sourceCode"); + + clipboard = new ClipboardJS(".copy-to-clipboard-button", { + text: function(trigger) { + return trigger.parentNode.textContent; + } + }); + + clipboard._prevPage = window.location.pathname + + }); + +}); diff --git a/libs/gitbook-2.6.7/js/plugin-fontsettings.js b/libs/gitbook-2.6.7/js/plugin-fontsettings.js new file mode 100644 index 00000000..a70f0fb3 --- /dev/null +++ b/libs/gitbook-2.6.7/js/plugin-fontsettings.js @@ -0,0 +1,152 @@ +gitbook.require(["gitbook", "lodash", "jQuery"], function(gitbook, _, $) { + var fontState; + + var THEMES = { + "white": 0, + "sepia": 1, + "night": 2 + }; + + var FAMILY = { + "serif": 0, + "sans": 1 + }; + + // Save current font settings + function saveFontSettings() { + gitbook.storage.set("fontState", fontState); + update(); + } + + // Increase font size + function enlargeFontSize(e) { + e.preventDefault(); + if (fontState.size >= 4) return; + + fontState.size++; + saveFontSettings(); + }; + + // Decrease font size + function reduceFontSize(e) { + e.preventDefault(); + if (fontState.size <= 0) return; + + fontState.size--; + saveFontSettings(); + }; + + // Change font family + function changeFontFamily(index, e) { + e.preventDefault(); + + fontState.family = index; + saveFontSettings(); + }; + + // Change type of color + function changeColorTheme(index, e) { + e.preventDefault(); + + var $book = $(".book"); + + if (fontState.theme !== 0) + $book.removeClass("color-theme-"+fontState.theme); + + fontState.theme = index; + if (fontState.theme !== 0) + $book.addClass("color-theme-"+fontState.theme); + + saveFontSettings(); + }; + + function update() { + var $book = gitbook.state.$book; + + $(".font-settings .font-family-list li").removeClass("active"); + $(".font-settings .font-family-list li:nth-child("+(fontState.family+1)+")").addClass("active"); + + $book[0].className = $book[0].className.replace(/\bfont-\S+/g, ''); + $book.addClass("font-size-"+fontState.size); + $book.addClass("font-family-"+fontState.family); + + if(fontState.theme !== 0) { + $book[0].className = $book[0].className.replace(/\bcolor-theme-\S+/g, ''); + $book.addClass("color-theme-"+fontState.theme); + } + }; + + function init(config) { + var $bookBody, $book; + + //Find DOM elements. + $book = gitbook.state.$book; + $bookBody = $book.find(".book-body"); + + // Instantiate font state object + fontState = gitbook.storage.get("fontState", { + size: config.size || 2, + family: FAMILY[config.family || "sans"], + theme: THEMES[config.theme || "white"] + }); + + update(); + }; + + + gitbook.events.bind("start", function(e, config) { + var opts = config.fontsettings; + if (!opts) return; + + // Create buttons in toolbar + gitbook.toolbar.createButton({ + icon: 'fa fa-font', + label: 'Font Settings', + className: 'font-settings', + dropdown: [ + [ + { + text: 'A', + className: 'font-reduce', + onClick: reduceFontSize + }, + { + text: 'A', + className: 'font-enlarge', + onClick: enlargeFontSize + } + ], + [ + { + text: 'Serif', + onClick: _.partial(changeFontFamily, 0) + }, + { + text: 'Sans', + onClick: _.partial(changeFontFamily, 1) + } + ], + [ + { + text: 'White', + onClick: _.partial(changeColorTheme, 0) + }, + { + text: 'Sepia', + onClick: _.partial(changeColorTheme, 1) + }, + { + text: 'Night', + onClick: _.partial(changeColorTheme, 2) + } + ] + ] + }); + + + // Init current settings + init(opts); + }); +}); + + diff --git a/libs/gitbook-2.6.7/js/plugin-search.js b/libs/gitbook-2.6.7/js/plugin-search.js new file mode 100644 index 00000000..747fcceb --- /dev/null +++ b/libs/gitbook-2.6.7/js/plugin-search.js @@ -0,0 +1,270 @@ +gitbook.require(["gitbook", "lodash", "jQuery"], function(gitbook, _, $) { + var index = null; + var fuse = null; + var _search = {engine: 'lunr', opts: {}}; + var $searchInput, $searchLabel, $searchForm; + var $highlighted = [], hi, hiOpts = { className: 'search-highlight' }; + var collapse = false, toc_visible = []; + + function init(config) { + // Instantiate search settings + _search = gitbook.storage.get("search", { + engine: config.search.engine || 'lunr', + opts: config.search.options || {}, + }); + }; + + // Save current search settings + function saveSearchSettings() { + gitbook.storage.set("search", _search); + } + + // Use a specific index + function loadIndex(data) { + // [Yihui] In bookdown, I use a character matrix to store the chapter + // content, and the index is dynamically built on the client side. + // Gitbook prebuilds the index data instead: https://github.com/GitbookIO/plugin-search + // We can certainly do that via R packages V8 and jsonlite, but let's + // see how slow it really is before improving it. On the other hand, + // lunr cannot handle non-English text very well, e.g. the default + // tokenizer cannot deal with Chinese text, so we may want to replace + // lunr with a dumb simple text matching approach. + if (_search.engine === 'lunr') { + index = lunr(function () { + this.ref('url'); + this.field('title', { boost: 10 }); + this.field('body'); + }); + data.map(function(item) { + index.add({ + url: item[0], + title: item[1], + body: item[2] + }); + }); + return; + } + fuse = new Fuse(data.map((_data => { + return { + url: _data[0], + title: _data[1], + body: _data[2] + }; + })), Object.assign( + { + includeScore: true, + threshold: 0.1, + ignoreLocation: true, + keys: ["title", "body"] + }, + _search.opts + )); + } + + // Fetch the search index + function fetchIndex() { + return $.getJSON(gitbook.state.basePath+"/search_index.json") + .then(loadIndex); // [Yihui] we need to use this object later + } + + // Search for a term and return results + function search(q) { + let results = []; + switch (_search.engine) { + case 'fuse': + if (!fuse) return; + results = fuse.search(q).map(function(result) { + var parts = result.item.url.split('#'); + return { + path: parts[0], + hash: parts[1] + }; + }); + break; + case 'lunr': + default: + if (!index) return; + results = _.chain(index.search(q)).map(function(result) { + var parts = result.ref.split("#"); + return { + path: parts[0], + hash: parts[1] + }; + }) + .value(); + } + + // [Yihui] Highlight the search keyword on current page + $highlighted = $('.page-inner') + .unhighlight(hiOpts).highlight(q, hiOpts).find('span.search-highlight'); + scrollToHighlighted(0); + + return results; + } + + // [Yihui] Scroll the chapter body to the i-th highlighted string + function scrollToHighlighted(d) { + var n = $highlighted.length; + hi = hi === undefined ? 0 : hi + d; + // navignate to the previous/next page in the search results if reached the top/bottom + var b = hi < 0; + if (d !== 0 && (b || hi >= n)) { + var path = currentPath(), n2 = toc_visible.length; + if (n2 === 0) return; + for (var i = b ? 0 : n2; (b && i < n2) || (!b && i >= 0); i += b ? 1 : -1) { + if (toc_visible.eq(i).data('path') === path) break; + } + i += b ? -1 : 1; + if (i < 0) i = n2 - 1; + if (i >= n2) i = 0; + var lnk = toc_visible.eq(i).find('a[href$=".html"]'); + if (lnk.length) lnk[0].click(); + return; + } + if (n === 0) return; + var $p = $highlighted.eq(hi); + $p[0].scrollIntoView(); + $highlighted.css('background-color', ''); + // an orange background color on the current item and removed later + $p.css('background-color', 'orange'); + setTimeout(function() { + $p.css('background-color', ''); + }, 2000); + } + + function currentPath() { + var href = window.location.pathname; + href = href.substr(href.lastIndexOf('/') + 1); + return href === '' ? 'index.html' : href; + } + + // Create search form + function createForm(value) { + if ($searchForm) $searchForm.remove(); + if ($searchLabel) $searchLabel.remove(); + if ($searchInput) $searchInput.remove(); + + $searchForm = $('
', { + 'class': 'book-search', + 'role': 'search' + }); + + $searchLabel = $('",e.querySelectorAll("[msallowcapture^='']").length&&v.push("[*^$]="+M+"*(?:''|\"\")"),e.querySelectorAll("[selected]").length||v.push("\\["+M+"*(?:value|"+R+")"),e.querySelectorAll("[id~="+S+"-]").length||v.push("~="),(t=C.createElement("input")).setAttribute("name",""),e.appendChild(t),e.querySelectorAll("[name='']").length||v.push("\\["+M+"*name"+M+"*="+M+"*(?:''|\"\")"),e.querySelectorAll(":checked").length||v.push(":checked"),e.querySelectorAll("a#"+S+"+*").length||v.push(".#.+[+~]"),e.querySelectorAll("\\\f"),v.push("[\\r\\n\\f]")}),ce(function(e){e.innerHTML="";var t=C.createElement("input");t.setAttribute("type","hidden"),e.appendChild(t).setAttribute("name","D"),e.querySelectorAll("[name=d]").length&&v.push("name"+M+"*[*^$|!~]?="),2!==e.querySelectorAll(":enabled").length&&v.push(":enabled",":disabled"),a.appendChild(e).disabled=!0,2!==e.querySelectorAll(":disabled").length&&v.push(":enabled",":disabled"),e.querySelectorAll("*,:x"),v.push(",.*:")})),(d.matchesSelector=K.test(c=a.matches||a.webkitMatchesSelector||a.mozMatchesSelector||a.oMatchesSelector||a.msMatchesSelector))&&ce(function(e){d.disconnectedMatch=c.call(e,"*"),c.call(e,"[s!='']:x"),s.push("!=",F)}),v=v.length&&new RegExp(v.join("|")),s=s.length&&new RegExp(s.join("|")),t=K.test(a.compareDocumentPosition),y=t||K.test(a.contains)?function(e,t){var n=9===e.nodeType?e.documentElement:e,r=t&&t.parentNode;return e===r||!(!r||1!==r.nodeType||!(n.contains?n.contains(r):e.compareDocumentPosition&&16&e.compareDocumentPosition(r)))}:function(e,t){if(t)while(t=t.parentNode)if(t===e)return!0;return!1},j=t?function(e,t){if(e===t)return l=!0,0;var n=!e.compareDocumentPosition-!t.compareDocumentPosition;return n||(1&(n=(e.ownerDocument||e)==(t.ownerDocument||t)?e.compareDocumentPosition(t):1)||!d.sortDetached&&t.compareDocumentPosition(e)===n?e==C||e.ownerDocument==p&&y(p,e)?-1:t==C||t.ownerDocument==p&&y(p,t)?1:u?P(u,e)-P(u,t):0:4&n?-1:1)}:function(e,t){if(e===t)return l=!0,0;var n,r=0,i=e.parentNode,o=t.parentNode,a=[e],s=[t];if(!i||!o)return e==C?-1:t==C?1:i?-1:o?1:u?P(u,e)-P(u,t):0;if(i===o)return pe(e,t);n=e;while(n=n.parentNode)a.unshift(n);n=t;while(n=n.parentNode)s.unshift(n);while(a[r]===s[r])r++;return r?pe(a[r],s[r]):a[r]==p?-1:s[r]==p?1:0}),C},se.matches=function(e,t){return se(e,null,null,t)},se.matchesSelector=function(e,t){if(T(e),d.matchesSelector&&E&&!N[t+" "]&&(!s||!s.test(t))&&(!v||!v.test(t)))try{var n=c.call(e,t);if(n||d.disconnectedMatch||e.document&&11!==e.document.nodeType)return n}catch(e){N(t,!0)}return 0":{dir:"parentNode",first:!0}," ":{dir:"parentNode"},"+":{dir:"previousSibling",first:!0},"~":{dir:"previousSibling"}},preFilter:{ATTR:function(e){return e[1]=e[1].replace(te,ne),e[3]=(e[3]||e[4]||e[5]||"").replace(te,ne),"~="===e[2]&&(e[3]=" "+e[3]+" "),e.slice(0,4)},CHILD:function(e){return e[1]=e[1].toLowerCase(),"nth"===e[1].slice(0,3)?(e[3]||se.error(e[0]),e[4]=+(e[4]?e[5]+(e[6]||1):2*("even"===e[3]||"odd"===e[3])),e[5]=+(e[7]+e[8]||"odd"===e[3])):e[3]&&se.error(e[0]),e},PSEUDO:function(e){var t,n=!e[6]&&e[2];return G.CHILD.test(e[0])?null:(e[3]?e[2]=e[4]||e[5]||"":n&&X.test(n)&&(t=h(n,!0))&&(t=n.indexOf(")",n.length-t)-n.length)&&(e[0]=e[0].slice(0,t),e[2]=n.slice(0,t)),e.slice(0,3))}},filter:{TAG:function(e){var t=e.replace(te,ne).toLowerCase();return"*"===e?function(){return!0}:function(e){return e.nodeName&&e.nodeName.toLowerCase()===t}},CLASS:function(e){var t=m[e+" "];return t||(t=new RegExp("(^|"+M+")"+e+"("+M+"|$)"))&&m(e,function(e){return t.test("string"==typeof e.className&&e.className||"undefined"!=typeof e.getAttribute&&e.getAttribute("class")||"")})},ATTR:function(n,r,i){return function(e){var t=se.attr(e,n);return null==t?"!="===r:!r||(t+="","="===r?t===i:"!="===r?t!==i:"^="===r?i&&0===t.indexOf(i):"*="===r?i&&-1:\x20\t\r\n\f]*)[\x20\t\r\n\f]*\/?>(?:<\/\1>|)$/i;function j(e,n,r){return m(n)?S.grep(e,function(e,t){return!!n.call(e,t,e)!==r}):n.nodeType?S.grep(e,function(e){return e===n!==r}):"string"!=typeof n?S.grep(e,function(e){return-1)[^>]*|#([\w-]+))$/;(S.fn.init=function(e,t,n){var r,i;if(!e)return this;if(n=n||D,"string"==typeof e){if(!(r="<"===e[0]&&">"===e[e.length-1]&&3<=e.length?[null,e,null]:q.exec(e))||!r[1]&&t)return!t||t.jquery?(t||n).find(e):this.constructor(t).find(e);if(r[1]){if(t=t instanceof S?t[0]:t,S.merge(this,S.parseHTML(r[1],t&&t.nodeType?t.ownerDocument||t:E,!0)),N.test(r[1])&&S.isPlainObject(t))for(r in t)m(this[r])?this[r](t[r]):this.attr(r,t[r]);return this}return(i=E.getElementById(r[2]))&&(this[0]=i,this.length=1),this}return e.nodeType?(this[0]=e,this.length=1,this):m(e)?void 0!==n.ready?n.ready(e):e(S):S.makeArray(e,this)}).prototype=S.fn,D=S(E);var L=/^(?:parents|prev(?:Until|All))/,H={children:!0,contents:!0,next:!0,prev:!0};function O(e,t){while((e=e[t])&&1!==e.nodeType);return e}S.fn.extend({has:function(e){var t=S(e,this),n=t.length;return this.filter(function(){for(var e=0;e\x20\t\r\n\f]*)/i,he=/^$|^module$|\/(?:java|ecma)script/i;ce=E.createDocumentFragment().appendChild(E.createElement("div")),(fe=E.createElement("input")).setAttribute("type","radio"),fe.setAttribute("checked","checked"),fe.setAttribute("name","t"),ce.appendChild(fe),y.checkClone=ce.cloneNode(!0).cloneNode(!0).lastChild.checked,ce.innerHTML="",y.noCloneChecked=!!ce.cloneNode(!0).lastChild.defaultValue,ce.innerHTML="",y.option=!!ce.lastChild;var ge={thead:[1,"","
"],col:[2,"","
"],tr:[2,"","
"],td:[3,"","
"],_default:[0,"",""]};function ve(e,t){var n;return n="undefined"!=typeof e.getElementsByTagName?e.getElementsByTagName(t||"*"):"undefined"!=typeof e.querySelectorAll?e.querySelectorAll(t||"*"):[],void 0===t||t&&A(e,t)?S.merge([e],n):n}function ye(e,t){for(var n=0,r=e.length;n",""]);var me=/<|&#?\w+;/;function xe(e,t,n,r,i){for(var o,a,s,u,l,c,f=t.createDocumentFragment(),p=[],d=0,h=e.length;d\s*$/g;function je(e,t){return A(e,"table")&&A(11!==t.nodeType?t:t.firstChild,"tr")&&S(e).children("tbody")[0]||e}function De(e){return e.type=(null!==e.getAttribute("type"))+"/"+e.type,e}function qe(e){return"true/"===(e.type||"").slice(0,5)?e.type=e.type.slice(5):e.removeAttribute("type"),e}function Le(e,t){var n,r,i,o,a,s;if(1===t.nodeType){if(Y.hasData(e)&&(s=Y.get(e).events))for(i in Y.remove(t,"handle events"),s)for(n=0,r=s[i].length;n").attr(n.scriptAttrs||{}).prop({charset:n.scriptCharset,src:n.url}).on("load error",i=function(e){r.remove(),i=null,e&&t("error"===e.type?404:200,e.type)}),E.head.appendChild(r[0])},abort:function(){i&&i()}}});var _t,zt=[],Ut=/(=)\?(?=&|$)|\?\?/;S.ajaxSetup({jsonp:"callback",jsonpCallback:function(){var e=zt.pop()||S.expando+"_"+wt.guid++;return this[e]=!0,e}}),S.ajaxPrefilter("json jsonp",function(e,t,n){var r,i,o,a=!1!==e.jsonp&&(Ut.test(e.url)?"url":"string"==typeof e.data&&0===(e.contentType||"").indexOf("application/x-www-form-urlencoded")&&Ut.test(e.data)&&"data");if(a||"jsonp"===e.dataTypes[0])return r=e.jsonpCallback=m(e.jsonpCallback)?e.jsonpCallback():e.jsonpCallback,a?e[a]=e[a].replace(Ut,"$1"+r):!1!==e.jsonp&&(e.url+=(Tt.test(e.url)?"&":"?")+e.jsonp+"="+r),e.converters["script json"]=function(){return o||S.error(r+" was not called"),o[0]},e.dataTypes[0]="json",i=C[r],C[r]=function(){o=arguments},n.always(function(){void 0===i?S(C).removeProp(r):C[r]=i,e[r]&&(e.jsonpCallback=t.jsonpCallback,zt.push(r)),o&&m(i)&&i(o[0]),o=i=void 0}),"script"}),y.createHTMLDocument=((_t=E.implementation.createHTMLDocument("").body).innerHTML="
",2===_t.childNodes.length),S.parseHTML=function(e,t,n){return"string"!=typeof e?[]:("boolean"==typeof t&&(n=t,t=!1),t||(y.createHTMLDocument?((r=(t=E.implementation.createHTMLDocument("")).createElement("base")).href=E.location.href,t.head.appendChild(r)):t=E),o=!n&&[],(i=N.exec(e))?[t.createElement(i[1])]:(i=xe([e],t,o),o&&o.length&&S(o).remove(),S.merge([],i.childNodes)));var r,i,o},S.fn.load=function(e,t,n){var r,i,o,a=this,s=e.indexOf(" ");return-1").append(S.parseHTML(e)).find(r):e)}).always(n&&function(e,t){a.each(function(){n.apply(this,o||[e.responseText,t,e])})}),this},S.expr.pseudos.animated=function(t){return S.grep(S.timers,function(e){return t===e.elem}).length},S.offset={setOffset:function(e,t,n){var r,i,o,a,s,u,l=S.css(e,"position"),c=S(e),f={};"static"===l&&(e.style.position="relative"),s=c.offset(),o=S.css(e,"top"),u=S.css(e,"left"),("absolute"===l||"fixed"===l)&&-1<(o+u).indexOf("auto")?(a=(r=c.position()).top,i=r.left):(a=parseFloat(o)||0,i=parseFloat(u)||0),m(t)&&(t=t.call(e,n,S.extend({},s))),null!=t.top&&(f.top=t.top-s.top+a),null!=t.left&&(f.left=t.left-s.left+i),"using"in t?t.using.call(e,f):c.css(f)}},S.fn.extend({offset:function(t){if(arguments.length)return void 0===t?this:this.each(function(e){S.offset.setOffset(this,t,e)});var e,n,r=this[0];return r?r.getClientRects().length?(e=r.getBoundingClientRect(),n=r.ownerDocument.defaultView,{top:e.top+n.pageYOffset,left:e.left+n.pageXOffset}):{top:0,left:0}:void 0},position:function(){if(this[0]){var e,t,n,r=this[0],i={top:0,left:0};if("fixed"===S.css(r,"position"))t=r.getBoundingClientRect();else{t=this.offset(),n=r.ownerDocument,e=r.offsetParent||n.documentElement;while(e&&(e===n.body||e===n.documentElement)&&"static"===S.css(e,"position"))e=e.parentNode;e&&e!==r&&1===e.nodeType&&((i=S(e).offset()).top+=S.css(e,"borderTopWidth",!0),i.left+=S.css(e,"borderLeftWidth",!0))}return{top:t.top-i.top-S.css(r,"marginTop",!0),left:t.left-i.left-S.css(r,"marginLeft",!0)}}},offsetParent:function(){return this.map(function(){var e=this.offsetParent;while(e&&"static"===S.css(e,"position"))e=e.offsetParent;return e||re})}}),S.each({scrollLeft:"pageXOffset",scrollTop:"pageYOffset"},function(t,i){var o="pageYOffset"===i;S.fn[t]=function(e){return $(this,function(e,t,n){var r;if(x(e)?r=e:9===e.nodeType&&(r=e.defaultView),void 0===n)return r?r[i]:e[t];r?r.scrollTo(o?r.pageXOffset:n,o?n:r.pageYOffset):e[t]=n},t,e,arguments.length)}}),S.each(["top","left"],function(e,n){S.cssHooks[n]=Fe(y.pixelPosition,function(e,t){if(t)return t=We(e,n),Pe.test(t)?S(e).position()[n]+"px":t})}),S.each({Height:"height",Width:"width"},function(a,s){S.each({padding:"inner"+a,content:s,"":"outer"+a},function(r,o){S.fn[o]=function(e,t){var n=arguments.length&&(r||"boolean"!=typeof e),i=r||(!0===e||!0===t?"margin":"border");return $(this,function(e,t,n){var r;return x(e)?0===o.indexOf("outer")?e["inner"+a]:e.document.documentElement["client"+a]:9===e.nodeType?(r=e.documentElement,Math.max(e.body["scroll"+a],r["scroll"+a],e.body["offset"+a],r["offset"+a],r["client"+a])):void 0===n?S.css(e,t,i):S.style(e,t,n,i)},s,n?e:void 0,n)}})}),S.each(["ajaxStart","ajaxStop","ajaxComplete","ajaxError","ajaxSuccess","ajaxSend"],function(e,t){S.fn[t]=function(e){return this.on(t,e)}}),S.fn.extend({bind:function(e,t,n){return this.on(e,null,t,n)},unbind:function(e,t){return this.off(e,null,t)},delegate:function(e,t,n,r){return this.on(t,e,n,r)},undelegate:function(e,t,n){return 1===arguments.length?this.off(e,"**"):this.off(t,e||"**",n)},hover:function(e,t){return this.mouseenter(e).mouseleave(t||e)}}),S.each("blur focus focusin focusout resize scroll click dblclick mousedown mouseup mousemove mouseover mouseout mouseenter mouseleave change select submit keydown keypress keyup contextmenu".split(" "),function(e,n){S.fn[n]=function(e,t){return 0 + + + + + + 12 Performing spatial analysis | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

12 Performing spatial analysis

+

Highly multiplexed imaging technologies measure the spatial distributions of +molecule abundances across tissue sections. As such, having the option to +analyze single cells in their spatial tissue context is a key strength of these +technologies.

+

A number of software packages such as +squidpy, +giotto and +Seurat have +been developed to analyse and visualize cells in their spatial context. The +following chapter will highlight the use of +imcRtools +and other Bioconductor packages to visualize and analyse single-cell data +obtained from highly multiplexed imaging technologies.

+

We will first read in the spatially-annotated single-cell data processed in the +previous sections.

+
library(SpatialExperiment)
+spe <- readRDS("data/spe.rds")
+
+

12.1 Spatial interaction graphs

+

Many spatial analysis approaches either compare the observed versus expected +number of cells around a given cell type (point process) or utilize interaction +graphs (spatial object graphs) to estimate clustering or interaction frequencies +between cell types.

+

The steinbock +framework allows the construction of these spatial graphs. During image +processing (see Section 4.3), we have constructed +a spatial graph by expanding the individual cell masks by 4 pixels.

+

The imcRtools package further allows the ad hoc consctruction of spatial +graphs directly using a SpatialExperiment or SingleCellExperiment object +while considering the spatial location (centroids) of individual cells. The +buildSpatialGraph +function allows constructing spatial graphs by detecting the k-nearest neighbors +in 2D (knn), by detecting all cells within a given distance to the center cell +(expansion) and by Delaunay triangulation (delaunay).

+

When constructing a knn graph, the number of neighbors (k) needs to be set and +(optionally) the maximum distance to consider (max_dist) can be specified. +When constructing a graph via expansion, the distance to expand (threshold) +needs to be provided. For graphs constructed via Delaunay triangulation, +the max_dist parameter can be set to avoid unusually large connections at the +edge of the image.

+
library(imcRtools)
+
spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "knn", k = 20)
+
## The returned object is ordered by the 'sample_id' entry.
+
spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "expansion", threshold = 20)
+
## The returned object is ordered by the 'sample_id' entry.
+
spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "delaunay", max_dist = 20)
+
## The returned object is ordered by the 'sample_id' entry.
+

The spatial graphs are stored in colPair(spe, name) slots. These slots store +SelfHits objects representing edge lists in which the first column indicates +the index of the “from” cell and the second column the index of the “to” cell. +Each edge list is newly constructed when subsetting the object.

+
colPairNames(spe)
+
## [1] "neighborhood"                "knn_interaction_graph"      
+## [3] "expansion_interaction_graph" "delaunay_interaction_graph"
+

Here, colPair(spe, "neighborhood") stores the spatial graph constructed by +steinbock, colPair(spe, "knn_interaction_graph") stores the knn spatial +graph, colPair(spe, "expansion_interaction_graph") stores the expansion graph +and colPair(spe, "delaunay_interaction_graph") stores the graph constructed by +Delaunay triangulation.

+
+
+

12.2 Spatial visualization

+

Section 11 highlights the use of the +cytomapper +package to visualize multichannel images and segmentation masks. Here, we +introduce the +plotSpatial +function of the imcRtools package to visualize the cells’ centroids and +cell-cell interactions as spatial graphs.

+

In the following example, we select one image for visualization purposes. +Here, each dot (node) represents a cell and edges are drawn between cells +in close physical proximity as detected by steinbock or the buildSpatialGraph +function. Nodes are variably colored based on the cell type and edges are +colored in grey.

+
library(ggplot2)
+library(viridis)
+
+# steinbock interaction graph 
+plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
+            node_color_by = "celltype", 
+            img_id = "sample_id", 
+            draw_edges = TRUE, 
+            colPairName = "neighborhood", 
+            nodes_first = FALSE, 
+            edge_color_fix = "grey") + 
+    scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
+    ggtitle("steinbock interaction graph")
+

+
# knn interaction graph 
+plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
+            node_color_by = "celltype", 
+            img_id = "sample_id", 
+            draw_edges = TRUE, 
+            colPairName = "knn_interaction_graph", 
+            nodes_first = FALSE,
+            edge_color_fix = "grey") + 
+    scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
+    ggtitle("knn interaction graph")
+

+
# expansion interaction graph 
+plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
+            node_color_by = "celltype", 
+            img_id = "sample_id", 
+            draw_edges = TRUE, 
+            colPairName = "expansion_interaction_graph", 
+            nodes_first = FALSE,
+            edge_color_fix = "grey") + 
+    scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
+    ggtitle("expansion interaction graph")
+

+
# delaunay interaction graph 
+plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
+            node_color_by = "celltype", 
+            img_id = "sample_id", 
+            draw_edges = TRUE, 
+            colPairName = "delaunay_interaction_graph", 
+            nodes_first = FALSE,
+            edge_color_fix = "grey") + 
+    scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
+    ggtitle("delaunay interaction graph")
+

+

Nodes can also be colored based on the cells’ expression levels (e.g., +E-cadherin expression) and their size can be adjusted (e.g., based on measured +cell area).

+
plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
+            node_color_by = "Ecad", 
+            assay_type = "exprs",
+            img_id = "sample_id", 
+            draw_edges = TRUE, 
+            colPairName = "expansion_interaction_graph", 
+            nodes_first = FALSE, 
+            node_size_by = "area", 
+            directed = FALSE,
+            edge_color_fix = "grey") + 
+    scale_size_continuous(range = c(0.1, 2)) +
+    ggtitle("E-cadherin expression")
+

+

Finally, the plotSpatial function allows displaying all images at once. This +visualization can be useful to quickly detect larger structures of interest.

+
plotSpatial(spe, 
+            node_color_by = "celltype", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) + 
+    scale_color_manual(values = metadata(spe)$color_vectors$celltype)
+

+

For a full documentation on the plotSpatial function, please refer to +?plotSpatial.

+
+
+

12.3 Spatial community analysis

+

The detection of spatial communities was proposed by (Jackson et al. 2020). Here, cells +are clustered solely based on their interactions as defined by the spatial +object graph. We can perform spatial community detection across all cells as +displayed in the next code chunk. Communities with less than 10 cells are +excluded. Of note: we set the seed outside of the function call for +reproducibility porposes as internally the louvain modularity optimization +function is used which gives different results over different runs.

+
set.seed(230621)
+spe <- detectCommunity(spe, 
+                       colPairName = "neighborhood", 
+                       size_threshold = 10)
+
+plotSpatial(spe, 
+            node_color_by = "spatial_community", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    theme(legend.position = "none") +
+    ggtitle("Spatial tumor communities") +
+    scale_color_manual(values = rev(colors()))
+

+

The example shown above might not be of interest if different tissue structures +exist within which spatial communities should be computed. In the following +example, we perform spatial community detection separately for tumor and stromal +cells.

+

The general procedure is as follows:

+
    +
  1. create a colData(spe) entry that specifies if a cell is part of the tumor +or stroma compartment.

  2. +
  3. use the detectCommunity function of the imcRtools +package to cluster cells within the tumor or stroma compartment solely based on +their spatial interaction graph as constructed by the steinbock package.

  4. +
+

Both tumor and stromal spatial communities are stored in the colData of +the SpatialExperiment object under the spatial_community identifier.

+

Of note: Here, and in contrast to the function call above, we set the seed +argument within the SerialParam function for reproducibility +purposes. We need this here due to the way the detectCommunity function +is implemented when setting the group_by parameter.

+
spe$tumor_stroma <- ifelse(spe$celltype == "Tumor", "Tumor", "Stroma")
+
+library(BiocParallel)
+spe <- detectCommunity(spe, 
+                       colPairName = "neighborhood", 
+                       size_threshold = 10,
+                       group_by = "tumor_stroma",
+                       BPPARAM = SerialParam(RNGseed = 220819))
+

We can now separately visualize the tumor and stromal communities.

+
plotSpatial(spe[,spe$celltype == "Tumor"], 
+            node_color_by = "spatial_community", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    theme(legend.position = "none") +
+    ggtitle("Spatial tumor communities") +
+    scale_color_manual(values = rev(colors()))
+

+
plotSpatial(spe[,spe$celltype != "Tumor"], 
+            node_color_by = "spatial_community", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    theme(legend.position = "none") +
+    ggtitle("Spatial non-tumor communities") +
+    scale_color_manual(values = rev(colors()))
+

+

The example data was acquired using a panel that mainly focuses on immune cells. +We are therefore unable to detect many tumor sub-phenotypes and will +focus on the stromal communities.

+

In the next step, the fraction of cell types within each +spatial stromal community is displayed.

+
library(pheatmap)
+library(viridis)
+
+cur_spe <- spe[,spe$celltype != "Tumor"]
+
+for_plot <- prop.table(table(cur_spe$spatial_community, 
+                             cur_spe$celltype), 
+                       margin = 1)
+
+pheatmap(for_plot, 
+         color = colorRampPalette(c("dark blue", "white", "dark red"))(100), 
+         show_rownames = FALSE, 
+         scale = "column")
+

+

We observe that many spatial stromal communities are made up of myeloid cells or +“stromal” (non-immune) cells. Other communities are mainly made up of B cells +and BnT cells indicating tertiary lymphoid structures (TLS). While plasma cells, +CD4\(^+\) or CD8\(^+\) T cells tend to aggregate, only in few spatial stromal +communities consists of mainly neutrophils.

+
+
+

12.4 Cellular neighborhood analysis

+

The following section highlights the use of the imcRtools package to +detect cellular neighborhoods. This approach has been proposed by +(Goltsev et al. 2018) and (Schürch et al. 2020) to group cells based on information +contained in their direct neighborhood.

+

(Goltsev et al. 2018) perfomed Delaunay triangulation-based graph construction, +neighborhood aggregation and then clustered cells. (Schürch et al. 2020) on the +other hand constructed a 10-nearest neighbor graph before aggregating +information across neighboring cells.

+

In the following code chunk we will use the 20-nearest neighbor graph as +constructed above to define the direct cellular neighborhood. The +aggregateNeighbors +function allows neighborhood aggregation in 2 different ways:

+
    +
  1. For each cell the function computes the fraction of cells of a +certain type (e.g., cell type) among its neighbors.
  2. +
  3. For each cell it aggregates (e.g., mean) the expression counts +across all neighboring cells.
  4. +
+

Based on these measures, cells can now be clustered into cellular +neighborhoods. We will first compute the fraction of the different cell +types among the 20-nearest neighbors and use kmeans clustering to group +cells into 6 cellular neighborhoods.

+

Of note: constructing a 20-nearest neighbor graph and clustering +using kmeans with k=6 is only an example. Similar to the analysis done +in Section 9.2.2, it is recommended to perform a parameter +sweep across different graph construction algorithms and different +parmaters k for kmeans clustering. Finding the best CN detection +settings is also subject to the question at hand. Constructing graphs +with more neighbors usually results in larger CNs.

+
# By celltypes
+spe <- aggregateNeighbors(spe, 
+                          colPairName = "knn_interaction_graph", 
+                          aggregate_by = "metadata", 
+                          count_by = "celltype")
+
+set.seed(220705)
+
+cn_1 <- kmeans(spe$aggregatedNeighbors, centers = 6)
+spe$cn_celltypes <- as.factor(cn_1$cluster)
+
+plotSpatial(spe, 
+            node_color_by = "cn_celltypes", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    scale_color_brewer(palette = "Set3")
+

+

The next code chunk visualizes the cell type compositions of the +detected cellular neighborhoods (CN).

+
for_plot <- prop.table(table(spe$cn_celltypes, spe$celltype), 
+                       margin = 1)
+
+pheatmap(for_plot, 
+         color = colorRampPalette(c("dark blue", "white", "dark red"))(100), 
+         scale = "column")
+

+

CN 1 and CN 6 are mainly composed of tumor cells with CN 6 forming the +tumor/stroma border. CN 3 is mainly composed of B and BnT cells +indicating TLS. CN 5 is composed of aggregated plasma cells and most T +cells.

+

We will now detect cellular neighborhoods by computing the mean +expression across the 20-nearest neighbor prior to kmeans clustering +(k=6).

+
# By expression
+spe <- aggregateNeighbors(spe, 
+                          colPairName = "knn_interaction_graph", 
+                          aggregate_by = "expression", 
+                          assay_type = "exprs",
+                          subset_row = rowData(spe)$use_channel)
+
+set.seed(220705)
+
+cn_2 <- kmeans(spe$mean_aggregatedExpression, centers = 6)
+spe$cn_expression <- as.factor(cn_2$cluster)
+
+plotSpatial(spe, 
+            node_color_by = "cn_expression", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    scale_color_brewer(palette = "Set3")
+

+

Also here, we can visualize the cell type composition of each cellular +neighborhood.

+
for_plot <- prop.table(table(spe$cn_expression, spe$celltype), 
+                  margin = 1)
+
+pheatmap(for_plot, 
+         color = colorRampPalette(c("dark blue", "white", "dark red"))(100), 
+         scale = "column")
+

+

When clustering cells based on the mean expression within the direct +neighborhood, tumor cells are split across CN 6, CN 1 and CN 4 without +forming a clear tumor/stroma interface. This result reflects +patient-to-patient differences in the expression of tumor markers.

+

CN 3 again contains B cells and BnT cells but also CD8 and undefined +cells, therefore it is less representative of TLS compared to CN 3 in +previous CN approach. CN detection based on mean marker expression is +therefore sensitive to staining/expression differences between samples +as well as lateral spillover due to imperfect segmentation.

+

An alternative to the aggregateNeighbors function is provided by the +lisaClust +Bioconductor package (Patrick et al. 2023). In contrast to imcRtools, the +lisaClust package computes local indicators of spatial associations +(LISA) functions and clusters cells based on those. More precise, the +package summarizes L-functions from a Poisson point process model to +derive numeric vectors for each cell which can then again be clustered +using kmeans. All steps are supported by the lisaClust function which +can be applied to a SingleCellExperiment and SpatialExperiment object.

+

In the following example, we calculate the LISA curves within a 10µm, 20µm and +50µm neighborhood around each cell. Increasing these radii will lead to broader +and smoother spatial clusters. However, a number of parameter settings should be +tested to estimate the robustness of the results.

+
library(lisaClust)
+
+set.seed(220705)
+spe <- lisaClust(spe, 
+                 k = 6,
+                 Rs = c(10, 20, 50),
+                 spatialCoords = c("Pos_X", "Pos_Y"),
+                 cellType = "celltype",
+                 imageID = "sample_id")
+
+plotSpatial(spe, 
+            node_color_by = "region", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    scale_color_brewer(palette = "Set3")
+

+

Similar to the example above, we can now observe the cell type +composition per spatial cluster.

+
for_plot <- prop.table(table(spe$region, spe$celltype), 
+                  margin = 1)
+
+pheatmap(for_plot, 
+         color = colorRampPalette(c("dark blue", "white", "dark red"))(100), 
+         scale = "column")
+

+

In this case, CN 1 and 4 contain tumor cells but no CN is forming the +tumor/stroma interface. CN 3 represents TLS. CN 2 indicates T cell +subtypes and plasma cells are aggregated to CN 5.

+

As an alternative way of visualizing the enrichment of cell types within the +detected CNs, the lisaClust package provides the regionMap function.

+
regionMap(spe, 
+          cellType = "celltype",
+          region = "region")
+

+
+
+

12.5 Spatial context analysis

+

Downstream of CN assignments, we will analyze the spatial context (SC) +of each cell using three functions from the imcRtools package.

+

While CNs can represent sites of unique local processes, the term SC was +coined by Bhate and colleagues (Bhate et al. 2022) and describes tissue regions +in which distinct CNs may be interacting. Hence, SCs may be interesting +regions of specialized biological events.

+

Here, we will first detect SCs using the detectSpatialContext function. This +function relies on CN fractions for each cell in a spatial interaction +graph (originally a KNN graph), which we will calculate using +buildSpatialGraph and aggregateNeighbors. We will focus on the CNs +derived from cell type fractions but other CN assignments are possible.

+

Of note, the window size (k for KNN) for buildSpatialGraph should +reflect a length scale on which biological signals can be exchanged and +depends, among others, on cell density and tissue area. In view of their +divergent functionality, we recommend to use a larger window size for SC +(interaction between local processes) than for CN (local processes) +detection. Since we used a 20-nearest neighbor graph for CN assignment, +we will use a 40-nearest neighbor graph for SC detection. As before, +different parameters should be tested.

+

Subsequently, the CN fractions are sorted from high-to-low and the SC of +each cell is assigned as the minimal combination of SCs that additively +surpass a user-defined threshold. The default threshold of 0.9 aims to +represent the dominant CNs, hence the most prevalent signals, in a given +window.

+

For more details and biological validation, please refer to +(Bhate et al. 2022).

+
library(circlize)
+library(RColorBrewer)
+
+# Construct a 40-nearest neighbor graph
+spe <- buildSpatialGraph(spe, 
+                         img_id = "sample_id", 
+                         type = "knn", 
+                         name = "knn_spatialcontext_graph", 
+                         k = 40)
+
+# Compute the fraction of cellular neighborhoods around each cell
+spe <- aggregateNeighbors(spe, 
+                          colPairName = "knn_spatialcontext_graph",
+                          aggregate_by = "metadata",
+                          count_by = "cn_celltypes",
+                          name = "aggregatedNeighborhood")
+
+# Detect spatial contexts
+spe <- detectSpatialContext(spe, 
+                            entry = "aggregatedNeighborhood",
+                            threshold = 0.90,
+                            name = "spatial_context")
+
+# Define SC color scheme
+n_SCs <- length(unique(spe$spatial_context))
+col_SC <- setNames(colorRampPalette(brewer.pal(9, "Paired"))(n_SCs), 
+                   sort(unique(spe$spatial_context)))
+
+# Visualize spatial contexts on images
+plotSpatial(spe, 
+            node_color_by = "spatial_context", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    scale_color_manual(values = col_SC)
+

+

We detect a total of 52 distinct +SCs across this dataset.

+

For ease of interpretation, we will directly compare the CN and SC +assignments for Patient3_001.

+
library(patchwork)
+
+# Compare CN and SC for one patient 
+p1 <- plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
+            node_color_by = "cn_celltypes", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    scale_color_brewer(palette = "Set3")
+
+p2 <- plotSpatial(spe[,spe$sample_id == "Patient3_001"], 
+            node_color_by = "spatial_context", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    scale_color_manual(values = col_SC, limits = force)
+
+p1 + p2
+

+

As expected, we can observe that interfaces between different CNs make +up distinct SCs. For instance, interface between CN 3 (TLS region +consisting of B and BnT cells) and CN 5 (Plasma- and T-cell dominated) +turns to SC 3_5. On the other hand, the core of CN 3 becomes SC 3, since +the most abundant CN of the neighborhood for these cells is just the CN +itself.

+

Next, we filter the SCs based on user-defined thresholds for number of +group entries (here at least 3 patients) and/or total number of cells +(here minimum of 100 cells) per SC using the filterSpatialContext function.

+
## Filter spatial contexts
+# By number of group entries
+spe <- filterSpatialContext(spe, 
+                            entry = "spatial_context",
+                            group_by = "patient_id", 
+                            group_threshold = 3,
+                            name = "spatial_context_filtered")
+
+plotSpatial(spe, 
+            node_color_by = "spatial_context_filtered", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    scale_color_manual(values = col_SC, limits = force)
+

+
# Filter out small and infrequent spatial contexts
+spe <- filterSpatialContext(spe, 
+                            entry = "spatial_context",
+                            group_by = "patient_id", 
+                            group_threshold = 3,
+                            cells_threshold = 100,
+                            name = "spatial_context_filtered")
+
+plotSpatial(spe, 
+            node_color_by = "spatial_context_filtered", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    scale_color_manual(values = col_SC, limits = force)
+

+

Lastly, we can use the plotSpatialContext function to generate SC +graphs, analogous to CN combination maps in (Bhate et al. 2022). Returned +objects are ggplots, which can be easily modified further. We will +create a SC graph for the filtered SCs here.

+
## Plot spatial context graph 
+
+# Colored by name, size by n_cells
+plotSpatialContext(spe, 
+                   entry = "spatial_context_filtered",
+                   group_by = "sample_id",
+                   node_color_by = "name",
+                   node_size_by = "n_cells",
+                   node_label_color_by = "name")
+

+
# Colored by n_cells, size by n_group                   
+plotSpatialContext(spe, 
+                   entry = "spatial_context_filtered",
+                   group_by = "sample_id",
+                   node_color_by = "n_cells",
+                   node_size_by = "n_group",
+                   node_label_color_by = "n_cells") +
+  scale_color_viridis()
+

+

SC 1 (Tumor-dominated), SC 1_6 (Tumor and Tumor-Stroma interface) and SC +4_5 (Plasma/T cell and Myeloid/Neutrophil interface) are the most +frequent SCs in this dataset. Moreover, we may compare the degree of the +different nodes in the SC graph. For example, we can observe that SC 1 +has only one degree (directed to SC 1_6), while SC 5 (T cells and plasma cells) has +a much higher degree (n = 4) and potentially more CN interactions.

+
+
+

12.6 Patch detection

+

The previous section focused on detecting cellular neighborhoods in a rather +unsupervised fashion. However, the imcRtools package also provides methods for +detecting spatial compartments in a supervised fashion. The +patchDetection +function allows the detection of connected sets of similar cells as proposed by +(Hoch et al. 2022). In the following example, we will use the patchDetection function +to detect tumor patches in three steps:

+
    +
  1. Find connected sets of tumor cells (using the steinbock graph).
    +
  2. +
  3. Components which contain less than 10 cells are excluded.
    +
  4. +
  5. Expand the components by 1µm to construct a concave hull around the patch and +include cells within the patch.
  6. +
+
spe <- patchDetection(spe, 
+                      patch_cells = spe$celltype == "Tumor",
+                      img_id = "sample_id",
+                      expand_by = 1,
+                      min_patch_size = 10,
+                      colPairName = "neighborhood",
+                      BPPARAM = MulticoreParam())
+
## The returned object is ordered by the 'sample_id' entry.
+
plotSpatial(spe, 
+            node_color_by = "patch_id", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    theme(legend.position = "none") +
+    scale_color_manual(values = rev(colors()))
+

+

We can now calculate the fraction of T cells within each tumor patch to roughly +estimate T cell infiltration.

+
library(tidyverse)
+colData(spe) %>% as_tibble() %>%
+    group_by(patch_id, sample_id) %>%
+    summarize(Tcell_count = sum(celltype == "CD8" | celltype == "CD4"),
+              patch_size = n(),
+              Tcell_freq = Tcell_count / patch_size) %>%
+    filter(!is.na(patch_id)) %>%
+    ggplot() +
+        geom_point(aes(log10(patch_size), Tcell_freq, color = sample_id)) +
+    theme_classic()
+

+

We can now measure the size of each patch using the +patchSize +function and visualize tumor patch distribution per patient.

+
patch_size <- patchSize(spe, "patch_id")
+
+patch_size <- merge(patch_size, 
+                    colData(spe)[match(patch_size$patch_id, spe$patch_id),], 
+                    by = "patch_id")
+
+ggplot(as.data.frame(patch_size)) + 
+    geom_boxplot(aes(patient_id, log10(size))) +
+    geom_point(aes(patient_id, log10(size)))
+

+

The +minDistToCells +function can be used to calculate the minimum distance between each cell and a +cell set of interest. Here, we highlight its use to calculate the minimum +distance of all cells to the detected tumor patches. Negative values indicate +the minimum distance of each tumor patch cell to a non-tumor patch cell.

+
spe <- minDistToCells(spe, 
+                      x_cells = !is.na(spe$patch_id), 
+                      img_id = "sample_id")
+
## The returned object is ordered by the 'sample_id' entry.
+
plotSpatial(spe, 
+            node_color_by = "distToCells", 
+            img_id = "sample_id", 
+            node_size_fix = 0.5) +
+    scale_color_gradient2(low = "dark blue", mid = "white", high = "dark red")
+

+

Finally, we can observe the minimum distances to tumor patches in a cell type +specific manner.

+
library(ggridges)
+
+ggplot(as.data.frame(colData(spe))) + 
+    geom_density_ridges(aes(distToCells, celltype, fill = celltype)) +
+    geom_vline(xintercept = 0, color = "dark red", linewidth = 2) +
+    scale_fill_manual(values = metadata(spe)$color_vectors$celltype)
+

+
+
+

12.7 Interaction analysis

+

Bug notice: we discovered and fixed a bug in the testInteractions function in version below 1.5.5 which affected SingleCellExperiment or SpatialExperiment objects in which cells were not grouped by image. Please make sure you have the newest version (>= 1.6.0) installed.

+

The next section focuses on statistically testing the pairwise interaction +between all cell types of the dataset. For this, the imcRtools package +provides the +testInteractions +function which implements the interaction testing strategy proposed by +(Schapiro et al. 2017).

+

Per grouping level (e.g., image), the testInteractions function computes the +averaged cell type/cell type interaction count and compares this count against +an empirical null distribution which is generated by permuting all cell labels (while maintaining the tissue structure).

+

In the following example, we use the steinbock generated spatial interaction +graph and estimate the interaction or avoidance between cell types in the +dataset.

+
library(scales)
+out <- testInteractions(spe, 
+                        group_by = "sample_id",
+                        label = "celltype", 
+                        colPairName = "neighborhood",
+                        BPPARAM = SerialParam(RNGseed = 221029))
+
+head(out)
+
## DataFrame with 6 rows and 10 columns
+##       group_by  from_label    to_label        ct      p_gt      p_lt
+##    <character> <character> <character> <numeric> <numeric> <numeric>
+## 1 Patient1_001       Bcell       Bcell         0  1.000000  1.000000
+## 2 Patient1_001       Bcell     BnTcell         0  1.000000  0.998002
+## 3 Patient1_001       Bcell         CD4         3  0.001998  1.000000
+## 4 Patient1_001       Bcell         CD8         0  1.000000  0.898102
+## 5 Patient1_001       Bcell     Myeloid         0  1.000000  0.804196
+## 6 Patient1_001       Bcell  Neutrophil        NA        NA        NA
+##   interaction         p       sig    sigval
+##     <logical> <numeric> <logical> <numeric>
+## 1       FALSE  1.000000     FALSE         0
+## 2       FALSE  0.998002     FALSE         0
+## 3        TRUE  0.001998      TRUE         1
+## 4       FALSE  0.898102     FALSE         0
+## 5       FALSE  0.804196     FALSE         0
+## 6          NA        NA        NA        NA
+

The returned DataFrame contains the test results per grouping level (in this case +the image ID, group_by), “from” cell type (from_label) and “to” cell type +(to_label). The sigval entry indicates if a pair of cell types is +significantly interacting (sigval = 1), if a pair of cell types is +significantly avoiding (sigval = -1) or if no significant interaction or +avoidance was detected (sigval = 0).

+

These results can be visualized by computing the sum of the sigval entries +across all images:

+
out %>% as_tibble() %>%
+    group_by(from_label, to_label) %>%
+    summarize(sum_sigval = sum(sigval, na.rm = TRUE)) %>%
+    ggplot() +
+        geom_tile(aes(from_label, to_label, fill = sum_sigval)) +
+        scale_fill_gradient2(low = muted("blue"), mid = "white", high = muted("red")) +
+        theme(axis.text.x = element_text(angle = 45, hjust = 1))
+

+

In the plot above the red tiles indicate cell type pairs that were detected to +significantly interact on a large number of images. On the other hand, blue +tiles show cell type pairs which tend to avoid each other on a large number +of images.

+

Here we can observe that tumor cells are mostly compartmentalized and are in +avoidance with other cell types. As expected, B cells interact with BnT cells; +regulatory T cells interact with CD4+ T cells and CD8+ T cells. Most cell types +show self interactions indicating spatial clustering.

+

The imcRtools package further implements an interaction testing strategy +proposed by (Schulz et al. 2018) where the hypothesis is tested if at least n cells of +a certain type are located around a target cell type (from_cell). This type of +testing can be performed by selecting method = "patch" and specifying the +number of patch cells via the patch_size parameter.

+
out <- testInteractions(spe, 
+                        group_by = "sample_id",
+                        label = "celltype", 
+                        colPairName = "neighborhood",
+                        method = "patch", 
+                        patch_size = 3,
+                        BPPARAM = SerialParam(RNGseed = 221029))
+
+out %>% as_tibble() %>%
+    group_by(from_label, to_label) %>%
+    summarize(sum_sigval = sum(sigval, na.rm = TRUE)) %>%
+    ggplot() +
+        geom_tile(aes(from_label, to_label, fill = sum_sigval)) +
+        scale_fill_gradient2(low = muted("blue"), mid = "white", high = muted("red")) +
+        theme(axis.text.x = element_text(angle = 45, hjust = 1))
+

+

These results are comparable to the interaction testing presented above. The +main difference comes from the lack of symmetry. We can now for example see that +3 or more myeloid cells sit around CD4\(^+\) T cells while this interaction is not +as strong when considering CD4\(^+\) T cells sitting around myeloid cells.

+

Finally, we save the updated SpatialExperiment object.

+
saveRDS(spe, "data/spe.rds")
+
+
+

12.8 Session Info

+
+ +SessionInfo + +
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.3 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## locale:
+##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+## 
+## time zone: Etc/UTC
+## tzcode source: system (glibc)
+## 
+## attached base packages:
+## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
+## [8] base     
+## 
+## other attached packages:
+##  [1] testthat_3.1.10             scales_1.2.1               
+##  [3] ggridges_0.5.4              lubridate_1.9.3            
+##  [5] forcats_1.0.0               stringr_1.5.0              
+##  [7] dplyr_1.1.3                 purrr_1.0.2                
+##  [9] readr_2.1.4                 tidyr_1.3.0                
+## [11] tibble_3.2.1                tidyverse_2.0.0            
+## [13] patchwork_1.1.3             RColorBrewer_1.1-3         
+## [15] circlize_0.4.15             lisaClust_1.8.1            
+## [17] pheatmap_1.0.12             BiocParallel_1.34.2        
+## [19] viridis_0.6.4               viridisLite_0.4.2          
+## [21] ggplot2_3.4.3               imcRtools_1.6.5            
+## [23] SpatialExperiment_1.10.0    SingleCellExperiment_1.22.0
+## [25] SummarizedExperiment_1.30.2 Biobase_2.60.0             
+## [27] GenomicRanges_1.52.0        GenomeInfoDb_1.36.3        
+## [29] IRanges_2.34.1              S4Vectors_0.38.2           
+## [31] BiocGenerics_0.46.0         MatrixGenerics_1.12.3      
+## [33] matrixStats_1.0.0          
+## 
+## loaded via a namespace (and not attached):
+##   [1] spatstat.sparse_3.0-2       bitops_1.0-7               
+##   [3] sf_1.0-14                   EBImage_4.42.0             
+##   [5] doParallel_1.0.17           numDeriv_2016.8-1.1        
+##   [7] tools_4.3.1                 backports_1.4.1            
+##   [9] utf8_1.2.3                  R6_2.5.1                   
+##  [11] DT_0.29                     HDF5Array_1.28.1           
+##  [13] mgcv_1.8-42                 rhdf5filters_1.12.1        
+##  [15] GetoptLong_1.0.5            withr_2.5.1                
+##  [17] sp_2.0-0                    gridExtra_2.3              
+##  [19] ClassifyR_3.4.11            cli_3.6.1                  
+##  [21] spatstat.explore_3.2-3      sandwich_3.0-2             
+##  [23] labeling_0.4.3              sass_0.4.7                 
+##  [25] spatstat.data_3.0-1         nnls_1.5                   
+##  [27] mvtnorm_1.2-3               proxy_0.4-27               
+##  [29] systemfonts_1.0.4           colorRamps_2.3.1           
+##  [31] svglite_2.1.1               R.utils_2.12.2             
+##  [33] scater_1.28.0               plotrix_3.8-2              
+##  [35] limma_3.56.2                flowCore_2.12.2            
+##  [37] rstudioapi_0.15.0           generics_0.1.3             
+##  [39] shape_1.4.6                 spatstat.random_3.1-6      
+##  [41] gtools_3.9.4                vroom_1.6.3                
+##  [43] car_3.1-2                   scam_1.2-14                
+##  [45] Matrix_1.6-1.1              RProtoBufLib_2.12.1        
+##  [47] ggbeeswarm_0.7.2            fansi_1.0.4                
+##  [49] abind_1.4-5                 R.methodsS3_1.8.2          
+##  [51] terra_1.7-46                lifecycle_1.0.3            
+##  [53] multcomp_1.4-25             yaml_2.3.7                 
+##  [55] edgeR_3.42.4                carData_3.0-5              
+##  [57] rhdf5_2.44.0                Rtsne_0.16                 
+##  [59] grid_4.3.1                  promises_1.2.1             
+##  [61] dqrng_0.3.1                 crayon_1.5.2               
+##  [63] shinydashboard_0.7.2        lattice_0.21-8             
+##  [65] beachmat_2.16.0             cowplot_1.1.1              
+##  [67] magick_2.8.0                cytomapper_1.12.0          
+##  [69] pillar_1.9.0                knitr_1.44                 
+##  [71] ComplexHeatmap_2.16.0       RTriangle_1.6-0.12         
+##  [73] boot_1.3-28.1               rjson_0.2.21               
+##  [75] codetools_0.2-19            glue_1.6.2                 
+##  [77] V8_4.3.3                    data.table_1.14.8          
+##  [79] MultiAssayExperiment_1.26.0 vctrs_0.6.3                
+##  [81] png_0.1-8                   gtable_0.3.4               
+##  [83] cachem_1.0.8                xfun_0.40                  
+##  [85] S4Arrays_1.0.6              mime_0.12                  
+##  [87] DropletUtils_1.20.0         tidygraph_1.2.3            
+##  [89] ConsensusClusterPlus_1.64.0 survival_3.5-5             
+##  [91] iterators_1.0.14            cytolib_2.12.1             
+##  [93] units_0.8-4                 ellipsis_0.3.2             
+##  [95] TH.data_1.1-2               nlme_3.1-162               
+##  [97] bit64_4.0.5                 rprojroot_2.0.3            
+##  [99] bslib_0.5.1                 irlba_2.3.5.1              
+## [101] svgPanZoom_0.3.4            vipor_0.4.5                
+## [103] KernSmooth_2.23-21          colorspace_2.1-0           
+## [105] DBI_1.1.3                   raster_3.6-23              
+## [107] tidyselect_1.2.0            curl_5.0.2                 
+## [109] bit_4.0.5                   compiler_4.3.1             
+## [111] BiocNeighbors_1.18.0        desc_1.4.2                 
+## [113] DelayedArray_0.26.7         bookdown_0.35              
+## [115] classInt_0.4-10             distances_0.1.9            
+## [117] goftest_1.2-3               tiff_0.1-11                
+## [119] digest_0.6.33               minqa_1.2.6                
+## [121] fftwtools_0.9-11            spatstat.utils_3.0-3       
+## [123] rmarkdown_2.25              XVector_0.40.0             
+## [125] CATALYST_1.24.0             htmltools_0.5.6            
+## [127] pkgconfig_2.0.3             jpeg_0.1-10                
+## [129] lme4_1.1-34                 sparseMatrixStats_1.12.2   
+## [131] fastmap_1.1.1               rlang_1.1.1                
+## [133] GlobalOptions_0.1.2         htmlwidgets_1.6.2          
+## [135] shiny_1.7.5                 DelayedMatrixStats_1.22.6  
+## [137] farver_2.1.1                jquerylib_0.1.4            
+## [139] zoo_1.8-12                  jsonlite_1.8.7             
+## [141] spicyR_1.12.2               R.oo_1.25.0                
+## [143] BiocSingular_1.16.0         RCurl_1.98-1.12            
+## [145] magrittr_2.0.3              scuttle_1.10.2             
+## [147] GenomeInfoDbData_1.2.10     Rhdf5lib_1.22.1            
+## [149] munsell_0.5.0               Rcpp_1.0.11                
+## [151] ggnewscale_0.4.9            stringi_1.7.12             
+## [153] ggraph_2.1.0                brio_1.1.3                 
+## [155] zlibbioc_1.46.0             MASS_7.3-60                
+## [157] plyr_1.8.8                  parallel_4.3.1             
+## [159] ggrepel_0.9.3               deldir_1.0-9               
+## [161] graphlayouts_1.0.1          splines_4.3.1              
+## [163] tensor_1.5                  hms_1.1.3                  
+## [165] locfit_1.5-9.8              igraph_1.5.1               
+## [167] ggpubr_0.6.0                spatstat.geom_3.2-5        
+## [169] ggsignif_0.6.4              pkgload_1.3.3              
+## [171] reshape2_1.4.4              ScaledMatrix_1.8.1         
+## [173] XML_3.99-0.14               drc_3.0-1                  
+## [175] evaluate_0.21               nloptr_2.0.3               
+## [177] tzdb_0.4.0                  foreach_1.5.2              
+## [179] tweenr_2.0.2                httpuv_1.6.11              
+## [181] polyclip_1.10-6             clue_0.3-65                
+## [183] ggforce_0.4.1               rsvd_1.0.5                 
+## [185] broom_1.0.5                 xtable_1.8-4               
+## [187] e1071_1.7-13                rstatix_0.7.2              
+## [189] later_1.3.1                 class_7.3-22               
+## [191] lmerTest_3.1-3              FlowSOM_2.8.0              
+## [193] beeswarm_0.4.0              cluster_2.1.4              
+## [195] timechange_0.2.0            concaveman_1.1.0
+
+ +
+
+

References

+
+
+Bhate, Salil S., Graham L. Barlow, Christian M. Schürch, and Garry P. Nolan. 2022. “Tissue Schematics Map the Specialization of Immune Tissue Motifs and Their Appropriation by Tumors.” Cell Systems 13 (2): 109–30. +
+
+Goltsev, Yury, Nikolay Samusik, Julia Kennedy-Darling, Salil Bhate, Matthew Hale, Gustavo Vazquez, Sarah Black, and Garry P. Nolan. 2018. “Deep Profiling of Mouse Splenic Architecture with CODEX Multiplexed Imaging.” Cell 174: 968–81. +
+
+Hoch, Tobias, Daniel Schulz, Nils Eling, Julia Martínez Gómez, Mitchell P. Levesque, and Bernd Bodenmiller. 2022. “Multiplexed Imaging Mass Cytometry of the Chemokine Milieus in Melanoma Characterizes Features of the Response to Immunotherapy.” Science Immunology 7 (70): eabk1692. +
+
+Jackson, Hartland W., Jana R. Fischer, Vito R. T. Zanotelli, H. Raza Ali, Robert Mechera, Savas D. Soysal, Holger Moch, et al. 2020. “The Single-Cell Pathology Landscape of Breast Cancer.” Nature 578: 615–20. +
+
+Patrick, Ellis, Nicolas P. Canete, Sourish S. Iyengar, Andrew N. Harman, Greg T. Sutherland, and Pengyi Yang. 2023. “Spatial Analysis for Highly Multiplexed Imaging Data to Identify Tissue Microenvironments.” Cytometry Part A. +
+
+Schapiro, Denis, Hartland W Jackson, Swetha Raghuraman, Jana R Fischer, Vito RT Zanotelli, Daniel Schulz, Charlotte Giesen, Raúl Catena, Zsuzsanna Varga, and Bernd Bodenmiller. 2017. “histoCAT: Analysis of Cell Phenotypes and Interactions in Multiplex Image Cytometry Data.” Nature Methods 14: 873--876. +
+
+Schulz, Daniel, Vito RT Zanotelli, Rana R Fischer, Denis Schapiro, Stefanie Engler, Xiao-Kang Lun, Hartland W Jackson, and Bernd Bodenmiller. 2018. “Simultaneous Multiplexed Imaging of mRNA and Proteins with Subcellular Resolution in Breast Cancer Tissue Samples by Mass Cytometry.” Cell Systems 6: 25–36.e5. +
+
+Schürch, Christian M, Salil S Bhate, Graham L Barlow, Darci J Phillips, Luca Noti, Inti Zlobec, Pauline Chu, et al. 2020. “Coordinated Cellular Neighborhoods Orchestrate Antitumoral Immunity at the Colorectal Cancer Invasive Front.” Cell 182: 1341–59. +
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/prerequisites.html b/prerequisites.html new file mode 100644 index 00000000..1d4b6856 --- /dev/null +++ b/prerequisites.html @@ -0,0 +1,916 @@ + + + + + + + 4 Prerequisites | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

4 Prerequisites

+

The analysis presented in this book requires a basic understanding of the +R programing language. An introduction to R can be found here and +in the book R for Data Science.

+

Furthermore, it is beneficial to be familiar with single-cell data analysis +using the Bioconductor framework. The +Orchestrating Single-Cell Analysis with Bioconductor book +gives an excellent overview on data containers and basic analysis that are being +used here.

+

An overview on IMC as technology and necessary image processing steps can be +found on the IMC workflow website.

+

Before we get started on IMC data analysis, we will need to make sure that +software dependencies are installed and the example data is downloaded.

+
+

4.1 Obtain the code

+

This book provides R code to perform single-cell and spatial data analysis. +You can copy the individual code chunks into your R scripts or you can obtain +the full code of the book via:

+
git clone https://github.com/BodenmillerGroup/IMCDataAnalysis.git
+
+
+

4.2 Software requirements

+

The R packages needed to execute the presented workflow can either be manually +installed (see section 4.2.2) or are available within a provided +Docker container (see section 4.2.1). The Docker option is useful if you +want to exactly reproduce the presented analysis across operating systems; +however, the manual install gives you more flexibility for exploratory data +analysis.

+
+

4.2.1 Using Docker

+

For reproducibility purposes, we provide a Docker container here.

+
    +
  1. After installing Docker you can first pull the container via:
  2. +
+
docker pull ghcr.io/bodenmillergroup/imcdataanalysis:latest
+

and then run the container:

+
docker run -v /path/to/IMCDataAnalysis:/home/rstudio/IMCDataAnalysis \
+    -e PASSWORD=bioc -p 8787:8787  \
+    ghcr.io/bodenmillergroup/imcdataanalysis:latest
+

Here, the /path/to/ needs to be adjusted to where you keep the code and data +of the book.

+

Of note: it is recommended to use a date-tagged version of the container to ensure reproducibility. +This can be done via:

+
docker pull ghcr.io/bodenmillergroup/imcdataanalysis:<year-month-date>
+
    +
  1. An RStudio server session can be accessed via a browser at localhost:8787 using Username: rstudio and Password: bioc.
    +
  2. +
  3. Navigate to IMCDataAnalysis and open the IMCDataAnalysis.Rproj file.
    +
  4. +
  5. Code in the individual files can now be executed or the whole workflow can be build by entering bookdown::render_book().
  6. +
+
+
+

4.2.2 Manual installation

+

The following section describes how to manually install all needed R packages +when not using the provided Docker container. +To install all R packages needed for the analysis, please run:

+
if (!requireNamespace("BiocManager", quietly = TRUE))
+    install.packages("BiocManager")
+
+BiocManager::install(c("rmarkdown", "bookdown", "pheatmap", "viridis", "zoo", 
+                       "devtools", "testthat", "tiff", "distill", "ggrepel", 
+                       "patchwork", "mclust", "RColorBrewer", "uwot", "Rtsne", 
+                       "harmony", "Seurat", "SeuratObject", "cowplot", "kohonen", 
+                       "caret", "randomForest", "ggridges", "cowplot", 
+                       "gridGraphics", "scales", "tiff", "harmony", "Matrix", 
+                       "CATALYST", "scuttle", "scater", "dittoSeq", 
+                       "tidyverse", "BiocStyle", "batchelor", "bluster", "scran", 
+                       "lisaClust", "spicyR", "iSEE", "imcRtools", "cytomapper",
+                       "imcdatasets", "cytoviewer"))
+
+# Github dependencies
+devtools::install_github("i-cyto/Rphenograph")
+
+
+

4.2.3 Major package versions

+

Throughout the analysis, we rely on different R software packages. +This section lists the most commonly used packages in this workflow.

+

Data containers:

+ +

Data analysis:

+ +

Data visualization:

+ +

Tidy R:

+ +
+
+
+

4.3 Image processing

+

The analysis presented here fully relies on packages written in the programming +language R and primarily focuses on analysis approaches downstream of image +processing. The example data available at +https://zenodo.org/record/7575859 were +processed (file type conversion, image segmentation, feature extraction as +explained in Section 3) using the +steinbock toolkit. The +exact command line interface calls to process the raw data are shown below:

+
#!/usr/bin/env bash
+BASEDIR=$(cd -- "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)
+cd "${BASEDIR}"
+
+# raw data collection
+mkdir raw
+wget https://zenodo.org/record/6449127/files/IMCWorkflow.ilp
+wget https://zenodo.org/record/6449127/files/analysis.zip
+unzip analysis.zip
+rm analysis.zip
+rm -r analysis/cpinp
+rm -r analysis/cpout
+rm -r analysis/histocat
+rm -r analysis/ilastik
+rm -r analysis/ometiff
+cd raw
+wget https://zenodo.org/record/5949116/files/panel.csv
+wget https://zenodo.org/record/5949116/files/Patient1.zip
+wget https://zenodo.org/record/5949116/files/Patient2.zip
+wget https://zenodo.org/record/5949116/files/Patient3.zip
+wget https://zenodo.org/record/5949116/files/Patient4.zip
+cd ${BASEDIR}
+
+# steinbock alias setup
+shopt -s expand_aliases
+alias steinbock="docker run -v ${BASEDIR}:/data -u $(id -u):$(id -g) ghcr.io/bodenmillergroup/steinbock:0.16.0"
+
+# raw data preprocessing
+steinbock preprocess imc panel --namecol Clean_Target
+steinbock preprocess imc images --hpf 50
+
+# random forest-based segmentation using Ilastik/CellProfiler
+steinbock classify ilastik prepare --cropsize 500 --seed 123
+rm pixel_classifier.ilp && mv IMCWorkflow.ilp pixel_classifier.ilp
+rm -r ilastik_crops && mv analysis/crops ilastik_crops
+steinbock classify ilastik fix --no-backup
+steinbock classify ilastik run
+steinbock segment cellprofiler prepare
+steinbock segment cellprofiler run -o masks_ilastik
+
+# deep learning-based whole-cell segmentation using DeepCell/Mesmer
+steinbock segment deepcell --app mesmer --minmax -o masks_deepcell
+
+# single-cell feature extraction
+steinbock measure intensities --masks masks_deepcell
+steinbock measure regionprops --masks masks_deepcell
+steinbock measure neighbors --masks masks_deepcell --type expansion --dmax 4
+
+# data export
+steinbock export ome
+steinbock export histocat --masks masks_deepcell
+steinbock export csv intensities regionprops -o cells.csv
+steinbock export csv intensities regionprops --no-concat -o cells_csv
+steinbock export fcs intensities regionprops -o cells.fcs
+steinbock export fcs intensities regionprops --no-concat -o cells_fcs
+steinbock export anndata --intensities intensities --data regionprops --neighbors neighbors -o cells.h5ad
+steinbock export anndata --intensities intensities --data regionprops --neighbors neighbors --no-concat -o cells_h5ad
+steinbock export graphs --data intensities
+
+# archiving
+zip -r img.zip img
+zip -r ilastik_img.zip ilastik_img
+zip -r ilastik_crops.zip ilastik_crops
+zip -r ilastik_probabilities.zip ilastik_probabilities
+zip -r masks_ilastik.zip masks_ilastik
+zip -r masks_deepcell.zip masks_deepcell
+zip -r intensities.zip intensities
+zip -r regionprops.zip regionprops
+zip -r neighbors.zip neighbors
+zip -r ome.zip ome
+zip -r histocat.zip histocat
+zip -r cells_csv.zip cells_csv
+zip -r cells_fcs.zip cells_fcs
+zip -r cells_h5ad.zip cells_h5ad
+zip -r graphs.zip graphs
+
+
+

4.4 Download example data

+

Throughout this tutorial, we will access a number of different data types. +To declutter the analysis scripts, we will already download all needed data here.

+

To highlight the basic steps of IMC data analysis, we provide example data that +were acquired as part of the Integrated iMMUnoprofiling of large adaptive +CANcer patient cohorts projects (immucan.eu). The +raw data of 4 patients can be accessed online at +zenodo.org/record/7575859. We will only +download the sample/patient metadata information here:

+
download.file("https://zenodo.org/record/7575859/files/sample_metadata.csv", 
+         destfile = "data/sample_metadata.csv")
+
+

4.4.1 Processed multiplexed imaging data

+

The IMC raw data was either processed using the +steinbock toolkit or the +IMC Segmentation Pipeline. +Image processing included file type conversion, cell segmentation and feature +extraction.

+

steinbock output

+

This book uses the output of the steinbock framework when applied to process +the example data. The processed data includes the single-cell mean intensity +files, the single-cell morphological features and spatial locations, spatial +object graphs in form of edge lists indicating cells in close proximity, hot +pixel filtered multi-channel images, segmentation masks, image metadata and +channel metadata. All these files will be downloaded here for later use. The +commands which were used to generate this data can be found in the shell script +above.

+
# download intensities
+url <- "https://zenodo.org/record/7624451/files/intensities.zip"
+destfile <- "data/steinbock/intensities.zip"
+download.file(url, destfile)
+unzip(destfile, exdir="data/steinbock", overwrite=TRUE)
+unlink(destfile)
+
+# download regionprops
+url <- "https://zenodo.org/record/7624451/files/regionprops.zip"
+destfile <- "data/steinbock/regionprops.zip"
+download.file(url, destfile)
+unzip(destfile, exdir="data/steinbock", overwrite=TRUE)
+unlink(destfile)
+
+# download neighbors
+url <- "https://zenodo.org/record/7624451/files/neighbors.zip"
+destfile <- "data/steinbock/neighbors.zip"
+download.file(url, destfile)
+unzip(destfile, exdir="data/steinbock", overwrite=TRUE)
+unlink(destfile)
+
+# download images
+url <- "https://zenodo.org/record/7624451/files/img.zip"
+destfile <- "data/steinbock/img.zip"
+download.file(url, destfile)
+unzip(destfile, exdir="data/steinbock", overwrite=TRUE)
+unlink(destfile)
+
+# download masks
+url <- "https://zenodo.org/record/7624451/files/masks_deepcell.zip"
+destfile <- "data/steinbock/masks_deepcell.zip"
+download.file(url, destfile)
+unzip(destfile, exdir="data/steinbock", overwrite=TRUE)
+unlink(destfile)
+
+# download individual files
+download.file("https://zenodo.org/record/7624451/files/panel.csv", 
+              "data/steinbock/panel.csv")
+download.file("https://zenodo.org/record/7624451/files/images.csv", 
+              "data/steinbock/images.csv")
+download.file("https://zenodo.org/record/7624451/files/steinbock.sh", 
+              "data/steinbock/steinbock.sh")
+

IMC Segmentation Pipeline output

+

The example data was also processed using the +IMC Segmetation Pipeline (version 3). +To highlight the use of the reader function for this type of output, we will need +to download the cpout folder which is part of the analysis folder. The cpout +folder stores all relevant output files of the pipeline. For a full description +of the pipeline, please refer to the docs.

+
# download analysis folder
+url <- "https://zenodo.org/record/7997296/files/analysis.zip"
+destfile <- "data/ImcSegmentationPipeline/analysis.zip"
+download.file(url, destfile)
+unzip(destfile, exdir="data/ImcSegmentationPipeline", overwrite=TRUE)
+unlink(destfile)
+
+unlink("data/ImcSegmentationPipeline/analysis/cpinp/", recursive=TRUE)
+unlink("data/ImcSegmentationPipeline/analysis/crops/", recursive=TRUE)
+unlink("data/ImcSegmentationPipeline/analysis/histocat/", recursive=TRUE)
+unlink("data/ImcSegmentationPipeline/analysis/ilastik/", recursive=TRUE)
+unlink("data/ImcSegmentationPipeline/analysis/ometiff/", recursive=TRUE)
+unlink("data/ImcSegmentationPipeline/analysis/cpout/images/", recursive=TRUE)
+unlink("data/ImcSegmentationPipeline/analysis/cpout/probabilities/", recursive=TRUE)
+unlink("data/ImcSegmentationPipeline/analysis/cpout/masks/", recursive=TRUE)
+
+
+

4.4.2 Files for spillover matrix estimation

+

To highlight the estimation and correction of channel-spillover as described by +(Chevrier et al. 2017), we can access an example spillover-acquisition from:

+
download.file("https://zenodo.org/record/7575859/files/compensation.zip",
+              "data/compensation.zip")
+unzip("data/compensation.zip", exdir="data", overwrite=TRUE)
+unlink("data/compensation.zip")
+
+
+

4.4.3 Gated cells

+

In Section 9.3, we present a cell type classification approach +that relies on previously gated cells. This ground truth data is available +online at zenodo.org/record/8095133 and +will be downloaded here for later use:

+
download.file("https://zenodo.org/record/8095133/files/gated_cells.zip",
+              "data/gated_cells.zip")
+unzip("data/gated_cells.zip", exdir="data", overwrite=TRUE)
+unlink("data/gated_cells.zip")
+
+
+
+

4.5 Software versions

+
+ +SessionInfo + +
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.3 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## locale:
+##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+## 
+## time zone: Etc/UTC
+## tzcode source: system (glibc)
+## 
+## attached base packages:
+## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
+## [8] base     
+## 
+## other attached packages:
+##  [1] cytoviewer_1.0.1            caret_6.0-94               
+##  [3] lattice_0.21-8              lisaClust_1.8.1            
+##  [5] scran_1.28.2                bluster_1.10.0             
+##  [7] lubridate_1.9.3             forcats_1.0.0              
+##  [9] stringr_1.5.0               dplyr_1.1.3                
+## [11] purrr_1.0.2                 readr_2.1.4                
+## [13] tidyr_1.3.0                 tibble_3.2.1               
+## [15] tidyverse_2.0.0             dittoSeq_1.12.1            
+## [17] cytomapper_1.12.0           EBImage_4.42.0             
+## [19] imcRtools_1.6.5             scater_1.28.0              
+## [21] ggplot2_3.4.3               scuttle_1.10.2             
+## [23] SpatialExperiment_1.10.0    CATALYST_1.24.0            
+## [25] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2
+## [27] Biobase_2.60.0              GenomicRanges_1.52.0       
+## [29] GenomeInfoDb_1.36.3         IRanges_2.34.1             
+## [31] S4Vectors_0.38.2            BiocGenerics_0.46.0        
+## [33] MatrixGenerics_1.12.3       matrixStats_1.0.0          
+## 
+## loaded via a namespace (and not attached):
+##   [1] R.methodsS3_1.8.2           vroom_1.6.3                
+##   [3] tiff_0.1-11                 nnet_7.3-19                
+##   [5] goftest_1.2-3               DT_0.29                    
+##   [7] HDF5Array_1.28.1            TH.data_1.1-2              
+##   [9] vctrs_0.6.3                 spatstat.random_3.1-6      
+##  [11] digest_0.6.33               png_0.1-8                  
+##  [13] shape_1.4.6                 proxy_0.4-27               
+##  [15] ggrepel_0.9.3               spicyR_1.12.2              
+##  [17] deldir_1.0-9                parallelly_1.36.0          
+##  [19] magick_2.8.0                MASS_7.3-60                
+##  [21] reshape2_1.4.4              httpuv_1.6.11              
+##  [23] foreach_1.5.2               withr_2.5.1                
+##  [25] xfun_0.40                   ggpubr_0.6.0               
+##  [27] ellipsis_0.3.2              survival_3.5-5             
+##  [29] RTriangle_1.6-0.12          ggbeeswarm_0.7.2           
+##  [31] RProtoBufLib_2.12.1         drc_3.0-1                  
+##  [33] systemfonts_1.0.4           zoo_1.8-12                 
+##  [35] GlobalOptions_0.1.2         gtools_3.9.4               
+##  [37] R.oo_1.25.0                 promises_1.2.1             
+##  [39] rstatix_0.7.2               globals_0.16.2             
+##  [41] rhdf5filters_1.12.1         rhdf5_2.44.0               
+##  [43] rstudioapi_0.15.0           miniUI_0.1.1.1             
+##  [45] archive_1.1.6               units_0.8-4                
+##  [47] generics_0.1.3              concaveman_1.1.0           
+##  [49] zlibbioc_1.46.0             ScaledMatrix_1.8.1         
+##  [51] ggraph_2.1.0                polyclip_1.10-6            
+##  [53] GenomeInfoDbData_1.2.10     fftwtools_0.9-11           
+##  [55] xtable_1.8-4                doParallel_1.0.17          
+##  [57] evaluate_0.21               S4Arrays_1.0.6             
+##  [59] hms_1.1.3                   bookdown_0.35              
+##  [61] irlba_2.3.5.1               colorspace_2.1-0           
+##  [63] spatstat.data_3.0-1         magrittr_2.0.3             
+##  [65] later_1.3.1                 viridis_0.6.4              
+##  [67] spatstat.geom_3.2-5         future.apply_1.11.0        
+##  [69] XML_3.99-0.14               cowplot_1.1.1              
+##  [71] class_7.3-22                svgPanZoom_0.3.4           
+##  [73] pillar_1.9.0                nlme_3.1-162               
+##  [75] iterators_1.0.14            compiler_4.3.1             
+##  [77] beachmat_2.16.0             shinycssloaders_1.0.0      
+##  [79] stringi_1.7.12              gower_1.0.1                
+##  [81] sf_1.0-14                   tensor_1.5                 
+##  [83] minqa_1.2.6                 ClassifyR_3.4.11           
+##  [85] plyr_1.8.8                  crayon_1.5.2               
+##  [87] abind_1.4-5                 locfit_1.5-9.8             
+##  [89] sp_2.0-0                    graphlayouts_1.0.1         
+##  [91] bit_4.0.5                   terra_1.7-46               
+##  [93] sandwich_3.0-2              codetools_0.2-19           
+##  [95] multcomp_1.4-25             recipes_1.0.8              
+##  [97] BiocSingular_1.16.0         bslib_0.5.1                
+##  [99] e1071_1.7-13                GetoptLong_1.0.5           
+## [101] mime_0.12                   MultiAssayExperiment_1.26.0
+## [103] splines_4.3.1               circlize_0.4.15            
+## [105] Rcpp_1.0.11                 sparseMatrixStats_1.12.2   
+## [107] knitr_1.44                  utf8_1.2.3                 
+## [109] clue_0.3-65                 lme4_1.1-34                
+## [111] listenv_0.9.0               nnls_1.5                   
+## [113] DelayedMatrixStats_1.22.6   ggsignif_0.6.4             
+## [115] Matrix_1.6-1.1              scam_1.2-14                
+## [117] statmod_1.5.0               tzdb_0.4.0                 
+## [119] svglite_2.1.1               tweenr_2.0.2               
+## [121] pkgconfig_2.0.3             pheatmap_1.0.12            
+## [123] tools_4.3.1                 cachem_1.0.8               
+## [125] viridisLite_0.4.2           DBI_1.1.3                  
+## [127] numDeriv_2016.8-1.1         fastmap_1.1.1              
+## [129] rmarkdown_2.25              scales_1.2.1               
+## [131] grid_4.3.1                  shinydashboard_0.7.2       
+## [133] broom_1.0.5                 sass_0.4.7                 
+## [135] carData_3.0-5               rpart_4.1.19               
+## [137] farver_2.1.1                tidygraph_1.2.3            
+## [139] mgcv_1.8-42                 yaml_2.3.7                 
+## [141] cli_3.6.1                   lifecycle_1.0.3            
+## [143] mvtnorm_1.2-3               lava_1.7.2.1               
+## [145] backports_1.4.1             DropletUtils_1.20.0        
+## [147] BiocParallel_1.34.2         cytolib_2.12.1             
+## [149] timechange_0.2.0            gtable_0.3.4               
+## [151] rjson_0.2.21                ggridges_0.5.4             
+## [153] parallel_4.3.1              pROC_1.18.4                
+## [155] limma_3.56.2                colourpicker_1.3.0         
+## [157] jsonlite_1.8.7              edgeR_3.42.4               
+## [159] bitops_1.0-7                bit64_4.0.5                
+## [161] Rtsne_0.16                  FlowSOM_2.8.0              
+## [163] spatstat.utils_3.0-3        BiocNeighbors_1.18.0       
+## [165] flowCore_2.12.2             jquerylib_0.1.4            
+## [167] metapod_1.8.0               dqrng_0.3.1                
+## [169] R.utils_2.12.2              timeDate_4022.108          
+## [171] shiny_1.7.5                 ConsensusClusterPlus_1.64.0
+## [173] htmltools_0.5.6             distances_0.1.9            
+## [175] glue_1.6.2                  XVector_0.40.0             
+## [177] RCurl_1.98-1.12             classInt_0.4-10            
+## [179] jpeg_0.1-10                 gridExtra_2.3              
+## [181] boot_1.3-28.1               igraph_1.5.1               
+## [183] R6_2.5.1                    cluster_2.1.4              
+## [185] Rhdf5lib_1.22.1             ipred_0.9-14               
+## [187] nloptr_2.0.3                DelayedArray_0.26.7        
+## [189] tidyselect_1.2.0            vipor_0.4.5                
+## [191] plotrix_3.8-2               ggforce_0.4.1              
+## [193] raster_3.6-23               car_3.1-2                  
+## [195] future_1.33.0               ModelMetrics_1.2.2.2       
+## [197] rsvd_1.0.5                  munsell_0.5.0              
+## [199] KernSmooth_2.23-21          data.table_1.14.8          
+## [201] htmlwidgets_1.6.2           ComplexHeatmap_2.16.0      
+## [203] RColorBrewer_1.1-3          rlang_1.1.1                
+## [205] spatstat.sparse_3.0-2       spatstat.explore_3.2-3     
+## [207] lmerTest_3.1-3              colorRamps_2.3.1           
+## [209] ggnewscale_0.4.9            fansi_1.0.4                
+## [211] hardhat_1.3.0               beeswarm_0.4.0             
+## [213] prodlim_2023.08.28
+
+ +
+
+

References

+
+
+Chevrier, Stéphane, Helena L. Crowell, Vito R. T. Zanotelli, Stefanie Engler, Mark D. Robinson, and Bernd Bodenmiller. 2017. “Compensation of Signal Spillover in Suspension and Imaging Mass Cytometry.” Cell Systems 6: 612–20. +
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/processing.html b/processing.html new file mode 100644 index 00000000..12267ec3 --- /dev/null +++ b/processing.html @@ -0,0 +1,565 @@ + + + + + + + 3 Multi-channel image processing | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

3 Multi-channel image processing

+

This book focuses on common analysis steps of spatially-resolved single-cell data +after image segmentation and feature extraction. In this chapter, the sections +describe the processing of multiplexed imaging data, including file type +conversion, image segmentation, feature extraction and data export. To obtain +more detailed information on the individual image processing approaches, please +visit their repositories:

+

steinbock: The steinbock +toolkit offers tools for multi-channel image processing using the command-line +or Python code (Windhager, Bodenmiller, and Eling 2021). Supported tasks include IMC data pre-processing, +multi-channel image segmentation, object quantification and data +export to a variety of file formats. It supports functionality similar to those +of the IMC Segmentation Pipeline (see below) and further allows deep-learning enabled image +segmentation. The toolkit is available as platform-independent Docker +container, ensuring reproducibility and user-friendly installation. Read more in +the Docs.

+

IMC Segmentation +Pipeline: The IMC +segmentation pipeline offers a rather manual way of segmenting multi-channel +images using a pixel classification-based approach. We continue to maintain the +pipeline but recommend the use of the steinbock toolkit for multi-channel +image processing. Raw IMC data pre-processing is performed using the +readimc Python package to convert +raw MCD files into OME-TIFF and TIFF files. After image cropping, an +Ilastik pixel classifier is trained for image +classification prior to image segmentation using +CellProfiler. Features (i.e., mean pixel intensity) +of segmented objects (i.e., cells) are quantified and exported. Read more in the +Docs.

+
+

3.1 Image pre-processing (IMC specific)

+

Image pre-processing is technology dependent. While most multiplexed imaging +technologies generated TIFF or OME-TIFF files which can be directly segmented +using the steinbock toolkit, IMC produces data in the proprietary +data format MCD.

+

To facilitate IMC data pre-processing, the +readimc open-source Python +package allows extracting the multi-modal (IMC acquisitions, panoramas), +multi-region, multi-channel information contained in raw IMC images. Both the +IMC Segmentation Pipeline and the steinbock toolkit use the readimc +package for IMC data pre-processing. Starting from IMC raw data and a “panel” +file, individual acquisitions are extracted as TIFF files and OME-TIFF files if +using the IMC Segmentation Pipeline. The panel contains information of +antibodies used in the experiment and the user can specify which channels to +keep for downstream analysis. When using the IMC Segmentation Pipeline, random +tiles are cropped from images for convenience of pixel labelling.

+
+
+

3.2 Image segmentation

+

The IMC Segmentation Pipeline supports pixel classification-based image +segmentation while steinbock supports pixel classification-based and deep +learning-based segmentation.

+

Pixel classification-based image segmentation is performed by training a +random forest classifier using Ilastik on the +randomly extracted image crops and selected image channels. Pixels are +classified as nuclear, cytoplasmic, or background. Employing a customizable +CellProfiler pipeline, the probabilities are then +thresholded for segmenting nuclei, and nuclei are expanded into cytoplasmic +regions to obtain cell masks.

+

Deep learning-based image segmentation is performed as presented by +(Greenwald et al. 2021). Briefly, steinbock first aggregates user-defined +image channels to generate two-channel images representing nuclear and +cytoplasmic signals. Next, the +DeepCell Python package is +used to run Mesmer, a deep learning-enabled segmentation algorithm pre-trained +on TissueNet, to automatically obtain cell masks without any further user +input.

+

Segmentation masks are single-channel images that match the input images in +size, with non-zero grayscale values indicating the IDs of segmented objects +(e.g., cells). These masks are written out as TIFF files after segmentation.

+
+
+

3.3 Feature extraction

+

Using the segmentation masks together with their corresponding multi-channel +images, the IMC Segmentation Pipeline as well as the steinbock toolkit extract +object-specific features. These include the mean pixel intensity per object and +channel, morphological features (e.g., object area) and the objects’ locations. +Object-specific features are written out as CSV files where rows represent +individual objects and columns represent features.

+

Furthermore, the IMC Segmentation Pipeline and the steinbock toolkit compute +spatial object graphs, in which nodes correspond to objects, and nodes in +spatial proximity are connected by an edge. These graphs serve as a proxy for +interactions between neighboring cells. They are stored as edge list in form of +one CSV file per image.

+

Both approaches also write out image-specific metadata (e.g., width and height) +as a CSV file.

+
+
+

3.4 Data export

+

To further facilitate compatibility with downstream analysis, steinbock +exports data to a variety of file formats such as OME-TIFF for images, FCS for +single-cell data, the anndata format (Virshup et al. 2021) for data analysis in Python, +and various graph file formats for network analysis using software such as +CytoScape (Shannon et al. 2003). For export to OME-TIFF, +steinbock uses xtiff, a Python +package developed for writing multi-channel TIFF stacks.

+
+
+

3.5 Data import into R

+

In Section 5, we will highlight the use of the +imcRtools and +cytomapper R/Bioconductor +packages to read spatially-resolved, single-cell and images as generated by the +IMC Segmentation Pipeline and the steinbock toolkit into the statistical +programming language R. All further downstream analyses are performed in R and +detailed in the following sections.

+ +
+
+

References

+
+
+Greenwald, Noah F., Geneva Miller, Erick Moen, Alex Kong, Adam Kagel, Thomas Dougherty, Christine Camacho Fullaway, et al. 2021. “Whole-Cell Segmentation of Tissue Images with Human-Level Performance Using Large-Scale Data Annotation and Deep Learning.” Nature Biotechnology 40: 555–65. +
+
+Shannon, Paul, Andrew Markiel, Owen Ozier, Nitin S. Baliga, Jonathan T. Wang, Daniel Ramage, Nada Amin, Benno Schwikowski, and Trey Ideker. 2003. “Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks.” Genome Research 13: 2498–2504. +
+
+Virshup, Isaac, Sergei Rybakov, Fabian J. Theis, Philipp Angerer, and F. Alexander Wolf. 2021. “Anndata: Annotated Data.” bioRxiv. +
+
+Windhager, Jonas, Bernd Bodenmiller, and Nils Eling. 2021. “An End-to-End Workflow for Multiplexed Image Processing and Analysis.” bioRxiv. +
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/read-data.html b/read-data.html new file mode 100644 index 00000000..c26f13e0 --- /dev/null +++ b/read-data.html @@ -0,0 +1,1034 @@ + + + + + + + 5 Read in the data | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

5 Read in the data

+

This section describes how to read in single-cell data and images into R +after image processing and segmentation (see Section 3).

+

To highlight examples for IMC data analysis, we provide already processed data at +10.5281/zenodo.6043599. +This data has already been downloaded in Section 4.4 and can +be accessed in the folder data.

+

We use the imcRtools package to +read in single-cell data extracted using the steinbock framework or the IMC +Segmentation Pipeline. Both image processing approaches also generate +multi-channel images and segmentation masks that can be read into R using the +cytomapper package.

+
library(imcRtools)
+library(cytomapper)
+
+

5.1 Read in single-cell information

+

For single-cell data analysis in R the +SingleCellExperiment +(Amezquita et al. 2019) data container is commonly used within the Bioconductor +framework. It allows standardized access to (i) expression data, (ii) cellular +metadata (e.g., cell type), (iii) feature metadata (e.g., marker name) and (iv) +experiment-wide metadata. For an in-depth introduction to the SingleCellExperiment +container, please refer to the SingleCellExperiment class.

+

The SpatialExperiment +class (Righelli et al. 2022) is an extension of the SingleCellExperiment class. It +was developed to store spatial data in addition to single-cell data and an +extended introduction is accessible +here.

+

To read in single-cell data generated by the steinbock framework or the IMC +Segmentation Pipeline, the imcRtools package provides the read_steinbock and +read_cpout functions, respectively. By default, the data is read into a +SpatialExperiment object; however, data can be read in as a +SingleCellExperiment object by setting return_as = "sce". All functions +presented in this book are applicable to both data containers.

+
+

5.1.1 steinbock generated data

+

The downloaded example data (Section 4.4) processed with the steinbock framework can be read in with the read_steinbock function provided by imcRtools. For more information, please refer to +?read_steinbock.

+
spe <- read_steinbock("data/steinbock/")
+spe
+
## class: SpatialExperiment 
+## dim: 40 47859 
+## metadata(0):
+## assays(1): counts
+## rownames(40): MPO HistoneH3 ... DNA1 DNA2
+## rowData names(12): channel name ... Final.Concentration...Dilution
+##   uL.to.add
+## colnames: NULL
+## colData names(8): sample_id ObjectNumber ... width_px height_px
+## reducedDimNames(0):
+## mainExpName: NULL
+## altExpNames(0):
+## spatialCoords names(2) : Pos_X Pos_Y
+## imgData names(1): sample_id
+

By default, single-cell data is read in as SpatialExperiment object. +The summarized pixel intensities per channel and cell (here mean intensity) are +stored in the counts slot. Columns represent cells and rows represent channels.

+
counts(spe)[1:5,1:5]
+
##                [,1]       [,2]      [,3]      [,4]      [,5]
+## MPO       0.5751064  0.4166667 0.4975494  0.890154 0.1818182
+## HistoneH3 3.1273082 11.3597883 2.3841440  7.712961 1.4512715
+## SMA       0.2600939  1.6720383 0.1535190  1.193948 0.2986703
+## CD16      2.0347747  2.5880536 2.2943074 15.629083 0.6084220
+## CD38      0.2530137  0.6826669 1.1902979  2.126060 0.2917793
+

Metadata associated to individual cells are stored in the colData slot. After +initial image processing, these metadata include the numeric identifier (ObjectNumber), +the area, and morphological features of each cell. In addition, sample_id stores +the image name from which each cell was extracted and the width and height of the +corresponding images are stored.

+
head(colData(spe))
+
## DataFrame with 6 rows and 8 columns
+##      sample_id ObjectNumber      area axis_major_length axis_minor_length
+##    <character>    <numeric> <numeric>         <numeric>         <numeric>
+## 1 Patient1_001            1        12           7.40623           1.89529
+## 2 Patient1_001            2        24          16.48004           1.96284
+## 3 Patient1_001            3        17           9.85085           1.98582
+## 4 Patient1_001            4        24           8.08290           3.91578
+## 5 Patient1_001            5        22           8.79367           3.11653
+## 6 Patient1_001            6        25           9.17436           3.46929
+##   eccentricity  width_px height_px
+##      <numeric> <numeric> <numeric>
+## 1     0.966702       600       600
+## 2     0.992882       600       600
+## 3     0.979470       600       600
+## 4     0.874818       600       600
+## 5     0.935091       600       600
+## 6     0.925744       600       600
+

The main difference between the SpatialExperiment and the +SingleCellExperiment data container is the way spatial +locations of all cells are stored. For the SingleCellExperiment container, the +locations are stored in the colData slot while the SpatialExperiment +container stores them in the spatialCoords slot:

+
head(spatialCoords(spe))
+
##      Pos_X     Pos_Y
+## 1 468.5833 0.4166667
+## 2 515.8333 0.4166667
+## 3 587.2353 0.4705882
+## 4 192.2500 1.2500000
+## 5 231.7727 0.9090909
+## 6 270.1600 1.0400000
+

The spatial object graphs generated by steinbock (see Section +3.3 are read into a colPair slot with the name +neighborhood of the SpatialExperiment (or SingleCellExperiment) object. +Cell-cell interactions (cells in close spatial proximity) are represented as +“edge list” (stored as SelfHits object). Here, the left side represents the +column indices of the SpatialExperiment object of the “from” cells and the +right side represents the column indices of the “to” cells. For visualization of +the spatial object graphs, please refer to Section 12.2.

+
colPair(spe, "neighborhood")
+
## SelfHits object with 257116 hits and 0 metadata columns:
+##                 from        to
+##            <integer> <integer>
+##        [1]         1        27
+##        [2]         1        55
+##        [3]         2        10
+##        [4]         2        44
+##        [5]         2        81
+##        ...       ...       ...
+##   [257112]     47858     47836
+##   [257113]     47859     47792
+##   [257114]     47859     47819
+##   [257115]     47859     47828
+##   [257116]     47859     47854
+##   -------
+##   nnode: 47859
+

Finally, metadata regarding the channels are stored in the rowData slot. This +information is extracted from the panel.csv file.

+

Channels have the same order as the rows in the panel.csv file for which the +keep column is set to 1, and match the order of channels in the multi-channel +images (see Section 5.3). For the example data, channels are +ordered by isotope mass.

+
head(rowData(spe))
+
## DataFrame with 6 rows and 12 columns
+##               channel        name      keep   ilastik  deepcell  cellpose
+##           <character> <character> <numeric> <numeric> <numeric> <logical>
+## MPO               Y89         MPO         1        NA        NA        NA
+## HistoneH3       In113   HistoneH3         1         1         1        NA
+## SMA             In115         SMA         1        NA        NA        NA
+## CD16            Pr141        CD16         1        NA        NA        NA
+## CD38            Nd142        CD38         1        NA        NA        NA
+## HLADR           Nd143       HLADR         1        NA        NA        NA
+##           Tube.Number              Target Antibody.Clone Stock.Concentration
+##             <numeric>         <character>    <character>           <numeric>
+## MPO              2101 Myeloperoxidase MPO Polyclonal MPO                 500
+## HistoneH3        2113          Histone H3           D1H2                 500
+## SMA              1914                 SMA            1A4                 500
+## CD16             2079                CD16       EPR16784                 500
+## CD38             2095                CD38        EPR4106                 500
+## HLADR            2087              HLA-DR        TAL 1B5                 500
+##           Final.Concentration...Dilution   uL.to.add
+##                              <character> <character>
+## MPO                              4 ug/mL         0.8
+## HistoneH3                        1 ug/mL         0.2
+## SMA                           0.25 ug/mL        0.05
+## CD16                             5 ug/mL           1
+## CD38                           2.5 ug/mL         0.5
+## HLADR                            1 ug/mL         0.2
+
+
+

5.1.2 IMC Segmentation Pipeline generated data

+

The IMC Segmentation Pipeline offers an +alternative approach to multiplexed image processing and segmentation. The +default pipeline is also available via steinbock. The IMC Segmentation +Pipeline is based on Ilastik pixel classification +and image segmentation using CellProfiler. We recommend +to become familiar with the pipeline as it allows flexible extension to more +complicated image analysis and segmentation tasks. For standard image analysis +and segmentation, steinbock is the preferred choice. Please refer to +the documentation +to get an overview on the pipeline.

+

All relevant output +storing single-cell data is contained in the cpout folder. +For reading in the single-cell measurement, the imcRtools package offers the +read_cpout function:

+
spe2 <- read_cpout("data/ImcSegmentationPipeline/analysis/cpout/")
+rownames(spe2) <- rowData(spe2)$Clean_Target
+spe2
+
## class: SpatialExperiment 
+## dim: 40 43796 
+## metadata(0):
+## assays(1): counts
+## rownames(40): MPO HistoneH3 ... DNA1 DNA2
+## rowData names(11): Tube.Number Metal.Tag ... ilastik deepcell
+## colnames: NULL
+## colData names(12): sample_id ObjectNumber ... Metadata_acid
+##   Metadata_description
+## reducedDimNames(0):
+## mainExpName: NULL
+## altExpNames(0):
+## spatialCoords names(2) : Pos_X Pos_Y
+## imgData names(1): sample_id
+

Similar to the steinbock output, cell morphological features and image level +metadata are stored in the colData(spe2) slot, the interaction information +is contained in colPair(spe2, type = "neighborhood") and the mean intensity +per channel and cell is stored in counts(spe2).

+
+
+

5.1.3 Reading custom files

+

When not using steinbock or the ImcSegmentationPipeline, the single-cell +information has to be read in from custom files. We now demonstrate how +to generate a SpatialExperiment object from single-cell data contained +in individual files. As an example, we use files generated by CellProfiler +as part of the ImcSegmentationPipeline.

+

First we will read in the single-cell features stored in a CSV file:

+
library(readr)
+
+cur_features <- read_csv("data/ImcSegmentationPipeline/analysis/cpout/cell.csv")
+
+dim(cur_features)
+
## [1] 43796   941
+
head(colnames(cur_features))
+
## [1] "ImageNumber"                    "ObjectNumber"                  
+## [3] "AreaShape_Area"                 "AreaShape_BoundingBoxArea"     
+## [5] "AreaShape_BoundingBoxMaximum_X" "AreaShape_BoundingBoxMaximum_Y"
+

This file contains a large number of single-cell features including the cell +identifier (ObjectNumber), the image identifier (ImageNumber), morphological +features (AreaShape_*), the cells’ locations (Location_Center_*) and the +mean pixel intensity per cell and per channel (Intensity_MeanIntensity_FullStack_*).

+

Now, we split the features into intensity features, cell-specific metadata and +the physical location of the cells:

+
counts <- cur_features[,grepl("Intensity_MeanIntensity_FullStack", 
+                                  colnames(cur_features))]
+
+meta <- cur_features[,c("ImageNumber", "ObjectNumber", "AreaShape_Area",
+                            "AreaShape_Eccentricity", "AreaShape_MeanRadius")]
+
+coords <- cur_features[,c("Location_Center_X", "Location_Center_Y")]
+

CellProfiler writes out the mean pixel intensities after scaling them +bit a scaling factor which is bit encoding-specific. The images to which +the IMC Segmentation Pipeline was applied were saved with 16-bit encoding. +This means for the example data, the mean pixel intensities need to +be scaled by a factor of 2 ^ 16 - 1 = 65535.

+
counts <- counts * 65535
+

In addition, CellProfiler does not order the channel numerically but rather +as a character; 1, 10, 2, 3, ... rather than 1, 2, 3, .... Therefore we +will need to reorder the channels.

+
library(stringr)
+cur_ch <- str_split(colnames(counts), "_", simplify = TRUE)[,4]
+cur_ch <- sub("c", "", cur_ch)
+
+counts <- counts[,order(as.numeric(cur_ch))]
+

From these features we can now construct the SpatialExperiment object.

+
spe3 <- SpatialExperiment(assays = list(counts = t(counts)),
+                          colData = meta, 
+                          sample_id = as.character(meta$ImageNumber),
+                          spatialCoords = as.matrix(coords))
+

Next, we can store the spatial cell graph generated by CellProfiler in the +colPairs slot of the object. Spatial cell graphs are usually stored as edge +list in form of a CSV file. The colPairs slot requires a SelfHits entry +storing an edge list where numeric entries represent the index of the from and +to cell in the SpatialExperiment object. To generate such an edge list, we +need to match the cell IDs contained in the CSV against the cell IDs in the +SpatialExperiment object.

+
cur_pairs <- read_csv("data/ImcSegmentationPipeline/analysis/cpout/Object relationships.csv")
+
+cur_from <- paste(cur_pairs$`First Image Number`, cur_pairs$`First Object Number`)
+cur_to <- paste(cur_pairs$`Second Image Number`, cur_pairs$`Second Object Number`)
+
+edgelist <- SelfHits(from = match(cur_from, 
+                                  paste(spe3$ImageNumber, spe3$ObjectNumber)),
+                     to = match(cur_to, 
+                                  paste(spe3$ImageNumber, spe3$ObjectNumber)),
+                     nnode = ncol(spe3))
+
+colPair(spe3, "neighborhood") <- edgelist
+

For further downstream analysis, we will use the steinbock results.

+
+
+
+

5.2 Single-cell processing

+

After reading in the single-cell data, few further processing steps need to be +taken.

+

Add additional metadata

+

We can set the colnames of the object to generate unique identifiers per cell:

+
colnames(spe) <- paste0(spe$sample_id, "_", spe$ObjectNumber)
+

It is also often the case that sample-specific metadata are available externally. +For the current data, we need to link the cancer type (also referred to as “Indication”) +to each sample. This metadata is available as external CSV file:

+
library(tidyverse)
+
+# Read patient metadata
+meta <- read_csv("data/sample_metadata.csv")
+
+# Extract patient id and ROI id from sample name
+spe$patient_id <- str_extract(spe$sample_id, "Patient[1-4]")
+spe$ROI <- str_extract(spe$sample_id, "00[1-8]")
+
+# Store cancer type in SPE object
+spe$indication <- meta$Indication[match(spe$patient_id, meta$`Sample ID`)]
+
+unique(spe$patient_id)
+
## [1] "Patient1" "Patient2" "Patient3" "Patient4"
+
unique(spe$ROI)
+
## [1] "001" "002" "003" "004" "005" "006" "007" "008"
+
unique(spe$indication)
+
## [1] "SCCHN" "BCC"   "NSCLC" "CRC"
+

The selected patients were diagnosed with different cancer types:

+
    +
  • SCCHN - head and neck cancer
    +
  • +
  • BCC - breast cancer
    +
  • +
  • NSCLC - lung cancer
    +
  • +
  • CRC - colorectal cancer
  • +
+

Transform counts

+

The distribution of expression counts across cells is often observed to be +skewed towards the right side meaning lots of cells display low counts and few +cells have high counts. To avoid analysis biases from these high-expressing +cells, the expression counts are commonly transformed or clipped.

+

Here, we perform counts transformation using an inverse hyperbolic sine +function. This transformation is commonly applied to flow cytometry +data. +The cofactor here defines the expression range on which no scaling is +performed. While the cofactor for CyTOF data is often set to 5, IMC data +usually display much lower counts. We therefore apply a cofactor of 1.

+

However, other transformations such as log(counts(spe) + 0.01) should be +tested when analysing IMC data.

+
library(dittoSeq)
+dittoRidgePlot(spe, var = "CD3", group.by = "patient_id", assay = "counts") +
+    ggtitle("CD3 - before transformation")
+

+
assay(spe, "exprs") <- asinh(counts(spe)/1)
+dittoRidgePlot(spe, var = "CD3", group.by = "patient_id", assay = "exprs") +
+    ggtitle("CD3 - after transformation")
+

+

Define interesting channels

+

For downstream analysis such as visualization, dimensionality reduction and +clustering, only a subset of markers should be used. As convenience, we can +store an additional entry in the rowData slot that specifies the markers of +interest. Here, we deselect the nuclear markers, which were primarily used for +cell segmentation, and keep all other biological targets. However, more informed +marker selection should be performed to exclude lowly expressed marker or +markers with low signal-to-noise ratio.

+
rowData(spe)$use_channel <- !grepl("DNA|Histone", rownames(spe))
+

Define color schemes

+

We will define color schemes for different metadata entries of the data and +conveniently store them in the metadata slot of the SpatialExperiment which +will be helpful for downstream data visualizations. We will use colors from the +RColorBrewer and dittoSeq packages but any other coloring package will +suffice.

+
library(RColorBrewer)
+color_vectors <- list()
+
+ROI <- setNames(brewer.pal(length(unique(spe$ROI)), name = "BrBG"), 
+                unique(spe$ROI))
+patient_id <- setNames(brewer.pal(length(unique(spe$patient_id)), name = "Set1"), 
+                unique(spe$patient_id))
+sample_id <- setNames(c(brewer.pal(6, "YlOrRd")[3:5],
+                        brewer.pal(6, "PuBu")[3:6],
+                        brewer.pal(6, "YlGn")[3:5],
+                        brewer.pal(6, "BuPu")[3:6]),
+                unique(spe$sample_id))
+indication <- setNames(brewer.pal(length(unique(spe$indication)), name = "Set2"), 
+                unique(spe$indication))
+
+color_vectors$ROI <- ROI
+color_vectors$patient_id <- patient_id
+color_vectors$sample_id <- sample_id
+color_vectors$indication <- indication
+
+metadata(spe)$color_vectors <- color_vectors
+
+
+

5.3 Read in images

+

The cytomapper package allows multi-channel image handling and visualization +within the Bioconductor framework. The most common data format for multi-channel +images or segmentation masks is the TIFF file format, which is used by steinbock +and the IMC segementation pipeline to save images.

+

Here, we will read in multi-channel images and segmentation masks into a +CytoImageList +data container. It allows storing multiple multi-channel images and requires +matched channels across all images within the object.

+

The loadImages function is used to read in processed multi-channel images and +their corresponding segmentation masks. Of note: the multi-channel images +generated by steinbock are saved as 32-bit images while the segmentation masks +are saved as 16-bit images. To correctly scale pixel values of the segmentation +masks when reading them in, we will need to set as.is = TRUE.

+
images <- loadImages("data/steinbock/img/")
+
## All files in the provided location will be read in.
+
masks <- loadImages("data/steinbock/masks_deepcell/", as.is = TRUE)
+
## All files in the provided location will be read in.
+

In the case of multi-channel images, it is beneficial to set the channelNames +for easy visualization. Using the steinbock framework, the channel order of +the single-cell data matches the channel order of the multi-channel images. +However, it is recommended to make sure that the channel order is identical +between the single-cell data and the images.

+
channelNames(images) <- rownames(spe)
+images
+
## CytoImageList containing 14 image(s)
+## names(14): Patient1_001 Patient1_002 Patient1_003 Patient2_001 Patient2_002 Patient2_003 Patient2_004 Patient3_001 Patient3_002 Patient3_003 Patient4_005 Patient4_006 Patient4_007 Patient4_008 
+## Each image contains 40 channel(s)
+## channelNames(40): MPO HistoneH3 SMA CD16 CD38 HLADR CD27 CD15 CD45RA CD163 B2M CD20 CD68 Ido1 CD3 LAG3 / LAG33 CD11c PD1 PDGFRb CD7 GrzB PDL1 TCF7 CD45RO FOXP3 ICOS CD8a CarbonicAnhydrase CD33 Ki67 VISTA CD40 CD4 CD14 Ecad CD303 CD206 cleavedPARP DNA1 DNA2
+

For visualization shown in Section 11 we will need to +add additional metadata to the elementMetadata slot of the CytoImageList +objects. This slot is easily accessible using the mcols function.

+

Here, we will store the matched sample_id, patient_id and indication +information within the elementMetadata slot of the multi-channel images and +segmentation masks objects. It is crucial that the order of the images in +both CytoImageList objects is the same.

+
all.equal(names(images), names(masks))
+
## [1] TRUE
+
# Extract patient id from image name
+patient_id <- str_extract(names(images), "Patient[1-4]")
+
+# Retrieve cancer type per patient from metadata file
+indication <- meta$Indication[match(patient_id, meta$`Sample ID`)] 
+
+# Store patient and image level information in elementMetadata
+mcols(images) <- mcols(masks) <- DataFrame(sample_id = names(images),
+                                           patient_id = patient_id,
+                                           indication = indication)
+
+
+

5.4 Generate single-cell data from images

+

An alternative way of generating a SingleCellExperiment object directly +from the multi-channel images and segmentation masks is supported by the +measureObjects +function of the cytomapper package. For each cell present in the masks +object, the function computes the mean pixel intensity per channel as well as +morphological features (area, radius, major axis length, eccentricity) and the +location of cells:

+
cytomapper_sce <- measureObjects(masks, image = images, img_id = "sample_id")
+
+cytomapper_sce
+
## class: SingleCellExperiment 
+## dim: 40 47859 
+## metadata(0):
+## assays(1): counts
+## rownames(40): MPO HistoneH3 ... DNA1 DNA2
+## rowData names(0):
+## colnames: NULL
+## colData names(10): sample_id object_id ... patient_id indication
+## reducedDimNames(0):
+## mainExpName: NULL
+## altExpNames(0):
+
+
+

5.5 Accessing publicly available IMC datasets

+

The imcdatasets +R/Bioconductor package provides a number of publicly available IMC datasets. For +a complete introduction to the package, please refer to the +documentation. +Here, we can read in example data of (Damond et al. 2019) taken from patients diagnosed +with Type I Diabetes. The example here consists of a CytoImageList object of +100 images, a CytoImageList object of 100 segmentation masks and a +SingleCellExperiment object containing 252059 cells. Of note: downloading the +images takes quite some time and uses 8GB of memory.

+
library(imcdatasets)
+
+pancreasImages <- Damond_2019_Pancreas(data_type = "images")
+pancreasMasks <- Damond_2019_Pancreas(data_type = "masks")
+pancreasSCE <- Damond_2019_Pancreas(data_type = "sce")
+
+
+

5.6 Save objects

+

Finally, the generated data objects can be saved for further downstream +processing and analysis.

+
saveRDS(spe, "data/spe.rds")
+saveRDS(images, "data/images.rds")
+saveRDS(masks, "data/masks.rds")
+
+
+

5.7 Session Info

+
+ +SessionInfo + +
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.3 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## locale:
+##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+## 
+## time zone: Etc/UTC
+## tzcode source: system (glibc)
+## 
+## attached base packages:
+## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
+## [8] base     
+## 
+## other attached packages:
+##  [1] testthat_3.1.10             RColorBrewer_1.1-3         
+##  [3] dittoSeq_1.12.1             lubridate_1.9.3            
+##  [5] forcats_1.0.0               dplyr_1.1.3                
+##  [7] purrr_1.0.2                 tidyr_1.3.0                
+##  [9] tibble_3.2.1                ggplot2_3.4.3              
+## [11] tidyverse_2.0.0             stringr_1.5.0              
+## [13] readr_2.1.4                 cytomapper_1.12.0          
+## [15] EBImage_4.42.0              imcRtools_1.6.5            
+## [17] SpatialExperiment_1.10.0    SingleCellExperiment_1.22.0
+## [19] SummarizedExperiment_1.30.2 Biobase_2.60.0             
+## [21] GenomicRanges_1.52.0        GenomeInfoDb_1.36.3        
+## [23] IRanges_2.34.1              S4Vectors_0.38.2           
+## [25] BiocGenerics_0.46.0         MatrixGenerics_1.12.3      
+## [27] matrixStats_1.0.0          
+## 
+## loaded via a namespace (and not attached):
+##   [1] later_1.3.1               bitops_1.0-7             
+##   [3] R.oo_1.25.0               svgPanZoom_0.3.4         
+##   [5] polyclip_1.10-6           lifecycle_1.0.3          
+##   [7] sf_1.0-14                 rprojroot_2.0.3          
+##   [9] edgeR_3.42.4              lattice_0.21-8           
+##  [11] vroom_1.6.3               MASS_7.3-60              
+##  [13] magrittr_2.0.3            limma_3.56.2             
+##  [15] sass_0.4.7                rmarkdown_2.25           
+##  [17] jquerylib_0.1.4           yaml_2.3.7               
+##  [19] httpuv_1.6.11             sp_2.0-0                 
+##  [21] cowplot_1.1.1             DBI_1.1.3                
+##  [23] pkgload_1.3.3             abind_1.4-5              
+##  [25] zlibbioc_1.46.0           R.utils_2.12.2           
+##  [27] ggraph_2.1.0              RCurl_1.98-1.12          
+##  [29] tweenr_2.0.2              GenomeInfoDbData_1.2.10  
+##  [31] ggrepel_0.9.3             RTriangle_1.6-0.12       
+##  [33] terra_1.7-46              pheatmap_1.0.12          
+##  [35] units_0.8-4               dqrng_0.3.1              
+##  [37] svglite_2.1.1             DelayedMatrixStats_1.22.6
+##  [39] codetools_0.2-19          DropletUtils_1.20.0      
+##  [41] DelayedArray_0.26.7       DT_0.29                  
+##  [43] scuttle_1.10.2            ggforce_0.4.1            
+##  [45] tidyselect_1.2.0          raster_3.6-23            
+##  [47] farver_2.1.1              viridis_0.6.4            
+##  [49] jsonlite_1.8.7            BiocNeighbors_1.18.0     
+##  [51] e1071_1.7-13              ellipsis_0.3.2           
+##  [53] tidygraph_1.2.3           ggridges_0.5.4           
+##  [55] systemfonts_1.0.4         tools_4.3.1              
+##  [57] Rcpp_1.0.11               glue_1.6.2               
+##  [59] gridExtra_2.3             xfun_0.40                
+##  [61] HDF5Array_1.28.1          shinydashboard_0.7.2     
+##  [63] withr_2.5.1               fastmap_1.1.1            
+##  [65] rhdf5filters_1.12.1       fansi_1.0.4              
+##  [67] digest_0.6.33             timechange_0.2.0         
+##  [69] R6_2.5.1                  mime_0.12                
+##  [71] colorspace_2.1-0          jpeg_0.1-10              
+##  [73] R.methodsS3_1.8.2         utf8_1.2.3               
+##  [75] generics_0.1.3            data.table_1.14.8        
+##  [77] class_7.3-22              graphlayouts_1.0.1       
+##  [79] htmlwidgets_1.6.2         S4Arrays_1.0.6           
+##  [81] pkgconfig_2.0.3           gtable_0.3.4             
+##  [83] XVector_0.40.0            brio_1.1.3               
+##  [85] htmltools_0.5.6           bookdown_0.35            
+##  [87] fftwtools_0.9-11          scales_1.2.1             
+##  [89] png_0.1-8                 knitr_1.44               
+##  [91] rstudioapi_0.15.0         tzdb_0.4.0               
+##  [93] rjson_0.2.21              proxy_0.4-27             
+##  [95] cachem_1.0.8              rhdf5_2.44.0             
+##  [97] KernSmooth_2.23-21        parallel_4.3.1           
+##  [99] vipor_0.4.5               concaveman_1.1.0         
+## [101] desc_1.4.2                pillar_1.9.0             
+## [103] grid_4.3.1                vctrs_0.6.3              
+## [105] promises_1.2.1            distances_0.1.9          
+## [107] beachmat_2.16.0           xtable_1.8-4             
+## [109] archive_1.1.6             beeswarm_0.4.0           
+## [111] evaluate_0.21             magick_2.8.0             
+## [113] cli_3.6.1                 locfit_1.5-9.8           
+## [115] compiler_4.3.1            rlang_1.1.1              
+## [117] crayon_1.5.2              labeling_0.4.3           
+## [119] classInt_0.4-10           ggbeeswarm_0.7.2         
+## [121] stringi_1.7.12            viridisLite_0.4.2        
+## [123] BiocParallel_1.34.2       nnls_1.5                 
+## [125] munsell_0.5.0             tiff_0.1-11              
+## [127] Matrix_1.6-1.1            hms_1.1.3                
+## [129] sparseMatrixStats_1.12.2  bit64_4.0.5              
+## [131] Rhdf5lib_1.22.1           shiny_1.7.5              
+## [133] igraph_1.5.1              bslib_0.5.1              
+## [135] bit_4.0.5
+
+ +
+
+

References

+
+
+Amezquita, Robert A., Aaron T. L. Lun, Etienne Becht, Vince J. Carey, Lindsay N. Carpp, Ludwig Geistlinger, Federico Marini, et al. 2019. “Orchestrating Single-Cell Analysis with Bioconductor.” Nature Methods 17: 137–45. +
+
+Damond, Nicolas, Stefanie Engler, Vito R. T. Zanotelli, Denis Schapiro, Clive H. Wasserfall, Irina Kusmartseva, Harry S. Nick, et al. 2019. “A Map of Human Type 1 Diabetes Progression by Imaging Mass Cytometry.” Cell Metabolism 29: 755–768.e5. +
+
+Righelli, Dario, Lukas M Weber, Helena L Crowell, Brenda Pardo, Leonardo Collado-Torres, Shila Ghazanfar, Aaron T L Lun, Stephanie C Hicks, and Davide Risso. 2022. SpatialExperiment: Infrastructure for Spatially-Resolved Transcriptomics Data in r Using Bioconductor.” Bioinformatics 38: 3128–31. +
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/reference-keys.txt b/reference-keys.txt new file mode 100644 index 00000000..85eabd1b --- /dev/null +++ b/reference-keys.txt @@ -0,0 +1,130 @@ +preamble +disclaimer +feedback-and-contributing +citation +changelog +intro +technical-details-of-imc +metal-conjugated-antobodies-and-staining +data-acquisition +data-format +processing +image-pre-processing-imc-specific +image-segmentation +feature-extraction +data-export +data-import-into-r +prerequisites +obtain-the-code +software-requirements +docker +manual-install +major-package-versions +image-processing +download-data +processed-multiplexed-imaging-data +files-for-spillover-matrix-estimation +gated-cells +sessionInfo +read-data +read-in-single-cell-information +steinbock-generated-data +imc-segmentation-pipeline-generated-data +reading-custom-files +cell-processing +read-images +generate-single-cell-data-from-images +accessing-publicly-available-imc-datasets +save-objects +session-info +spillover-correction +generate-the-spillover-matrix +read-in-the-data +quality-control +pixel_binning +filtering-incorrectly-assigned-pixels +compute-spillover-matrix +single-cell-data-compensation +image-compensation +write-out-compensated-images +save-objects-1 +session-info-1 +image-and-cell-level-quality-control +read-in-the-data-1 +seg-quality +image-quality +cell-quality +save-objects-2 +session-info-2 +batch-effects +fastmnn-correction +perform-sample-correction +quality-control-of-correction-results +visualization +harmony-correction +visualization-1 +seurat-correction +visualization-2 +save-objects-3 +session-info-3 +cell-phenotyping +load-data +clustering +rphenograph +snn-graph +self-organizing-maps +compare-between-clustering-approaches +clustering-notes +classification +manual-labeling-of-cells +define-color-vectors +read-in-and-consolidate-labeled-data +train-classifier +classifier-performance +classification-of-new-data +session-info-4 +single-cell-visualization +load-data-1 +cell-type-level +dimensionality-reduction-visualization +heatmap-visualization +violin-plot-visualization +scatter-plot-visualization +barplot-visualization +catalyst-based-visualization +pseudobulk-level-mds-plot +reduced-dimension-plot-on-clr-of-proportions +pseudobulk-expression-boxplot +sample-level +dimensionality-reduction-visualization-1 +heatmap-visualization-1 +barplot-visualization-1 +catalyst-based-visualization-1 +pseudobulk-level-mds-plot-1 +reduced-dimension-plot-on-clr-of-proportions-1 +rich-example +publication-ready-complexheatmap +interactive-visualization +session-info-5 +image-visualization +pixel-visualization +adjusting-colors +image-normalization +mask-visualization +visualzing-metadata +visualizating-expression +outline-cells +adjusting-plot-annotations +displaying-individual-images +saving-and-returning-images +interactive-image-visualization +session-info-6 +performing-spatial-analysis +spatial-interaction-graphs +spatial-viz +spatial-community-analysis +cellular-neighborhood-analysis +spatial-context-analysis +patch-detection +interaction-analysis +session-info-7 diff --git a/references.html b/references.html new file mode 100644 index 00000000..fd2fc725 --- /dev/null +++ b/references.html @@ -0,0 +1,563 @@ + + + + + + + References | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

References

+ +
+
+Ali, Raza, Hartland W. Jackson, Vito R. T. Zanotelli, Esther Danenberg, Jana R. Fischer, Helen Bardwell, Elena Provenzanoa, et al. 2020. “Imaging Mass Cytometry and Multiplatform Genomics Define the Phenogenomic Landscape of Breast Cancer.” Nature Cancer 1: 163–75. +
+
+Amezquita, Robert A., Aaron T. L. Lun, Etienne Becht, Vince J. Carey, Lindsay N. Carpp, Ludwig Geistlinger, Federico Marini, et al. 2019. “Orchestrating Single-Cell Analysis with Bioconductor.” Nature Methods 17: 137–45. +
+
+Angelo, Michael, Sean C. Bendall, Rachel Finck, Matthew B. Hale, Chuck Hitzman, Alexander D. Borowsky, Richard M. Levenson, et al. 2014. “Multiplexed Ion Beam Imaging of Human Breast Tumors.” Nature Medicine 20 (4): 436–42. +
+
+Bai, Yunhao, Bokai Zhu, Xavier Rovira-Clave, Han Chen, Maxim Markovic, Chi Ngai Chan, Tung-Hung Su, et al. 2021. “Adjacent Cell Marker Lateral Spillover Compensation and Reinforcement for Multiplexed Images.” Frontiers in Immunology 12. +
+
+Bendall, Sean C., Erin F. Simonds, Peng Qiu, El Ad D. Amir, Peter O. Krutzik, Rachel Finck, Robert V. Bruggner, et al. 2011. “Single-Cell Mass Cytometry of Differential Immune and Drug Responses Across a Human Hematopoietic Continuum.” Science 332: 687–96. +
+
+Bhate, Salil S., Graham L. Barlow, Christian M. Schürch, and Garry P. Nolan. 2022. “Tissue Schematics Map the Specialization of Immune Tissue Motifs and Their Appropriation by Tumors.” Cell Systems 13 (2): 109–30. +
+
+Chevrier, Stéphane, Helena L. Crowell, Vito R. T. Zanotelli, Stefanie Engler, Mark D. Robinson, and Bernd Bodenmiller. 2017. “Compensation of Signal Spillover in Suspension and Imaging Mass Cytometry.” Cell Systems 6: 612–20. +
+
+Damond, Nicolas, Stefanie Engler, Vito R. T. Zanotelli, Denis Schapiro, Clive H. Wasserfall, Irina Kusmartseva, Harry S. Nick, et al. 2019. “A Map of Human Type 1 Diabetes Progression by Imaging Mass Cytometry.” Cell Metabolism 29: 755–768.e5. +
+
+Eling, Nils, Nicolas Damond, Tobias Hoch, and Bernd Bodenmiller. 2020. “Cytomapper: An r/Bioconductor Package for Visualization of Highly Multiplexed Imaging Data.” Bioinformatics 36 (24): 5706--5708. +
+
+Ferrian, Selena, Candace C. Liu, Erin F. McCaffrey, Rashmi Kumar, Theodore S. Nowicki, David W. Dawson, Alex Baranski, et al. 2021. “Multiplexed Imaging Reveals an IFN-\(\gamma\)-Driven Inflammatory State in Nivolumab-Associated Gastritis.” Cell Reports Medicine 2: 100419. +
+
+Giesen, Charlotte, Hao A. O. Wang, Denis Schapiro, Nevena Zivanovic, Andrea Jacobs, Bodo Hattendorf, Peter J. Schüffler, et al. 2014. “Highly Multiplexed Imaging of Tumor Tissues with Subcellular Resolution by Mass Cytometry.” Nature Methods 11 (4): 417–22. +
+
+Goltsev, Yury, Nikolay Samusik, Julia Kennedy-Darling, Salil Bhate, Matthew Hale, Gustavo Vazquez, Sarah Black, and Garry P. Nolan. 2018. “Deep Profiling of Mouse Splenic Architecture with CODEX Multiplexed Imaging.” Cell 174: 968–81. +
+
+Greenwald, Noah F., Geneva Miller, Erick Moen, Alex Kong, Adam Kagel, Thomas Dougherty, Christine Camacho Fullaway, et al. 2021. “Whole-Cell Segmentation of Tissue Images with Human-Level Performance Using Large-Scale Data Annotation and Deep Learning.” Nature Biotechnology 40: 555–65. +
+
+Gu, Zuguang, Roland Eils, and Matthias Schlesner. 2016. “Complex Heatmaps Reveal Patterns and Correlations in Multidimensional Genomic Data.” Bioinformatics 32: 2847–49. +
+
+Gut, Gabriele, Markus D Herrmann, and Lucas Pelkmans. 2018. “Multiplexed Protein Maps Link Subcellular Organization to Cellular States.” Science 361: 1–13. +
+
+Haghverdi, Laleh, Aaron T. L. Lun, Michael D. Morgan, and John C. Marioni. 2018. “Batch Effects in Single-Cell RNA-Sequencing Data Are Corrected by Matching Mutual Nearest Neighbors.” Nature Biotechnology 36: 421–27. +
+
+Hoch, Tobias, Daniel Schulz, Nils Eling, Julia Martínez Gómez, Mitchell P. Levesque, and Bernd Bodenmiller. 2022. “Multiplexed Imaging Mass Cytometry of the Chemokine Milieus in Melanoma Characterizes Features of the Response to Immunotherapy.” Science Immunology 7 (70): eabk1692. +
+
+Ijsselsteijn, Marieke E., Ruud van der Breggen, Arantza F. Sarasqueta, Frits Koning, and Noel F. C. C. de Miranda. 2019. “A 40-Marker Panel for High Dimensional Characterization of Cancer Immune Microenvironments by Imaging Mass Cytometry.” Frontiers in Immunology 10. +
+
+Jackson, Hartland W., Jana R. Fischer, Vito R. T. Zanotelli, H. Raza Ali, Robert Mechera, Savas D. Soysal, Holger Moch, et al. 2020. “The Single-Cell Pathology Landscape of Breast Cancer.” Nature 578: 615–20. +
+
+Jiang, Sizun, Chi Ngai Chan, Xavier Rovira-Clavé, Han Chen, Yunhao Bai, Bokai Zhu, Erin McCaffrey, et al. 2022. “Combined Protein and Nucleic Acid Imaging Reveals Virus-Dependent b Cell and Macrophage Immunosuppression of Tissue Microenvironments.” Immunity 55: 1118–1134.e8. +
+
+Korsunsky, Ilya, Nghia Millard, Jean Fan, Kamil Slowikowski, Fan Zhang, Kevin Wei, Yuriy Baglaenko, Michael Brenner, Po-ru Loh, and Soumya Raychaudhuri. 2019. “Fast, Sensitive and Accurate Integration of Single-Cell Data with Harmony.” Nature Methods 16: 1289–96. +
+
+Levine, Jacob H., Erin F. Simonds, Sean C. Bendall, Kara L. Davis, El-ad D. Amir, Michelle D. Tadmor, Oren Litvin, et al. 2015. “Data-Driven Phenotypic Dissection of AML Reveals Progenitor-Like Cells That Correlate with Prognosis.” Cell 162: 184–97. +
+
+Lin, Jia-Ren, Benjamin Izar, Shu Wang, Clarence Yapp, Shaolin Mei, Parin M. Shah, Sandro Santagata, and Peter K. Sorger. 2018. “Highly Multiplexed Immunofluorescence Imaging of Human Tissues and Tumors Using t-CyCIF and Conventional Optical Microscopes.” eLife 7: 1–46. +
+
+Meyer, Lasse, Nils Eling, and Bernd Bodenmiller. 2023. “Cytoviewer: An r/Bioconductor Package for Interactive Visualization and Exploration of Highly Multiplexed Imaging Data.” +
+
+Mitamura, Yasutaka, Daniel Schulz, Saskia Oro, Nick Li, Isabel Kolm, Claudia Lang, Reihane Ziadlou, et al. 2021. “Cutaneous and Systemic Hyperinflammation Drives Maculopapular Drug Exanthema in Severely Ill COVID-19 Patients.” Allergy 77: 595–608. +
+
+Patrick, Ellis, Nicolas P. Canete, Sourish S. Iyengar, Andrew N. Harman, Greg T. Sutherland, and Pengyi Yang. 2023. “Spatial Analysis for Highly Multiplexed Imaging Data to Identify Tissue Microenvironments.” Cytometry Part A. +
+
+Rendeiro, André F., Hiranmayi Ravichandran, Yaron Bram, Vasuretha Chandar, Junbum Kim, Cem Meydan, Jiwoon Park, et al. 2021. “The Spatial Landscape of Lung Pathology During COVID-19 Progression.” Nature 593: 564–69. +
+
+Righelli, Dario, Lukas M Weber, Helena L Crowell, Brenda Pardo, Leonardo Collado-Torres, Shila Ghazanfar, Aaron T L Lun, Stephanie C Hicks, and Davide Risso. 2022. SpatialExperiment: Infrastructure for Spatially-Resolved Transcriptomics Data in r Using Bioconductor.” Bioinformatics 38: 3128–31. +
+
+Saka, Sinem K., Yu Wang, Jocelyn Y. Kishi, Allen Zhu, Yitian Zeng, Wenxin Xie, Koray Kirli, et al. 2019. “Immuno-SABER Enables Highly Multiplexed and Amplified Protein Imaging in Tissues.” Nature Biotechnology 37: 1080–90. +
+
+Schapiro, Denis, Hartland W Jackson, Swetha Raghuraman, Jana R Fischer, Vito RT Zanotelli, Daniel Schulz, Charlotte Giesen, Raúl Catena, Zsuzsanna Varga, and Bernd Bodenmiller. 2017. “histoCAT: Analysis of Cell Phenotypes and Interactions in Multiplex Image Cytometry Data.” Nature Methods 14: 873--876. +
+
+Schulz, Daniel, Vito RT Zanotelli, Rana R Fischer, Denis Schapiro, Stefanie Engler, Xiao-Kang Lun, Hartland W Jackson, and Bernd Bodenmiller. 2018. “Simultaneous Multiplexed Imaging of mRNA and Proteins with Subcellular Resolution in Breast Cancer Tissue Samples by Mass Cytometry.” Cell Systems 6: 25–36.e5. +
+
+Schürch, Christian M, Salil S Bhate, Graham L Barlow, Darci J Phillips, Luca Noti, Inti Zlobec, Pauline Chu, et al. 2020. “Coordinated Cellular Neighborhoods Orchestrate Antitumoral Immunity at the Colorectal Cancer Invasive Front.” Cell 182: 1341–59. +
+
+Shannon, Paul, Andrew Markiel, Owen Ozier, Nitin S. Baliga, Jonathan T. Wang, Daniel Ramage, Nada Amin, Benno Schwikowski, and Trey Ideker. 2003. “Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks.” Genome Research 13: 2498–2504. +
+
+Stuart, Tim, Andrew Butler, Paul Hoffman, Christoph Hafemeister, Efthymia Papalexi, William M. III Mauck, Yuhan Hao, Marlon Stoeckius, Peter Smibert, and Rahul Satija. 2019. “Comprehensive Integration of Single-Cell Data.” Cell 177: 1888–1902. +
+
+Tietscher, Sandra, Johanna Wagner, Tobias Anzeneder, Claus Langwieder, Martin Rees, Bettina Sobottka, Natalie de Souza, and Bernd Bodenmiller. 2022. “A Comprehensive Single-Cell Map of t Cell Exhaustion-Associated Immune Environments in Human Breast Cancer.” Research Square. +
+
+Virshup, Isaac, Sergei Rybakov, Fabian J. Theis, Philipp Angerer, and F. Alexander Wolf. 2021. “Anndata: Annotated Data.” bioRxiv. +
+
+Weber, Lukas M., and Mark D. Robinson. 2016. “Comparison of Clustering Methods for High-Dimensional Single-Cell Flow and Mass Cytometry Data.” Cytometry Part A 89A: 1084–96. +
+
+Windhager, Jonas, Bernd Bodenmiller, and Nils Eling. 2021. “An End-to-End Workflow for Multiplexed Image Processing and Analysis.” bioRxiv. +
+
+Yu, Lijia, Yue Cao, Jean Y. H. Yang, and Pengyi Yang. 2022. “Benchmarking Clustering Algorithms on Estimating the Number of Cell Types from Single-Cell RNA-Sequencing Data.” Genome Biology 23 (1). https://doi.org/10.1186/s13059-022-02622-0. +
+
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/search_index.json b/search_index.json new file mode 100644 index 00000000..a66833f6 --- /dev/null +++ b/search_index.json @@ -0,0 +1 @@ +[["index.html", "Analysis workflow for IMC data 1 IMC Data Analysis Workflow 1.1 Disclaimer 1.2 Feedback and contributing 1.3 Citation 1.4 Changelog", " Analysis workflow for IMC data Authors: Nils Eling 1,2,*, Vito Zanotelli 1,2, Michelle Daniel 1,2, Daniel Schulz 1,2, Jonas Windhager 1,2, Lasse Meyer 1,2 Compiled: 2023-10-19 1 IMC Data Analysis Workflow This workflow highlights the use of common R/Bioconductor packages to analyze single-cell data obtained from segmented multi-channel images. We will not perform multi-channel image processing and segmentation in R but rather link to available approaches in Section 3. While we use imaging mass cytometry (IMC) data as an example, the concepts presented here can be applied to images obtained by other highly-multiplexed imaging technologies (e.g. CODEX, MIBI, mIF, etc.). We will give an introduction to IMC in Section 2 and highlight strategies to extract single-cell data from multi-channel images in Section 3. Reproducible code written in R is available from Section 4 onwards and the workflow can be largely divided into the following parts: Preprocessing (reading in the data, spillover correction) Image- and cell-level quality control, low-dimensional visualization Sample/batch effect correction Cell phenotyping via clustering or classification Single-cell and image visualization Spatial analyses 1.1 Disclaimer Multi-channel image and spatial, single-cell analysis is complex and we highlight an example workflow here. However, this workflow is not complete and does not cover all possible aspects of exploratory data analysis. Instead, we demonstrate this workflow as a solid basis that supports other aspects of data analysis. It offers interoperability with other packages for single-cell and spatial analysis and the user will need to become familiar with the general framework to efficiently analyse data obtained from multiplexed imaging technologies. 1.2 Feedback and contributing We provide the workflow as an open-source resource. It does not mean that this workflow is tested on all possible datasets or biological questions. If you notice an issue or missing information, please report an issue here. We also welcome contributions in form of pull requests or feature requests in form of issues. Have a look at the source code at: https://github.com/BodenmillerGroup/IMCDataAnalysis 1.3 Citation The workflow has been published in https://www.nature.com/articles/s41596-023-00881-0 which you can cite as follows: Windhager, J., Zanotelli, V.R.T., Schulz, D. et al. An end-to-end workflow for multiplexed image processing and analysis. Nat Protoc (2023). 1.4 Changelog Version 1.0.0 [2023-06-30] First stable release of the workflow Version 1.0.1 [2023-10-19] Added seed before predict call after training a classifier * nils.eling@uzh.ch 1: Department for Quantitative Biomedicine, University of Zurich 2: Institute for Molecular Health Sciences, ETH Zurich "],["intro.html", "2 Introduction 2.1 Technical details of IMC 2.2 IMC data format", " 2 Introduction Highly multiplexed imaging (HMI) enables the simultaneous detection of dozens of biological molecules (e.g., proteins, transcripts; also referred to as “markers”) in tissues. Recently established multiplexed tissue imaging technologies rely on cyclic staining with fluorescently-tagged antibodies (Lin et al. 2018; Gut, Herrmann, and Pelkmans 2018), or the use of oligonucleotide-tagged (Goltsev et al. 2018; Saka et al. 2019) or metal-tagged (Giesen et al. 2014; Angelo et al. 2014) antibodies, among others. The key strength of these technologies is that they allow in-depth analysis of single cells within their spatial tissue context. As a result, these methods have enabled analysis of the spatial architecture of the tumor microenvironment (Lin et al. 2018; Jackson et al. 2020; Ali et al. 2020; Schürch et al. 2020), determination of nucleic acid and protein abundances for assessment of spatial co-localization of cell types and chemokines (Hoch et al. 2022) and spatial niches of virus infected cells (Jiang et al. 2022), and characterization of pathological features during COVID-19 infection (Rendeiro et al. 2021; Mitamura et al. 2021), Type 1 diabetes progression (Damond et al. 2019) and autoimmune disease (Ferrian et al. 2021). Imaging mass cytometry (IMC) utilizes metal-tagged antibodies to detect over 40 proteins and other metal-tagged molecules in biological samples. IMC can be used to perform highly multiplexed imaging and is particularly suited to profiling selected areas of tissues across many samples. Overview of imaging mass cytometry data acquisition. Taken from (Giesen et al. 2014) IMC has first been published in 2014 (Giesen et al. 2014) and has been commercialized by Standard BioToolsTM to be distributed as the Hyperion Imaging SystemTM (documentation is available here). Similar to other HMI technologies such as MIBI (Angelo et al. 2014), CyCIF (Lin et al. 2018), 4i (Gut, Herrmann, and Pelkmans 2018), CODEX (Goltsev et al. 2018) and SABER (Saka et al. 2019), IMC captures the spatial expression of multiple proteins in parallel. With a nominal 1 μm resolution, IMC is able to detect cytoplasmic and nuclear localization of proteins. The current ablation frequency of IMC is 200Hz, meaning that a 1 mm\\(^2\\) area can be imaged within about 2 hours. 2.1 Technical details of IMC Technical aspects of how data acquisition works can be found in the original publication (Giesen et al. 2014). Briefly, antibodies to detect targets in biological material are labeled with heavy metals (e.g., lanthanides) that do not occur in biological systems and thus can be used upon binding to their target as a readout similar to fluorophores in fluorescence microscopy. Thin sections of the biological sample on a glass slide are stained with an antibody cocktail. Stained microscopy slides are mounted on a precise motor-driven stage inside the ablation chamber of the IMC instrument. A high-energy UV laser is focused on the tissue, and each individual laser shot ablates tissue from an area of roughly 1 μm\\(^2\\). The energy of the laser is absorbed by the tissue resulting in vaporization followed by condensation of the ablated material. The ablated material from each laser shot is transported in the gas phase into the plasma of the mass cytometer, where first atomization of the particles and then ionization of the atoms occurs. The ion cloud is then transferred into a vacuum, and all ions below a mass of 80 m/z are filtered using a quadrupole mass filter. The remaining ions (mostly those used to tag antibodies) are analyzed in a time-of-flight mass spectrometer to ultimately obtain an accumulated mass spectrum from all ions that correspond to a single laser shot. One can regard this spectrum as the information underlying a 1 μm\\(^2\\) pixel. With repetitive laser shots (e.g., at 200 Hz) and a simultaneous lateral sample movement, a tissue can be ablated pixel by pixel. Ultimately an image is reconstructed from each pixel mass spectrum. In principle, IMC can be applied to the same type of samples as conventional fluorescence microscopy. The largest distinction from fluorescence microscopy is that for IMC, primary-labeled antibodies are commonly used, whereas in fluorescence microscopy secondary antibodies carrying fluorophores are widely applied. Additionally, for IMC, samples are dried before acquisition and can be stored for years. Formalin-fixed and paraffin-embedded (FFPE) samples are widely used for IMC. The FFPE blocks are cut to 2-5 μm thick sections and are stained, dried, and analyzed with IMC. 2.1.1 Metal-conjugated antobodies and staining Metal-labeled antibodies are used to stain molecules in tissues enabling to delineate tissue structures, cells, and subcellular structures. Metal-conjugated antibodies can either be purchased directly from Standard BioToolsTM (MaxPar IMC Antibodies), or antibodies can be purchased and labeled individually (MaxPar Antibody Labeling). Antibody labeling using the MaxPar kits is performed via TCEP antibody reduction followed by crosslinking with sulfhydryl-reactive maleimide-bearing metal polymers. For each antibody it is essential to validate its functionality, specificity and optimize its usage to provide optimal signal to noise. To facilitate antibody handling, a database is highly useful. Airlab is such a platform; it allows antibody lot tracking, validation data uploads, and panel generation for subsequent upload to the IMC acquisition software from Standard BioToolsTM Depending on the sample type, different staining protocols can be used. Generally, once antibodies of choice have been conjugated to a metal tag, titration experiments are performed to identify the optimal staining concentration. For FFPE samples, different staining protocols have been described, and different antibodies show variable staining with different protocols. Protocols such as the one provided by Standard BioToolsTM or the one describe by (Ijsselsteijn et al. 2019) are recommended. Briefly, for FFPE tissues, a dewaxing step is performed to remove the paraffin used to embed the material, followed by a graded re-hydration of the samples. Thereafter, heat-induced epitope retrieval (HIER), a step aiming at the reversal of formalin-based fixation, is used to unmask epitopes within tissues and make them accessible to antibodies. Epitope unmasking is generally performed in either basic, EDTA-based buffers (pH 9.2) or acidic, citrate-based buffers (pH 6). Next, a buffer containing bovine serum albumin (BSA) is used to block non-specific binding. This buffer is also used to dilute antibody stocks for the actual antibody staining. Staining time and temperature may vary and optimization must be performed to ensure that each single antibody performs well. However, overnight staining at 4°C or 3-5 hours at room temperature seem to be suitable in many cases. Following antibody incubation, unbound antibodies are washed away and a counterstain comparable to DAPI is applied to enable the identification of nuclei. The Iridium intercalator from Standard BioToolsTM is a reagent of choice and applied in a brief 5 minute staining. Finally, the samples are washed again and then dried under an airflow. Once dried, the samples are ready for analysis using IMC and are usually stable for a long period of time (at least one year). 2.1.2 Data acquisition Data is acquired using the CyTOF software from Standard BioToolsTM (see manuals here). The regions of interest are selected by providing coordinates for ablation. To determine the region to be imaged, so called “panoramas” can be generated. These are stitched images of single fields of views of about 200 μm in diameter. Panoramas provide an optical overview of the tissue with a resolution similar to 10x in microscopy and are intended to help with the selection of regions of interest for ablation. The tissue should be centered on the glass side, since the imaging mass cytometer cannot access roughly 5 mm from each of the slide edges. Currently, the instruments can process one slide at a time and usually one MCD file per sample slide is generated. Many regions of interest can be defined on a single slide and acquisition parameters such as channels to acquire, acquisition speed (100 Hz or 200 Hz), ablation energy, and other parameters are user-defined. It is recommended that all isotope channels are recorded. This will result in larger raw data files but valuable information such as potential contamination of the argon gas (e.g., Xenon) or of the samples (e.g., lead, barium) is stored. To process a large number of slides or to select regions on whole-slide samples, panoramas may not provide sufficient information. If this is the case, multi-color immunofluorescence of the same slide prior to staining with metal-labeled antibodies may be performed. To allow for region selection based on immunofluorescence images and to align those images with a panorama of the same or consecutive sections of the sample, we developed napping. Acquisition time is directly proportional to the total size of ablation, and run times for samples of large area or for large sample numbers can roughly be calculated by dividing the ablation area in square micrometer by the ablation speed (e.g., 200Hz). In addition to the proprietary MCD file format, TXT files can also be generated for each region of interest. This is recommended as a back-up option in case of errors that may corrupt MCD files but not TXT files. 2.2 IMC data format Upon completion of the acquisition an MCD file of variable size is generated. A single MCD file can hold raw acquisition data for multiple regions of interest, optical images providing a slide level overview of the sample (“panoramas”), and detailed metadata about the experiment. Additionally, for each acquisition a TXT file is generated which holds the same pixel information as the matched acquisition in the MCD file. The Hyperion Imaging SystemTM produces files in the following folder structure: . +-- {XYZ}_ROI_001_1.txt +-- {XYZ}_ROI_002_2.txt +-- {XYZ}_ROI_003_3.txt +-- {XYZ}.mcd Here, {XYZ} defines the filename, ROI_001, ROI_002, ROI_003 are user-defined names (descriptions) for the selected regions of interest (ROI), and 1, 2, 3 indicate the unique acquisition identifiers. The ROI description entry can be specified in the Standard BioTools software when selecting ROIs. The MCD file contains the raw imaging data and the full metadata of all acquired ROIs, while each TXT file contains data of a single ROI without metadata. To follow a consistent naming scheme and to bundle all metadata, we recommend to zip the folder. Each ZIP file should only contain data from a single MCD file, and the name of the ZIP file should match the name of the MCD file. We refer to this data as raw data and the further processing of this data is described in Section 3. References "],["processing.html", "3 Multi-channel image processing 3.1 Image pre-processing (IMC specific) 3.2 Image segmentation 3.3 Feature extraction 3.4 Data export 3.5 Data import into R", " 3 Multi-channel image processing This book focuses on common analysis steps of spatially-resolved single-cell data after image segmentation and feature extraction. In this chapter, the sections describe the processing of multiplexed imaging data, including file type conversion, image segmentation, feature extraction and data export. To obtain more detailed information on the individual image processing approaches, please visit their repositories: steinbock: The steinbock toolkit offers tools for multi-channel image processing using the command-line or Python code (Windhager, Bodenmiller, and Eling 2021). Supported tasks include IMC data pre-processing, multi-channel image segmentation, object quantification and data export to a variety of file formats. It supports functionality similar to those of the IMC Segmentation Pipeline (see below) and further allows deep-learning enabled image segmentation. The toolkit is available as platform-independent Docker container, ensuring reproducibility and user-friendly installation. Read more in the Docs. IMC Segmentation Pipeline: The IMC segmentation pipeline offers a rather manual way of segmenting multi-channel images using a pixel classification-based approach. We continue to maintain the pipeline but recommend the use of the steinbock toolkit for multi-channel image processing. Raw IMC data pre-processing is performed using the readimc Python package to convert raw MCD files into OME-TIFF and TIFF files. After image cropping, an Ilastik pixel classifier is trained for image classification prior to image segmentation using CellProfiler. Features (i.e., mean pixel intensity) of segmented objects (i.e., cells) are quantified and exported. Read more in the Docs. 3.1 Image pre-processing (IMC specific) Image pre-processing is technology dependent. While most multiplexed imaging technologies generated TIFF or OME-TIFF files which can be directly segmented using the steinbock toolkit, IMC produces data in the proprietary data format MCD. To facilitate IMC data pre-processing, the readimc open-source Python package allows extracting the multi-modal (IMC acquisitions, panoramas), multi-region, multi-channel information contained in raw IMC images. Both the IMC Segmentation Pipeline and the steinbock toolkit use the readimc package for IMC data pre-processing. Starting from IMC raw data and a “panel” file, individual acquisitions are extracted as TIFF files and OME-TIFF files if using the IMC Segmentation Pipeline. The panel contains information of antibodies used in the experiment and the user can specify which channels to keep for downstream analysis. When using the IMC Segmentation Pipeline, random tiles are cropped from images for convenience of pixel labelling. 3.2 Image segmentation The IMC Segmentation Pipeline supports pixel classification-based image segmentation while steinbock supports pixel classification-based and deep learning-based segmentation. Pixel classification-based image segmentation is performed by training a random forest classifier using Ilastik on the randomly extracted image crops and selected image channels. Pixels are classified as nuclear, cytoplasmic, or background. Employing a customizable CellProfiler pipeline, the probabilities are then thresholded for segmenting nuclei, and nuclei are expanded into cytoplasmic regions to obtain cell masks. Deep learning-based image segmentation is performed as presented by (Greenwald et al. 2021). Briefly, steinbock first aggregates user-defined image channels to generate two-channel images representing nuclear and cytoplasmic signals. Next, the DeepCell Python package is used to run Mesmer, a deep learning-enabled segmentation algorithm pre-trained on TissueNet, to automatically obtain cell masks without any further user input. Segmentation masks are single-channel images that match the input images in size, with non-zero grayscale values indicating the IDs of segmented objects (e.g., cells). These masks are written out as TIFF files after segmentation. 3.3 Feature extraction Using the segmentation masks together with their corresponding multi-channel images, the IMC Segmentation Pipeline as well as the steinbock toolkit extract object-specific features. These include the mean pixel intensity per object and channel, morphological features (e.g., object area) and the objects’ locations. Object-specific features are written out as CSV files where rows represent individual objects and columns represent features. Furthermore, the IMC Segmentation Pipeline and the steinbock toolkit compute spatial object graphs, in which nodes correspond to objects, and nodes in spatial proximity are connected by an edge. These graphs serve as a proxy for interactions between neighboring cells. They are stored as edge list in form of one CSV file per image. Both approaches also write out image-specific metadata (e.g., width and height) as a CSV file. 3.4 Data export To further facilitate compatibility with downstream analysis, steinbock exports data to a variety of file formats such as OME-TIFF for images, FCS for single-cell data, the anndata format (Virshup et al. 2021) for data analysis in Python, and various graph file formats for network analysis using software such as CytoScape (Shannon et al. 2003). For export to OME-TIFF, steinbock uses xtiff, a Python package developed for writing multi-channel TIFF stacks. 3.5 Data import into R In Section 5, we will highlight the use of the imcRtools and cytomapper R/Bioconductor packages to read spatially-resolved, single-cell and images as generated by the IMC Segmentation Pipeline and the steinbock toolkit into the statistical programming language R. All further downstream analyses are performed in R and detailed in the following sections. References "],["prerequisites.html", "4 Prerequisites 4.1 Obtain the code 4.2 Software requirements 4.3 Image processing 4.4 Download example data 4.5 Software versions", " 4 Prerequisites The analysis presented in this book requires a basic understanding of the R programing language. An introduction to R can be found here and in the book R for Data Science. Furthermore, it is beneficial to be familiar with single-cell data analysis using the Bioconductor framework. The Orchestrating Single-Cell Analysis with Bioconductor book gives an excellent overview on data containers and basic analysis that are being used here. An overview on IMC as technology and necessary image processing steps can be found on the IMC workflow website. Before we get started on IMC data analysis, we will need to make sure that software dependencies are installed and the example data is downloaded. 4.1 Obtain the code This book provides R code to perform single-cell and spatial data analysis. You can copy the individual code chunks into your R scripts or you can obtain the full code of the book via: git clone https://github.com/BodenmillerGroup/IMCDataAnalysis.git 4.2 Software requirements The R packages needed to execute the presented workflow can either be manually installed (see section 4.2.2) or are available within a provided Docker container (see section 4.2.1). The Docker option is useful if you want to exactly reproduce the presented analysis across operating systems; however, the manual install gives you more flexibility for exploratory data analysis. 4.2.1 Using Docker For reproducibility purposes, we provide a Docker container here. After installing Docker you can first pull the container via: docker pull ghcr.io/bodenmillergroup/imcdataanalysis:latest and then run the container: docker run -v /path/to/IMCDataAnalysis:/home/rstudio/IMCDataAnalysis \\ -e PASSWORD=bioc -p 8787:8787 \\ ghcr.io/bodenmillergroup/imcdataanalysis:latest Here, the /path/to/ needs to be adjusted to where you keep the code and data of the book. Of note: it is recommended to use a date-tagged version of the container to ensure reproducibility. This can be done via: docker pull ghcr.io/bodenmillergroup/imcdataanalysis:<year-month-date> An RStudio server session can be accessed via a browser at localhost:8787 using Username: rstudio and Password: bioc. Navigate to IMCDataAnalysis and open the IMCDataAnalysis.Rproj file. Code in the individual files can now be executed or the whole workflow can be build by entering bookdown::render_book(). 4.2.2 Manual installation The following section describes how to manually install all needed R packages when not using the provided Docker container. To install all R packages needed for the analysis, please run: if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install(c("rmarkdown", "bookdown", "pheatmap", "viridis", "zoo", "devtools", "testthat", "tiff", "distill", "ggrepel", "patchwork", "mclust", "RColorBrewer", "uwot", "Rtsne", "harmony", "Seurat", "SeuratObject", "cowplot", "kohonen", "caret", "randomForest", "ggridges", "cowplot", "gridGraphics", "scales", "tiff", "harmony", "Matrix", "CATALYST", "scuttle", "scater", "dittoSeq", "tidyverse", "BiocStyle", "batchelor", "bluster", "scran", "lisaClust", "spicyR", "iSEE", "imcRtools", "cytomapper", "imcdatasets", "cytoviewer")) # Github dependencies devtools::install_github("i-cyto/Rphenograph") 4.2.3 Major package versions Throughout the analysis, we rely on different R software packages. This section lists the most commonly used packages in this workflow. Data containers: SpatialExperiment version 1.10.0 SingleCellExperiment version 1.22.0 Data analysis: CATALYST version 1.24.0 imcRtools version 1.6.5 scuttle version 1.10.2 scater version 1.28.0 batchelor version 1.16.0 bluster version 1.10.0 scran version 1.28.2 harmony version 1.0.1 Seurat version 4.4.0 lisaClust version 1.8.1 caret version 6.0.94 Data visualization: cytomapper version 1.12.0 cytoviewer version 1.0.1 dittoSeq version 1.12.1 Tidy R: tidyverse version 2.0.0 4.3 Image processing The analysis presented here fully relies on packages written in the programming language R and primarily focuses on analysis approaches downstream of image processing. The example data available at https://zenodo.org/record/7575859 were processed (file type conversion, image segmentation, feature extraction as explained in Section 3) using the steinbock toolkit. The exact command line interface calls to process the raw data are shown below: #!/usr/bin/env bash BASEDIR=$(cd -- "$(dirname "${BASH_SOURCE[0]}")" && pwd -P) cd "${BASEDIR}" # raw data collection mkdir raw wget https://zenodo.org/record/6449127/files/IMCWorkflow.ilp wget https://zenodo.org/record/6449127/files/analysis.zip unzip analysis.zip rm analysis.zip rm -r analysis/cpinp rm -r analysis/cpout rm -r analysis/histocat rm -r analysis/ilastik rm -r analysis/ometiff cd raw wget https://zenodo.org/record/5949116/files/panel.csv wget https://zenodo.org/record/5949116/files/Patient1.zip wget https://zenodo.org/record/5949116/files/Patient2.zip wget https://zenodo.org/record/5949116/files/Patient3.zip wget https://zenodo.org/record/5949116/files/Patient4.zip cd ${BASEDIR} # steinbock alias setup shopt -s expand_aliases alias steinbock="docker run -v ${BASEDIR}:/data -u $(id -u):$(id -g) ghcr.io/bodenmillergroup/steinbock:0.16.0" # raw data preprocessing steinbock preprocess imc panel --namecol Clean_Target steinbock preprocess imc images --hpf 50 # random forest-based segmentation using Ilastik/CellProfiler steinbock classify ilastik prepare --cropsize 500 --seed 123 rm pixel_classifier.ilp && mv IMCWorkflow.ilp pixel_classifier.ilp rm -r ilastik_crops && mv analysis/crops ilastik_crops steinbock classify ilastik fix --no-backup steinbock classify ilastik run steinbock segment cellprofiler prepare steinbock segment cellprofiler run -o masks_ilastik # deep learning-based whole-cell segmentation using DeepCell/Mesmer steinbock segment deepcell --app mesmer --minmax -o masks_deepcell # single-cell feature extraction steinbock measure intensities --masks masks_deepcell steinbock measure regionprops --masks masks_deepcell steinbock measure neighbors --masks masks_deepcell --type expansion --dmax 4 # data export steinbock export ome steinbock export histocat --masks masks_deepcell steinbock export csv intensities regionprops -o cells.csv steinbock export csv intensities regionprops --no-concat -o cells_csv steinbock export fcs intensities regionprops -o cells.fcs steinbock export fcs intensities regionprops --no-concat -o cells_fcs steinbock export anndata --intensities intensities --data regionprops --neighbors neighbors -o cells.h5ad steinbock export anndata --intensities intensities --data regionprops --neighbors neighbors --no-concat -o cells_h5ad steinbock export graphs --data intensities # archiving zip -r img.zip img zip -r ilastik_img.zip ilastik_img zip -r ilastik_crops.zip ilastik_crops zip -r ilastik_probabilities.zip ilastik_probabilities zip -r masks_ilastik.zip masks_ilastik zip -r masks_deepcell.zip masks_deepcell zip -r intensities.zip intensities zip -r regionprops.zip regionprops zip -r neighbors.zip neighbors zip -r ome.zip ome zip -r histocat.zip histocat zip -r cells_csv.zip cells_csv zip -r cells_fcs.zip cells_fcs zip -r cells_h5ad.zip cells_h5ad zip -r graphs.zip graphs 4.4 Download example data Throughout this tutorial, we will access a number of different data types. To declutter the analysis scripts, we will already download all needed data here. To highlight the basic steps of IMC data analysis, we provide example data that were acquired as part of the Integrated iMMUnoprofiling of large adaptive CANcer patient cohorts projects (immucan.eu). The raw data of 4 patients can be accessed online at zenodo.org/record/7575859. We will only download the sample/patient metadata information here: download.file("https://zenodo.org/record/7575859/files/sample_metadata.csv", destfile = "data/sample_metadata.csv") 4.4.1 Processed multiplexed imaging data The IMC raw data was either processed using the steinbock toolkit or the IMC Segmentation Pipeline. Image processing included file type conversion, cell segmentation and feature extraction. steinbock output This book uses the output of the steinbock framework when applied to process the example data. The processed data includes the single-cell mean intensity files, the single-cell morphological features and spatial locations, spatial object graphs in form of edge lists indicating cells in close proximity, hot pixel filtered multi-channel images, segmentation masks, image metadata and channel metadata. All these files will be downloaded here for later use. The commands which were used to generate this data can be found in the shell script above. # download intensities url <- "https://zenodo.org/record/7624451/files/intensities.zip" destfile <- "data/steinbock/intensities.zip" download.file(url, destfile) unzip(destfile, exdir="data/steinbock", overwrite=TRUE) unlink(destfile) # download regionprops url <- "https://zenodo.org/record/7624451/files/regionprops.zip" destfile <- "data/steinbock/regionprops.zip" download.file(url, destfile) unzip(destfile, exdir="data/steinbock", overwrite=TRUE) unlink(destfile) # download neighbors url <- "https://zenodo.org/record/7624451/files/neighbors.zip" destfile <- "data/steinbock/neighbors.zip" download.file(url, destfile) unzip(destfile, exdir="data/steinbock", overwrite=TRUE) unlink(destfile) # download images url <- "https://zenodo.org/record/7624451/files/img.zip" destfile <- "data/steinbock/img.zip" download.file(url, destfile) unzip(destfile, exdir="data/steinbock", overwrite=TRUE) unlink(destfile) # download masks url <- "https://zenodo.org/record/7624451/files/masks_deepcell.zip" destfile <- "data/steinbock/masks_deepcell.zip" download.file(url, destfile) unzip(destfile, exdir="data/steinbock", overwrite=TRUE) unlink(destfile) # download individual files download.file("https://zenodo.org/record/7624451/files/panel.csv", "data/steinbock/panel.csv") download.file("https://zenodo.org/record/7624451/files/images.csv", "data/steinbock/images.csv") download.file("https://zenodo.org/record/7624451/files/steinbock.sh", "data/steinbock/steinbock.sh") IMC Segmentation Pipeline output The example data was also processed using the IMC Segmetation Pipeline (version 3). To highlight the use of the reader function for this type of output, we will need to download the cpout folder which is part of the analysis folder. The cpout folder stores all relevant output files of the pipeline. For a full description of the pipeline, please refer to the docs. # download analysis folder url <- "https://zenodo.org/record/7997296/files/analysis.zip" destfile <- "data/ImcSegmentationPipeline/analysis.zip" download.file(url, destfile) unzip(destfile, exdir="data/ImcSegmentationPipeline", overwrite=TRUE) unlink(destfile) unlink("data/ImcSegmentationPipeline/analysis/cpinp/", recursive=TRUE) unlink("data/ImcSegmentationPipeline/analysis/crops/", recursive=TRUE) unlink("data/ImcSegmentationPipeline/analysis/histocat/", recursive=TRUE) unlink("data/ImcSegmentationPipeline/analysis/ilastik/", recursive=TRUE) unlink("data/ImcSegmentationPipeline/analysis/ometiff/", recursive=TRUE) unlink("data/ImcSegmentationPipeline/analysis/cpout/images/", recursive=TRUE) unlink("data/ImcSegmentationPipeline/analysis/cpout/probabilities/", recursive=TRUE) unlink("data/ImcSegmentationPipeline/analysis/cpout/masks/", recursive=TRUE) 4.4.2 Files for spillover matrix estimation To highlight the estimation and correction of channel-spillover as described by (Chevrier et al. 2017), we can access an example spillover-acquisition from: download.file("https://zenodo.org/record/7575859/files/compensation.zip", "data/compensation.zip") unzip("data/compensation.zip", exdir="data", overwrite=TRUE) unlink("data/compensation.zip") 4.4.3 Gated cells In Section 9.3, we present a cell type classification approach that relies on previously gated cells. This ground truth data is available online at zenodo.org/record/8095133 and will be downloaded here for later use: download.file("https://zenodo.org/record/8095133/files/gated_cells.zip", "data/gated_cells.zip") unzip("data/gated_cells.zip", exdir="data", overwrite=TRUE) unlink("data/gated_cells.zip") 4.5 Software versions SessionInfo ## R version 4.3.1 (2023-06-16) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 22.04.3 LTS ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## time zone: Etc/UTC ## tzcode source: system (glibc) ## ## attached base packages: ## [1] stats4 stats graphics grDevices utils datasets methods ## [8] base ## ## other attached packages: ## [1] cytoviewer_1.0.1 caret_6.0-94 ## [3] lattice_0.21-8 lisaClust_1.8.1 ## [5] scran_1.28.2 bluster_1.10.0 ## [7] lubridate_1.9.3 forcats_1.0.0 ## [9] stringr_1.5.0 dplyr_1.1.3 ## [11] purrr_1.0.2 readr_2.1.4 ## [13] tidyr_1.3.0 tibble_3.2.1 ## [15] tidyverse_2.0.0 dittoSeq_1.12.1 ## [17] cytomapper_1.12.0 EBImage_4.42.0 ## [19] imcRtools_1.6.5 scater_1.28.0 ## [21] ggplot2_3.4.3 scuttle_1.10.2 ## [23] SpatialExperiment_1.10.0 CATALYST_1.24.0 ## [25] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2 ## [27] Biobase_2.60.0 GenomicRanges_1.52.0 ## [29] GenomeInfoDb_1.36.3 IRanges_2.34.1 ## [31] S4Vectors_0.38.2 BiocGenerics_0.46.0 ## [33] MatrixGenerics_1.12.3 matrixStats_1.0.0 ## ## loaded via a namespace (and not attached): ## [1] R.methodsS3_1.8.2 vroom_1.6.3 ## [3] tiff_0.1-11 nnet_7.3-19 ## [5] goftest_1.2-3 DT_0.29 ## [7] HDF5Array_1.28.1 TH.data_1.1-2 ## [9] vctrs_0.6.3 spatstat.random_3.1-6 ## [11] digest_0.6.33 png_0.1-8 ## [13] shape_1.4.6 proxy_0.4-27 ## [15] ggrepel_0.9.3 spicyR_1.12.2 ## [17] deldir_1.0-9 parallelly_1.36.0 ## [19] magick_2.8.0 MASS_7.3-60 ## [21] reshape2_1.4.4 httpuv_1.6.11 ## [23] foreach_1.5.2 withr_2.5.1 ## [25] xfun_0.40 ggpubr_0.6.0 ## [27] ellipsis_0.3.2 survival_3.5-5 ## [29] RTriangle_1.6-0.12 ggbeeswarm_0.7.2 ## [31] RProtoBufLib_2.12.1 drc_3.0-1 ## [33] systemfonts_1.0.4 zoo_1.8-12 ## [35] GlobalOptions_0.1.2 gtools_3.9.4 ## [37] R.oo_1.25.0 promises_1.2.1 ## [39] rstatix_0.7.2 globals_0.16.2 ## [41] rhdf5filters_1.12.1 rhdf5_2.44.0 ## [43] rstudioapi_0.15.0 miniUI_0.1.1.1 ## [45] archive_1.1.6 units_0.8-4 ## [47] generics_0.1.3 concaveman_1.1.0 ## [49] zlibbioc_1.46.0 ScaledMatrix_1.8.1 ## [51] ggraph_2.1.0 polyclip_1.10-6 ## [53] GenomeInfoDbData_1.2.10 fftwtools_0.9-11 ## [55] xtable_1.8-4 doParallel_1.0.17 ## [57] evaluate_0.21 S4Arrays_1.0.6 ## [59] hms_1.1.3 bookdown_0.35 ## [61] irlba_2.3.5.1 colorspace_2.1-0 ## [63] spatstat.data_3.0-1 magrittr_2.0.3 ## [65] later_1.3.1 viridis_0.6.4 ## [67] spatstat.geom_3.2-5 future.apply_1.11.0 ## [69] XML_3.99-0.14 cowplot_1.1.1 ## [71] class_7.3-22 svgPanZoom_0.3.4 ## [73] pillar_1.9.0 nlme_3.1-162 ## [75] iterators_1.0.14 compiler_4.3.1 ## [77] beachmat_2.16.0 shinycssloaders_1.0.0 ## [79] stringi_1.7.12 gower_1.0.1 ## [81] sf_1.0-14 tensor_1.5 ## [83] minqa_1.2.6 ClassifyR_3.4.11 ## [85] plyr_1.8.8 crayon_1.5.2 ## [87] abind_1.4-5 locfit_1.5-9.8 ## [89] sp_2.0-0 graphlayouts_1.0.1 ## [91] bit_4.0.5 terra_1.7-46 ## [93] sandwich_3.0-2 codetools_0.2-19 ## [95] multcomp_1.4-25 recipes_1.0.8 ## [97] BiocSingular_1.16.0 bslib_0.5.1 ## [99] e1071_1.7-13 GetoptLong_1.0.5 ## [101] mime_0.12 MultiAssayExperiment_1.26.0 ## [103] splines_4.3.1 circlize_0.4.15 ## [105] Rcpp_1.0.11 sparseMatrixStats_1.12.2 ## [107] knitr_1.44 utf8_1.2.3 ## [109] clue_0.3-65 lme4_1.1-34 ## [111] listenv_0.9.0 nnls_1.5 ## [113] DelayedMatrixStats_1.22.6 ggsignif_0.6.4 ## [115] Matrix_1.6-1.1 scam_1.2-14 ## [117] statmod_1.5.0 tzdb_0.4.0 ## [119] svglite_2.1.1 tweenr_2.0.2 ## [121] pkgconfig_2.0.3 pheatmap_1.0.12 ## [123] tools_4.3.1 cachem_1.0.8 ## [125] viridisLite_0.4.2 DBI_1.1.3 ## [127] numDeriv_2016.8-1.1 fastmap_1.1.1 ## [129] rmarkdown_2.25 scales_1.2.1 ## [131] grid_4.3.1 shinydashboard_0.7.2 ## [133] broom_1.0.5 sass_0.4.7 ## [135] carData_3.0-5 rpart_4.1.19 ## [137] farver_2.1.1 tidygraph_1.2.3 ## [139] mgcv_1.8-42 yaml_2.3.7 ## [141] cli_3.6.1 lifecycle_1.0.3 ## [143] mvtnorm_1.2-3 lava_1.7.2.1 ## [145] backports_1.4.1 DropletUtils_1.20.0 ## [147] BiocParallel_1.34.2 cytolib_2.12.1 ## [149] timechange_0.2.0 gtable_0.3.4 ## [151] rjson_0.2.21 ggridges_0.5.4 ## [153] parallel_4.3.1 pROC_1.18.4 ## [155] limma_3.56.2 colourpicker_1.3.0 ## [157] jsonlite_1.8.7 edgeR_3.42.4 ## [159] bitops_1.0-7 bit64_4.0.5 ## [161] Rtsne_0.16 FlowSOM_2.8.0 ## [163] spatstat.utils_3.0-3 BiocNeighbors_1.18.0 ## [165] flowCore_2.12.2 jquerylib_0.1.4 ## [167] metapod_1.8.0 dqrng_0.3.1 ## [169] R.utils_2.12.2 timeDate_4022.108 ## [171] shiny_1.7.5 ConsensusClusterPlus_1.64.0 ## [173] htmltools_0.5.6 distances_0.1.9 ## [175] glue_1.6.2 XVector_0.40.0 ## [177] RCurl_1.98-1.12 classInt_0.4-10 ## [179] jpeg_0.1-10 gridExtra_2.3 ## [181] boot_1.3-28.1 igraph_1.5.1 ## [183] R6_2.5.1 cluster_2.1.4 ## [185] Rhdf5lib_1.22.1 ipred_0.9-14 ## [187] nloptr_2.0.3 DelayedArray_0.26.7 ## [189] tidyselect_1.2.0 vipor_0.4.5 ## [191] plotrix_3.8-2 ggforce_0.4.1 ## [193] raster_3.6-23 car_3.1-2 ## [195] future_1.33.0 ModelMetrics_1.2.2.2 ## [197] rsvd_1.0.5 munsell_0.5.0 ## [199] KernSmooth_2.23-21 data.table_1.14.8 ## [201] htmlwidgets_1.6.2 ComplexHeatmap_2.16.0 ## [203] RColorBrewer_1.1-3 rlang_1.1.1 ## [205] spatstat.sparse_3.0-2 spatstat.explore_3.2-3 ## [207] lmerTest_3.1-3 colorRamps_2.3.1 ## [209] ggnewscale_0.4.9 fansi_1.0.4 ## [211] hardhat_1.3.0 beeswarm_0.4.0 ## [213] prodlim_2023.08.28 References "],["read-data.html", "5 Read in the data 5.1 Read in single-cell information 5.2 Single-cell processing 5.3 Read in images 5.4 Generate single-cell data from images 5.5 Accessing publicly available IMC datasets 5.6 Save objects 5.7 Session Info", " 5 Read in the data This section describes how to read in single-cell data and images into R after image processing and segmentation (see Section 3). To highlight examples for IMC data analysis, we provide already processed data at 10.5281/zenodo.6043599. This data has already been downloaded in Section 4.4 and can be accessed in the folder data. We use the imcRtools package to read in single-cell data extracted using the steinbock framework or the IMC Segmentation Pipeline. Both image processing approaches also generate multi-channel images and segmentation masks that can be read into R using the cytomapper package. library(imcRtools) library(cytomapper) 5.1 Read in single-cell information For single-cell data analysis in R the SingleCellExperiment (Amezquita et al. 2019) data container is commonly used within the Bioconductor framework. It allows standardized access to (i) expression data, (ii) cellular metadata (e.g., cell type), (iii) feature metadata (e.g., marker name) and (iv) experiment-wide metadata. For an in-depth introduction to the SingleCellExperiment container, please refer to the SingleCellExperiment class. The SpatialExperiment class (Righelli et al. 2022) is an extension of the SingleCellExperiment class. It was developed to store spatial data in addition to single-cell data and an extended introduction is accessible here. To read in single-cell data generated by the steinbock framework or the IMC Segmentation Pipeline, the imcRtools package provides the read_steinbock and read_cpout functions, respectively. By default, the data is read into a SpatialExperiment object; however, data can be read in as a SingleCellExperiment object by setting return_as = \"sce\". All functions presented in this book are applicable to both data containers. 5.1.1 steinbock generated data The downloaded example data (Section 4.4) processed with the steinbock framework can be read in with the read_steinbock function provided by imcRtools. For more information, please refer to ?read_steinbock. spe <- read_steinbock("data/steinbock/") spe ## class: SpatialExperiment ## dim: 40 47859 ## metadata(0): ## assays(1): counts ## rownames(40): MPO HistoneH3 ... DNA1 DNA2 ## rowData names(12): channel name ... Final.Concentration...Dilution ## uL.to.add ## colnames: NULL ## colData names(8): sample_id ObjectNumber ... width_px height_px ## reducedDimNames(0): ## mainExpName: NULL ## altExpNames(0): ## spatialCoords names(2) : Pos_X Pos_Y ## imgData names(1): sample_id By default, single-cell data is read in as SpatialExperiment object. The summarized pixel intensities per channel and cell (here mean intensity) are stored in the counts slot. Columns represent cells and rows represent channels. counts(spe)[1:5,1:5] ## [,1] [,2] [,3] [,4] [,5] ## MPO 0.5751064 0.4166667 0.4975494 0.890154 0.1818182 ## HistoneH3 3.1273082 11.3597883 2.3841440 7.712961 1.4512715 ## SMA 0.2600939 1.6720383 0.1535190 1.193948 0.2986703 ## CD16 2.0347747 2.5880536 2.2943074 15.629083 0.6084220 ## CD38 0.2530137 0.6826669 1.1902979 2.126060 0.2917793 Metadata associated to individual cells are stored in the colData slot. After initial image processing, these metadata include the numeric identifier (ObjectNumber), the area, and morphological features of each cell. In addition, sample_id stores the image name from which each cell was extracted and the width and height of the corresponding images are stored. head(colData(spe)) ## DataFrame with 6 rows and 8 columns ## sample_id ObjectNumber area axis_major_length axis_minor_length ## <character> <numeric> <numeric> <numeric> <numeric> ## 1 Patient1_001 1 12 7.40623 1.89529 ## 2 Patient1_001 2 24 16.48004 1.96284 ## 3 Patient1_001 3 17 9.85085 1.98582 ## 4 Patient1_001 4 24 8.08290 3.91578 ## 5 Patient1_001 5 22 8.79367 3.11653 ## 6 Patient1_001 6 25 9.17436 3.46929 ## eccentricity width_px height_px ## <numeric> <numeric> <numeric> ## 1 0.966702 600 600 ## 2 0.992882 600 600 ## 3 0.979470 600 600 ## 4 0.874818 600 600 ## 5 0.935091 600 600 ## 6 0.925744 600 600 The main difference between the SpatialExperiment and the SingleCellExperiment data container is the way spatial locations of all cells are stored. For the SingleCellExperiment container, the locations are stored in the colData slot while the SpatialExperiment container stores them in the spatialCoords slot: head(spatialCoords(spe)) ## Pos_X Pos_Y ## 1 468.5833 0.4166667 ## 2 515.8333 0.4166667 ## 3 587.2353 0.4705882 ## 4 192.2500 1.2500000 ## 5 231.7727 0.9090909 ## 6 270.1600 1.0400000 The spatial object graphs generated by steinbock (see Section 3.3 are read into a colPair slot with the name neighborhood of the SpatialExperiment (or SingleCellExperiment) object. Cell-cell interactions (cells in close spatial proximity) are represented as “edge list” (stored as SelfHits object). Here, the left side represents the column indices of the SpatialExperiment object of the “from” cells and the right side represents the column indices of the “to” cells. For visualization of the spatial object graphs, please refer to Section 12.2. colPair(spe, "neighborhood") ## SelfHits object with 257116 hits and 0 metadata columns: ## from to ## <integer> <integer> ## [1] 1 27 ## [2] 1 55 ## [3] 2 10 ## [4] 2 44 ## [5] 2 81 ## ... ... ... ## [257112] 47858 47836 ## [257113] 47859 47792 ## [257114] 47859 47819 ## [257115] 47859 47828 ## [257116] 47859 47854 ## ------- ## nnode: 47859 Finally, metadata regarding the channels are stored in the rowData slot. This information is extracted from the panel.csv file. Channels have the same order as the rows in the panel.csv file for which the keep column is set to 1, and match the order of channels in the multi-channel images (see Section 5.3). For the example data, channels are ordered by isotope mass. head(rowData(spe)) ## DataFrame with 6 rows and 12 columns ## channel name keep ilastik deepcell cellpose ## <character> <character> <numeric> <numeric> <numeric> <logical> ## MPO Y89 MPO 1 NA NA NA ## HistoneH3 In113 HistoneH3 1 1 1 NA ## SMA In115 SMA 1 NA NA NA ## CD16 Pr141 CD16 1 NA NA NA ## CD38 Nd142 CD38 1 NA NA NA ## HLADR Nd143 HLADR 1 NA NA NA ## Tube.Number Target Antibody.Clone Stock.Concentration ## <numeric> <character> <character> <numeric> ## MPO 2101 Myeloperoxidase MPO Polyclonal MPO 500 ## HistoneH3 2113 Histone H3 D1H2 500 ## SMA 1914 SMA 1A4 500 ## CD16 2079 CD16 EPR16784 500 ## CD38 2095 CD38 EPR4106 500 ## HLADR 2087 HLA-DR TAL 1B5 500 ## Final.Concentration...Dilution uL.to.add ## <character> <character> ## MPO 4 ug/mL 0.8 ## HistoneH3 1 ug/mL 0.2 ## SMA 0.25 ug/mL 0.05 ## CD16 5 ug/mL 1 ## CD38 2.5 ug/mL 0.5 ## HLADR 1 ug/mL 0.2 5.1.2 IMC Segmentation Pipeline generated data The IMC Segmentation Pipeline offers an alternative approach to multiplexed image processing and segmentation. The default pipeline is also available via steinbock. The IMC Segmentation Pipeline is based on Ilastik pixel classification and image segmentation using CellProfiler. We recommend to become familiar with the pipeline as it allows flexible extension to more complicated image analysis and segmentation tasks. For standard image analysis and segmentation, steinbock is the preferred choice. Please refer to the documentation to get an overview on the pipeline. All relevant output storing single-cell data is contained in the cpout folder. For reading in the single-cell measurement, the imcRtools package offers the read_cpout function: spe2 <- read_cpout("data/ImcSegmentationPipeline/analysis/cpout/") rownames(spe2) <- rowData(spe2)$Clean_Target spe2 ## class: SpatialExperiment ## dim: 40 43796 ## metadata(0): ## assays(1): counts ## rownames(40): MPO HistoneH3 ... DNA1 DNA2 ## rowData names(11): Tube.Number Metal.Tag ... ilastik deepcell ## colnames: NULL ## colData names(12): sample_id ObjectNumber ... Metadata_acid ## Metadata_description ## reducedDimNames(0): ## mainExpName: NULL ## altExpNames(0): ## spatialCoords names(2) : Pos_X Pos_Y ## imgData names(1): sample_id Similar to the steinbock output, cell morphological features and image level metadata are stored in the colData(spe2) slot, the interaction information is contained in colPair(spe2, type = \"neighborhood\") and the mean intensity per channel and cell is stored in counts(spe2). 5.1.3 Reading custom files When not using steinbock or the ImcSegmentationPipeline, the single-cell information has to be read in from custom files. We now demonstrate how to generate a SpatialExperiment object from single-cell data contained in individual files. As an example, we use files generated by CellProfiler as part of the ImcSegmentationPipeline. First we will read in the single-cell features stored in a CSV file: library(readr) cur_features <- read_csv("data/ImcSegmentationPipeline/analysis/cpout/cell.csv") dim(cur_features) ## [1] 43796 941 head(colnames(cur_features)) ## [1] "ImageNumber" "ObjectNumber" ## [3] "AreaShape_Area" "AreaShape_BoundingBoxArea" ## [5] "AreaShape_BoundingBoxMaximum_X" "AreaShape_BoundingBoxMaximum_Y" This file contains a large number of single-cell features including the cell identifier (ObjectNumber), the image identifier (ImageNumber), morphological features (AreaShape_*), the cells’ locations (Location_Center_*) and the mean pixel intensity per cell and per channel (Intensity_MeanIntensity_FullStack_*). Now, we split the features into intensity features, cell-specific metadata and the physical location of the cells: counts <- cur_features[,grepl("Intensity_MeanIntensity_FullStack", colnames(cur_features))] meta <- cur_features[,c("ImageNumber", "ObjectNumber", "AreaShape_Area", "AreaShape_Eccentricity", "AreaShape_MeanRadius")] coords <- cur_features[,c("Location_Center_X", "Location_Center_Y")] CellProfiler writes out the mean pixel intensities after scaling them bit a scaling factor which is bit encoding-specific. The images to which the IMC Segmentation Pipeline was applied were saved with 16-bit encoding. This means for the example data, the mean pixel intensities need to be scaled by a factor of 2 ^ 16 - 1 = 65535. counts <- counts * 65535 In addition, CellProfiler does not order the channel numerically but rather as a character; 1, 10, 2, 3, ... rather than 1, 2, 3, .... Therefore we will need to reorder the channels. library(stringr) cur_ch <- str_split(colnames(counts), "_", simplify = TRUE)[,4] cur_ch <- sub("c", "", cur_ch) counts <- counts[,order(as.numeric(cur_ch))] From these features we can now construct the SpatialExperiment object. spe3 <- SpatialExperiment(assays = list(counts = t(counts)), colData = meta, sample_id = as.character(meta$ImageNumber), spatialCoords = as.matrix(coords)) Next, we can store the spatial cell graph generated by CellProfiler in the colPairs slot of the object. Spatial cell graphs are usually stored as edge list in form of a CSV file. The colPairs slot requires a SelfHits entry storing an edge list where numeric entries represent the index of the from and to cell in the SpatialExperiment object. To generate such an edge list, we need to match the cell IDs contained in the CSV against the cell IDs in the SpatialExperiment object. cur_pairs <- read_csv("data/ImcSegmentationPipeline/analysis/cpout/Object relationships.csv") cur_from <- paste(cur_pairs$`First Image Number`, cur_pairs$`First Object Number`) cur_to <- paste(cur_pairs$`Second Image Number`, cur_pairs$`Second Object Number`) edgelist <- SelfHits(from = match(cur_from, paste(spe3$ImageNumber, spe3$ObjectNumber)), to = match(cur_to, paste(spe3$ImageNumber, spe3$ObjectNumber)), nnode = ncol(spe3)) colPair(spe3, "neighborhood") <- edgelist For further downstream analysis, we will use the steinbock results. 5.2 Single-cell processing After reading in the single-cell data, few further processing steps need to be taken. Add additional metadata We can set the colnames of the object to generate unique identifiers per cell: colnames(spe) <- paste0(spe$sample_id, "_", spe$ObjectNumber) It is also often the case that sample-specific metadata are available externally. For the current data, we need to link the cancer type (also referred to as “Indication”) to each sample. This metadata is available as external CSV file: library(tidyverse) # Read patient metadata meta <- read_csv("data/sample_metadata.csv") # Extract patient id and ROI id from sample name spe$patient_id <- str_extract(spe$sample_id, "Patient[1-4]") spe$ROI <- str_extract(spe$sample_id, "00[1-8]") # Store cancer type in SPE object spe$indication <- meta$Indication[match(spe$patient_id, meta$`Sample ID`)] unique(spe$patient_id) ## [1] "Patient1" "Patient2" "Patient3" "Patient4" unique(spe$ROI) ## [1] "001" "002" "003" "004" "005" "006" "007" "008" unique(spe$indication) ## [1] "SCCHN" "BCC" "NSCLC" "CRC" The selected patients were diagnosed with different cancer types: SCCHN - head and neck cancer BCC - breast cancer NSCLC - lung cancer CRC - colorectal cancer Transform counts The distribution of expression counts across cells is often observed to be skewed towards the right side meaning lots of cells display low counts and few cells have high counts. To avoid analysis biases from these high-expressing cells, the expression counts are commonly transformed or clipped. Here, we perform counts transformation using an inverse hyperbolic sine function. This transformation is commonly applied to flow cytometry data. The cofactor here defines the expression range on which no scaling is performed. While the cofactor for CyTOF data is often set to 5, IMC data usually display much lower counts. We therefore apply a cofactor of 1. However, other transformations such as log(counts(spe) + 0.01) should be tested when analysing IMC data. library(dittoSeq) dittoRidgePlot(spe, var = "CD3", group.by = "patient_id", assay = "counts") + ggtitle("CD3 - before transformation") assay(spe, "exprs") <- asinh(counts(spe)/1) dittoRidgePlot(spe, var = "CD3", group.by = "patient_id", assay = "exprs") + ggtitle("CD3 - after transformation") Define interesting channels For downstream analysis such as visualization, dimensionality reduction and clustering, only a subset of markers should be used. As convenience, we can store an additional entry in the rowData slot that specifies the markers of interest. Here, we deselect the nuclear markers, which were primarily used for cell segmentation, and keep all other biological targets. However, more informed marker selection should be performed to exclude lowly expressed marker or markers with low signal-to-noise ratio. rowData(spe)$use_channel <- !grepl("DNA|Histone", rownames(spe)) Define color schemes We will define color schemes for different metadata entries of the data and conveniently store them in the metadata slot of the SpatialExperiment which will be helpful for downstream data visualizations. We will use colors from the RColorBrewer and dittoSeq packages but any other coloring package will suffice. library(RColorBrewer) color_vectors <- list() ROI <- setNames(brewer.pal(length(unique(spe$ROI)), name = "BrBG"), unique(spe$ROI)) patient_id <- setNames(brewer.pal(length(unique(spe$patient_id)), name = "Set1"), unique(spe$patient_id)) sample_id <- setNames(c(brewer.pal(6, "YlOrRd")[3:5], brewer.pal(6, "PuBu")[3:6], brewer.pal(6, "YlGn")[3:5], brewer.pal(6, "BuPu")[3:6]), unique(spe$sample_id)) indication <- setNames(brewer.pal(length(unique(spe$indication)), name = "Set2"), unique(spe$indication)) color_vectors$ROI <- ROI color_vectors$patient_id <- patient_id color_vectors$sample_id <- sample_id color_vectors$indication <- indication metadata(spe)$color_vectors <- color_vectors 5.3 Read in images The cytomapper package allows multi-channel image handling and visualization within the Bioconductor framework. The most common data format for multi-channel images or segmentation masks is the TIFF file format, which is used by steinbock and the IMC segementation pipeline to save images. Here, we will read in multi-channel images and segmentation masks into a CytoImageList data container. It allows storing multiple multi-channel images and requires matched channels across all images within the object. The loadImages function is used to read in processed multi-channel images and their corresponding segmentation masks. Of note: the multi-channel images generated by steinbock are saved as 32-bit images while the segmentation masks are saved as 16-bit images. To correctly scale pixel values of the segmentation masks when reading them in, we will need to set as.is = TRUE. images <- loadImages("data/steinbock/img/") ## All files in the provided location will be read in. masks <- loadImages("data/steinbock/masks_deepcell/", as.is = TRUE) ## All files in the provided location will be read in. In the case of multi-channel images, it is beneficial to set the channelNames for easy visualization. Using the steinbock framework, the channel order of the single-cell data matches the channel order of the multi-channel images. However, it is recommended to make sure that the channel order is identical between the single-cell data and the images. channelNames(images) <- rownames(spe) images ## CytoImageList containing 14 image(s) ## names(14): Patient1_001 Patient1_002 Patient1_003 Patient2_001 Patient2_002 Patient2_003 Patient2_004 Patient3_001 Patient3_002 Patient3_003 Patient4_005 Patient4_006 Patient4_007 Patient4_008 ## Each image contains 40 channel(s) ## channelNames(40): MPO HistoneH3 SMA CD16 CD38 HLADR CD27 CD15 CD45RA CD163 B2M CD20 CD68 Ido1 CD3 LAG3 / LAG33 CD11c PD1 PDGFRb CD7 GrzB PDL1 TCF7 CD45RO FOXP3 ICOS CD8a CarbonicAnhydrase CD33 Ki67 VISTA CD40 CD4 CD14 Ecad CD303 CD206 cleavedPARP DNA1 DNA2 For visualization shown in Section 11 we will need to add additional metadata to the elementMetadata slot of the CytoImageList objects. This slot is easily accessible using the mcols function. Here, we will store the matched sample_id, patient_id and indication information within the elementMetadata slot of the multi-channel images and segmentation masks objects. It is crucial that the order of the images in both CytoImageList objects is the same. all.equal(names(images), names(masks)) ## [1] TRUE # Extract patient id from image name patient_id <- str_extract(names(images), "Patient[1-4]") # Retrieve cancer type per patient from metadata file indication <- meta$Indication[match(patient_id, meta$`Sample ID`)] # Store patient and image level information in elementMetadata mcols(images) <- mcols(masks) <- DataFrame(sample_id = names(images), patient_id = patient_id, indication = indication) 5.4 Generate single-cell data from images An alternative way of generating a SingleCellExperiment object directly from the multi-channel images and segmentation masks is supported by the measureObjects function of the cytomapper package. For each cell present in the masks object, the function computes the mean pixel intensity per channel as well as morphological features (area, radius, major axis length, eccentricity) and the location of cells: cytomapper_sce <- measureObjects(masks, image = images, img_id = "sample_id") cytomapper_sce ## class: SingleCellExperiment ## dim: 40 47859 ## metadata(0): ## assays(1): counts ## rownames(40): MPO HistoneH3 ... DNA1 DNA2 ## rowData names(0): ## colnames: NULL ## colData names(10): sample_id object_id ... patient_id indication ## reducedDimNames(0): ## mainExpName: NULL ## altExpNames(0): 5.5 Accessing publicly available IMC datasets The imcdatasets R/Bioconductor package provides a number of publicly available IMC datasets. For a complete introduction to the package, please refer to the documentation. Here, we can read in example data of (Damond et al. 2019) taken from patients diagnosed with Type I Diabetes. The example here consists of a CytoImageList object of 100 images, a CytoImageList object of 100 segmentation masks and a SingleCellExperiment object containing 252059 cells. Of note: downloading the images takes quite some time and uses 8GB of memory. library(imcdatasets) pancreasImages <- Damond_2019_Pancreas(data_type = "images") pancreasMasks <- Damond_2019_Pancreas(data_type = "masks") pancreasSCE <- Damond_2019_Pancreas(data_type = "sce") 5.6 Save objects Finally, the generated data objects can be saved for further downstream processing and analysis. saveRDS(spe, "data/spe.rds") saveRDS(images, "data/images.rds") saveRDS(masks, "data/masks.rds") 5.7 Session Info SessionInfo ## R version 4.3.1 (2023-06-16) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 22.04.3 LTS ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## time zone: Etc/UTC ## tzcode source: system (glibc) ## ## attached base packages: ## [1] stats4 stats graphics grDevices utils datasets methods ## [8] base ## ## other attached packages: ## [1] testthat_3.1.10 RColorBrewer_1.1-3 ## [3] dittoSeq_1.12.1 lubridate_1.9.3 ## [5] forcats_1.0.0 dplyr_1.1.3 ## [7] purrr_1.0.2 tidyr_1.3.0 ## [9] tibble_3.2.1 ggplot2_3.4.3 ## [11] tidyverse_2.0.0 stringr_1.5.0 ## [13] readr_2.1.4 cytomapper_1.12.0 ## [15] EBImage_4.42.0 imcRtools_1.6.5 ## [17] SpatialExperiment_1.10.0 SingleCellExperiment_1.22.0 ## [19] SummarizedExperiment_1.30.2 Biobase_2.60.0 ## [21] GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 ## [23] IRanges_2.34.1 S4Vectors_0.38.2 ## [25] BiocGenerics_0.46.0 MatrixGenerics_1.12.3 ## [27] matrixStats_1.0.0 ## ## loaded via a namespace (and not attached): ## [1] later_1.3.1 bitops_1.0-7 ## [3] R.oo_1.25.0 svgPanZoom_0.3.4 ## [5] polyclip_1.10-6 lifecycle_1.0.3 ## [7] sf_1.0-14 rprojroot_2.0.3 ## [9] edgeR_3.42.4 lattice_0.21-8 ## [11] vroom_1.6.3 MASS_7.3-60 ## [13] magrittr_2.0.3 limma_3.56.2 ## [15] sass_0.4.7 rmarkdown_2.25 ## [17] jquerylib_0.1.4 yaml_2.3.7 ## [19] httpuv_1.6.11 sp_2.0-0 ## [21] cowplot_1.1.1 DBI_1.1.3 ## [23] pkgload_1.3.3 abind_1.4-5 ## [25] zlibbioc_1.46.0 R.utils_2.12.2 ## [27] ggraph_2.1.0 RCurl_1.98-1.12 ## [29] tweenr_2.0.2 GenomeInfoDbData_1.2.10 ## [31] ggrepel_0.9.3 RTriangle_1.6-0.12 ## [33] terra_1.7-46 pheatmap_1.0.12 ## [35] units_0.8-4 dqrng_0.3.1 ## [37] svglite_2.1.1 DelayedMatrixStats_1.22.6 ## [39] codetools_0.2-19 DropletUtils_1.20.0 ## [41] DelayedArray_0.26.7 DT_0.29 ## [43] scuttle_1.10.2 ggforce_0.4.1 ## [45] tidyselect_1.2.0 raster_3.6-23 ## [47] farver_2.1.1 viridis_0.6.4 ## [49] jsonlite_1.8.7 BiocNeighbors_1.18.0 ## [51] e1071_1.7-13 ellipsis_0.3.2 ## [53] tidygraph_1.2.3 ggridges_0.5.4 ## [55] systemfonts_1.0.4 tools_4.3.1 ## [57] Rcpp_1.0.11 glue_1.6.2 ## [59] gridExtra_2.3 xfun_0.40 ## [61] HDF5Array_1.28.1 shinydashboard_0.7.2 ## [63] withr_2.5.1 fastmap_1.1.1 ## [65] rhdf5filters_1.12.1 fansi_1.0.4 ## [67] digest_0.6.33 timechange_0.2.0 ## [69] R6_2.5.1 mime_0.12 ## [71] colorspace_2.1-0 jpeg_0.1-10 ## [73] R.methodsS3_1.8.2 utf8_1.2.3 ## [75] generics_0.1.3 data.table_1.14.8 ## [77] class_7.3-22 graphlayouts_1.0.1 ## [79] htmlwidgets_1.6.2 S4Arrays_1.0.6 ## [81] pkgconfig_2.0.3 gtable_0.3.4 ## [83] XVector_0.40.0 brio_1.1.3 ## [85] htmltools_0.5.6 bookdown_0.35 ## [87] fftwtools_0.9-11 scales_1.2.1 ## [89] png_0.1-8 knitr_1.44 ## [91] rstudioapi_0.15.0 tzdb_0.4.0 ## [93] rjson_0.2.21 proxy_0.4-27 ## [95] cachem_1.0.8 rhdf5_2.44.0 ## [97] KernSmooth_2.23-21 parallel_4.3.1 ## [99] vipor_0.4.5 concaveman_1.1.0 ## [101] desc_1.4.2 pillar_1.9.0 ## [103] grid_4.3.1 vctrs_0.6.3 ## [105] promises_1.2.1 distances_0.1.9 ## [107] beachmat_2.16.0 xtable_1.8-4 ## [109] archive_1.1.6 beeswarm_0.4.0 ## [111] evaluate_0.21 magick_2.8.0 ## [113] cli_3.6.1 locfit_1.5-9.8 ## [115] compiler_4.3.1 rlang_1.1.1 ## [117] crayon_1.5.2 labeling_0.4.3 ## [119] classInt_0.4-10 ggbeeswarm_0.7.2 ## [121] stringi_1.7.12 viridisLite_0.4.2 ## [123] BiocParallel_1.34.2 nnls_1.5 ## [125] munsell_0.5.0 tiff_0.1-11 ## [127] Matrix_1.6-1.1 hms_1.1.3 ## [129] sparseMatrixStats_1.12.2 bit64_4.0.5 ## [131] Rhdf5lib_1.22.1 shiny_1.7.5 ## [133] igraph_1.5.1 bslib_0.5.1 ## [135] bit_4.0.5 References "],["spillover-correction.html", "6 Spillover correction 6.1 Generate the spillover matrix 6.2 Single-cell data compensation 6.3 Image compensation 6.4 Write out compensated images 6.5 Save objects 6.6 Session Info", " 6 Spillover correction Original scripts: Vito Zanotelli, adapted/maintained by: Nils Eling This section highlights how to generate a spillover matrix from individually acquired single metal spots on an agarose slide. Each spot needs to be imaged as its own acquisition/ROI and individual TXT files containing the pixel intensities per spot need to be available. For complete details on the spillover correction approach, please refer to the original publication (Chevrier et al. 2017). Spillover slide preparation: Prepare 2% agarose in double distilled H\\(_2\\)O in a beaker and melt it in a microwave until well dissolved. Dip a blank superfrost plus glass microscope slide into the agarose and submerge it until the label. Remove the slide and prop it up against a support to allow the excess agarose to run off onto paper towels. Allow the slide to dry completely (at least 30 minutes). Retrieve all the antibody conjugates used in the panel for which the spillover matrix is to be generated and place them on ice. Arrange them in a known order (e.g., mass of the conjugated metal). Pipette 0.3 µl spots of 0.4% trypan blue dye into an array on the slide. Prepare one spot per antibody, and make sure the spots are well separated. Pipette 0.3 µl of each antibody conjugate (usually at 0.5 mg/ml) onto a unique blue spot, taking care to avoid different antibodies bleeding into each other. Note the exact location of each conjugate on the slide. Let the spots dry completely, at least 1 hour. Spillover slide acquisition: Create a JPEG or PNG image of the slide using a mobile phone camera or flat-bed scanner. In the CyTOF software, create a new file and import the slide image into it. Create a panorama across all the spots to visualize their locations. Within each spot, create a region of interest (ROI) with a width of 200 pixels and a height of 10 pixels. Name each ROI with the mass and name of the metal conjugate contained in the spot, e.g “Ir193” or “Ho165”. This will be how each TXT file is named. Set the profiling type of each ROI to “Local”. Apply the antibody panel to all the ROIs. This panel should contain all (or more) of the isotopes in the panel, with the correct metal specified. For example: if the metal used is Barium 138, make sure this, rather than Lanthanum 138, is selected. Save the file, make sure “Generate Text File” is selected, and start the acquisition. This procedure will generate an MCD file similar to the one available on zenodo: 10.5281/zenodo.5949115 The original code of the spillover correction manuscript is available on Github here; however, due to changes in the CATALYST package, users were not able to reproduce the analysis using the newest software versions. The following workflow uses the newest package versions to generate a spillover matrix and perform spillover correction. In brief, the highlighted workflow comprises 9 steps: Reading in the data Quality control (Optional) pixel binning “Debarcoding” for pixel assignment Pixel selection for spillover matrix estimation Spillover matrix generation Saving the results Single-cell compensation Image compensation 6.1 Generate the spillover matrix In the first step, we will generate a spillover matrix based on the single-metal spots and save it for later use. 6.1.1 Read in the data Here, we will read in the individual TXT files into a SingleCellExperiment object. This object can be used directly by the CATALYST package to estimate the spillover. For this to work, the TXT file names need to contain the spotted metal isotope name. By default, the first occurrence of the isotope in the format (mt)(mass) (e.g. Sm152 for Samarium isotope with the atomic mass 152) will be used as spot identifier. Alternatively, a named list of already read-in pixel intensities can be provided. For more information, please refer to the man page ?readSCEfromTXT. For further downstream analysis, we will asinh-transform the data using a cofactor of 5; a common transformation for CyTOF data (Bendall et al. 2011). As the pixel intensities are larger than the cell intensities, the cofactor here is larger than the cofactor when transforming the mean cell intensities. library(imcRtools) # Create SingleCellExperiment from TXT files sce <- readSCEfromTXT("data/compensation/") ## Spotted channels: Y89, In113, In115, Pr141, Nd142, Nd143, Nd144, Nd145, Nd146, Sm147, Nd148, Sm149, Nd150, Eu151, Sm152, Eu153, Sm154, Gd155, Gd156, Gd158, Tb159, Gd160, Dy161, Dy162, Dy163, Dy164, Ho165, Er166, Er167, Er168, Tm169, Er170, Yb171, Yb172, Yb173, Yb174, Lu175, Yb176 ## Acquired channels: Ar80, Y89, In113, In115, Xe131, Xe134, Ba136, La138, Pr141, Nd142, Nd143, Nd144, Nd145, Nd146, Sm147, Nd148, Sm149, Nd150, Eu151, Sm152, Eu153, Sm154, Gd155, Gd156, Gd158, Tb159, Gd160, Dy161, Dy162, Dy163, Dy164, Ho165, Er166, Er167, Er168, Tm169, Er170, Yb171, Yb172, Yb173, Yb174, Lu175, Yb176, Ir191, Ir193, Pt196, Pb206 ## Channels spotted but not acquired: ## Channels acquired but not spotted: Ar80, Xe131, Xe134, Ba136, La138, Ir191, Ir193, Pt196, Pb206 assay(sce, "exprs") <- asinh(counts(sce)/5) 6.1.2 Quality control In the next step, we will observe the median pixel intensities per spot and threshold on medians < 200 counts. These types of visualization serve two purposes: Small median pixel intensities (< 200 counts) might hinder the robust estimation of the channel spillover. In that case, consecutive pixels can be summed (see Optional pixel binning). Each spotted metal (row) should show the highest median pixel intensity in its corresponding channel (column). If this is not the case, either the naming of the TXT files was incorrect or the incorrect metal was spotted. # Log10 median pixel counts per spot and channel plotSpotHeatmap(sce) # Thresholded on 200 pixel counts plotSpotHeatmap(sce, log = FALSE, threshold = 200) As we can see, nearly all median pixel intensities are > 200 counts for each spot. We also observe acquired channels for which no spot was placed (e.g., Xe134, Ir191, Ir193). 6.1.3 Optional pixel binning In cases where median pixel intensities are low (< 200 counts), consecutive pixels can be summed to increase the robustness of the spillover estimation. The imcRtools package provides the binAcrossPixels function, which performs aggregation for each channel across bin_size consecutive pixels per spotted metal. # Define grouping bin_size = 10 sce2 <- binAcrossPixels(sce, bin_size = bin_size) # Log10 median pixel counts per spot and channel plotSpotHeatmap(sce2) # Thresholded on 200 pixel counts plotSpotHeatmap(sce2, log = FALSE, threshold = 200) Here, we can see an increase in the median pixel intensities and accumulation of off-diagonal signal. Due to already high original pixel intensities, we will refrain from aggregating across consecutive pixels for this demonstration. 6.1.4 Filtering incorrectly assigned pixels The following step uses functions provided by the CATALYST package to “debarcode” the pixels. Based on the intensity distribution of all channels, pixels are assigned to their corresponding barcode; here this is the already known metal spot. This procedure serves the purpose to identify pixels that cannot be robustly assigned to the spotted metal. Pixels of such kind can be regarded as “noisy”, “background” or “artefacts” that should be removed prior to spillover estimation. We will also need to specify which channels were spotted (argument bc_key). This information is directly contained in the colData(sce) slot. To facilitate visualization, we will order the bc_key by mass. The general workflow for pixel debarcoding is as follows: assign a preliminary metal mass to each pixel for each pixel, estimate a cutoff parameter for the distance between positive and negative pixel sets apply the estimated cutoffs to identify truly positive pixels library(CATALYST) bc_key <- as.numeric(unique(sce$sample_mass)) bc_key <- bc_key[order(bc_key)] sce <- assignPrelim(sce, bc_key = bc_key) sce <- estCutoffs(sce) sce <- applyCutoffs(sce) The obtained SingleCellExperiment now contains the additional bc_id entry. For each pixel, this vector indicates the assigned mass (e.g. 161) or 0, meaning unassigned. This information can be visualized in form of a heatmap: library(pheatmap) cur_table <- table(sce$bc_id, sce$sample_mass) # Visualize the correctly and incorrectly assigned pixels pheatmap(log10(cur_table + 1), cluster_rows = FALSE, cluster_cols = FALSE) # Compute the fraction of unassigned pixels per spot cur_table["0",] / colSums(cur_table) ## 113 115 141 142 143 144 145 146 147 148 149 ## 0.1985 0.1060 0.2575 0.3195 0.3190 0.3825 0.3545 0.4280 0.3570 0.4770 0.4200 ## 150 151 152 153 154 155 156 158 159 160 161 ## 0.4120 0.4025 0.4050 0.4630 0.4190 0.4610 0.3525 0.4020 0.4655 0.4250 0.5595 ## 162 163 164 165 166 167 168 169 170 171 172 ## 0.4340 0.4230 0.4390 0.4055 0.5210 0.3900 0.3285 0.3680 0.5015 0.4900 0.5650 ## 173 174 175 176 89 ## 0.3125 0.4605 0.4710 0.2845 0.3015 We can see here, that all pixels were assigned to the right mass and that all pixel sets are made up of > 800 pixels. However, in cases where incorrect assignment occurred or where few pixels were measured for some spots, the imcRtools package exports a simple helper function to exclude pixels based on these criteria: sce <- filterPixels(sce, minevents = 40, correct_pixels = TRUE) In the filterPixels function, the minevents parameter specifies the threshold under which correctly assigned pixel sets are excluded from spillover estimation. The correct_pixels parameter indicates whether pixels that were assigned to masses other than the spotted mass should be excluded from spillover estimation. The default values often result in sufficient pixel filtering; however, if very few pixels (~100) are measured per spot, the minevents parameter value needs to be lowered. 6.1.5 Compute spillover matrix Based on the single-positive pixels, we use the CATALYST::computeSpillmat() function to compute the spillover matrix and CATALYST::plotSpillmat() to visualize it. The plotSpillmat function checks the spotted and acquired metal isotopes against a pre-defined CATALYST::isotope_list(). In this data, the Ar80 channel was additionally acquired to check for deviations in signal intensity. Ar80 needs to be added to a custom isotope_list object for visualization. sce <- computeSpillmat(sce) isotope_list <- CATALYST::isotope_list isotope_list$Ar <- 80 plotSpillmat(sce, isotope_list = isotope_list) ## Warning: The `guide` argument in `scale_*()` cannot be `FALSE`. This was deprecated in ## ggplot2 3.3.4. ## ℹ Please use "none" instead. ## ℹ The deprecated feature was likely used in the CATALYST package. ## Please report the issue at <https://github.com/HelenaLC/CATALYST/issues>. ## This warning is displayed once every 8 hours. ## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was ## generated. # Save spillover matrix in variable sm <- metadata(sce)$spillover_matrix Of note: the visualization of the spillover matrix using CATALYST does currently not visualize spillover between the larger channels. In this case, the spillover matrix is clipped at Yb171. As we can see, the largest spillover appears in In113 --> In115 and we also observe the +16 oxide impurities for e.g. Nd148 --> Dy164. We can save the spillover matrix for external use. write.csv(sm, "data/sm.csv") 6.2 Single-cell data compensation The CATALYST package can be used to perform spillover compensation on the single-cell mean intensities. Here, the SpatialExperiment object generated in Section 5 is read in. The CATALYST package requires an entry to rowData(spe)$channel_name for the compCytof function to run. This entry should contain the metal isotopes in the form (mt)(mass)Di (e.g., Sm152Di for Samarium isotope with the atomic mass 152). The compCytof function performs channel spillover compensation on the mean pixel intensities per channel and cell. Here, we will not overwrite the assays in the SpatialExperiment object to later highlight the effect of compensation. As shown in Section 5, also the compensated counts are asinh-transformed using a cofactor of 1. spe <- readRDS("data/spe.rds") rowData(spe)$channel_name <- paste0(rowData(spe)$channel, "Di") spe <- compCytof(spe, sm, transform = TRUE, cofactor = 1, isotope_list = isotope_list, overwrite = FALSE) To check the effect of channel spillover compensation, the expression of markers that are affected by spillover (e.g., E-cadherin in channel Yb173 and CD303 in channel Yb174) can be visualized in form of scatter plots before and after compensation. library(dittoSeq) library(patchwork) before <- dittoScatterPlot(spe, x.var = "Ecad", y.var = "CD303", assay.x = "exprs", assay.y = "exprs") + ggtitle("Before compensation") after <- dittoScatterPlot(spe, x.var = "Ecad", y.var = "CD303", assay.x = "compexprs", assay.y = "compexprs") + ggtitle("After compensation") before + after We observe that the spillover Yb173 –> Yb174 was successfully corrected. To facilitate further downstream analysis, the non-compensated assays can now be replaced by their compensated counterparts: assay(spe, "counts") <- assay(spe, "compcounts") assay(spe, "exprs") <- assay(spe, "compexprs") assay(spe, "compcounts") <- assay(spe, "compexprs") <- NULL 6.3 Image compensation The cytomapper package allows channel spillover compensation directly on multi-channel images. The compImage function takes a CytoImageList object and the estimated spillover matrix as input. More info on how to work with CytoImageList objects can be seen in Section 11. At this point, we can read in the CytoImageList object containing multi-channel images as generated in Section 5. The channelNames need to be set according to their metal isotope in the form (mt)(mass)Di and therefore match colnames(sm). library(cytomapper) images <- readRDS("data/images.rds") channelNames(images) <- rowData(spe)$channel_name The CATALYST package provides the adaptSpillmat function that corrects the spillover matrix in a way that rows and columns match a predefined set of metals. Please refer to ?compCytof for more information how metals in the spillover matrix are matched to acquired channels in the SingleCellExperiment object. The spillover matrix can now be adapted to exclude channels that were not kept for downstream analysis. adapted_sm <- adaptSpillmat(sm, channelNames(images), isotope_list = isotope_list) ## Compensation is likely to be inaccurate. ## Spill values for the following interactions ## have not been estimated: ## Ir191Di -> Ir193Di ## Ir193Di -> Ir191Di The adapted spillover matrix now matches the channelNames of the CytoImageList object and can be used to perform pixel-level spillover compensation. Here, we parallelise the image compensation on all available minus 2 cores. When working on Windows, you will need to use the SnowParam function instead of MultiCoreParam. library(BiocParallel) images_comp <- compImage(images, adapted_sm, BPPARAM = MulticoreParam()) As a sanity check, we will visualize the image before and after compensation: # Before compensation plotPixels(images[5], colour_by = "Yb173Di", image_title = list(text = "Yb173 (Ecad) - before", position = "topleft"), legend = NULL, bcg = list(Yb173Di = c(0, 4, 1))) plotPixels(images[5], colour_by = "Yb174Di", image_title = list(text = "Yb174 (CD303) - before", position = "topleft"), legend = NULL, bcg = list(Yb174Di = c(0, 4, 1))) # After compensation plotPixels(images_comp[5], colour_by = "Yb173Di", image_title = list(text = "Yb173 (Ecad) - after", position = "topleft"), legend = NULL, bcg = list(Yb173Di = c(0, 4, 1))) plotPixels(images_comp[5], colour_by = "Yb174Di", image_title = list(text = "Yb174 (CD303) - after", position = "topleft"), legend = NULL, bcg = list(Yb174Di = c(0, 4, 1))) For convenience, we will re-set the channelNames to their biological targtes: channelNames(images_comp) <- rownames(spe) 6.4 Write out compensated images In the final step, the compensated images are written out as 16-bit TIFF files: library(tiff) dir.create("data/comp_img") lapply(names(images_comp), function(x){ writeImage(as.array(images_comp[[x]])/(2^16 - 1), paste0("data/comp_img/", x, ".tiff"), bits.per.sample = 16) }) 6.5 Save objects For further downstream analysis, the compensated SpatialExperiment and CytoImageList objects are saved replacing the former objects: saveRDS(spe, "data/spe.rds") saveRDS(images_comp, "data/images.rds") 6.6 Session Info SessionInfo ## R version 4.3.1 (2023-06-16) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 22.04.3 LTS ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## time zone: Etc/UTC ## tzcode source: system (glibc) ## ## attached base packages: ## [1] stats4 stats graphics grDevices utils datasets methods ## [8] base ## ## other attached packages: ## [1] testthat_3.1.10 tiff_0.1-11 ## [3] BiocParallel_1.34.2 cytomapper_1.12.0 ## [5] EBImage_4.42.0 patchwork_1.1.3 ## [7] dittoSeq_1.12.1 ggplot2_3.4.3 ## [9] pheatmap_1.0.12 CATALYST_1.24.0 ## [11] imcRtools_1.6.5 SpatialExperiment_1.10.0 ## [13] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2 ## [15] Biobase_2.60.0 GenomicRanges_1.52.0 ## [17] GenomeInfoDb_1.36.3 IRanges_2.34.1 ## [19] S4Vectors_0.38.2 BiocGenerics_0.46.0 ## [21] MatrixGenerics_1.12.3 matrixStats_1.0.0 ## ## loaded via a namespace (and not attached): ## [1] bitops_1.0-7 sf_1.0-14 ## [3] RColorBrewer_1.1-3 doParallel_1.0.17 ## [5] tools_4.3.1 backports_1.4.1 ## [7] utf8_1.2.3 R6_2.5.1 ## [9] DT_0.29 HDF5Array_1.28.1 ## [11] rhdf5filters_1.12.1 GetoptLong_1.0.5 ## [13] withr_2.5.1 sp_2.0-0 ## [15] gridExtra_2.3 cli_3.6.1 ## [17] archive_1.1.6 sandwich_3.0-2 ## [19] labeling_0.4.3 sass_0.4.7 ## [21] nnls_1.5 mvtnorm_1.2-3 ## [23] readr_2.1.4 proxy_0.4-27 ## [25] ggridges_0.5.4 systemfonts_1.0.4 ## [27] colorRamps_2.3.1 svglite_2.1.1 ## [29] R.utils_2.12.2 scater_1.28.0 ## [31] plotrix_3.8-2 limma_3.56.2 ## [33] flowCore_2.12.2 rstudioapi_0.15.0 ## [35] generics_0.1.3 shape_1.4.6 ## [37] gtools_3.9.4 vroom_1.6.3 ## [39] car_3.1-2 dplyr_1.1.3 ## [41] Matrix_1.6-1.1 RProtoBufLib_2.12.1 ## [43] ggbeeswarm_0.7.2 fansi_1.0.4 ## [45] abind_1.4-5 R.methodsS3_1.8.2 ## [47] terra_1.7-46 lifecycle_1.0.3 ## [49] multcomp_1.4-25 yaml_2.3.7 ## [51] edgeR_3.42.4 carData_3.0-5 ## [53] rhdf5_2.44.0 Rtsne_0.16 ## [55] grid_4.3.1 promises_1.2.1 ## [57] dqrng_0.3.1 crayon_1.5.2 ## [59] shinydashboard_0.7.2 lattice_0.21-8 ## [61] beachmat_2.16.0 cowplot_1.1.1 ## [63] magick_2.8.0 pillar_1.9.0 ## [65] knitr_1.44 ComplexHeatmap_2.16.0 ## [67] RTriangle_1.6-0.12 rjson_0.2.21 ## [69] codetools_0.2-19 glue_1.6.2 ## [71] data.table_1.14.8 vctrs_0.6.3 ## [73] png_0.1-8 gtable_0.3.4 ## [75] cachem_1.0.8 xfun_0.40 ## [77] S4Arrays_1.0.6 mime_0.12 ## [79] DropletUtils_1.20.0 tidygraph_1.2.3 ## [81] ConsensusClusterPlus_1.64.0 survival_3.5-5 ## [83] iterators_1.0.14 cytolib_2.12.1 ## [85] units_0.8-4 ellipsis_0.3.2 ## [87] TH.data_1.1-2 bit64_4.0.5 ## [89] rprojroot_2.0.3 bslib_0.5.1 ## [91] irlba_2.3.5.1 svgPanZoom_0.3.4 ## [93] vipor_0.4.5 KernSmooth_2.23-21 ## [95] colorspace_2.1-0 DBI_1.1.3 ## [97] raster_3.6-23 tidyselect_1.2.0 ## [99] bit_4.0.5 compiler_4.3.1 ## [101] BiocNeighbors_1.18.0 desc_1.4.2 ## [103] DelayedArray_0.26.7 bookdown_0.35 ## [105] scales_1.2.1 classInt_0.4-10 ## [107] distances_0.1.9 stringr_1.5.0 ## [109] digest_0.6.33 fftwtools_0.9-11 ## [111] rmarkdown_2.25 XVector_0.40.0 ## [113] htmltools_0.5.6 pkgconfig_2.0.3 ## [115] jpeg_0.1-10 sparseMatrixStats_1.12.2 ## [117] fastmap_1.1.1 rlang_1.1.1 ## [119] GlobalOptions_0.1.2 htmlwidgets_1.6.2 ## [121] shiny_1.7.5 DelayedMatrixStats_1.22.6 ## [123] farver_2.1.1 jquerylib_0.1.4 ## [125] zoo_1.8-12 jsonlite_1.8.7 ## [127] R.oo_1.25.0 BiocSingular_1.16.0 ## [129] RCurl_1.98-1.12 magrittr_2.0.3 ## [131] scuttle_1.10.2 GenomeInfoDbData_1.2.10 ## [133] Rhdf5lib_1.22.1 munsell_0.5.0 ## [135] Rcpp_1.0.11 ggnewscale_0.4.9 ## [137] viridis_0.6.4 stringi_1.7.12 ## [139] ggraph_2.1.0 brio_1.1.3 ## [141] zlibbioc_1.46.0 MASS_7.3-60 ## [143] plyr_1.8.8 parallel_4.3.1 ## [145] ggrepel_0.9.3 graphlayouts_1.0.1 ## [147] splines_4.3.1 hms_1.1.3 ## [149] circlize_0.4.15 locfit_1.5-9.8 ## [151] igraph_1.5.1 ggpubr_0.6.0 ## [153] ggsignif_0.6.4 pkgload_1.3.3 ## [155] ScaledMatrix_1.8.1 reshape2_1.4.4 ## [157] XML_3.99-0.14 drc_3.0-1 ## [159] evaluate_0.21 tzdb_0.4.0 ## [161] foreach_1.5.2 tweenr_2.0.2 ## [163] httpuv_1.6.11 tidyr_1.3.0 ## [165] purrr_1.0.2 polyclip_1.10-6 ## [167] clue_0.3-65 ggforce_0.4.1 ## [169] rsvd_1.0.5 broom_1.0.5 ## [171] xtable_1.8-4 e1071_1.7-13 ## [173] rstatix_0.7.2 later_1.3.1 ## [175] viridisLite_0.4.2 class_7.3-22 ## [177] tibble_3.2.1 FlowSOM_2.8.0 ## [179] beeswarm_0.4.0 cluster_2.1.4 ## [181] concaveman_1.1.0 References "],["image-and-cell-level-quality-control.html", "7 Image and cell-level quality control 7.1 Read in the data 7.2 Segmentation quality control 7.3 Image-level quality control 7.4 Cell-level quality control 7.5 Save objects 7.6 Session Info", " 7 Image and cell-level quality control The following section discusses possible quality indicators for data obtained by IMC and other highly multiplexed imaging technologies. Here, we will focus on describing quality metrics on the single-cell as well as image level. 7.1 Read in the data We will first read in the data processed in previous sections: images <- readRDS("data/images.rds") masks <- readRDS("data/masks.rds") spe <- readRDS("data/spe.rds") 7.2 Segmentation quality control The first step after image segmentation is to observe its accuracy. Without having ground-truth data readily available, a common approach to segmentation quality control is to overlay segmentation masks on composite images displaying channels that were used for segmentation. The cytomapper package supports exactly this tasks by using the plotPixels function. Here, we select 3 random images and perform image- and channel-wise normalization (channels are first min-max normalized and scaled to a range of 0-1 before clipping the maximum intensity to 0.2). library(cytomapper) set.seed(20220118) img_ids <- sample(seq_along(images), 3) # Normalize and clip images cur_images <- images[img_ids] cur_images <- cytomapper::normalize(cur_images, separateImages = TRUE) cur_images <- cytomapper::normalize(cur_images, inputRange = c(0, 0.2)) plotPixels(cur_images, mask = masks[img_ids], img_id = "sample_id", missing_colour = "white", colour_by = c("CD163", "CD20", "CD3", "Ecad", "DNA1"), colour = list(CD163 = c("black", "yellow"), CD20 = c("black", "red"), CD3 = c("black", "green"), Ecad = c("black", "cyan"), DNA1 = c("black", "blue")), image_title = NULL, legend = list(colour_by.title.cex = 0.7, colour_by.labels.cex = 0.7)) We can see that nuclei are centered within the segmentation masks and all cell types are correctly segmented (note: to zoom into the image you can right click and select Open Image in New Tab). A common challenge here is to segment large (e.g., epithelial cells - in cyan) versus small (e.g., B cells - in red). However, the segmentation approach here appears to correctly segment cells across different sizes. An easier and interactive way of observing segmentation quality is to use the interactive image viewer provided by the cytoviewer R/Bioconductor package (Meyer, Eling, and Bodenmiller 2023). Under “Image-level” > “Basic controls”, up to six markers can be selected for visualization. The contrast of each marker can be adjusted. Under “Image-level” > “Advanced controls”, click the “Show cell outlines” box to outline segmented cells on the images. library(cytoviewer) app <- cytoviewer(image = images, mask = masks, cell_id = "ObjectNumber", img_id = "sample_id") if (interactive()) { shiny::runApp(app, launch.browser = TRUE) } An additional approach to observe cell segmentation quality and potentially also antibody specificity issues is to visualize single-cell expression in form of a heatmap. Here, we sub-sample the dataset to 2000 cells for visualization purposes and overlay the cancer type from which the cells were extracted. library(dittoSeq) library(viridis) cur_cells <- sample(seq_len(ncol(spe)), 2000) dittoHeatmap(spe[,cur_cells], genes = rownames(spe)[rowData(spe)$use_channel], assay = "exprs", cluster_cols = TRUE, scale = "none", heatmap.colors = viridis(100), annot.by = "indication", annotation_colors = list(indication = metadata(spe)$color_vectors$indication)) We can differentiate between epithelial cells (Ecad+) and immune cells (CD45RO+). Some of the markers are detected in specific cells (e.g., Ki67, CD20, Ecad) while others are more broadly expressed across cells (e.g., HLADR, B2M, CD4). 7.3 Image-level quality control Image-level quality control is often performed using tools that offer a graphical user interface such as QuPath, FIJI and the previously mentioned cytoviewer package. Viewers that were specifically developed for IMC data can be seen here. In this section, we will specifically focus on quantitative metrics to assess image quality. It is often of interest to calculate the signal-to-noise ratio (SNR) for individual channels and markers. Here, we define the SNR as: \\[SNR = I_s/I_n\\] where \\(I_s\\) is the intensity of the signal (mean intensity of pixels with true signal) and \\(I_n\\) is the intensity of the noise (mean intensity of pixels containing noise). This definition of the SNR is just one of many and other measures can be applied. Finding a threshold that separates pixels containing signal and pixels containing noise is not trivial and different approaches can be chosen. Here, we use the otsu thresholding approach to find pixels of the “foreground” (i.e., signal) and “background” (i.e., noise). The SNR is then defined as the mean intensity of foreground pixels divided by the mean intensity of background pixels. We compute this measure as well as the mean signal intensity per image. The plot below shows the average SNR versus the average signal intensity across all images. library(tidyverse) library(ggrepel) library(EBImage) cur_snr <- lapply(names(images), function(x){ img <- images[[x]] mat <- apply(img, 3, function(ch){ # Otsu threshold thres <- otsu(ch, range = c(min(ch), max(ch)), levels = 65536) # Signal-to-noise ratio snr <- mean(ch[ch > thres]) / mean(ch[ch <= thres]) # Signal intensity ps <- mean(ch[ch > thres]) return(c(snr = snr, ps = ps)) }) t(mat) %>% as.data.frame() %>% mutate(image = x, marker = colnames(mat)) %>% pivot_longer(cols = c(snr, ps)) }) cur_snr <- do.call(rbind, cur_snr) cur_snr %>% group_by(marker, name) %>% summarize(log_mean = log2(mean(value))) %>% pivot_wider(names_from = name, values_from = log_mean) %>% ggplot() + geom_point(aes(ps, snr)) + geom_label_repel(aes(ps, snr, label = marker)) + theme_minimal(base_size = 15) + ylab("Signal-to-noise ratio [log2]") + xlab("Signal intensity [log2]") We observe PD1, LAG3 and cleaved PARP to have high SNR but low signal intensity meaning that in general these markers are not abundantly expressed. The Iridium intercalator (here marked as DNA1 and DNA2) has the highest signal intensity but low SNR. This might be due to staining differences between individual nuclei where some nuclei are considered as background. We do however observe high SNR and sufficient signal intensity for the majority of markers. Otsu thesholding and SNR calculation does not perform well if the markers are lowly abundant. In the next code chunk, we will remove markers that have a positive signal of below 2 per image. cur_snr <- cur_snr %>% pivot_wider(names_from = name, values_from = value) %>% filter(ps > 2) %>% pivot_longer(cols = c(snr, ps)) cur_snr %>% group_by(marker, name) %>% summarize(log_mean = log2(mean(value))) %>% pivot_wider(names_from = name, values_from = log_mean) %>% ggplot() + geom_point(aes(ps, snr)) + geom_label_repel(aes(ps, snr, label = marker)) + theme_minimal(base_size = 15) + ylab("Signal-to-noise ratio [log2]") + xlab("Signal intensity [log2]") This visualization shows a reduces SNR for PD1, LAG3 and cleaved PARP which was previously inflated due to low signal. Another quality indicator is the image area covered by cells (or biological tissue). This metric identifies ROIs where little cells are present, possibly hinting at incorrect selection of the ROI. We can compute the percentage of covered image area using the metadata contained in the SpatialExperiment object: cell_density <- colData(spe) %>% as.data.frame() %>% group_by(sample_id) %>% # Compute the number of pixels covered by cells and # the total number of pixels summarize(cell_area = sum(area), no_pixels = mean(width_px) * mean(height_px)) %>% # Divide the total number of pixels # by the number of pixels covered by cells mutate(covered_area = cell_area / no_pixels) # Visualize the image area covered by cells per image ggplot(cell_density) + geom_point(aes(reorder(sample_id,covered_area), covered_area)) + theme_minimal(base_size = 15) + theme(axis.text.x = element_text(angle = 90, hjust = 1, size = 15)) + ylim(c(0, 1)) + ylab("% covered area") + xlab("") We observe that two of the 14 images show unusually low cell coverage. These two images can now be visualized using cytomapper. # Normalize and clip images cur_images <- images[c("Patient4_005", "Patient4_007")] cur_images <- cytomapper::normalize(cur_images, separateImages = TRUE) cur_images <- cytomapper::normalize(cur_images, inputRange = c(0, 0.2)) plotPixels(cur_images, mask = masks[c("Patient4_005", "Patient4_007")], img_id = "sample_id", missing_colour = "white", colour_by = c("CD163", "CD20", "CD3", "Ecad", "DNA1"), colour = list(CD163 = c("black", "yellow"), CD20 = c("black", "red"), CD3 = c("black", "green"), Ecad = c("black", "cyan"), DNA1 = c("black", "blue")), legend = list(colour_by.title.cex = 0.7, colour_by.labels.cex = 0.7)) These two images display less dense tissue structure but overall the images are intact and appear to be segmented correctly. Finally, it can be beneficial to visualize the mean marker expression per image to identify images with outlying marker expression. This check does not indicate image quality per se but can highlight biological differences. Here, we will use the aggregateAcrossCells function of the scuttle package to compute the mean expression per image. For visualization purposes, we again asinh transform the mean expression values. library(scuttle) image_mean <- aggregateAcrossCells(spe, ids = spe$sample_id, statistics="mean", use.assay.type = "counts") assay(image_mean, "exprs") <- asinh(counts(image_mean)) dittoHeatmap(image_mean, genes = rownames(spe)[rowData(spe)$use_channel], assay = "exprs", cluster_cols = TRUE, scale = "none", heatmap.colors = viridis(100), annot.by = c("indication", "patient_id", "ROI"), annotation_colors = list(indication = metadata(spe)$color_vectors$indication, patient_id = metadata(spe)$color_vectors$patient_id, ROI = metadata(spe)$color_vectors$ROI), show_colnames = TRUE) We observe extensive biological variation across the 14 images specifically for some of the cell phenotype markers including the macrophage marker CD206, the B cell marker CD20, the neutrophil marker CD15, and the proliferation marker Ki67. These differences will be further studied in the following chapters. 7.4 Cell-level quality control In the following paragraphs we will look at different metrics and visualization approaches to assess data quality (as well as biological differences) on the single-cell level. Related to the signal-to-noise ratio (SNR) calculated above on the pixel-level, a similar measure can be derived on the single-cell level. Here, we will use a two component Gaussian mixture model for each marker to find cells with positive and negative expression. The SNR is defined as: \\[SNR = I_s/I_n\\] where \\(I_s\\) is the intensity of the signal (mean intensity of cells with positive signal) and \\(I_n\\) is the intensity of the noise (mean intensity of cells lacking expression). To define cells with positive and negative marker expression, we fit the mixture model across the transformed counts of all cells contained in the SpatialExperiment object. Next, for each marker we calculate the mean of the non-transformed counts for the positive and the negative cells. The SNR is then the ratio between the mean of the positive signal and the mean of the negative signal. library(mclust) set.seed(220224) mat <- sapply(seq_len(nrow(spe)), function(x){ cur_exprs <- assay(spe, "exprs")[x,] cur_counts <- assay(spe, "counts")[x,] cur_model <- Mclust(cur_exprs, G = 2) mean1 <- mean(cur_counts[cur_model$classification == 1]) mean2 <- mean(cur_counts[cur_model$classification == 2]) signal <- ifelse(mean1 > mean2, mean1, mean2) noise <- ifelse(mean1 > mean2, mean2, mean1) return(c(snr = signal/noise, ps = signal)) }) cur_snr <- t(mat) %>% as.data.frame() %>% mutate(marker = rownames(spe)) cur_snr %>% ggplot() + geom_point(aes(log2(ps), log2(snr))) + geom_label_repel(aes(log2(ps), log2(snr), label = marker)) + theme_minimal(base_size = 15) + ylab("Signal-to-noise ratio [log2]") + xlab("Signal intensity [log2]") Next, we observe the distributions of cell size across the individual images. Differences in cell size distributions can indicate segmentation biases due to differences in cell density or can indicate biological differences due to cell type compositions (tumor cells tend to be larger than immune cells). dittoPlot(spe, var = "area", group.by = "sample_id", plots = "boxplot") + ylab("Cell area") + xlab("") summary(spe$area) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 3.00 47.00 70.00 76.38 98.00 466.00 The median cell size is 70 pixels with a median major axis length of 11.3. The largest cell has an area of 466 pixels which relates to a diameter of 21.6 pixels assuming a circular shape. Overall, the distribution of cell sizes is similar across images with images from Patient4_005 and Patient4_007 showing a reduced average cell size. These images contain fewer tumor cells which can explain the smaller average cell size. We detect very small cells in the dataset and will remove them. The chosen threshold is arbitrary and needs to be adjusted per dataset. sum(spe$area < 5) ## [1] 65 spe <- spe[,spe$area >= 5] Another quality indicator can be an absolute measure of cell density often reported in cells per mm\\(^2\\). cell_density <- colData(spe) %>% as.data.frame() %>% group_by(sample_id) %>% summarize(cell_count = n(), no_pixels = mean(width_px) * mean(height_px)) %>% mutate(cells_per_mm2 = cell_count/(no_pixels/1000000)) ggplot(cell_density) + geom_point(aes(reorder(sample_id,cells_per_mm2), cells_per_mm2)) + theme_minimal(base_size = 15) + theme(axis.text.x = element_text(angle = 90, hjust = 1, size = 8)) + ylab("Cells per mm2") + xlab("") The number of cells per mm\\(^2\\) varies across images which also depends on the number of tumor/non-tumor cells. As we can see in the following sections, some immune cells appear in cell dense regions while other stromal regions are less dense. The data presented here originate from samples from different locations with potential differences in pre-processing and each sample was stained individually. These (and other) technical aspects can induce staining differences between samples or batches of samples. Observing potential staining differences can be crucial to assess data quality. We will use ridgeline visualizations to check differences in staining patterns: multi_dittoPlot(spe, vars = rownames(spe)[rowData(spe)$use_channel], group.by = "patient_id", plots = "ridgeplot", assay = "exprs", color.panel = metadata(spe)$color_vectors$patient_id) We observe variations in the distributions of marker expression across patients. These variations may arise partly from different abundances of cells in different images (e.g., Patient3 may have higher numbers of CD11c+ and PD1+ cells) as well as staining differences between samples. While most of the selected markers are specifically expressed in immune cell subtypes, we can see that E-Cadherin (a marker for epithelial (tumor) cells) shows a similar expression range across all patients. Finally, we will use non-linear dimensionality reduction methods to project cells from a high-dimensional (40) down to a low-dimensional (2) space. For this the scater package provides the runUMAP and runTSNE function. To ensure reproducibility, we will need to set a seed; however different seeds and different parameter settings (e.g., the perplexity parameter in the runTSNE function) need to be tested to avoid over-interpretation of visualization artefacts. For dimensionality reduction, we will use all channels that show biological variation across the dataset. However, marker selection can be performed with different biological questions in mind. Here, both the runUMAP and runTSNE function are not deterministic, meaning they produce different results across different runs. We therefore set a seed in this chunk for reproducibility purposes. library(scater) set.seed(220225) spe <- runUMAP(spe, subset_row = rowData(spe)$use_channel, exprs_values = "exprs") spe <- runTSNE(spe, subset_row = rowData(spe)$use_channel, exprs_values = "exprs") After dimensionality reduction, the low-dimensional embeddings are stored in the reducedDim slot. reducedDims(spe) ## List of length 2 ## names(2): UMAP TSNE head(reducedDim(spe, "UMAP")) ## UMAP1 UMAP2 ## Patient1_001_1 -4.810167 -3.777362 ## Patient1_001_2 -4.397347 -3.456036 ## Patient1_001_3 -4.369883 -3.445561 ## Patient1_001_4 -4.081614 -3.162119 ## Patient1_001_5 -6.234012 -2.433976 ## Patient1_001_6 -5.666597 -3.428058 Visualization of the low-dimensional embedding facilitates assessment of potential “batch effects”. The dittoDimPlot function allows flexible visualization. It returns ggplot objects which can be further modified. library(patchwork) # visualize patient id p1 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "UMAP", size = 0.2) + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + ggtitle("Patient ID on UMAP") p2 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "TSNE", size = 0.2) + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + ggtitle("Patient ID on TSNE") # visualize region of interest id p3 <- dittoDimPlot(spe, var = "ROI", reduction.use = "UMAP", size = 0.2) + scale_color_manual(values = metadata(spe)$color_vectors$ROI) + ggtitle("ROI ID on UMAP") p4 <- dittoDimPlot(spe, var = "ROI", reduction.use = "TSNE", size = 0.2) + scale_color_manual(values = metadata(spe)$color_vectors$ROI) + ggtitle("ROI ID on TSNE") # visualize indication p5 <- dittoDimPlot(spe, var = "indication", reduction.use = "UMAP", size = 0.2) + scale_color_manual(values = metadata(spe)$color_vectors$indication) + ggtitle("Indication on UMAP") p6 <- dittoDimPlot(spe, var = "indication", reduction.use = "TSNE", size = 0.2) + scale_color_manual(values = metadata(spe)$color_vectors$indication) + ggtitle("Indication on TSNE") (p1 + p2) / (p3 + p4) / (p5 + p6) # visualize marker expression p1 <- dittoDimPlot(spe, var = "Ecad", reduction.use = "UMAP", assay = "exprs", size = 0.2) + scale_color_viridis(name = "Ecad") + ggtitle("E-Cadherin expression on UMAP") p2 <- dittoDimPlot(spe, var = "CD45RO", reduction.use = "UMAP", assay = "exprs", size = 0.2) + scale_color_viridis(name = "CD45RO") + ggtitle("CD45RO expression on UMAP") p3 <- dittoDimPlot(spe, var = "Ecad", reduction.use = "TSNE", assay = "exprs", size = 0.2) + scale_color_viridis(name = "Ecad") + ggtitle("Ecad expression on TSNE") p4 <- dittoDimPlot(spe, var = "CD45RO", reduction.use = "TSNE", assay = "exprs", size = 0.2) + scale_color_viridis(name = "CD45RO") + ggtitle("CD45RO expression on TSNE") (p1 + p2) / (p3 + p4) We observe a strong separation of tumor cells (Ecad+ cells) between the patients. Here, each patient was diagnosed with a different tumor type. The separation of tumor cells could be of biological origin since tumor cells tend to display differences in expression between patients and cancer types and/or of technical origin: the panel only contains a single tumor marker (E-Cadherin) and therefore slight technical differences in staining causes visible separation between cells of different patients. Nevertheless, the immune compartment (CD45RO+ cells) mix between patients and we can rule out systematic staining differences between patients. 7.5 Save objects The modified SpatialExperiment object is saved for further downstream analysis. saveRDS(spe, "data/spe.rds") 7.6 Session Info SessionInfo ## R version 4.3.1 (2023-06-16) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 22.04.3 LTS ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## time zone: Etc/UTC ## tzcode source: system (glibc) ## ## attached base packages: ## [1] stats4 stats graphics grDevices utils datasets methods ## [8] base ## ## other attached packages: ## [1] testthat_3.1.10 patchwork_1.1.3 ## [3] scater_1.28.0 mclust_6.0.0 ## [5] scuttle_1.10.2 ggrepel_0.9.3 ## [7] lubridate_1.9.3 forcats_1.0.0 ## [9] stringr_1.5.0 dplyr_1.1.3 ## [11] purrr_1.0.2 readr_2.1.4 ## [13] tidyr_1.3.0 tibble_3.2.1 ## [15] tidyverse_2.0.0 viridis_0.6.4 ## [17] viridisLite_0.4.2 dittoSeq_1.12.1 ## [19] ggplot2_3.4.3 cytoviewer_1.0.1 ## [21] cytomapper_1.12.0 SingleCellExperiment_1.22.0 ## [23] SummarizedExperiment_1.30.2 Biobase_2.60.0 ## [25] GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 ## [27] IRanges_2.34.1 S4Vectors_0.38.2 ## [29] BiocGenerics_0.46.0 MatrixGenerics_1.12.3 ## [31] matrixStats_1.0.0 EBImage_4.42.0 ## ## loaded via a namespace (and not attached): ## [1] RColorBrewer_1.1-3 rstudioapi_0.15.0 ## [3] jsonlite_1.8.7 magrittr_2.0.3 ## [5] ggbeeswarm_0.7.2 magick_2.8.0 ## [7] farver_2.1.1 rmarkdown_2.25 ## [9] zlibbioc_1.46.0 vctrs_0.6.3 ## [11] memoise_2.0.1 DelayedMatrixStats_1.22.6 ## [13] RCurl_1.98-1.12 terra_1.7-46 ## [15] svgPanZoom_0.3.4 htmltools_0.5.6 ## [17] S4Arrays_1.0.6 BiocNeighbors_1.18.0 ## [19] raster_3.6-23 Rhdf5lib_1.22.1 ## [21] rhdf5_2.44.0 sass_0.4.7 ## [23] bslib_0.5.1 desc_1.4.2 ## [25] htmlwidgets_1.6.2 fontawesome_0.5.2 ## [27] cachem_1.0.8 mime_0.12 ## [29] lifecycle_1.0.3 pkgconfig_2.0.3 ## [31] rsvd_1.0.5 colourpicker_1.3.0 ## [33] Matrix_1.6-1.1 R6_2.5.1 ## [35] fastmap_1.1.1 GenomeInfoDbData_1.2.10 ## [37] shiny_1.7.5 digest_0.6.33 ## [39] colorspace_2.1-0 shinycssloaders_1.0.0 ## [41] rprojroot_2.0.3 irlba_2.3.5.1 ## [43] dqrng_0.3.1 pkgload_1.3.3 ## [45] beachmat_2.16.0 labeling_0.4.3 ## [47] timechange_0.2.0 fansi_1.0.4 ## [49] nnls_1.5 abind_1.4-5 ## [51] compiler_4.3.1 withr_2.5.1 ## [53] tiff_0.1-11 BiocParallel_1.34.2 ## [55] HDF5Array_1.28.1 R.utils_2.12.2 ## [57] DelayedArray_0.26.7 rjson_0.2.21 ## [59] tools_4.3.1 vipor_0.4.5 ## [61] beeswarm_0.4.0 httpuv_1.6.11 ## [63] R.oo_1.25.0 glue_1.6.2 ## [65] rhdf5filters_1.12.1 promises_1.2.1 ## [67] grid_4.3.1 Rtsne_0.16 ## [69] generics_0.1.3 gtable_0.3.4 ## [71] tzdb_0.4.0 R.methodsS3_1.8.2 ## [73] hms_1.1.3 ScaledMatrix_1.8.1 ## [75] BiocSingular_1.16.0 sp_2.0-0 ## [77] utf8_1.2.3 XVector_0.40.0 ## [79] RcppAnnoy_0.0.21 pillar_1.9.0 ## [81] limma_3.56.2 later_1.3.1 ## [83] lattice_0.21-8 tidyselect_1.2.0 ## [85] locfit_1.5-9.8 miniUI_0.1.1.1 ## [87] knitr_1.44 gridExtra_2.3 ## [89] bookdown_0.35 edgeR_3.42.4 ## [91] svglite_2.1.1 xfun_0.40 ## [93] shinydashboard_0.7.2 brio_1.1.3 ## [95] DropletUtils_1.20.0 pheatmap_1.0.12 ## [97] stringi_1.7.12 fftwtools_0.9-11 ## [99] yaml_2.3.7 evaluate_0.21 ## [101] codetools_0.2-19 archive_1.1.6 ## [103] BiocManager_1.30.22 cli_3.6.1 ## [105] uwot_0.1.16 xtable_1.8-4 ## [107] systemfonts_1.0.4 munsell_0.5.0 ## [109] jquerylib_0.1.4 Rcpp_1.0.11 ## [111] png_0.1-8 parallel_4.3.1 ## [113] ellipsis_0.3.2 jpeg_0.1-10 ## [115] sparseMatrixStats_1.12.2 bitops_1.0-7 ## [117] SpatialExperiment_1.10.0 scales_1.2.1 ## [119] ggridges_0.5.4 crayon_1.5.2 ## [121] BiocStyle_2.28.1 rlang_1.1.1 ## [123] cowplot_1.1.1 References "],["batch-effects.html", "8 Batch effect correction 8.1 fastMNN correction 8.2 harmony correction 8.3 Seurat correction 8.4 Save objects 8.5 Session Info", " 8 Batch effect correction In Section 7.4 we observed staining/expression differences between the individual samples. This can arise due to technical (e.g., differences in sample processing) as well as biological (e.g., differential expression between patients/indications) effects. However, the combination of these effects hinders cell phenotyping via clustering as highlighted in Section 9.2. To integrate cells across samples, we can use computational strategies developed for correcting batch effects in single-cell RNA sequencing data. In the following sections, we will use functions of the batchelor, harmony and Seurat packages to correct for such batch effects. Of note: the correction approaches presented here aim at removing any differences between samples. This will also remove biological differences between the patients/indications. Nevertheless, integrating cells across samples can facilitate the detection of cell phenotypes via clustering. First, we will read in the SpatialExperiment object containing the single-cell data. spe <- readRDS("data/spe.rds") 8.1 fastMNN correction The batchelor package provides the mnnCorrect and fastMNN functions to correct for differences between samples/batches. Both functions build up on finding mutual nearest neighbors (MNN) among the cells of different samples and correct expression differences between the batches (Haghverdi et al. 2018). The mnnCorrect function returns corrected expression counts while the fastMNN functions performs the correction in reduced dimension space. As such, fastMNN returns integrated cells in form of a low dimensional embedding. Paper: Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors Documentation: batchelor 8.1.1 Perform sample correction Here, we apply the fastMNN function to integrate cells between patients. By setting auto.merge = TRUE the function estimates the best batch merging order by maximizing the number of MNN pairs at each merging step. This is more time consuming than merging sequentially based on how batches appear in the dataset (default). We again select the markers defined in Section 5.2 for sample correction. The function returns a SingleCellExperiment object which contains corrected low-dimensional coordinates for each cell in the reducedDim(out, \"corrected\") slot. This low-dimensional embedding can be further used for clustering and non-linear dimensionality reduction. We check that the order of cells is the same between the input and output object and then transfer the corrected coordinates to the main SpatialExperiment object. library(batchelor) set.seed(220228) out <- fastMNN(spe, batch = spe$patient_id, auto.merge = TRUE, subset.row = rowData(spe)$use_channel, assay.type = "exprs") # Check that order of cells is the same stopifnot(all.equal(colnames(spe), colnames(out))) # Transfer the correction results to the main spe object reducedDim(spe, "fastMNN") <- reducedDim(out, "corrected") The computational time of the fastMNN function call is 2.33 minutes. Of note, the warnings that the fastMNN function produces can be avoided as follows: The following warning can be avoided by setting BSPARAM = BiocSingular::ExactParam() Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, : You're computing too large a percentage of total singular values, use a standard svd instead. The following warning can be avoided by requesting fewer singular values by setting d = 30 In check_numbers(k = k, nu = nu, nv = nv, limit = min(dim(x)) - : more singular values/vectors requested than available 8.1.2 Quality control of correction results The fastMNN function further returns outputs that can be used to assess the quality of the batch correction. The metadata(out)$merge.info entry collects diagnostics for each individual merging step. Here, the batch.size and lost.var entries are important. The batch.size entry reports the relative magnitude of the batch effect and the lost.var entry represents the percentage of lost variance per merging step. A large batch.size and low lost.var indicate sufficient batch correction. merge_info <- metadata(out)$merge.info merge_info[,c("left", "right", "batch.size")] ## DataFrame with 3 rows and 3 columns ## left right batch.size ## <List> <List> <numeric> ## 1 Patient4 Patient2 0.381635 ## 2 Patient4,Patient2 Patient1 0.581013 ## 3 Patient4,Patient2,Patient1 Patient3 0.767376 merge_info$lost.var ## Patient1 Patient2 Patient3 Patient4 ## [1,] 0.000000000 0.031154864 0.00000000 0.046198914 ## [2,] 0.043363546 0.009772150 0.00000000 0.011931892 ## [3,] 0.005394755 0.003023119 0.07219394 0.005366304 We observe that Patient4 and Patient2 are most similar with a low batch effect. Merging cells of Patient3 into the combined batch of Patient1, Patient2 and Patient4 resulted in the highest percentage of lost variance and the detection of the largest batch effect. In the next paragraph we can visualize the correction results. 8.1.3 Visualization The simplest option to check if the sample effects were corrected is by using non-linear dimensionality reduction techniques and observe mixing of cells across samples. We will recompute the UMAP embedding using the corrected low-dimensional coordinates for each cell. library(scater) set.seed(220228) spe <- runUMAP(spe, dimred= "fastMNN", name = "UMAP_mnnCorrected") Next, we visualize the corrected UMAP while overlaying patient IDs. library(cowplot) library(dittoSeq) library(viridis) # visualize patient id p1 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "UMAP", size = 0.2) + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + ggtitle("Patient ID on UMAP before correction") p2 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "UMAP_mnnCorrected", size = 0.2) + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + ggtitle("Patient ID on UMAP after correction") plot_grid(p1, p2) We observe an imperfect merging of Patient3 into all other samples. This was already seen when displaying the merging information above. We now also visualize the expression of selected markers across all cells before and after batch correction. markers <- c("Ecad", "CD45RO", "CD20", "CD3", "FOXP3", "CD206", "MPO", "SMA", "Ki67") # Before correction plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP", assay = "exprs", size = 0.2, list.out = TRUE) plot_list <- lapply(plot_list, function(x) x + scale_color_viridis()) plot_grid(plotlist = plot_list) # After correction plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP_mnnCorrected", assay = "exprs", size = 0.2, list.out = TRUE) plot_list <- lapply(plot_list, function(x) x + scale_color_viridis()) plot_grid(plotlist = plot_list) We observe that immune cells across patients are merged after batch correction using fastMNN. However, the tumor cells of different patients still cluster separately. 8.2 harmony correction The harmony algorithm performs batch correction by iteratively clustering and correcting the positions of cells in PCA space (Korsunsky et al. 2019). We will first perform PCA on the asinh-transformed counts and then call the RunHarmony function to perform data integration. Paper: Fast, sensitive and accurate integration of single-cell data with Harmony Documentation: harmony Similar to the fastMNN function, harmony returns the corrected low-dimensional coordinates for each cell. These can be transfered to the reducedDim slot of the original SpatialExperiment object. library(harmony) library(BiocSingular) spe <- runPCA(spe, subset_row = rowData(spe)$use_channel, exprs_values = "exprs", ncomponents = 30, BSPARAM = ExactParam()) set.seed(230616) out <- RunHarmony(spe, group.by.vars = "patient_id") # Check that order of cells is the same stopifnot(all.equal(colnames(spe), colnames(out))) reducedDim(spe, "harmony") <- reducedDim(out, "HARMONY") The computational time of the HarmonyMatrix function call is 1.3 minutes. 8.2.1 Visualization We will now again visualize the cells in low dimensions after UMAP embedding. set.seed(220228) spe <- runUMAP(spe, dimred = "harmony", name = "UMAP_harmony") # visualize patient id p1 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "UMAP", size = 0.2) + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + ggtitle("Patient ID on UMAP before correction") p2 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "UMAP_harmony", size = 0.2) + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + ggtitle("Patient ID on UMAP after correction") plot_grid(p1, p2) And we visualize selected marker expression as defined above. # Before correction plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP", assay = "exprs", size = 0.2, list.out = TRUE) plot_list <- lapply(plot_list, function(x) x + scale_color_viridis()) plot_grid(plotlist = plot_list) # After correction plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP_harmony", assay = "exprs", size = 0.2, list.out = TRUE) plot_list <- lapply(plot_list, function(x) x + scale_color_viridis()) plot_grid(plotlist = plot_list) We observe a more aggressive merging of cells from different patients compared to the results after fastMNN correction. Importantly, immune cell and epithelial markers are expressed in distinct regions of the UMAP. 8.3 Seurat correction The Seurat package provides a number of functionalities to analyze single-cell data. As such it also allows the integration of cells across different samples. Conceptually, Seurat performs batch correction similarly to fastMNN by finding mutual nearest neighbors (MNN) in low dimensional space before correcting the expression values of cells (Stuart et al. 2019). Paper: Comprehensive Integration of Single-Cell Data Documentation: Seurat To use Seurat, we will first create a Seurat object from the SpatialExperiment object and add relevant metadata. The object also needs to be split by patient prior to integration. library(Seurat) library(SeuratObject) seurat_obj <- as.Seurat(spe, counts = "counts", data = "exprs") seurat_obj <- AddMetaData(seurat_obj, as.data.frame(colData(spe))) seurat.list <- SplitObject(seurat_obj, split.by = "patient_id") To avoid long run times, we will use an approach that relies on reciprocal PCA instead of canonical correlation analysis for dimensionality reduction and initial alignment. For an extended tutorial on how to use Seurat for data integration, please refer to their vignette. We will first define the features used for integration and perform PCA on cells of each patient individually. The FindIntegrationAnchors function detects MNNs between cells of different patients and the IntegrateData function corrects the expression values of cells. We slightly increase the number of neighbors to be considered for MNN detection (the k.anchor parameter). This increases the integration strength. features <- rownames(spe)[rowData(spe)$use_channel] seurat.list <- lapply(X = seurat.list, FUN = function(x) { x <- ScaleData(x, features = features, verbose = FALSE) x <- RunPCA(x, features = features, verbose = FALSE, approx = FALSE) return(x) }) anchors <- FindIntegrationAnchors(object.list = seurat.list, anchor.features = features, reduction = "rpca", k.anchor = 20) combined <- IntegrateData(anchorset = anchors) We now select the integrated assay and perform PCA dimensionality reduction. The cell coordinates in PCA reduced space can then be transferred to the original SpatialExperiment object. Of note: by splitting the object into individual batch-specific objects, the ordering of cells in the integrated object might not match the ordering of cells in the input object. In this case, columns will need to be reordered. Here, we test if the ordering of cells in the integrated Seurat object matches the ordering of cells in the main SpatialExperiment object. DefaultAssay(combined) <- "integrated" combined <- ScaleData(combined, verbose = FALSE) combined <- RunPCA(combined, npcs = 30, verbose = FALSE, approx = FALSE) # Check that order of cells is the same stopifnot(all.equal(colnames(spe), colnames(combined))) reducedDim(spe, "seurat") <- Embeddings(combined, reduction = "pca") The computational time of the Seurat function calls is 4.29 minutes. 8.3.1 Visualization As above, we recompute the UMAP embeddings based on Seurat integrated results and visualize the embedding. set.seed(220228) spe <- runUMAP(spe, dimred = "seurat", name = "UMAP_seurat") Visualize patient IDs. # visualize patient id p1 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "UMAP", size = 0.2) + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + ggtitle("Patient ID on UMAP before correction") p2 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "UMAP_seurat", size = 0.2) + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + ggtitle("Patient ID on UMAP after correction") plot_grid(p1, p2) Visualization of marker expression. # Before correction plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP", assay = "exprs", size = 0.2, list.out = TRUE) plot_list <- lapply(plot_list, function(x) x + scale_color_viridis()) plot_grid(plotlist = plot_list) # After correction plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP_seurat", assay = "exprs", size = 0.2, list.out = TRUE) plot_list <- lapply(plot_list, function(x) x + scale_color_viridis()) plot_grid(plotlist = plot_list) Similar to the methods presented above, Seurat integrates immune cells correctly. When visualizing the patient IDs, slight patient-to-patient differences within tumor cells can be detected. Choosing the correct integration approach is challenging without having ground truth cell labels available. It is recommended to compare different techniques and different parameter settings. Please refer to the documentation of the individual tools to become familiar with the possible parameter choices. Furthermore, in the following section, we will discuss clustering and classification approaches in light of expression differences between samples. In general, it appears that MNN-based approaches are less conservative in terms of merging compared to harmony. On the other hand, harmony could well merge cells in a way that regresses out biological signals. 8.4 Save objects The modified SpatialExperiment object is saved for further downstream analysis. saveRDS(spe, "data/spe.rds") 8.5 Session Info SessionInfo ## R version 4.3.1 (2023-06-16) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 22.04.3 LTS ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## time zone: Etc/UTC ## tzcode source: system (glibc) ## ## attached base packages: ## [1] stats4 stats graphics grDevices utils datasets methods ## [8] base ## ## other attached packages: ## [1] testthat_3.1.10 SeuratObject_4.1.4 ## [3] Seurat_4.4.0 BiocSingular_1.16.0 ## [5] harmony_1.0.1 Rcpp_1.0.11 ## [7] viridis_0.6.4 viridisLite_0.4.2 ## [9] dittoSeq_1.12.1 cowplot_1.1.1 ## [11] scater_1.28.0 ggplot2_3.4.3 ## [13] scuttle_1.10.2 SpatialExperiment_1.10.0 ## [15] batchelor_1.16.0 SingleCellExperiment_1.22.0 ## [17] SummarizedExperiment_1.30.2 Biobase_2.60.0 ## [19] GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 ## [21] IRanges_2.34.1 S4Vectors_0.38.2 ## [23] BiocGenerics_0.46.0 MatrixGenerics_1.12.3 ## [25] matrixStats_1.0.0 ## ## loaded via a namespace (and not attached): ## [1] RcppAnnoy_0.0.21 splines_4.3.1 ## [3] later_1.3.1 bitops_1.0-7 ## [5] tibble_3.2.1 R.oo_1.25.0 ## [7] polyclip_1.10-6 lifecycle_1.0.3 ## [9] rprojroot_2.0.3 edgeR_3.42.4 ## [11] globals_0.16.2 lattice_0.21-8 ## [13] MASS_7.3-60 magrittr_2.0.3 ## [15] limma_3.56.2 plotly_4.10.2 ## [17] sass_0.4.7 rmarkdown_2.25 ## [19] jquerylib_0.1.4 yaml_2.3.7 ## [21] httpuv_1.6.11 sctransform_0.4.0 ## [23] spatstat.sparse_3.0-2 sp_2.0-0 ## [25] reticulate_1.32.0 pbapply_1.7-2 ## [27] RColorBrewer_1.1-3 ResidualMatrix_1.10.0 ## [29] pkgload_1.3.3 abind_1.4-5 ## [31] zlibbioc_1.46.0 Rtsne_0.16 ## [33] purrr_1.0.2 R.utils_2.12.2 ## [35] RCurl_1.98-1.12 GenomeInfoDbData_1.2.10 ## [37] ggrepel_0.9.3 irlba_2.3.5.1 ## [39] spatstat.utils_3.0-3 listenv_0.9.0 ## [41] pheatmap_1.0.12 goftest_1.2-3 ## [43] spatstat.random_3.1-6 dqrng_0.3.1 ## [45] fitdistrplus_1.1-11 parallelly_1.36.0 ## [47] DelayedMatrixStats_1.22.6 leiden_0.4.3 ## [49] codetools_0.2-19 DropletUtils_1.20.0 ## [51] DelayedArray_0.26.7 tidyselect_1.2.0 ## [53] farver_2.1.1 ScaledMatrix_1.8.1 ## [55] spatstat.explore_3.2-3 jsonlite_1.8.7 ## [57] BiocNeighbors_1.18.0 ellipsis_0.3.2 ## [59] progressr_0.14.0 ggridges_0.5.4 ## [61] survival_3.5-5 tools_4.3.1 ## [63] ica_1.0-3 glue_1.6.2 ## [65] gridExtra_2.3 xfun_0.40 ## [67] dplyr_1.1.3 HDF5Array_1.28.1 ## [69] withr_2.5.1 fastmap_1.1.1 ## [71] rhdf5filters_1.12.1 fansi_1.0.4 ## [73] digest_0.6.33 rsvd_1.0.5 ## [75] R6_2.5.1 mime_0.12 ## [77] colorspace_2.1-0 scattermore_1.2 ## [79] tensor_1.5 spatstat.data_3.0-1 ## [81] R.methodsS3_1.8.2 RhpcBLASctl_0.23-42 ## [83] utf8_1.2.3 tidyr_1.3.0 ## [85] generics_0.1.3 data.table_1.14.8 ## [87] httr_1.4.7 htmlwidgets_1.6.2 ## [89] S4Arrays_1.0.6 uwot_0.1.16 ## [91] pkgconfig_2.0.3 gtable_0.3.4 ## [93] lmtest_0.9-40 XVector_0.40.0 ## [95] brio_1.1.3 htmltools_0.5.6 ## [97] bookdown_0.35 scales_1.2.1 ## [99] png_0.1-8 knitr_1.44 ## [101] rstudioapi_0.15.0 reshape2_1.4.4 ## [103] rjson_0.2.21 nlme_3.1-162 ## [105] cachem_1.0.8 zoo_1.8-12 ## [107] rhdf5_2.44.0 stringr_1.5.0 ## [109] KernSmooth_2.23-21 parallel_4.3.1 ## [111] miniUI_0.1.1.1 vipor_0.4.5 ## [113] desc_1.4.2 pillar_1.9.0 ## [115] grid_4.3.1 vctrs_0.6.3 ## [117] RANN_2.6.1 promises_1.2.1 ## [119] beachmat_2.16.0 xtable_1.8-4 ## [121] cluster_2.1.4 waldo_0.5.1 ## [123] beeswarm_0.4.0 evaluate_0.21 ## [125] magick_2.8.0 cli_3.6.1 ## [127] locfit_1.5-9.8 compiler_4.3.1 ## [129] rlang_1.1.1 crayon_1.5.2 ## [131] future.apply_1.11.0 labeling_0.4.3 ## [133] plyr_1.8.8 ggbeeswarm_0.7.2 ## [135] stringi_1.7.12 deldir_1.0-9 ## [137] BiocParallel_1.34.2 munsell_0.5.0 ## [139] lazyeval_0.2.2 spatstat.geom_3.2-5 ## [141] Matrix_1.6-1.1 patchwork_1.1.3 ## [143] sparseMatrixStats_1.12.2 future_1.33.0 ## [145] Rhdf5lib_1.22.1 shiny_1.7.5 ## [147] ROCR_1.0-11 igraph_1.5.1 ## [149] bslib_0.5.1 References "],["cell-phenotyping.html", "9 Cell phenotyping 9.1 Load data 9.2 Clustering approaches 9.3 Classification approach 9.4 Session Info", " 9 Cell phenotyping A common step during single-cell data analysis is the annotation of cells based on their phenotype. Defining cell phenotypes is often subjective and relies on previous biological knowledge. The Orchestrating Single Cell Analysis with Bioconductor book presents a number of approaches to phenotype cells detected by single-cell RNA sequencing based on reference datasets or gene set analysis. In highly-multiplexed imaging, target proteins or molecules are manually selected based on the biological question at hand. It narrows down the feature space and facilitates the manual annotation of clusters to derive cell phenotypes. We will therefore discuss and compare a number of clustering approaches to group cells based on their similarity in marker expression in Section 9.2. Unlike single-cell RNA sequencing or CyTOF data, single-cell data derived from highly-multiplexed imaging data often suffers from “lateral spillover” between neighboring cells. This spillover caused by imperfect segmentation often hinders accurate clustering to define specific cell phenotypes in multiplexed imaging data. Tools have been developed to correct lateral spillover between cells (Bai et al. 2021) but the approach requires careful selection of the markers to correct. In Section 9.3 we will train and apply a random forest classifier to classify cell phenotypes in the dataset as alternative approach to clustering-based cell phenotyping. This approach has been previously used to identify major cell phenotypes in metastatic melanoma and avoids clustering of cells (Hoch et al. 2022). 9.1 Load data We will first read in the previously generated SpatialExperiment object and sample 2000 cells to visualize cluster membership. library(SpatialExperiment) spe <- readRDS("data/spe.rds") # Sample cells set.seed(220619) cur_cells <- sample(seq_len(ncol(spe)), 2000) 9.2 Clustering approaches In the first section, we will present clustering approaches to identify cellular phenotypes in the dataset. These methods group cells based on their similarity in marker expression or by their proximity in low dimensional space. A number of approaches have been developed to cluster data derived from single-cell RNA sequencing technologies (Yu et al. 2022) or CyTOF (Weber and Robinson 2016). For demonstration purposes, we will highlight common clustering approaches that are available in R and have been used for clustering cells obtained from IMC. Two approaches rely on graph-based clustering and one approach uses self organizing maps (SOM). 9.2.1 Rphenograph The PhenoGraph clustering approach was first described to group cells of a CyTOF dataset (Levine et al. 2015). The algorithm first constructs a graph by detecting the k nearest neighbours based on euclidean distance in expression space. In the next step, edges between nodes (cells) are weighted by their overlap in nearest neighbor sets. To quantify the overlap in shared nearest neighbor sets, the jaccard index is used. The Louvain modularity optimization approach is used to detect connected communities and partition the graph into clusters of cells. This clustering strategy was used by Jackson, Fischer et al. and Schulz et al. to cluster IMC data (Jackson et al. 2020; Schulz et al. 2018). There are several different PhenoGraph implementations available in R. Here, we use the one available at https://github.com/i-cyto/Rphenograph. For large datasets, https://github.com/stuchly/Rphenoannoy offers a more performant implementation of the algorithm. In the following code chunk, we select the asinh-transformed mean pixel intensities per cell and channel and subset the channels to the ones containing biological variation. This matrix is transposed to store cells in rows. Within the Rphenograph function, we select the 45 nearest neighbors for graph building and louvain community detection (default). The function returns a list of length 2, the first entry being the graph and the second entry containing the community object. Calling membership on the community object will return cluster IDs for each cell. These cluster IDs are then stored within the colData of the SpatialExperiment object. Cluster IDs are mapped on top of the UMAP embedding and single-cell marker expression within each cluster are visualized in form of a heatmap. It is recommended to test different inputs to k as shown in the next section. Selecting larger values for k results in larger clusters. library(Rphenograph) library(igraph) library(dittoSeq) library(viridis) mat <- t(assay(spe, "exprs")[rowData(spe)$use_channel,]) set.seed(230619) out <- Rphenograph(mat, k = 45) clusters <- factor(membership(out[[2]])) spe$pg_clusters <- clusters dittoDimPlot(spe, var = "pg_clusters", reduction.use = "UMAP", size = 0.2, do.label = TRUE) + ggtitle("Phenograph clusters on UMAP") dittoHeatmap(spe[,cur_cells], genes = rownames(spe)[rowData(spe)$use_channel], assay = "exprs", scale = "none", heatmap.colors = viridis(100), annot.by = c("pg_clusters", "patient_id"), annot.colors = c(dittoColors(1)[1:length(unique(spe$pg_clusters))], metadata(spe)$color_vectors$patient_id)) The Rphenograph function call took 2.31 minutes. We can observe that some of the clusters only contain cells of a single patient. This can often be observed in the tumor compartment. In the next step, we use the integrated cells (see Section 8) in low dimensional embedding for clustering. Here, the low dimensional embedding can be directly accessed from the reducedDim slot. mat <- reducedDim(spe, "fastMNN") set.seed(230619) out <- Rphenograph(mat, k = 45) clusters <- factor(membership(out[[2]])) spe$pg_clusters_corrected <- clusters dittoDimPlot(spe, var = "pg_clusters_corrected", reduction.use = "UMAP_mnnCorrected", size = 0.2, do.label = TRUE) + ggtitle("Phenograph clusters on UMAP, integrated cells") dittoHeatmap(spe[,cur_cells], genes = rownames(spe)[rowData(spe)$use_channel], assay = "exprs", scale = "none", heatmap.colors = viridis(100), annot.by = c("pg_clusters_corrected","patient_id"), annot.colors = c(dittoColors(1)[1:length(unique(spe$pg_clusters_corrected))], metadata(spe)$color_vectors$patient_id)) Clustering using the integrated embedding leads to clusters that contain cells of different patients. Cluster annotation can now be performed by manually labeling cells based on their marker expression (see Notes in Section 9.2.5). 9.2.2 Shared nearest neighbour graph The bluster package provides a simple interface to cluster cells using a number of different clustering approaches and different metrics to access cluster stability. For simplicity, we will focus on graph based clustering as this is the most popular and a fast method for single-cell clustering. The bluster package provides functionalities to build k-nearest neighbor (KNN) graphs and its weighted version, shared nearest neighbor (SNN) graphs where nodes represent cells. The user can chose the number of neighbors to consider (parameter k), the edge weighting method (parameter type) and the community detection function to use (parameter cluster.fun). As all parameters affect the clustering results, the bluster package provides the clusterSweep function to test a number of parameter settings in parallel. In the following code chunk, we select the asinh-transformed mean pixel intensities and subset the markers of interest. The resulting matrix is transposed to fit to the requirements of the bluster package (cells in rows). We test two different settings for k, two for type and fix the cluster.fun to louvain as this is one of the most common approaches for community detection. This function call is parallelized by setting the BPPARAM parameter. library(bluster) library(BiocParallel) library(ggplot2) mat <- t(assay(spe, "exprs")[rowData(spe)$use_channel,]) combinations <- clusterSweep(mat, BLUSPARAM=SNNGraphParam(), k=c(10L, 20L), type = c("rank", "jaccard"), cluster.fun = "louvain", BPPARAM = MulticoreParam(RNGseed = 220427)) We next calculate two metrics to estimate cluster stability: the average silhouette width and the neighborhood purity. We use the approxSilhouette function to compute the silhouette width for each cell and compute the average across all cells per parameter setting. Please see ?silhouette for more information on how the silhouette width is computed for each cell. A large average silhouette width indicates a cluster parameter setting for which cells that are well clustered. The neighborPurity function computes the fraction of cells around each cell with the same cluster ID. Per parameter setting, we compute the average neighborhood purity across all cells. A large average neighborhood purity indicates a cluster parameter setting for which cells that are well clustered. sil <- vapply(as.list(combinations$clusters), function(x) mean(approxSilhouette(mat, x)$width), 0) ggplot(data.frame(method = names(sil), sil = sil)) + geom_point(aes(method, sil)) + theme_classic(base_size = 15) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + xlab("Cluster parameter combination") + ylab("Average silhouette width") pur <- vapply(as.list(combinations$clusters), function(x) mean(neighborPurity(mat, x)$purity), 0) ggplot(data.frame(method = names(pur), pur = pur)) + geom_point(aes(method, pur)) + theme_classic(base_size = 15) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + xlab("Cluster parameter combination") + ylab("Average neighborhood purity") The cluster parameter sweep took 8.81 minutes. Performing a cluster sweep takes some time as multiple function calls are run in parallel. We do however recommend testing a number of different parameter settings to assess clustering performance. Once parameter settings are known, we can either use the clusterRows function of the bluster package to cluster cells or its convenient wrapper function exported by the scran package. The scran::clusterCells function accepts a SpatialExperiment (or SingleCellExperiment) object which stores cells in columns. By default, the function detects the 10 nearest neighbours for each cell, performs rank-based weighting of edges (see ?makeSNNGraph for more information) and uses the cluster_walktrap function to detect communities in the graph. As we can see above, the clustering approach in this dataset with k being 20 and rank-based edge weighting leads to the highest silhouette width and highest neighborhood purity. library(scran) set.seed(220620) clusters <- clusterCells(spe[rowData(spe)$use_channel,], assay.type = "exprs", BLUSPARAM = SNNGraphParam(k=20, cluster.fun = "louvain", type = "rank")) spe$nn_clusters <- clusters dittoDimPlot(spe, var = "nn_clusters", reduction.use = "UMAP", size = 0.2, do.label = TRUE) + ggtitle("SNN clusters on UMAP") dittoHeatmap(spe[,cur_cells], genes = rownames(spe)[rowData(spe)$use_channel], assay = "exprs", scale = "none", heatmap.colors = viridis(100), annot.by = c("nn_clusters", "patient_id"), annot.colors = c(dittoColors(1)[1:length(unique(spe$nn_clusters))], metadata(spe)$color_vectors$patient_id)) The shared nearest neighbor graph clustering approach took 1.31 minutes. This function was used by (Tietscher et al. 2022) to cluster cells obtained by IMC. Setting type = \"jaccard\" performs clustering similar to Rphenograph above and Seurat. Similar to the results obtained by Rphenograph, some of the clusters are patient-specific. We can now perform clustering of the integrated cells by directly specifying which low-dimensional embedding to use: set.seed(220621) clusters <- clusterCells(spe, use.dimred = "fastMNN", BLUSPARAM = SNNGraphParam(k = 20, cluster.fun = "louvain", type = "rank")) spe$nn_clusters_corrected <- clusters dittoDimPlot(spe, var = "nn_clusters_corrected", reduction.use = "UMAP_mnnCorrected", size = 0.2, do.label = TRUE) + ggtitle("SNN clusters on UMAP, integrated cells") dittoHeatmap(spe[,cur_cells], genes = rownames(spe)[rowData(spe)$use_channel], assay = "exprs", scale = "none", heatmap.colors = viridis(100), annot.by = c("nn_clusters_corrected","patient_id"), annot.colors = c(dittoColors(1)[1:length(unique(spe$nn_clusters_corrected))], metadata(spe)$color_vectors$patient_id)) 9.2.3 Self organizing maps An alternative to graph-based clustering is offered by the CATALYST package. The cluster function internally uses the FlowSOM package to group cells into 100 (default) clusters based on self organizing maps (SOM). In the next step, the ConsensusClusterPlus package is used to perform hierarchical consensus clustering of the previously detected 100 SOM nodes into 2 to maxK clusters. Cluster stability for each k can be assessed by plotting the delta_area(spe). The optimal number of clusters can be found by selecting the k at which a plateau is reached. In the example below, an optimal k lies somewhere around 13. library(CATALYST) # Run FlowSOM and ConsensusClusterPlus clustering set.seed(220410) spe <- cluster(spe, features = rownames(spe)[rowData(spe)$use_channel], maxK = 30) # Assess cluster stability delta_area(spe) spe$som_clusters <- cluster_ids(spe, "meta13") dittoDimPlot(spe, var = "som_clusters", reduction.use = "UMAP", size = 0.2, do.label = TRUE) + ggtitle("SOM clusters on UMAP") dittoHeatmap(spe[,cur_cells], genes = rownames(spe)[rowData(spe)$use_channel], assay = "exprs", scale = "none", heatmap.colors = viridis(100), annot.by = c("som_clusters", "patient_id"), annot.colors = c(dittoColors(1)[1:length(unique(spe$som_clusters))], metadata(spe)$color_vectors$patient_id)) Running FlowSOM clustering took 0.22 minutes. The CATALYST package does not provide functionality to perform FlowSOM and ConsensusClusterPlus clustering directly on the batch-corrected, integrated cells. As an alternative to the CATALYST package, the bluster package provides SOM clustering when specifying the SomParam() parameter. Similar to the CATALYST approach, we will first cluster the dataset into 100 clusters (also called “codes”). These codes are then further clustered into a maximum of 30 clusters using ConsensusClusterPlus (using hierarchical clustering and euclidean distance). The delta area plot can be accessed using the (not exported) .plot_delta_area function from CATALYST. Here, it seems that the plateau is reached at a k of 16 and we will store the final cluster IDs within the SpatialExperiment object. library(kohonen) library(ConsensusClusterPlus) # Select integrated cells mat <- reducedDim(spe, "fastMNN") # Perform SOM clustering set.seed(220410) som.out <- clusterRows(mat, SomParam(100), full = TRUE) # Cluster the 100 SOM codes into larger clusters ccp <- ConsensusClusterPlus(t(som.out$objects$som$codes[[1]]), maxK = 30, reps = 100, distance = "euclidean", seed = 220410, plot = NULL) # Visualize delta area plot CATALYST:::.plot_delta_area(ccp) # Link ConsensusClusterPlus clusters with SOM codes and save in object som.cluster <- ccp[[16]][["consensusClass"]][som.out$clusters] spe$som_clusters_corrected <- as.factor(som.cluster) dittoDimPlot(spe, var = "som_clusters_corrected", reduction.use = "UMAP_mnnCorrected", size = 0.2, do.label = TRUE) + ggtitle("SOM clusters on UMAP, integrated cells") dittoHeatmap(spe[,cur_cells], genes = rownames(spe)[rowData(spe)$use_channel], assay = "exprs", scale = "none", heatmap.colors = viridis(100), annot.by = c("som_clusters_corrected","patient_id"), annot.colors = c(dittoColors(1)[1:length(unique(spe$som_clusters_corrected))], metadata(spe)$color_vectors$patient_id)) The FlowSOM clustering approach has been used by (Hoch et al. 2022) to sub-cluster tumor cells as measured by IMC. 9.2.4 Compare between clustering approaches Finally, we can compare the results of different clustering approaches. For this, we visualize the number of cells that are shared between different clustering results in a pairwise fashion. In the following heatmaps a high match between clustering results can be seen for those clusters that are uniquely detected in both approaches. First, we will visualize the match between the three different approaches applied to the asinh-transformed counts. library(patchwork) library(pheatmap) library(gridExtra) tab1 <- table(paste("Rphenograph", spe$pg_clusters), paste("SNN", spe$nn_clusters)) tab2 <- table(paste("Rphenograph", spe$pg_clusters), paste("SOM", spe$som_clusters)) tab3 <- table(paste("SNN", spe$nn_clusters), paste("SOM", spe$som_clusters)) pheatmap(log10(tab1 + 10), color = viridis(100)) pheatmap(log10(tab2 + 10), color = viridis(100)) pheatmap(log10(tab3 + 10), color = viridis(100)) We observe that Rphenograph and the shared nearest neighbor (SNN) approach by scran show similar results (first heatmap above). For example, Rphenograph cluster 20 (a tumor cluster) is perfectly captured by SNN cluster 12. On the other hand, the Neutrophil cluster (SNN cluster 6) is split into Rphenograph cluster 2 and Rphenograph cluster 6. A common approach is to now merge clusters that contain similar cell types and annotate them by hand (see below). Below, a comparison between the clustering results of the integrated cells is shown. tab1 <- table(paste("Rphenograph", spe$pg_clusters_corrected), paste("SNN", spe$nn_clusters_corrected)) tab2 <- table(paste("Rphenograph", spe$pg_clusters_corrected), paste("SOM", spe$som_clusters_corrected)) tab3 <- table(paste("SNN", spe$nn_clusters_corrected), paste("SOM", spe$som_clusters_corrected)) pheatmap(log10(tab1 + 10), color = viridis(100)) pheatmap(log10(tab2 + 10), color = viridis(100)) pheatmap(log10(tab3 + 10), color = viridis(100)) In comparison to clustering on the non-integrated cells, the clustering results of the integrated cells show higher overlap. The SNN approach resulted in fewer clusters and therefore matches better with the SOM clustering approach. 9.2.5 Further clustering notes The bluster package provides a number of metrics to assess cluster stability here. For brevity we only highlighted the use of the silhouette width and the neighborhood purity but different metrics should be tested to assess cluster stability. To assign cell types to clusters, we manually annotate clusters based on their marker expression. For example, SNN cluster 12 (clustering of the integrated cells) shows high, homogeneous expression of CD20 and we might therefore label this cluster as B cells. The next chapter 10 will highlight single-cell visualization methods that can be helpful for manual cluster annotations. An example how to label clusters can be seen below: library(dplyr) cluster_celltype <- recode(spe$nn_clusters_corrected, "1" = "Tumor_proliferating", "2" = "Myeloid", "3" = "Tumor", "4" = "Tumor", "5" = "Stroma", "6" = "Proliferating", "7" = "Myeloid", "8" = "Plasma_cell", "9" = "CD8", "10" = "CD4", "11" = "Neutrophil", "12" = "Bcell", "13" = "Stroma") spe$cluster_celltype <- cluster_celltype 9.3 Classification approach In this section, we will highlight a cell type classification approach based on ground truth labeling and random forest classification. The rational for this supervised cell phenotyping approach is to use the information contained in the pre-defined markers to detect cells of interest. This approach was used by Hoch et al. to classify cell types in a metastatic melanoma IMC dataset (Hoch et al. 2022). The antibody panel used in the example data set mainly focuses on immune cell types and little on tumor cell phenotypes. Therefore we will label the following cell types: Tumor (E-cadherin positive) Stroma (SMA, PDGFRb positive) Plasma cells (CD38 positive) Neutrophil (MPO, CD15 positive) Myeloid cells (HLADR positive) B cells (CD20 positive) B next to T cells (CD20, CD3 positive) Regulatory T cells (FOXP3 positive) CD8+ T cells (CD3, CD8 positive) CD4+ T cells (CD3, CD4 positive) The “B next to T cell” phenotype (BnTcell) is commonly observed in immune infiltrated regions measured by IMC. We include this phenotype to account for B cell/T cell interactions where precise classification into B cells or T cells is not possible. The exact gating scheme can be seen at img/Gating_scheme.pdf. As related approaches, Astir and Garnett use pre-defined panel information to classify cell phenotypes based on their marker expression. 9.3.1 Manual labeling of cells The cytomapper package provides the cytomapperShiny function that allows gating of cells based on their marker expression and visualization of selected cells directly on the images. library(cytomapper) if (interactive()) { images <- readRDS("data/images.rds") masks <- readRDS("data/masks.rds") cytomapperShiny(object = spe, mask = masks, image = images, cell_id = "ObjectNumber", img_id = "sample_id") } The labeled cells for this data set can be accessed at 10.5281/zenodo.6554544 and were downloaded in Section 4. Gating is performed per image and the cytomapperShiny function allows the export of gated cells in form of a SingleCellExperiment or SpatialExperiment object. The cell label is stored in colData(object)$cytomapper_CellLabel and the gates are stored in metadata(object). In the next section, we will read in and consolidate the labeled data. 9.3.2 Define color vectors For consistent visualization of cell types, we will now pre-define their colors: celltype <- setNames(c("#3F1B03", "#F4AD31", "#894F36", "#1C750C", "#EF8ECC", "#6471E2", "#4DB23B", "grey", "#F4800C", "#BF0A3D", "#066970"), c("Tumor", "Stroma", "Myeloid", "CD8", "Plasma_cell", "Treg", "CD4", "undefined", "BnTcell", "Bcell", "Neutrophil")) metadata(spe)$color_vectors$celltype <- celltype 9.3.3 Read in and consolidate labeled data Here, we will read in the individual SpatialExperiment objects containing the labeled cells and concatenate them. In the process of concatenating the SpatialExperiment objects along their columns, the sample_id entry is appended by .1, .2, .3, ... due to replicated entries. library(SingleCellExperiment) label_files <- list.files("data/gated_cells", full.names = TRUE, pattern = ".rds$") # Read in SPE objects spes <- lapply(label_files, readRDS) # Merge SPE objects concat_spe <- do.call("cbind", spes) In the following code chunk we will identify cells that were labeled multiple times. This occurs when different cell phenotypes are gated per image and can affect immune cells that are located inside the tumor compartment. We will first identify those cells that were uniquely labeled. In the next step, we will identify those cells that were labeled twice AND were labeled as Tumor cells. These cells will be assigned their immune cell label. Finally, we will save the unique labels within the original SpatialExperiment object. Of note: this concatenation strategy is specific for cell phenotypes contained in this example dataset. The gated cell labels might need to be processed in a slightly different way when working with other samples. For these tasks, we will define a filter function: filter_labels <- function(object, label = "cytomapper_CellLabel") { cur_tab <- unclass(table(colnames(object), object[[label]])) cur_labels <- colnames(cur_tab)[apply(cur_tab, 1, which.max)] names(cur_labels) <- rownames(cur_tab) cur_labels <- cur_labels[rowSums(cur_tab) == 1] return(cur_labels) } This function is now applied to all cells and then only non-tumor cells. labels <- filter_labels(concat_spe) cur_spe <- concat_spe[,concat_spe$cytomapper_CellLabel != "Tumor"] non_tumor_labels <- filter_labels(cur_spe) additional_cells <- setdiff(names(non_tumor_labels), names(labels)) final_labels <- c(labels, non_tumor_labels[additional_cells]) # Transfer labels to SPE object spe_labels <- rep("unlabeled", ncol(spe)) names(spe_labels) <- colnames(spe) spe_labels[names(final_labels)] <- final_labels spe$cell_labels <- spe_labels # Number of cells labeled per patient table(spe$cell_labels, spe$patient_id) ## ## Patient1 Patient2 Patient3 Patient4 ## Bcell 152 131 234 263 ## BnTcell 396 37 240 1029 ## CD4 45 342 167 134 ## CD8 60 497 137 128 ## Myeloid 183 378 672 517 ## Neutrophil 97 4 17 16 ## Plasma_cell 34 536 87 59 ## Stroma 84 37 85 236 ## Treg 139 149 49 24 ## Tumor 2342 906 1618 1133 ## unlabeled 7214 9780 7826 9580 Based on these labels, we can now train a random forest classifier to classify all remaining, unlabeled cells. 9.3.4 Train classifier In this section, we will use the caret framework for machine learning in R. This package provides an interface to train a number of regression and classification models in a coherent fashion. We use a random forest classifier due to low number of parameters, high speed and an observed high performance for cell type classification (Hoch et al. 2022). In the following section, we will first split the SpatialExperiment object into labeled and unlabeled cells. Based on the labeled cells, we split the data into a train (75% of the data) and test (25% of the data) dataset. We currently do not provide an independently labeled validation dataset. The caret package provides the trainControl function, which specifies model training parameters and the train function, which performs the actual model training. While training the model, we also want to estimate the best model parameters. In the case of the chosen random forest model (method = \"rf\"), we only need to estimate a single parameters (mtry) which corresponds to the number of variables randomly sampled as candidates at each split. To estimate the best parameter, we will perform a 5-fold cross validation (set within trainControl) over a tune length of 5 entries to mtry. In the following code chunk, the createDataPartition and the train function are not deterministic, meaning they return different results across different runs. We therefore set a seed here for both functions. library(caret) # Split between labeled and unlabeled cells lab_spe <- spe[,spe$cell_labels != "unlabeled"] unlab_spe <- spe[,spe$cell_labels == "unlabeled"] # Randomly split into train and test data set.seed(221029) trainIndex <- createDataPartition(factor(lab_spe$cell_labels), p = 0.75) train_spe <- lab_spe[,trainIndex$Resample1] test_spe <- lab_spe[,-trainIndex$Resample1] # Define fit parameters for 5-fold cross validation fitControl <- trainControl(method = "cv", number = 5) # Select the arsinh-transformed counts for training cur_mat <- t(assay(train_spe, "exprs")[rowData(train_spe)$use_channel,]) # Train a random forest classifier rffit <- train(x = cur_mat, y = factor(train_spe$cell_labels), method = "rf", ntree = 1000, tuneLength = 5, trControl = fitControl) rffit ## Random Forest ## ## 10049 samples ## 37 predictor ## 10 classes: 'Bcell', 'BnTcell', 'CD4', 'CD8', 'Myeloid', 'Neutrophil', 'Plasma_cell', 'Stroma', 'Treg', 'Tumor' ## ## No pre-processing ## Resampling: Cross-Validated (5 fold) ## Summary of sample sizes: 8040, 8039, 8038, 8038, 8041 ## Resampling results across tuning parameters: ## ## mtry Accuracy Kappa ## 2 0.9643726 0.9524051 ## 10 0.9780071 0.9707483 ## 19 0.9801973 0.9736577 ## 28 0.9787052 0.9716635 ## 37 0.9779095 0.9705890 ## ## Accuracy was used to select the optimal model using the largest value. ## The final value used for the model was mtry = 19. Training the classifier took 11.77 minutes. 9.3.5 Classifier performance We next observe the accuracy of the classifer when predicting cell phenotypes across the cross-validation and when applying the classifier to the test dataset. First, we can visualize the classification accuracy during parameter tuning: ggplot(rffit) + geom_errorbar(data = rffit$results, aes(ymin = Accuracy - AccuracySD, ymax = Accuracy + AccuracySD), width = 0.4) + theme_classic(base_size = 15) The best value for mtry is 19 and is used when predicting new data. It is often recommended to visualize the variable importance of the classifier. The following plot specifies which variables (markers) are most important for classifying the data. plot(varImp(rffit)) As expected, the markers that were used for gating (Ecad, CD3, CD20, HLADR, CD8a, CD38, FOXP3) were important for classification. To assess the accuracy, sensitivity, specificity, among other quality measures of the classifier, we will now predict cell phenotypes in the test data. # Select the arsinh-transformed counts of the test data cur_mat <- t(assay(test_spe, "exprs")[rowData(test_spe)$use_channel,]) # Predict the cell phenotype labels of the test data set.seed(231019) cur_pred <- predict(rffit, newdata = cur_mat) While the overall classification accuracy can appear high, we also want to check if each cell phenotype class is correctly predicted. For this, we will calculate the confusion matrix between predicted and actual cell labels. This measure may highlight individual cell phenotype classes that were not correctly predicted by the classifier. When setting mode = \"everything\", the confusionMatrix function returns all available prediction measures including sensitivity, specificity, precision, recall and the F1 score per cell phenotype class. cm <- confusionMatrix(data = cur_pred, reference = factor(test_spe$cell_labels), mode = "everything") cm ## Confusion Matrix and Statistics ## ## Reference ## Prediction Bcell BnTcell CD4 CD8 Myeloid Neutrophil Plasma_cell Stroma ## Bcell 186 2 0 0 0 0 6 0 ## BnTcell 4 423 1 0 0 0 0 0 ## CD4 0 0 163 0 0 2 3 2 ## CD8 0 0 0 199 0 0 8 0 ## Myeloid 0 0 2 1 437 0 0 0 ## Neutrophil 0 0 0 0 0 30 0 0 ## Plasma_cell 1 0 3 2 0 0 158 0 ## Stroma 0 0 2 0 0 0 0 108 ## Treg 0 0 0 0 0 0 3 0 ## Tumor 4 0 1 3 0 1 1 0 ## Reference ## Prediction Treg Tumor ## Bcell 0 1 ## BnTcell 0 1 ## CD4 0 5 ## CD8 0 3 ## Myeloid 0 0 ## Neutrophil 0 0 ## Plasma_cell 1 0 ## Stroma 0 0 ## Treg 89 2 ## Tumor 0 1487 ## ## Overall Statistics ## ## Accuracy : 0.9806 ## 95% CI : (0.9753, 0.985) ## No Information Rate : 0.4481 ## P-Value [Acc > NIR] : < 2.2e-16 ## ## Kappa : 0.9741 ## ## Mcnemar's Test P-Value : NA ## ## Statistics by Class: ## ## Class: Bcell Class: BnTcell Class: CD4 Class: CD8 ## Sensitivity 0.95385 0.9953 0.94767 0.97073 ## Specificity 0.99714 0.9979 0.99622 0.99650 ## Pos Pred Value 0.95385 0.9860 0.93143 0.94762 ## Neg Pred Value 0.99714 0.9993 0.99716 0.99809 ## Precision 0.95385 0.9860 0.93143 0.94762 ## Recall 0.95385 0.9953 0.94767 0.97073 ## F1 0.95385 0.9906 0.93948 0.95904 ## Prevalence 0.05830 0.1271 0.05142 0.06129 ## Detection Rate 0.05561 0.1265 0.04873 0.05949 ## Detection Prevalence 0.05830 0.1283 0.05232 0.06278 ## Balanced Accuracy 0.97549 0.9966 0.97195 0.98361 ## Class: Myeloid Class: Neutrophil Class: Plasma_cell ## Sensitivity 1.0000 0.909091 0.88268 ## Specificity 0.9990 1.000000 0.99779 ## Pos Pred Value 0.9932 1.000000 0.95758 ## Neg Pred Value 1.0000 0.999095 0.99340 ## Precision 0.9932 1.000000 0.95758 ## Recall 1.0000 0.909091 0.88268 ## F1 0.9966 0.952381 0.91860 ## Prevalence 0.1306 0.009865 0.05351 ## Detection Rate 0.1306 0.008969 0.04723 ## Detection Prevalence 0.1315 0.008969 0.04933 ## Balanced Accuracy 0.9995 0.954545 0.94024 ## Class: Stroma Class: Treg Class: Tumor ## Sensitivity 0.98182 0.98889 0.9920 ## Specificity 0.99938 0.99846 0.9946 ## Pos Pred Value 0.98182 0.94681 0.9933 ## Neg Pred Value 0.99938 0.99969 0.9935 ## Precision 0.98182 0.94681 0.9933 ## Recall 0.98182 0.98889 0.9920 ## F1 0.98182 0.96739 0.9927 ## Prevalence 0.03288 0.02691 0.4481 ## Detection Rate 0.03229 0.02661 0.4445 ## Detection Prevalence 0.03288 0.02810 0.4475 ## Balanced Accuracy 0.99060 0.99368 0.9933 To easily visualize these results, we can now plot the true positive rate (sensitivity) versus the false positive rate (1 - specificity). The size of the point is determined by the number of true positives divided by the total number of cells. library(tidyverse) data.frame(cm$byClass) %>% mutate(class = sub("Class: ", "", rownames(cm$byClass))) %>% ggplot() + geom_point(aes(1 - Specificity, Sensitivity, size = Detection.Rate, fill = class), shape = 21) + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) + theme_classic(base_size = 15) + ylab("Sensitivity (TPR)") + xlab("1 - Specificity (FPR)") We observe high sensitivity and specificity for most cell types. Plasma cells show the lowest true positive rate with 88% being sufficiently high. Finally, to observe which cell phenotypes were wrongly classified, we can visualize the distribution of classification probabilities per cell phenotype class: set.seed(231019) cur_pred <- predict(rffit, newdata = cur_mat, type = "prob") cur_pred$truth <- factor(test_spe$cell_labels) cur_pred %>% pivot_longer(cols = Bcell:Tumor) %>% ggplot() + geom_boxplot(aes(x = name, y = value, fill = name), outlier.size = 0.5) + facet_wrap(. ~ truth, ncol = 1) + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) + theme(panel.background = element_blank(), axis.text.x = element_text(angle = 45, hjust = 1)) The boxplots indicate the classification probabilities per class. The classifier is well trained if classification probabilities are only high for the one specific class. 9.3.6 Classification of new data In the final section, we will now use the tuned and tested random forest classifier to predict the cell phenotypes of the unlabeled data. First, we predict the cell phenotypes and extract their classification probabilities. # Select the arsinh-transformed counts of the unlabeled data for prediction cur_mat <- t(assay(unlab_spe, "exprs")[rowData(unlab_spe)$use_channel,]) # Predict the cell phenotype labels of the unlabeled data set.seed(231014) cell_class <- as.character(predict(rffit, newdata = cur_mat, type = "raw")) names(cell_class) <- rownames(cur_mat) table(cell_class) ## cell_class ## Bcell BnTcell CD4 CD8 Myeloid Neutrophil ## 817 979 3620 2716 6302 559 ## Plasma_cell Stroma Treg Tumor ## 2692 4904 1170 10641 # Extract prediction probabilities for each cell set.seed(231014) cell_prob <- predict(rffit, newdata = cur_mat, type = "prob") Each cell is assigned to the class with highest probability. There are however cases, where the highest probability is low meaning the cell can not be uniquely assigned to a class. We next want to identify these cells and label them as “undefined”. Here, we select a maximum classification probability threshold of 40% but this threshold needs to be adjusted for other datasets. The adjusted cell labels are then stored in the SpatialExperiment object. library(ggridges) # Distribution of maximum probabilities tibble(max_prob = rowMax(as.matrix(cell_prob)), type = cell_class) %>% ggplot() + geom_density_ridges(aes(x = max_prob, y = cell_class, fill = cell_class)) + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) + theme_classic(base_size = 15) + xlab("Maximum probability") + ylab("Cell type") + xlim(c(0,1.2)) ## Picking joint bandwidth of 0.0238 # Label undefined cells cell_class[rowMax(as.matrix(cell_prob)) < 0.4] <- "undefined" # Store labels in SpatialExperiment onject cell_labels <- spe$cell_labels cell_labels[colnames(unlab_spe)] <- cell_class spe$celltype <- cell_labels table(spe$celltype, spe$patient_id) ## ## Patient1 Patient2 Patient3 Patient4 ## Bcell 179 527 431 458 ## BnTcell 416 586 594 1078 ## CD4 391 1370 699 1385 ## CD8 518 1365 479 1142 ## Myeloid 1369 2197 1723 2731 ## Neutrophil 348 9 148 176 ## Plasma_cell 650 2122 351 274 ## Stroma 633 676 736 3261 ## Treg 553 409 243 310 ## Tumor 5560 3334 5648 2083 ## undefined 129 202 80 221 We can now compare the cell labels derived by classification to the different clustering strategies. The first comparison is against the clustering results using the asinh-transformed counts. tab1 <- table(spe$celltype, paste("Rphenograph", spe$pg_clusters)) tab2 <- table(spe$celltype, paste("SNN", spe$nn_clusters)) tab3 <- table(spe$celltype, paste("SOM", spe$som_clusters)) pheatmap(log10(tab1 + 10), color = viridis(100)) pheatmap(log10(tab2 + 10), color = viridis(100)) pheatmap(log10(tab3 + 10), color = viridis(100)) We can see that Tumor and Myeloid cells span multiple clusters while Neutrophiles are detected as an individual cluster by all clustering approaches. We next compare the cell classification against clustering results using the integrated cells. tab1 <- table(spe$celltype, paste("Rphenograph", spe$pg_clusters_corrected)) tab2 <- table(spe$celltype, paste("SNN", spe$nn_clusters_corrected)) tab3 <- table(spe$celltype, paste("SOM", spe$som_clusters_corrected)) pheatmap(log10(tab1 + 10), color = viridis(100)) pheatmap(log10(tab2 + 10), color = viridis(100)) pheatmap(log10(tab3 + 10), color = viridis(100)) We observe a high agreement between the shared nearest neighbor clustering approach using the integrated cells and the cell phenotypes derived by classification. In the next sections, we will highlight visualization strategies to verify the correctness of the phenotyping approach. Specifically, Section 11.2.3 shows how to outline identified cell phenotypes on composite images. Finally, we save the updated SpatialExperiment object. saveRDS(spe, "data/spe.rds") 9.4 Session Info SessionInfo ## R version 4.3.1 (2023-06-16) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 22.04.3 LTS ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## time zone: Etc/UTC ## tzcode source: system (glibc) ## ## attached base packages: ## [1] stats4 stats graphics grDevices utils datasets methods ## [8] base ## ## other attached packages: ## [1] testthat_3.1.10 ggridges_0.5.4 ## [3] lubridate_1.9.3 forcats_1.0.0 ## [5] stringr_1.5.0 purrr_1.0.2 ## [7] readr_2.1.4 tidyr_1.3.0 ## [9] tibble_3.2.1 tidyverse_2.0.0 ## [11] caret_6.0-94 lattice_0.21-8 ## [13] cytomapper_1.12.0 EBImage_4.42.0 ## [15] dplyr_1.1.3 gridExtra_2.3 ## [17] pheatmap_1.0.12 patchwork_1.1.3 ## [19] ConsensusClusterPlus_1.64.0 kohonen_3.0.12 ## [21] CATALYST_1.24.0 scran_1.28.2 ## [23] scuttle_1.10.2 BiocParallel_1.34.2 ## [25] bluster_1.10.0 viridis_0.6.4 ## [27] viridisLite_0.4.2 dittoSeq_1.12.1 ## [29] ggplot2_3.4.3 igraph_1.5.1 ## [31] Rphenograph_0.99.1.9003 SpatialExperiment_1.10.0 ## [33] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2 ## [35] Biobase_2.60.0 GenomicRanges_1.52.0 ## [37] GenomeInfoDb_1.36.3 IRanges_2.34.1 ## [39] S4Vectors_0.38.2 BiocGenerics_0.46.0 ## [41] MatrixGenerics_1.12.3 matrixStats_1.0.0 ## ## loaded via a namespace (and not attached): ## [1] bitops_1.0-7 RColorBrewer_1.1-3 ## [3] doParallel_1.0.17 tools_4.3.1 ## [5] backports_1.4.1 utf8_1.2.3 ## [7] R6_2.5.1 HDF5Array_1.28.1 ## [9] rhdf5filters_1.12.1 GetoptLong_1.0.5 ## [11] withr_2.5.1 sp_2.0-0 ## [13] cli_3.6.1 sandwich_3.0-2 ## [15] labeling_0.4.3 sass_0.4.7 ## [17] nnls_1.5 mvtnorm_1.2-3 ## [19] randomForest_4.7-1.1 proxy_0.4-27 ## [21] systemfonts_1.0.4 colorRamps_2.3.1 ## [23] svglite_2.1.1 R.utils_2.12.2 ## [25] scater_1.28.0 parallelly_1.36.0 ## [27] plotrix_3.8-2 limma_3.56.2 ## [29] flowCore_2.12.2 rstudioapi_0.15.0 ## [31] generics_0.1.3 shape_1.4.6 ## [33] gtools_3.9.4 car_3.1-2 ## [35] Matrix_1.6-1.1 RProtoBufLib_2.12.1 ## [37] waldo_0.5.1 ggbeeswarm_0.7.2 ## [39] fansi_1.0.4 abind_1.4-5 ## [41] R.methodsS3_1.8.2 terra_1.7-46 ## [43] lifecycle_1.0.3 multcomp_1.4-25 ## [45] yaml_2.3.7 edgeR_3.42.4 ## [47] carData_3.0-5 rhdf5_2.44.0 ## [49] recipes_1.0.8 Rtsne_0.16 ## [51] grid_4.3.1 promises_1.2.1 ## [53] dqrng_0.3.1 crayon_1.5.2 ## [55] shinydashboard_0.7.2 beachmat_2.16.0 ## [57] cowplot_1.1.1 magick_2.8.0 ## [59] pillar_1.9.0 knitr_1.44 ## [61] ComplexHeatmap_2.16.0 metapod_1.8.0 ## [63] rjson_0.2.21 future.apply_1.11.0 ## [65] codetools_0.2-19 glue_1.6.2 ## [67] data.table_1.14.8 vctrs_0.6.3 ## [69] png_0.1-8 gtable_0.3.4 ## [71] cachem_1.0.8 gower_1.0.1 ## [73] xfun_0.40 S4Arrays_1.0.6 ## [75] mime_0.12 prodlim_2023.08.28 ## [77] DropletUtils_1.20.0 survival_3.5-5 ## [79] timeDate_4022.108 iterators_1.0.14 ## [81] cytolib_2.12.1 hardhat_1.3.0 ## [83] lava_1.7.2.1 statmod_1.5.0 ## [85] ellipsis_0.3.2 TH.data_1.1-2 ## [87] ipred_0.9-14 nlme_3.1-162 ## [89] rprojroot_2.0.3 bslib_0.5.1 ## [91] irlba_2.3.5.1 svgPanZoom_0.3.4 ## [93] vipor_0.4.5 rpart_4.1.19 ## [95] colorspace_2.1-0 raster_3.6-23 ## [97] nnet_7.3-19 tidyselect_1.2.0 ## [99] compiler_4.3.1 BiocNeighbors_1.18.0 ## [101] desc_1.4.2 DelayedArray_0.26.7 ## [103] bookdown_0.35 scales_1.2.1 ## [105] tiff_0.1-11 digest_0.6.33 ## [107] fftwtools_0.9-11 rmarkdown_2.25 ## [109] XVector_0.40.0 htmltools_0.5.6 ## [111] pkgconfig_2.0.3 jpeg_0.1-10 ## [113] sparseMatrixStats_1.12.2 fastmap_1.1.1 ## [115] rlang_1.1.1 GlobalOptions_0.1.2 ## [117] htmlwidgets_1.6.2 shiny_1.7.5 ## [119] DelayedMatrixStats_1.22.6 farver_2.1.1 ## [121] jquerylib_0.1.4 zoo_1.8-12 ## [123] jsonlite_1.8.7 ModelMetrics_1.2.2.2 ## [125] R.oo_1.25.0 BiocSingular_1.16.0 ## [127] RCurl_1.98-1.12 magrittr_2.0.3 ## [129] GenomeInfoDbData_1.2.10 Rhdf5lib_1.22.1 ## [131] munsell_0.5.0 Rcpp_1.0.11 ## [133] ggnewscale_0.4.9 pROC_1.18.4 ## [135] stringi_1.7.12 brio_1.1.3 ## [137] zlibbioc_1.46.0 MASS_7.3-60 ## [139] plyr_1.8.8 listenv_0.9.0 ## [141] parallel_4.3.1 ggrepel_0.9.3 ## [143] splines_4.3.1 hms_1.1.3 ## [145] circlize_0.4.15 locfit_1.5-9.8 ## [147] ggpubr_0.6.0 ggsignif_0.6.4 ## [149] pkgload_1.3.3 reshape2_1.4.4 ## [151] ScaledMatrix_1.8.1 XML_3.99-0.14 ## [153] drc_3.0-1 evaluate_0.21 ## [155] tzdb_0.4.0 foreach_1.5.2 ## [157] tweenr_2.0.2 httpuv_1.6.11 ## [159] RANN_2.6.1 polyclip_1.10-6 ## [161] future_1.33.0 clue_0.3-65 ## [163] ggforce_0.4.1 rsvd_1.0.5 ## [165] broom_1.0.5 xtable_1.8-4 ## [167] e1071_1.7-13 rstatix_0.7.2 ## [169] later_1.3.1 class_7.3-22 ## [171] FlowSOM_2.8.0 beeswarm_0.4.0 ## [173] cluster_2.1.4 timechange_0.2.0 ## [175] globals_0.16.2 References "],["single-cell-visualization.html", "10 Single cell visualization 10.1 Load data 10.2 Cell-type level 10.3 Sample-level 10.4 Further examples 10.5 Session Info", " 10 Single cell visualization The following section describes typical approaches for visualizing single-cell data. This chapter is divided into three parts. Section 10.2 will highlight visualization approaches downstream of cell type classification from Section 9.3. We will then focus on visualization methods that relate single-cell data to the sample level in Section 10.3. Lastly, Section 10.4 will provide a more customized example on how to integrate various single-cell and sample metadata into one heatmap using the ComplexHeatmap package (Gu, Eils, and Schlesner 2016). Visualization functions from popular R packages in single-cell research such as scater, DittoSeq and CATALYST will be utilized. We will recycle methods and functions that we have used in previous sections, while also introducing new ones. Please note that this chapter aims to provide an overview on common visualization options and should be seen as a stepping-stone. However, many more options exist and the user should customize the visualization according to the biological question at hand. 10.1 Load data First, we will read in the previously generated SpatialExperiment object. spe <- readRDS("data/spe.rds") For visualization purposes, we will define markers that were used for cell type classification and markers that can indicate a specific cell state (e.g., Ki67 for proliferating cells). # Define cell phenotype markers type_markers <- c("Ecad", "CD45RO", "CD20", "CD3", "FOXP3", "CD206", "MPO", "SMA", "CD8a", "CD4", "HLADR", "CD15", "CD38", "PDGFRb") # Define cell state markers state_markers <- c("CarbonicAnhydrase", "Ki67", "PD1", "GrzB", "PDL1", "ICOS", "TCF7", "VISTA") # Add to spe rowData(spe)$marker_class <- ifelse(rownames(spe) %in% type_markers, "type", ifelse(rownames(spe) %in% state_markers, "state", "other")) 10.2 Cell-type level In the first section of this chapter, the grouping-level for the visualization approaches will be the cell type classification from Section 9.3. Other grouping levels (e.g., cluster assignments from Section 9.2) are possible and the user should adjust depending on the chosen analysis workflow. 10.2.1 Dimensionality reduction visualization As seen before, we can visualize single-cells in low-dimensional space. Often, non-linear methods for dimensionality reduction such as tSNE and UMAP are used. They aim to preserve the distances between each cell and its neighbors in the high-dimensional space. Interpreting these plots is not trivial, but local neighborhoods in the plot can suggest similarity in expression for given cells. See Orchestrating Single-Cell Analysis with Bioconductor for more details. Here, we will use dittoDimPlot from the DittoSeq package and plotReducedDim from the scater package to visualize the fastMNN-corrected UMAP colored by cell type and expression (using the asinh-transformed intensities), respectively. Both functions are highly flexible and return ggplot objects which can be further modified. library(dittoSeq) library(scater) library(patchwork) library(cowplot) library(viridis) ## UMAP colored by cell type and expression - dittoDimPlot p1 <- dittoDimPlot(spe, var = "celltype", reduction.use = "UMAP_mnnCorrected", size = 0.2, do.label = TRUE) + scale_color_manual(values = metadata(spe)$color_vectors$celltype) + theme(legend.title = element_blank()) + ggtitle("Cell types on UMAP, integrated cells") p2 <- dittoDimPlot(spe, var = "Ecad", assay = "exprs", reduction.use = "UMAP_mnnCorrected", size = 0.2, colors = viridis(100), do.label = TRUE) + scale_color_viridis() p1 + p2 The plotReducedDim function of the scater package provides an alternative way for visualizing cells in low dimensions. Here, we loop over all type markers, generate one plot per marker and plot the indivudual plots side-by-side. # UMAP colored by expression for all markers - plotReducedDim plot_list <- lapply(rownames(spe)[rowData(spe)$marker_class == "type"], function(x){ p <- plotReducedDim(spe, dimred = "UMAP_mnnCorrected", colour_by = x, by_exprs_values = "exprs", point_size = 0.2) return(p) }) plot_grid(plotlist = plot_list) 10.2.2 Heatmap visualization Next, it is often useful to visualize single-cell expression per cell type in form of a heatmap. For this, we will use the dittoHeatmap function from the DittoSeq package. We sub-sample the dataset to 4000 cells for ease of visualization and overlay the cancer type and patient ID from which the cells were extracted. set.seed(220818) cur_cells <- sample(seq_len(ncol(spe)), 4000) # Heatmap visualization - DittoHeatmap dittoHeatmap(spe[,cur_cells], genes = rownames(spe)[rowData(spe)$marker_class == "type"], assay = "exprs", cluster_cols = FALSE, scale = "none", heatmap.colors = viridis(100), annot.by = c("celltype", "indication", "patient_id"), annotation_colors = list(indication = metadata(spe)$color_vectors$indication, patient_id = metadata(spe)$color_vectors$patient_id, celltype = metadata(spe)$color_vectors$celltype)) Similarly, we can visualize the mean marker expression per cell type for all cells bu first calculating the mean marker expression per cell type using the aggregateAcrossCells function from the scuttle package and then use dittoHeatmap. We will annotate the heatmap with the number of cells per cell type and we will used different ways for feature scaling. library(scuttle) ## aggregate by cell type celltype_mean <- aggregateAcrossCells(as(spe, "SingleCellExperiment"), ids = spe$celltype, statistics = "mean", use.assay.type = "exprs", subset.row = rownames(spe)[rowData(spe)$marker_class == "type"]) # No scaling dittoHeatmap(celltype_mean, assay = "exprs", cluster_cols = TRUE, scale = "none", heatmap.colors = viridis(100), annot.by = c("celltype", "ncells"), annotation_colors = list(celltype = metadata(spe)$color_vectors$celltype, ncells = plasma(100))) # Scaled to max dittoHeatmap(celltype_mean, assay = "exprs", cluster_cols = TRUE, scaled.to.max = TRUE, heatmap.colors.max.scaled = inferno(100), annot.by = c("celltype", "ncells"), annotation_colors = list(celltype = metadata(spe)$color_vectors$celltype, ncells = plasma(100))) # Z score scaled dittoHeatmap(celltype_mean, assay = "exprs", cluster_cols = TRUE, annot.by = c("celltype", "ncells"), annotation_colors = list(celltype = metadata(spe)$color_vectors$celltype, ncells = plasma(100))) As illustrated above for not-, max-, and Z score-scaled expression values, different ways of scaling can have strong effects on visualization output and we encourage the user to test multiple options. Overall, we can observe cell-type specific marker expression (e.g., Tumor = Ecad high and B cells = CD20 high) in agreement with the gating scheme of Section 9.3. 10.2.3 Violin plot visualization The plotExpression function from the scater package allows to plot the distribution of expression values across cell types for a chosen set of proteins. The output is a ggplot object which can be modified further. # Violin Plot - plotExpression plotExpression(spe[,cur_cells], features = rownames(spe)[rowData(spe)$marker_class == "type"], x = "celltype", exprs_values = "exprs", colour_by = "celltype") + theme(axis.text.x = element_text(angle = 90))+ scale_color_manual(values = metadata(spe)$color_vectors$celltype) 10.2.4 Scatter plot visualization Moreover, a protein expression based scatter plot can be generated with dittoScatterPlot (returns a ggplot object). We overlay the plot with the cell type information. # Scatter plot dittoScatterPlot(spe, x.var = "CD3", y.var="CD20", assay.x = "exprs", assay.y = "exprs", color.var = "celltype") + scale_color_manual(values = metadata(spe)$color_vectors$celltype) + ggtitle("Scatterplot for CD3/CD20 labelled by celltype") We can nicely observe how the “B next to T cell” phenotype (BnTcell) has high expression values for both CD20 and CD3. Of note, in a setting where the user aims to assign labels to clusters based on marker genes/proteins, all of the above plots can be particularly helpful. 10.2.5 Barplot visualization In order to display frequencies of cell types per sample/patient, the dittoBarPlot function will be used. Data can be represented as percentages or counts and again ggplot objects are outputted. # by sample_id - percentage dittoBarPlot(spe, var = "celltype", group.by = "sample_id") + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) # by patient_id - percentage dittoBarPlot(spe, var = "celltype", group.by = "patient_id") + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) # by patient_id - count dittoBarPlot(spe, scale = "count", var = "celltype", group.by = "patient_id") + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) We can see that cell type frequencies change between samples/patients and that the highest proportion/counts of plasma cells and stromal cells can be observed for Patient 2 and Patient 4, respectively. 10.2.6 CATALYST-based visualization In the following, we highlight some useful visualization functions from the CATALYST package. To this end, we will first convert the SpatialExperiment object into a CATALYST-compatible format. library(CATALYST) # Save SPE in CATALYST-compatible object with renamed colData entries and # new metadata information spe_cat <- spe spe_cat$sample_id <- factor(spe$sample_id) spe_cat$condition <- factor(spe$indication) spe_cat$cluster_id <- factor(spe$celltype) # Add celltype information to metadata metadata(spe_cat)$cluster_codes <- data.frame(celltype = factor(spe_cat$celltype)) All of the CATALYST functions presented below return ggplot objects, which allow flexible downstream adjustment. 10.2.6.1 Pseudobulk-level MDS plot Pseudobulk-level multi-dimensional scaling (MDS) plots can be rendered with the exported pbMDS function. Here, we will use pbMDS to highlight expression similarities between cell types and subsequently for each celltype-sample-combination. # MDS pseudobulk by cell type pbMDS(spe_cat, by = "cluster_id", features = rownames(spe_cat)[rowData(spe_cat)$marker_class == "type"], label_by = "cluster_id", k = "celltype") + scale_color_manual(values = metadata(spe_cat)$color_vectors$celltype) # MDS pseudobulk by cell type and sample_id pbMDS(spe_cat, by = "both", features = rownames(spe_cat)[rowData(spe_cat)$marker_class == "type"], k = "celltype", shape_by = "condition", size_by = TRUE) + scale_color_manual(values = metadata(spe_cat)$color_vectors$celltype) We can see that the pseudobulk-expression profile of neutrophils seems markedly distinct from the other cell types, while comparable cell types such as the T cell subtypes group together. Furthermore, pseudobulk cell-type profiles from SCCHN appear different from the other indications. 10.2.6.2 Reduced dimension plot on CLR of proportions The clrDR function produces dimensionality reduction plots on centered log-ratios (CLR) of sample/cell type proportions across cell type/samples. As with pbMDS, the output plots aim to illustrate the degree of similarity between cell types based on sample proportions. # CLR on cluster proportions across samples clrDR(spe_cat, dr = "PCA", by = "cluster_id", k = "celltype", label_by = "cluster_id", arrow_col = "sample_id", point_pal = metadata(spe_cat)$color_vectors$celltype) We can again observe that neutrophils have a divergent profile also in terms of their sample proportions. 10.2.6.3 Pseudobulk expression boxplot The plotPbExprs generates combined box- and jitter-plots of aggregated marker expression per cell type and sample (image). Here, we further split the data by cancer type. plotPbExprs(spe_cat, k = "celltype", facet_by = "cluster_id", ncol = 2, features = rownames(spe_cat)[rowData(spe_cat)$marker_class == "type"]) + scale_color_manual(values = metadata(spe_cat)$color_vectors$indication) Notably, CD15 levels are elevated in SCCHN in comparison to all other indications for most cell types. 10.3 Sample-level In the next section, we will shift the grouping-level focus from the cell type to the sample-level. Sample-levels will be further divided into the sample-(image) and patient-level. Although we will mostly repeat the functions from the previous section 10.2, sample- and patient-level centered visualization can provide additional quality control and biological interpretation. 10.3.1 Dimensionality reduction visualization Visualization of low-dimensional embeddings, here comparing non-corrected and fastMNN-corrected UMAPs, and coloring it by sample-levels is often used for “batch effect” assessment as mentioned in Section 7.4. We will again use dittoDimPlot. ## UMAP colored by cell type and expression - dittoDimPlot p1 <- dittoDimPlot(spe, var = "sample_id", reduction.use = "UMAP", size = 0.2, colors = viridis(100), do.label = FALSE) + scale_color_manual(values = metadata(spe)$color_vectors$sample_id) + theme(legend.title = element_blank()) + ggtitle("Sample ID") p2 <- dittoDimPlot(spe, var = "sample_id", reduction.use = "UMAP_mnnCorrected", size = 0.2, colors = viridis(100), do.label = FALSE) + scale_color_manual(values = metadata(spe)$color_vectors$sample_id) + theme(legend.title = element_blank()) + ggtitle("Sample ID") p3 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "UMAP", size = 0.2, do.label = FALSE) + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + theme(legend.title = element_blank()) + ggtitle("Patient ID") p4 <- dittoDimPlot(spe, var = "patient_id", reduction.use = "UMAP_mnnCorrected", size = 0.2, do.label = FALSE) + scale_color_manual(values = metadata(spe)$color_vectors$patient_id) + theme(legend.title = element_blank()) + ggtitle("Patient ID") (p1 + p2) / (p3 + p4) As illustrated in Section 8, we see that the fastMNN approach (right side of the plot) leads to mixing of cells across samples/patients and thus batch effect correction. 10.3.2 Heatmap visualization It can be beneficial to use a heatmap to visualize single-cell expression per sample and patient. Such a plot, which we will create using dittoHeatmap, can highlight biological differences across samples/patients. # Heatmap visualization - DittoHeatmap dittoHeatmap(spe[,cur_cells], genes = rownames(spe)[rowData(spe)$marker_class == "type"], assay = "exprs", order.by = c("patient_id","sample_id"), cluster_cols = FALSE, scale = "none", heatmap.colors = viridis(100), annot.by = c("celltype", "indication", "patient_id", "sample_id"), annotation_colors = list(celltype = metadata(spe)$color_vectors$celltype, indication = metadata(spe)$color_vectors$indication, patient_id = metadata(spe)$color_vectors$patient_id, sample_id = metadata(spe)$color_vectors$sample_id)) As in Section 7.3, aggregated mean marker expression per sample/patient allow identification of samples/patients with outlying expression patterns. Here, we will focus on the patient level and use aggregateAcrossCells and dittoHeatmap. The heatmap will be annotated with the number of cells per patient and cancer type and displayed using two scaling options. # mean expression by patient_id patient_mean <- aggregateAcrossCells(as(spe, "SingleCellExperiment"), ids = spe$patient_id, statistics = "mean", use.assay.type = "exprs", subset.row = rownames(spe)[rowData(spe)$marker_class == "type"]) # No scaling dittoHeatmap(patient_mean, assay = "exprs", cluster_cols = TRUE, scale = "none", heatmap.colors = viridis(100), annot.by = c("patient_id","indication","ncells"), annotation_colors = list(patient_id = metadata(spe)$color_vectors$patient_id, indication = metadata(spe)$color_vectors$indication, ncells = plasma(100))) # Max expression scaling dittoHeatmap(patient_mean, assay = "exprs", cluster_cols = TRUE, scaled.to.max = TRUE, heatmap.colors.max.scaled = inferno(100), annot.by = c("patient_id","indication","ncells"), annotation_colors = list(patient_id = metadata(spe)$color_vectors$patient_id, indication = metadata(spe)$color_vectors$indication, ncells = plasma(100))) As seen before, CD15 levels are elevated in Patient 4 (SCCHN), while SMA levels are highest for Patient 4 (CRC). 10.3.3 Barplot visualization Complementary to displaying cell type frequencies per sample/patient, we can use dittoBarPlot to display sample/patient frequencies per cell type. dittoBarPlot(spe, var = "patient_id", group.by = "celltype") + scale_fill_manual(values = metadata(spe)$color_vectors$patient_id) dittoBarPlot(spe, var = "sample_id", group.by = "celltype") + scale_fill_manual(values = metadata(spe)$color_vectors$sample_id) Patient2 has the highest and lowest proportion of plasma cells and neutrophils, respectively. 10.3.4 CATALYST-based visualization 10.3.4.1 Pseudobulk-level MDS plot Expression-based pseudobulks for each sample can be compared with the pbMDS function. # MDS pseudobulk by sample_id pbMDS(spe_cat, by = "sample_id", color_by = "sample_id", features = rownames(spe_cat)[rowData(spe_cat)$marker_class == "type"]) + scale_color_manual(values = metadata(spe_cat)$color_vectors$sample_id) There are marked differences in pseudobulk-expression patterns between samples and across patients, which can be driven by biological differences and also technical aspects such as divergent region selection. 10.3.4.2 Reduced dimension plot on CLR of proportions The clrDR function can also be used to analyze similarity of samples based on cell type proportions. # CLR on sample proportions across clusters clrDR(spe_cat, dr = "PCA", by = "sample_id", point_col = "sample_id", k = "celltype", point_pal = metadata(spe_cat)$color_vectors$sample_id) + scale_color_manual(values = metadata(spe_cat)$color_vectors$celltype) ## Scale for colour is already present. ## Adding another scale for colour, which will replace the existing scale. There are notable differences between samples based on their cell type proportions. Interestingly, Patient3_001, Patient1_003, Patient4_007 and Patient4_006 group together and the PC loadings indicate a strong contribution of BnT and B cells, which could propose formation of tertiary lymphoid structures (TLS). In section 12.2, we will be able to confirm this hypothesis visually on the images. 10.4 Further examples In the last section of this chapter, we will use the popular ComplexHeatmap package to create a visualization example that combines various cell-type- and sample-level information. ComplexHeatmap is highly versatile and is originally inspired from the pheatmap package. Therefore, many arguments have the same/similar names. For more details, we would recommend to read the reference book. 10.4.1 Publication-ready ComplexHeatmap For this example, we will concatenate heatmaps and annotations horizontally into one rich heatmap list. The grouping-level for the visualization will again be the cell type information from Section 9.3 Initially, we will create two separate Heatmap objects for cell type and state markers. Then, metadata information, including the cancer type proportion and number of cells/patients per cell type, will be extracted into HeatmapAnnotation objects. Notably, we will add spatial features per cell type, here the number of neighbors extracted from colPair(spe) and cell area, in another HeatmapAnnotation object. Ultimately, all objects are combined in a HeatmapList and visualized. library(ComplexHeatmap) library(circlize) library(tidyverse) set.seed(22) ### 1. Heatmap bodies ### # Heatmap body color col_exprs <- colorRamp2(c(0,1,2,3,4), c("#440154FF","#3B518BFF","#20938CFF", "#6ACD5AFF","#FDE725FF")) # Create Heatmap objects # By cell type markers celltype_mean <- aggregateAcrossCells(as(spe, "SingleCellExperiment"), ids = spe$celltype, statistics = "mean", use.assay.type = "exprs", subset.row = rownames(spe)[rowData(spe)$marker_class == "type"]) h_type <- Heatmap(t(assay(celltype_mean, "exprs")), column_title = "type_markers", col = col_exprs, name= "mean exprs", show_row_names = TRUE, show_column_names = TRUE) # By cell state markers cellstate_mean <- aggregateAcrossCells(as(spe, "SingleCellExperiment"), ids = spe$celltype, statistics = "mean", use.assay.type = "exprs", subset.row = rownames(spe)[rowData(spe)$marker_class == "state"]) h_state <- Heatmap(t(assay(cellstate_mean, "exprs")), column_title = "state_markers", col = col_exprs, name= "mean exprs", show_row_names = TRUE, show_column_names = TRUE) ### 2. Heatmap annotation ### ### 2.1 Metadata features anno <- colData(celltype_mean) %>% as.data.frame %>% select(celltype, ncells) # Proportion of indication per celltype indication <- unclass(prop.table(table(spe$celltype, spe$indication), margin = 1)) # Number of contributing patients per celltype cluster_PID <- colData(spe) %>% as.data.frame() %>% select(celltype, patient_id) %>% group_by(celltype) %>% table() %>% as.data.frame() n_PID <- cluster_PID %>% filter(Freq>0) %>% group_by(celltype) %>% count(name = "n_PID") %>% column_to_rownames("celltype") # Create HeatmapAnnotation objects ha_anno <- HeatmapAnnotation(celltype = anno$celltype, border = TRUE, gap = unit(1,"mm"), col = list(celltype = metadata(spe)$color_vectors$celltype), which = "row") ha_meta <- HeatmapAnnotation(n_cells = anno_barplot(anno$ncells, width = unit(10, "mm")), n_PID = anno_barplot(n_PID, width = unit(10, "mm")), indication = anno_barplot(indication,width = unit(10, "mm"), gp = gpar(fill = metadata(spe)$color_vectors$indication)), border = TRUE, annotation_name_rot = 90, gap = unit(1,"mm"), which = "row") ### 2.2 Spatial features # Add number of neighbors to spe object (saved in colPair) spe$n_neighbors <- countLnodeHits(colPair(spe, "neighborhood")) # Select spatial features and average over celltypes spatial <- colData(spe) %>% as.data.frame() %>% select(area, celltype, n_neighbors) spatial <- spatial %>% select(-celltype) %>% aggregate(by = list(celltype = spatial$celltype), FUN = mean) %>% column_to_rownames("celltype") # Create HeatmapAnnotation object ha_spatial <- HeatmapAnnotation( area = spatial$area, n_neighbors = spatial$n_neighbors, border = TRUE, gap = unit(1,"mm"), which = "row") ### 3. Plot rich heatmap ### # Create HeatmapList object h_list <- h_type + h_state + ha_anno + ha_spatial + ha_meta # Add customized legend for anno_barplot() lgd <- Legend(title = "indication", at = colnames(indication), legend_gp = gpar(fill = metadata(spe)$color_vectors$indication)) # Plot draw(h_list,annotation_legend_list = list(lgd)) This plot summarizes most of the information we have seen in this chapter previously. In addition, we can observe that tumor cells have the largest mean cell area, high number of neighbors and elevated Ki67 expression. BnT cells have the highest number of neighbors on average, which is biological sound given their predominant location in highly immune infiltrated regions (such as TLS). 10.4.2 Interactive visualization For interactive visualization of the single-cell data the iSEE shiny application can be used. For a comprehensive tutorial, please refer to the iSEE vignette. if (interactive()) { library(iSEE) iSEE(spe) } 10.5 Session Info SessionInfo ## R version 4.3.1 (2023-06-16) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 22.04.3 LTS ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## time zone: Etc/UTC ## tzcode source: system (glibc) ## ## attached base packages: ## [1] grid stats4 stats graphics grDevices utils datasets ## [8] methods base ## ## other attached packages: ## [1] lubridate_1.9.3 forcats_1.0.0 ## [3] stringr_1.5.0 dplyr_1.1.3 ## [5] purrr_1.0.2 readr_2.1.4 ## [7] tidyr_1.3.0 tibble_3.2.1 ## [9] tidyverse_2.0.0 circlize_0.4.15 ## [11] ComplexHeatmap_2.16.0 CATALYST_1.24.0 ## [13] viridis_0.6.4 viridisLite_0.4.2 ## [15] cowplot_1.1.1 patchwork_1.1.3 ## [17] scater_1.28.0 scuttle_1.10.2 ## [19] dittoSeq_1.12.1 ggplot2_3.4.3 ## [21] SpatialExperiment_1.10.0 SingleCellExperiment_1.22.0 ## [23] SummarizedExperiment_1.30.2 Biobase_2.60.0 ## [25] GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 ## [27] IRanges_2.34.1 S4Vectors_0.38.2 ## [29] BiocGenerics_0.46.0 MatrixGenerics_1.12.3 ## [31] matrixStats_1.0.0 ## ## loaded via a namespace (and not attached): ## [1] splines_4.3.1 bitops_1.0-7 ## [3] R.oo_1.25.0 polyclip_1.10-6 ## [5] XML_3.99-0.14 lifecycle_1.0.3 ## [7] rstatix_0.7.2 edgeR_3.42.4 ## [9] doParallel_1.0.17 lattice_0.21-8 ## [11] MASS_7.3-60 backports_1.4.1 ## [13] magrittr_2.0.3 limma_3.56.2 ## [15] sass_0.4.7 rmarkdown_2.25 ## [17] jquerylib_0.1.4 yaml_2.3.7 ## [19] plotrix_3.8-2 RColorBrewer_1.1-3 ## [21] ConsensusClusterPlus_1.64.0 multcomp_1.4-25 ## [23] abind_1.4-5 zlibbioc_1.46.0 ## [25] Rtsne_0.16 R.utils_2.12.2 ## [27] RCurl_1.98-1.12 TH.data_1.1-2 ## [29] tweenr_2.0.2 sandwich_3.0-2 ## [31] GenomeInfoDbData_1.2.10 ggrepel_0.9.3 ## [33] irlba_2.3.5.1 pheatmap_1.0.12 ## [35] dqrng_0.3.1 DelayedMatrixStats_1.22.6 ## [37] codetools_0.2-19 DropletUtils_1.20.0 ## [39] DelayedArray_0.26.7 ggforce_0.4.1 ## [41] tidyselect_1.2.0 shape_1.4.6 ## [43] farver_2.1.1 ScaledMatrix_1.8.1 ## [45] jsonlite_1.8.7 GetoptLong_1.0.5 ## [47] BiocNeighbors_1.18.0 ggridges_0.5.4 ## [49] survival_3.5-5 iterators_1.0.14 ## [51] foreach_1.5.2 tools_4.3.1 ## [53] ggnewscale_0.4.9 Rcpp_1.0.11 ## [55] glue_1.6.2 gridExtra_2.3 ## [57] xfun_0.40 HDF5Array_1.28.1 ## [59] withr_2.5.1 fastmap_1.1.1 ## [61] rhdf5filters_1.12.1 fansi_1.0.4 ## [63] digest_0.6.33 rsvd_1.0.5 ## [65] timechange_0.2.0 R6_2.5.1 ## [67] colorspace_2.1-0 Cairo_1.6-1 ## [69] gtools_3.9.4 R.methodsS3_1.8.2 ## [71] utf8_1.2.3 generics_0.1.3 ## [73] data.table_1.14.8 S4Arrays_1.0.6 ## [75] pkgconfig_2.0.3 gtable_0.3.4 ## [77] RProtoBufLib_2.12.1 XVector_0.40.0 ## [79] htmltools_0.5.6 carData_3.0-5 ## [81] bookdown_0.35 clue_0.3-65 ## [83] scales_1.2.1 png_0.1-8 ## [85] colorRamps_2.3.1 knitr_1.44 ## [87] rstudioapi_0.15.0 tzdb_0.4.0 ## [89] reshape2_1.4.4 rjson_0.2.21 ## [91] cachem_1.0.8 zoo_1.8-12 ## [93] rhdf5_2.44.0 GlobalOptions_0.1.2 ## [95] parallel_4.3.1 vipor_0.4.5 ## [97] pillar_1.9.0 vctrs_0.6.3 ## [99] ggpubr_0.6.0 car_3.1-2 ## [101] BiocSingular_1.16.0 cytolib_2.12.1 ## [103] beachmat_2.16.0 cluster_2.1.4 ## [105] beeswarm_0.4.0 evaluate_0.21 ## [107] magick_2.8.0 mvtnorm_1.2-3 ## [109] cli_3.6.1 locfit_1.5-9.8 ## [111] compiler_4.3.1 rlang_1.1.1 ## [113] crayon_1.5.2 ggsignif_0.6.4 ## [115] labeling_0.4.3 FlowSOM_2.8.0 ## [117] flowCore_2.12.2 plyr_1.8.8 ## [119] ggbeeswarm_0.7.2 stringi_1.7.12 ## [121] BiocParallel_1.34.2 nnls_1.5 ## [123] munsell_0.5.0 Matrix_1.6-1.1 ## [125] hms_1.1.3 sparseMatrixStats_1.12.2 ## [127] Rhdf5lib_1.22.1 drc_3.0-1 ## [129] igraph_1.5.1 broom_1.0.5 ## [131] bslib_0.5.1 References "],["image-visualization.html", "11 Image visualization 11.1 Pixel visualization 11.2 Cell visualization 11.3 Adjusting plot annotations 11.4 Displaying individual images 11.5 Saving and returning images 11.6 Interactive image visualization 11.7 Session Info", " 11 Image visualization The following section describes how to visualize the abundance of biomolecules (e.g., protein or RNA) as well as cell-specific metadata on images. Section 11.1 focuses on visualizing pixel-level information including the generation of pseudo-color composite images. Section 11.2 highlights the visualization of cell metadata (e.g., cell phenotype) as well as summarized pixel intensities on cell segmentation masks. The cytomapper R/Bioconductor package was developed to support the handling and visualization of multiple multi-channel images and segmentation masks (Eling et al. 2020). The main data object for image handling is the CytoImageList container which we used in Section 5 to store multi-channel images and segmentation masks. We will first read in the previously processed data and randomly select 3 images for visualization purposes. library(SpatialExperiment) library(cytomapper) spe <- readRDS("data/spe.rds") images <- readRDS("data/images.rds") masks <- readRDS("data/masks.rds") # Sample images set.seed(220517) cur_id <- sample(unique(spe$sample_id), 3) cur_images <- images[names(images) %in% cur_id] cur_masks <- masks[names(masks) %in% cur_id] 11.1 Pixel visualization The following section gives examples for visualizing individual channels or multiple channels as pseudo-color composite images. For this the cytomapper package exports the plotPixels function which expects a CytoImageList object storing one or multiple multi-channel images. In the simplest use case, a single channel can be visualized as follows: plotPixels(cur_images, colour_by = "Ecad", bcg = list(Ecad = c(0, 5, 1))) The plot above shows the tissue expression of the epithelial tumor marker E-cadherin on the 3 selected images. The bcg parameter (default c(0, 1, 1)) stands for “background”, “contrast”, “gamma” and controls these attributes of the image. This parameter takes a named list where each entry specifies these attributes per channel. The first value of the numeric vector will be added to the pixel intensities (background); pixel intensities will be multiplied by the second entry of the vector (contrast); pixel intensities will be exponentiated by the third entry of the vector (gamma). In most cases, it is sufficient to adjust the second (contrast) entry of the vector. The following example highlights the visualization of 6 markers (maximum allowed number of markers) at once per image. The markers indicate the spatial distribution of tumor cells (E-cadherin), T cells (CD3), B cells (CD20), CD8+ T cells (CD8a), plasma cells (CD38) and proliferating cells (Ki67). plotPixels(cur_images, colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67"), bcg = list(Ecad = c(0, 5, 1), CD3 = c(0, 5, 1), CD20 = c(0, 5, 1), CD8a = c(0, 5, 1), CD38 = c(0, 8, 1), Ki67 = c(0, 5, 1))) 11.1.1 Adjusting colors The default colors for visualization are chosen by the additive RGB (red, green, blue) color model. For six markers the default colors are: red, green, blue, cyan (green + blue), magenta (red + blue), yellow (green + red). These colors are the easiest to distinguish by eye. However, you can select other colors for each channel by setting the colour parameter: plotPixels(cur_images, colour_by = c("Ecad", "CD3", "CD20"), bcg = list(Ecad = c(0, 5, 1), CD3 = c(0, 5, 1), CD20 = c(0, 5, 1)), colour = list(Ecad = c("black", "burlywood1"), CD3 = c("black", "cyan2"), CD20 = c("black", "firebrick1"))) The colour parameter takes a named list in which each entry specifies the colors from which a color gradient is constructed via colorRampPalette. These are usually vectors of length 2 in which the first entry is \"black\" and the second entry specifies the color of choice. Although not recommended, you can also specify more than two colors to generate a more complex color gradient. 11.1.2 Image normalization As an alternative to setting the bcg parameter, images can first be normalized. Normalization here means to scale the pixel intensities per channel between 0 and 1 (or a range specified by the ft parameter in the normalize function). By default, the normalize function scales pixel intensities across all images contained in the CytoImageList object (separateImages = FALSE). Each individual channel is scaled independently (separateChannels = TRUE). After 0-1 normalization, maximum pixel intensities can be clipped to enhance the contrast of the image (setting the inputRange parameter). In the following example, the clipping to 0 and 0.2 is the same as multiplying the pixel intensities by a factor of 5. # 0 - 1 channel scaling across all images norm_images <- cytomapper::normalize(cur_images) # Clip channel at 0.2 norm_images <- cytomapper::normalize(norm_images, inputRange = c(0, 0.2)) plotPixels(norm_images, colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67")) The default setting of scaling pixel intensities across all images ensures comparable intensity levels across images. Pixel intensities can also be scaled per image therefore correcting for staining/expression differences between images: # 0 - 1 channel scaling per image norm_images <- cytomapper::normalize(cur_images, separateImages = TRUE) # Clip channel at 0.2 norm_images <- cytomapper::normalize(norm_images, inputRange = c(0, 0.2)) plotPixels(norm_images, colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67")) As we can see, the marker Ki67 appears brighter on image 2 and 3 in comparison to scaling the channel across all images. Finally, the normalize function also accepts a named list input for the inputRange argument. In this list, the clipping range per channel can be set individually: # 0 - 1 channel scaling per image norm_images <- cytomapper::normalize(cur_images, separateImages = TRUE, inputRange = list(Ecad = c(0, 50), CD3 = c(0, 30), CD20 = c(0, 40), CD8a = c(0, 50), CD38 = c(0, 10), Ki67 = c(0, 70))) plotPixels(norm_images, colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67")) 11.2 Cell visualization In the following section, we will show examples on how to visualize single cells either as segmentation masks or outlined on composite images. This type of visualization allows to observe the spatial distribution of cell phenotypes, the visual assessment of morphological features and quality control in terms of cell segmentation and phenotyping. 11.2.1 Visualzing metadata The cytomapper package provides the plotCells function that accepts a CytoImageList object containing segmentation masks. These are defined as single channel images where sets of pixels with the same integer ID identify individual cells. This integer ID can be found as an entry in the colData(spe) slot and as pixel information in the segmentation masks. The entry in colData(spe) needs to be specified via the cell_id argument to the plotCells function. In that way, data contained in the SpatialExperiment object can be mapped to segmentation masks. For the current dataset, the cell IDs are stored in colData(spe)$ObjectNumber. As cell IDs are only unique within a single image, plotCells also requires the img_id argument. This argument specifies the colData(spe) as well as the mcols(masks) entry that stores the unique image name from which each cell was extracted. In the current dataset the unique image names are stored in colData(spe)$sample_id and mcols(masks)$sample_id. Providing these two entries that allow mapping between the SpatialExperiment object and segmentation masks, we can now color individual cells based on their cell type: plotCells(cur_masks, object = spe, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = "celltype") For consistent visualization, the plotCells function takes a named list as color argument. The entry name must match the colour_by argument. plotCells(cur_masks, object = spe, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = "celltype", colour = list(celltype = metadata(spe)$color_vectors$celltype)) If only individual cell types should be visualized, the SpatialExperiment object can be subsetted (e.g., to only contain CD8+ T cells). In the following example CD8+ T cells are colored in red and all other cells that are not contained in the dataset are colored in white (as set by the missing_color argument). CD8 <- spe[,spe$celltype == "CD8"] plotCells(cur_masks, object = CD8, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = "celltype", colour = list(celltype = c(CD8 = "red")), missing_colour = "white") In terms of visualizing metadata, any entry in the colData(spe) slot can be visualized. The plotCells function automatically detects if the entry is continuous or discrete. In this fashion, we can now visualize the area of each cell: plotCells(cur_masks, object = spe, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = "area") 11.2.2 Visualizating expression Similar to visualizing single-cell metadata on segmentation masks, we can use the plotCells function to visualize the aggregated pixel intensities per cell. In the current dataset pixel intensities were aggregated by computing the mean pixel intensity per cell and per channel. The plotCells function accepts the exprs_values argument (default counts) that allows selecting the assay which stores the expression values that should be visualized. In the following example, we visualize the asinh-transformed mean pixel intensities of the epithelial marker E-cadherin on segmentation masks. plotCells(cur_masks, object = spe, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = "Ecad", exprs_values = "exprs") We will now visualize the maximum number of allowed markers as composites on the segmentation masks. As above the markers indicate the spatial distribution of tumor cells (E-cadherin), T cells (CD3), B cells (CD20), CD8+ T cells (CD8a), plasma cells (CD38) and proliferating cells (Ki67). plotCells(cur_masks, object = spe, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67"), exprs_values = "exprs") While visualizing 6 markers on the pixel-level may still allow the distinction of different tissue structures, observing single-cell expression levels is difficult when visualizing many markers simultaneously due to often overlapping expression. Similarly to adjusting marker colors when visualizing pixel intensities, we can change the color gradients per marker by setting the color argument: plotCells(cur_masks, object = spe, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = c("Ecad", "CD3", "CD20"), exprs_values = "exprs", colour = list(Ecad = c("black", "burlywood1"), CD3 = c("black", "cyan2"), CD20 = c("black", "firebrick1"))) 11.2.3 Outlining cells on images The following section highlights the combined visualization of pixel- and cell-level information at once. For this, besides the SpatialExperiment object, the plotPixels function accepts two CytoImageList objects. One for the multi-channel images and one for the segmentation masks. By specifying the outline_by parameter, the outlines of cells can now be colored based on their metadata. The following example first generates a 3-channel composite images displaying the expression of E-cadherin, CD3 and CD20 before coloring the cells’ outlines by their cell phenotype. plotPixels(image = cur_images, mask = cur_masks, object = spe, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = c("Ecad", "CD3", "CD20"), outline_by = "celltype", bcg = list(Ecad = c(0, 5, 1), CD3 = c(0, 5, 1), CD20 = c(0, 5, 1)), colour = list(celltype = metadata(spe)$color_vectors$celltype), thick = TRUE) Distinguishing individual cell phenotypes is nearly impossible in the images above. However, the SpatialExperiment object can be subsetted to only contain cells of a single or few phenotypes. This allows the selective visualization of cell outlines on composite images. Here, we select all CD8+ T cells from the dataset and outline them on a 2-channel composite image displaying the expression of CD3 and CD8a. CD8 <- spe[,spe$celltype == "CD8"] plotPixels(image = cur_images, mask = cur_masks, object = CD8, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = c("CD3", "CD8a"), outline_by = "celltype", bcg = list(CD3 = c(0, 5, 1), CD8a = c(0, 5, 1)), colour = list(celltype = c("CD8" = "white")), thick = TRUE) This type of visualization allows the quality control of two things: 1. segmentation quality of individual cell types can be checked and 2. cell phenotyping accuracy can be visually assessed against expected marker expression. 11.3 Adjusting plot annotations The cytomapper package provides a number of function arguments to adjust the visual appearance of figures that are shared between the plotPixels and plotCells function. For a full overview of the arguments please refer to ?plotting-param. We use the following example to highlight how to adjust the scale bar, the image title, the legend appearance and the margin between images. plotPixels(cur_images, colour_by = c("Ecad", "CD3", "CD20", "CD8a", "CD38", "Ki67"), bcg = list(Ecad = c(0, 5, 1), CD3 = c(0, 5, 1), CD20 = c(0, 5, 1), CD8a = c(0, 5, 1), CD38 = c(0, 8, 1), Ki67 = c(0, 5, 1)), scale_bar = list(length = 100, label = expression("100 " ~ mu * "m"), cex = 0.7, lwidth = 10, colour = "grey", position = "bottomleft", margin = c(5,5), frame = 3), image_title = list(text = mcols(cur_images)$indication, position = "topright", colour = "grey", margin = c(5,5), font = 2, cex = 2), legend = list(colour_by.title.cex = 0.7, margin = 10), margin = 40) 11.4 Displaying individual images By default, all images are displayed on the same graphics device. This can be useful when saving all images at once (see next section) to zoom into the individual images instead of opening each image individually. However, when displaying images in a markdown document these are more accessible when visualized individually. For this, the plotPixels and plotCells function accepts the display parameter that when set to \"single\" displays each resulting image in its own graphics device: plotCells(cur_masks, object = spe, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = "celltype", colour = list(celltype = metadata(spe)$color_vectors$celltype), display = "single", legend = NULL) 11.5 Saving and returning images The final section addresses how to save composite images and how to return them for integration with other plots. The plotPixels and plotCells functions accept the save_plot argument which takes a named list of the following entries: filename indicates the location and file type of the image saved to disk; scale adjusts the resolution of the saved image (this only needs to be adjusted for small images). plotCells(cur_masks, object = spe, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = "celltype", colour = list(celltype = metadata(spe)$color_vectors$celltype), save_plot = list(filename = "data/celltype_image.png")) The composite images (together with their annotation) can also be returned. In the following code chunk we save two example plots to variables (out1 and out2). out1 <- plotCells(cur_masks, object = spe, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = "celltype", colour = list(celltype = metadata(spe)$color_vectors$celltype), return_plot = TRUE) out2 <- plotCells(cur_masks, object = spe, cell_id = "ObjectNumber", img_id = "sample_id", colour_by = c("Ecad", "CD3", "CD20"), exprs_values = "exprs", return_plot = TRUE) The composite images are stored in out1$plot and out2$plot and can be converted into a graph object recognized by the cowplot package. The final function call of the following chunk plots both object next to each other. library(cowplot) library(gridGraphics) p1 <- ggdraw(out1$plot, clip = "on") p2 <- ggdraw(out2$plot, clip = "on") plot_grid(p1, p2) 11.6 Interactive image visualization The cytoviewer package allows the interactive visualization of multi-channel images and segmentation masks. It also allows to map cellular metadata onto segmentation masks and outlining of cells on composite images. For a full introduction to the package, please refer to the vignette. library(cytoviewer) app <- cytoviewer(image = images, mask = masks, object = spe, cell_id = "ObjectNumber", img_id = "sample_id") if (interactive()) { shiny::runApp(app, launch.browser = TRUE) } 11.7 Session Info SessionInfo ## R version 4.3.1 (2023-06-16) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 22.04.3 LTS ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## time zone: Etc/UTC ## tzcode source: system (glibc) ## ## attached base packages: ## [1] grid stats4 stats graphics grDevices utils datasets ## [8] methods base ## ## other attached packages: ## [1] cytoviewer_1.0.1 gridGraphics_0.5-1 ## [3] cowplot_1.1.1 cytomapper_1.12.0 ## [5] EBImage_4.42.0 SpatialExperiment_1.10.0 ## [7] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2 ## [9] Biobase_2.60.0 GenomicRanges_1.52.0 ## [11] GenomeInfoDb_1.36.3 IRanges_2.34.1 ## [13] S4Vectors_0.38.2 BiocGenerics_0.46.0 ## [15] MatrixGenerics_1.12.3 matrixStats_1.0.0 ## ## loaded via a namespace (and not attached): ## [1] splines_4.3.1 later_1.3.1 ## [3] bitops_1.0-7 tibble_3.2.1 ## [5] R.oo_1.25.0 svgPanZoom_0.3.4 ## [7] polyclip_1.10-6 XML_3.99-0.14 ## [9] lifecycle_1.0.3 rstatix_0.7.2 ## [11] edgeR_3.42.4 doParallel_1.0.17 ## [13] lattice_0.21-8 MASS_7.3-60 ## [15] backports_1.4.1 magrittr_2.0.3 ## [17] limma_3.56.2 sass_0.4.7 ## [19] rmarkdown_2.25 plotrix_3.8-2 ## [21] jquerylib_0.1.4 yaml_2.3.7 ## [23] httpuv_1.6.11 sp_2.0-0 ## [25] RColorBrewer_1.1-3 ConsensusClusterPlus_1.64.0 ## [27] multcomp_1.4-25 abind_1.4-5 ## [29] zlibbioc_1.46.0 Rtsne_0.16 ## [31] purrr_1.0.2 R.utils_2.12.2 ## [33] RCurl_1.98-1.12 TH.data_1.1-2 ## [35] tweenr_2.0.2 sandwich_3.0-2 ## [37] circlize_0.4.15 GenomeInfoDbData_1.2.10 ## [39] ggrepel_0.9.3 irlba_2.3.5.1 ## [41] CATALYST_1.24.0 terra_1.7-46 ## [43] dqrng_0.3.1 svglite_2.1.1 ## [45] DelayedMatrixStats_1.22.6 codetools_0.2-19 ## [47] DropletUtils_1.20.0 DelayedArray_0.26.7 ## [49] scuttle_1.10.2 ggforce_0.4.1 ## [51] tidyselect_1.2.0 shape_1.4.6 ## [53] raster_3.6-23 farver_2.1.1 ## [55] ScaledMatrix_1.8.1 viridis_0.6.4 ## [57] jsonlite_1.8.7 BiocNeighbors_1.18.0 ## [59] GetoptLong_1.0.5 ellipsis_0.3.2 ## [61] scater_1.28.0 ggridges_0.5.4 ## [63] survival_3.5-5 iterators_1.0.14 ## [65] systemfonts_1.0.4 foreach_1.5.2 ## [67] tools_4.3.1 ggnewscale_0.4.9 ## [69] Rcpp_1.0.11 glue_1.6.2 ## [71] gridExtra_2.3 xfun_0.40 ## [73] dplyr_1.1.3 HDF5Array_1.28.1 ## [75] shinydashboard_0.7.2 withr_2.5.1 ## [77] fastmap_1.1.1 rhdf5filters_1.12.1 ## [79] fansi_1.0.4 rsvd_1.0.5 ## [81] digest_0.6.33 R6_2.5.1 ## [83] mime_0.12 colorspace_2.1-0 ## [85] gtools_3.9.4 jpeg_0.1-10 ## [87] R.methodsS3_1.8.2 utf8_1.2.3 ## [89] tidyr_1.3.0 generics_0.1.3 ## [91] data.table_1.14.8 htmlwidgets_1.6.2 ## [93] S4Arrays_1.0.6 pkgconfig_2.0.3 ## [95] gtable_0.3.4 ComplexHeatmap_2.16.0 ## [97] RProtoBufLib_2.12.1 XVector_0.40.0 ## [99] htmltools_0.5.6 carData_3.0-5 ## [101] bookdown_0.35 fftwtools_0.9-11 ## [103] clue_0.3-65 scales_1.2.1 ## [105] png_0.1-8 colorRamps_2.3.1 ## [107] knitr_1.44 rstudioapi_0.15.0 ## [109] reshape2_1.4.4 rjson_0.2.21 ## [111] cachem_1.0.8 zoo_1.8-12 ## [113] rhdf5_2.44.0 GlobalOptions_0.1.2 ## [115] stringr_1.5.0 shinycssloaders_1.0.0 ## [117] miniUI_0.1.1.1 parallel_4.3.1 ## [119] vipor_0.4.5 pillar_1.9.0 ## [121] vctrs_0.6.3 promises_1.2.1 ## [123] ggpubr_0.6.0 BiocSingular_1.16.0 ## [125] car_3.1-2 cytolib_2.12.1 ## [127] beachmat_2.16.0 xtable_1.8-4 ## [129] cluster_2.1.4 archive_1.1.6 ## [131] beeswarm_0.4.0 evaluate_0.21 ## [133] magick_2.8.0 mvtnorm_1.2-3 ## [135] cli_3.6.1 locfit_1.5-9.8 ## [137] compiler_4.3.1 rlang_1.1.1 ## [139] crayon_1.5.2 ggsignif_0.6.4 ## [141] FlowSOM_2.8.0 plyr_1.8.8 ## [143] flowCore_2.12.2 ggbeeswarm_0.7.2 ## [145] stringi_1.7.12 viridisLite_0.4.2 ## [147] BiocParallel_1.34.2 nnls_1.5 ## [149] munsell_0.5.0 tiff_0.1-11 ## [151] colourpicker_1.3.0 Matrix_1.6-1.1 ## [153] sparseMatrixStats_1.12.2 ggplot2_3.4.3 ## [155] Rhdf5lib_1.22.1 shiny_1.7.5 ## [157] fontawesome_0.5.2 drc_3.0-1 ## [159] memoise_2.0.1 igraph_1.5.1 ## [161] broom_1.0.5 bslib_0.5.1 References "],["performing-spatial-analysis.html", "12 Performing spatial analysis 12.1 Spatial interaction graphs 12.2 Spatial visualization 12.3 Spatial community analysis 12.4 Cellular neighborhood analysis 12.5 Spatial context analysis 12.6 Patch detection 12.7 Interaction analysis 12.8 Session Info", " 12 Performing spatial analysis Highly multiplexed imaging technologies measure the spatial distributions of molecule abundances across tissue sections. As such, having the option to analyze single cells in their spatial tissue context is a key strength of these technologies. A number of software packages such as squidpy, giotto and Seurat have been developed to analyse and visualize cells in their spatial context. The following chapter will highlight the use of imcRtools and other Bioconductor packages to visualize and analyse single-cell data obtained from highly multiplexed imaging technologies. We will first read in the spatially-annotated single-cell data processed in the previous sections. library(SpatialExperiment) spe <- readRDS("data/spe.rds") 12.1 Spatial interaction graphs Many spatial analysis approaches either compare the observed versus expected number of cells around a given cell type (point process) or utilize interaction graphs (spatial object graphs) to estimate clustering or interaction frequencies between cell types. The steinbock framework allows the construction of these spatial graphs. During image processing (see Section 4.3), we have constructed a spatial graph by expanding the individual cell masks by 4 pixels. The imcRtools package further allows the ad hoc consctruction of spatial graphs directly using a SpatialExperiment or SingleCellExperiment object while considering the spatial location (centroids) of individual cells. The buildSpatialGraph function allows constructing spatial graphs by detecting the k-nearest neighbors in 2D (knn), by detecting all cells within a given distance to the center cell (expansion) and by Delaunay triangulation (delaunay). When constructing a knn graph, the number of neighbors (k) needs to be set and (optionally) the maximum distance to consider (max_dist) can be specified. When constructing a graph via expansion, the distance to expand (threshold) needs to be provided. For graphs constructed via Delaunay triangulation, the max_dist parameter can be set to avoid unusually large connections at the edge of the image. library(imcRtools) spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "knn", k = 20) ## The returned object is ordered by the 'sample_id' entry. spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "expansion", threshold = 20) ## The returned object is ordered by the 'sample_id' entry. spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "delaunay", max_dist = 20) ## The returned object is ordered by the 'sample_id' entry. The spatial graphs are stored in colPair(spe, name) slots. These slots store SelfHits objects representing edge lists in which the first column indicates the index of the “from” cell and the second column the index of the “to” cell. Each edge list is newly constructed when subsetting the object. colPairNames(spe) ## [1] "neighborhood" "knn_interaction_graph" ## [3] "expansion_interaction_graph" "delaunay_interaction_graph" Here, colPair(spe, \"neighborhood\") stores the spatial graph constructed by steinbock, colPair(spe, \"knn_interaction_graph\") stores the knn spatial graph, colPair(spe, \"expansion_interaction_graph\") stores the expansion graph and colPair(spe, \"delaunay_interaction_graph\") stores the graph constructed by Delaunay triangulation. 12.2 Spatial visualization Section 11 highlights the use of the cytomapper package to visualize multichannel images and segmentation masks. Here, we introduce the plotSpatial function of the imcRtools package to visualize the cells’ centroids and cell-cell interactions as spatial graphs. In the following example, we select one image for visualization purposes. Here, each dot (node) represents a cell and edges are drawn between cells in close physical proximity as detected by steinbock or the buildSpatialGraph function. Nodes are variably colored based on the cell type and edges are colored in grey. library(ggplot2) library(viridis) # steinbock interaction graph plotSpatial(spe[,spe$sample_id == "Patient3_001"], node_color_by = "celltype", img_id = "sample_id", draw_edges = TRUE, colPairName = "neighborhood", nodes_first = FALSE, edge_color_fix = "grey") + scale_color_manual(values = metadata(spe)$color_vectors$celltype) + ggtitle("steinbock interaction graph") # knn interaction graph plotSpatial(spe[,spe$sample_id == "Patient3_001"], node_color_by = "celltype", img_id = "sample_id", draw_edges = TRUE, colPairName = "knn_interaction_graph", nodes_first = FALSE, edge_color_fix = "grey") + scale_color_manual(values = metadata(spe)$color_vectors$celltype) + ggtitle("knn interaction graph") # expansion interaction graph plotSpatial(spe[,spe$sample_id == "Patient3_001"], node_color_by = "celltype", img_id = "sample_id", draw_edges = TRUE, colPairName = "expansion_interaction_graph", nodes_first = FALSE, edge_color_fix = "grey") + scale_color_manual(values = metadata(spe)$color_vectors$celltype) + ggtitle("expansion interaction graph") # delaunay interaction graph plotSpatial(spe[,spe$sample_id == "Patient3_001"], node_color_by = "celltype", img_id = "sample_id", draw_edges = TRUE, colPairName = "delaunay_interaction_graph", nodes_first = FALSE, edge_color_fix = "grey") + scale_color_manual(values = metadata(spe)$color_vectors$celltype) + ggtitle("delaunay interaction graph") Nodes can also be colored based on the cells’ expression levels (e.g., E-cadherin expression) and their size can be adjusted (e.g., based on measured cell area). plotSpatial(spe[,spe$sample_id == "Patient3_001"], node_color_by = "Ecad", assay_type = "exprs", img_id = "sample_id", draw_edges = TRUE, colPairName = "expansion_interaction_graph", nodes_first = FALSE, node_size_by = "area", directed = FALSE, edge_color_fix = "grey") + scale_size_continuous(range = c(0.1, 2)) + ggtitle("E-cadherin expression") Finally, the plotSpatial function allows displaying all images at once. This visualization can be useful to quickly detect larger structures of interest. plotSpatial(spe, node_color_by = "celltype", img_id = "sample_id", node_size_fix = 0.5) + scale_color_manual(values = metadata(spe)$color_vectors$celltype) For a full documentation on the plotSpatial function, please refer to ?plotSpatial. 12.3 Spatial community analysis The detection of spatial communities was proposed by (Jackson et al. 2020). Here, cells are clustered solely based on their interactions as defined by the spatial object graph. We can perform spatial community detection across all cells as displayed in the next code chunk. Communities with less than 10 cells are excluded. Of note: we set the seed outside of the function call for reproducibility porposes as internally the louvain modularity optimization function is used which gives different results over different runs. set.seed(230621) spe <- detectCommunity(spe, colPairName = "neighborhood", size_threshold = 10) plotSpatial(spe, node_color_by = "spatial_community", img_id = "sample_id", node_size_fix = 0.5) + theme(legend.position = "none") + ggtitle("Spatial tumor communities") + scale_color_manual(values = rev(colors())) The example shown above might not be of interest if different tissue structures exist within which spatial communities should be computed. In the following example, we perform spatial community detection separately for tumor and stromal cells. The general procedure is as follows: create a colData(spe) entry that specifies if a cell is part of the tumor or stroma compartment. use the detectCommunity function of the imcRtools package to cluster cells within the tumor or stroma compartment solely based on their spatial interaction graph as constructed by the steinbock package. Both tumor and stromal spatial communities are stored in the colData of the SpatialExperiment object under the spatial_community identifier. Of note: Here, and in contrast to the function call above, we set the seed argument within the SerialParam function for reproducibility purposes. We need this here due to the way the detectCommunity function is implemented when setting the group_by parameter. spe$tumor_stroma <- ifelse(spe$celltype == "Tumor", "Tumor", "Stroma") library(BiocParallel) spe <- detectCommunity(spe, colPairName = "neighborhood", size_threshold = 10, group_by = "tumor_stroma", BPPARAM = SerialParam(RNGseed = 220819)) We can now separately visualize the tumor and stromal communities. plotSpatial(spe[,spe$celltype == "Tumor"], node_color_by = "spatial_community", img_id = "sample_id", node_size_fix = 0.5) + theme(legend.position = "none") + ggtitle("Spatial tumor communities") + scale_color_manual(values = rev(colors())) plotSpatial(spe[,spe$celltype != "Tumor"], node_color_by = "spatial_community", img_id = "sample_id", node_size_fix = 0.5) + theme(legend.position = "none") + ggtitle("Spatial non-tumor communities") + scale_color_manual(values = rev(colors())) The example data was acquired using a panel that mainly focuses on immune cells. We are therefore unable to detect many tumor sub-phenotypes and will focus on the stromal communities. In the next step, the fraction of cell types within each spatial stromal community is displayed. library(pheatmap) library(viridis) cur_spe <- spe[,spe$celltype != "Tumor"] for_plot <- prop.table(table(cur_spe$spatial_community, cur_spe$celltype), margin = 1) pheatmap(for_plot, color = colorRampPalette(c("dark blue", "white", "dark red"))(100), show_rownames = FALSE, scale = "column") We observe that many spatial stromal communities are made up of myeloid cells or “stromal” (non-immune) cells. Other communities are mainly made up of B cells and BnT cells indicating tertiary lymphoid structures (TLS). While plasma cells, CD4\\(^+\\) or CD8\\(^+\\) T cells tend to aggregate, only in few spatial stromal communities consists of mainly neutrophils. 12.4 Cellular neighborhood analysis The following section highlights the use of the imcRtools package to detect cellular neighborhoods. This approach has been proposed by (Goltsev et al. 2018) and (Schürch et al. 2020) to group cells based on information contained in their direct neighborhood. (Goltsev et al. 2018) perfomed Delaunay triangulation-based graph construction, neighborhood aggregation and then clustered cells. (Schürch et al. 2020) on the other hand constructed a 10-nearest neighbor graph before aggregating information across neighboring cells. In the following code chunk we will use the 20-nearest neighbor graph as constructed above to define the direct cellular neighborhood. The aggregateNeighbors function allows neighborhood aggregation in 2 different ways: For each cell the function computes the fraction of cells of a certain type (e.g., cell type) among its neighbors. For each cell it aggregates (e.g., mean) the expression counts across all neighboring cells. Based on these measures, cells can now be clustered into cellular neighborhoods. We will first compute the fraction of the different cell types among the 20-nearest neighbors and use kmeans clustering to group cells into 6 cellular neighborhoods. Of note: constructing a 20-nearest neighbor graph and clustering using kmeans with k=6 is only an example. Similar to the analysis done in Section 9.2.2, it is recommended to perform a parameter sweep across different graph construction algorithms and different parmaters k for kmeans clustering. Finding the best CN detection settings is also subject to the question at hand. Constructing graphs with more neighbors usually results in larger CNs. # By celltypes spe <- aggregateNeighbors(spe, colPairName = "knn_interaction_graph", aggregate_by = "metadata", count_by = "celltype") set.seed(220705) cn_1 <- kmeans(spe$aggregatedNeighbors, centers = 6) spe$cn_celltypes <- as.factor(cn_1$cluster) plotSpatial(spe, node_color_by = "cn_celltypes", img_id = "sample_id", node_size_fix = 0.5) + scale_color_brewer(palette = "Set3") The next code chunk visualizes the cell type compositions of the detected cellular neighborhoods (CN). for_plot <- prop.table(table(spe$cn_celltypes, spe$celltype), margin = 1) pheatmap(for_plot, color = colorRampPalette(c("dark blue", "white", "dark red"))(100), scale = "column") CN 1 and CN 6 are mainly composed of tumor cells with CN 6 forming the tumor/stroma border. CN 3 is mainly composed of B and BnT cells indicating TLS. CN 5 is composed of aggregated plasma cells and most T cells. We will now detect cellular neighborhoods by computing the mean expression across the 20-nearest neighbor prior to kmeans clustering (k=6). # By expression spe <- aggregateNeighbors(spe, colPairName = "knn_interaction_graph", aggregate_by = "expression", assay_type = "exprs", subset_row = rowData(spe)$use_channel) set.seed(220705) cn_2 <- kmeans(spe$mean_aggregatedExpression, centers = 6) spe$cn_expression <- as.factor(cn_2$cluster) plotSpatial(spe, node_color_by = "cn_expression", img_id = "sample_id", node_size_fix = 0.5) + scale_color_brewer(palette = "Set3") Also here, we can visualize the cell type composition of each cellular neighborhood. for_plot <- prop.table(table(spe$cn_expression, spe$celltype), margin = 1) pheatmap(for_plot, color = colorRampPalette(c("dark blue", "white", "dark red"))(100), scale = "column") When clustering cells based on the mean expression within the direct neighborhood, tumor cells are split across CN 6, CN 1 and CN 4 without forming a clear tumor/stroma interface. This result reflects patient-to-patient differences in the expression of tumor markers. CN 3 again contains B cells and BnT cells but also CD8 and undefined cells, therefore it is less representative of TLS compared to CN 3 in previous CN approach. CN detection based on mean marker expression is therefore sensitive to staining/expression differences between samples as well as lateral spillover due to imperfect segmentation. An alternative to the aggregateNeighbors function is provided by the lisaClust Bioconductor package (Patrick et al. 2023). In contrast to imcRtools, the lisaClust package computes local indicators of spatial associations (LISA) functions and clusters cells based on those. More precise, the package summarizes L-functions from a Poisson point process model to derive numeric vectors for each cell which can then again be clustered using kmeans. All steps are supported by the lisaClust function which can be applied to a SingleCellExperiment and SpatialExperiment object. In the following example, we calculate the LISA curves within a 10µm, 20µm and 50µm neighborhood around each cell. Increasing these radii will lead to broader and smoother spatial clusters. However, a number of parameter settings should be tested to estimate the robustness of the results. library(lisaClust) set.seed(220705) spe <- lisaClust(spe, k = 6, Rs = c(10, 20, 50), spatialCoords = c("Pos_X", "Pos_Y"), cellType = "celltype", imageID = "sample_id") plotSpatial(spe, node_color_by = "region", img_id = "sample_id", node_size_fix = 0.5) + scale_color_brewer(palette = "Set3") Similar to the example above, we can now observe the cell type composition per spatial cluster. for_plot <- prop.table(table(spe$region, spe$celltype), margin = 1) pheatmap(for_plot, color = colorRampPalette(c("dark blue", "white", "dark red"))(100), scale = "column") In this case, CN 1 and 4 contain tumor cells but no CN is forming the tumor/stroma interface. CN 3 represents TLS. CN 2 indicates T cell subtypes and plasma cells are aggregated to CN 5. As an alternative way of visualizing the enrichment of cell types within the detected CNs, the lisaClust package provides the regionMap function. regionMap(spe, cellType = "celltype", region = "region") 12.5 Spatial context analysis Downstream of CN assignments, we will analyze the spatial context (SC) of each cell using three functions from the imcRtools package. While CNs can represent sites of unique local processes, the term SC was coined by Bhate and colleagues (Bhate et al. 2022) and describes tissue regions in which distinct CNs may be interacting. Hence, SCs may be interesting regions of specialized biological events. Here, we will first detect SCs using the detectSpatialContext function. This function relies on CN fractions for each cell in a spatial interaction graph (originally a KNN graph), which we will calculate using buildSpatialGraph and aggregateNeighbors. We will focus on the CNs derived from cell type fractions but other CN assignments are possible. Of note, the window size (k for KNN) for buildSpatialGraph should reflect a length scale on which biological signals can be exchanged and depends, among others, on cell density and tissue area. In view of their divergent functionality, we recommend to use a larger window size for SC (interaction between local processes) than for CN (local processes) detection. Since we used a 20-nearest neighbor graph for CN assignment, we will use a 40-nearest neighbor graph for SC detection. As before, different parameters should be tested. Subsequently, the CN fractions are sorted from high-to-low and the SC of each cell is assigned as the minimal combination of SCs that additively surpass a user-defined threshold. The default threshold of 0.9 aims to represent the dominant CNs, hence the most prevalent signals, in a given window. For more details and biological validation, please refer to (Bhate et al. 2022). library(circlize) library(RColorBrewer) # Construct a 40-nearest neighbor graph spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "knn", name = "knn_spatialcontext_graph", k = 40) # Compute the fraction of cellular neighborhoods around each cell spe <- aggregateNeighbors(spe, colPairName = "knn_spatialcontext_graph", aggregate_by = "metadata", count_by = "cn_celltypes", name = "aggregatedNeighborhood") # Detect spatial contexts spe <- detectSpatialContext(spe, entry = "aggregatedNeighborhood", threshold = 0.90, name = "spatial_context") # Define SC color scheme n_SCs <- length(unique(spe$spatial_context)) col_SC <- setNames(colorRampPalette(brewer.pal(9, "Paired"))(n_SCs), sort(unique(spe$spatial_context))) # Visualize spatial contexts on images plotSpatial(spe, node_color_by = "spatial_context", img_id = "sample_id", node_size_fix = 0.5) + scale_color_manual(values = col_SC) We detect a total of 52 distinct SCs across this dataset. For ease of interpretation, we will directly compare the CN and SC assignments for Patient3_001. library(patchwork) # Compare CN and SC for one patient p1 <- plotSpatial(spe[,spe$sample_id == "Patient3_001"], node_color_by = "cn_celltypes", img_id = "sample_id", node_size_fix = 0.5) + scale_color_brewer(palette = "Set3") p2 <- plotSpatial(spe[,spe$sample_id == "Patient3_001"], node_color_by = "spatial_context", img_id = "sample_id", node_size_fix = 0.5) + scale_color_manual(values = col_SC, limits = force) p1 + p2 As expected, we can observe that interfaces between different CNs make up distinct SCs. For instance, interface between CN 3 (TLS region consisting of B and BnT cells) and CN 5 (Plasma- and T-cell dominated) turns to SC 3_5. On the other hand, the core of CN 3 becomes SC 3, since the most abundant CN of the neighborhood for these cells is just the CN itself. Next, we filter the SCs based on user-defined thresholds for number of group entries (here at least 3 patients) and/or total number of cells (here minimum of 100 cells) per SC using the filterSpatialContext function. ## Filter spatial contexts # By number of group entries spe <- filterSpatialContext(spe, entry = "spatial_context", group_by = "patient_id", group_threshold = 3, name = "spatial_context_filtered") plotSpatial(spe, node_color_by = "spatial_context_filtered", img_id = "sample_id", node_size_fix = 0.5) + scale_color_manual(values = col_SC, limits = force) # Filter out small and infrequent spatial contexts spe <- filterSpatialContext(spe, entry = "spatial_context", group_by = "patient_id", group_threshold = 3, cells_threshold = 100, name = "spatial_context_filtered") plotSpatial(spe, node_color_by = "spatial_context_filtered", img_id = "sample_id", node_size_fix = 0.5) + scale_color_manual(values = col_SC, limits = force) Lastly, we can use the plotSpatialContext function to generate SC graphs, analogous to CN combination maps in (Bhate et al. 2022). Returned objects are ggplots, which can be easily modified further. We will create a SC graph for the filtered SCs here. ## Plot spatial context graph # Colored by name, size by n_cells plotSpatialContext(spe, entry = "spatial_context_filtered", group_by = "sample_id", node_color_by = "name", node_size_by = "n_cells", node_label_color_by = "name") # Colored by n_cells, size by n_group plotSpatialContext(spe, entry = "spatial_context_filtered", group_by = "sample_id", node_color_by = "n_cells", node_size_by = "n_group", node_label_color_by = "n_cells") + scale_color_viridis() SC 1 (Tumor-dominated), SC 1_6 (Tumor and Tumor-Stroma interface) and SC 4_5 (Plasma/T cell and Myeloid/Neutrophil interface) are the most frequent SCs in this dataset. Moreover, we may compare the degree of the different nodes in the SC graph. For example, we can observe that SC 1 has only one degree (directed to SC 1_6), while SC 5 (T cells and plasma cells) has a much higher degree (n = 4) and potentially more CN interactions. 12.6 Patch detection The previous section focused on detecting cellular neighborhoods in a rather unsupervised fashion. However, the imcRtools package also provides methods for detecting spatial compartments in a supervised fashion. The patchDetection function allows the detection of connected sets of similar cells as proposed by (Hoch et al. 2022). In the following example, we will use the patchDetection function to detect tumor patches in three steps: Find connected sets of tumor cells (using the steinbock graph). Components which contain less than 10 cells are excluded. Expand the components by 1µm to construct a concave hull around the patch and include cells within the patch. spe <- patchDetection(spe, patch_cells = spe$celltype == "Tumor", img_id = "sample_id", expand_by = 1, min_patch_size = 10, colPairName = "neighborhood", BPPARAM = MulticoreParam()) ## The returned object is ordered by the 'sample_id' entry. plotSpatial(spe, node_color_by = "patch_id", img_id = "sample_id", node_size_fix = 0.5) + theme(legend.position = "none") + scale_color_manual(values = rev(colors())) We can now calculate the fraction of T cells within each tumor patch to roughly estimate T cell infiltration. library(tidyverse) colData(spe) %>% as_tibble() %>% group_by(patch_id, sample_id) %>% summarize(Tcell_count = sum(celltype == "CD8" | celltype == "CD4"), patch_size = n(), Tcell_freq = Tcell_count / patch_size) %>% filter(!is.na(patch_id)) %>% ggplot() + geom_point(aes(log10(patch_size), Tcell_freq, color = sample_id)) + theme_classic() We can now measure the size of each patch using the patchSize function and visualize tumor patch distribution per patient. patch_size <- patchSize(spe, "patch_id") patch_size <- merge(patch_size, colData(spe)[match(patch_size$patch_id, spe$patch_id),], by = "patch_id") ggplot(as.data.frame(patch_size)) + geom_boxplot(aes(patient_id, log10(size))) + geom_point(aes(patient_id, log10(size))) The minDistToCells function can be used to calculate the minimum distance between each cell and a cell set of interest. Here, we highlight its use to calculate the minimum distance of all cells to the detected tumor patches. Negative values indicate the minimum distance of each tumor patch cell to a non-tumor patch cell. spe <- minDistToCells(spe, x_cells = !is.na(spe$patch_id), img_id = "sample_id") ## The returned object is ordered by the 'sample_id' entry. plotSpatial(spe, node_color_by = "distToCells", img_id = "sample_id", node_size_fix = 0.5) + scale_color_gradient2(low = "dark blue", mid = "white", high = "dark red") Finally, we can observe the minimum distances to tumor patches in a cell type specific manner. library(ggridges) ggplot(as.data.frame(colData(spe))) + geom_density_ridges(aes(distToCells, celltype, fill = celltype)) + geom_vline(xintercept = 0, color = "dark red", linewidth = 2) + scale_fill_manual(values = metadata(spe)$color_vectors$celltype) 12.7 Interaction analysis Bug notice: we discovered and fixed a bug in the testInteractions function in version below 1.5.5 which affected SingleCellExperiment or SpatialExperiment objects in which cells were not grouped by image. Please make sure you have the newest version (>= 1.6.0) installed. The next section focuses on statistically testing the pairwise interaction between all cell types of the dataset. For this, the imcRtools package provides the testInteractions function which implements the interaction testing strategy proposed by (Schapiro et al. 2017). Per grouping level (e.g., image), the testInteractions function computes the averaged cell type/cell type interaction count and compares this count against an empirical null distribution which is generated by permuting all cell labels (while maintaining the tissue structure). In the following example, we use the steinbock generated spatial interaction graph and estimate the interaction or avoidance between cell types in the dataset. library(scales) out <- testInteractions(spe, group_by = "sample_id", label = "celltype", colPairName = "neighborhood", BPPARAM = SerialParam(RNGseed = 221029)) head(out) ## DataFrame with 6 rows and 10 columns ## group_by from_label to_label ct p_gt p_lt ## <character> <character> <character> <numeric> <numeric> <numeric> ## 1 Patient1_001 Bcell Bcell 0 1.000000 1.000000 ## 2 Patient1_001 Bcell BnTcell 0 1.000000 0.998002 ## 3 Patient1_001 Bcell CD4 3 0.001998 1.000000 ## 4 Patient1_001 Bcell CD8 0 1.000000 0.898102 ## 5 Patient1_001 Bcell Myeloid 0 1.000000 0.804196 ## 6 Patient1_001 Bcell Neutrophil NA NA NA ## interaction p sig sigval ## <logical> <numeric> <logical> <numeric> ## 1 FALSE 1.000000 FALSE 0 ## 2 FALSE 0.998002 FALSE 0 ## 3 TRUE 0.001998 TRUE 1 ## 4 FALSE 0.898102 FALSE 0 ## 5 FALSE 0.804196 FALSE 0 ## 6 NA NA NA NA The returned DataFrame contains the test results per grouping level (in this case the image ID, group_by), “from” cell type (from_label) and “to” cell type (to_label). The sigval entry indicates if a pair of cell types is significantly interacting (sigval = 1), if a pair of cell types is significantly avoiding (sigval = -1) or if no significant interaction or avoidance was detected (sigval = 0). These results can be visualized by computing the sum of the sigval entries across all images: out %>% as_tibble() %>% group_by(from_label, to_label) %>% summarize(sum_sigval = sum(sigval, na.rm = TRUE)) %>% ggplot() + geom_tile(aes(from_label, to_label, fill = sum_sigval)) + scale_fill_gradient2(low = muted("blue"), mid = "white", high = muted("red")) + theme(axis.text.x = element_text(angle = 45, hjust = 1)) In the plot above the red tiles indicate cell type pairs that were detected to significantly interact on a large number of images. On the other hand, blue tiles show cell type pairs which tend to avoid each other on a large number of images. Here we can observe that tumor cells are mostly compartmentalized and are in avoidance with other cell types. As expected, B cells interact with BnT cells; regulatory T cells interact with CD4+ T cells and CD8+ T cells. Most cell types show self interactions indicating spatial clustering. The imcRtools package further implements an interaction testing strategy proposed by (Schulz et al. 2018) where the hypothesis is tested if at least n cells of a certain type are located around a target cell type (from_cell). This type of testing can be performed by selecting method = \"patch\" and specifying the number of patch cells via the patch_size parameter. out <- testInteractions(spe, group_by = "sample_id", label = "celltype", colPairName = "neighborhood", method = "patch", patch_size = 3, BPPARAM = SerialParam(RNGseed = 221029)) out %>% as_tibble() %>% group_by(from_label, to_label) %>% summarize(sum_sigval = sum(sigval, na.rm = TRUE)) %>% ggplot() + geom_tile(aes(from_label, to_label, fill = sum_sigval)) + scale_fill_gradient2(low = muted("blue"), mid = "white", high = muted("red")) + theme(axis.text.x = element_text(angle = 45, hjust = 1)) These results are comparable to the interaction testing presented above. The main difference comes from the lack of symmetry. We can now for example see that 3 or more myeloid cells sit around CD4\\(^+\\) T cells while this interaction is not as strong when considering CD4\\(^+\\) T cells sitting around myeloid cells. Finally, we save the updated SpatialExperiment object. saveRDS(spe, "data/spe.rds") 12.8 Session Info SessionInfo ## R version 4.3.1 (2023-06-16) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 22.04.3 LTS ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## time zone: Etc/UTC ## tzcode source: system (glibc) ## ## attached base packages: ## [1] stats4 stats graphics grDevices utils datasets methods ## [8] base ## ## other attached packages: ## [1] testthat_3.1.10 scales_1.2.1 ## [3] ggridges_0.5.4 lubridate_1.9.3 ## [5] forcats_1.0.0 stringr_1.5.0 ## [7] dplyr_1.1.3 purrr_1.0.2 ## [9] readr_2.1.4 tidyr_1.3.0 ## [11] tibble_3.2.1 tidyverse_2.0.0 ## [13] patchwork_1.1.3 RColorBrewer_1.1-3 ## [15] circlize_0.4.15 lisaClust_1.8.1 ## [17] pheatmap_1.0.12 BiocParallel_1.34.2 ## [19] viridis_0.6.4 viridisLite_0.4.2 ## [21] ggplot2_3.4.3 imcRtools_1.6.5 ## [23] SpatialExperiment_1.10.0 SingleCellExperiment_1.22.0 ## [25] SummarizedExperiment_1.30.2 Biobase_2.60.0 ## [27] GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 ## [29] IRanges_2.34.1 S4Vectors_0.38.2 ## [31] BiocGenerics_0.46.0 MatrixGenerics_1.12.3 ## [33] matrixStats_1.0.0 ## ## loaded via a namespace (and not attached): ## [1] spatstat.sparse_3.0-2 bitops_1.0-7 ## [3] sf_1.0-14 EBImage_4.42.0 ## [5] doParallel_1.0.17 numDeriv_2016.8-1.1 ## [7] tools_4.3.1 backports_1.4.1 ## [9] utf8_1.2.3 R6_2.5.1 ## [11] DT_0.29 HDF5Array_1.28.1 ## [13] mgcv_1.8-42 rhdf5filters_1.12.1 ## [15] GetoptLong_1.0.5 withr_2.5.1 ## [17] sp_2.0-0 gridExtra_2.3 ## [19] ClassifyR_3.4.11 cli_3.6.1 ## [21] spatstat.explore_3.2-3 sandwich_3.0-2 ## [23] labeling_0.4.3 sass_0.4.7 ## [25] spatstat.data_3.0-1 nnls_1.5 ## [27] mvtnorm_1.2-3 proxy_0.4-27 ## [29] systemfonts_1.0.4 colorRamps_2.3.1 ## [31] svglite_2.1.1 R.utils_2.12.2 ## [33] scater_1.28.0 plotrix_3.8-2 ## [35] limma_3.56.2 flowCore_2.12.2 ## [37] rstudioapi_0.15.0 generics_0.1.3 ## [39] shape_1.4.6 spatstat.random_3.1-6 ## [41] gtools_3.9.4 vroom_1.6.3 ## [43] car_3.1-2 scam_1.2-14 ## [45] Matrix_1.6-1.1 RProtoBufLib_2.12.1 ## [47] ggbeeswarm_0.7.2 fansi_1.0.4 ## [49] abind_1.4-5 R.methodsS3_1.8.2 ## [51] terra_1.7-46 lifecycle_1.0.3 ## [53] multcomp_1.4-25 yaml_2.3.7 ## [55] edgeR_3.42.4 carData_3.0-5 ## [57] rhdf5_2.44.0 Rtsne_0.16 ## [59] grid_4.3.1 promises_1.2.1 ## [61] dqrng_0.3.1 crayon_1.5.2 ## [63] shinydashboard_0.7.2 lattice_0.21-8 ## [65] beachmat_2.16.0 cowplot_1.1.1 ## [67] magick_2.8.0 cytomapper_1.12.0 ## [69] pillar_1.9.0 knitr_1.44 ## [71] ComplexHeatmap_2.16.0 RTriangle_1.6-0.12 ## [73] boot_1.3-28.1 rjson_0.2.21 ## [75] codetools_0.2-19 glue_1.6.2 ## [77] V8_4.3.3 data.table_1.14.8 ## [79] MultiAssayExperiment_1.26.0 vctrs_0.6.3 ## [81] png_0.1-8 gtable_0.3.4 ## [83] cachem_1.0.8 xfun_0.40 ## [85] S4Arrays_1.0.6 mime_0.12 ## [87] DropletUtils_1.20.0 tidygraph_1.2.3 ## [89] ConsensusClusterPlus_1.64.0 survival_3.5-5 ## [91] iterators_1.0.14 cytolib_2.12.1 ## [93] units_0.8-4 ellipsis_0.3.2 ## [95] TH.data_1.1-2 nlme_3.1-162 ## [97] bit64_4.0.5 rprojroot_2.0.3 ## [99] bslib_0.5.1 irlba_2.3.5.1 ## [101] svgPanZoom_0.3.4 vipor_0.4.5 ## [103] KernSmooth_2.23-21 colorspace_2.1-0 ## [105] DBI_1.1.3 raster_3.6-23 ## [107] tidyselect_1.2.0 curl_5.0.2 ## [109] bit_4.0.5 compiler_4.3.1 ## [111] BiocNeighbors_1.18.0 desc_1.4.2 ## [113] DelayedArray_0.26.7 bookdown_0.35 ## [115] classInt_0.4-10 distances_0.1.9 ## [117] goftest_1.2-3 tiff_0.1-11 ## [119] digest_0.6.33 minqa_1.2.6 ## [121] fftwtools_0.9-11 spatstat.utils_3.0-3 ## [123] rmarkdown_2.25 XVector_0.40.0 ## [125] CATALYST_1.24.0 htmltools_0.5.6 ## [127] pkgconfig_2.0.3 jpeg_0.1-10 ## [129] lme4_1.1-34 sparseMatrixStats_1.12.2 ## [131] fastmap_1.1.1 rlang_1.1.1 ## [133] GlobalOptions_0.1.2 htmlwidgets_1.6.2 ## [135] shiny_1.7.5 DelayedMatrixStats_1.22.6 ## [137] farver_2.1.1 jquerylib_0.1.4 ## [139] zoo_1.8-12 jsonlite_1.8.7 ## [141] spicyR_1.12.2 R.oo_1.25.0 ## [143] BiocSingular_1.16.0 RCurl_1.98-1.12 ## [145] magrittr_2.0.3 scuttle_1.10.2 ## [147] GenomeInfoDbData_1.2.10 Rhdf5lib_1.22.1 ## [149] munsell_0.5.0 Rcpp_1.0.11 ## [151] ggnewscale_0.4.9 stringi_1.7.12 ## [153] ggraph_2.1.0 brio_1.1.3 ## [155] zlibbioc_1.46.0 MASS_7.3-60 ## [157] plyr_1.8.8 parallel_4.3.1 ## [159] ggrepel_0.9.3 deldir_1.0-9 ## [161] graphlayouts_1.0.1 splines_4.3.1 ## [163] tensor_1.5 hms_1.1.3 ## [165] locfit_1.5-9.8 igraph_1.5.1 ## [167] ggpubr_0.6.0 spatstat.geom_3.2-5 ## [169] ggsignif_0.6.4 pkgload_1.3.3 ## [171] reshape2_1.4.4 ScaledMatrix_1.8.1 ## [173] XML_3.99-0.14 drc_3.0-1 ## [175] evaluate_0.21 nloptr_2.0.3 ## [177] tzdb_0.4.0 foreach_1.5.2 ## [179] tweenr_2.0.2 httpuv_1.6.11 ## [181] polyclip_1.10-6 clue_0.3-65 ## [183] ggforce_0.4.1 rsvd_1.0.5 ## [185] broom_1.0.5 xtable_1.8-4 ## [187] e1071_1.7-13 rstatix_0.7.2 ## [189] later_1.3.1 class_7.3-22 ## [191] lmerTest_3.1-3 FlowSOM_2.8.0 ## [193] beeswarm_0.4.0 cluster_2.1.4 ## [195] timechange_0.2.0 concaveman_1.1.0 References "],["references.html", "References", " References "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] diff --git a/single-cell-visualization.html b/single-cell-visualization.html new file mode 100644 index 00000000..e19526e4 --- /dev/null +++ b/single-cell-visualization.html @@ -0,0 +1,1255 @@ + + + + + + + 10 Single cell visualization | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

10 Single cell visualization

+

The following section describes typical approaches for visualizing +single-cell data.

+

This chapter is divided into three parts. Section 10.2 +will highlight visualization approaches downstream of cell type +classification from Section 9.3. We will then focus on +visualization methods that relate single-cell data to the sample level +in Section 10.3. Lastly, Section 10.4 will +provide a more customized example on how to integrate various +single-cell and sample metadata into one heatmap using the +ComplexHeatmap +package (Gu, Eils, and Schlesner 2016).

+

Visualization functions from popular R packages in single-cell research +such as +scater, +DittoSeq +and +CATALYST +will be utilized. We will recycle methods and functions that we have +used in previous sections, while also introducing new ones.

+

Please note that this chapter aims to provide an overview on common +visualization options and should be seen as a stepping-stone. However, +many more options exist and the user should customize the visualization +according to the biological question at hand.

+
+

10.1 Load data

+

First, we will read in the previously generated SpatialExperiment +object.

+
spe <- readRDS("data/spe.rds")
+

For visualization purposes, we will define markers that were used for +cell type classification and markers that can indicate a specific cell +state (e.g., Ki67 for proliferating cells).

+
# Define cell phenotype markers 
+type_markers <- c("Ecad", "CD45RO", "CD20", "CD3", "FOXP3", "CD206", "MPO", 
+                  "SMA", "CD8a", "CD4", "HLADR", "CD15", "CD38", "PDGFRb")
+
+# Define cell state markers 
+state_markers <- c("CarbonicAnhydrase", "Ki67", "PD1", "GrzB", "PDL1", 
+                   "ICOS", "TCF7", "VISTA")
+
+# Add to spe
+rowData(spe)$marker_class <- ifelse(rownames(spe) %in% type_markers, "type",
+                                    ifelse(rownames(spe) %in% state_markers, "state", 
+                                    "other"))
+
+
+

10.2 Cell-type level

+

In the first section of this chapter, the grouping-level for the +visualization approaches will be the cell type classification from +Section 9.3. Other grouping levels (e.g., cluster +assignments from Section 9.2) are possible and the user +should adjust depending on the chosen analysis workflow.

+
+

10.2.1 Dimensionality reduction visualization

+

As seen before, we can visualize single-cells in low-dimensional space. +Often, non-linear methods for dimensionality reduction such as tSNE and +UMAP are used. They aim to preserve the distances between each cell and its +neighbors in the high-dimensional space.

+

Interpreting these plots is not trivial, but local neighborhoods in the +plot can suggest similarity in expression for given cells. See +Orchestrating Single-Cell Analysis with +Bioconductor for more +details.

+

Here, we will use dittoDimPlot from the +DittoSeq +package and plotReducedDim from the +scater package +to visualize the fastMNN-corrected UMAP colored by cell type and +expression (using the asinh-transformed intensities), respectively.

+

Both functions are highly flexible and return ggplot objects which can +be further modified.

+
library(dittoSeq)
+library(scater)
+library(patchwork)
+library(cowplot)
+library(viridis)
+
+## UMAP colored by cell type and expression - dittoDimPlot
+p1 <- dittoDimPlot(spe, 
+                   var = "celltype", 
+                   reduction.use = "UMAP_mnnCorrected", 
+                   size = 0.2,
+                   do.label = TRUE) +
+  scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
+  theme(legend.title = element_blank()) +
+  ggtitle("Cell types on UMAP, integrated cells")
+
+p2 <- dittoDimPlot(spe, 
+                   var = "Ecad", 
+                   assay = "exprs",
+                   reduction.use = "UMAP_mnnCorrected", 
+                   size = 0.2, 
+                   colors = viridis(100), 
+                   do.label = TRUE) +
+    scale_color_viridis()
+  
+p1 + p2
+

+

The plotReducedDim function of the scater package provides an alternative +way for visualizing cells in low dimensions. Here, we loop over all type +markers, generate one plot per marker and plot the indivudual plots side-by-side.

+
# UMAP colored by expression for all markers - plotReducedDim
+plot_list  <- lapply(rownames(spe)[rowData(spe)$marker_class == "type"], function(x){
+                      p <- plotReducedDim(spe, 
+                                          dimred = "UMAP_mnnCorrected",
+                                          colour_by = x,
+                                          by_exprs_values = "exprs",
+                                          point_size = 0.2)
+                      return(p)
+                    })
+
+plot_grid(plotlist = plot_list)
+

+
+
+

10.2.2 Heatmap visualization

+

Next, it is often useful to visualize single-cell expression per cell +type in form of a heatmap. For this, we will use the dittoHeatmap +function from the +DittoSeq +package.

+

We sub-sample the dataset to 4000 cells for ease of visualization and +overlay the cancer type and patient ID from which the cells were +extracted.

+
set.seed(220818)
+cur_cells <- sample(seq_len(ncol(spe)), 4000)
+
+# Heatmap visualization - DittoHeatmap
+dittoHeatmap(spe[,cur_cells], 
+             genes = rownames(spe)[rowData(spe)$marker_class == "type"],
+             assay = "exprs", 
+             cluster_cols = FALSE, 
+             scale = "none",
+             heatmap.colors = viridis(100), 
+             annot.by = c("celltype", "indication", "patient_id"),
+             annotation_colors = list(indication = metadata(spe)$color_vectors$indication,
+                                      patient_id = metadata(spe)$color_vectors$patient_id,
+                                      celltype = metadata(spe)$color_vectors$celltype))
+

+

Similarly, we can visualize the mean marker expression per cell type for all +cells bu first calculating the mean marker expression per cell type using the +aggregateAcrossCells function from the +scuttle +package and then use dittoHeatmap. We will annotate the heatmap with the +number of cells per cell type and we will used different ways for feature +scaling.

+
library(scuttle)
+
+## aggregate by cell type
+celltype_mean <- aggregateAcrossCells(as(spe, "SingleCellExperiment"),  
+                     ids = spe$celltype, 
+                     statistics = "mean",
+                     use.assay.type = "exprs", 
+                     subset.row = rownames(spe)[rowData(spe)$marker_class == "type"])
+
+# No scaling
+dittoHeatmap(celltype_mean,
+             assay = "exprs", 
+             cluster_cols = TRUE, 
+             scale = "none",
+             heatmap.colors = viridis(100),
+             annot.by = c("celltype", "ncells"),
+             annotation_colors = list(celltype = metadata(spe)$color_vectors$celltype,
+                                      ncells = plasma(100)))
+

+
# Scaled to max
+dittoHeatmap(celltype_mean,
+             assay = "exprs", 
+             cluster_cols = TRUE, 
+             scaled.to.max = TRUE,
+             heatmap.colors.max.scaled = inferno(100),
+             annot.by = c("celltype", "ncells"),
+             annotation_colors = list(celltype = metadata(spe)$color_vectors$celltype,
+                                      ncells = plasma(100)))
+

+
# Z score scaled
+dittoHeatmap(celltype_mean,
+             assay = "exprs", 
+             cluster_cols = TRUE, 
+             annot.by = c("celltype", "ncells"),
+             annotation_colors = list(celltype = metadata(spe)$color_vectors$celltype,
+                                      ncells = plasma(100)))
+

+

As illustrated above for not-, max-, and Z score-scaled expression values, +different ways of scaling can have strong effects on visualization +output and we encourage the user to test multiple options.

+

Overall, we can observe cell-type specific marker expression (e.g., Tumor += Ecad high and B cells = CD20 high) in agreement with the gating scheme +of Section 9.3.

+
+
+

10.2.3 Violin plot visualization

+

The plotExpression function from the +scater package +allows to plot the distribution of expression values across cell types +for a chosen set of proteins. The output is a ggplot object which can be +modified further.

+
# Violin Plot - plotExpression
+plotExpression(spe[,cur_cells], 
+               features = rownames(spe)[rowData(spe)$marker_class == "type"],
+               x = "celltype", 
+               exprs_values = "exprs", 
+               colour_by = "celltype") +
+    theme(axis.text.x =  element_text(angle = 90))+
+    scale_color_manual(values = metadata(spe)$color_vectors$celltype)
+

+
+
+

10.2.4 Scatter plot visualization

+

Moreover, a protein expression based scatter plot can be generated with +dittoScatterPlot (returns a ggplot object). We overlay the plot with +the cell type information.

+
# Scatter plot
+dittoScatterPlot(spe, 
+                 x.var = "CD3", 
+                 y.var="CD20", 
+                 assay.x = "exprs", 
+                 assay.y = "exprs", 
+                 color.var = "celltype") +
+    scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
+    ggtitle("Scatterplot for CD3/CD20 labelled by celltype")
+

+

We can nicely observe how the “B next to T cell” phenotype (BnTcell) +has high expression values for both CD20 and CD3.

+

Of note, in a setting where the user aims to assign labels to +clusters based on marker genes/proteins, all of the above plots can be +particularly helpful.

+
+
+

10.2.5 Barplot visualization

+

In order to display frequencies of cell types per sample/patient, the +dittoBarPlot function will be used. Data can be represented as +percentages or counts and again ggplot objects are outputted.

+
# by sample_id - percentage
+dittoBarPlot(spe, 
+             var = "celltype", 
+             group.by = "sample_id") +
+    scale_fill_manual(values = metadata(spe)$color_vectors$celltype)
+

+
# by patient_id - percentage
+dittoBarPlot(spe, 
+             var = "celltype", 
+             group.by = "patient_id") +
+    scale_fill_manual(values = metadata(spe)$color_vectors$celltype)
+

+
# by patient_id - count
+dittoBarPlot(spe, 
+             scale = "count",
+             var = "celltype", 
+             group.by = "patient_id") +
+    scale_fill_manual(values = metadata(spe)$color_vectors$celltype)
+

+

We can see that cell type frequencies change between samples/patients +and that the highest proportion/counts of plasma cells and stromal +cells can be observed for Patient 2 and Patient 4, respectively.

+
+
+

10.2.6 CATALYST-based visualization

+

In the following, we highlight some useful visualization +functions from the +CATALYST +package.

+

To this end, we will first convert the SpatialExperiment object into a +CATALYST-compatible format.

+
library(CATALYST)
+
+# Save SPE in CATALYST-compatible object with renamed colData entries and 
+# new metadata information
+spe_cat <- spe 
+
+spe_cat$sample_id <- factor(spe$sample_id)
+spe_cat$condition <- factor(spe$indication)
+spe_cat$cluster_id <- factor(spe$celltype)
+
+# Add celltype information to metadata
+metadata(spe_cat)$cluster_codes <- data.frame(celltype = factor(spe_cat$celltype))
+

All of the CATALYST functions presented below return ggplot objects, +which allow flexible downstream adjustment.

+
+

10.2.6.1 Pseudobulk-level MDS plot

+

Pseudobulk-level multi-dimensional scaling (MDS) plots can be rendered +with the exported pbMDS function.

+

Here, we will use pbMDS to highlight expression similarities between +cell types and subsequently for each celltype-sample-combination.

+
# MDS pseudobulk by cell type
+pbMDS(spe_cat, 
+      by = "cluster_id", 
+      features = rownames(spe_cat)[rowData(spe_cat)$marker_class == "type"], 
+      label_by = "cluster_id", 
+      k = "celltype") +
+  scale_color_manual(values = metadata(spe_cat)$color_vectors$celltype)
+

+
# MDS pseudobulk by cell type and sample_id
+pbMDS(spe_cat, 
+      by = "both", 
+      features = rownames(spe_cat)[rowData(spe_cat)$marker_class == "type"], 
+      k = "celltype", 
+      shape_by = "condition", 
+      size_by = TRUE) +
+  scale_color_manual(values = metadata(spe_cat)$color_vectors$celltype)
+

+

We can see that the pseudobulk-expression profile of neutrophils seems +markedly distinct from the other cell types, while comparable cell types +such as the T cell subtypes group together. Furthermore, pseudobulk +cell-type profiles from SCCHN appear different from the other +indications.

+
+
+

10.2.6.2 Reduced dimension plot on CLR of proportions

+

The clrDR function produces dimensionality reduction plots on centered +log-ratios (CLR) of sample/cell type proportions across cell +type/samples.

+

As with pbMDS, the output plots aim to illustrate the degree of +similarity between cell types based on sample proportions.

+
# CLR on cluster proportions across samples
+clrDR(spe_cat, 
+      dr = "PCA", 
+      by = "cluster_id", 
+      k = "celltype", 
+      label_by = "cluster_id", 
+      arrow_col = "sample_id", 
+      point_pal = metadata(spe_cat)$color_vectors$celltype) 
+

+

We can again observe that neutrophils have a divergent profile also in +terms of their sample proportions.

+
+
+

10.2.6.3 Pseudobulk expression boxplot

+

The plotPbExprs generates combined box- and jitter-plots of aggregated marker +expression per cell type and sample (image). Here, we further split the data by +cancer type.

+
plotPbExprs(spe_cat, 
+            k = "celltype", 
+            facet_by = "cluster_id", 
+            ncol = 2, 
+            features = rownames(spe_cat)[rowData(spe_cat)$marker_class == "type"]) +
+    scale_color_manual(values = metadata(spe_cat)$color_vectors$indication)
+

+

Notably, CD15 levels are elevated in SCCHN in comparison to all other +indications for most cell types.

+
+
+
+
+

10.3 Sample-level

+

In the next section, we will shift the grouping-level focus from the +cell type to the sample-level. Sample-levels will be further divided +into the sample-(image) and patient-level.

+

Although we will mostly repeat the functions from the previous section +10.2, sample- and patient-level centered visualization +can provide additional quality control and biological interpretation.

+
+

10.3.1 Dimensionality reduction visualization

+

Visualization of low-dimensional embeddings, here comparing non-corrected and +fastMNN-corrected UMAPs, and coloring it by sample-levels is often used +for “batch effect” assessment as mentioned in Section +7.4.

+

We will again use dittoDimPlot.

+
## UMAP colored by cell type and expression - dittoDimPlot
+p1 <- dittoDimPlot(spe, 
+                   var = "sample_id",
+                   reduction.use = "UMAP", 
+                   size = 0.2, 
+                   colors = viridis(100), 
+                   do.label = FALSE) +
+    scale_color_manual(values = metadata(spe)$color_vectors$sample_id) +
+  theme(legend.title = element_blank()) +
+  ggtitle("Sample ID")
+
+p2 <- dittoDimPlot(spe, 
+                   var = "sample_id",
+                   reduction.use = "UMAP_mnnCorrected", 
+                   size = 0.2, 
+                   colors = viridis(100), 
+                   do.label = FALSE) +
+    scale_color_manual(values = metadata(spe)$color_vectors$sample_id) +
+  theme(legend.title = element_blank()) +
+  ggtitle("Sample ID")
+
+p3 <- dittoDimPlot(spe, 
+                   var = "patient_id",
+                   reduction.use = "UMAP", 
+                   size = 0.2,
+                   do.label = FALSE) +
+  scale_color_manual(values = metadata(spe)$color_vectors$patient_id) +
+  theme(legend.title = element_blank()) +
+  ggtitle("Patient ID")
+
+p4 <- dittoDimPlot(spe, 
+                   var = "patient_id", 
+                   reduction.use = "UMAP_mnnCorrected", 
+                   size = 0.2,
+                   do.label = FALSE) +
+  scale_color_manual(values = metadata(spe)$color_vectors$patient_id) +
+  theme(legend.title = element_blank()) +
+  ggtitle("Patient ID")
+
+(p1 + p2) / (p3 + p4)
+

+

As illustrated in Section 8, we see that the fastMNN +approach (right side of the plot) leads to mixing of cells across +samples/patients and thus batch effect correction.

+
+
+

10.3.2 Heatmap visualization

+

It can be beneficial to use a heatmap to visualize single-cell +expression per sample and patient. Such a plot, which we will create +using dittoHeatmap, can highlight biological differences across +samples/patients.

+
# Heatmap visualization - DittoHeatmap
+dittoHeatmap(spe[,cur_cells], 
+             genes = rownames(spe)[rowData(spe)$marker_class == "type"],
+             assay = "exprs", 
+             order.by = c("patient_id","sample_id"),
+             cluster_cols = FALSE, 
+             scale = "none",
+             heatmap.colors = viridis(100), 
+             annot.by = c("celltype", "indication", "patient_id", "sample_id"),
+             annotation_colors = list(celltype = metadata(spe)$color_vectors$celltype,
+                                      indication = metadata(spe)$color_vectors$indication,
+                                      patient_id = metadata(spe)$color_vectors$patient_id,
+                                      sample_id = metadata(spe)$color_vectors$sample_id))
+

+

As in Section 7.3, aggregated mean marker expression +per sample/patient allow identification of samples/patients with +outlying expression patterns.

+

Here, we will focus on the patient level and use aggregateAcrossCells +and dittoHeatmap. The heatmap will be annotated with the number of +cells per patient and cancer type and displayed using two scaling +options.

+
# mean expression by patient_id
+patient_mean <- aggregateAcrossCells(as(spe, "SingleCellExperiment"),  
+                     ids = spe$patient_id, 
+                     statistics = "mean",
+                     use.assay.type = "exprs", 
+                     subset.row = rownames(spe)[rowData(spe)$marker_class == "type"])
+
+# No scaling
+dittoHeatmap(patient_mean,
+             assay = "exprs", 
+             cluster_cols = TRUE, 
+             scale = "none",
+             heatmap.colors = viridis(100),
+             annot.by = c("patient_id","indication","ncells"),
+             annotation_colors = list(patient_id = metadata(spe)$color_vectors$patient_id,
+                                      indication = metadata(spe)$color_vectors$indication,
+                                      ncells = plasma(100)))
+

+
# Max expression scaling
+dittoHeatmap(patient_mean,
+             assay = "exprs", 
+             cluster_cols = TRUE, 
+             scaled.to.max =  TRUE,
+             heatmap.colors.max.scaled = inferno(100),
+             annot.by = c("patient_id","indication","ncells"),
+             annotation_colors = list(patient_id = metadata(spe)$color_vectors$patient_id,
+                                      indication = metadata(spe)$color_vectors$indication,
+                                      ncells = plasma(100)))
+

+

As seen before, CD15 levels are elevated in Patient 4 (SCCHN), while SMA +levels are highest for Patient 4 (CRC).

+
+
+

10.3.3 Barplot visualization

+

Complementary to displaying cell type frequencies per sample/patient, we +can use dittoBarPlot to display sample/patient frequencies per cell +type.

+
dittoBarPlot(spe, 
+             var = "patient_id", 
+             group.by = "celltype") +
+    scale_fill_manual(values = metadata(spe)$color_vectors$patient_id)
+

+
dittoBarPlot(spe, 
+             var = "sample_id", 
+             group.by = "celltype") +
+    scale_fill_manual(values = metadata(spe)$color_vectors$sample_id)
+

+

Patient2 has the highest and lowest proportion of plasma cells and +neutrophils, respectively.

+
+
+

10.3.4 CATALYST-based visualization

+
+

10.3.4.1 Pseudobulk-level MDS plot

+

Expression-based pseudobulks for each sample can be compared with the +pbMDS function.

+
# MDS pseudobulk by sample_id 
+pbMDS(spe_cat, 
+      by = "sample_id", 
+      color_by = "sample_id", 
+      features = rownames(spe_cat)[rowData(spe_cat)$marker_class == "type"]) +
+  scale_color_manual(values = metadata(spe_cat)$color_vectors$sample_id)
+

+

There are marked differences in pseudobulk-expression patterns between +samples and across patients, which can be driven by biological +differences and also technical aspects such as divergent region +selection.

+
+
+

10.3.4.2 Reduced dimension plot on CLR of proportions

+

The clrDR function can also be used to analyze similarity of samples +based on cell type proportions.

+
# CLR on sample proportions across clusters
+clrDR(spe_cat, 
+      dr = "PCA", 
+      by = "sample_id", 
+      point_col = "sample_id",
+      k = "celltype", 
+      point_pal = metadata(spe_cat)$color_vectors$sample_id) +
+  scale_color_manual(values = metadata(spe_cat)$color_vectors$celltype)
+
## Scale for colour is already present.
+## Adding another scale for colour, which will replace the existing scale.
+

+

There are notable differences between samples based on their cell type +proportions.

+

Interestingly, Patient3_001, Patient1_003, Patient4_007 and +Patient4_006 group together and the PC loadings indicate a strong +contribution of BnT and B cells, which could propose formation of +tertiary lymphoid structures (TLS). In section 12.2, we +will be able to confirm this hypothesis visually on the images.

+
+
+
+
+

10.4 Further examples

+

In the last section of this chapter, we will use the popular +ComplexHeatmap +package to create a visualization example that combines various +cell-type- and sample-level information.

+

ComplexHeatmap +is highly versatile and is originally inspired from the +pheatmap +package. Therefore, many arguments have the same/similar names.

+

For more details, we would recommend to read the reference +book.

+
+

10.4.1 Publication-ready ComplexHeatmap

+

For this example, we will concatenate heatmaps and annotations +horizontally into one rich heatmap list. The grouping-level for the +visualization will again be the cell type information from Section +9.3

+

Initially, we will create two separate Heatmap objects for cell type +and state markers.

+

Then, metadata information, including the cancer type proportion and +number of cells/patients per cell type, will be extracted into +HeatmapAnnotation objects.

+

Notably, we will add spatial features per cell type, here the number of +neighbors extracted from colPair(spe) and cell area, in another +HeatmapAnnotation object.

+

Ultimately, all objects are combined in a HeatmapList and visualized.

+
library(ComplexHeatmap)
+library(circlize)
+library(tidyverse)
+set.seed(22)
+
+### 1. Heatmap bodies ###
+
+# Heatmap body color 
+col_exprs <- colorRamp2(c(0,1,2,3,4), 
+                        c("#440154FF","#3B518BFF","#20938CFF",
+                          "#6ACD5AFF","#FDE725FF"))
+
+# Create Heatmap objects
+# By cell type markers
+celltype_mean <- aggregateAcrossCells(as(spe, "SingleCellExperiment"),  
+                     ids = spe$celltype, 
+                     statistics = "mean",
+                     use.assay.type = "exprs", 
+                     subset.row = rownames(spe)[rowData(spe)$marker_class == "type"])
+
+h_type <- Heatmap(t(assay(celltype_mean, "exprs")),
+        column_title = "type_markers",
+        col = col_exprs,
+        name= "mean exprs",
+        show_row_names = TRUE, 
+        show_column_names = TRUE)
+    
+# By cell state markers
+cellstate_mean <- aggregateAcrossCells(as(spe, "SingleCellExperiment"),  
+                     ids = spe$celltype, 
+                     statistics = "mean",
+                     use.assay.type = "exprs", 
+                     subset.row = rownames(spe)[rowData(spe)$marker_class == "state"])
+
+h_state <- Heatmap(t(assay(cellstate_mean, "exprs")),
+        column_title = "state_markers",
+        col = col_exprs,
+        name= "mean exprs",
+        show_row_names = TRUE,
+        show_column_names = TRUE)
+
+
+### 2. Heatmap annotation ###
+
+### 2.1  Metadata features
+
+anno <- colData(celltype_mean) %>% as.data.frame %>% select(celltype, ncells)
+
+# Proportion of indication per celltype
+indication <- unclass(prop.table(table(spe$celltype, spe$indication), margin = 1))
+
+# Number of contributing patients per celltype
+cluster_PID <- colData(spe) %>% 
+    as.data.frame() %>% 
+    select(celltype, patient_id) %>% 
+    group_by(celltype) %>% table() %>% 
+    as.data.frame()
+
+n_PID <- cluster_PID %>% 
+    filter(Freq>0) %>% 
+    group_by(celltype) %>% 
+    count(name = "n_PID") %>% 
+    column_to_rownames("celltype")
+
+# Create HeatmapAnnotation objects
+ha_anno <- HeatmapAnnotation(celltype = anno$celltype,
+                            border = TRUE, 
+                            gap = unit(1,"mm"),
+                            col = list(celltype = metadata(spe)$color_vectors$celltype),
+                            which = "row")
+    
+ha_meta <- HeatmapAnnotation(n_cells = anno_barplot(anno$ncells, width = unit(10, "mm")),
+                            n_PID = anno_barplot(n_PID, width = unit(10, "mm")),
+                            indication = anno_barplot(indication,width = unit(10, "mm"),
+                                                      gp = gpar(fill = metadata(spe)$color_vectors$indication)),
+                            border = TRUE, 
+                            annotation_name_rot = 90,
+                            gap = unit(1,"mm"),
+                            which = "row")
+
+### 2.2 Spatial features
+
+# Add number of neighbors to spe object (saved in colPair)
+spe$n_neighbors <- countLnodeHits(colPair(spe, "neighborhood"))
+
+# Select spatial features and average over celltypes
+spatial <- colData(spe) %>% 
+    as.data.frame() %>% 
+    select(area, celltype, n_neighbors)
+
+spatial <- spatial %>% 
+    select(-celltype) %>% 
+    aggregate(by = list(celltype = spatial$celltype), FUN = mean) %>% 
+    column_to_rownames("celltype")
+
+# Create HeatmapAnnotation object
+ha_spatial <- HeatmapAnnotation(
+    area = spatial$area,
+    n_neighbors = spatial$n_neighbors,
+    border = TRUE,
+    gap = unit(1,"mm"),
+    which = "row")
+
+### 3. Plot rich heatmap ###
+
+# Create HeatmapList object
+h_list <- h_type +
+    h_state +
+    ha_anno +
+    ha_spatial +
+    ha_meta
+
+# Add customized legend for anno_barplot()
+lgd <- Legend(title = "indication", 
+              at = colnames(indication), 
+              legend_gp = gpar(fill = metadata(spe)$color_vectors$indication))
+             
+# Plot
+draw(h_list,annotation_legend_list = list(lgd))
+

+

This plot summarizes most of the information we have seen in this +chapter previously. In addition, we can observe that tumor cells have +the largest mean cell area, high number of neighbors and elevated Ki67 +expression. BnT cells have the highest number of neighbors on average, +which is biological sound given their predominant location in highly +immune infiltrated regions (such as TLS).

+
+
+

10.4.2 Interactive visualization

+

For interactive visualization of the single-cell data the +iSEE shiny +application can be used. For a comprehensive tutorial, please refer to the +iSEE vignette.

+
if (interactive()) {
+    library(iSEE)
+
+    iSEE(spe)   
+}
+
+
+
+

10.5 Session Info

+
+ +SessionInfo + +
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.3 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## locale:
+##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+## 
+## time zone: Etc/UTC
+## tzcode source: system (glibc)
+## 
+## attached base packages:
+## [1] grid      stats4    stats     graphics  grDevices utils     datasets 
+## [8] methods   base     
+## 
+## other attached packages:
+##  [1] lubridate_1.9.3             forcats_1.0.0              
+##  [3] stringr_1.5.0               dplyr_1.1.3                
+##  [5] purrr_1.0.2                 readr_2.1.4                
+##  [7] tidyr_1.3.0                 tibble_3.2.1               
+##  [9] tidyverse_2.0.0             circlize_0.4.15            
+## [11] ComplexHeatmap_2.16.0       CATALYST_1.24.0            
+## [13] viridis_0.6.4               viridisLite_0.4.2          
+## [15] cowplot_1.1.1               patchwork_1.1.3            
+## [17] scater_1.28.0               scuttle_1.10.2             
+## [19] dittoSeq_1.12.1             ggplot2_3.4.3              
+## [21] SpatialExperiment_1.10.0    SingleCellExperiment_1.22.0
+## [23] SummarizedExperiment_1.30.2 Biobase_2.60.0             
+## [25] GenomicRanges_1.52.0        GenomeInfoDb_1.36.3        
+## [27] IRanges_2.34.1              S4Vectors_0.38.2           
+## [29] BiocGenerics_0.46.0         MatrixGenerics_1.12.3      
+## [31] matrixStats_1.0.0          
+## 
+## loaded via a namespace (and not attached):
+##   [1] splines_4.3.1               bitops_1.0-7               
+##   [3] R.oo_1.25.0                 polyclip_1.10-6            
+##   [5] XML_3.99-0.14               lifecycle_1.0.3            
+##   [7] rstatix_0.7.2               edgeR_3.42.4               
+##   [9] doParallel_1.0.17           lattice_0.21-8             
+##  [11] MASS_7.3-60                 backports_1.4.1            
+##  [13] magrittr_2.0.3              limma_3.56.2               
+##  [15] sass_0.4.7                  rmarkdown_2.25             
+##  [17] jquerylib_0.1.4             yaml_2.3.7                 
+##  [19] plotrix_3.8-2               RColorBrewer_1.1-3         
+##  [21] ConsensusClusterPlus_1.64.0 multcomp_1.4-25            
+##  [23] abind_1.4-5                 zlibbioc_1.46.0            
+##  [25] Rtsne_0.16                  R.utils_2.12.2             
+##  [27] RCurl_1.98-1.12             TH.data_1.1-2              
+##  [29] tweenr_2.0.2                sandwich_3.0-2             
+##  [31] GenomeInfoDbData_1.2.10     ggrepel_0.9.3              
+##  [33] irlba_2.3.5.1               pheatmap_1.0.12            
+##  [35] dqrng_0.3.1                 DelayedMatrixStats_1.22.6  
+##  [37] codetools_0.2-19            DropletUtils_1.20.0        
+##  [39] DelayedArray_0.26.7         ggforce_0.4.1              
+##  [41] tidyselect_1.2.0            shape_1.4.6                
+##  [43] farver_2.1.1                ScaledMatrix_1.8.1         
+##  [45] jsonlite_1.8.7              GetoptLong_1.0.5           
+##  [47] BiocNeighbors_1.18.0        ggridges_0.5.4             
+##  [49] survival_3.5-5              iterators_1.0.14           
+##  [51] foreach_1.5.2               tools_4.3.1                
+##  [53] ggnewscale_0.4.9            Rcpp_1.0.11                
+##  [55] glue_1.6.2                  gridExtra_2.3              
+##  [57] xfun_0.40                   HDF5Array_1.28.1           
+##  [59] withr_2.5.1                 fastmap_1.1.1              
+##  [61] rhdf5filters_1.12.1         fansi_1.0.4                
+##  [63] digest_0.6.33               rsvd_1.0.5                 
+##  [65] timechange_0.2.0            R6_2.5.1                   
+##  [67] colorspace_2.1-0            Cairo_1.6-1                
+##  [69] gtools_3.9.4                R.methodsS3_1.8.2          
+##  [71] utf8_1.2.3                  generics_0.1.3             
+##  [73] data.table_1.14.8           S4Arrays_1.0.6             
+##  [75] pkgconfig_2.0.3             gtable_0.3.4               
+##  [77] RProtoBufLib_2.12.1         XVector_0.40.0             
+##  [79] htmltools_0.5.6             carData_3.0-5              
+##  [81] bookdown_0.35               clue_0.3-65                
+##  [83] scales_1.2.1                png_0.1-8                  
+##  [85] colorRamps_2.3.1            knitr_1.44                 
+##  [87] rstudioapi_0.15.0           tzdb_0.4.0                 
+##  [89] reshape2_1.4.4              rjson_0.2.21               
+##  [91] cachem_1.0.8                zoo_1.8-12                 
+##  [93] rhdf5_2.44.0                GlobalOptions_0.1.2        
+##  [95] parallel_4.3.1              vipor_0.4.5                
+##  [97] pillar_1.9.0                vctrs_0.6.3                
+##  [99] ggpubr_0.6.0                car_3.1-2                  
+## [101] BiocSingular_1.16.0         cytolib_2.12.1             
+## [103] beachmat_2.16.0             cluster_2.1.4              
+## [105] beeswarm_0.4.0              evaluate_0.21              
+## [107] magick_2.8.0                mvtnorm_1.2-3              
+## [109] cli_3.6.1                   locfit_1.5-9.8             
+## [111] compiler_4.3.1              rlang_1.1.1                
+## [113] crayon_1.5.2                ggsignif_0.6.4             
+## [115] labeling_0.4.3              FlowSOM_2.8.0              
+## [117] flowCore_2.12.2             plyr_1.8.8                 
+## [119] ggbeeswarm_0.7.2            stringi_1.7.12             
+## [121] BiocParallel_1.34.2         nnls_1.5                   
+## [123] munsell_0.5.0               Matrix_1.6-1.1             
+## [125] hms_1.1.3                   sparseMatrixStats_1.12.2   
+## [127] Rhdf5lib_1.22.1             drc_3.0-1                  
+## [129] igraph_1.5.1                broom_1.0.5                
+## [131] bslib_0.5.1
+
+ +
+
+

References

+
+
+Gu, Zuguang, Roland Eils, and Matthias Schlesner. 2016. “Complex Heatmaps Reveal Patterns and Correlations in Multidimensional Genomic Data.” Bioinformatics 32: 2847–49. +
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/spillover-correction.html b/spillover-correction.html new file mode 100644 index 00000000..19a3bf98 --- /dev/null +++ b/spillover-correction.html @@ -0,0 +1,938 @@ + + + + + + + 6 Spillover correction | Analysis workflow for IMC data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+ +
+
+ + +
+
+ +
+
+

6 Spillover correction

+

Original scripts: Vito Zanotelli, adapted/maintained by: Nils Eling

+

This section highlights how to generate a spillover matrix from individually +acquired single metal spots on an agarose slide. Each spot needs to be imaged as +its own acquisition/ROI and individual TXT files containing the pixel +intensities per spot need to be available. For complete details on the spillover +correction approach, please refer to the original +publication (Chevrier et al. 2017).

+

Spillover slide preparation:

+
    +
  • Prepare 2% agarose in double distilled H\(_2\)O in a beaker and melt it in a microwave until well dissolved.
  • +
  • Dip a blank superfrost plus glass microscope slide into the agarose and submerge it until the label.
  • +
  • Remove the slide and prop it up against a support to allow the excess agarose to run off onto paper towels.
    +
  • +
  • Allow the slide to dry completely (at least 30 minutes).
    +
  • +
  • Retrieve all the antibody conjugates used in the panel for which the spillover matrix is to be generated and place them on ice.
  • +
  • Arrange them in a known order (e.g., mass of the conjugated metal).
    +
  • +
  • Pipette 0.3 µl spots of 0.4% trypan blue dye into an array on the slide. Prepare one spot per antibody, and make sure the spots are well separated.
  • +
  • Pipette 0.3 µl of each antibody conjugate (usually at 0.5 mg/ml) onto a unique blue spot, taking care to avoid different antibodies bleeding into each other. Note the exact location of each conjugate on the slide.
  • +
  • Let the spots dry completely, at least 1 hour.
  • +
+

Spillover slide acquisition:

+
    +
  • Create a JPEG or PNG image of the slide using a mobile phone camera or flat-bed scanner.
    +
  • +
  • In the CyTOF software, create a new file and import the slide image into it.
  • +
  • Create a panorama across all the spots to visualize their locations.
    +
  • +
  • Within each spot, create a region of interest (ROI) with a width of 200 pixels and a height of 10 pixels.
    +
  • +
  • Name each ROI with the mass and name of the metal conjugate contained in the spot, e.g “Ir193” or “Ho165”. This will be how each TXT file is named.
  • +
  • Set the profiling type of each ROI to “Local”.
    +
  • +
  • Apply the antibody panel to all the ROIs. This panel should contain all (or more) of the isotopes in the panel, with the correct metal specified. For example: if the metal used is Barium 138, make sure this, rather than Lanthanum 138, is selected.
  • +
  • Save the file, make sure “Generate Text File” is selected, and start the acquisition.
  • +
+

This procedure will generate an MCD file similar to the one available on zenodo: +10.5281/zenodo.5949115

+

The original code of the spillover correction manuscript is available on Github +here; however, due to +changes in the +CATALYST +package, users were not able to reproduce the analysis using the newest software +versions. The following workflow uses the newest package versions to generate a +spillover matrix and perform spillover correction.

+

In brief, the highlighted workflow comprises 9 steps:

+
    +
  1. Reading in the data
  2. +
  3. Quality control
  4. +
  5. (Optional) pixel binning
  6. +
  7. “Debarcoding” for pixel assignment
  8. +
  9. Pixel selection for spillover matrix estimation
  10. +
  11. Spillover matrix generation
  12. +
  13. Saving the results
  14. +
  15. Single-cell compensation
  16. +
  17. Image compensation
  18. +
+
+

6.1 Generate the spillover matrix

+

In the first step, we will generate a spillover matrix based on the single-metal +spots and save it for later use.

+
+

6.1.1 Read in the data

+

Here, we will read in the individual TXT files into a SingleCellExperiment +object. This object can be used directly by the CATALYST package to estimate +the spillover.

+

For this to work, the TXT file names need to contain the spotted metal isotope +name. By default, the first occurrence of the isotope in the format (mt)(mass) +(e.g. Sm152 for Samarium isotope with the atomic mass 152) will be used as +spot identifier. Alternatively, a named list of already read-in pixel intensities +can be provided. For more information, please refer to the man page ?readSCEfromTXT.

+

For further downstream analysis, we will asinh-transform the data using a +cofactor of 5; a common transformation for CyTOF data (Bendall et al. 2011). +As the pixel intensities are larger than the cell intensities, the cofactor +here is larger than the cofactor when transforming the mean cell intensities.

+
library(imcRtools)
+
+# Create SingleCellExperiment from TXT files
+sce <- readSCEfromTXT("data/compensation/") 
+
## Spotted channels:  Y89, In113, In115, Pr141, Nd142, Nd143, Nd144, Nd145, Nd146, Sm147, Nd148, Sm149, Nd150, Eu151, Sm152, Eu153, Sm154, Gd155, Gd156, Gd158, Tb159, Gd160, Dy161, Dy162, Dy163, Dy164, Ho165, Er166, Er167, Er168, Tm169, Er170, Yb171, Yb172, Yb173, Yb174, Lu175, Yb176
+## Acquired channels:  Ar80, Y89, In113, In115, Xe131, Xe134, Ba136, La138, Pr141, Nd142, Nd143, Nd144, Nd145, Nd146, Sm147, Nd148, Sm149, Nd150, Eu151, Sm152, Eu153, Sm154, Gd155, Gd156, Gd158, Tb159, Gd160, Dy161, Dy162, Dy163, Dy164, Ho165, Er166, Er167, Er168, Tm169, Er170, Yb171, Yb172, Yb173, Yb174, Lu175, Yb176, Ir191, Ir193, Pt196, Pb206
+## Channels spotted but not acquired:  
+## Channels acquired but not spotted:  Ar80, Xe131, Xe134, Ba136, La138, Ir191, Ir193, Pt196, Pb206
+
assay(sce, "exprs") <- asinh(counts(sce)/5)
+
+
+

6.1.2 Quality control

+

In the next step, we will observe the median pixel intensities per spot and +threshold on medians < 200 counts. +These types of visualization serve two purposes:

+
    +
  1. Small median pixel intensities (< 200 counts) might hinder the robust +estimation of the channel spillover. In that case, consecutive pixels can be +summed (see Optional pixel binning).

  2. +
  3. Each spotted metal (row) should show the highest median pixel intensity in its +corresponding channel (column). If this is not the case, either the naming of the +TXT files was incorrect or the incorrect metal was spotted.

  4. +
+
# Log10 median pixel counts per spot and channel
+plotSpotHeatmap(sce)
+

+
# Thresholded on 200 pixel counts
+plotSpotHeatmap(sce, log = FALSE, threshold = 200)
+

+

As we can see, nearly all median pixel intensities are > 200 counts for each spot. +We also observe acquired channels for which no spot was placed (e.g., Xe134, Ir191, Ir193).

+
+
+

6.1.3 Optional pixel binning

+

In cases where median pixel intensities are low (< 200 counts), consecutive +pixels can be summed to increase the robustness of the spillover estimation. +The imcRtools package provides the binAcrossPixels function, +which performs aggregation for each channel across bin_size consecutive pixels +per spotted metal.

+
# Define grouping
+bin_size = 10
+
+sce2 <- binAcrossPixels(sce, bin_size = bin_size)
+
+# Log10 median pixel counts per spot and channel
+plotSpotHeatmap(sce2)
+

+
# Thresholded on 200 pixel counts
+plotSpotHeatmap(sce2, log = FALSE, threshold = 200)
+

+

Here, we can see an increase in the median pixel intensities and accumulation of +off-diagonal signal. Due to already high original pixel intensities, we will +refrain from aggregating across consecutive pixels for this demonstration.

+
+
+

6.1.4 Filtering incorrectly assigned pixels

+

The following step uses functions provided by the CATALYST package to +“debarcode” the pixels. Based on the intensity distribution of all channels, +pixels are assigned to their corresponding barcode; here this is the already +known metal spot. This procedure serves the purpose to identify pixels that +cannot be robustly assigned to the spotted metal. Pixels of such kind can be +regarded as “noisy”, “background” or “artefacts” that should be removed prior to +spillover estimation.

+

We will also need to specify which channels were spotted (argument bc_key). +This information is directly contained in the colData(sce) slot. +To facilitate visualization, we will order the bc_key by mass.

+

The general workflow for pixel debarcoding is as follows:

+
    +
  1. assign a preliminary metal mass to each pixel
  2. +
  3. for each pixel, estimate a cutoff parameter for the distance between +positive and negative pixel sets
  4. +
  5. apply the estimated cutoffs to identify truly positive pixels
  6. +
+
library(CATALYST)
+
+bc_key <- as.numeric(unique(sce$sample_mass))
+bc_key <- bc_key[order(bc_key)]
+
+sce <- assignPrelim(sce, bc_key = bc_key)
+sce <- estCutoffs(sce)
+sce <- applyCutoffs(sce)
+

The obtained SingleCellExperiment now contains the additional bc_id entry. +For each pixel, this vector indicates the assigned mass (e.g. 161) or +0, meaning unassigned.

+

This information can be visualized in form of a heatmap:

+
library(pheatmap)
+cur_table <- table(sce$bc_id, sce$sample_mass)
+
+# Visualize the correctly and incorrectly assigned pixels
+pheatmap(log10(cur_table + 1),
+         cluster_rows = FALSE,
+         cluster_cols = FALSE)
+

+
# Compute the fraction of unassigned pixels per spot
+cur_table["0",] / colSums(cur_table)
+
##    113    115    141    142    143    144    145    146    147    148    149 
+## 0.1985 0.1060 0.2575 0.3195 0.3190 0.3825 0.3545 0.4280 0.3570 0.4770 0.4200 
+##    150    151    152    153    154    155    156    158    159    160    161 
+## 0.4120 0.4025 0.4050 0.4630 0.4190 0.4610 0.3525 0.4020 0.4655 0.4250 0.5595 
+##    162    163    164    165    166    167    168    169    170    171    172 
+## 0.4340 0.4230 0.4390 0.4055 0.5210 0.3900 0.3285 0.3680 0.5015 0.4900 0.5650 
+##    173    174    175    176     89 
+## 0.3125 0.4605 0.4710 0.2845 0.3015
+

We can see here, that all pixels were assigned to the right mass and that all +pixel sets are made up of > 800 pixels.

+

However, in cases where incorrect assignment occurred or where few pixels were +measured for some spots, the imcRtools package exports a simple helper +function to exclude pixels based on these criteria:

+
sce <- filterPixels(sce, minevents = 40, correct_pixels = TRUE)
+

In the filterPixels function, the minevents parameter specifies the threshold +under which correctly assigned pixel sets are excluded from spillover +estimation. The correct_pixels parameter indicates whether pixels that were +assigned to masses other than the spotted mass should be excluded from spillover +estimation. The default values often result in sufficient pixel filtering; +however, if very few pixels (~100) are measured per spot, the minevents +parameter value needs to be lowered.

+
+
+

6.1.5 Compute spillover matrix

+

Based on the single-positive pixels, we use the CATALYST::computeSpillmat() +function to compute the spillover matrix and CATALYST::plotSpillmat() to +visualize it. The plotSpillmat function checks the spotted and acquired +metal isotopes against a pre-defined CATALYST::isotope_list(). In this data, +the Ar80 channel was additionally acquired to check for deviations in signal +intensity. Ar80 needs to be added to a custom isotope_list object for +visualization.

+
sce <- computeSpillmat(sce)
+
+isotope_list <- CATALYST::isotope_list
+isotope_list$Ar <- 80
+
+plotSpillmat(sce, isotope_list = isotope_list)
+
## Warning: The `guide` argument in `scale_*()` cannot be `FALSE`. This was deprecated in
+## ggplot2 3.3.4.
+## ℹ Please use "none" instead.
+## ℹ The deprecated feature was likely used in the CATALYST package.
+##   Please report the issue at <https://github.com/HelenaLC/CATALYST/issues>.
+## This warning is displayed once every 8 hours.
+## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
+## generated.
+

+
# Save spillover matrix in variable
+sm <- metadata(sce)$spillover_matrix
+

Of note: the visualization of the spillover matrix using CATALYST does currently +not visualize spillover between the larger channels. In this case, the +spillover matrix is clipped at Yb171.

+

As we can see, the largest spillover appears in In113 --> In115 and we also +observe the +16 oxide impurities for e.g. Nd148 --> Dy164.

+

We can save the spillover matrix for external use.

+
write.csv(sm, "data/sm.csv")
+
+
+
+

6.2 Single-cell data compensation

+

The CATALYST package can be used to perform spillover compensation on the +single-cell mean intensities. Here, the SpatialExperiment object generated +in Section 5 is read in. The CATALYST package requires an entry +to rowData(spe)$channel_name for the compCytof function to run. This entry +should contain the metal isotopes in the form (mt)(mass)Di (e.g., Sm152Di for +Samarium isotope with the atomic mass 152).

+

The compCytof function performs channel spillover compensation on the mean +pixel intensities per channel and cell. Here, we will not overwrite the assays +in the SpatialExperiment object to later highlight the effect of compensation. +As shown in Section 5, also the compensated counts are +asinh-transformed using a cofactor of 1.

+
spe <- readRDS("data/spe.rds")
+rowData(spe)$channel_name <- paste0(rowData(spe)$channel, "Di")
+
+spe <- compCytof(spe, sm, 
+                 transform = TRUE, cofactor = 1,
+                 isotope_list = isotope_list, 
+                 overwrite = FALSE)
+

To check the effect of channel spillover compensation, the expression of markers +that are affected by spillover (e.g., E-cadherin in channel Yb173 and CD303 in +channel Yb174) can be visualized in form of scatter plots before and after +compensation.

+
library(dittoSeq)
+library(patchwork)
+before <- dittoScatterPlot(spe, x.var = "Ecad", y.var = "CD303",
+                           assay.x = "exprs", assay.y = "exprs") +
+    ggtitle("Before compensation")
+
+after <- dittoScatterPlot(spe, x.var = "Ecad", y.var = "CD303",
+                          assay.x = "compexprs", assay.y = "compexprs") +
+    ggtitle("After compensation")
+before + after
+

+

We observe that the spillover Yb173 –> Yb174 was successfully corrected. +To facilitate further downstream analysis, the non-compensated assays can now be +replaced by their compensated counterparts:

+
assay(spe, "counts") <- assay(spe, "compcounts") 
+assay(spe, "exprs") <- assay(spe, "compexprs") 
+assay(spe, "compcounts") <- assay(spe, "compexprs") <- NULL
+
+
+

6.3 Image compensation

+

The cytomapper package allows channel +spillover compensation directly on multi-channel images. +The compImage function takes a CytoImageList object and the estimated +spillover matrix as input. More info on how to work with CytoImageList +objects can be seen in Section 11.

+

At this point, we can read in the CytoImageList object containing multi-channel +images as generated in Section 5. +The channelNames need to be set according to their metal isotope in the form +(mt)(mass)Di and therefore match colnames(sm).

+
library(cytomapper)
+
+images <- readRDS("data/images.rds")
+channelNames(images) <- rowData(spe)$channel_name
+

The CATALYST package provides the adaptSpillmat function that corrects the +spillover matrix in a way that rows and columns match a predefined set of +metals. Please refer to ?compCytof for more information how metals in the +spillover matrix are matched to acquired channels in the SingleCellExperiment +object.

+

The spillover matrix can now be adapted to exclude channels that were not kept +for downstream analysis.

+
adapted_sm <- adaptSpillmat(sm, channelNames(images), 
+                            isotope_list = isotope_list)
+
## Compensation is likely to be inaccurate.
+## Spill values for the following interactions
+## have not been estimated:
+
## Ir191Di -> Ir193Di
+
## Ir193Di -> Ir191Di
+

The adapted spillover matrix now matches the channelNames of the +CytoImageList object and can be used to perform pixel-level spillover +compensation. Here, we parallelise the image compensation on all available minus 2 cores. When +working on Windows, you will need to use the SnowParam function instead of +MultiCoreParam.

+
library(BiocParallel)
+
+images_comp <- compImage(images, adapted_sm, 
+                         BPPARAM = MulticoreParam())
+

As a sanity check, we will visualize the image before and after compensation:

+
# Before compensation
+plotPixels(images[5], colour_by = "Yb173Di", 
+           image_title = list(text = "Yb173 (Ecad) - before", position = "topleft"), 
+           legend = NULL, bcg = list(Yb173Di = c(0, 4, 1)))
+

+
plotPixels(images[5], colour_by = "Yb174Di", 
+           image_title = list(text = "Yb174 (CD303) - before", position = "topleft"), 
+           legend = NULL, bcg = list(Yb174Di = c(0, 4, 1)))
+

+
# After compensation
+plotPixels(images_comp[5], colour_by = "Yb173Di",
+           image_title = list(text = "Yb173 (Ecad) - after", position = "topleft"), 
+           legend = NULL, bcg = list(Yb173Di = c(0, 4, 1)))
+

+
plotPixels(images_comp[5], colour_by = "Yb174Di", 
+           image_title = list(text = "Yb174 (CD303) - after", position = "topleft"),
+           legend = NULL, bcg = list(Yb174Di = c(0, 4, 1)))
+

+

For convenience, we will re-set the channelNames to their biological targtes:

+
channelNames(images_comp) <- rownames(spe)
+
+
+

6.4 Write out compensated images

+

In the final step, the compensated images are written out as 16-bit TIFF +files:

+
library(tiff)
+dir.create("data/comp_img")
+lapply(names(images_comp), function(x){
+  writeImage(as.array(images_comp[[x]])/(2^16 - 1), 
+             paste0("data/comp_img/", x, ".tiff"),
+             bits.per.sample = 16)
+})
+
+
+

6.5 Save objects

+

For further downstream analysis, the compensated SpatialExperiment and +CytoImageList objects are saved replacing the former objects:

+
saveRDS(spe, "data/spe.rds")
+saveRDS(images_comp, "data/images.rds")
+
+
+

6.6 Session Info

+
+ +SessionInfo + +
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.3 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## locale:
+##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+## 
+## time zone: Etc/UTC
+## tzcode source: system (glibc)
+## 
+## attached base packages:
+## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
+## [8] base     
+## 
+## other attached packages:
+##  [1] testthat_3.1.10             tiff_0.1-11                
+##  [3] BiocParallel_1.34.2         cytomapper_1.12.0          
+##  [5] EBImage_4.42.0              patchwork_1.1.3            
+##  [7] dittoSeq_1.12.1             ggplot2_3.4.3              
+##  [9] pheatmap_1.0.12             CATALYST_1.24.0            
+## [11] imcRtools_1.6.5             SpatialExperiment_1.10.0   
+## [13] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2
+## [15] Biobase_2.60.0              GenomicRanges_1.52.0       
+## [17] GenomeInfoDb_1.36.3         IRanges_2.34.1             
+## [19] S4Vectors_0.38.2            BiocGenerics_0.46.0        
+## [21] MatrixGenerics_1.12.3       matrixStats_1.0.0          
+## 
+## loaded via a namespace (and not attached):
+##   [1] bitops_1.0-7                sf_1.0-14                  
+##   [3] RColorBrewer_1.1-3          doParallel_1.0.17          
+##   [5] tools_4.3.1                 backports_1.4.1            
+##   [7] utf8_1.2.3                  R6_2.5.1                   
+##   [9] DT_0.29                     HDF5Array_1.28.1           
+##  [11] rhdf5filters_1.12.1         GetoptLong_1.0.5           
+##  [13] withr_2.5.1                 sp_2.0-0                   
+##  [15] gridExtra_2.3               cli_3.6.1                  
+##  [17] archive_1.1.6               sandwich_3.0-2             
+##  [19] labeling_0.4.3              sass_0.4.7                 
+##  [21] nnls_1.5                    mvtnorm_1.2-3              
+##  [23] readr_2.1.4                 proxy_0.4-27               
+##  [25] ggridges_0.5.4              systemfonts_1.0.4          
+##  [27] colorRamps_2.3.1            svglite_2.1.1              
+##  [29] R.utils_2.12.2              scater_1.28.0              
+##  [31] plotrix_3.8-2               limma_3.56.2               
+##  [33] flowCore_2.12.2             rstudioapi_0.15.0          
+##  [35] generics_0.1.3              shape_1.4.6                
+##  [37] gtools_3.9.4                vroom_1.6.3                
+##  [39] car_3.1-2                   dplyr_1.1.3                
+##  [41] Matrix_1.6-1.1              RProtoBufLib_2.12.1        
+##  [43] ggbeeswarm_0.7.2            fansi_1.0.4                
+##  [45] abind_1.4-5                 R.methodsS3_1.8.2          
+##  [47] terra_1.7-46                lifecycle_1.0.3            
+##  [49] multcomp_1.4-25             yaml_2.3.7                 
+##  [51] edgeR_3.42.4                carData_3.0-5              
+##  [53] rhdf5_2.44.0                Rtsne_0.16                 
+##  [55] grid_4.3.1                  promises_1.2.1             
+##  [57] dqrng_0.3.1                 crayon_1.5.2               
+##  [59] shinydashboard_0.7.2        lattice_0.21-8             
+##  [61] beachmat_2.16.0             cowplot_1.1.1              
+##  [63] magick_2.8.0                pillar_1.9.0               
+##  [65] knitr_1.44                  ComplexHeatmap_2.16.0      
+##  [67] RTriangle_1.6-0.12          rjson_0.2.21               
+##  [69] codetools_0.2-19            glue_1.6.2                 
+##  [71] data.table_1.14.8           vctrs_0.6.3                
+##  [73] png_0.1-8                   gtable_0.3.4               
+##  [75] cachem_1.0.8                xfun_0.40                  
+##  [77] S4Arrays_1.0.6              mime_0.12                  
+##  [79] DropletUtils_1.20.0         tidygraph_1.2.3            
+##  [81] ConsensusClusterPlus_1.64.0 survival_3.5-5             
+##  [83] iterators_1.0.14            cytolib_2.12.1             
+##  [85] units_0.8-4                 ellipsis_0.3.2             
+##  [87] TH.data_1.1-2               bit64_4.0.5                
+##  [89] rprojroot_2.0.3             bslib_0.5.1                
+##  [91] irlba_2.3.5.1               svgPanZoom_0.3.4           
+##  [93] vipor_0.4.5                 KernSmooth_2.23-21         
+##  [95] colorspace_2.1-0            DBI_1.1.3                  
+##  [97] raster_3.6-23               tidyselect_1.2.0           
+##  [99] bit_4.0.5                   compiler_4.3.1             
+## [101] BiocNeighbors_1.18.0        desc_1.4.2                 
+## [103] DelayedArray_0.26.7         bookdown_0.35              
+## [105] scales_1.2.1                classInt_0.4-10            
+## [107] distances_0.1.9             stringr_1.5.0              
+## [109] digest_0.6.33               fftwtools_0.9-11           
+## [111] rmarkdown_2.25              XVector_0.40.0             
+## [113] htmltools_0.5.6             pkgconfig_2.0.3            
+## [115] jpeg_0.1-10                 sparseMatrixStats_1.12.2   
+## [117] fastmap_1.1.1               rlang_1.1.1                
+## [119] GlobalOptions_0.1.2         htmlwidgets_1.6.2          
+## [121] shiny_1.7.5                 DelayedMatrixStats_1.22.6  
+## [123] farver_2.1.1                jquerylib_0.1.4            
+## [125] zoo_1.8-12                  jsonlite_1.8.7             
+## [127] R.oo_1.25.0                 BiocSingular_1.16.0        
+## [129] RCurl_1.98-1.12             magrittr_2.0.3             
+## [131] scuttle_1.10.2              GenomeInfoDbData_1.2.10    
+## [133] Rhdf5lib_1.22.1             munsell_0.5.0              
+## [135] Rcpp_1.0.11                 ggnewscale_0.4.9           
+## [137] viridis_0.6.4               stringi_1.7.12             
+## [139] ggraph_2.1.0                brio_1.1.3                 
+## [141] zlibbioc_1.46.0             MASS_7.3-60                
+## [143] plyr_1.8.8                  parallel_4.3.1             
+## [145] ggrepel_0.9.3               graphlayouts_1.0.1         
+## [147] splines_4.3.1               hms_1.1.3                  
+## [149] circlize_0.4.15             locfit_1.5-9.8             
+## [151] igraph_1.5.1                ggpubr_0.6.0               
+## [153] ggsignif_0.6.4              pkgload_1.3.3              
+## [155] ScaledMatrix_1.8.1          reshape2_1.4.4             
+## [157] XML_3.99-0.14               drc_3.0-1                  
+## [159] evaluate_0.21               tzdb_0.4.0                 
+## [161] foreach_1.5.2               tweenr_2.0.2               
+## [163] httpuv_1.6.11               tidyr_1.3.0                
+## [165] purrr_1.0.2                 polyclip_1.10-6            
+## [167] clue_0.3-65                 ggforce_0.4.1              
+## [169] rsvd_1.0.5                  broom_1.0.5                
+## [171] xtable_1.8-4                e1071_1.7-13               
+## [173] rstatix_0.7.2               later_1.3.1                
+## [175] viridisLite_0.4.2           class_7.3-22               
+## [177] tibble_3.2.1                FlowSOM_2.8.0              
+## [179] beeswarm_0.4.0              cluster_2.1.4              
+## [181] concaveman_1.1.0
+
+ +
+
+

References

+
+
+Bendall, Sean C., Erin F. Simonds, Peng Qiu, El Ad D. Amir, Peter O. Krutzik, Rachel Finck, Robert V. Bruggner, et al. 2011. “Single-Cell Mass Cytometry of Differential Immune and Drug Responses Across a Human Hematopoietic Continuum.” Science 332: 687–96. +
+
+Chevrier, Stéphane, Helena L. Crowell, Vito R. T. Zanotelli, Stefanie Engler, Mark D. Robinson, and Bernd Bodenmiller. 2017. “Compensation of Signal Spillover in Suspension and Imaging Mass Cytometry.” Cell Systems 6: 612–20. +
+
+
+ +
+
+
+ + +
+
+ + + + + + + + + + + + + + + diff --git a/style.css b/style.css new file mode 100644 index 00000000..b1b1080d --- /dev/null +++ b/style.css @@ -0,0 +1,16 @@ +p.caption { + color: #777; + margin-top: 10px; +} +p code { + white-space: inherit; +} +pre { + word-break: normal; + word-wrap: normal; +} +pre code { + white-space: inherit; +} + +pre, code {white-space:pre !important; overflow-x:scroll !important}