André F. Rendeiro*, Hiranmayi Ravichandran*, Yaron Bram, Vasuretha Chandar, Junbum Kim, Cem Meydan, Jiwoon Park, Jonathan Foox, Tyler Hether, Sarah Warren, Youngmi Kim, Jason Reeves, Steven Salvatore, Christopher E. Mason, Eric C. Swanson, Alain C. Borczuk, Olivier Elemento & Robert E. Schwartz.
* Authors contributed equally.
⬅️ Raw IMC data
⬅️ Processed IMC data
⬅️ 2nd IMC panel data
⬅️ Immunohistochemistry data
⬅️ Targeted spatial transcriptomics data
⬅️ read the published article here
- The metadata directory contains metadata relevant to annotate the samples
- This CSV file is the master record of all analyzed samples
- The src directory contains source code used to analyze the data
- Raw data (i.e. MCD files) will be under the
data
directory. - Processing of the data will create TIFF files under the
processed
directory. - Outputs from the analysis will be present in a
results
directory, with subfolders pertaining to each part of the analysis as described below.
To download files from Zenodo programatically create an access token (https://zenodo.org/account/settings/applications/tokens/new/), and add this to a file ~/.zenodo.auth.json
as a simple key: value pair e.g.: {'access_token': '123asd123asd123asd123asd123asd'}
.
Be sure to make the file read-only (e.g. chmod 400 ~/.zenodo.auth.json
).
To see all available steps type:
$ make
Makefile for the covid-imc project.
Available commands:
help Display help and quit
requirements Install Python requirements
download_data Download all data from Zenodo
analysis Run the actual analysis
To reproduce analysis using the pre-preocessed data, one would so:
$ make requirements # install python requirements using pip
$ make download_data # download data from Zenodo
$ make analysis # run the analysis scripts
- Python 3.7+ (was run on 3.8.2)
- Python packages as specified in the requirements file - install with
make requirements
orpip install -r requirements.txt
.
Feel free to use some virtualization or compartimentalization software such as virtual environments or conda to install the requirements.
It is recommended to compartimentalize the analysis software from the system's using virtual environments, for example.
Here's how to create one with the repository and installed requirements:
git clone [email protected]:ElementoLab/covid-imc.git
cd covid-imc
virtualenv .venv
source .venv/bin/activate
pip install -r requirements.txt
This is the main dataset of the manuscript, consisting of 27 samples from 27 individuals, from which 240 images were produced. 3 images were excluded from analysis. The list of markers used is available here.
These data are available in the following Zenodo deposits:
This is a complementary dataset, focusing on proteins related with immune activation/cell state. It consists of 7 samples from 7 individuals, from which 46 images were produced.
These data are available in the following Zenodo deposits:
This is a complementary dataset, validating the IMC data. It consists of 383 H-DAB images for two markers (MPO, and CD163) across all disease groups are available.
Raw images and segmentation masks are available here: https://doi.org/10.5281/zenodo.4633905.
The workflow is the following: Single nucleus are segmentated with Stardist using the 2D_versatile_he model.
Images are decomposed into Hematoxylin and DAB components and each cell is quantified for the abundance of either marker. Positive cells are declared using a mixture of gaussian models. Intensity and percentage of positive cells are compared between patients, compartments within the tisse and disease groups.
This is a complementary dataset, validating the IMC data and providing an expanded molecular view of the lung. Newly generated data is available here: https://doi.org/10.5281/zenodo.4635285. A script used to load and analyze the dataset is available here: src/geomx.py.
Reanalysis of targeted spatial transcriptomics data from Desai et al
A script used to get the dataset and analise it is available here: src/geomx_desai.py.