Gregory Way, Jo Lynne Harenza, John Maris, 2018
Here, we apply a Ras activation, an NF1 inactivation, and a TP53 inactivation classifier to Target Patient Derived Xenograft (PDX) RNAseq data. The classifiers were previously trained using data from The Cancer Genome Atlas (TCGA) PanCanAtlas Project (Way et al. 2018, Knijnenburg et al. 2018)
We use conda as an environment manager. To reproduce the computational environment used in this pipeline, run:
# Using conda version >4.5
conda env create --force --file environment.yml
conda activate expression-classification
The following notebooks describe the analysis pipeline
Notebook | Description |
---|---|
1.apply-classifier.ipynb |
Apply the classifiers trained previously on the input data |
2.evaluate-classifier.ipynb |
Investigate and evaluate the prediction performance and score distribution for input data |
3.explore-variants.ipynb |
Explore the classifier predictions across genes, variants, and outliers |
To rerun all scripts, perform the following:
# First, download the gene expression and alterations data
./download_data.sh
# Make sure to activate the conda environment
conda activate expression-classification
# Run the pipeline to extract results, figures, and convert notebooks for easy viewing
./run_analysis.sh