Applying Machine Learning Classifiers to Pediatric Patient Derived Xenograft Expression Data

Gregory Way, Jo Lynne Harenza, John Maris, 2018

Here, we apply a Ras activation, an NF1 inactivation, and a TP53 inactivation classifier to Target Patient Derived Xenograft (PDX) RNAseq data. The classifiers were previously trained using data from The Cancer Genome Atlas (TCGA) PanCanAtlas Project (Way et al. 2018, Knijnenburg et al. 2018)

Computational Environment

We use conda as an environment manager. To reproduce the computational environment used in this pipeline, run:

# Using conda version >4.5
conda env create --force --file environment.yml

conda activate expression-classification

Pipeline

The following notebooks describe the analysis pipeline

Notebook	Description
`1.apply-classifier.ipynb`	Apply the classifiers trained previously on the input data
`2.evaluate-classifier.ipynb`	Investigate and evaluate the prediction performance and score distribution for input data
`3.explore-variants.ipynb`	Explore the classifier predictions across genes, variants, and outliers

To rerun all scripts, perform the following:

# First, download the gene expression and alterations data
./download_data.sh

# Make sure to activate the conda environment
conda activate expression-classification

# Run the pipeline to extract results, figures, and convert notebooks for easy viewing
./run_analysis.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Applying Machine Learning Classifiers to Pediatric Patient Derived Xenograft Expression Data

Computational Environment

Pipeline

About

Releases 2

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
data		data
figures		figures
html		html
results		results
scripts		scripts
.gitignore		.gitignore
1.apply-classifier.ipynb		1.apply-classifier.ipynb
2.evaluate-classifier.ipynb		2.evaluate-classifier.ipynb
3.explore-variants.ipynb		3.explore-variants.ipynb
LICENSE.md		LICENSE.md
README.md		README.md
download_data.sh		download_data.sh
environment.yml		environment.yml
run_analysis.sh		run_analysis.sh
utils.py		utils.py

License

marislab/pdx-classification

Folders and files

Latest commit

History

Repository files navigation

Applying Machine Learning Classifiers to Pediatric Patient Derived Xenograft Expression Data

Computational Environment

Pipeline

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages