-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
289 lines (214 loc) · 11.8 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# PGxVision
<!-- badges: start -->
<!-- badges: end -->
## Description
PGxVision (PharmacoGenomic Vision & Interpretation) helps identify and
visualize RNA-based cancer biomarkers for drug response. This package is
intended to be used in conjunction with the Roche-PharmacoGx pipeline.
PGxVision is intended to guide cancer treatment decisions in molecular
tumour boards.
``` r
R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
```
## Installation
To download the package:
``` r
require("devtools")
devtools::install_github("EvgeniyaGorobets/PGxVision", build_vignettes=TRUE)
library("PGxVision")
```
To run the shinyApp:
``` r
runPGxVision()
```
## Overview
``` r
ls("package:PGxVision")
data(package = "PGxVision")
browseVignettes("PGxVision")
```
`PGxVision` contains nine functions and four sample data sets. Of the nine
functions, one is to run the Shiny app for this package, three are plotting
functions for visualizing drug sensitivity data, and five are gene set analysis
functions that help provide biological context for genes of interest.
An overview of the package is illustrated below.
![](./inst/extdata/figures/PGxVisionSummary.jpeg)
### Biomarker Plotting
The _buildManhattanPlot_ function visualizes gene response to a certain drugs
across the entire human genome. The x-axis represents the full human genome and
is labeled by chromosome, and the y-axis can either plot the drug sensitivity
or the p-value/fdr/significance of the drug sensitivity statistic. As is the
standard for Manhattan plots, data points are colored according to what
chromosome they belong to.
The _buildVolcanoPlot_ function lets you simultaneously visualize the drug
sensitivity and the p-value/significance of the sensitivity statistic. As is
standard for volcano plots, points that are statistically insignificant are
grayed out.
The _buildWaterfallPlot_ function plots drug sensitivity (or other similar
metrics) across a range of tumours or a range of drugs. The waterfall plot will
order results according to the y-axis, so users can quickly identify the most
and least sensitive tumour-drug combinations.
You can learn more about how to use these plotting functions in the Plotting
Biomarkers section of the Visualizing and Interpreting Biomarkers vignette.
### Gene Set Analysis
The gene set analysis in this package is broken down into four steps:
1. _getGeneSets_: Given an ENSEMBL gene ID, find all gene sets that contain
that gene and return information about those gene sets. Users can examine
different types of gene sets, such as those based on biological pathways,
cellular components, molecular function, etc.
2. _expandGeneSets_: Using the gene set IDs retrieved in _getGeneSets_, get all
other genes in those gene sets.
3. _computeGeneSetSimilarity_: Using the fully expanded gene sets from
_expandGeneSets_, compute the overlap between the gene sets. All gene sets
will at least contain the initial query gene, so pairwise overlap will be
non-zero. Currently, the package only supports similarity metrics based on
the proportion of intersecting genes to total genes.
4. _buildNetworkPlot_: Plot the gene sets that were retrieved in _getGeneSets_,
using their similarity scores from _computeGeneSetSimilarity_ as edge
weights. Gene sets that have high overlap will be in closer proximity and
will have thicker and darker edges between them.
In addition to these four functions, the _geneSetAnalysis_ function is provided
to run the first three steps of the above pipeline. You can learn more about
how to use these gene set analysis functions in the Gene Set Analysis section
of the Visualizing and Interpreting Biomarkers vignette.
### Shiny Dashboard
The _runPGxVision_ lets you perform the functions described above by interacting
with UI elements. The dashboard layout of the app lets you see multiple plots
side by side, to facilitate analysis. Additionally, the Shiny app lets you
interact with the plots (hovering over and selecting points), making it easier
to extract useful information.
The screenshots below illustrate what the main features and layout of the app.
It is divided into two tabs: "Biomarkers", which lets you look at gene-level
cell-line drug sensitivity data, and "Treatment Response", which lets you
examine and compare treatment responses of different tumours and compounds.
<!-- The code for placing images inline was taken from jdlong's comment on
a post in RStudio Community:
jdlong. (2018). How to stack two images horizontally in R Markdown. RStudio
Community.
https://community.rstudio.com/t/how-to-stack-two-images-horizontally-in-r-markdown/18941 -->
```{r image_grobs, fig.show = "hold", out.width = "33%", echo=FALSE}
knitr::include_graphics("./inst/extdata/figures/app-1.png")
knitr::include_graphics("./inst/extdata/figures/app-2.png")
knitr::include_graphics("./inst/extdata/figures/app-3.png")
```
### Sample Data
The _BRCA.PDXE.paxlitaxel.response_ data set is a `data.frame` with treatment
vs. control angle data from BRCA PDXs (patient-derived xenographs). It is used
in the _buildWaterfallPlot_ example and was retrieved from the `Xeva` package.
The _Biomarkers_ data set is a `data.frame` with drug sensitivity/response data
in various experiments involving cancer cell lines (an experiment is a unique
combination of a drug, tissue, and molecular data type). This data set
also contains the p-values for the drug sensitivity statistics. It is used in
the _buildManhattanPlot_ and _buildVolcanoPlot_ examples and was retrieved from
PharmacoDb.
The _GRCh38.p13.Assembly_ data set is a `data.frame` containing basic
information about the GRCh38.p13 genome assembly. Most notably, it includes the
names and lengths of all the chromosomes. It is used in the
_buildManhattanPlot_ function and was retrieved from Gencode.
The _TestGeneSets_ data set is a `data.frame` containing 10 different gene sets
and all their constituent genes. This data set represents all the gene sets
in the "GO::CC" (GO cellular compartment) category that gene ENSG00000012124 is
a part of. It is used for testing and was retrieved from MSigDb.
## Contributions
The author of the package is Evgeniya Gorobets.
`data.table` is used to transform `data.frame`s into `data.table`s in some
plotting and gene set analysis functions (_buildVolcanoPlot, buildManhattanPlot,
queryGene, expandGeneSets, computeGeneSetSimilarity_). The `data.table`
vignettes were used to create cleaner syntax and optimize table manipulation.
`ggplot2` is used to create non-network plots (_buildVolcanoPlot,
buildManhattanPlot, buildWaterfallPlot_), and `viwNetwork` is used to create
network plots (_buildNetworkPlot_). `ggprism` is used to enhance the axes on the
Manhattan plot (_buildManhattanPlot_). `viridis` is used to enhance the colors
on the network plot (_buildNetworkPlot_).
`checkmate` is used to succinctly check user input in all functions.
`msigdbr` is used to query the MSigDb in gene set analysis functions
(_queryGene, expandGeneSets_).
`shiny`, `shinydashboard`, and `shinybusy` are used to create the Shiny app and
generate UI elements in the app. `plotly` is used to add interactivity to
ggplots that are rendered in the Shiny app. `visNetwork` is used to add
interactivity to visNetwork graphs that are generated in the Shiny app.
`magrittr` is used to pipe between functions.
## References
### Package References
These references describe any and all packages used in PGxVision.
Almende B.V., Thieurmel, B., & Robert, T. (2021). visNetwork: Network
Visualization using 'vis.js' Library. R package version 2.1.0.
https://CRAN.R-project.org/package=visNetwork
Bache, S. M., and Wickham, H. (2020). magrittr: A Forward-Pipe Operator for R.
R package version 2.0.1. https://CRAN.R-project.org/package=magrittr
Chang, W. and Borges Ribeiro, B. (2021). shinydashboard: Create
Dashboards with 'Shiny'. R package version 0.7.2.
https://CRAN.R-project.org/package=shinydashboard
Chang, W., Cheng, J., Allaire, J., Sievert, C., Schloerke, B., Xie, Y., Allen,
J., McPherson, J., Dipert, A., and Borges, B. (2021). shiny: Web Application
Framework for R. R package version 1.7.1.
https://CRAN.R-project.org/package=shiny
Dawson, C. (2021). ggprism: A 'ggplot2' Extension Inspired by 'GraphPad Prism'.
R package version 1.0.3. https://CRAN.R-project.org/package=ggprism
Dolgalev, I. (2021). msigdbr: MSigDB Gene Sets for Multiple Organisms in a Tidy
Data Format. R package version 7.4.1.
https://CRAN.R-project.org/package=msigdbr
Dowle, M., and Srinivasan, A. (2021). data.table: Extension of `data.frame`.
R package version 1.14.2. https://CRAN.R-project.org/package=data.table.
Garnier, S., Ross, N., Rudis, R., Camargo, A. P., Sciaini, M., and Scherer, C.
(2021). Rvision - Colorblind-Friendly Color Maps for R. R package version
0.6.2.
Lang, M. (2017). “checkmate: Fast Argument Checks for Defensive R Programming.”
_The R Journal 9_(1), 437-445.
https://journal.r-project.org/archive/2017/RJ-2017-028/index.html.
Meyer, F. and Perrier, V. (2020). shinybusy: Busy Indicator for 'Shiny'
Applications. R package version 0.2.2.
https://CRAN.R-project.org/package=shinybusy
R Core Team. (2021). R: A language and environment for statistical computing. R
Foundation for Statistical Computing, Vienna, Austria.
https://www.R-project.org/.
Sievert, C. (2020). _Interactive Web-Based Data Visualization with R, plotly,
and shiny_. Chapman and Hall/CRC.
Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis.
Springer-Verlag New York. https://ggplot2.tidyverse.org.
### Data References
These references indicate the source for all datasets available in PGxVision.
Feizi, N., Nair, S. K., Smirnov, P., Beri, G., Eeles, C., Esfahani, P. N., …
Haibe-Kains, B. (2021). PharmacoDB 2.0: Improving scalability and
transparency of in vitro pharmacogenomics analysis. _bioRxiv._
doi:10.1101/2021.09.21.461211
Frankish, A., Diekhans, M., Ferreira, A. M., Johnson, R., Jungreis, I.
Loveland, J., Mudge, J. M., Sisu, C., Wright, J., Armstrong, J., Barnes, I.,
Berry, A., Bignell, A., Carbonell Sala, S., Chrast, J., Cunningham, F., Di
Domenico, T., Donaldson, S., Fiddes, I. T., García Girón, C., … Flicek, P.
(2019). GENCODE reference annotation for the human and mouse genomes.
_Nucleic acids research, 47_(D1), D766–D773.
https://doi.org/10.1093/nar/gky955
Mer A, Haibe-Kains B (2021). Xeva: Analysis of patient-derived xenograft (PDX)
data. R package version 1.10.0.
Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L.,
Gillette, M. A., … Mesirov, J. P. (2005). Gene set enrichment analysis:
A knowledge-based approach for interpreting genome-wide expression profiles.
_Proceedings of the National Academy of Sciences, 102_(43), 15545–15550.
doi:10.1073/pnas.0506580102
### Code References
These references indicate the source for any code in PGxVision that was taken
directly from a website (not including documentation & vignettes for packages).
jdlong. (2018). How to stack two images horizontally in R Markdown. _RStudio
Community._
https://community.rstudio.com/t/how-to-stack-two-images-horizontally-in-r-markdown/18941
Perry & YakovL. (2017). Stop vis.js physics after nodes load but allow
drag-able nodes. _StackOverflow._
https://stackoverflow.com/questions/32403578/stop-vis-js-physics-after-nodes-load-but-allow-drag-able-nodes
## Acknowledgements
This package was developed as part of an assessment for 2021 BCB410H:
Applied Bioinformatics, University of Toronto, Toronto, CANADA.