-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
152 lines (102 loc) · 5.39 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# splice2neo
<!-- badges: start -->
[![R-CMD-check](https://github.com/TRON-Bioinformatics/splice2neo/workflows/R-CMD-check/badge.svg)](https://github.com/TRON-Bioinformatics/splice2neo/actions)
[![Codecov test coverage](https://codecov.io/gh/TRON-Bioinformatics/splice2neo/branch/master/graph/badge.svg)](https://codecov.io/gh/TRON-Bioinformatics/splice2neo?branch=master)
`r badger::badge_devel("TRON-Bioinformatics/splice2neo", "blue")`
`r badger::badge_lifecycle("experimental", "blue")`
`r badger::badge_last_commit("TRON-Bioinformatics/splice2neo")`
<!-- badges: end -->
- Documentation: https://tron-bioinformatics.github.io/splice2neo/
- Publication: Lang et al. (2024) [Prediction of tumor-specific splicing from somatic mutations as a source of neoantigen candidates](https://doi.org/10.1093/bioadv/vbae080)
## Overview
This package provides functions for the analysis of splice
junctions and their association with somatic mutations. It integrates the output of several tools that predict splicing effects from mutations or detect expressed splice junctions from RNA-seq data into a standardized splice junction format based on genomic coordinates. Users can filter detected splice junctions against canonical splice junctions and annotate them with affected transcript sequences, CDS, and resulting peptide sequences. Splice2neo supports splice events from alternative 3'/5' splice sites, exons skipping, intron retentions, exitrons, and mutually exclusive exons. Integrating splice2neo functions and detection rules based on splice effect scores and RNA-seq support facilitates the identification of mutation-associated splice junctions that are tumor-specific and can encode neoantigen candidates.
Currently, splice2neo supports the output data from the following tools:
- Mutation effect prediction tools:
- [SpliceAI](https://github.com/Illumina/SpliceAI)
- [CI-SpliceAI](https://github.com/YStrauch/CI-SpliceAI__Annotation)
- [MMsplice](https://github.com/gagneurlab/MMSplice_MTSplice)
- [Pangolin](https://github.com/tkzeng/Pangolin)
- RNA-seq-based splice junction detection tools:
- [LeafCutter](https://github.com/davidaknowles/leafcutter)
- [RegTools](https://github.com/griffithlab/regtools)
- [SplAdder](https://github.com/ratschlab/spladder)
- [IRfinder](https://github.com/RitchieLabIGH/IRFinder)
- [SUPPA2](https://github.com/comprna/SUPPA)
- [STAR](https://github.com/alexdobin/STAR)
- [StringTie](https://github.com/gpertea/stringtie)
For a more detailed description and a full example workflow see the [vignette](https://tron-bioinformatics.github.io/splice2neo/articles/splice2neo_workflow.html)
## Installation
The R package splice2neo is not yet on [CRAN](https://CRAN.R-project.org) or
[Bioconductor](https://www.bioconductor.org/). Therefore, you need to install splice2neo
from [github](https://github.com/TRON-Bioinformatics/splice2neo).
```{r, eval=FALSE}
if (!requireNamespace("remotes", quietly = TRUE)) {
install.packages("remotes")
}
remotes::install_github("TRON-Bioinformatics/splice2neo")
```
## Example usage
This is a basic example of how to use some functions.
```{r}
library(splice2neo)
# load human genome reference sequence
requireNamespace("BSgenome.Hsapiens.UCSC.hg19", quietly = TRUE)
bsg <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19
```
### 1 Example data
We start with some example splice junctions provided with the package.
```{r}
junc_df <- dplyr::tibble(
junc_id = toy_junc_id[c(1, 6, 10)]
)
junc_df
```
### 2 Add transcripts
Next, we find the transcripts which are in the same genomic region as the splice junction and that may be affected by the junction.
```{r}
junc_df %>%
add_tx(toy_transcripts) %>%
head()
```
### 3 Modify transcripts with junctions
We modify the canonical transcripts by introducing the splice junctions. Then we
add the transcript sequence in a fixed-sized window around the junction
positions, the context sequence.
```{r}
toy_junc_df %>%
head()
toy_junc_df %>%
add_context_seq(transcripts = toy_transcripts, size = 400, bsg = bsg) %>%
head()
```
### 4 Annotate peptide sequence
Here, we use the splice junctions to modify the coding sequences (CDS) of the
reference transcripts. The resulting CDS sequences are translated into protein
sequence and further annotated with the peptide around the junction, the
relative position of the splice junction in the peptide, and the location of the
junction in an open reading frame (ORF).
```{r}
toy_junc_df %>%
# add peptide sequence
add_peptide(cds=toy_cds, flanking_size = 13, bsg = bsg) %>%
# select subset of columns
dplyr::select(junc_id, peptide_context, junc_in_orf, cds_description) %>%
head()
```
## Issues
Please report issues here: https://github.com/TRON-Bioinformatics/splice2neo/issues
## Reference
Lang F, Sorn P, Suchan M, Henrich A, Albrecht C, Köhl N, Beicht A, Riesgo-Ferreiro P, Holtsträter C, Schrörs B, Weber D, Löwer M, Sahin U, Ibn-Salem J. Prediction of tumor-specific splicing from somatic mutations as a source of neoantigen candidates. Bioinform Adv. 2024 May 29;4(1):vbae080. doi: [10.1093/bioadv/vbae080](https://doi.org/10.1093/bioadv/vbae080). PMID: 38863673; PMCID: PMC11165244.