title | author | date | output |
---|---|---|---|
Process Sanger data for CC Founders |
Brian S Yandell |
8 December 2016 |
html_document |
These files are getting features for use with DOQTL
data. These are derived from
code developed by Karl Broman.
R/get_mgi_features.R
: get MGI features (genes, exons, ...) for Sanger fileinst/CreatSQL/vcf_snp_2db.R
: (re)createcc_foundersnps.sqlite
(update of Karl Broman'sR/0_vcf2db.R
)inst/CreatSQL/vcf_indel_2db.R
: createcc_founderindels.sqlite
(new)inst/CreatSQL/svs.Rmd
: createsvs8_*.rds
files (new)
The vcf
and svs
files have hard-wired dirpath
that needs to be locally edited.
For MGI features:
library(dplyr)
library(readr)
source("check_interval.R")
For VCF:
library(VariantAnnotation)
library(RSQLite)
For SVS:
library(dplyr)
The MGI features are in their own SQLite. This is much smaller. Not clear that it is needed this way.
I modified Karl's SNP SQLite code to include consequence and Ensembl IDs.
The InDel SQLite code is a minor change to get indels. Note that there is a column labelled allele
, which is the name used in creation of the VCF, but this is pretty close to the column alleles
, which we use in DOQTL stuff.
The structural variants were small, but unclear what users might want. I mainly use the svs8_len.rds
file.
These are in a different format from VCF. Look carefully at this. Not totally happy, but it works.
You can install R/CCSanger
from GitHub.
You will need the following packages for CCSanger
:
install.packages(c("assertthat", "dplyr", "feather", "dbplyr"))
Once you have installed these, install CCSanger
as
install_github("byandell/CCSanger")