-
Notifications
You must be signed in to change notification settings - Fork 8
Formats
VCF & BCF | GFF | GFF2 & GTF* | GFF3 | BED | |
---|---|---|---|---|---|
reference_sequence(s) | Only diff (optional) | ||||
Region of reference_sequences(s) | x | x | x | x | x |
Annotation data | variants | feature | feature | feature | Regions (less than gff) |
Is File Seekable | Yes | No Use Case? | No Use Case? | No Use Case? | No Use Case? |
VCF & BCF | GFF | GFF2 & GTF* | GFF3 | BED | |
---|---|---|---|---|---|
Specific Header / Metainformation | x | no header | no header | No header | |
Header line | x | no | |||
CHROM | x | seqname | Reference sequence | x | |
feature*** | Name (optional) | ||||
source | x | x | |||
method | x | type | |||
POS | x | start & end | start & end | start & end | start & end |
thickStart | x (optional) | ||||
thickEnd | x (optional) | ||||
itemRgb | x (optional) | ||||
blockCount | x (optional) | ||||
blockSizes | x (optional) | ||||
blockStarts | x (optional) | ||||
ID | x | seqid | |||
ALT | x | ||||
QUAL | x | score | score | score | score (optional) |
strand | x | x | x | x (optional) | |
frame | x | phase | phase | ||
group | x | ||||
FILTER | x | ||||
INFO**** | x | attribute | attributes**** * | ||
FORMAT**** ** (optional) | x | ||||
SAMPLES (optional) | x |
*The GTF is identical to GFF version 2.
***Gene, Variation, Similarity
****INFO: Arbitrary keys are permitted, although some sub-fields are reserved (albeit optional).
**** *attributes: ID, Name, Alias, Parent, Target, Gap, Derives_from, Note, Dbxref, Ontology_term
**** **FORMAT Tags: AD, ADF, ADR, DP, EC, FT, GL, GP , GQ, GT, HQ, MQ, PL, PQ, PS
VCF (Variant Call Format)*, BCF, GFF (General Feature Format), GFF2 (deprecated), GTF (General Transfer Format), GFF3, BED (Browser Extensible Data), GVF
HGVS vs BED vs GVF vs VCF Format example: https://www.ncbi.nlm.nih.gov/variation/tools/reporter/docs/examples#section-1.2.3
Format Overviews of Broad Institute of MIT and Harvard and UCSC (University of California, Santa Cruz)
- VCF & BCF: https://samtools.github.io/hts-specs/VCFv4.3.pdf https://en.wikipedia.org/wiki/Variant_Call_Format
- GFF: https://www.ensembl.org/info/website/upload/gff.html
- GFF2: http://gmod.org/wiki/GFF2
- GFF3: http://gmod.org/wiki/GFF3 https://en.wikipedia.org/wiki/General_feature_format
VCF to GFF: http://seqanswers.com/forums/showthread.php?t=9796&highlight=gff+vcf%3C/a
*The VCF specification is no longer maintained by the 1000 Genomes Project. The group leading the management and expansion of the format is the Global Alliance for Genomics and Health (GA4GH) Large Scale Genomics Work Stream file format team[7], http://ga4gh.org/#/fileformats-team (Wikipedia).
VCF & BCF: https://samtools.github.io/hts-specs/VCFv4.3.pdf & https://en.wikipedia.org/wiki/Variant_Call_Format & http://vcftools.sourceforge.net/VCF-poster.pdf
Review for all GFF and GTF formats: https://github.com/NBISweden/GAAS/blob/master/annotation/knowledge/gxf.md -> https://github.com/NBISweden/GAAS/blob/master/annotation/knowledge/gxf.md#main-points-and-differences-between-gff-formats & https://github.com/NBISweden/GAAS/blob/master/annotation/knowledge/gxf.md#main-points-and-differences-between-gtf-formats
GFF: https://www.ensembl.org/info/website/upload/gff.html
GFF2: http://gmod.org/wiki/GFF2
GFF3: http://gmod.org/wiki/GFF3 & https://en.wikipedia.org/wiki/General_feature_format
BED: https://m.ensembl.org/info/website/upload/bed.html
VCF to GFF: http://seqanswers.com/forums/showthread.php?t=9796&highlight=gff+vcf%3C/a
GFF to BED: https://bedops.readthedocs.io/en/latest/content/reference/file-management/conversion/gff2bed.html
GTF to GFF: http://seqanswers.com/forums/showthread.php?t=8321
BAM, GFF, GTF, GVF, PSL, RepeatMasker annotation output (OUT), SAM, VCF and WIG to BED: https://bedops.readthedocs.io/en/latest/content/reference/file-management/conversion/convert2bed.html