This repository contains data indexes from NIST's Genome in a Bottle (GIAB) project. The indexes for sequences and alignments are also available under: https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data_indexes .
AshkenazimTrio
Son:HG002 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG002_NA24385_son/
Father:HG003 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG003_NA24149_father/
Mother:HG004 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG004_NA24143_mother/
Sequencing Platform | Sequence | Alignment |
---|---|---|
Illumina WGS 2x150bp 300X per individual | All HG002 HG003 HG004 | novoalign: All HG002 HG003 HG004 |
Illumina 6KB Matepair | All HG002 HG003 HG004 | bwamem:hg19 All HG002 HG003 HG004 |
Illumina WGS 2X250bp | All HG002 HG003 HG004 | isaac:hg19 All HG002 HG003 HG004 novoalign: All HG002 HG003 HG004 |
Moleculo | All HG002 HG003 HG004 | |
Illumina Whole Exome | - | bwamem:hg19 All HG002 HG003 HG004 |
SOLiD 60x for son | All HG002 | LifeScope:hg19 All HG002 |
CompleteGenomics | - | CGAtools:hg19 All HG002 HG003 HG004 |
Ion Proton 1000x Exome | - | TMAP:hg19 All HG002 HG003 HG004 |
10X Genomics | - | bwamem:hg19 All HG002 HG003 HG004 |
10X Genomics ChromiumGenome | All HG002 | LongRanger2.0:hg19 All HG002 HG003 HG004 |
BioNano | All:bnx HG002:bnx HG003:bnx HG004:bnx | All:cmap HG002 HG003 HG004 |
PacBio 70x/30x/30x | All HG002 HG003 HG004 All:hdf5 HG002 HG003 HG004 |
NGMLR:hg19 All HG002 HG003 HG004 minimap2: All HG002 HG003 HG004 |
PacBio CCS 10kb | All HG002 | pbmm2:hg19 All HG002 |
PacBio CCS 11kb | All HG002 | pbmm2:hg19 All HG002 |
PacBio CCS 15kb | All HG002 | pbmm2:hg19 All HG002 |
PacBio CCS 15kb_20kb chemistry2 | All HG002 | pbmm2: All HG002 HG003 HG004 |
Oxford Nanopore 2D | All HG002 | - |
Oxford Nanopore ultralong (guppy-V3.2.4_2020-01-22) | All HG002 | minimap2:whatshap:hg19 All HG002 |
Oxford Nanopore ultralong Promethion | All HG002 HG003 HG004 | - |
BGI BGISEQ500 | All HG002 | - |
BGI MGISEQ PCR-free | All HG002 | - |
BGI stLFR | All HG002 HG003 HG004 | All:bwamem:hg19 HG002 HG003 HG004 |
Strand-Seq HG002 by BCCRC | All HG002 | - |
* CompleteGenomics LFR raw or alignment data not available, but analysis results available under: https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/analysis/CompleteGenomics_newLFR_CGAtools_06122015/
ChineseTrio
Son:HG005 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/ChineseTrio/HG005_NA24631_son/
Father:HG006 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/ChineseTrio/HG006_NA24694-huCA017E_father/
Mother:HG007 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/ChineseTrio/HG007_NA24695-hu38168_mother/
Sequencing Platform | Sequence | Alignment |
---|---|---|
Illumina WGS 2x250bp 300X for son; 2x150bp 100x for parents |
All HG005 HG006 HG007 | novoalign: All:hg19-hg38 HG005:hg19-hg38 HG006:hg19-hg38 HG007:hg19-hg38 |
Illumina 6KB Matepair | All HG005 HG006 HG007 | |
Moleculo | All HG005 HG006 HG007 | |
SOLiD 60x for son | All:xsq HG005:xsq | LifeScope: All:hg19 HG005:hg19 |
CompleteGenomics | CGAtools: All:hg19 (RMDNA) HG005:hg19 HG006:hg19 HG007:hg19 CGAtools: All:hg19 (cellsDNA) HG005:hg19 |
|
Illumina Whole Exome | bwamem: All:hg19 HG005:hg19 | |
Ion Proton 1000x Exome | TMAP: All:hg19 HG005:hg19 | |
BioNano for son | All:bnx HG005:bnx | All:hg19 (cmap) HG005:hg19 (cmap) |
PacBio Sequel for the trio | All HG005 HG006 HG007 | |
PacBio SequelII CCS 11kb | |
|
BGI BGISEQ500, MGISEQ, stLFR | |
NA12878
NA12878:HG001 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/NA12878/
Sequencing Platform | Sequence | Alignment |
---|---|---|
Illumina WGS 2x150bp 300X | HG001 | bwamem: HG001:hg19 (downsampled30x) novoalign: HG001 |
Illumina HiSeq Exome | HG001 HG001:trimmed_fastq |
bwamem: HG001:hg19 |
Illumina TruSeq Exome | bwamem: HG001:hg19 | |
10X Genomics | bwamem: HG001:hg19 bwamem: HG001:hg19 (size_selected) |
|
10X Genomics ChromiumGenome | LongRanger2.0: HG001:hg19-hg38 LongRanger2.1: HG001:hg19-hg38 |
|
CompleteGenomics | CGAtools: HG001:hg19 | |
Ion Proton 1000x Exome | TMAP: HG001:hg19 | |
NA12878 SOLiD5500W | LifeScope: HG001:hg19 | |
BGI BGISEQ500, MGISEQ, stLFR | ||
PacBio 40x | HG001:hdf5 | |
PacBio SequelII CCS 11kb | ||
Ultralong_OxfordNanopore | - |
minimap2: HG001 |
- CompleteGenomics LFR raw or alignment data not available, but analysis results available under: https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/NA12878/analysis/CompleteGenomics_newLFR_CGAtools_06122015/ .
Please Note:
1. If you want to use raw sequencing data (fastq, fasta, hdf5, xsq, bnx etc) for your analysis, then you can use the sequence.index.* files when you need to download the data.
2. If you want to use aligned data (bam, xmap/cmap etc.) for your analysis, then you can use the alignment.index.* files when you need to download the data.