Jianlin Cheng, PhD
Department of Computer Science
University of Missouri, Columbia
Email: [email protected]
GSDB is a database of Hi-C data chromosome and genome structures. Our goal is that this database will enable the exploration of the dynamic architecture of the different Hi-C 3D structure in a variety of cells and tissues.
Over 50,000 structures from 12 start-of-the-art Hi-C data structure prediction algorithms for 32 Hi-C datasets each containing varying resolutions.
- src: source code for the website
- GSDB_Scripts: Contains the Algorithms and the scripts used for data extraction, data normalization, and 3D structure generation.
- Database Data Info.xlsx: Contains more information about the Hi-C data and the 3D structure prediction tools
- ID_Generator.jar: Java executable file to create the GSDB ID. To execute type in terminal: java -jar ID_Generator.jar
For each algorithm, we have described the contact matrix input format they accept and the input file name extension/suffix used for the 3D structure Construction
Algorithm | Input Format | GSDB Input filename suffix |
---|---|---|
LorDG | 3-column Matrix | _list.txt |
3DMax | 3-column Matrix | _list.txt |
MOGEN | 3-column Matrix | _list.txt |
Pastis | 3-column Matrix(bin1,bin2,IF), and mapping coordinate(chr, start_pos,end_pos, bin) | .n_contact, .cbins |
Chromosome3D | n x n Square Matrix | _matrix.txt |
HSA | 2-column bin positon with n x n Square Matrix | _HSA.txt |
miniMDS | Chromosome,positon,IF(chr,start_pos1,end_pos1,chr, start_pos2,end_pos2,IF) | .bed |
ShRec3D | n x n Square Matrix | _matrix.txt |
GEM | 2-column bin positon with n x n Square Matrix | _HSA.txt |
ChromSDE | 3-column Matrix, and mapping coordinate | .n_contact, .cbins |
SIMBA3D | n x n Square Matrix | .npy |
InfMod3DGen | n x n Square Matrix | _matrix.txt |
The executable software and the source code of is distributed free of charge as it is to any non-commercial users. The authors hold no liabilities to the performance of the program.