Skip to content

cmason-lab/unfog

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


UNFOG
Uncovering Nanopore's Fingerprints of Genomes

A fingerprint-based method for species identification using nanopore event space
data.
Inspired by methods for song identification implemented by Shazam and the
open-source Dejavu.

UNFOG first builds a database of genomes converted into fingerprints from input
fasta files, then compares nanopore fast5 output similarly converted into
fingerprints.
After mapping, the program prints out lists of genomes by numbers of reads
mapping and percentages of reads mapping.

*UNDER DEVELOPMENT*
Written in python 2.7.

Dependencies that must be installed:
- h5py
- numpy
- scipy
- matplotlib
(Also uses the following from the Python standard library: __future__, re,
operator, os, math, glob, collections, cPickle, gzip, hashlib, multiprocessing,
sys, getopt)

Settings and input can be modified through flags:
--input_dir #a directory containing fast5 files of interest
--output_dir #a directory in which to store output
--genomes #a list of genomes for reference
--overlap #True/False to allow overlap between windows
--outlier_thresh #an outlier threshold for events (currently deprecated)
--nfft #Fast Fourier Transform block size (current recommendation 128)
--noverlap #size of overlaps between FFT blocks (current recommendation 64)
--fs #sample rate for the FFT (current recommendation 50)
--create_db #True/False to create the database of fingerprints
--num_processes #number of processes to use when creating a database
--min_peak_amplitude #for extracting peaks from FFT spectrograms (current recommendation 20)
--peak_fan #number of connected peaks to attempt to match (current recommendation 15)
--max_hash_time #max separation between two joined peaks (current recommendation 300)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages