Skip to content

jingquanlim/batalign

Repository files navigation

"BatAlign: An incremental method for accurate gapped alignment".
This is a README file for the usage of Batalign.

Please go to http://compbio.ddns.comp.nus.edu.sg/~limjingq/BATALIGN/

  1. Batalign.tar.gz (the tarball of the program)
  2. INPUT_READ_all_one_million_reads_datasets.tbz (the tarball of some 1-million-reads datasets that were used for the manuscript)
  3. hg19.BatAlign.index.tbz (the tarball of the reference-hg19-index that can be used to align 2) using 1) )


    INSTALL BatAlign
    =-=-=-=-=-=-=-=-=
    a) Download 1)
    b) tar -zxvf 1)
    c) Change directory into the top directory of b) "batindel/"
    d) Type "./configure" then type "make" and finally "make copy"
    e) batalign (the binary of BatAlign) will be created in bin/

GETTING pre-built INDEX
=-=-=-=-=-=-=-=-=-=-=-=
a) Download 3)
b) tar -zxvf 3)
c) "hg19.fa" will be the input-string to represent this index into program

BUILDING INDEX
=-=-=-=-=-=-=
a) Have a fastq-formatted file ready
b) Locate the script "build_indexX" in "batindel/bin"
c) Type "./build_indexX GENOME.fa" to make the neccessary pairing data-structure based on FM-index.

USAGE
=-=-=-=
Single-end-reads
./bin/batalign -g INDEX -q INPUT -o OUTPUT

Paired-end-reads
./bin/batalign -g INDEX -q INPUT_left -q INPUT_right -o OUTPUT
Example: ./bin/batalign -g /data/index/hg19/hg19.fa -q CML_R1_left.fq -q CML_R2_right.fq -o CML.out.sam --threads 20

INDEX was the mentioned "hg19.fa". Make sure all index files reside in the same directory.
INPUT can be any fastq/fasta file from 2).
OUTPUT is a SAM-format alignment-mapping file.
BatAlign only needs -g -q -o, the above mentioned parameters, to execute in DEFAULT mode.
To parallelize, use "--threads INT".

Built with...
=-=-=-=-=-=-=-
GNU automake v1.11.1, GNU autoconf v2.63, gcc v4.4.7.
Tested on Centos Release 6.6 (Final), Debian GNU/Linux 7
Must be SSE2-instructions compatible

Thank you for your patience.