Skip to content

Mapping RNA Seq Reads

Benilton Carvalho edited this page Apr 7, 2014 · 20 revisions

Setup files

All your configuration files are stored in the directory mentioned below:

config_dir=~/Mapping/config

The genome reference FASTA file:

genome=${config_dir}/dmel56.fa

The reference is indexed:

geome_index=${config_dir}/dmel56

If needed, we can create the index for the genome reference as shown below:

bowtie-build ${genome} ${genome_index}

The transcript annotation:

transcript_annotation=${config_dir}/Drosophila_melanogaster.BDGP5.75.gtf

Two sample files are shown below:

reads1=data/S2-DRSC-14-PE/081121_S2-DRSC-14-1_1_sequence.txt

reads2=data/S2-DRSC-14-PE/081121_S2-DRSC-14-1_2_sequence.txt

We can operate on these files:

du -h ${reads1}

We will save the outputs at:

output=sample1

Mapping RNA-Seq Reads to the Genome

Now, we call TopHat. The call below saves the transcriptome index to ${config_dir}.

tophat -p 2 -G ${transcript_annotation} --transcriptome-index ${config_dir} -o ${output} --no-novel-juncs ${genome_index} ${reads1},${reads2}

Now, we sort the output BAM file.

samtools sort ${output}/accepted_hits.bam ${output}/accepted_hits.sorted

And create an index to speed up access.

samtools index ${output}/accepted_hits.sorted.bam

Mapping RNA-Seq Reads to the Transcriptome

The reference transcriptome:

transcriptome=${config_dir}/dmeltran56.fa

Again, if needed, create and index to the FASTA file associated to the transcriptome.

bwa index ${transcriptome}

Because we already set the output directory and input files, we can simply call BWA. We align each of the strands independently, as shown below. Then, combine the the SAI files into a SAM file.

bwa aln -t 2 ${reference} ${reads1} > ${output}/samplet1_1.sai bwa aln -t 2 ${reference} ${reads1} > ${output}/samplet1_1.sai bwa aln -t 2 ${reference} ${reads1} > ${output}/samplet1_1.sai

bla

bwa aln -t 2 ${reference} ${reads1} > ${output}/samplet1_1.sai

bwa aln -t 2 ${reference} ${reads2} > ${output}/samplet1_2.sai

bwa sample ${reference} ${output}/samplet1_1.sai ${output}/samplet1_2.sai ${reads1} ${reads2} > ${output}/sample1t.sam

The BWA output is in the SAM format. We will now convert it to its binary version (BAM).

samtools import ${reference}.fai ${output}/sample1t.sam ${output}/sample1t.unsorted.bam

The BAM file is now sorted and indexed.

samtools sort ${output}/sample1t.unsorted.bam ${output}/sample1t

samtools index ${output}/sample1t.bam

Clone this wiki locally