-
Notifications
You must be signed in to change notification settings - Fork 0
Mapping RNA Seq Reads
To map the RNA-Seq Reads as described in this document, you'll need access to a terminal where you can use samtools
, bwa
and tophat
. For example, an SSH terminal or Shell in a Box.
All your configuration files are stored in the directory mentioned below:
config_dir=~/Mapping/config
The genome reference FASTA file:
genome=${config_dir}/dmel56.fa
The reference is indexed:
genome_index=${config_dir}/dmel56
If needed, we can create the index for the genome reference as shown below:
bowtie-build ${genome} ${genome_index}
The transcript annotation:
transcript_annotation=${config_dir}/Drosophila_melanogaster.BDGP5.75.gtf
Two sample files are shown below:
reads1=data/S2-DRSC-14-PE/081121_S2-DRSC-14-1_1_sequence.txt
reads2=data/S2-DRSC-14-PE/081121_S2-DRSC-14-1_2_sequence.txt
We can operate on these files:
du -h ${reads1}
We will save the outputs at:
output=sample1
Now, we call TopHat. The call below saves the transcriptome index to ${config_dir}
.
tophat -p 2 -G ${transcript_annotation} --transcriptome-index ${config_dir} -o ${output} --no-novel-juncs ${genome_index} ${reads1},${reads2}
Now, we sort the output BAM file.
samtools sort ${output}/accepted_hits.bam ${output}/accepted_hits.sorted
And create an index to speed up access.
samtools index ${output}/accepted_hits.sorted.bam
The reference transcriptome:
transcriptome=${config_dir}/dmeltran56.fa
Again, if needed, create and index to the FASTA file associated to the transcriptome.
bwa index ${transcriptome}
Because we already set the output directory and input files, we can simply call BWA. We align each of the strands independently, as shown below. Then, combine the the SAI files into a SAM file. Here, we use bwa aln
because our reads are short (37bp).
bwa aln -t 2 ${transcriptome} ${reads1} > ${output}/samplet1_1.sai
bwa aln -t 2 ${transcriptome} ${reads2} > ${output}/samplet1_2.sai
bwa sampe ${transcriptome} ${output}/samplet1_1.sai ${output}/samplet1_2.sai ${reads1} ${reads2} > ${output}/sample1t.sam
The BWA output is in the SAM format. We will now convert it to its binary version (BAM).
samtools view -bS ${output}/sample1t.sam > ${output}/sample1t.unsorted.bam
The BAM file is now sorted and indexed.
samtools sort ${output}/sample1t.unsorted.bam ${output}/sample1t.sorted
samtools index ${output}/sample1t.sorted.bam