You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, we are using SortMeRNA to align contigs against the complete ref db, and we are outputting all alignments. This can lead to very big SAM files because some very conserved contigs will have alignments against almost every ref sequence. In the next sub-step, when reading that SAM file with Python, we will load in memory all alignments of the same contig, which can lead to huge memory usage.
We can imagine several complementary solutions to reduce this RAM and disk space usage:
optimise batch SAM reading in Python, by storing only relevant data in memory
store alignments in a BAM file instead of a SAM file. Then we'll probably need a combination of samtools and pythons libraries to read it properly.
rethink our scaffolding strategy to not have to output all possible alignments. This strategy will probably improve memory usage the best, but it will change the algorithm and need to be well though of in advance.
The text was updated successfully, but these errors were encountered:
Right now, we are using SortMeRNA to align contigs against the complete ref db, and we are outputting all alignments. This can lead to very big SAM files because some very conserved contigs will have alignments against almost every ref sequence. In the next sub-step, when reading that SAM file with Python, we will load in memory all alignments of the same contig, which can lead to huge memory usage.
We can imagine several complementary solutions to reduce this RAM and disk space usage:
The text was updated successfully, but these errors were encountered: