-
Notifications
You must be signed in to change notification settings - Fork 17
safe guard for memory error #6
safe guard for memory error #6
Conversation
…OUNT to catch potential more longer overlap if avilable
…orting reads, fix for the last read
Used this new code for NIST Trio Child assembly. It works on-par with the previous version but with much better I/O performance and less useless computation during the consensus stage. I am merging the code. |
Used this new modification for NIST Trio Child assembly. It works on-par with the previous version but with much better I/O performance and less useless computation during the consensus stage. I am merging the code.
No objection here. I don't fully understand the change, so I didn't comment. |
The change is to reduce the amount of sequence data that LA4Falcon to fetch and send to the consensus code. The basic problem is that when one seqences large or very repetitive genomes, sometimes there are many false positive hits. For example, you can find some reads have more 50,000 reads aligned. They can't be all from the same genomic location. The idea is to use some simple predictor (overlap length - overhang length used here) to select those that have higher likelihood to be from the same genomic location and limit the total amount of sequences to fetch and ouput. This may solve some I/O bandwidth issue and if the predictor is good, the results could be even better in theory. With this change, I was able to have 120 concurrent I/O proceses reading the Also, ideally, I would use a priority queue to keep the top n-hits. It could be done later. The flat array + qsort solve the current problem. |
oh. the MIN macro was not used at the end. we can remove it. |
Nice.
Maybe I'll try that myself someday. |
…are done summary: The graph to layout code add a new rule to reduce mis-assembly, see PacificBiosciences/FALCON#179 Initial raw read alignment hit are sorted by overlap length to get error correction reads more efficiently. See https://github.com/PacificBiosciences/DALIGNER/pull/, PacificBiosciences/DALIGNER#6
…vement Used this new modification for NIST Trio Child assembly. It works on-par with the previous version but with much better I/O performance and less useless computation during the consensus stage. I am merging the code.
…vement Used this new modification for NIST Trio Child assembly. It works on-par with the previous version but with much better I/O performance and less useless computation during the consensus stage. I am merging the code.
…vement Used this new modification for NIST Trio Child assembly. It works on-par with the previous version but with much better I/O performance and less useless computation during the consensus stage. I am merging the code.
Used this new modification for NIST Trio Child assembly. It works on-par with the previous version but with much better I/O performance and less useless computation during the consensus stage. I am merging the code.
…evelop * commit '18f2fb2e33f37dc9a61d5584845041a7e6925069': stop including the dazzdb repo
No description provided.