Optimization of wfdb.io.annotation.field2bytes function #406
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi,
I noticed writing an annotation file was slow for a file with many annotations.
Running line-profiling on writing functions, I found out that the
field2bytes
function was taking up most of the execution time.So, it turns out that the problem was with this line:
typecode = ann_label_table.loc[ann_label_table["symbol"] == value[1], "label_store"].values[0]
What happened was that we filtered through all the
ann_label_table
DataFrame for every input value offield2bytes
, so this was pretty slow. Instead, I added a dictionnary that maps every symbols to its corresponding label, which is much faster (see the time profiler output below)Time profilers
Current version
New version