-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix reading formats 8, 310, and 311 #327
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Mark Python source files as "diff=python" to show the correct function names, and mark various types of binary files as binary to avoid attempting to diff/merge them.
In each branch, the calls to _rd_dat_signals with and without 'no_file=True, sig_data=sig_data' were otherwise identical. sig_data is ignored if no_file is false, so this is equivalent to simply passing 'no_file=no_file, sig_data=sig_data'. Furthermore, since _rd_dat_signals takes a huge number of arguments, convert all of them to keyword style to avoid confusion.
All four of the calls to _rd_segment were identical apart from the 'no_file' and 'sig_data' arguments. Rearrange the code so there is only one call to _rd_segment, to avoid redundancy. Furthermore, since _rd_segment takes a huge number of arguments, convert all of them to keyword style to avoid confusion.
This should be a list containing the initial sample value for each signal; this is required in order to correctly read format-8 dat files.
This should be a list containing the initial sample value for each signal; this is required in order to correctly read format-8 dat files.
In signal format 8, each sample is stored as an 8-bit signed difference from the previous sample. This means that after reading the raw byte values, they must be translated to absolute sample values by calling cumsum() and adding the initial value.
In formats 310 and 311, each block of three samples is written as four bytes. _rd_dat_signals will retrieve the minimum range of bytes (as determined by _dat_read_params and _required_byte_num) that are needed in order to decode the desired samples; thus, the data passed to _blocks_to_samples may include an incomplete block at the end. The previous implementation of _blocks_to_samples was meant to pad the input data to a multiple of four bytes. However, this logic was wrong: added_samps was always set to zero, so the intended extra bytes were not appended, and (if the lack of extra bytes didn't cause an error) the wrong number of samples was returned to the caller. In fact, the subsequent statements for decoding blocks into samples already worked correctly for an unpadded input array (since each input slice is correctly truncated to the length of the output slice.) So remove the padding logic entirely.
The record "binformats" contains one signal in each of the ten WFDB binary formats (8, 16, 61, 80, 160, 212, 310, 311, 24, and 32.) In this record, sample j of signal i is equal to: (i + 16843019 * j) % ((1 << adcres) - 1) + 1 - (1 << (adcres - 1))) Note that the length of the record is 499 samples, so each of the bit-packed data files ends with an incomplete data block. Use this record to test that it is possible to read all of the formats correctly, including when we skip one or two samples from the start and/or end of the record. (Skipping samples is expected to give incorrect results for format 8, so that signal is not required to match.) We do not test writing, since not all formats are currently supported by wr_dat_file.
bemoody
force-pushed
the
more-signal-fmts
branch
from
September 24, 2021 19:22
d16dd96
to
14048dd
Compare
thanks @bemoody, looks good to me. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Format 8 (not to be confused with format 80) is a format in which each sample is stored as an 8-bit difference from the previous sample. This was handled incorrectly; for example:
This format is rarely used (should not be used) nowadays, but it remains supported by WFDB, so handling it incorrectly will give misleading results. Note that
wr_dat_file
doesn't support writing this format and I'm inclined to keep it that way.Formats 310 and 311 are two different formats that store 10-bit samples as four-byte blocks of three samples each. These formats were handled correctly for the most part, but they would crash if
sampto
was not divisible by 3; for example:wr_dat_file
doesn't support writing either of these formats, but would be worth adding them in the future.