-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read multi-frequency multi-segment records #331
Conversation
e746de3
to
c4a1257
Compare
This was a straightforward rebase and the changes themselves I think are fairly straightforward, but it's been a while since I looked at this code, so careful reviews would be welcome. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look good, thanks for the new test cases. Please apply formatting before merge.
We really should rename the tests so that we can more easily see the various scenarios being covered.
@@ -1152,7 +1162,16 @@ def multi_to_single(self, physical, return_res=64): | |||
reference_fields[field][ch] = item_ch | |||
# mismatch case | |||
elif reference_fields[field][ch] != item_ch: | |||
if physical: | |||
if field == 'samps_per_frame': | |||
raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use fstrings please.
tests/test_record.py
Outdated
physical=False, smooth_frames=False) | ||
|
||
np.testing.assert_array_equal(sig, sig_target) | ||
assert record.__eq__(record_named) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does assert record == record_named
work? Also, please add a comment above each rdrecord call in this test to explain the intention/difference.
Does `assert record == record_named` work?
It certainly should and is more readable. I'd go further than that and
use self.assertEqual instead.
The Record.__eq__ method is a bit messy though for two reasons: the
verbose flag and the use of np.testing.assert_array_equal. Both are
useful for testing but break the standard Python API. Calling
__eq__ really shouldn't have any side effects, and should just return a
boolean to say if the two objects are equal. It's a mess and partly I
blame numpy for making it a mess.
So what I actually think would be better is to define a method
Record.assert_equal that does the verbose thing. Possibly this can be
hooked into unittest via addTypeEqualityFunc.
Also, please add a comment above each rdrecord call in this test to
explain the intention/difference.
Will do. Thanks.
|
Since commit 21764cf, the optional parameter to wfdb.io.record.BaseRecord.check_field is called "required_channels", not "channels". The semantics appear equivalent. Several callers were still referring to the old name "channels", while most callers simply use positional arguments. Change all callers to use positional arguments for consistency.
When reading a multi-segment record, the samps_per_frame attribute must be copied from the layout segment into the resulting flattened record. (Unlike most other signal attributes, samples per frame must be uniform for a particular signal. The layout header must be used to determine the correct number of samples per frame for signals that are not present in the selected segments.)
When reading a multi-segment record, the multi_to_single function is used to stitch the segments together into a virtual single-segment record. To enable this to work in expanded (non-smooth-frames) mode, we want to combine the 'e_p_signal' or 'e_d_signal' arrays from the individual segments, rather than 'p_signal' or 'd_signal'.
When reading a multi-segment, multi-frequency record, we want to have the option of reading each signal at its original sampling frequency, which requires using smooth_frames=False. Previously this simply wasn't allowed, either with or without multi-to-single conversion. To do this, we need to ensure each segment is loaded in the appropriate (smooth or non-smooth) mode (which formerly would have failed if certain segments *didn't* contain multiple samples per frame.) After loading the segments, we must invoke multi_to_single, if desired, in the appropriate mode.
This test case uses an excerpt of record mimicdb/041/, which is a fixed-layout record with three signals at 500 Hz (four samples per frame) and four signals at 125 Hz (one sample per frame).
Modify the existing test case test_multi_variable_c to check that we can read a single-frequency variable-layout record using smooth_frames=False (which should retrieve the same data as smooth_frames=True, but represented as a list of arrays rather than a single array.)
c4a1257
to
f1f805b
Compare
Rebased to apply black formatting. Here's how:
Commit d25923b became a no-op and was therefore automatically dropped. |
A multi-segment record can contain signals with multiple sampling frequencies, as in the MIMIC DB or the upcoming MIMIC-IV Waveform DB.
By default,
rdrecord
will convert the input to a uniform 2D array of samples; in the case of multi-frequency records it downsamples all signals to the frame frequency ("smooth frames"), and in the case of multi-segment records it stitches the input arrays together into one ("multi-to-single"). However, both these options can be turned off.In particular, smooth frames mode is simply not a good idea; when actually analyzing a multi-frequency record we need to be able to read each signal at its original sampling frequency or better.
Currently,
rdrecord
disallows this:It also fails in "non-multi-to-single" mode:
These changes fix both cases.
This is based on and depends on pull #313 (it might possibly work for mimicdb without those changes, FSVO "work", but both sets of changes are definitely needed for MIMIC-IV.)