Read multi-frequency multi-segment records #331

bemoody · 2021-10-13T16:25:15Z

A multi-segment record can contain signals with multiple sampling frequencies, as in the MIMIC DB or the upcoming MIMIC-IV Waveform DB.

By default, rdrecord will convert the input to a uniform 2D array of samples; in the case of multi-frequency records it downsamples all signals to the frame frequency ("smooth frames"), and in the case of multi-segment records it stitches the input arrays together into one ("multi-to-single"). However, both these options can be turned off.

In particular, smooth frames mode is simply not a good idea; when actually analyzing a multi-frequency record we need to be able to read each signal at its original sampling frequency or better.

Currently, rdrecord disallows this:

>>> r = wfdb.rdrecord('041', pn_dir='mimicdb/041', sampto=10, smooth_frames=False)
...
ValueError: This package version cannot expand all samples when reading multi-segment records. Must enable frame smoothing.

It also fails in "non-multi-to-single" mode:

>>> r = wfdb.rdrecord('041', pn_dir='mimicdb/041', sampto=10, smooth_frames=False, m2s=False)
...
ValueError: This package version cannot expand all samples when reading multi-segment records. Must enable frame smoothing.

These changes fix both cases.

This is based on and depends on pull #313 (it might possibly work for mimicdb without those changes, FSVO "work", but both sets of changes are definitely needed for MIMIC-IV.)

bemoody · 2022-04-25T15:59:23Z

This was a straightforward rebase and the changes themselves I think are fairly straightforward, but it's been a while since I looked at this code, so careful reviews would be welcome.

cx1111

Look good, thanks for the new test cases. Please apply formatting before merge.

We really should rename the tests so that we can more easily see the various scenarios being covered.

cx1111 · 2022-04-29T21:01:08Z

wfdb/io/record.py

@@ -1152,7 +1162,16 @@ def multi_to_single(self, physical, return_res=64):
                            reference_fields[field][ch] = item_ch
                        # mismatch case
                        elif reference_fields[field][ch] != item_ch:
-                            if physical:
+                            if field == 'samps_per_frame':
+                                raise ValueError(


Use fstrings please.

cx1111 · 2022-04-29T21:06:08Z

tests/test_record.py

+                                     physical=False, smooth_frames=False)
+
+        np.testing.assert_array_equal(sig, sig_target)
+        assert record.__eq__(record_named)


Does assert record == record_named work? Also, please add a comment above each rdrecord call in this test to explain the intention/difference.

bemoody · 2022-04-29T21:46:38Z

Does `assert record == record_named` work?

It certainly should and is more readable. I'd go further than that and use self.assertEqual instead. The Record.__eq__ method is a bit messy though for two reasons: the verbose flag and the use of np.testing.assert_array_equal. Both are useful for testing but break the standard Python API. Calling __eq__ really shouldn't have any side effects, and should just return a boolean to say if the two objects are equal. It's a mess and partly I blame numpy for making it a mess. So what I actually think would be better is to define a method Record.assert_equal that does the verbose thing. Possibly this can be hooked into unittest via addTypeEqualityFunc.

Also, please add a comment above each rdrecord call in this test to explain the intention/difference.

Will do. Thanks.

Since commit 21764cf, the optional parameter to wfdb.io.record.BaseRecord.check_field is called "required_channels", not "channels". The semantics appear equivalent. Several callers were still referring to the old name "channels", while most callers simply use positional arguments. Change all callers to use positional arguments for consistency.

When reading a multi-segment record, the samps_per_frame attribute must be copied from the layout segment into the resulting flattened record. (Unlike most other signal attributes, samples per frame must be uniform for a particular signal. The layout header must be used to determine the correct number of samples per frame for signals that are not present in the selected segments.)

When reading a multi-segment record, the multi_to_single function is used to stitch the segments together into a virtual single-segment record. To enable this to work in expanded (non-smooth-frames) mode, we want to combine the 'e_p_signal' or 'e_d_signal' arrays from the individual segments, rather than 'p_signal' or 'd_signal'.

When reading a multi-segment, multi-frequency record, we want to have the option of reading each signal at its original sampling frequency, which requires using smooth_frames=False. Previously this simply wasn't allowed, either with or without multi-to-single conversion. To do this, we need to ensure each segment is loaded in the appropriate (smooth or non-smooth) mode (which formerly would have failed if certain segments *didn't* contain multiple samples per frame.) After loading the segments, we must invoke multi_to_single, if desired, in the appropriate mode.

This test case uses an excerpt of record mimicdb/041/, which is a fixed-layout record with three signals at 500 Hz (four samples per frame) and four signals at 125 Hz (one sample per frame).

Modify the existing test case test_multi_variable_c to check that we can read a single-frequency variable-layout record using smooth_frames=False (which should retrieve the same data as smooth_frames=True, but represented as a list of arrays rather than a single array.)

bemoody · 2022-05-03T20:41:49Z

Rebased to apply black formatting. Here's how:

rm -f .git/info/attributes
git checkout multisegment-multifrequency
git rebase 1e695017fa4722c8008e02ad514086ca8eb77d76
git config --local filter.black.clean 'black -tpy37 -l80 -q -'
git config --local filter.black.smudge 'black -tpy37 -l80 -q -'
echo '*.py filter=black' > .git/info/attributes
git rebase -Xrenormalize 3b408f23f50889c7d1c6fee9329c76b739b9f667
rm -f .git/info/attributes

Commit d25923b became a no-op and was therefore automatically dropped.

bemoody changed the title ~~Read multi-frequency multi-segment records~~ [WIP] Read multi-frequency multi-segment records Oct 27, 2021

bemoody force-pushed the multisegment-multifrequency branch from e746de3 to c4a1257 Compare April 25, 2022 15:52

bemoody changed the title ~~[WIP] Read multi-frequency multi-segment records~~ Read multi-frequency multi-segment records Apr 25, 2022

cx1111 approved these changes Apr 29, 2022

View reviewed changes

Benjamin Moody added 7 commits May 3, 2022 16:01

multi_to_single: rearrange for readability.

d330a4a

Test reading fixed-layout multi-frequency signals.

90b8e3b

This test case uses an excerpt of record mimicdb/041/, which is a fixed-layout record with three signals at 500 Hz (four samples per frame) and four signals at 125 Hz (one sample per frame).

bemoody force-pushed the multisegment-multifrequency branch from c4a1257 to f1f805b Compare May 3, 2022 20:40

Benjamin Moody added 2 commits May 3, 2022 16:45

multi_to_single: reformat error message as f-string.

f2044a9

test_multi_fixed_d: add comments and use assertEqual.

4f06a06

tompollard merged commit efca603 into master May 4, 2022

tompollard deleted the multisegment-multifrequency branch May 4, 2022 14:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read multi-frequency multi-segment records #331

Read multi-frequency multi-segment records #331

bemoody commented Oct 13, 2021

bemoody commented Apr 25, 2022

cx1111 left a comment

cx1111 Apr 29, 2022

cx1111 Apr 29, 2022

bemoody commented Apr 29, 2022 via email

bemoody commented May 3, 2022

Read multi-frequency multi-segment records #331

Read multi-frequency multi-segment records #331

Conversation

bemoody commented Oct 13, 2021

bemoody commented Apr 25, 2022

cx1111 left a comment

Choose a reason for hiding this comment

cx1111 Apr 29, 2022

Choose a reason for hiding this comment

cx1111 Apr 29, 2022

Choose a reason for hiding this comment

bemoody commented Apr 29, 2022 via email

bemoody commented May 3, 2022