Additional support for MFF #8038

ephathaway · 2020-07-21T21:43:06Z

On behalf of Brain Electrophysiology Laboratory, we would like to add support for all different flavors of MFF. The mne.io.egi module can be used to read in raw MFF files, but we would like to be able to read MFF segmented files (analogous to Epochs class) and MFF averaged files (analogous to Evoked class). We would also like to be able to convert BaseRaw, Epochs, or Evoked objects to continuous, segmented, and averaged MFF files respectively and write these as output files. Finally, we would like to be able to read and store a wider range of event track information from MFF files. MFF event track XML files contain several attributes for each event, such as label, duration, and response, that are not currently captured when reading in raw MFF files.

In order to address these requests, we propose using mffpy, a Python MFF reader/writer created by BEL for handling MFF inputs and outputs. This Python package is capable of reading and writing each of the different flavors of MFF as well as parsing individual XML files within an MFF directory. Our general proposal would be to make mffpy a dependency of MNE and then restructure the mne.io.egi module to read in MFF files with mffpy and then convert them to the appropriate object (Raw, Epochs, or Evoked). We also want to add a module that takes Raw, Epochs, and Evoked objects and converts them to an mffpy Writer object so that they can be output as MFF files. As far as event tracks, we are able to fully parse event track XML files with mffpy. We may need to create a new type of MFF events class to store this information, as it appears that event info is currently stored in a simple array with relative start time and event type.

We have developers at BEL that would be happy to implement these features. We are also happy to answer any questions about mffpy and how it could increase the functionality of MNE.

The text was updated successfully, but these errors were encountered:

ephathaway · 2020-07-21T21:44:02Z

@jusjusjus

drammock · 2020-07-21T21:53:00Z

We may need to create a new type of MFF events class to store this information, as it appears that event info is currently stored in a simple array with relative start time and event type.

Have you looked at the MNE-Python Annotations class? https://mne.tools/dev/auto_tutorials/raw/plot_30_annotate_raw.html

agramfort · 2020-07-22T13:09:19Z

no objection on my side is we match the same performance for reading files and we support the preload option. maybe start with evoked and epochs as there is not code for this at the moment

…

ephathaway · 2020-07-22T18:50:11Z

We may need to create a new type of MFF events class to store this information, as it appears that event info is currently stored in a simple array with relative start time and event type.

Have you looked at the MNE-Python Annotations class? https://mne.tools/dev/auto_tutorials/raw/plot_30_annotate_raw.html

I actually hadn't looked at the Annotations class until now. It looks a little more descriptive than what we get from the mne.find_events function. It seems like it would be easy enough to parse an event track XMLs with mffpy and then convert them to an Annotations object. How feasible would it be to add additional attributes to the Annotations class for other aspects of MFF events, such as sourceDevice, response time, trial #, etc.? Could we maybe make a subclass of the Annotations class specifically for RawEGI objects?

ephathaway · 2020-07-22T18:53:54Z

no objection on my side is we match the same performance for reading files and we support the preload option. maybe start with evoked and epochs as there is not code for this at the moment
…

Good idea to pilot on the Epochs and Evoked since that will be new code.

drammock · 2020-07-22T19:29:04Z

How feasible would it be to add additional attributes to the Annotations class for other aspects of MFF events, such as sourceDevice, response time, trial #, etc.? Could we maybe make a subclass of the Annotations class specifically for RawEGI objects?

That is certainly possible in principle... can you say a bit more about what kind of metadata is stored in MFF events? Is it the same for continuous, epoched, and averaged data? The Epochs.metadata property is also worth considering here (it's a restricted version of a pandas.DataFrame) though it's not currently implemented for Raw or Evoked objects... here's a primer on Epochs metadata

ephathaway · 2020-07-23T18:46:37Z

That is certainly possible in principle... can you say a bit more about what kind of metadata is stored in MFF events? Is it the same for continuous, epoched, and averaged data? The Epochs.metadata property is also worth considering here (it's a restricted version of a pandas.DataFrame) though it's not currently implemented for Raw or Evoked objects... here's a primer on Epochs metadata

Below is an example of a single event from an event track XML. This is a good example of an event generated by a stimulus presentation device at recording. I think the beginTime, duration, code, label, description, and sourceDevice are common to all MFF events and then the number of keys can vary.

There are event tracks in continuous, epoched, and averaged MFF files, although I am not sure how events are preserved in averaged files, as the time scale is not preserved.

The Epochs.metadata looks pretty versatile. Seems like it could be adapted to store event info for a Raw object.

<event>
    <beginTime>2003-04-17T13:35:33.162000-08:00</beginTime>
    <duration>1000</duration>
    <code>TRSP</code>
    <label></label>
    <description></description>
    <sourceDevice>Experimental Control Interface</sourceDevice>
    <keys>
        <key>
            <keyCode>cel#</keyCode>
            <data dataType="short">4</data>
        </key>
        <key>
            <keyCode>obs#</keyCode>
            <data dataType="short">1</data>
        </key>
        <key>
            <keyCode>rsp#</keyCode>
            <data dataType="short">4</data>
        </key>
        <key>
            <keyCode>eval</keyCode>
            <data dataType="short">1</data>
        </key>
        <key>
            <keyCode>rtim</keyCode>
            <data dataType="short">645</data>
        </key>
        <key>
            <keyCode>trl#</keyCode>
            <data dataType="short">1</data>
        </key>
        <key>
            <keyCode>Task</keyCode>
            <data dataType="string">trait</data>
        </key>
        <key>
            <keyCode>Cont</keyCode>
            <data dataType="string">lively</data>
        </key>
        <key>
            <keyCode>Valn</keyCode>
            <data dataType="string">Good</data>
        </key>
    </keys>
</event>

agramfort · 2020-07-23T20:43:11Z

as Annotations and events are not as versatile as MFF format I would suggest to add a function to read the event file and eventually get metadata as output. If you can accept to attach Annotations taking the 3 columns, onset, duration and description then it's ok to attach it to Raw. So you come up with a description with `/` separators to make keys in event_id down the line so that you can index your epochs as explained in : https://mne.tools/dev/auto_tutorials/epochs/plot_10_epochs_overview.html#subselecting-epochs

…

drammock · 2020-07-23T21:44:54Z

So @agramfort suggests 2 possibilities:

convert Event Track XML into an Annotations object, where the description field of each annotation is a somewhat unwieldy string where all metadata fields are concatenated, separated by / characters. This has the advantage that /-separated strings are already parsed by downstream epoch-related code, so when the annotations are converted to an events array and corresponding event_id dict used in epoching, it will be straightforward to sub-select resulting epochs based on metadata fields.
convert Event Track XML into a pandas DataFrame, and accept that it will be a separate object from the Raw data object. If this approach is taken, there should probably at least be a function to create MNE-Python-style Annotations and/or an events array from that DataFrame. I'm assuming one row per event, with the columns being the union of all keys present in the file, with some suitable NA value (np.nan, empty string, etc) wherever a key is not defined for the event in that row?

Personally I am open to a third option which is attaching a dataframe-like metadata object to Raw, similar to Epochs.metadata. But then how to make it useful... a lot of what makes epochs metadata useful is that there is exactly one metadata row per epoch, but the hypothetical Raw.metadata would be one row per event. I think there is considerable work necessary to figure out the API and supported use case(s) here.

ephathaway · 2020-07-23T23:30:54Z

Converting the event track XML to a DataFrame seems like the most attractive option as far as easy access to all the event attributes. However, we do want to be able to create Epochs objects based on the events and certain subsets of events. I think ideally, the events used to create Epochs would then become the contents of Epochs.metadata as a description of each epoch.

What if we store the events in a DataFrame separate from the Raw file and then add functionality to the Epochs class to be able to create epochs around events from a DataFrame?

I think it will be easiest to first focus on implementing reading of segmented and averaged MFFs with mffpy to see how well this works. From there we can look at reading of continuous MFF files and decide how we want to handle events.

drammock · 2020-07-23T23:44:03Z

Converting the event track XML to a DataFrame seems like the most attractive option as far as easy access to all the event attributes.

Agreed.

However, we do want to be able to create Epochs objects based on the events and certain subsets of events. I think ideally, the events used to create Epochs would then become the contents of Epochs.metadata as a description of each epoch.

That is indeed an elegant-sounding approach.

What if we store the events in a DataFrame separate from the Raw file and then add functionality to the Epochs class to be able to create epochs around events from a DataFrame?

hmm... perhaps an API like Epochs.from_dataframe(raw, dataframe, ...)? I think this would be cleaner than trying to make the existing Epochs constructor flexible enough to take in either a DataFrame or an events array + event_id dict. You'll probably end up with (private?) helper functions similar to the ones I mentioned above (functions to create an events array and event_id dict from the MFF-derived dataframe). But we should hear input from others before proceeding down that line of approach.

I think it will be easiest to first focus on implementing reading of segmented and averaged MFFs with mffpy to see how well this works. From there we can look at reading of continuous MFF files and decide how we want to handle events.

Sounds good.

agramfort · 2020-07-24T08:08:17Z

one idea is to have a function events_from_frame or events_from_dataframe that converts the dataframe to events/event_id like we have for annotations. So we keep one way to build epochs.

…

ephathaway · 2020-07-24T15:55:17Z

hmm... perhaps an API like Epochs.from_dataframe(raw, dataframe, ...)? I think this would be cleaner than trying to make the existing Epochs constructor flexible enough to take in either a DataFrame or an events array + event_id dict. You'll probably end up with (private?) helper functions similar to the ones I mentioned above (functions to create an events array and event_id dict from the MFF-derived dataframe). But we should hear input from others before proceeding down that line of approach.

I like this idea from a usability standpoint. This way someone processing an MFF could keep their events in the DataFrame format throughout the workflow.

drammock · 2020-07-24T17:01:11Z

one idea is to have a function events_from_frame or events_from_dataframe that converts the dataframe to events/event_id like we have for annotations. So we keep one way to build epochs.

This was my earlier suggestion @agramfort, but it would not allow the (really nice) feature of the events dataframe automatically being used to populate the Epochs.metadata. However, as I implied above, the events_from_dataframe function will probably need to get written under the hood anyway to make Epochs.from_dataframe() work, so it's actually a nice sequential approach to implement events_from_dataframe() first, and then add on Epochs.from_dataframe() as a wrapper of the main constructor in a subsequent step.

But before all that, it sounds like the first steps are to simply read the already-epoched and already-averaged data formats, and come back to the event parsing later.

agramfort · 2020-07-24T18:06:49Z

But before all that, it sounds like the first steps are to simply read the already-epoched and already-averaged data formats, and come back to the event parsing later.

+1

ephathaway · 2020-07-27T20:31:14Z

@drammock @agramfort would it be wise to wait for some of the other MNE maintainers to get on board before beginning work on the reading of epoched and averaged MFFs? Or do you guys think there will be enough support from the others to merge something like this?

agramfort · 2020-07-27T20:34:41Z

I have not seen the diff nor CIs green so I cannot promise anything. But if we want mne support for such files I don't see any other way.

…

drammock · 2020-07-27T21:44:06Z

I see no reason not to start work on the Epoched/Averaged reader. We seem to mostly agree on the big picture, and it's easier to hammer out the little details once there's an open PR that we can look at.

ephathaway · 2020-07-28T00:22:46Z

Sounds good. I will start coordinating with my team on this.

larsoner · 2020-07-29T14:16:25Z

I would suggest to start with making a PR to add only read_evokeds_mff, with a dependency (just in that function) on mffpy. It will force us to sort out the mffpy optional depedency, work out general contribution stuff, etc. The API there seems straightforward. After that PR is merged we can work out how to deal with epochs/metadata and whether or not we want to kill our custom MFF raw reading code in favor of mffpy.

ephathaway · 2020-07-29T19:01:55Z

I would suggest to start with making a PR to add only read_evokeds_mff, with a dependency (just in that function) on mffpy. It will force us to sort out the mffpy optional depedency, work out general contribution stuff, etc. The API there seems straightforward. After that PR is merged we can work out how to deal with epochs/metadata and whether or not we want to kill our custom MFF raw reading code in favor of mffpy.

Sounds like a good approach. Probably best not to bite off more than we can chew for the first PR.

christianbrodbeck · 2020-11-28T13:52:08Z

I was trying to do coregistration with some (raw) MFF files and noticed some issues (below) and thinking about how best to address them. Is the plan still to eventually replace the current raw reader with mffpy? Is somebody already working on this?

Channel locations are added to channels but not added as digitizer points, which would be required for coregistration
Fiducials are not added at all

jusjusjus · 2020-11-30T11:30:37Z

@christianbrodbeck, great point. How can we evaluate if mffpy supported your use case? @ephathaway

christianbrodbeck · 2020-11-30T17:48:47Z

If nobody else is working on it already I could start translating the raw reader to use mffpy?

larsoner · 2020-11-30T17:53:34Z

If nobody else is working on it already I could start translating the raw reader to use mffpy?

Sure, would be good!

But actually I think the issue you're encountering with info['dig'] not being populated is actually separable -- when we add the EEG electrode locations in info['chs'][ii]['loc'] we should also add it to dig. So I think there should be two PRs: the first to add dig information and test that it's there and reasonable, then a second separate one to use mffpy. It will require some effort that will end up being discarded (to get the EEG + maybe fids in our native code) but hopefully this is actually a small effort. And it gives us time to move to mffpy, and do so with more complete (dig) tests.

@christianbrodbeck let me know which of the two PRs you're interested in working on. If it's just the mffpy bit, I can add the native-reader bits + tests first.

christianbrodbeck · 2020-11-30T22:40:26Z

They're potentially more interrelated in that the fiducials are currently never read (as far as I could tell), and I suspect implementing that would be more straight forward with mffpy than with the current approach. This is why I would have suggested translating to mffpy first. But I might be wrong.

ephathaway · 2020-11-30T23:48:47Z

@christianbrodbeck I am not exactly sure what these digitizer points and fiducials are, but I suspect this info is contained in the coordinates.xml in an .mff file. mffpy has a class to parse these.

christianbrodbeck · 2020-12-01T12:52:44Z

Thanks @ephathaway , I saw this in coordinates.xml, do you know whether we can rely on the identifier entries corresponding to nasion/LPA/RPA?

            <sensor>
                <name>Nasion</name>
                <number>258</number>
                <type>2</type>
                <x>0.000</x>
                <y>10.564</y>
                <z>-2.051</z>
                <identifier>2002</identifier>
            </sensor>
            <sensor>
                <name>Left periauricular point</name>
                <number>259</number>
                <type>2</type>
                <x>-8.592</x>
                <y>0.498</y>
                <z>-4.128</z>
                <identifier>2011</identifier>
            </sensor>
            <sensor>
                <name>Right periauricular point</name>
                <number>260</number>
                <type>2</type>
                <x>8.592</x>
                <y>0.498</y>
                <z>-4.128</z>
                <identifier>2010</identifier>
            </sensor>

ephathaway · 2020-12-01T18:50:36Z

do you know whether we can rely on the identifier entries corresponding to nasion/LPA/RPA?

I'm not sure. I believe the coordinates.xml file is determined by the recording device. So any number of MFFs recorded with a "HydroCel GSN 256 1.0" will have identical coordinates.xml files.

christianbrodbeck · 2020-12-01T18:57:39Z

test_egi.mff in the MNE testing dataset has different net and also has the same identifiers, so maybe we can assume they're the same?

            <sensor>
                <name>Nasion</name>
                <number>130</number>
                <type>2</type>
                <x>0.000</x>
                <y>10.356</y>
                <z>-2.694</z>
                <identifier>2002</identifier>
            </sensor>
            <sensor>
                <name>Left periauricular point</name>
                <number>131</number>
                <type>2</type>
                <x>-6.712</x>
                <y>0.046</y>
                <z>-3.712</z>
                <identifier>2011</identifier>
            </sensor>
            <sensor>
                <name>Right periauricular point</name>
                <number>132</number>
                <type>2</type>
                <x>6.712</x>
                <y>0.046</y>
                <z>-3.712</z>
                <identifier>2010</identifier>
            </sensor>

ephathaway · 2020-12-11T18:39:30Z

so maybe we can assume they're the same?

@christianbrodbeck you can view all of the possible coordinates.xml files here. It looks like the identifier fields for all of these match except for the Geodesic Sensor Net 64 2.0.xml coordinates file. This does seem odd that it would only be different for this one recording device type, especially given that there are other 64-channel devices.

            <sensor>
                <name>VREF</name>
                <number>65</number>
                <type>1</type>
                <x>0.000</x>
                <y>0.000</y>
                <z>8.943</z>
                <identifier>1001</identifier>
            </sensor>
            <sensor>
                <name>Nasion</name>
                <number>66</number>
                <type>2</type>
                <x>0.000</x>
                <y>10.385</y>
                <z>-1.815</z>
                <identifier>2002</identifier>                
            </sensor>
            <sensor>
                <name>Left periauricular point</name>
                <number>67</number>
                <type>2</type>
                <x>-7.004</x>
                <y>0.368</y>
                <z>-3.694</z>
                <identifier>2011</identifier>                
            </sensor>
            <sensor>
                <name>Right periauricular point</name>
                <number>68</number>
                <type>2</type>
                <x>7.004</x>
                <y>0.368</y>
                <z>-3.694</z>
                <identifier>1001</identifier>                
            </sensor>

ephathaway · 2021-05-07T22:28:39Z

I would like to add a function to write one or more Evoked objects to an MFF file. MFFs can be of the "averaged" flavor and may contain multiple averages of different categories. I think the Evoked class will translate nicely to this sort of MFF. To accomplish this, we could make use of mffpy.Writer class, which I think will make this function pretty slick.

The question is where to put it? I noticed the Epochs and BaseRaw classes both have an export method, which exports these data structures to EEGLab, and maybe other formats in the future? Although, a method like this may not work in this case, since we want to be able to write multiple Evoked objects to a single output file. Could we stick a write_evokeds_mff function in mne/io/egi/egimff.py?

larsoner · 2021-05-10T11:50:27Z

To stick to the export theme, how about mne.export_evokeds? Currently the only option would be MFF but eventually things like EEGLAB could be added for example.

ephathaway · 2021-05-10T15:45:18Z

To stick to the export theme, how about mne.export_evokeds? Currently the only option would be MFF but eventually things like EEGLAB could be added for example.

Sounds good! I will work on a PR in the next couple of weeks.

ephathaway · 2021-05-11T21:51:48Z

@larsoner to avoid having a lot of MFF specific code in evoked.py, I am thinking something like:

from .io.egi import write_evokeds_to_mff

def export_evokeds(fname, evoked, fmt):
    """Export evoked datasets to external formats"""
    if fmt == 'mff':
        write_evokeds_to_mff(fname, evoked)

larsoner · 2021-05-11T23:14:27Z

Sounds good. If you need to avoid a circular import problem you can nest the import

ephathaway · 2021-05-14T22:19:35Z

@larsoner mind taking a look at mne-tools/mne-testing-data#87? I'll need this to properly test the export function.

larsoner · 2021-05-14T22:49:18Z

https://github.com/mne-tools/mne-testing-data/releases/tag/0.120

ephathaway · 2021-05-17T16:33:04Z

https://github.com/mne-tools/mne-testing-data/releases/tag/0.120

Great! Thanks

ephathaway mentioned this issue Aug 10, 2020

ENH: Modify XML.from_file or other method to return category names BEL-Public/mffpy#6

Closed

ephathaway mentioned this issue Sep 14, 2020

Read mff evoked BEL-Public/mne-python#3

Closed

ephathaway mentioned this issue Oct 9, 2020

ENH: read mff evoked #8354

Merged

christianbrodbeck mentioned this issue Jan 25, 2021

[MRG] ENH EGI MFF reader: populate info['dig'] #8789

Merged

ephathaway mentioned this issue May 18, 2021

MRG, ENH: Export evokeds MFF #9406

Merged

scott-huberty mentioned this issue Dec 4, 2023

API: Use annotations for read_raw_egi events rather than synthetic channel #12262

Closed

scott-huberty mentioned this issue Nov 21, 2024

WIP: Finally Use Mffpy for read_raw_egi #12981

Draft

9 tasks

Additional support for MFF #8038

Additional support for MFF #8038

Comments

ephathaway commented Jul 21, 2020 • edited Loading

ephathaway commented Jul 21, 2020

drammock commented Jul 21, 2020

agramfort commented Jul 22, 2020 via email

ephathaway commented Jul 22, 2020

ephathaway commented Jul 22, 2020

drammock commented Jul 22, 2020

ephathaway commented Jul 23, 2020

agramfort commented Jul 23, 2020 via email • edited by drammock Loading

drammock commented Jul 23, 2020

ephathaway commented Jul 23, 2020

drammock commented Jul 23, 2020

agramfort commented Jul 24, 2020 via email

ephathaway commented Jul 24, 2020

drammock commented Jul 24, 2020

agramfort commented Jul 24, 2020 via email

ephathaway commented Jul 27, 2020

agramfort commented Jul 27, 2020 via email

drammock commented Jul 27, 2020

ephathaway commented Jul 28, 2020

larsoner commented Jul 29, 2020

ephathaway commented Jul 29, 2020

christianbrodbeck commented Nov 28, 2020

jusjusjus commented Nov 30, 2020

christianbrodbeck commented Nov 30, 2020

larsoner commented Nov 30, 2020

christianbrodbeck commented Nov 30, 2020

ephathaway commented Nov 30, 2020

christianbrodbeck commented Dec 1, 2020

ephathaway commented Dec 1, 2020 • edited Loading

christianbrodbeck commented Dec 1, 2020

ephathaway commented Dec 11, 2020

ephathaway commented May 7, 2021

larsoner commented May 10, 2021

ephathaway commented May 10, 2021

ephathaway commented May 11, 2021

larsoner commented May 11, 2021

ephathaway commented May 14, 2021

larsoner commented May 14, 2021

ephathaway commented May 17, 2021

ephathaway commented Jul 21, 2020 •

edited

Loading

agramfort commented Jul 23, 2020 via email •

edited by drammock

Loading

ephathaway commented Dec 1, 2020 •

edited

Loading