Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG, ENH: Export evokeds MFF #9406

Merged
merged 22 commits into from
Jun 3, 2021
Merged

Conversation

ephathaway
Copy link
Contributor

In this PR we add support for exporting evoked(s) objects to MFF, as described in #8038

For the API we add function mne.export_evokeds, which supports exporting to MFF format, but leaves room for adding other formats in the future.

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, mostly minor things and one future compat idea!

mne/evoked.py Outdated
If exporting to MFF format, specify the device on which EEG was
recorded (e.g. 'HydroCel GSN 256 1.0'). This is necessary for
determining the sensor layout and coordinates specs.
history : None (default) | list of history entries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
history : None (default) | list of history entries
history : None (default) | list of str

mne/evoked.py Outdated
elif fmt == 'brainvision':
raise NotImplementedError('Export to BrainVision not implemented.')

print(f'Exporting evoked dataset to {fname}...')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be logger.info and precede the actual exporting attempt/operation

@@ -9,6 +9,7 @@
from xml.dom.minidom import parse

import numpy as np
import pytz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a builtin so this would be a new MNE-Python dependency. It's required by mffpy but mffpy doesn't always have to be installed by people. Can you nest this after the mffpy = _import_mffpy line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Yeah sounds like a plan.

@@ -369,7 +369,6 @@ def test_io_egi_evokeds_mff(idx, cond, tmax, signals, bads):
assert evoked_cond.info['nchan'] == 259
assert evoked_cond.info['sfreq'] == 250.0
assert not evoked_cond.info['custom_ref_applied']
assert evoked_cond.info['dig'] is None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this changed, can you update this instead to be assert len(evoked_cond.info['dig']) == 100 or whatever the correct value is? Was this fixed by some previous PR's changes to the egi code and this got updated when you updated the testing data with a newer evoked version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so before with no categories.xml in the example MFF, no digitization data could be determined, so it info['dig'] defaulted to None. Now, with the categories.xml, the digitization data can be read in.

Comment on lines 315 to 317
with pytest.raises(NotImplementedError) as exc_info:
export_evokeds(f'output.{ext}', evoked)
assert str(exc_info.value) == f'Export to {fmt} not implemented.'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with pytest.raises(NotImplementedError) as exc_info:
export_evokeds(f'output.{ext}', evoked)
assert str(exc_info.value) == f'Export to {fmt} not implemented.'
with pytest.raises(NotImplementedError, match=f'Export to {fmt} not supp'):
export_evokeds(f'output.{ext}', evoked)

Comment on lines 272 to 275
evoked_exported = read_evokeds_mff(export_fname)
for i in range(len(evoked_exported)):
ave_exported = evoked_exported[i]
ave = evoked[i]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
evoked_exported = read_evokeds_mff(export_fname)
for i in range(len(evoked_exported)):
ave_exported = evoked_exported[i]
ave = evoked[i]
evoked_exported = read_evokeds_mff(export_fname)
assert len(evoked) == len(evoked_exported)
for ave, ave_exported in zip(evoked, evoked_exported):

assert_equal(ave_exported.nave, ave.nave)
assert_equal(ave_exported.kind, ave.kind)
assert_equal(ave_exported.comment, ave.comment)
assert_equal(ave_exported.times, ave.times)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert_equal(ave_exported.times, ave.times)
assert_allclose(ave_exported.times, ave.times)

mne/evoked.py Outdated
Comment on lines 1445 to 1449
.. warning::
Since we are exporting to external formats, there's no guarantee that
all the info will be preserved in the external format. To save in
native MNE format (``.fif``) without information loss, use
:func:`mne.write_evokeds` instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deduplicate with raw and epochs export using %(export_warning)s (i.e., add new entry to mne/utils/docs.py and modify this function and mne.io.Raw.export and mne.io.Epochs.export)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I originally wanted to reuse that export warning message from raw and epochs, but it says to use the save function, which isn't quite equivalent in this case. I think we want to point the user to write_evokeds.

Are you suggesting change the shared warning message to something like:

To save in native MNE format (``.fif``) without information loss,
use :func:`save` instead or :func:`mne.write_evokeds` if exporting evokeds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do:

docs.py:

docdict['export_warning'] = """\
.. warning: ... without information loss, use"""

evoked.py:

...
    %(export_warning)s :func:`mne.write_evokeds` instead.

epochs.py:

...
    %(export_warning)s :meth:`save` instead.

Make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes! Didn't think of that.

mne/evoked.py Outdated
Format of the export. Defaults to ``'auto'``, which will infer the
format from the filename extension. See supported formats above for
more information.
device : None (default) | str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to be careful about future compat here. Each export could have zero or more such extra arguments, and I foresee a nasty parameter list if, say, one of these took 10 such arguments.

I don't see a great way around it (**kwargs seems worse for readability, and mff_kwargs=None, eeglab_kwargs=None, ... just punts on the issue. So how about we prefix the relevant exporter name:

Suggested change
device : None (default) | str
mff_device : None (default) | str

That way it will clearly group visually in the parameter list:

mff_device : ...
    ...
mff_history : ...
    ...
eeglab_foo : ...
    ...
brainvision_bar : ...
    ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

mne/evoked.py Outdated
@@ -1436,6 +1436,77 @@ def _write_evokeds(fname, evoked, check=True):
end_file(fid)


@fill_doc
def export_evokeds(fname, evoked, fmt='auto', device=None, history=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we force the mff_* entries to be provided as kwargs, it makes for nice future compat:

Suggested change
def export_evokeds(fname, evoked, fmt='auto', device=None, history=None):
def export_evokeds(fname, evoked, fmt='auto', *, mff_device=None, mff_history=None):

because later we can add to the signature before the MFF-specific entries if there happens to be something common to all formats for example.

mne/evoked.py Outdated
Comment on lines 1458 to 1466
mff_device : None (default) | str
If exporting to MFF format, it is required to specify the device on
which EEG was recorded (e.g. 'HydroCel GSN 256 1.0'). This is necessary
for determining the sensor layout and coordinates specs.
mff_history : None (default) | list of dict
If exporting to MFF format, it is optional to provide a list of history
entries (dictionaries) to be written to history.xml. This must adhere
to the format described in mffpy.xml_files.History.content. If None, no
history.xml will be written.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit uneasy about this. Ideally the export_evokeds function should have a format agnostic signature. Otherwise we have one export function per format. Also I see the word "required" while it defaults to None. I assume it would be highly recommended to save this information. Can we infer the device from channel names and/or their locations?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise we have one export function per format.

@agramfort I could live with this, as to me it's mostly the same. However, it goes against how things work for Raw and Epochs currently, so we'd want to switch those to have separate functions, too, otherwise the inconsistent interface isn't worth it to me. And this would require a deprecation cycle because we already released with the EEGLAB export.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes more sense to just have mne.io.egi.export_evokeds_to_mff as the API for exporting to MFF.

The device argument determines which sensorLayout.xml and coordinates.xml to write to the MFF directory. These are generic files that are determined only by the net type used to record EEG and they give the 2D and 3D positions of the sensors (see https://github.com/BEL-Public/mffpy/tree/develop/mffpy/resources/coordinates). The tricky part is that sometimes we have two different net types (e.g. 'HydroCel GSN 32 1.0' and 'MicroCel GSN 100 32 1.0') with the exact same sensor positions.

I think it's possible to determine the device type (or at least a device with the same coordinates) from info['dig'], but it would definitely be a chore.

As far as history, this is not something that is required for writing a valid MFF, we would appreciate the option to include it so that we can log processing steps in the output file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also this info['device_info'] key that could be utilized to store the device type upon reading MFFs and then used to determine the device upon write. I think in any case, at least having the option to provide the device as an argument upon write is a good idea.

Copy link
Member

@agramfort agramfort May 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes good idea. We could set properly the "device_info" on read from mff file and you remind that there is an info['proc_history'] that we don't use enough. It's just used for maxfilter information AFAIK.
the less stuff we do that is custom the MFF the better. It's more work but I think it's better in the long term.

my 2c

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a separate PR to set the "device_info" on read or is it ok to include in this PR?

@@ -907,6 +909,142 @@ def _read_evoked_mff(fname, condition, channel_naming='E%d', verbose=None):
nave=nave, verbose=verbose)


@fill_doc
def export_evokeds_to_mff(fname, evoked, device, history):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function should be private.

export_evokeds will be AFAIK the public entry point to the API.

mne/io/egi/egimff.py Outdated Show resolved Hide resolved
@agramfort
Copy link
Member

agramfort commented May 20, 2021 via email

@agramfort
Copy link
Member

agramfort commented May 21, 2021 via email

@ephathaway
Copy link
Contributor Author

The info['proc_history'] does seem like a nice analog to the way we store processing history in MFF. I could see reading the entries from history.xml in the MFF read functions, storing this in info['proc_history'], then having a .add_proc_history_entry method so that a user can log an MNE processing step they took. Exporting to MFF could then write the contents of info['proc_history'] to a history.xml file. This probably goes beyond the scope of this PR though.

mne/evoked.py Outdated
passed to the respective export function.

Supported formats
MFF (mff, uses `mne.io.egi.egimff.export_evokeds_to_mff`)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to correctly reference this function. Would I add it to the file i/o python reference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to expose it as mne.io.egi.export_evokeds_to_mff

I would insert section in the API page just before Creating data objects from arrays which I would call Exporting to other file formats

ok for you? thx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we have decided in the past that mne.io is for raw I/O only, I'm inclined to make it mne.evokeds.export_evokeds_mff

@agramfort
Copy link
Member

and yes I would keep the handling of proc_history for a future PR. It made we realize that we heavily under use this field in info. We could use it for provenance tracking....

@agramfort
Copy link
Member

agramfort commented May 23, 2021 via email

@larsoner
Copy link
Member

The only code outside mine/io/egi/ would be a single import line in mne/evoked.py, and there should be no circular import because mne/evoked.py already imports from IO, so I don't think those concerns are valid

@agramfort
Copy link
Member

agramfort commented May 24, 2021 via email

@agramfort
Copy link
Member

agramfort commented May 25, 2021 via email

@drammock
Copy link
Member

I would vote for 1 as I don't expect users to use this function what that want to do. Evoked.export should already do the trick I think.

I thought Evoked.export was not an option because we want to support writing multiple evokeds to one file?

@agramfort
Copy link
Member

sorry I meant export_evokeds (not Evoked.export)

PS @ephathaway can you have a look at https://mne.discourse.group/t/unable-to-load-mff-files-parseerror-not-well-formed-invalid-token/3121/7 ?

@larsoner
Copy link
Member

@agramfort what is the advantage of having the API entry point be mne.io.egi.export_evokeds_mff (what you prefer) instead of mne.evoked.export_evokeds_mff?

@agramfort
Copy link
Member

agramfort commented May 26, 2021 via email

@larsoner
Copy link
Member

Given that we have raw.export(...) and epochs.export(...), I don't think we should try that hard to hide the equivalent Evoked functionality (i.e., three levels deep, in the module for raw file I/O). Having it in mne.evoked rather than the mne already seems hidden / well-scoped / expert-y enough.

@larsoner
Copy link
Member

I had a quick chat with @agramfort and this is what we converged on:

  1. All export-related code lives in a new mne.export submodule, which is for exporting data to any format other than our two standard ones (FIF and H5)
  2. This PR would add mne.export.export_evokeds (with no writer-specific kwargs) implemented in mne/export/_export.py and mne.export.export_evokeds_mff, with actual implementation living in mne/export/mff/mff.py or so
  3. Existing raw and epochs exporting code can be moved to mne.export.export_raw and mne.export.export_epochs
  4. A new Evoked method Evoked.export(...) that calls mne.export.export_evokeds under the hood

@ephathaway @drammock okay for you?

If so, I propose:

  1. I make a PR to initialize the mne.export submodule and move Epochs and Raw code there, and we merge
  2. I merge upstream/main into this PR
  3. I push commits to this PR to reflect the structure described above (it will be faster than trying to explain all details I left out above I think)

@drammock
Copy link
Member

Ok for me.

@larsoner
Copy link
Member

@ephathaway let me know if you're okay with me pushing a couple of commits to do a merge with upstream/master and reorganize the code a bit according to the plan above, I should be able to do it Tuesday

@ephathaway
Copy link
Contributor Author

@ephathaway let me know if you're okay with me pushing a couple of commits to do a merge with upstream/master and reorganize the code a bit according to the plan above, I should be able to do it Tuesday

Your plan sounds good to me. Go for it!

My only (minor) concern is that it may confuse users a bit to have an io module and an export module that both deal with foreign file formats.

@larsoner
Copy link
Member

larsoner commented Jun 1, 2021

My only (minor) concern is that it may confuse users a bit to have an io module and an export module that both deal with foreign file formats.

The io name has always been a bit of a misnomer. I think here if we make it as clear as possible (and stick to the notion) that mne.io.* publict funtions are only for reading raw data I think we'll be okay.

@larsoner
Copy link
Member

larsoner commented Jun 2, 2021

@ephathaway I can't push, can you tick the box "Allow edits by maintainers"?

Screenshot from 2021-06-02 09-48-58

@ephathaway
Copy link
Contributor Author

@ephathaway I can't push, can you tick the box "Allow edits by maintainers"?

Since the forked repo is under our organization instead of my own GH account, this option doesn't show up. I gave you write access to our fork, which should do the trick.

Comment on lines 894 to 896
try:
fp = mff.directory.filepointer('history')
except ValueError: # should probably be FileNotFoundError upstream...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @ephathaway I had to add this, with the code before writing with history=None raised an error when the resulting file was read with read_evokeds_mff. This catches the error that the history file is not written. Upstream in mffpy this should probably be a FileNotFoundError rather than ValueError. I can open a PR for this in mffpy if you agree

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be great. Thanks! And yes, I think we should treat history.xml as optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BEL-Public/mffpy#89

Pushed a commit to make this PR compatible with ValueError (backward compat) and FileNotFoundError (future compat)

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1 for merge. @agramfort want to see if this reflects the design we had in mind?

@larsoner larsoner changed the title ENH: Export evokeds MFF MRG, ENH: Export evokeds MFF Jun 3, 2021
@agramfort
Copy link
Member

merge when green

@larsoner larsoner merged commit 8702d0e into mne-tools:main Jun 3, 2021
@larsoner
Copy link
Member

larsoner commented Jun 3, 2021

Thanks @ephathaway !

@larsoner
Copy link
Member

larsoner commented Jun 3, 2021

I just realized we forgot latest.inc, will push a fix to main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants