Automatically assigning metadata #1162

alexrockhill · 2023-08-04T16:01:34Z

Describe the problem

Metadata must be added separately when creating epochs, it's not part of read_raw_bids

Describe your solution

How do you handle extra event columns in an events.tsv? You can read them in separately and assign them but it might be nice to either 1) make an mne-bids helper function to read metadata or 2) put them as metadata on a raw object (not currently an attribute) automatically during read_raw_bids and then add them to the epochs when that raw object is epoched. Option 2 might be super clean for abstracting bids stuff

Describe possible alternatives

Above

Additional context

No response

The text was updated successfully, but these errors were encountered:

agramfort · 2023-08-08T08:19:06Z

can you draft an API for this?

…

Message ID: ***@***.***>

alexrockhill · 2023-08-08T16:01:24Z

I think it would just be added automatically sounds nice to me, within the existing read_raw_bids API

alexrockhill · 2023-08-08T18:17:26Z

Well actually that could look like passing only raw to mne.Epochs with events=None. That's a pretty big API change but it kind of seems like it's very sensible since annotations were introduced and you don't have to preload epochs. I think you'd need a lot of opinions on that though as it's super central to pretty much any workflow. Then the metadata would passed with the annotations to the epochs. When you load in with mne_bids.read_raw_bids it could check that your annotations are the same length as your metadata and raise an error if not.

hoechenberger · 2023-08-09T05:09:48Z

I'm probably missing something here but ... shouldn't we add full support for Raw metadata the MNE first? I mean like we have it for Epochs. Even if that means storing that data in a separate file.

hoechenberger · 2023-08-09T05:18:31Z

Oh never mind, we can store that stuff as annotations, as you suggested. Sorry about the confusion.

agramfort · 2023-08-09T07:24:12Z

Well actually that could look like passing only raw to mne.Epochs with events=None. That's a pretty big API change but it kind of seems like it's very sensible since annotations were introduced and you don't have to preload epochs. I think you'd need a lot of opinions on that though as it's super central to pretty much any workflow.Message ID: ***@***.***>

can you finish the thought process? then what? What behavior? what do you do with event_id? I don't want to close the door here but the plan should be well designed so we don't end up in a rabbit hole

alexrockhill · 2023-08-09T12:55:03Z

Sorry if that was unclear. The motivation is from working with this dataset https://openneuro.org/datasets/ds003708/versions/1.0.3 with stimulation sites per trial. As far as I know, that can't be stored in annotations and is metadata that is nice associate with epochs. If metadata were allowed to be assigned to raw and checked that it was the length of annotations (first PR) then when events=None in mne.Epochs (with event_id=None) the events and event_id would be taken from the annotations and the metadata that was already checked that it was compatible would be assigned to the epochs. That way you wouldn't have to worry about pulling events and event_id from annotations , since they are already an attribute of the raw object, that makes sense to me. Maybe this is too big of an API change but I'm not sure I see an easier way to get the metadata during read_raw_bids.

larsoner · 2023-08-09T17:45:00Z

the plan should be well designed so we don't end up in a rabbit hole

If metadata were allowed to be assigned to raw and checked that it was the length of annotations (first PR)

Unfortunately this already seems like we're going to head down a rabbit hole. We have all sorts of ways of setting/appending/deleting annotations. So now you have to either prevent these operations or insert empty metadata for new ones, etc. An alternative would be to make the Annotations object itself have the option to add arbitrary metadata, I guess like we do for Epochs (some of the logic could be shared). But this in itself could become complicated and a bit of a rabbit hole.

So I think the best/simplest/cleanest solution is the one mentioned above:

make an mne-bids helper function to read metadata

In mne-bids-pipeline we have something a bit along these lines:

https://github.com/mne-tools/mne-bids-pipeline/blob/a931cae9ed41c9acdfef8b8e725eb1b817f1e697/mne_bids_pipeline/_import_data.py#L68-L106

I think a MNE-BIDS helper that returns an ndarray events and a DataFrame metadata of the same length makes the most sense. Then the applications/uses become pretty straightforward and transparent -- no need for magic, adding events=None, attaching to Raw, or modifying Annotations required.

alexrockhill · 2023-08-09T17:59:01Z

That sounds reasonable, the only thing I'm thinking is that we ran into an issue where the events were mislabeled in the annotations in a raw brainvision file and the dataset curator labeled them correctly in the events.tsv. This was confusing when setting epochs.metadata and then trying to select using fancy indexing by taking epochs['electrical_stimulation'] for instance which went off of the incorrect annotations instead of the corrected metadata. So, maybe something like mne_bids.read_raw_epochs that ignores everything in the raw file and goes off the events.tsv might be the solution to that.

larsoner · 2023-08-09T18:02:07Z

we ran into an issue where the events were mislabeled in the annotations in a raw brainvision file and the dataset curator labeled them correctly in the events.tsv... So, maybe something like mne_bids.read_raw_epochs that ignores everything in the raw file and goes off the events.tsv might be the solution to that.

I don't think we should write a new function for this. If we try to write a helper function to fix every instance of a broken dataset, we're going to end up with infinite functions. A better idea I think is 1) tell them to fix their dataset (which they might ignore), and then 2) write code to overcome this particular use case in the meantime. Now if this starts happening for dozens of datasets, then I think there's an argument to be made for a helper. But I really hope this is a rare exception...

alexrockhill · 2023-08-09T18:12:38Z

I have stuff like this in BIDS datasets that I have too. It often happens when you have a version of task or recording with event codings and then you change them in a later version. I think the idea with BIDS is that you leave the raw file be so that everything you put in BIDS is not modified and just correct it in the sidecar.

That doesn't mean you necessarily need a helper function or new helper functions for everything but I do think that the idea that the sidecars overwrite anything in the files should be facilitated by mne-bids and right now the events.tsv file is ignored as far as official API. That's why I was thinking of reading metadata in read_raw_bids and checking that it matches with annotations in the raw file. I think it's an edge case but I think it's a super common edge case.

agramfort · 2023-08-12T14:34:43Z

- see also this discussion mne-tools/mne-python#6018 I would suggest also to have a helper function that attaches metadata to epochs a bit manually so far this approach has always worked for me (see indeed mne bids pipeline approach)

…

Message ID: ***@***.***>

marcpabst · 2023-10-02T16:37:40Z

we ran into an issue where the events were mislabeled in the annotations in a raw brainvision file and the dataset curator labeled them correctly in the events.tsv... So, maybe something like mne_bids.read_raw_epochs that ignores everything in the raw file and goes off the events.tsv might be the solution to that.

I don't think we should write a new function for this. If we try to write a helper function to fix every instance of a broken dataset, we're going to end up with infinite functions. A better idea I think is 1) tell them to fix their dataset (which they might ignore), and then 2) write code to overcome this particular use case in the meantime. Now if this starts happening for dozens of datasets, then I think there's an argument to be made for a helper. But I really hope this is a rare exception...

I just wanted to chime in and share my strong disagreement with this viewpoint. I don't believe we should consider a dataset that relies on events.tsv to encode events as flawed.

In reality, the problem here lies more in how we handle this type of data, which isn't always ideal. Historically, dealing with events in EEG recordings has been quite limited due to technical constraints. Most systems lacked the ability to attach complex metadata to events. But what if we're working with a complex experimental design or need to add post-hoc metadata? Sure, we could manually tweak the triggers in the EEG recording file, but that's error-prone and goes against the spirit of the BIDS format. As @alexrockhill pointed out, this is a common challenge, and I believe that, for a BIDS dataset, we should consider events.tsv as the primary source of truth.

Perhaps a practical solution could be allowing for arbitrary metadata to be added to the Annotations object, similar to what's already possible with the metadata dataframe in Epochs. Is there a an Issue/PR for that somehwere already?

Anyway, I do really like MNE, but I don't think event handling is one of its strong points at the moment.

larsoner · 2023-10-02T18:08:22Z

I don't believe we should consider a dataset that relies on events.tsv to encode events as flawed... I believe that, for a BIDS dataset, we should consider events.tsv as the primary source of truth

Ahh I see -- rereading I think you're right and I was wrong about this (sassuming the other MNE-BIDS people agree). as it could be a way to fix a flawed recording. To not force this behavior, though, it seems like a mne_bids.annotations_from_events_tsv or so seems like it should get you everything you need to fix your dataset. You'll have the native raw with bad events and/or bad annotations, then read the presumably correct ones from tsv, and replace as necessary before epoching.

when events=None in mne.Epochs (with event_id=None) the events and event_id would be taken from the annotations

FWIW recently the Epochs(..., events=None) meaning "call annotations to events under the hood" approach (re)surfaced at the MNE sprint and could make sense as well, feel free to check for and then open an MNE issue about that. But I think we can do this without needing to attach any metadata to raw.

sappelhoff · 2023-10-24T19:37:47Z

regarding:

I think the idea with BIDS is that you leave the raw file be so that everything you put in BIDS is not modified and just correct it in the sidecar.

and

for a BIDS dataset, we should consider events.tsv as the primary source of truth.

I personally agree, and this is how we handle things in mne-bids.

However, within BIDS, this is an unresolved issue:

[ENH] Add a statement addressing metadata conflicts bids-standard/bids-specification#761

To not force this behavior, though, it seems like a mne_bids.annotations_from_events_tsv or so seems like it should get you everything you need to fix your dataset. You'll have the native raw with bad events and/or bad annotations, then read the presumably correct ones from tsv, and replace as necessary before epoching.

this sounds good to me!

alexrockhill added the enhancement label Aug 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically assigning metadata #1162

Automatically assigning metadata #1162

alexrockhill commented Aug 4, 2023

agramfort commented Aug 8, 2023 via email

alexrockhill commented Aug 8, 2023

alexrockhill commented Aug 8, 2023 •

edited

Loading

hoechenberger commented Aug 9, 2023 •

edited

Loading

hoechenberger commented Aug 9, 2023

agramfort commented Aug 9, 2023 via email

alexrockhill commented Aug 9, 2023

larsoner commented Aug 9, 2023

alexrockhill commented Aug 9, 2023 •

edited

Loading

larsoner commented Aug 9, 2023 •

edited

Loading

alexrockhill commented Aug 9, 2023

agramfort commented Aug 12, 2023 via email •

edited by sappelhoff

Loading

marcpabst commented Oct 2, 2023 •

edited

Loading

larsoner commented Oct 2, 2023

sappelhoff commented Oct 24, 2023

Automatically assigning metadata #1162

Automatically assigning metadata #1162

Comments

alexrockhill commented Aug 4, 2023

Describe the problem

Describe your solution

Describe possible alternatives

Additional context

agramfort commented Aug 8, 2023 via email

alexrockhill commented Aug 8, 2023

alexrockhill commented Aug 8, 2023 • edited Loading

hoechenberger commented Aug 9, 2023 • edited Loading

hoechenberger commented Aug 9, 2023

agramfort commented Aug 9, 2023 via email

alexrockhill commented Aug 9, 2023

larsoner commented Aug 9, 2023

alexrockhill commented Aug 9, 2023 • edited Loading

larsoner commented Aug 9, 2023 • edited Loading

alexrockhill commented Aug 9, 2023

agramfort commented Aug 12, 2023 via email • edited by sappelhoff Loading

marcpabst commented Oct 2, 2023 • edited Loading

larsoner commented Oct 2, 2023

sappelhoff commented Oct 24, 2023

alexrockhill commented Aug 8, 2023 •

edited

Loading

hoechenberger commented Aug 9, 2023 •

edited

Loading

alexrockhill commented Aug 9, 2023 •

edited

Loading

larsoner commented Aug 9, 2023 •

edited

Loading

agramfort commented Aug 12, 2023 via email •

edited by sappelhoff

Loading

marcpabst commented Oct 2, 2023 •

edited

Loading