FIX: get unique, with conflicting meta-data #748

adelavega · 2021-06-29T22:43:22Z

Fixes #747

This is a cheap fix for a bug that occurs when a .tsv meta-data entry (i.e. column description) matches a common BIDS entity.

For example, if you call layout.get_tasks(), it will first get all unique values for the task entity. However, if index_metadata=True, that includes all instances of task in .json sidecars for TSVs as well. However, .json sidecars for .tsv files are a different type of meta-data that describes the columns in TSV files, not the entities of the corresponding file.

Thus, get_tasks will fail because it tries to take a set on a list that includes a dict value. Here, I simply excluded dict values from that operation.

adelavega · 2021-06-29T22:56:26Z

Looking at this a bit more, the issue is that all meta-data is stored as an Entity, with the attribute is_metadata=True.

This is fine, except that .json sidecars for .tsv files are a different kind of meta-data. It's meta-data associated with columns, not files.

One option is to not index meta-data for .tsv files at all. However, then f.get_metadata() would not work for these files.

Alternatively, we can add another column to Entity which describes if this is column or file meta-data (is_column?). This would be excluded from get_entities but not from get_metadata.

adelavega · 2021-06-29T22:58:13Z

@effigies does the inheritance principle apply to tsv file meta-data? If not then that's an argument that pybids should just ignore this type of meta-data from indexing.

effigies · 2021-06-29T23:15:44Z

As far as I know, it should apply.

adelavega · 2021-06-30T02:33:27Z

I suppose I could imagine how the inherentice principle might apply here, if all event files have the same columns across all subjects.

It's still weird to me that pybids would index this meta-data in a way that lets you filter on it.

For example, it makes no sense to filter tsv files based on the Description of a specific column within it.

oesteban · 2021-06-30T15:31:17Z

This is related to #684. A quick&dirty, ad-hoc workaround would be to filter the retrieved metadata to only provide str, along the same lines of #682 for runs.

EDIT: But yes, I agree with @effigies that there must be another source of metadata for the entity "session" that is inserting a dict in the db.

adelavega · 2021-06-30T19:20:40Z

@oesteban that's basically what I did but only excluded dicts.

The dicts are coming from tsv sidecars, since they allow dicts. If you describe a column called "session" in participants.json then there will be a match for the "session" entity that is a dict in the db.

oesteban · 2021-06-30T19:53:55Z

If you describe a column called "task" in participants.tsv then there

will be a match for the task entity that is a dict. Yes, literally any metadata that defines a field whose name is a bids entity. This would be very problematic if instead of a dict, such "task" field were a valid str task identifier (e.g. "faketask"). Now, instead of an error, faketask would be returned as one more task by get_tasks()

…

On Wed, Jun 30, 2021, 21:20 Alejandro de la Vega ***@***.***> wrote: @oesteban <https://github.com/oesteban> that's basically what I did but only excluded dicts. The dicts are coming from tsv sidecars, since they allow dicts. If you describe a column called "task" in participants.tsv then there will be a match for the task entity that is a dict. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#748 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAESDRREFANXCAXMOJR2DC3TVNVBFANCNFSM47RERXVA> .

adelavega · 2021-06-30T22:40:21Z

Yep.

Well if nobody has any objections to this fix, I'll merge it, but I think in the long run the solution is to amend meta-data to differentiate between those coming from tsv sidecars.

effigies · 2021-07-01T03:24:57Z

Yeah, I'm okay with this short term, but can we open an issue to make sure it doesn't get forgotten?

oesteban · 2021-07-01T04:09:47Z

Before that, it would also be interesting to make a decision regarding #694.

adelavega · 2021-07-02T02:29:19Z

I think #694 sufficient covers what I pointed out. I'm going to merge this as is, as its a relatively innocuous and in practice fixes the most glaring issues with this.

Exclude dicts from set

7a7baed

adelavega requested a review from effigies June 29, 2021 22:43

adelavega merged commit 8a9a5e5 into master Jul 2, 2021

adelavega deleted the fix/get branch July 2, 2021 02:29

adelavega added a commit to adelavega/pybids that referenced this pull request Jul 9, 2021

Apply patch bids-standard#748

75a8f93

adelavega mentioned this pull request Aug 3, 2021

Patch 0.13.x maint branch #763

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX: get unique, with conflicting meta-data #748

FIX: get unique, with conflicting meta-data #748

adelavega commented Jun 29, 2021 •

edited

Loading

adelavega commented Jun 29, 2021

adelavega commented Jun 29, 2021

effigies commented Jun 29, 2021

adelavega commented Jun 30, 2021 •

edited

Loading

oesteban commented Jun 30, 2021 •

edited

Loading

adelavega commented Jun 30, 2021 •

edited

Loading

oesteban commented Jun 30, 2021 via email

adelavega commented Jun 30, 2021

effigies commented Jul 1, 2021

oesteban commented Jul 1, 2021

adelavega commented Jul 2, 2021

FIX: get unique, with conflicting meta-data #748

FIX: get unique, with conflicting meta-data #748

Conversation

adelavega commented Jun 29, 2021 • edited Loading

adelavega commented Jun 29, 2021

adelavega commented Jun 29, 2021

effigies commented Jun 29, 2021

adelavega commented Jun 30, 2021 • edited Loading

oesteban commented Jun 30, 2021 • edited Loading

adelavega commented Jun 30, 2021 • edited Loading

oesteban commented Jun 30, 2021 via email

adelavega commented Jun 30, 2021

effigies commented Jul 1, 2021

oesteban commented Jul 1, 2021

adelavega commented Jul 2, 2021

adelavega commented Jun 29, 2021 •

edited

Loading

adelavega commented Jun 30, 2021 •

edited

Loading

oesteban commented Jun 30, 2021 •

edited

Loading

adelavega commented Jun 30, 2021 •

edited

Loading