Incomplete metadata consensus #7

Lestropie · 2024-05-21T03:11:29Z

Within the Inheritance Principle, it is currently permissible that for any given data file, with more than one applicable metadata file, the same metadata field could appear in different metadata files with different values. It is then up to the Inheritance Principle to define the order in which those metadata fields should be read, in order to determine the precedence of which value should be considered appropriate for that particular data file.

The purpose of bids-standard/bids-specification#946 was to systematize this order of precedence, partly because if that order of precedence were to become more complex, eg. as in bids-standard/bids-specification#1003, then the precise phrasing around how that precedence is determined is crucial.

This capability of "overwriting" metadata from one file with that of another is potentially a poor decision in the construction of BIDS 1.0. Indeed explicitly forbidding such overloading may be one way to help reach consensus on the future state of the Inheritance Principle. See: bids-standard/bids-2-devel#65

In the context of this software tool, specifically in determining the maximal utilisation of the IP, neglecting to ever overload metadata fields is in fact the simplest from an algorithmic perspective. If the metadata field value is different for just one data file, then it cannot be promoted up the hierarchy, even if to a human it would be overall "less complex" if that field were to be promoted up the hierarchy but then overridden just for that one exception data file.

Having this software just not utilise the IP for such a field, but provide zero feedback to the user regarding this fact, would itself not be ideal. It is likely warranted to draw the user's attention to metadata fields that could almost be promoted by the IP, but were prevented from doing so due to specific data files where that field was either absent or had a different value to the consensus, given that this could indicate eg. acquisitions where some sequence parameter has erroneously changed.

If the software had the capability to identify such cases, then it could:

Provide a sorted list of those fields and data file exceptions that warrant manual inspection, as there may be an "unexpected" inconsistency in metadata
Optionally utilise the overriding capability of the IP where deemed appropriate.

Would need some careful contemplation as to how to devise an algorithm to identify such cases, and some quantitative metric that somehow captures the probability of a candidate metadata field being erroneously inconsistent.

Lestropie added the enhancement New feature or request label May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incomplete metadata consensus #7

Incomplete metadata consensus #7

Lestropie commented May 21, 2024

Incomplete metadata consensus #7

Incomplete metadata consensus #7

Comments

Lestropie commented May 21, 2024