[ENH] Add a Consistency principle to Common principles #1530

oesteban · 2023-06-27T10:53:42Z

Explicitly adds a new Consistency principle which I think is implicit in the motivation and other principles. Still, I believe we need to make it explicit to establish a shared standpoint on the interpretation of the specs.

The reason I think we want to discuss about this proposed "Consistency principle" is that interpretation of modifications, etc. may change dramatically. I informally voiced this here #105 (comment) and at least for that particular PR to go forward, I think we want to make several of these basic agreements/conventions explicit.

Explicitly adds a new Consistency principle which I think it is implicit in the motivation and other principles, but I believe we need to make explicit to establish a shared standpoint to the interpretation of the specs.

codecov · 2023-06-27T10:57:28Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (f2cea0f) 88.00% compared to head (b9432e8) 87.83%.
Report is 135 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1530      +/-   ##
==========================================
- Coverage   88.00%   87.83%   -0.18%     
==========================================
  Files          16       16              
  Lines        1326     1356      +30     
==========================================
+ Hits         1167     1191      +24     
- Misses        159      165       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

effigies · 2023-06-27T13:20:47Z

Do we have examples where this isn't followed and has caused problems? Or is this a guideline that rule crafters should ensure remains the easiest behavior for users?

How does the development of two mechanisms, for example, B0FieldIdentifier and IntendedFor comport with this principle? Does the increased expressiveness justify an exception? Is it an exception? As another example, BIDS-URIs are intended to unify (and expand) existing file links; would that be acceptable under this rule?

yarikoptic · 2023-06-27T13:37:42Z

I think BIDS is not there (yet) in terms of rules for "conversion" (what subject label to choose if there is no subject ID in dicom? what session label to choose?) to establish such somewhat stringent rule. Having said that, it might be something to inspire and strive for.
But then in current formulation "virtually" is not defined, and thus possibly adds just confusion and not clarification really, hence obliterating the value of the principle. If it could be described as a set of rules/algorithmic descriptions ("chosen labels need to have 1-to-1 mapping between conversions", "set of entities present be identical between conversions", etc) then it would be formalized nicely, but (again) I am not yet 100% on board with it. E.g. I might have liked something like sub-X_rec-res1_desc-filt1_T1w.nii.gz, sub-X_rec-res1_desc-filt2_T2w.nii.gz to signal that those two were acquired at the same resolution but filtered differently, where neither rec nor desc would be strictly needed since suffixes differ. So I still think that there is some value in researcher ability to be "expressive" in pointing to some important distinctions/metadata.

oesteban · 2023-06-27T13:39:54Z

Do we have examples where this isn't followed and has caused problems? Or is this a guideline that rule crafters should ensure remains the easiest behavior for users?

If this is a principle we all share, it should be stated as such, and then act upon examples and problems. Nonetheless, I mentioned #105 where I felt there are several interpretations of the specs' spirit that signify we may not agree on this simple (or derivation thereof) articulation of "consistency".

In fact, I have several minor edits here and there prepared in case this is accepted as a foundational standpoint. I was planning to propose those on a subsequent PR, so that there's no conflict between discussing this and secondary effects. I believe we should agree on the basics first before we go into the details.

How does the development of two mechanisms, for example, B0FieldIdentifier and IntendedFor comport with this principle? Does the increased expressiveness justify an exception? Is it an exception? As another example, BIDS-URIs are intended to unify (and expand) existing file links; would that be acceptable under this rule?

While I agree accepting this modification opens the door to new decisions and potential conflicts such as those you bring up, my point here is that this addition is necessary to better support currently unclear and, more worryingly, hard to interpret aspects of the specs.

That said, as it reads right now, I don't think this has any implication in the B0FieldIdentifier vs. IntendedFor as they are optional (though the spirit is openly toward trying to make a more directive stance in this regard). Unfortunately, I am not familiar enough with BIDS-URIs, but I don't see why it would not be acceptable under this principle (i.e., not a rule).

oesteban · 2023-06-27T13:54:39Z

I think BIDS is not there (yet) in terms of rules for "conversion" (what subject label to choose if there is no subject ID in dicom? what session label to choose?) to establish such somewhat stringent rule. Having said that, it might be something to inspire and strive for.

I apologize in advance if the current articulation is unclear and I'm totally open to suggestions to make this more accurate and unambiguous. Indeed, the goal here is to say: BIDS allows some degrees of freedom to choose whatever participant name, session identifier, task name (except for rest), acquisition, etc. etc. but, saving those, two experts on BIDS should give you the same dataset structurally. This could be more specific as to use the same entities (but allowing differing actual values for those choices).

But then in current formulation "virtually" is not defined, and thus possibly adds just confusion and not clarification really, hence obliterating the value of the principle.

It is defined, but I may have a poor job at articulating it - please help here to make it better. The previous point is going in this direction

If it could be described as a set of rules/algorithmic descriptions ("chosen labels need to have 1-to-1 mapping between conversions", "set of entities present be identical between conversions", etc) then it would be formalized nicely, but (again) I am not yet 100% on board with it. E.g. I might have liked something like sub-X_rec-res1_desc-filt1_T1w.nii.gz, sub-X_rec-res1_desc-filt2_T2w.nii.gz to signal that those two were acquired at the same resolution but filtered differently, where neither rec nor desc would be strictly needed since suffixes differ. So I still think that there is some value in researcher ability to be "expressive" in pointing to some important distinctions/metadata.

This is a principle, so it should not be confused with a rule. I'm proposing it as such because we need to establish a starting point to be able to be specific on the specific ruling.

As it is written now, again, I don't think this is against your example. But I agree it enforces that such a rule is written to explicitly allow having filt1 and filt2 of your example. And we will meet there - you know my standpoint on this one. But, either if you convince the rest of the community or I do, the fact that two researchers encoding this dataset would choose to use rec- and desc- (or none of them because there's no ambiguity, which would be my point here) in this case is necessary. I don't think this limits whether desc-filt1 or desc-filterA is specifically chosen, it attempts to set a framework for future discussions.

oesteban · 2023-06-27T13:57:03Z

Isn't a fundamental principle of replicability that given the same dataset, the same method should give you the same output no matter the platform you run it in or the operator?

Remi-Gau · 2023-06-27T14:23:30Z

Kind of agreeing that the lack of definition / operationalization of "virtually" is a bit of a problem.

I think that people may organize their dataset differently depending on what they want to do with it. I am especially thinking that people may group things in sessions quite differently given that sessions are just a logical grouping.

One can imagine that the same following data could easily be organized in different ways:

participants came on 2 different days: one for EEG and one for MRI
participants did 2 tasks: one is common for eeg and fmri, the other only exists for fmri

The 3 following are technically possible and allowed by BIDS: and I am sure we can think of others.

├── ds1
│   └── sub-01
│       ├── anat
│       ├── eeg
│       └── func
├── ds2
│   └── sub-01
│       ├── ses-eeg
│       │   └── eeg
│       └── ses-mri
│           ├── anat
│           └── func
└── ds3
    └── sub-01
        ├── ses-anat
        ├── ses-task1
        │   └── func
        └── ses-task2
            ├── eeg
            └── func

Would you call those virtually identical?

BTW I agree that some may be more "logical" or "desirable" but then I think what we need are more like "guidelines" or recommendation to augment the starter kit rather than adding rules in the spec that are going to be very hard to enforce.

oesteban · 2023-06-27T14:33:16Z

Would you call those virtually identical?

No - but there are two different concepts conflated. If participants came on 2 different days, then this principle would open the door to ~~stipulate the (currently not existing) rule that two different experiments must be organized with different sessions (even if they are done on the same day but one after another)~~. Precisely to limit that degree of freedom of the encoder.

Example two - participants did 2 tasks. is this in the same session (i.e., simultaneous fMRI and EEG) or different sessions. Although I agree it is possible, I find that your ds3 should explicitly be disallowed in the specs founded on this very principle, hence justifying this PR.

What I find "virtually" equivalent is this:

├── ds1
│   └── sub-01
│       ├── anat
│       ├── eeg
│       └── func

├── ds2
│   └── sub-AA
│       ├── anat
│       ├── eeg
│       └── func

or

├── ds1
│   └── sub-01
│       ├── ses-eeg
│       │   └── eeg
│       └── ses-mri
│           ├── anat
│           └── func
├── ds2
│   └── sub-01
│       ├── ses-1
│       │   └── eeg
│       └── ses-2
│           ├── anat
│           └── func

EDIT: I have checked, and BIDS is currently a bit more specific than I thought:

Session: A logical grouping of neuroimaging and behavioral data consistent across subjects. Session can (but doesn't have to) be synonymous to a visit in a longitudinal study. In general, subjects will stay in the scanner during one session. However, for example, if a subject has to leave the scanner room and then be re-positioned on the scanner bed, the set of MRI acquisitions will still be considered as a session and match sessions acquired in other subjects. Similarly, in situations where different data types are obtained over several visits (for example fMRI on one day followed by DWI the day after) those can be grouped in one session. Defining multiple sessions is appropriate when several identical or similar data acquisitions are planned and performed on all -or most- subjects, often in the case of some intervention between sessions (for example, training). In the PET context, a session may also indicate a group of related scans, taken in one or more visits.

I still see ds3 borderline within the definition of "logical grouping [...] consistent across subjects". The example "if a subject has to leave the scanner room and then be re-positioned on the scanner bed, the set of MRI acquisitions will still be considered as a session and match sessions acquired in other subjects" as there's not even a re-positioning between tasks across sessions. BIDS defines task as required entity, so I would think ds3 should be considered an invalid "hack" (akin to organizing different sessions as different subjects in the dataset) with current specs.

BTW I agree that some may be more "logical" or "desirable" but then I think what we need are more like "guidelines" or recommendation to augment the starter kit rather than adding rules in the spec that are going to be very hard to enforce.

Disagree. Rules may be hard to enforce. Principles just guide progress.

enh: add a Consistency principle

3daf358

Explicitly adds a new Consistency principle which I think it is implicit in the motivation and other principles, but I believe we need to make explicit to establish a shared standpoint to the interpretation of the specs.

oesteban requested a review from DimitriPapadopoulos as a code owner June 27, 2023 10:53

oesteban requested a review from effigies June 27, 2023 10:54

oesteban assigned yarikoptic Jun 27, 2023

fix: add newline to close quote block

7e6bf10

oesteban requested a review from yarikoptic June 27, 2023 12:15

oesteban unassigned yarikoptic Jun 27, 2023

enh: add breaklines

b9432e8

Remi-Gau added the consistency Spec is (potentially) inconsistent label Dec 22, 2023

oesteban mentioned this pull request Feb 14, 2024

BEP Proposal: Atlas specification #1281

Open

oesteban mentioned this pull request Jul 8, 2024

Alternative proposal to BEP038 #1856

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Add a Consistency principle to Common principles #1530

[ENH] Add a Consistency principle to Common principles #1530

oesteban commented Jun 27, 2023

codecov bot commented Jun 27, 2023 •

edited

Loading

effigies commented Jun 27, 2023

yarikoptic commented Jun 27, 2023

oesteban commented Jun 27, 2023 •

edited

Loading

oesteban commented Jun 27, 2023

oesteban commented Jun 27, 2023

Remi-Gau commented Jun 27, 2023

oesteban commented Jun 27, 2023 •

edited

Loading

[ENH] Add a Consistency principle to Common principles #1530

Are you sure you want to change the base?

[ENH] Add a Consistency principle to Common principles #1530

Conversation

oesteban commented Jun 27, 2023

codecov bot commented Jun 27, 2023 • edited Loading

Codecov Report

effigies commented Jun 27, 2023

yarikoptic commented Jun 27, 2023

oesteban commented Jun 27, 2023 • edited Loading

oesteban commented Jun 27, 2023

oesteban commented Jun 27, 2023

Remi-Gau commented Jun 27, 2023

oesteban commented Jun 27, 2023 • edited Loading

codecov bot commented Jun 27, 2023 •

edited

Loading

oesteban commented Jun 27, 2023 •

edited

Loading

oesteban commented Jun 27, 2023 •

edited

Loading