Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add a Consistency principle to Common principles #1530

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

oesteban
Copy link
Collaborator

Explicitly adds a new Consistency principle which I think is implicit in the motivation and other principles. Still, I believe we need to make it explicit to establish a shared standpoint on the interpretation of the specs.

The reason I think we want to discuss about this proposed "Consistency principle" is that interpretation of modifications, etc. may change dramatically. I informally voiced this here #105 (comment) and at least for that particular PR to go forward, I think we want to make several of these basic agreements/conventions explicit.

Explicitly adds a new Consistency principle which I think it is implicit
in the motivation and other principles, but I believe we need to make
explicit to establish a shared standpoint to the interpretation of the
specs.
@codecov
Copy link

codecov bot commented Jun 27, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (f2cea0f) 88.00% compared to head (b9432e8) 87.83%.
Report is 135 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1530      +/-   ##
==========================================
- Coverage   88.00%   87.83%   -0.18%     
==========================================
  Files          16       16              
  Lines        1326     1356      +30     
==========================================
+ Hits         1167     1191      +24     
- Misses        159      165       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@effigies
Copy link
Collaborator

Do we have examples where this isn't followed and has caused problems? Or is this a guideline that rule crafters should ensure remains the easiest behavior for users?

How does the development of two mechanisms, for example, B0FieldIdentifier and IntendedFor comport with this principle? Does the increased expressiveness justify an exception? Is it an exception? As another example, BIDS-URIs are intended to unify (and expand) existing file links; would that be acceptable under this rule?

@yarikoptic
Copy link
Collaborator

I think BIDS is not there (yet) in terms of rules for "conversion" (what subject label to choose if there is no subject ID in dicom? what session label to choose?) to establish such somewhat stringent rule. Having said that, it might be something to inspire and strive for.
But then in current formulation "virtually" is not defined, and thus possibly adds just confusion and not clarification really, hence obliterating the value of the principle. If it could be described as a set of rules/algorithmic descriptions ("chosen labels need to have 1-to-1 mapping between conversions", "set of entities present be identical between conversions", etc) then it would be formalized nicely, but (again) I am not yet 100% on board with it. E.g. I might have liked something like sub-X_rec-res1_desc-filt1_T1w.nii.gz, sub-X_rec-res1_desc-filt2_T2w.nii.gz to signal that those two were acquired at the same resolution but filtered differently, where neither rec nor desc would be strictly needed since suffixes differ. So I still think that there is some value in researcher ability to be "expressive" in pointing to some important distinctions/metadata.

@oesteban
Copy link
Collaborator Author

oesteban commented Jun 27, 2023

Do we have examples where this isn't followed and has caused problems? Or is this a guideline that rule crafters should ensure remains the easiest behavior for users?

If this is a principle we all share, it should be stated as such, and then act upon examples and problems. Nonetheless, I mentioned #105 where I felt there are several interpretations of the specs' spirit that signify we may not agree on this simple (or derivation thereof) articulation of "consistency".

In fact, I have several minor edits here and there prepared in case this is accepted as a foundational standpoint. I was planning to propose those on a subsequent PR, so that there's no conflict between discussing this and secondary effects. I believe we should agree on the basics first before we go into the details.

How does the development of two mechanisms, for example, B0FieldIdentifier and IntendedFor comport with this principle? Does the increased expressiveness justify an exception? Is it an exception? As another example, BIDS-URIs are intended to unify (and expand) existing file links; would that be acceptable under this rule?

While I agree accepting this modification opens the door to new decisions and potential conflicts such as those you bring up, my point here is that this addition is necessary to better support currently unclear and, more worryingly, hard to interpret aspects of the specs.

That said, as it reads right now, I don't think this has any implication in the B0FieldIdentifier vs. IntendedFor as they are optional (though the spirit is openly toward trying to make a more directive stance in this regard). Unfortunately, I am not familiar enough with BIDS-URIs, but I don't see why it would not be acceptable under this principle (i.e., not a rule).

@oesteban
Copy link
Collaborator Author

I think BIDS is not there (yet) in terms of rules for "conversion" (what subject label to choose if there is no subject ID in dicom? what session label to choose?) to establish such somewhat stringent rule. Having said that, it might be something to inspire and strive for.

I apologize in advance if the current articulation is unclear and I'm totally open to suggestions to make this more accurate and unambiguous. Indeed, the goal here is to say: BIDS allows some degrees of freedom to choose whatever participant name, session identifier, task name (except for rest), acquisition, etc. etc. but, saving those, two experts on BIDS should give you the same dataset structurally. This could be more specific as to use the same entities (but allowing differing actual values for those choices).

But then in current formulation "virtually" is not defined, and thus possibly adds just confusion and not clarification really, hence obliterating the value of the principle.

It is defined, but I may have a poor job at articulating it - please help here to make it better. The previous point is going in this direction

If it could be described as a set of rules/algorithmic descriptions ("chosen labels need to have 1-to-1 mapping between conversions", "set of entities present be identical between conversions", etc) then it would be formalized nicely, but (again) I am not yet 100% on board with it. E.g. I might have liked something like sub-X_rec-res1_desc-filt1_T1w.nii.gz, sub-X_rec-res1_desc-filt2_T2w.nii.gz to signal that those two were acquired at the same resolution but filtered differently, where neither rec nor desc would be strictly needed since suffixes differ. So I still think that there is some value in researcher ability to be "expressive" in pointing to some important distinctions/metadata.

This is a principle, so it should not be confused with a rule. I'm proposing it as such because we need to establish a starting point to be able to be specific on the specific ruling.

As it is written now, again, I don't think this is against your example. But I agree it enforces that such a rule is written to explicitly allow having filt1 and filt2 of your example. And we will meet there - you know my standpoint on this one. But, either if you convince the rest of the community or I do, the fact that two researchers encoding this dataset would choose to use rec- and desc- (or none of them because there's no ambiguity, which would be my point here) in this case is necessary. I don't think this limits whether desc-filt1 or desc-filterA is specifically chosen, it attempts to set a framework for future discussions.

@oesteban
Copy link
Collaborator Author

Isn't a fundamental principle of replicability that given the same dataset, the same method should give you the same output no matter the platform you run it in or the operator?

@Remi-Gau
Copy link
Collaborator

Kind of agreeing that the lack of definition / operationalization of "virtually" is a bit of a problem.

I think that people may organize their dataset differently depending on what they want to do with it. I am especially thinking that people may group things in sessions quite differently given that sessions are just a logical grouping.

One can imagine that the same following data could easily be organized in different ways:

  • participants came on 2 different days: one for EEG and one for MRI
  • participants did 2 tasks: one is common for eeg and fmri, the other only exists for fmri

The 3 following are technically possible and allowed by BIDS: and I am sure we can think of others.

├── ds1
│   └── sub-01
│       ├── anat
│       ├── eeg
│       └── func
├── ds2
│   └── sub-01
│       ├── ses-eeg
│       │   └── eeg
│       └── ses-mri
│           ├── anat
│           └── func
└── ds3
    └── sub-01
        ├── ses-anat
        ├── ses-task1
        │   └── func
        └── ses-task2
            ├── eeg
            └── func

Would you call those virtually identical?

BTW I agree that some may be more "logical" or "desirable" but then I think what we need are more like "guidelines" or recommendation to augment the starter kit rather than adding rules in the spec that are going to be very hard to enforce.

@oesteban
Copy link
Collaborator Author

oesteban commented Jun 27, 2023

Would you call those virtually identical?

No - but there are two different concepts conflated. If participants came on 2 different days, then this principle would open the door to stipulate the (currently not existing) rule that two different experiments must be organized with different sessions (even if they are done on the same day but one after another). Precisely to limit that degree of freedom of the encoder.

Example two - participants did 2 tasks. is this in the same session (i.e., simultaneous fMRI and EEG) or different sessions. Although I agree it is possible, I find that your ds3 should explicitly be disallowed in the specs founded on this very principle, hence justifying this PR.

What I find "virtually" equivalent is this:

├── ds1
│   └── sub-01
│       ├── anat
│       ├── eeg
│       └── func

├── ds2
│   └── sub-AA
│       ├── anat
│       ├── eeg
│       └── func

or

├── ds1
│   └── sub-01
│       ├── ses-eeg
│       │   └── eeg
│       └── ses-mri
│           ├── anat
│           └── func
├── ds2
│   └── sub-01
│       ├── ses-1
│       │   └── eeg
│       └── ses-2
│           ├── anat
│           └── func

EDIT: I have checked, and BIDS is currently a bit more specific than I thought:

Session: A logical grouping of neuroimaging and behavioral data consistent across subjects. Session can (but doesn't have to) be synonymous to a visit in a longitudinal study. In general, subjects will stay in the scanner during one session. However, for example, if a subject has to leave the scanner room and then be re-positioned on the scanner bed, the set of MRI acquisitions will still be considered as a session and match sessions acquired in other subjects. Similarly, in situations where different data types are obtained over several visits (for example fMRI on one day followed by DWI the day after) those can be grouped in one session. Defining multiple sessions is appropriate when several identical or similar data acquisitions are planned and performed on all -or most- subjects, often in the case of some intervention between sessions (for example, training). In the PET context, a session may also indicate a group of related scans, taken in one or more visits.

I still see ds3 borderline within the definition of "logical grouping [...] consistent across subjects". The example "if a subject has to leave the scanner room and then be re-positioned on the scanner bed, the set of MRI acquisitions will still be considered as a session and match sessions acquired in other subjects" as there's not even a re-positioning between tasks across sessions. BIDS defines task as required entity, so I would think ds3 should be considered an invalid "hack" (akin to organizing different sessions as different subjects in the dataset) with current specs.

BTW I agree that some may be more "logical" or "desirable" but then I think what we need are more like "guidelines" or recommendation to augment the starter kit rather than adding rules in the spec that are going to be very hard to enforce.

Disagree. Rules may be hard to enforce. Principles just guide progress.

@Remi-Gau Remi-Gau added the consistency Spec is (potentially) inconsistent label Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
consistency Spec is (potentially) inconsistent
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants