Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-site/center studies #11

Open
tsalo opened this issue Aug 3, 2020 · 12 comments
Open

Multi-site/center studies #11

tsalo opened this issue Aug 3, 2020 · 12 comments
Labels
folder-structure Proposals to reorganize files in the specification. impact: high Estimated high impact change

Comments

@tsalo
Copy link
Member

tsalo commented Aug 3, 2020

https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/bids-discussion/SwH-1KRnBU0/oCx0ynEpBAAJ

Our group would also be interested in finding a way to incorporate a way of distinguishing between sites for the datasets we work with. I personally would prefer to use the key 'site-' rather than 'centre-', as 'centre' is not a shared spelling between British and American English.

Maybe overall, all the participants from a single site could also live within a directory for that site. So the directory structure might be amended to look like:

/site-<site_label>/sub-<participant_label>/[ses-<session_label>/]

Original authors: @jpellman

@tsalo
Copy link
Member Author

tsalo commented Aug 3, 2020

@thomasbeaudry wrote:

why not add the site to the JSON file?

@tsalo
Copy link
Member Author

tsalo commented Aug 3, 2020

@Athanasiamo wrote:

why not add the site to the JSON file?

I have multi-site data and found Chris' advice to very much fit our data and the way our researchers use it. Though, we are not a "true" multi-site study, I believe this is a very nice solution (though produces long file names):
https://groups.google.com/forum/#!topic/bids-discussion/WV-weqTusNQ

@tsalo
Copy link
Member Author

tsalo commented Aug 3, 2020

@alexandreroutier wrote:

why not add the site to the JSON file?

I currently have to work with a multi-site study and in my experience, the simplest and quite efficient solution was to embed the site into the <participant_label> e.g. sub-01CLNC001 where "01" is the site id and "CLNC001" the participant label and I am quite satisfied with that.

However, the only drawback I see in this approach is that it assumes that the participant does not change site during the study. In that case, we can consider to embed the site at the session.

@tsalo
Copy link
Member Author

tsalo commented Aug 3, 2020

@chrisgorgo wrote:

This is also one of the recommendation from the main spec. See https://docs.google.com/document/d/1HFUkAEE-pB-angVcYe6pf_-fVf4sCpOHKesUvfb8Grc/edit#heading=h.29tn5cduh4ci

@tsalo
Copy link
Member Author

tsalo commented Aug 3, 2020

@TKoscik wrote:

from my point of view, site is a critical piece of information that generally denotes batch effects within a particular project. For example, in the ABCD protocol there will be batch effects between sites and scanners and scanner software revisions, etc.

Also, a site variable varies independently from subject and session, so lumping this information with other variables seems incorrect.

having a separate folder for each site is also seems to place this variable at the wrong level of analysis as it is a within-project, and potentially within-subjects variable. not to mention that changing the folder structure will impact scripts more than a filename change.

Currently we are using the site tag to denote a combination of site, vendor, and relevant scanner/software changes, the benefit is that this makes immediately visiblethe need to explore/correct for batch effects. currentlly we are using a 5 digit code:

  • first 2 digits, identify institute, e.g., UIHC = 00
  • 3rd digit identifies the scanner vendor, e.g., 1=Siemens, 2=GE, 3=Phillips
  • last two digits, identify scanner and software version (and other major changes on each scanner that might cause batch effects in images), e.g., initial scanner setup for a given scanner is 00, every relevant change increments this number.
  • e.g., filename = sub-123_ses-12345six_site-00201_*

@tsalo
Copy link
Member Author

tsalo commented Aug 3, 2020

@HenkMutsaerts wrote:

I personally would prefer to use the key 'site-' rather than 'centre-', as 'centre' is not a shared spelling between British and American English.

I agree that this is a nice contribution to BIDS. However, in a multi-site study, 1 site can still have multiple scanners, and in most image processing you want to correct per scanner. So instead of 'site', you could consider 'scanner'

@tsalo
Copy link
Member Author

tsalo commented Aug 3, 2020

@pvdemael wrote:

I agree that this is a nice contribution to BIDS. However, in a multi-site study, 1 site can still have multiple scanners, and in most image processing you want to correct per scanner. So instead of 'site', you could consider 'scanner'

This can easily be added to the participants file by adding columns for sites and scanners. IMHO is the scanner very important information but not to be in the filename which is more oriented towards features of the image

@tsalo tsalo added folder-structure Proposals to reorganize files in the specification. impact: high Estimated high impact change labels Aug 3, 2020
@jbteves
Copy link

jbteves commented Aug 5, 2020

My view is that the subject ID should generally encapsulate the site anyway, as discussed above. Additionally, the scanner could be embedded as part of the metadata.

@drmowinckels
Copy link

drmowinckels commented Aug 10, 2020

Hi.

We have ended up using something similar to what @tsalo desribes. We ended up incorporating site into the session tags, because we have subjects that have participated in multiple sites over time.

For us now, the session tag numerals counts the sequential scans, while the alphabeticals indicate the site
ex. sub-1200131_ses-01siteScanner
And the site-information is a combination of site acronym and scanner

We found this solution to be the best also because of our longitudinal data. For our specific data, we have decided to change the meaning of "session" from standard MRI terms. Having the numerals in the session to mean "lognitudinal timepoint", rather than scan session. This was necessary to make a system that made it easy for our staff to recognise what cognitive data fits with which scan data. This was extra important also because some of our participants have within the same lognitudinal timepoint visited several scanners in a single day, so we could try to estimate the error in measurements when participants switch from one scanner to another due to upgrades.

So we could have

  • sub-1200131_ses-01scannerOne
  • sub-1200131_ses-02scannerOne
  • sub-1200131_ses-02scannerTwo
  • sub-1200131_ses-03scannerTwo

@yarikoptic
Copy link
Contributor

#54 could provide a generic way to support overall desired layout for this. The issue could be split into two:

@yarikoptic
Copy link
Contributor

Also somewhat relates to https://bids.neuroimaging.io/bep035 where the idea is to aggregate across studies, so adds entity study. But probably semantic is different enough which would warrant to have both study and site.

@yarikoptic
Copy link
Contributor

yarikoptic commented Apr 24, 2024

other pieces of feedback:

  • site is IMHO better than scanner since not MR/CT/...-specific
  • site is better than center since the "site" is more generic than "center.
  • indeed site can be incorporated either into sub or into ses depending on the use case, and its utility remains not highly demanded ATM IMHO. At large there is already reliance on some metadata field to tell apart different acquisition equipment "samples", versions of their software, etc . But they are largely spread out and not formalized. I see formalization of site as "data acquisition site" as a combination which would give us sites.tsv and sites.json to aggregate/summarize (see Replace "inheritance" with "summarization" principle #65) metadata specific to the data acquisition sites.
  • there was a promise stated in BIDS 1.0 about "multi-site" support since "the initial markdown" bids-standard/bids-specification@a364add#diff-82b82851a7dbfdcaec1452e9b42e7d12522fe6caec85998bc85e2c7cf3b341afR2477 -- so we better "deliver"
  • IMHO there is no point to recommend addition of _site- entity outside of the BIDS 2.0 since there is no "the level" site should be used - could be over subjects (site-/sub-) or under subjects (sub-/site- in particular e.g. for "traveling human phantom" etc), so at large depends on having
  • edit: I would consider site entity as OPTIONAL and likely after subject entity in the ordering (we can have multiple sessions within site but not multiple sites within session for the same subject. There could be multiple subjects within a session across sites though in some "hyperscanning" etc studies). But as "OPTIONAL" it would likely (we are yet to formalize) require "manual" specification of the placement within Make it possible to specify folders layout to be other than sub-{label}/[ses-{label}/] #54 solution.

yarikoptic added a commit to yarikoptic/bids-specification that referenced this issue Apr 24, 2024
effigies pushed a commit to bids-standard/bids-specification that referenced this issue May 22, 2024
…dies in a single dataset (#1803)

* Fix syntax (add closing ')') + point to specific issue for multi-site

* Add additional way (session-level) for multi-site studies

Reflecting up on discussion within bids-standard/bids-2-devel#11

---------

Co-authored-by: Taylor Salo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
folder-structure Proposals to reorganize files in the specification. impact: high Estimated high impact change
Projects
Status: Todo
Development

No branches or pull requests

4 participants