Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add warning about deidentification when sharing sourcedata #1769

Merged
merged 5 commits into from
Apr 22, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions src/common-principles.md
Original file line number Diff line number Diff line change
Expand Up @@ -309,6 +309,28 @@ field in `dataset_description.json` of each subdirectory of `derivatives` to:
}
```

!!! danger "Caution"

Sharing source data may help amend errors and missing data discovered
only with the reuse of the raw dataset in practice.
Therefore, from an Open Science perspective, it is RECOMMENDED to share
the source data whenever it is possible.

However, more stringent sharing limitations may apply to the source data
than those applicable to the raw data.
For example, human data almost always requires deidentification
before they can be redistributed,
or the subjects' consent form did not explicitly state that the source files
DimitriPapadopoulos marked this conversation as resolved.
Show resolved Hide resolved
would be shared after deidentification.
Further examples in which sharing source data may not be possible
include original data formats that are not redistributable
as per the acquisition device's license.

As for raw data, all regulatory, ethical, and legal aspects SHOULD
be carefully considered before sharing data
through the `sourcedata/` directory mechanism.
In the case of source data, these aspects are likely more stringent.

### Storage of derived datasets

Derivatives can be stored/distributed in two ways:
Expand Down
Loading