Skip to content

Commit

Permalink
[FIX] Update HED appendix to comply with current HED version (#970)
Browse files Browse the repository at this point in the history
* First pass at revision of Appendix III to update to HED-3G

* Updated the Appendix III description of HED versioning

* Updated the markdown format

* Fixed lint errors in the HED.md

* Minor syntax changes

* Added more detail about column inheritance.

* Got rid of e.g.

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* A few minor rewordings for recommendations

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Corrected confusion about lower directories

* Added a link to the hed-examples repository in hed-standard on GitHub

* apply "uncontroversial" suggestions

* remove trailing whitespace

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Update src/99-appendices/03-hed.md

Co-authored-by: Stefan Appelhoff <[email protected]>

Co-authored-by: Stefan Appelhoff <[email protected]>
  • Loading branch information
VisLab and sappelhoff authored Jan 23, 2022
1 parent d611689 commit f888291
Show file tree
Hide file tree
Showing 2 changed files with 152 additions and 174 deletions.
6 changes: 3 additions & 3 deletions src/03-modality-agnostic-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ Example:
```JSON
{
"Name": "The mother of all experiments",
"BIDSVersion": "1.4.0",
"BIDSVersion": "1.6.0",
"DatasetType": "raw",
"License": "CC0",
"Authors": [
Expand All @@ -57,7 +57,7 @@ Example:
"Alzheimer A., & Kraepelin, E. (2015). Neural correlates of presenile dementia in humans. Journal of Neuroscientific Data, 2, 234001. doi:1920.8/jndata.2015.7"
],
"DatasetDOI": "doi:10.0.2.3/dfjj.10",
"HEDVersion": "7.1.1"
"HEDVersion": "8.0.0"
}
```

Expand Down Expand Up @@ -94,7 +94,7 @@ Example:
```JSON
{
"Name": "FMRIPREP Outputs",
"BIDSVersion": "1.4.0",
"BIDSVersion": "1.6.0",
"DatasetType": "derivative",
"GeneratedBy": [
{
Expand Down
320 changes: 149 additions & 171 deletions src/99-appendices/03-hed.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,207 +3,185 @@
Hierarchical Event Descriptors (HED) are a controlled vocabulary of terms describing
events in a machine-actionable form so that algorithms can use the information without
manual recoding.
HED was originally developed with EEG in mind, but is applicable to
all behavioral experiments.

Each level of a hierarchical tag is delimited with a forward slash (`/`).
A HED string contains one or more HED tags separated by commas (`,`).
Parentheses (brackets, `()`) group tags and enable specification of multiple items
and their attributes in a single **HED string** (see section 2.4 in
[HED Tagging Strategy Guide](https://www.hedtags.org/hed-docs/HEDTaggingStrategyGuide.pdf)).
For more information about HED and tools available to validate and match HED
strings, please visit [www.hedtags.org](https://www.hedtags.org).
Since dedicated fields already exist for the overall task classification in the
sidecar JSON files (`CogAtlasID` and `CogPOID`), HED tags from the `Paradigm`
HED subcategory should not be used to annotate events.

## Annotating each event

There are several ways to associate HED annotations with events within the BIDS
framework.
The most direct way is to use the `HED` column of the `*_events.tsv`
file to annotate events.

Example: An `*_events.tsv` annotated using HED tags for individual events.

```Text
onset duration HED
1.1 n/a Event/Category/Experimental stimulus, Event/Label/CrossFix, Sensory presentation/Visual, Item/Object/2D Shape/Cross
1.3 n/a Event/Category/Participant response, Event/Label/ButtonPress, Action/Button press
...
```

The direct approach requires that each line in the events file be annotated.
Since there are typically thousands of events in each experiment,
this method of annotation is not convenient unless the annotations are
automatically generated.
Usually annotations that appear in the `HED` column are specific to each individual event.
Information that is common to groups of events can be annotated by category.
Numerical values associated with each event can be annotated by value type.
Annotating by category and by value greatly reduces the effort required to HED tag
data and improves the clarity for data users.

## Annotating events by categories

In many experiments, the event instances fall into a much smaller number of
categories, and often these categories are labeled with numerical codes or short names.
This categorical information usually corresponds to one or more columns in `*_events.tsv`
representing categorical values.
Instead of tagging this information for each individual event,
you can assign HED tags for each distinct categorical value
in an accompanying `*_events.json` sidecar and allow the analysis tools to make
the association with individual event instances during analysis.
The column name in the `*_events.tsv` identifies the type of categorical variable.
The following `*_events.tsv` file has one categorical variable called `mycodes` that
takes on three possible values: `Fixation`, `Button`, and `Target`.

Example: An `*_events.tsv` containing the `mycodes` categorical column.

```Text
onset duration mycodes
1.1 n/a Fixation
1.3 n/a Button
1.8 n/a Target
...
```

Example: An accompanying `*_events.json` sidecar describing the `mycodes` categorical variable.

```JSON
{
"mycodes": {
"LongName": "Local event type names",
"Description": "Main types of events that comprise a trial",
"Levels": {
"Fixation": "Fixation cross is displayed",
"Target": "Target image appears",
"Button": "Subject presses a button"
},
"HED": {
"Fixation": "Event/Category/Experimental stimulus, Event/Label/CrossFix,
Event/Description/A cross appears at screen center to serve as a fixation point,
Sensory presentation/Visual, Item/Object/2D Shape/Cross,
Attribute/Visual/Fixation point, Attribute/Visual/Rendering type/Screen,
Attribute/Location/Screen/Center",
"Target": "Event/Label/TargetImage, Event/Category/Experimental stimulus,
Event/Description/A white airplane as the RSVP target superimposed on a satellite image is displayed.,
Item/Object/Vehicle/Aircraft/Airplane, Participant/Effect/Cognitive/Target,
Sensory presentation/Visual/Rendering type/Screen/2D),
(Item/Natural scene/Aerial/Satellite,
Sensory presentation/Visual/Rendering type/Screen/2D)",
"Button": "Event/Category/Participant response, Event/Label/PressButton,
Event/Description/The participant presses the button as soon as the target is visible,
Action/Button press"
}
}
}
```

## Annotating events by value type

Each column of `*_events.tsv` containing non-categorical values usually represents a
particular type of data, for example the `speed` of a stimulus object across the
screen or the filename of the stimulus image.
These variables could be annotated in the HED column of `*_events.tsv`.
However, that approach requires repeating the values appearing in the individual
columns in the HED column.
A better approach is to annotate the type of value contained in each of these
columns in the `*_events.json` sidecar.
Value variables are annotated in a manner similar to categorical values,
except that the HED string must contain exactly one `#` specifying a placeholder
for the actual column values.
Tools are responsible for substituting the actual column values for the `#` during analysis.

Example: An `*_events.tsv` containing a categorical column (`trial_type`) and two value
columns (`response_time` and `stim_file`).
HED annotation can be used to describe any experimental events by combining
information from the dataset's `_events.tsv` files and `_events.json` sidecars.

## HED annotations and vocabulary

A HED annotation consists of terms selected from a controlled
hierarchical vocabulary (the HED schema).
Individual terms are comma-separated and may be grouped using parentheses to indicate
association.
See [https://www.hedtags.org/display_hed.html](https://www.hedtags.org/display_hed.html)
to view the HED schema and the
[HED documentation](https://hed-specification.readthedocs.io/en/latest/index.html)
for additional resources.

Starting with HED version 8.0.0, HED allows users to annotate using individual
terms or partial paths in the HED vocabulary (for example `Red` or `Visual-presentation`)
rather than the full paths in the HED hierarchy (
`Property/Sensory-property/Sensory-attribute/Visual-attribute/Color/CSS-color/Red-color/Red`
or
`Property/Sensory-property/Sensory-presentation/Visual-presentation`).

HED specific tools MUST treat the short and long HED tag forms interchangeably,
converting between the forms when necessary, based on the HED schema.
Examples of test datasets using the various forms can be found in
[hed-examples/datasets](https://github.com/hed-standard/hed-examples/tree/main/datasets)
on GitHub.
**Using the short form for tags is strongly RECOMMENDED whenever possible**.

## Annotating events

Event-related data in BIDS appears in tab-separated value (`events.tsv`)
files in various places in the dataset hierarchy
(see [Events](../04-modality-specific-files/05-task-events.md)).

`events.tsv` files MUST have `onset` and `duration` columns.
Dataset curators MAY also include additional columns and define their
meanings in associated JSON sidecar files (`events.json`).

Example: An excerpt from an `events.tsv` file containing three columns
(`trial_type`, `response_time`, and `stim_file`) in addition to
the required `onset` and `duration` columns.

```Text
onset duration trial_type response_time stim_file
1.2 0.6 go 1.435 images/red_square.jpg
5.6 0.6 stop 1.739 images/blue_square.jpg
```

Example: An accompanying `*_events.json` sidecar describing both categorical and value columns.
The `trial_type` column in the above example contains a limited number of distinct
values (`go` and `stop`).
This type of column is referred to as a *categorical* column,
and the column's meaning can be annotated by assigning HED tags to describe
each of these distinct values.
The JSON sidecar provides a [JSON object](https://www.json.org/json-en.html) of annotations for these categorical values.
That is, the object is a dictionary mapping the categorical values to corresponding HED annotations.

In contrast, the `response_time` and `stim_file` columns could potentially contain
distinct values in every row.
These columns are referred to as *value* columns and are annotated by creating
a HED tag string to describe a general pattern for these values.
The HED annotation for a value column must include a `#` placeholder,
which dedicated HED tools MUST replace by the actual column value when the annotations
are assembled for analysis.

Example: An accompanying `events.json` sidecar describing both categorical and
value columns of the previous example.
The `duration` column is also annotated as a value column.

```JSON
{
"trial_type": {
"LongName": "Event category",
"Description": "Indicator of type of action that is expected",
"Levels": {
"go": "A red square is displayed to indicate starting",
"stop": "A blue square is displayed to indicate stopping",
},
"HED": {
"go": "Event/Category/Experimental stimulus, Event/Label/RedSquare,
Event/Description/A red square is displayed to indicate starting,
Sensory presentation/Visual, Item/Object/2D Shape/Square,
Attribute/Visual/Color/Red, Attribute/Visual/Rendering type/Screen,
Attribute/Location/Screen/Center",
"stop": "Event/Category/Experimental stimulus, Event/Label/BlueSquare,
Event/Description/A blue square is displayed to indicate stopping,
Sensory presentation/Visual, Item/Object/2D Shape/Square,
Attribute/Visual/Color/Blue, Attribute/Visual/Rendering type/Screen,
Attribute/Location/Screen/Center",
"Duration": {
"LongName": "Image duration",
"Description": "Duration of the image presentations",
"Units": "s",
"HED": "Duration/# s"
},
"trial_type": {
"LongName": "Event category",
"Description": "Indicator of type of action that is expected",
"Levels": {
"go": "A red square is displayed to indicate starting",
"stop": "A blue square is displayed to indicate stopping"
},
"HED": {
"go": "Sensory-event, Visual-presentation, ((Square, Blue),(Computer-screen, Center-of))",
"stop": "Sensory-event, Visual-presentation, ((Square, Blue), (Computer-screen, Center-of))"
}
},
"response_time": {
"LongName": "Response time after stimulus",
"Description": "Time from stimulus presentation until subject presses button",
"Units": "ms",
"HED": "Attribute/Response start delay/# ms, Action/Button press"
"HED": "(Delay/# ms, Agent-action, (Experiment-participant, (Press, Mouse-button))),"
},
"stim_file": {
"LongName": "Stimulus filename",
"Description": "Relative path of the stimulus image file",
"HED": "Attribute/File/#"
"HED": "Pathname/#"
}
}
```

## Best practices

Most studies will have event categorical variables and value variables that
are common across many of the datasets in the study.
You should try to annotate these columns in a `*_events.json` sidecar
as high in the study hierarchy as possible to avoid duplicate annotations.
Annotations that can be placed in sidecars are preferred to those placed
directly in the HED column, because they are simpler, more compact, and
less prone to inconsistent annotation.
Downstream tools should not distinguish between tags specified using
the explicit HED column and the categorical specifications, but should
form the union before analysis.
Further, the [inheritance principle](../02-common-principles.md#the-inheritance-principle)
applies, so the data dictionaries can appear higher in the BIDS hierarchy.

You should try to annotate in as much detail as possible.
The HED path structure makes it easy for analysis tools to extract tags
at different levels of detail: For example a user can consider extracting
events associated with 2D shapes for stimuli, ignoring the particular
color or shape details for the stimuli.

## HED schema and HED versions
Dedicated HED tools MUST assemble an annotation for each event by concatenating the
annotations for each column.

Example: The fully assembled annotation for the first event in the above
`events.tsv` file with onset `1.2` (the first row) is:

```Text
Duration/0.6 s, Sensory-event, Visual-presentation,
((Square, Blue), (Computer-screen, Center-of)),
(Delay/1.435 ms, Agent-action,
(Experiment-participant, (Press, Mouse-button))),
Pathname/images/red_square.jpg
```

## Annotation using the `HED` column

Another tagging strategy is to annotate individual events directly by
including a `HED` column in the `events.tsv` file.
This approach is necessary when each event has annotations that are unique
and do not fit into a standard set of patterns.

Some acquisition or presentation software systems directly
write annotations during the experiment, and these MAY also be placed in the
`HED` column of the `events.tsv` file.

Dedicated HED tools that assemble the full annotation for events treat MUST not distinguish
between HED annotations extracted from `_events.json` sidecars and those
appearing in the `HED` column of `_events.tsv` files.
The HED strings from all sources are concatenated to form the final
event annotations.

Annotations placed in sidecars are the RECOMMENDED way
to annotate data using HED.
These annnotations are preferred to those placed
directly in the `HED` column, because they are simpler, more compact,
more easily edited, and less prone to inconsistencies.

## HED and the BIDS inheritance principle

Most studies have event files whose columns contain categorical and
numerical values that are similar across the recordings in the study.
If possible, users should annotate these columns in a single
`events.json` sidecar placed at the top level in the dataset.

If some recordings in the dataset have a column whose values deviate from a
standard pattern, then the annotations for that column MUST be placed in
sidecars located deeper in the dataset directory hierarchy.
According to the BIDS [Inheritance Principle](../02-common-principles.md#the-inheritance-principle),
once a column key in a sidecar (that is, the column name found in the `events.tsv` files) is set,
information about that column cannot be overridden by a sidecar appearing in a directory
closer to the dataset root.

## HED schema versions

The HED vocabulary is specified by a HED schema,
which delineates the allowed HED path strings.
By default, BIDS uses the latest HED schema available in the
[hed-specification](https://github.com/hed-standard/hed-specification/tree/master/hedxml) repository
maintained by the hed-standard group.
The version of HED used in tagging a dataset should be provided in the `HEDVersion`
field of the `dataset_description.json` file located in the dataset root directory.
This allows for a proper validation of the HED annotations
(for example using the `bids-validator`).

You can override the default by providing a specific HED version number in the
`dataset_description.json` file using the `HEDVersion` field.
The preferred approach is to validate with the latest version (the default),
but to use the `HEDVersion` field to specify which version was used for later reference.

Example: The following `dataset_description.json` file specifies that
`HED7.1.1.xml` from the [hed-specification](https://github.com/hed-standard/hed-specification/tree/master/hedxml) repository
should be used to validate the study event annotations.
Example: The following `dataset_description.json` file specifies that the
[`HED8.0.0.xml`](https://github.com/hed-standard/hed-specification/tree/master/hedxml/HED8.0.0.xml)
file from the `hedxml` directory of the
[`hed-specification`](https://github.com/hed-standard/hed-specification)
repository on GitHub should be used to validate the study event annotations.

```JSON
{
"Name": "The mother of all experiments",
"BIDSVersion": "1.4.0",
"HEDVersion": "7.1.1"
"Name": "A great experiment",
"BIDSVersion": "1.6.0",
"HEDVersion": "8.0.0"
}
```

If you omit the `HEDVersion` field from the dataset description file,
any present HED information will be validated using the latest version of the HED schema,
which is bound to result in problems.
Hence, it is strongly RECOMMENDED that the `HEDVersion` field be included when using HED
in a BIDS dataset.

0 comments on commit f888291

Please sign in to comment.