generated from w3c-ccg/markdown-to-spec
-
Notifications
You must be signed in to change notification settings - Fork 42
/
index.bs
747 lines (622 loc) · 38.4 KB
/
index.bs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
<pre class='metadata'>
Title: Next-generation file formats (NGFF)
Shortname: ome-ngff
Level: 1
Status: LS-COMMIT
Status: w3c/ED
Group: ome
URL: https://ngff.openmicroscopy.org/latest/
Repository: https://github.com/ome/ngff
Issue Tracking: Forums https://forum.image.sc/tag/ome-ngff
Logo: http://www.openmicroscopy.org/img/logos/ome-logomark.svg
Local Boilerplate: header yes
Local Boilerplate: copyright yes
Boilerplate: style-darkmode off
Markup Shorthands: markdown yes
Editor: Josh Moore, University of Dundee (UoD) https://www.dundee.ac.uk, https://orcid.org/0000-0003-4028-811X
Text Macro: NGFFVERSION 0.5-dev
Abstract: This document contains next-generation file format (NGFF)
Abstract: specifications for storing bioimaging data in the cloud.
Abstract: All specifications are submitted to the https://image.sc community for review.
Status Text: The current released version of this specification is
Status Text: <a href="../0.4/index.html">0.4</a>. Migration scripts
Status Text: will be provided between numbered versions. Data written with these latest changes
Status Text: (an "editor's draft") will not necessarily be supported.
</pre>
OME-NGFF {#ome-ngff}
--------------------
The conventions and specifications defined in this document are designed to
enable next-generation file formats to represent the same bioimaging data
that can be represented in \[OME-TIFF](http://www.openmicroscopy.org/ome-files/)
and beyond. However, the conventions will also be usable by HDF5 and other sufficiently advanced
binary containers. Eventually, we hope, the moniker "next-generation" will no longer be
applicable, and this will simply be the most efficient, common, and useful representation
of bioimaging data, whether during acquisition or sharing in the cloud.
Note: The following text makes use of OME-Zarr [[ome-zarr-py]], the current prototype implementation,
for all examples.
Document conventions
--------------------
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”,
“RECOMMENDED”, “MAY”, and “OPTIONAL” are to be interpreted as described in
[RFC 2119](https://tools.ietf.org/html/rfc2119).
<p>
<dfn>Transitional</dfn> metadata is added to the specification with the
intention of removing it in the future. Implementations may be expected (MUST) or
encouraged (SHOULD) to support the reading of the data, but writing will usually
be optional (MAY). Examples of transitional metadata include custom additions by
implementations that are later submitted as a formal specification. (See [[#bf2raw]])
</p>
Some of the JSON examples in this document include comments. However, these are only for
clarity purposes and comments MUST NOT be included in JSON objects.
On-disk (or in-cloud) layout {#on-disk}
=======================================
An overview of the layout of an OME-Zarr fileset should make
understanding the following metadata sections easier. The hierarchy
is represented here as it would appear locally but could equally
be stored on a web server to be accessed via HTTP or in object storage
like S3 or GCS.
OME-Zarr is an implementation of the OME-NGFF specification using the Zarr
format. Arrays MUST be defined and stored in a hierarchical organization as
defined by the
[version 2 of the Zarr specification ](https://zarr.readthedocs.io/en/stable/spec/v2.html).
OME-NGFF metadata MUST be stored as attributes in the corresponding Zarr
groups.
Images {#image-layout}
----------------------
The following layout describes the expected Zarr hierarchy for images with
multiple levels of resolutions and optionally associated labels.
Note that the number of dimensions is variable between 2 and 5 and that axis names are arbitrary, see [[#multiscale-md]] for details.
For this example we assume an image with 5 dimensions and axes called `t,c,z,y,x`.
<pre>
. # Root folder, potentially in S3,
│ # with a flat list of images by image ID.
│
├── 123.zarr # One image (id=123) converted to Zarr.
│
└── 456.zarr # Another image (id=456) converted to Zarr.
│
├── .zgroup # Each image is a Zarr group, or a folder, of other groups and arrays.
├── .zattrs # Group level attributes are stored in the .zattrs file and include
│ # "multiscales" and "omero" (see below). In addition, the group level attributes
│ # may also contain "_ARRAY_DIMENSIONS" for compatibility with xarray if this group directly contains multi-scale arrays.
│
├── 0 # Each multiscale level is stored as a separate Zarr array,
│ ... # which is a folder containing chunk files which compose the array.
├── n # The name of the array is arbitrary with the ordering defined by
│ │ # by the "multiscales" metadata, but is often a sequence starting at 0.
│ │
│ ├── .zarray # All image arrays must be up to 5-dimensional
│ │ # with the axis of type time before type channel, before spatial axes.
│ │
│ └─ t # Chunks are stored with the nested directory layout.
│ └─ c # All but the last chunk element are stored as directories.
│ └─ z # The terminal chunk is a file. Together the directory and file names
│ └─ y # provide the "chunk coordinate" (t, c, z, y, x), where the maximum coordinate
│ └─ x # will be `dimension_size / chunk_size`.
│
└── labels
│
├── .zgroup # The labels group is a container which holds a list of labels to make the objects easily discoverable
│
├── .zattrs # All labels will be listed in `.zattrs` e.g. `{ "labels": [ "original/0" ] }`
│ # Each dimension of the label `(t, c, z, y, x)` should be either the same as the
│ # corresponding dimension of the image, or `1` if that dimension of the label
│ # is irrelevant.
│
└── original # Intermediate folders are permitted but not necessary and currently contain no extra metadata.
│
└── 0 # Multiscale, labeled image. The name is unimportant but is registered in the "labels" group above.
├── .zgroup # Zarr Group which is both a multiscaled image as well as a labeled image.
├── .zattrs # Metadata of the related image and as well as display information under the "image-label" key.
│
├── 0 # Each multiscale level is stored as a separate Zarr array, as above, but only integer values
│ ... # are supported.
└── n
</pre>
High-content screening {#hcs-layout}
------------------------------------
The following specification defines the hierarchy for a high-content screening
dataset. Three groups MUST be defined above the images:
- the group above the images defines the well and MUST implement the
[well specification](#well-md). All images contained in a well are fields
of view of the same well
- the group above the well defines a row of wells
- the group above the well row defines an entire plate i.e. a two-dimensional
collection of wells organized in rows and columns. It MUST implement the
[plate specification](#plate-md)
A well row group SHOULD NOT be present if there are no images in the well row.
A well group SHOULD NOT be present if there are no images in the well.
<pre>
. # Root folder, potentially in S3,
│
└── 5966.zarr # One plate (id=5966) converted to Zarr
├── .zgroup
├── .zattrs # Implements "plate" specification
├── A # First row of the plate
│ ├── .zgroup
│ │
│ ├── 1 # First column of row A
│ │ ├── .zgroup
│ │ ├── .zattrs # Implements "well" specification
│ │ │
│ │ ├── 0 # First field of view of well A1
│ │ │ │
│ │ │ ├── .zgroup
│ │ │ ├── .zattrs # Implements "multiscales", "omero"
│ │ │ ├── 0
│ │ │ │ ... # Resolution levels
│ │ │ ├── n
│ │ │ └── labels # Labels (optional)
│ │ ├── ... # Fields of view
│ │ └── m
│ ├── ... # Columns
│ └── 12
├── ... # Rows
└── H
</pre>
Metadata {#metadata}
====================
The various `.zattrs` files throughout the above array hierarchy may contain metadata
keys as specified below for discovering certain types of data, especially images.
"axes" metadata {#axes-md}
--------------------------
"axes" describes the dimensions of a physical coordinate space. It is a list of dictionaries, where each dictionary describes a dimension (axis) and:
- MUST contain the field "name" that gives the name for this dimension. The values MUST be unique across all "name" fields.
- SHOULD contain the field "type". It SHOULD be one of "space", "time" or "channel", but MAY take other values for custom axis types that are not part of this specification yet.
- SHOULD contain the field "unit" to specify the physical unit of this dimension. The value SHOULD be one of the following strings, which are valid units according to UDUNITS-2.
- Units for "space" axes: 'angstrom', 'attometer', 'centimeter', 'decimeter', 'exameter', 'femtometer', 'foot', 'gigameter', 'hectometer', 'inch', 'kilometer', 'megameter', 'meter', 'micrometer', 'mile', 'millimeter', 'nanometer', 'parsec', 'petameter', 'picometer', 'terameter', 'yard', 'yoctometer', 'yottameter', 'zeptometer', 'zettameter'
- Units for "time" axes: 'attosecond', 'centisecond', 'day', 'decisecond', 'exasecond', 'femtosecond', 'gigasecond', 'hectosecond', 'hour', 'kilosecond', 'megasecond', 'microsecond', 'millisecond', 'minute', 'nanosecond', 'petasecond', 'picosecond', 'second', 'terasecond', 'yoctosecond', 'yottasecond', 'zeptosecond', 'zettasecond'
If part of [[#multiscale-md]], the length of "axes" MUST be equal to the number of dimensions of the arrays that contain the image data.
"bioformats2raw.layout" (transitional) {#bf2raw}
------------------------------------------------
[=Transitional=] "bioformats2raw.layout" metadata identifies a group which implicitly describes a series of images.
The need for the collection stems from the common "multi-image file" scenario in microscopy. Parsers like Bio-Formats
define a strict, stable ordering of the images in a single container that can be used to refer to them by other tools.
In order to capture that information within an OME-NGFF dataset, `bioformats2raw` internally introduced a wrapping layer.
The bioformats2raw layout has been added to v0.4 as a transitional specification to specify filesets that already exist
in the wild. An upcoming NGFF specification will replace this layout with explicit metadata.
<h4 id="bf2raw-layout" class="no-toc">Layout</h4>
Typical Zarr layout produced by running `bioformats2raw` on a fileset that contains more than one image (series > 1):
<pre>
series.ome.zarr # One converted fileset from bioformats2raw
├── .zgroup
├── .zattrs # Contains "bioformats2raw.layout" metadata
├── OME # Special group for containing OME metadata
│ ├── .zgroup
│ ├── .zattrs # Contains "series" metadata
│ └── METADATA.ome.xml # OME-XML file stored within the Zarr fileset
├── 0 # First image in the collection
├── 1 # Second image in the collection
└── ...
</pre>
<h4 id="bf2raw-attributes" class="no-toc">Attributes</h4>
The top-level `.zattrs` file must contain the `bioformats2raw.layout` key:
<pre class=include-code>
path: examples/bf2raw/image.json
highlight: json
</pre>
If the top-level group represents a plate, the `bioformats2raw.layout` metadata will be present but
the "plate" key MUST also be present, takes precedence and parsing of such datasets should follow [[#plate-md]]. It is not
possible to mix collections of images with plates at present.
<pre class=include-code>
path: examples/bf2raw/plate.json
highlight: json
</pre>
The `.zattrs` file within the OME group may contain the "series" key:
<pre class=include-code>
path: examples/ome/series-2.json
highlight: json
</pre>
<h4 id="bf2raw-details" class="no-toc">Details</h4>
Conforming groups:
- MUST have the value "3" for the "bioformats2raw.layout" key in their `.zattrs` metadata at the top of the hierarchy;
- SHOULD have OME metadata representing the entire collection of images in a file named "OME/METADATA.ome.xml" which:
- MUST adhere to the OME-XML specification but
- MUST use `<MetadataOnly/>` elements as opposed to `<BinData/>`, `<BinaryOnly/>` or `<TiffData/>`;
- MAY make use of the [minimum specification](https://docs.openmicroscopy.org/ome-model/6.2.2/specifications/minimum.html).
Additionally, the logic for finding the Zarr group for each image follows the following logic:
- If "plate" metadata is present, images MUST be located at the defined location.
- Matching "series" metadata (as described next) SHOULD be provided for tools that are unaware of the "plate" specification.
- If the "OME" Zarr group exists, it:
- MAY contain a "series" attribute. If so:
- "series" MUST be a list of string objects, each of which is a path to an image group.
- The order of the paths MUST match the order of the "Image" elements in "OME/METADATA.ome.xml" if provided.
- If the "series" attribute does not exist and no "plate" is present:
- separate "multiscales" images MUST be stored in consecutively numbered groups starting from 0 (i.e. "0/", "1/", "2/", "3/", ...).
- Every "multiscales" group MUST represent exactly one OME-XML "Image" in the same order as either the series index or the group numbers.
Conforming readers:
- SHOULD make users aware of the presence of more than one image (i.e. SHOULD NOT default to only opening the first image);
- MAY use the "series" attribute in the "OME" group to determine a list of valid groups to display;
- MAY choose to show all images within the collection or offer the user a choice of images, as with <dfn export="true"><abbr title="High-content screening">HCS</abbr></dfn> plates;
- MAY ignore other groups or arrays under the root of the hierarchy.
"coordinateTransformations" metadata {#trafo-md}
------------------------------------------------
"coordinateTransformations" describe a series of transformations that map between two coordinate spaces (defined by "axes").
For example, to map a discrete data space of an array to the corresponding physical space.
It is a list of dictionaries. Each entry describes a single transformation and MUST contain the field "type".
The value of "type" MUST be one of the elements of the `type` column in the table below.
Additional fields for the entry depend on "type" and are defined by the column `fields`.
<table>
<tr><th>`identity` <td> <td>identity transformation, is the default transformation and is typically not explicitly defined
<tr><th>`translation` <td> one of: `"translation":List[float]`, `"path":str` <td>translation vector, stored either as a list of floats (`"translation"`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. |
<tr><th>`scale` <td> one of: `"scale":List[float]`, `"path":str` <td>scale vector, stored either as a list of floats (`scale`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. |
<thead>
<tr><th>type<th>fields<th>description
</table>
The transformations in the list are applied sequentially and in order.
"multiscales" metadata {#multiscale-md}
---------------------------------------
Metadata about an image can be found under the "multiscales" key in the group-level metadata. Here, image refers to 2 to 5 dimensional data representing image or volumetric data with optional time or channel axes. It is stored in a multiple resolution representation.
"multiscales" contains a list of dictionaries where each entry describes a multiscale image.
Each "multiscales" dictionary MUST contain the field "axes", see [[#axes-md]].
The length of "axes" must be between 2 and 5 and MUST be equal to the dimensionality of the zarr arrays storing the image data (see "datasets:path").
The "axes" MUST contain 2 or 3 entries of "type:space" and MAY contain one additional entry of "type:time" and MAY contain one additional entry of "type:channel" or a null / custom type.
The order of the entries MUST correspond to the order of dimensions of the zarr arrays. In addition, the entries MUST be ordered by "type" where the "time" axis must come first (if present), followed by the "channel" or custom axis (if present) and the axes of type "space".
If there are three spatial axes where two correspond to the image plane ("yx") and images are stacked along the other (anisotropic) axis ("z"), the spatial axes SHOULD be ordered as "zyx".
Each "multiscales" dictionary MUST contain the field "datasets", which is a list of dictionaries describing the arrays storing the individual resolution levels.
Each dictionary in "datasets" MUST contain the field "path", whose value contains the path to the array for this resolution relative
to the current zarr group. The "path"s MUST be ordered from largest (i.e. highest resolution) to smallest.
Each "datasets" dictionary MUST have the same number of dimensions and MUST NOT have more than 5 dimensions. The number of dimensions and order MUST correspond to number and order of "axes".
Each dictionary in "datasets" MUST contain the field "coordinateTransformations", which contains a list of transformations that map the data coordinates to the physical coordinates (as specified by "axes") for this resolution level.
The transformations are defined according to [[#trafo-md]]. The transformation MUST only be of type `translation` or `scale`.
They MUST contain exactly one `scale` transformation that specifies the pixel size in physical units or time duration. If scaling information is not available or applicable for one of the axes, the value MUST express the scaling factor between the current resolution and the first resolution for the given axis, defaulting to 1.0 if there is no downsampling along the axis.
It MAY contain exactly one `translation` that specifies the offset from the origin in physical units. If `translation` is given it MUST be listed after `scale` to ensure that it is given in physical coordinates.
The length of the `scale` and `translation` array MUST be the same as the length of "axes".
The requirements (only `scale` and `translation`, restrictions on order) are in place to provide a simple mapping from data coordinates to physical coordinates while being compatible with the general transformation spec.
Each "multiscales" dictionary MAY contain the field "coordinateTransformations", describing transformations that are applied to all resolution levels in the same manner.
The transformations MUST follow the same rules about allowed types, order, etc. as in "datasets:coordinateTransformations" and are applied after them.
They can for example be used to specify the `scale` for a dimension that is the same for all resolutions.
Each "multiscales" dictionary SHOULD contain the field "name". It MUST contain the field "version", which indicates the version of the multiscale metadata of this image (current version is [NGFFVERSION]).
Each "multiscales" dictionary SHOULD contain the field "type", which gives the type of downscaling method used to generate the multiscale image pyramid.
It SHOULD contain the field "metadata", which contains a dictionary with additional information about the downscaling method.
<pre class=include-code>
path: examples/multiscales_strict/multiscales_example.json
highlight: json
</pre>
If only one multiscale is provided, use it. Otherwise, the user can choose by
name, using the first multiscale as a fallback:
```python
datasets = []
for named in multiscales:
if named["name"] == "3D":
datasets = [x["path"] for x in named["datasets"]]
break
if not datasets:
# Use the first by default. Or perhaps choose based on chunk size.
datasets = [x["path"] for x in multiscales[0]["datasets"]]
```
"omero" metadata (transitional) {#omero-md}
-------------------------------------------
[=Transitional=] information specific to the channels of an image and how to render it
can be found under the "omero" key in the group-level metadata:
```json
"id": 1, # ID in OMERO
"name": "example.tif", # Name as shown in the UI
"version": "0.5-dev", # Current version
"channels": [ # Array matching the c dimension size
{
"active": true,
"coefficient": 1,
"color": "0000FF",
"family": "linear",
"inverted": false,
"label": "LaminB1",
"window": {
"end": 1500,
"max": 65535,
"min": 0,
"start": 0
}
}
],
"rdefs": {
"defaultT": 0, # First timepoint to show the user
"defaultZ": 118, # First Z section to show the user
"model": "color" # "color" or "greyscale"
}
```
See the [OMERO WebGateway documentation](https://omero.readthedocs.io/en/stable/developers/Web/WebGateway.html#imgdata)
for more information.
"labels" metadata {#labels-md}
------------------------------
In OME-Zarr, Zarr arrays representing pixel-annotation data are stored in a group called "labels". Some applications--notably image segmentation--produce
a new image that is in the same coordinate system as a corresponding multiscale image (usually having the same dimensions and coordinate transformations).
This new image is composed of integer values corresponding to certain labels with custom meanings. For example, pixels take the value 1 or 0
if the corresponding pixel in the original image represents cellular space or intercellular space, respectively.
Such an image is referred to in this specification as a 'label image'.
The "labels" group is nested within an image group, at the same level of the Zarr hierarchy as the resolution levels for the original image.
The "labels" group is not itself an image; it contains images. The pixels of the label images MUST be integer data types, i.e. one of
[`uint8`, `int8`, `uint16`, `int16`, `uint32`, `int32`, `uint64`, `int64`]. Intermediate groups between "labels" and the images within it are allowed,
but these MUST NOT contain metadata. Names of the images in the "labels" group are arbitrary.
The `.zattrs` file associated with the "labels" group MUST contain a JSON object with the key `labels`, whose value is a JSON array of paths to the
labeled multiscale image(s). All label images SHOULD be listed within this metadata file. For example:
```json
{
"labels": [
"cell_space_segmentation"
]
}
```
The `.zattrs` file for the label image MUST implement the multiscales specification. Within the `multiscales` object, the JSON array
associated with the `datasets` key MUST have the same number of entries (scale levels) as the original unlabeled image.
In addition to the `multiscales` key, the JSON object in this image-level `.zattrs` file SHOULD contain another key, `image-label`,
whose value is also a JSON object. The `image-label` object stores information about the display colors, source image, and optionally,
further arbitrary properties of the label image. That `image-label` object SHOULD contain the following keys: first, a `colors` key,
whose value MUST be a JSON array describing color information for the unique label values. Second, a `version` key, whose value MUST be a
string specifying the version of the OME-NGFF `image-label` schema.
Conforming readers SHOULD display labels using the colors specified by the `colors` JSON array, as follows. This array contains one
JSON object for each unique custom label. Each of these objects MUST contain the `label-value` key, whose value MUST be the integer
corresponding to a particular label. In addition to the `label-value` key, the objects in this array MAY contain an `rgba` key whose
value MUST be an array of four integers between 0 and 255, inclusive. These integers represent the `uint8` values of red, green, and
blue that comprise the final color to be displayed at the pixels with this label. The fourth integer in the `rgba` array represents alpha,
or the opacity of the color. Additional keys under `colors` are allowed.
Next, the `image-label` object MAY contain the following keys: a `properties` key, and a `source` key.
Like the `colors` key, the value of the `properties` key MUST be an array of JSON objects describing the set of unique possible pixel values.
Each object in the `properties` array MUST contain the `label-value` key, whose value again MUST be an integer specifying the pixel value for that label.
Additionally, an arbitrary number of key-value pairs MAY be present for each label value, denoting arbitrary metadata associated with that label.
Label-value objects within the `properties` array do not need to have the same keys.
The value of the `source` key MUST be a JSON object containing information about the original image from which the label image derives.
This object MAY include a key `image`, whose value MUST be a string specifying the relative path to a Zarr image group.
The default value is `../../` since most labeled images are stored in a "labels" group that is nested within the original image group.
Here is an example of a simple `image-label` object for a label image in which 0s and 1s represent intercellular and cellular space, respectively:
<pre class=include-code>
path: examples/label_strict/colors_properties.json
highlight: json
</pre>
In this case, the pixels consisting of a 0 in the Zarr array will be displayed as 50% blue and 50% opacity. Pixels with a 1 in the Zarr array,
which correspond to cellular space, will be displayed as 50% green and 50% opacity.
"plate" metadata {#plate-md}
----------------------------
For high-content screening datasets, the plate layout can be found under the
custom attributes of the plate group under the `plate` key in the group-level metadata.
The `plate` dictionary MAY contain an `acquisitions` key whose value MUST be a list of
JSON objects defining the acquisitions for a given plate to which wells can refer to. Each
acquisition object MUST contain an `id` key whose value MUST be an unique integer identifier
greater than or equal to 0 within the context of the plate to which fields of view can refer
to (see #well-md).
Each acquisition object SHOULD contain a `name` key whose value MUST be a string identifying
the name of the acquisition. Each acquisition object SHOULD contain a `maximumfieldcount`
key whose value MUST be a positive integer indicating the maximum number of fields of view for the
acquisition. Each acquisition object MAY contain a `description` key whose value MUST be a
string specifying a description for the acquisition. Each acquisition object MAY contain
a `starttime` and/or `endtime` key whose values MUST be integer epoch timestamps specifying
the start and/or end timestamp of the acquisition.
The `plate` dictionary MUST contain a `columns` key whose value MUST be a list of JSON objects
defining the columns of the plate. Each column object defines the properties of
the column at the index of the object in the list. Each column in the physical plate
MUST be defined, even if no wells in the column are defined. Each column object MUST
contain a `name` key whose value is a string specifying the column name. The `name` MUST
contain only alphanumeric characters, MUST be case-sensitive, and MUST NOT be a duplicate of any
other `name` in the `columns` list. Care SHOULD be taken to avoid collisions on
case-insensitive filesystems (e.g. avoid using both `Aa` and `aA`).
The `plate` dictionary SHOULD contain a `field_count` key whose value MUST be a positive integer
defining the maximum number of fields per view across all wells.
The `plate` dictionary SHOULD contain a `name` key whose value MUST be a string defining the
name of the plate.
The `plate` dictionary MUST contain a `rows` key whose value MUST be a list of JSON objects
defining the rows of the plate. Each row object defines the properties of
the row at the index of the object in the list. Each row in the physical plate
MUST be defined, even if no wells in the row are defined. Each defined row MUST
contain a `name` key whose value MUST be a string defining the row name. The `name` MUST
contain only alphanumeric characters, MUST be case-sensitive, and MUST NOT be a duplicate of any
other `name` in the `rows` list. Care SHOULD be taken to avoid collisions on
case-insensitive filesystems (e.g. avoid using both `Aa` and `aA`).
The `plate` dictionary MUST contain a `version` key whose value MUST be a string specifying the
version of the plate specification.
The `plate` dictionary MUST contain a `wells` key whose value MUST be a list of JSON objects
defining the wells of the plate. Each well object MUST contain a `path` key whose value MUST
be a string specifying the path to the well subgroup. The `path` MUST consist of a `name` in
the `rows` list, a file separator (`/`), and a `name` from the `columns` list, in that order.
The `path` MUST NOT contain additional leading or trailing directories.
Each well object MUST contain both a `rowIndex` key whose value MUST be an integer identifying
the index into the `rows` list and a `columnIndex` key whose value MUST be an integer identifying
the index into the `columns` list. `rowIndex` and `columnIndex` MUST be 0-based. The
`rowIndex`, `columnIndex`, and `path` MUST all refer to the same row/column pair.
For example the following JSON object defines a plate with two acquisitions and
6 wells (2 rows and 3 columns), containing up to 2 fields of view per acquisition.
<pre class=include-code>
path: examples/plate_strict/plate_6wells.json
highlight: json
</pre>
The following JSON object defines a sparse plate with one acquisition and
2 wells in a 96 well plate, containing one field of view per acquisition.
<pre class=include-code>
path: examples/plate_strict/plate_2wells.json
highlight: json
</pre>
"well" metadata {#well-md}
--------------------------
For high-content screening datasets, the metadata about all fields of views
under a given well can be found under the "well" key in the attributes of the
well group.
The `well` dictionary MUST contain an `images` key whose value MUST be a list of JSON objects
specifying all fields of views for a given well. Each image object MUST contain a
`path` key whose value MUST be a string specifying the path to the field of view. The `path`
MUST contain only alphanumeric characters, MUST be case-sensitive, and MUST NOT be a duplicate
of any other `path` in the `images` list. If multiple acquisitions were performed in the plate,
it MUST contain an `acquisition` key whose value MUST be an integer identifying the acquisition
which MUST match one of the acquisition JSON objects defined in the plate metadata (see #plate-md).
The `well` dictionary SHOULD contain a `version` key whose value MUST be a string specifying the
version of the well specification.
For example the following JSON object defines a well with four fields of
view. The first two fields of view were part of the first acquisition while
the last two fields of view were part of the second acquisition.
<pre class=include-code>
path: examples/well_strict/well_4fields.json
highlight: json
</pre>
The following JSON object defines a well with two fields of view in a plate with
four acquisitions. The first field is part of the first acquisition, and the second
field is part of the last acquisition.
<pre class=include-code>
path: examples/well_strict/well_2fields.json
highlight: json
</pre>
Specification naming style {#naming-style}
==========================================
Multi-word keys in this specification should use the `camelCase` style.
NB: some parts of the specification don't obey this convention as they
were added before this was adopted, but they should be updated in due course.
Implementations {#implementations}
==================================
Projects which support reading and/or writing OME-NGFF data include:
<dl>
<dt><strong>[bigdataviewer-ome-zarr](https://github.com/mobie/bigdataviewer-ome-zarr)</strong></dt>
<dd>Fiji-plugin for reading OME-Zarr.</dd>
<dt><strong>[bioformats2raw](https://github.com/glencoesoftware/bioformats2raw)</strong></dt>
<dd>A performant, Bio-Formats image file format converter.</dd>
<dt><strong>[omero-ms-zarr](https://github.com/ome/omero-ms-zarr)</strong></dt>
<dd>A microservice for OMERO.server that converts images stored in OMERO to OME-Zarr files on the fly, served via a web API.</dd>
<dt><strong>[idr-zarr-tools](https://github.com/IDR/idr-zarr-tools)</strong></dt>
<dd>A full workflow demonstrating the conversion of IDR images to OME-Zarr images on S3.</dd>
<dt><strong>[OMERO CLI Zarr plugin](https://github.com/ome/omero-cli-zarr)</strong></dt>
<dd>An OMERO CLI plugin that converts images stored in OMERO.server into a local Zarr file.</dd>
<dt><strong>[ome-zarr-py](https://github.com/ome/ome-zarr-py)</strong></dt>
<dd>A napari plugin for reading ome-zarr files.</dd>
<dt><strong>[vizarr](https://github.com/hms-dbmi/vizarr/)</strong></dt>
<dd>A minimal, purely client-side program for viewing Zarr-based images with Viv & ImJoy.</dd>
<dt><strong>[ITKIOOMEZarrNGFF](https://github.com/InsightSoftwareConsortium/ITKIOOMEZarrNGFF/)</strong></dt>
<dd>ITK IO for images stored in OME-NGFF format.</dd>
</dl>
<img src="https://downloads.openmicroscopy.org/presentations/2020/Dundee/Workshops/NGFF/zarr_diagram/images/zarr-ome-diagram.png" alt="Diagram of related projects"></img>
All implementations prevent an equivalent representation of a dataset which can be downloaded or uploaded freely. An interactive
version of this diagram is available from the [OME2020 Workshop](https://downloads.openmicroscopy.org/presentations/2020/Dundee/Workshops/NGFF/zarr_diagram/).
Mouseover the blackboxes representing the implementations above to get a quick tip on how to use them.
Note: If you would like to see your project listed, please open an issue or PR on the [ome/ngff](https://github.com/ome/ngff) repository.
Citing {#citing}
================
[Next-generation file format (NGFF) specifications for storing bioimaging data in the cloud.](https://ngff.openmicroscopy.org/0.4)
J. Moore, *et al*. Open Microscopy Environment Consortium, 8 February 2022.
This edition of the specification is [https://ngff.openmicroscopy.org/0.4/](https://ngff.openmicroscopy.org/0.4/]).
The latest edition is available at [https://ngff.openmicroscopy.org/latest/](https://ngff.openmicroscopy.org/latest/).
[(doi:10.5281/zenodo.4282107)](https://doi.org/10.5281/zenodo.4282107)
Version History {#history}
==========================
<table>
<thead>
<tr>
<td>Revision</td>
<td>Date</td>
<td>Description</td>
</tr>
</thead>
<tr>
<td>0.4.1</td>
<td>2023-02-09</td>
<td>expand on "labels" description</td>
</tr>
<tr>
<td>0.4.1</td>
<td>2022-09-26</td>
<td>transitional metadata for image collections ("bioformats2raw.layout")</td>
</tr>
<tr>
<td>0.4.0</td>
<td>2022-02-08</td>
<td>multiscales: add axes type, units and coordinateTransformations</td>
</tr>
<tr>
<td>0.4.0</td>
<td>2022-02-08</td>
<td>plate: add rowIndex/columnIndex </td>
</tr>
<tr>
<td>0.3.0</td>
<td>2021-08-24</td>
<td>Add axes field to multiscale metadata </td>
</tr>
<tr>
<td>0.2.0</td>
<td>2021-03-29</td>
<td>Change chunk dimension separator to "/" </td>
</tr>
<tr>
<td>0.1.4</td>
<td>2020-11-26</td>
<td>Add HCS specification </td>
</tr>
<tr>
<td>0.1.3</td>
<td>2020-09-14</td>
<td>Add labels specification </td>
</tr>
<tr>
<td>0.1.2 </td>
<td>2020-05-07</td>
<td>Add description of "omero" metadata </td>
</tr>
<tr>
<td>0.1.1 </td>
<td>2020-05-06</td>
<td>Add info on the ordering of resolutions </td>
</tr>
<tr>
<td>0.1.0 </td>
<td>2020-04-20</td>
<td>First version for internal demo </td>
</tr>
</table>
<pre class="biblio">
{
"blogNov2020": {
"href": "https://blog.openmicroscopy.org/file-formats/community/2020/11/04/zarr-data/",
"title": "Public OME-Zarr data (Nov. 2020)",
"authors": [
"OME Team"
],
"status": "Informational",
"publisher": "OME",
"id": "blogNov2020",
"date": "04 November 2020"
},
"imagesc26952": {
"href": "https://forum.image.sc/t/ome-s-position-regarding-file-formats/26952",
"title": "OME’s position regarding file formats",
"authors": [
"OME Team"
],
"status": "Informational",
"publisher": "OME",
"id": "imagesc26952",
"date": "19 June 2020"
},
"n5": {
"id": "n5",
"href": "https://github.com/saalfeldlab/n5/issues/62",
"title": "N5---a scalable Java API for hierarchies of chunked n-dimensional tensors and structured meta-data",
"status": "Informational",
"authors": [
"John A. Bogovic",
"Igor Pisarev",
"Philipp Hanslovsky",
"Neil Thistlethwaite",
"Stephan Saalfeld"
],
"date": "2020"
},
"ome-zarr-py": {
"id": "ome-zarr-py",
"href": "https://doi.org/10.5281/zenodo.4113931",
"title": "ome-zarr-py: Experimental implementation of next-generation file format (NGFF) specifications for storing bioimaging data in the cloud.",
"status": "Informational",
"publisher": "Zenodo",
"authors": [
"OME",
"et al"
],
"date": "06 October 2020"
},
"zarr": {
"id": "zarr",
"href": "https://doi.org/10.5281/zenodo.4069231",
"title": "Zarr: An implementation of chunked, compressed, N-dimensional arrays for Python.",
"status": "Informational",
"publisher": "Zenodo",
"authors": [
"Alistair Miles",
"et al"
],
"date": "06 October 2020"
}
}
</pre>