Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XMLMockObjects: make artificial SHA1 schema-compliant #158

Merged
merged 2 commits into from
Nov 10, 2022

Conversation

melissalinkert
Copy link
Member

@sbesson
Copy link
Member

sbesson commented Apr 18, 2022

To my original surprise, this change failed the unit tests in https://merge-ci.openmicroscopy.org/jenkins/job/BIOFORMATS-image/1151 with the following validation error

2022-04-18 00:17:05,448 [main] ERROR loci.common.xml.XMLTools - cvc-length-valid: Value '1234567890ABCDEF1234' with length = '10' is not facet-valid with respect to length '20' for type 'Hex40'.
2022-04-18 00:17:05,449 [main] ERROR loci.common.xml.XMLTools - cvc-type.3.1.3: The value '1234567890ABCDEF1234' of element 'HashSHA1' is not valid.

After some investigation, I suspect the explanation is that the HashSHA1 element is of Hex40 type defined as

<xsd:simpleType name="Hex40">
<xsd:annotation>
<xsd:documentation>
Binary contents coded in hexadecimal (20 characters long)
</xsd:documentation>
</xsd:annotation>
<xsd:restriction base="xsd:hexBinary">
<xsd:length value="20"/>
</xsd:restriction>
</xsd:simpleType>

So the 20 length restriction might not apply to the hexadecimal string but rather to its equivalent binary representation. This matches what Python binascii.unhexlify returns about the former value and the newer one

>>> from binascii import unhexlify
>>> len(unhexlify("1234567890ABCDEF1234567890ABCDEF12345678"))
20
>>> len(unhexlify("1234567890ABCDEF1234"))
10
>>> unhexlify("1234567890ABCDEF1234567890ABCDEF12345678")
b'\x124Vx\x90\xab\xcd\xef\x124Vx\x90\xab\xcd\xef\x124Vx'

Using a 40 character string passes Java tests and fails Python tests.
Using a 20 character string passes Python tests but fails Java tests.
@melissalinkert
Copy link
Member Author

Following today's formats team discussion, removing HashSHA1 entirely from XMLMockObjects and re-including.

The longer-term solution is to update the schema to deprecate HashSHA1, and ensure that nothing else in our stack uses this feature. git grep HashSHA1 on https://github.com/ome/bioformats/tree/3d73a85c5af636cf5af42e952adae4e22611585b shows nothing.

@sbesson sbesson removed the exclude label Apr 18, 2022
joshmoore added a commit to joshmoore/ome-types that referenced this pull request Sep 15, 2022
While trying to get the `bioformats2raw.layout` specification
(ome/ngff#112) this raised its head
again. Following the discussion in ome/ome-model#158
changing the size checks to `40` passes.

fix: ome/bioformats#3810
fix: ome/napari-ome-zarr#47 (comment)
tlambert03 pushed a commit to tlambert03/ome-types that referenced this pull request Sep 16, 2022
* Update Hex40 definition

While trying to get the `bioformats2raw.layout` specification
(ome/ngff#112) this raised its head
again. Following the discussion in ome/ome-model#158
changing the size checks to `40` passes.

fix: ome/bioformats#3810
fix: ome/napari-ome-zarr#47 (comment)

* Bump to min_length=40
@sbesson sbesson self-requested a review November 7, 2022 14:49
Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original issue associated with the reading the Plane.HashSHA1 has been fixed downstream - see e.g. tlambert03/ome-types#146.

As discussed at the Formats meeting, a secondary problem is that the Plane.HashSHA1 carries the expectation to store a hash of the plane data. At the moment, this mock infrastructure will unconditionally store an arbitrary value which has the potential to create confusion for downstream consumers.

Given all the builds have been staying green with this PR included, I would agree to get it merged and included in the next ome-model release.

@dgault dgault merged commit 5cf9404 into ome:master Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants