Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add codec for tag 0xC4A5 PrintImageMatching #81

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

georgethebeatle
Copy link

This PR is attempting to fix #80 by adding a codec for the PrintImageMatching tag. I tried to find more information about this tag structure, but it was not an easy task. Here is what I was able to find:

  • This link explains the basic structure of the tag. It looks like the tag used the ifd format to encode sub tags, but it is not too clear what the tag ids are. This tag is meant to be used by printers, so maybe just keeping the raw bytes around is a good enough implementation. Whoever knows how to parse it will still be able to do it.
  • This issue talks about the same tag. It also mentions its nested ifd structure and the difficulty of finding documentation about its structure.

I hope this PR makes sense. It definitely unblocks my use case - I am trying to read and write basic exif tags like DateTimeOriginal and DateTimeDigitized and was getting the ErrUnparseableValue mentioned in the issue. With this codec the error is gone and exif tool proves that the PrintImageMatching tag is not tampered with or removed.

As the information about this tag is pretty obscure in the internet the
codec is fairly basic, just checking the header, parsing the version and
keeping the raw bytes.
@@ -9,10 +9,7 @@ go 1.12
require (
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not include module changes.

@@ -1,11 +1,17 @@
github.com/dsoprea/go-exif/v2 v2.0.0-20200321225314-640175a69fe4/go.mod h1:Lm2lMM2zx8p4a34ZemkaUV95AnMl4ZvLbCUbwOvLC2E=
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not include module changes.

log.Panicf("invalid header for tag 0xC4A5 PrintImageMatching")
}

versionLen := bytes.IndexByte(rawBytes[8:], 0)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validate that versionLen is not -1.


valueContext.SetUndefinedValueType(exifcommon.TypeByte)
rawBytes, err := valueContext.ReadBytes()
ev.Value = rawBytes
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation describes how to find an entry count and the list of entries. We can parse things better than just yielding an opaque byte slice, yes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find any decent documentation on PrintIM. It is not specified in the exif 2.2 spec. The only thing I found was the link I posted above and it does not agree with my images. My images have a 106 byte PrintIM. According to the spec this would mean

Header: 8
Version: 5
ExtraNull: 1
EntryCount: 2
Entries: 6*EntryCount

This works out if there are 15 entries. However here is the actual PrintIM bytes of one of my photos:

00000000: 5072 696e 7449 4d00 3033 3030 0000 0300  PrintIM.0300....
00000010: 0200 0100 0000 0300 2200 0000 0101 0000  ........".......
00000020: 0000 0911 0000 1027 0000 0b0f 0000 1027  .......'.......'
00000030: 0000 9705 0000 1027 0000 b008 0000 1027  .......'.......'
00000040: 0000 011c 0000 1027 0000 5e02 0000 1027  .......'..^....'
00000050: 0000 8b00 0000 1027 0000 cb03 0000 1027  .......'.......'
00000060: 0000 e51b 0000 1027 0000 0a              .......'...

Bytes 15-16 are [03 00] which yields 3 entries (order is Little Endian). So it does not work out.

Let's say we ignore the count and treat the remaining bytes as 6 byte entries. Here is what we get:

0200 0100 0000
0300 2200 0000
0101 0000 0000
0911 0000 1027
0000 0b0f 0000
1027 0000 9705
0000 1027 0000

and so on.

If the first 2 bytes of each entry would designate its tag id we have 2 tag ids of [00 00] which does not make much sense as we have a duplication as well as a meaningless tag id.

That's why I gave up and decided I would rather keep the tag opaque rather than parse it incorrectly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ErrUnparseableValue error when reading in a JPEG file taken with a Sony digital camera
2 participants