Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unified Metadata Container #1505

Open
clanmills opened this issue Mar 24, 2021 · 4 comments
Open

Unified Metadata Container #1505

clanmills opened this issue Mar 24, 2021 · 4 comments
Milestone

Comments

@clanmills
Copy link
Collaborator

clanmills commented Mar 24, 2021

svn://dev.exiv2.org/svn/branches/unstable

I'm going to study this and report back. It would be great to get this into '0.28' as I believe this could be a major stepping stone to reach the Exiv2 1.0 goal of a stable API that we expect to endure.

With every release of the 0.27 "dots", I have successfully tested that shared objects/DLLs can be updated without breaking existing code. Once we have the Unified Metadata Container, we may have achieved Exiv2 v1.0.

@clanmills clanmills added this to the v0.27.4 milestone Mar 24, 2021
@clanmills clanmills self-assigned this Mar 24, 2021
@clanmills clanmills changed the title study Unified Metadata Container implementation on unstable Study Unified Metadata Container implementation Mar 24, 2021
@clanmills
Copy link
Collaborator Author

clanmills commented Mar 24, 2021

I have this built and working. There's a new file key.cpp of about 800 lines of code. Mostly template stuff. There are a couple of new classes Key1 and and Tag1. These are used in actions.cpp to enumerate and delete stuff. However actions.cpp is still using ExifData, XmpData and IptcData. Maybe that's simply because Andreas has not converted the application code to take advantage of this library feature. For sure, it's good to retain the old APIs.

I've thought of several metadata families that could be usefully added to Exiv2 to really explore the opportunity that arrives with the Unified Metadata Container:

  1. PNG Textual Information
    PNG.zTXt.Author Value
    PNG. tEXt. Description Value
    PNG. iTXt.MyPrivateTag Value

  2. ICC Data
    ICC.desc.Tag Value
    ICC.cprt.Tag Value

  3. File Properties
    File.MimeType
    File.Size

We'll really need a new key parser with the following syntax:

Family.[Group].Key[[page]] Example: Exif.Photo.PixelXDimension[3]

The optional [Group]. is essential for Exif, PNG, ICC and not needed for File. The option [page] should be engineered into the tag parser, although it has not purpose until we support a multipage format such as tiff or PDF. The page field could be useful for working with multi-resolution thumbnails in JPEG which are clearly visible to tvisitor and ignored by Exiv2:

531 rmills@rmillsm1:~/gnu/exiv2/team/book $ tvisitor -pR ~/Stonehenge.jpg  | grep -i Dimen
         500 | 0xa002 Exif.Photo.PixelXDimension       |     SHORT |        1 |           | 6000
         512 | 0xa003 Exif.Photo.PixelYDimension       |     SHORT |        1 |           | 4000
          30 | 0xa002 Exif.Photo.PixelXDimension       |     SHORT |        1 |           | 640
          42 | 0xa003 Exif.Photo.PixelYDimension       |     SHORT |        1 |           | 424
          30 | 0xa002 Exif.Photo.PixelXDimension       |     SHORT |        1 |           | 1620
          42 | 0xa003 Exif.Photo.PixelYDimension       |     SHORT |        1 |           | 1080
532 rmills@rmillsm1:~/gnu/exiv2/team/book $

@clanmills
Copy link
Collaborator Author

clanmills commented Mar 25, 2021

I'm rather puzzled by the implementation of the Unified Metadata Container. Andreas introduced a new class Key1 which is veneer for ExifKey, IptcKey or XmpKey. class Tag1 is a veneer for ExifTag, IptcTag or XmpTag.

However there's no class Metadata. I would expect things to work as follows:

image = Exiv2::ImageFactory::open(path);
assert(image.get() != 0);
image->readMetadata();
for ( Exiv2::MetaData::const_iterator it = image->metaData().begin() ; it != image->metadata().end() ; it++) {
   std::cout << it->key()->toString() << "   " << it->value()->toString() << std::endl;
}

Applications are not required to use classes ExifData[], IptcData[] and XmpData[]. For backwards compatibility, it's good to keep the existing xxxData[] structures. Internally, it's very desirable to retain ExifData[], IptcData[] and XmpData[] to avoid refactoring readMetadata() and writeMetadata() for every image handler.

There is very interesting code here. Adding a new family of Metadata (for example: ICC, File, PNG) only involves extending this table. It's very desirable to add families without touching existing code. I suspect we'll have to widen this table with the pointer to xxxData[].

    const Key1Impl::MetadataInfo Key1Impl::metadataInfo_[] = {
        { mdExif, "Exif",  ExifKey1::create },
        { mdIptc, "Iptc",  IptcKey1::create },
        { mdXmp,  "Xmp",   XmpKey1::create  }
    };

The moment code edits any xxxData[], the image class must synchronise every xxxData[]. I'm going to investigate implementing class Exiv2::Metadata() on the 0.27-maintenance branch. I think class metaData might be:

typedef struct {
    uint32_t   index ; // current element
    uin16_t    nData ; // length of data[] array (currently 3)
    void**     data[];
} MetaData;

This design avoids synchronisation as it holds pointers to xxxData[]. There could be edge case issues with index when code is inserting, deleting in an xxxData[] array. It's seldom a good idea to insert/delete while using an iterator.

@clanmills
Copy link
Collaborator Author

clanmills commented Mar 25, 2021

Good Progress. We don't need to synchronize the data. We simply copy it on demand. I've written the following function in image.hpp. Andreas seems to have omitted this. I'll search some more.

        virtual Metadata& metaData() {
            metadata_.clear();
            for (Exiv2::ExifData::const_iterator i = exifData().begin(); i != exifData().end() ; ++i) {
                metadata_.add(Exiv2::Tag1(*i));
            }
            for (Exiv2::IptcData::const_iterator i = iptcData().begin(); i != iptcData().end() ; ++i) {
                metadata_.add(Exiv2::Tag1(*i));
            }
            for (Exiv2::XmpData::const_iterator i = xmpData().begin(); i != xmpData().end(); ++i) {
                metadata_.add(Exiv2::Tag1(*i));
            }
            return metadata_ ;
        }

    protected:
        // DATA
        Metadata          metadata_ ;
        BasicIo::AutoPtr  io_;                //!< Image data IO pointer
...

I've added the following code to exifprint.cpp to test this:

    Exiv2::Metadata&  metadata = image->metaData();
    std::cout << "metadata count = " << metadata.count() << std::endl;

    for ( int i = 0 ; i < 3 ; i++ ) {
        const char* name = "";
        Exiv2::MetadataId family   = Exiv2::mdNone;
        switch ( i ) {
            case 0 : family = Exiv2::mdExif ; name = "Exif"; break;
            case 1 : family = Exiv2::mdIptc ; name = "Iptc"; break;
            case 2 : family = Exiv2::mdXmp  ; name = "Xmp" ; break;
        }
        size_t count = 0;
        for (Exiv2::Metadata::const_iterator it = metadata.begin(family) ;
             it != metadata.end(family) ;  it++
        ) {
            std::cout << it->key() << std::endl ; // " = " << it->value() << std::endl;
            count++;
        }
        std::cout << name <<  " =  " << count << std::endl;
    }

The result is splendid:

664 rmills@rmillsm1:~/gnu/github/exiv2/unstable/xcode_build $ samples/Debug/exifprint ~/Stonehenge.jpg 
metadata count = 219
Exif.Image.Make
Exif.Image.Model
...
Exif.NikonCb2b.0x0095
Exif =  209
Iptc.Envelope.ModelVersion
...
Iptc =  4
Xmp.cm2e.Family
...
Xmp =  6
665 rmills@rmillsm1:~/gnu/github/exiv2/unstable/xcode_build $ 

As always with Andreas' code, I am surprised and very pleased by his solution.

I'm sure a lot of code in the exiv2 command-line utility can be simplified (removed) with this feature.

Tomorrow I will investigate adding a new metadata Family to Exiv2. It's unlikely that Exiv2 v0.28 will ship with an additional family, however I'd like to explore how this can be achieved. I think I will investigate ICC profiles.

@clanmills
Copy link
Collaborator Author

I don't intend to do any more work on this for v0.27.4. I've done sufficient work to be confident that this code is really good and should be ported to branch 'main' to ship in v1.00.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants