Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for Extra MetaData fields #3084

Closed
wainaina opened this issue Jan 11, 2021 · 2 comments
Closed

Add Support for Extra MetaData fields #3084

wainaina opened this issue Jan 11, 2021 · 2 comments

Comments

@wainaina
Copy link
Contributor

Short description

At BBC, images managed by legacy systems are being moved to the BBC images service(running on the grid).

Current situation

  • The grid supports standard set of metadata fields described by com.gu.mediaservice.model.ImageMetadata.scala
    image

Current Challenge

The current metadata available has more fields and there is a need to allow enough wiggle room to support alteration of the metadata structure without adversely affecting the existing functionality.

A solution of this will involve;

  • Inclusion of the extra metadata fields will trigger a change of the structure of the com.gu.mediaservice.model.ImageMetadata.scala class.
  • Updating the com.gu.mediaservice.lib.metadata.ImageMetadataConverter class to extract the extra Metadata from images, where it will be cleaned by the custom metadata cleaners.
  • Ensuring that all the metadata from all the legacy systems is maintained and indexed to ensure searchability of the extra metadata fields.
  • Ensure that the DS holding the extra metadata is optional and allows for the addition of new metadata field types in the event that extra metadata fields are introduced.

Proposed imagemetadata field structure with proposed extraMetadata field:

image

@akash1810
Copy link
Member

akash1810 commented Jan 11, 2021

I think this is a sensible proposal, however I wonder if we can break the implementation into multiple stages?

IIUC there are three tiers to this data:

  1. Searchable in the UI via structured search
  2. Read-only on the image preview page
  3. Editable on the image preview page

I'm not sure if we've categorised which fields are in which tier and it might be good to deliver 1 (search) whilst we understand the use case for 2 and 3?

Currently, all metadata embedded in the image file gets extracted into the fileMetadata block. That is, any arbitrary xmp metadata is stored in elasticsearch.

Today, it is possible to search for the presence of an arbitrary field using the has search. I wonder if we should add the ability to search on the value of arbitrary fields too as a way of solving 1?

As demonstrated by the type of metadata in Edits, anything in ImageMetadata is editable. That is, I think using extraMetadata to hold read-only fields would break this model.

At BBC, images managed by legacy systems are being moved to the BBC images service(running on the grid).

I'd be keen to understand this a bit more. We performed a similar process at the Guardian and had a use-case to be able to answer the question: "Does this old image id exist in Grid?". To answer this, there is an identifiers blob in the image which is searchable.

@paperboyo
Copy link
Contributor

I think this is fixed by #3132, #3183 and #3205. Let us know if not!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants