Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct colour space before cropping #1281

Merged
merged 10 commits into from
Sep 25, 2015
Merged

Correct colour space before cropping #1281

merged 10 commits into from
Sep 25, 2015

Conversation

theefer
Copy link
Contributor

@theefer theefer commented Sep 21, 2015

Currently, the cropper fails in at least two cases:

  • If cropping an image where the colour space in the ICC profile doesn't match the actual colour model of the image data. e.g. an image with RGB image data with a CMYK profile (note that both such an image is technically incorrect). This is because when trying to convert from CMYK to sRGB profile, GraphicsMagick detects that the image data isn't actually CMYK and it fails with an error about mismatching colour models.
  • If cropping an image where the image data is not in RGB (e.g. CMYK or grayscale) and there is no embedded ICC profile. In that case, adding the profile doesn't trigger any conversion; the sRGB profile is simply embedded, but it doesn't match the image data, resulting in an image that doesn't look correct in browsers.

The solution is to always read out both the embedded ICC profile colour space and the image data colour model and compare them before any conversion.

If they match, great, we can just apply our normal conversion to sRGB profile (from whatever profile is correctly embedded).

If they don't match, either because there is no embedded ICC profile, or because the embedded ICC profile is not matching the actual colour model, then we strip any embedded profile and replace it with a generic ICC profile that matches the actual colour model of the image data. Note that there is no guarantee that the profile will exactly match the image data, but it's a best effort estimate with standard ICC profiles. We can then resume the normal flow of converting the image to the desired output profile (sRGB), etc.

Initial tests with various images with mismatching or missing ICC profiles gave correct results.

This will also be used to correct the colour space of thumbnails we generate, in a future PR.

Thanks to @paperboyo for his big help with this!

TODO

  • More manual testing
  • Think of whether we can write automated tests for this somehow

def runConvertCmd(op: IMOperation): Future[Unit] = Future((new ConvertCmd).run(op))
def runIdentifyCmd(op: IMOperation): Future[List[String]] = Future {
val cmd = new IdentifyCmd()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you presuming that the identify command infers the colorspace from (a) some heuristic on the image data itself, or (b) uses image metadata?

When looking at this previously I was defeated by the ImageMagick source code being impenetrable to me. I think testing indicated that it is in fact (b). If this is the case then images with incorrect or missing metadata will still fail this test.

To confirm which situation is occurring you could intentionally strip all metadata from an image and then run the identify command to see whether it can correctly infer the colorspace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm which situation is occurring you could intentionally strip all metadata from an image and then run the identify command to see whether it can correctly infer the colorspace.

Yes, it (GM) can. After running -strip, the jpeg-colorspace-name is still present.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theefer i'd try this with the same setup as in production (can't actually remember if we are using IM or GM, and the version may have an impact).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work fine on the cropper instance from some quick tests.

@paperboyo
Copy link
Contributor

💎! Thanks, @theefer, great explanation!

Just some thoughts as I can’t read code:

Note: ICC identified as "icm" here

Are we stripping both icm AND icc (the latter being much more common)?

I’m worried about robustness of our colour model inference via jpeg-colorspace-name. This is not available in IM whose recent, better colour management changes may have not been ported to GM yet. It also reads jpeg ;-)

In general with that, I would like to see if among all our imagery there is even one image that is not correctly identified by the above but opens in Photoshop. If yes, I would like to find more robust way of identification. Only completely broken images have no colour model...

The above detection could inform our db (it may be useful to list colour model in Show all metadata and be able to search for it): #765

@kenoir
Copy link
Contributor

kenoir commented Sep 21, 2015

jpeg-colorspace-name

This. I think I gave up last time I tried this because it's not possible to work out the colorspace without the metadata, (which kind of defeats the point). I seem to remember there are Java libraries that do run heuristics on the actual image data (possibly ImageJ) or indeed as @paperboyo says it's possibly more accurate in the most recent versions of ImageMagick.

@theefer
Copy link
Contributor Author

theefer commented Sep 21, 2015

As far as we could see, jpeg-colorspace-name is using the image data, because using it fixes the known bugs we have, and it likely wouldn't be the case if it was just reading metadata.

@paperboyo what are your worries specifically? Could we run some tests to verify them?

Are we stripping both icm AND icc?

No, I ran into an issue when stripping just ICC and I thought ICM was the canonical name, but if they're genuinely different we could try and strip both?

@paperboyo
Copy link
Contributor

@kenoir jpeg-colorspace-name is not present in most recent IM. It is present in most recent GM, though. In most recent IM this property may have been extracted earlier in the output as Colorspace: (although, watch out - it reads sRGB which just means RGB, really).

In both GM and IM, these properties are still present after -strip, though. I wonder what would make them disappear, but still make image readable...

@kenoir
Copy link
Contributor

kenoir commented Sep 21, 2015

If this PR fixes the problem in TEST then it's all fine but it's worth trying. I seem to remember being frustrated with the difference between my sandbox and a "real" instance.

@paperboyo
Copy link
Contributor

This is a comparison of latest IM & GM properties we are interested in, that survive -strip commands, run, in order, on an RGB, a CMYK and a grayscale JPG file (after they’ve been -stripped):

IM GM
(very beggining) Colorspace: sRGB, CMYK, Gray JPEG-Colorspace-Name: RGB, CMYK, GRAYSCALE
(under properties) jpeg:colospace: 2, 4, 1 JPEG-Colorspace: 2, 4, 1
(very beggining) Type: TrueColor, ColorSeparation, Palette (very beggining) Type: true color, color separated, grayscale

@paperboyo
Copy link
Contributor

@theefer

No, I ran into an issue when stripping just ICC and I thought ICM was the canonical name, but if they're genuinely different we could try and strip both?

As fare as I know, when they differ, it pertains only to filename extensions. Part of the image spec is ICC, so I’m not sure why IM or GM even lets you differentiate between them (running any should strip ALL colour profile data).

Okay, I’ve just checked: added an ICM profile to one image and ICC profile to another (the same profile, really, just different filenames :). Then run +profile "icc,icm" on both of these in both IM and GM. Both apps correctly removed profiles from both images. This command seems safe to use.

Annoyingly, if you do just +profile "ICC" but run it on an image where you’ve added a profile with an extension ICM, then the profile does not get removed. I blame GM & IM, they make no sense here.

That said, on an unrelated note, I can’t make latest GM to accept +profile "!ICC,*" command to strip everything but the colour profile. It works correctly in IM. Most probably, just a bug in GM. We should be careful, would we like to use it.

@jamesgorrie
Copy link
Contributor

... GM to accept +profile "!ICC,*"

I have played around quite a lot with that command and it differs quite a lot dependant on OS too.

@paperboyo
Copy link
Contributor

it differs quite a lot dependant on OS too

Oh, no! That means that ICC profiles are treated by both apps as files instead of image characteristics. It’s convenient when adding and extracting, but definitely not in this instance... At least IM is listing icc: properties (like copyright, description, manufacturer, model - very useful to identify specific profiles) irrespective of if the image had ICC or ICM profile added. GM lists only Profile-color: [size in bytes], though 😢 So GM is useless if you would like to know what actual profile an image has without actually extracting it...

@theefer re:identifying specific profiles

@theefer
Copy link
Contributor Author

theefer commented Sep 24, 2015

Now stripping both icm and icc cc @paperboyo

Added some tests for the colour model detection using a few sample images with and without embedded profiles.

We should consider adding more tests around the cropImage function in the future, to check output profile and colour model.

// If matching, all is well, just pass through
case (icc, model) if icc == model => base
// If no colour model detected, we can't do anything anyway so just hope all is well
case (_, None) => base
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it worth logging out these cases?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it worth logging out these cases?

I do think so. It's generally interesting to me what is the composition of the whole db in terms of colour models, colour profiles. Not that (all) the metadata isn't interesting ;)

@akash1810
Copy link
Member

👍

theefer added a commit that referenced this pull request Sep 25, 2015
@theefer theefer merged commit 97c1b7f into master Sep 25, 2015
@theefer theefer deleted the sc-correct-colour-space branch September 25, 2015 10:40
@paperboyo paperboyo mentioned this pull request Oct 14, 2015
@paperboyo paperboyo mentioned this pull request Apr 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants