Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid ICC_ColorSpace instances in Apache Tika to support GraalVM 20.3 #13718

Closed
wants to merge 1 commit into from

Conversation

zakkak
Copy link
Contributor

@zakkak zakkak commented Dec 5, 2020

Fixes apache tika integration test.
Note however that users still need to avoid bringing instances of ICC_ColorSpace in the native image (this is a limitation of GraalVM 20.3 that I am afraid we can't work around in Quarkus. The good thing is that it won't be present in 21.0, since it's already fixed in graal master).
Another thing to keep in mind is that there are at least two more classes in pdfbox that hold instances of ICC_ColorSpace:

  • PDJPXColorSpace
  • PDDeviceCMYK
    which, although not reachable from the existing test, if reachable through Apache Tika might also cause issues to users.

@ghost ghost added the area/tika label Dec 5, 2020
@zakkak zakkak mentioned this pull request Dec 5, 2020
Copy link
Member

@gsmet gsmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, great work! I requested some small adjustments.

@@ -0,0 +1,36 @@
package io.quarkus.tika.graalvm;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks great, thanks!

Minor nitpick: this class should be in the io.quarkus.tika.runtime.graal to be consistent with other extensions.

}

// Substitutions to prevent ICC_ColorSpace instances from appearing in the native image when using Apache Tika
// See https://github.com/quarkusio/quarkus/pull/13644
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should add a comment adding that this class should be removed once we switch from GraalVM 20.3 to GraalVM 21?

(having 20.3 in the comment is important, this way I will catch it when upgrading from 20.3 as I will look for 20.3 occurrences everywhere)

@gsmet
Copy link
Member

gsmet commented Dec 7, 2020

I made the adjustments and pushed the commits there: #13644 .

Let's see how it goes for CI and make further adjustments in my GraalVM 20.3 branch.

@gsmet gsmet closed this Dec 7, 2020
@ghost ghost added the triage/invalid This doesn't seem right label Dec 7, 2020
@zakkak zakkak deleted the quarkus-tika-mandrel-20.3-v2 branch December 8, 2020 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tika triage/invalid This doesn't seem right
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants