Skip to content

Commit

Permalink
Enable all charsets in native image for quarkus-tika
Browse files Browse the repository at this point in the history
  • Loading branch information
sberyozkin committed Oct 18, 2021
1 parent 93f3547 commit 5d43d77
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 17 deletions.
13 changes: 0 additions & 13 deletions docs/src/main/asciidoc/tika.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,6 @@ This guide explains how your Quarkus application can use https://tika.apache.org

https://tika.apache.org/[Apache Tika] is a content analysis toolkit which is used to parse the documents in PDF, Open Document, Excel and many other well known binary and text formats using a simple uniform API. Both the document text and properties (metadata) are available once the document has been parsed.

[NOTE]
====
If you are planning to run the application as a native executable and parse documents that may have been created with charsets different than the standard ones supported in Java such as `UTF-8` then you should configure Quarkus to get the native image generator include all the charsets available to the JVM:
[source,xml]
----
<properties>
<quarkus.package.type>native</quarkus.package.type>
<quarkus.native.add-all-charsets>true</quarkus.native.add-all-charsets>
<properties>
----
====


== Prerequisites

To complete this guide, you need:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
import io.quarkus.deployment.annotations.ExecutionTime;
import io.quarkus.deployment.annotations.Record;
import io.quarkus.deployment.builditem.FeatureBuildItem;
import io.quarkus.deployment.builditem.NativeImageEnableAllCharsetsBuildItem;
import io.quarkus.deployment.builditem.nativeimage.NativeImageResourceBuildItem;
import io.quarkus.deployment.builditem.nativeimage.NativeImageResourceDirectoryBuildItem;
import io.quarkus.deployment.builditem.nativeimage.RuntimeInitializedClassBuildItem;
Expand Down Expand Up @@ -87,6 +88,11 @@ public void registerPdfBoxResources(BuildProducer<NativeImageResourceDirectoryBu
resource.produce(new NativeImageResourceDirectoryBuildItem("org/apache/fontbox/unicode"));
}

@BuildStep
public NativeImageEnableAllCharsetsBuildItem registerAllCharsets() {
return new NativeImageEnableAllCharsetsBuildItem();
}

@BuildStep
@Record(ExecutionTime.STATIC_INIT)
void initializeTikaParser(BeanContainerBuildItem beanContainer, TikaRecorder recorder,
Expand Down
4 changes: 0 additions & 4 deletions integration-tests/tika/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -99,10 +99,6 @@
<name>native</name>
</property>
</activation>
<!-- add some custom config, the rest comes from parent -->
<properties>
<quarkus.native.add-all-charsets>true</quarkus.native.add-all-charsets>
</properties>
</profile>
</profiles>

Expand Down

0 comments on commit 5d43d77

Please sign in to comment.