Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.x] Upgrade Tika to 1.27 for ingest (#75191) #75904

Merged
merged 3 commits into from
Aug 2, 2021

Conversation

danhermann
Copy link
Contributor

Includes upgrades for some other dependencies as well.

Backport of #75191

@danhermann danhermann added >upgrade :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP backport v7.15.0 labels Jul 30, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Jul 30, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

'poi' : '4.1.2',
'mime4j': '0.8.3'
'mime4j': '0.8.5'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be version 0.8.4?

This version triggers:

java.lang.NoSuchMethodError: java.nio.ByteBuffer.flip()Ljava/nio/ByteBuffer;	
		at __randomizedtesting.SeedInfo.seed([92EBE05E8999BE7A:FBF685D8DB444DF8]:0)	
		at org.apache.james.mime4j.io.TextInputStream.<init>(TextInputStream.java:49)	
		at org.apache.james.mime4j.io.InputStreams.createAscii(InputStreams.java:70)	
		at org.apache.james.mime4j.codec.DecoderUtil.decodeQuotedPrintable(DecoderUtil.java:51)	
		at org.apache.james.mime4j.codec.DecoderUtil.decodeQ(DecoderUtil.java:126)	
		at org.apache.james.mime4j.codec.DecoderUtil.tryDecodeEncodedWord(DecoderUtil.java:258)	
		at org.apache.james.mime4j.codec.DecoderUtil.decodeEncodedWords(DecoderUtil.java:212)	
		at org.apache.james.mime4j.codec.DecoderUtil.decodeEncodedWords(DecoderUtil.java:145)	
		at org.apache.tika.parser.microsoft.OutlookExtractor.decodeHeader(OutlookExtractor.java:563)	
		at org.apache.tika.parser.microsoft.OutlookExtractor.normalizeHeaders(OutlookExtractor.java:529)	
		at org.apache.tika.parser.microsoft.OutlookExtractor.parse(OutlookExtractor.java:161)	
		at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:199)	
		at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131)	
		at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)	
		at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)	
		at org.apache.tika.Tika.parseToString(Tika.java:568)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @martijnvg, that fixed it.

@danhermann danhermann merged commit f890bc4 into elastic:7.x Aug 2, 2021
@danhermann danhermann deleted the backport_7x_75191_upgrade_tika branch August 2, 2021 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team >upgrade v7.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants