Skip to content

Wraps the charset detection logic from StormCrawler as a Tika module

License

Notifications You must be signed in to change notification settings

DigitalPebble/tika-detector-stormcrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tika-detector-stormcrawler

Wraps the charset detection logic from StormCrawler as a Tika module

Has 2 configs:

  • fastMethod (false)
  • maxLength (0 unlimited)

Needs configuring in tika-config.xml

<encodingDetectors> 
  <encodingDetector class="com.digitalpebble.tika.detect.SCCharsetDetector"/> 
</encodingDetectors>

About

Wraps the charset detection logic from StormCrawler as a Tika module

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages