Skip to content
This repository has been archived by the owner on Jul 10, 2019. It is now read-only.

Behemothcompilation

jnioche edited this page May 30, 2012 · 5 revisions

How to compile

Prerequisites

  • Java 1.6.0
  • Apache Maven 2.2.1
  • Internet connection (required for fetching the dependencies with Maven)

Compiling

Running 'mvn install' from the root directory of Behemoth will fetch the dependencies, compile each module, run the tests and a generate a jar file in the target directory of each module. The modules can have dependencies between each other (at least to gate-core) as well as external ones.

Testing

Running 'mvn test' from the root directory of Behemoth calls the JUnit tests for each module. The outputs of the tests can be found in the directory target/surefire-reports in the modules.

Generating a job file

A job file is necessary in order to run Behemoth on a Hadoop cluster. Job files are generated on a module basis : a user can generate several job files and use them separately (e.g. one for Tika and one for GATE) or build a new module with some custom code and declare a dependency to both the modules tika and gate and generate a job file for that new module.

Running 'mvn package' from the root directory of Behemoth will generate a *-job.jar file in the target directory for each module. These job files can then be used with Hadoop as shown in the tutorial.

Clone this wiki locally