Skip to content

Commit

Permalink
doc
Browse files Browse the repository at this point in the history
  • Loading branch information
rsoika committed Jul 30, 2017
1 parent 3774ffa commit 84f55f1
Showing 1 changed file with 21 additions and 0 deletions.
21 changes: 21 additions & 0 deletions imixs-archive-hadoop/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,27 @@ The Imixs-Archive-Hadoop project provides a API to store workitems into a Hadoop

Imixs-Archive-Hadoop is communicating with a hadoop cluster via the [WebHDFS Rest API](https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/WebHDFS.html).

## Synchronous Mode Push

This implementation follows the architector of a synchronous push mode. By this strategy the archive process is directly coupled to the workflow process. This means that the archive process can be controlled by the workflow model. The implementation is realized by a Imixs-Plug-In which is directly controlled by the engine. The plug-in access the hadoop cluster via the Hadoop Rest API. In this scenario the plugin can store archive data, like the Checksum, immediately into the workitem. This is a tightly coupled archive strategy.

### Pros

* The archive process can be directly controlled by the workflow engine (via a plug-in)
* The data between hadoop and imixs-workflow is synchron at any time
* A workitem can store archive information in synchronous way (e.g. checksumm)

### Cons

* The process is time consuming and slows down the overall performance from the workflow engine
* The process is memory consuming
* The process have to be embedded into the running transaction which increases the complexity
* Hadoop must be accessible via the internet and additional security must be implemented on both sides.


# Implementation

The service is implemented a a stateful session EJB with a Plug-In. The statefull session EJB synchronizes the transaction and decided in the afterCommit(boolean) method either to comit or rolback the changes in hadoop. This approach is a little bit complex, time and memory consuming but has the advantage that the workitem is always synchron with the data in the hadoop cluster.

## CDI Support

Expand Down

0 comments on commit 84f55f1

Please sign in to comment.