Skip to content

Commit

Permalink
documentation
Browse files Browse the repository at this point in the history
issue #9
  • Loading branch information
rsoika committed Nov 1, 2017
1 parent bda19bb commit 7e85b9e
Show file tree
Hide file tree
Showing 9 changed files with 41 additions and 1,152 deletions.
53 changes: 19 additions & 34 deletions imixs-archive-api/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,45 +32,37 @@ A _snapshot workitem_ is an immutable copy of a workitem (origin-workitem) inclu

The snapshot process includes the following stages:

1. A process instance is processed by the Imixs-Workflow engine based on a BPMN 2.0 model.
2. After processing the process instance is persisted into the local workflow storage by the DocumentService.
3. The DocumentService sends a notification of the new or updated process instance to the SnapshotService.
4. The SnapshotService creates a immutable copy of the process instance - called snapshot-workitem.
5. The snapshot workitem is stored into the local workflow storage
6. The origin process instance is returned to the application
7. An external archive system receives the new snapshot-workitem and stores it into a archive storage.
1. A workitem is processed by the Imixs-Workflow engine based on a BPMN 2.0 model.
2. The WorkflowService sends a notification event.
3. The DMS Service collects the DMS meta data.
4. The DMS meta data is stored into the process instance.
5. After processing is completed, the process instance is persisted into the local workflow storage by the DocumentService.
6. The DocumentService sends a notification event.
7. The SnapshotService creates a immutable copy of the process instance - called snapshot-workitem.
8. The SnapshotService detaches the file content form the workitem.
9. The snapshot workitem is stored into the local workflow storage
10. The origin process instance is returned to the application
11. An external archive system polls new snapshot-workitems
12. An external archive system stores the snapshot-workitems into a archive storage.



A snapshot-workitem holds a reference to the origin-workitem by its own $UniqueID which is
always the $UniqueID from the origin-workitem suffixed with a timestamp.
During the snapshot creation the snapshot $UniqueID is stored into the origin-workitem.

### The SnapshotPlugin
### The DMS Service

The _DMSService_ collects meta data from attached documents during the processing phase. This meta data contains also extracted text information added into the lucene full-text-index. The DMS meta data is stored into the item '_dms_'.

The snapshot process includes the following stages:
The _DMSService_ is parsing the the content of attachments from the type .pdf, .doc, .xls and .ppt. The service uses the libraries of [Apache POI](http://poi.apache.org/) and [Apache PDFBox](https://pdfbox.apache.org/) to extract the content of those documents.

1. create a copy of the origin workitem instance
2. compute a snapshot $uniqueId based on the origin workitem suffixed with a timestamp.
3. change the type of the snapshot-workitem with the prefix 'archive-'
4. If an old snapshot already exists, Files are compared to the current $files and, if necessary, stored in the Snapshot applied
5. remove the file content form the origin-workitem
6. store the snapshot uniqeId into the origin-workitem as a reference ($snapshotID)
7. remove deprecated snapshots

A snapshot-workitem holds a reference to the origin-workitem by its own $UniqueID which is
always the $UniqueID from the origin-workitem suffixed with a timestamp.
During the snapshot creation the snapshot $UniqueID is stored into the origin-workitem.


Why did we use a Plugin to implement the Snapshot-Architecture? You could say that it is easier to have the snapshot-workitem directly generated by the engine. This avoids that someone can forget to include the plugin in his model.
But this also accounted for every possibility of control when and if data is archived. With the plugin the control is by the model. And this is important when you are considering legal provisions such as the EU data protection law.


### How the SnapshotPlugin Works
The SnapshotPlugin implements the ObserverPlugin interface and is tied to the transaction context of the imixs-workflow engine. The process of creating a new snapshot workitem is aware of the current transaction in a transparent way and will automatically role back any snapshots workitems in case of a EJB Exception. The SnapShotPlugin can be included into any model that manages business-critical data.

### CDI Events

The communication between the service layers is implemented by the CDI Observer pattern. The CDI Events are tied to the transaction context of the imixs-workflow engine.
See the [DocumentService](http://www.imixs.org/doc/engine/documentservice.html#CDI_Events) and [WorkflowService](http://www.imixs.org/doc/engine/workflowservice.html#CDI_Events) for further information.

### The Access Control (ACL)
The access to archive data, written into the Imixs-Archive, is controlled completely by the [Imixs-Workflow engine ACL](http://www.imixs.org/doc/engine/acl.html). Imixs-Workflow supports a multiple-level security model, that offers a great space of flexibility while controlling the access to all parts of a workitem.
Expand Down Expand Up @@ -98,13 +90,6 @@ Thus, in this exmple a system processing 1 million process instances per year ca



# Document Fulltext Search

The EJB _LuceneDocumentService_ provides method to index the content of attachments of the type .pdf, .doc, .xls and .ppt in a Lucene Fulltext Serach index. The service uses the libraries of [Apache POI](http://poi.apache.org/) and [Apache PDFBox](https://pdfbox.apache.org/) to extract the content of those documents.

The indexing process is controlled by a timer servcie class called 'LuceneDocumentScheduler'.



# Deployment

Expand Down

This file was deleted.

This file was deleted.

This file was deleted.

Loading

0 comments on commit 7e85b9e

Please sign in to comment.