Refactoring the persistence layer to be able to persist any Java Object #407

yojs · 2020-09-01T02:57:48Z

Fixes #: #399

Description of changes:

The persistence layer takes care of durably storing the RCA results (and in future decisions from deciders) and providing them when asked for it. The layer also takes care of periodic file rotations and cleaning up old DB files.

Although, it wasn't intended, the Persistence layer is tightly coupled with the RCA graph today. If you look at the write() method in Persistable interface, you will find this signature:
<T extends ResourceFlowUnit> void write(Node<?> node, T flowUnit) throws SQLException, IOException;

So the Persistable Object knows about the graph node. Therefore, in future as the remediation system evolves and we have components, that are not nodes in the RCA-Graph, we might still want to persist their outputs. But this interface can't do that.

Therefore, the goal is to make the methods of Persistable generic, so that it can persist any object given to it and be able to read out any object the caller asks for.

Say we have to persist this class:

Outer.java
________________

class Outer {
  int x;
  int y;
  boolean z;
  String name;
  List<T> myList;
  B b_obj;
}
-------------------------------------------------------

A.java
_______

class B {
  int x;
  String y;
}

when the above code is annotated as:

Outer.java
________________

// This will create a table named "Outer"
class Outer {
  @ValueColumn
  int x;

  @ValueColumn
  int y;
  // Its not annotated as column, therefore, it will not be persisted.
  boolean z;

  @ValueColumn
  String name;

  @RefColumn
  B b_obj;

  @RefColumn
  List<T> myList;

  // Add a column named x to Outer table. The value stored in the row for that column will be the value of x.
  int getX() { .. }
  void setX(int x) { .. }

  int getY() { .. }
  String getName() { .. }
  void setName(String name) { .. }

  // For all fields that are not [primitive types](https://docs.oracle.com/javase/6/docs/api/java/lang/Class.html#isPrimitive()), they will be persisted in their own tables as a new row. The auto-increment ID
for this row will be persisted as a value of the column named "__table__B", so that we have a link to the nested element/table from this table.
  B getBObj() {..}
  void setBObj(B b) {..}

// If the Column annotation exists for a collection type, then they will be written to a table of their own. 
// The name of the table is obtained by calling T.class.getSimpleName(). If all of them evaluates to the same 
// table name, then they get written to the same table as different rows. And the corresponding Outer table's
// column will have a string [<rowId1]>, <rowId2>].
// If `T.class.getSimpleName()` evaluates to different names, then they are linked back in Outer table as 
// different columns.
  List<T> getMyList() { .. }
  void setMyList(List<T> list) { .. }
}
-------------------------------------------------------

B.java
_______

class B {
  @ValueColumn
  int x;
  String y;

  int getX() { .. }
  void setX(int x) { .. }
}

this creates these set of tables:
Here assuming that

getMyList() returns a list of three elements where

for (T in getMyList) {
  // T.getClass().getSimpleName() returns `T` for two of the objects and `TT` for the third.
}

Table Outer

ID	X	Y	Name	__table__BObj	__table__T	__table__TT
53	1	2	name	34	[{"tableName":"ITestImpl2","rowIds":[1]},{"tableName":"ITestImpl1","rowIds":[1,2]}]	32

Table B

ID	X
34	23

Table T

ID	col2	..
23	24	..
24	24	..

Table TT

ID	col2	..
32	24	..

What this buys us

The Java code resembles how the tables and nested tables will be laid out.
Because the nested table can be obtained just by reading the column name, someone can write a tool to read the RCAs from the SQLite files. The won't need the java package to figure out the nesting.
If new columns or nestings are added, they can also be added in the DB without schema mismatch. This is because, we create a new SQLite file on each restart of the RCA agent process (on top of periodic file rotations).
The table name and column names are derived from the classname and getter name (methodName with get stripped out). This provides a 1-1 mapping from the persistor class and fields to the table name and column names.
Over and above, we are able to persist any Java Object.

Tests:
Added unit tests

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

codecov · 2020-09-01T04:26:43Z

Codecov Report

Merging #407 into master will increase coverage by 0.70%.
The diff coverage is 82.97%.

@@             Coverage Diff              @@
##             master     #407      +/-   ##
============================================
+ Coverage     70.09%   70.80%   +0.70%     
- Complexity     2287     2354      +67     
============================================
  Files           303      307       +4     
  Lines         13605    13962     +357     
  Branches       1133     1178      +45     
============================================
+ Hits           9537     9886     +349     
- Misses         3695     3702       +7     
- Partials        373      374       +1

Impacted Files	Coverage Δ	Complexity Δ
...ormanceanalyzer/rca/persistence/PersistorBase.java	`78.50% <46.66%> (-5.20%)`	`23.00 <1.00> (+1.00)`	⬇️
...manceanalyzer/rca/persistence/SQLitePersistor.java	`77.21% <85.05%> (+8.70%)`	`72.00 <44.00> (+44.00)`
...ormanceanalyzer/rca/store/rca/cache/CacheUtil.java	`58.62% <0.00%> (ø)`	`9.00% <0.00%> (ø%)`
...performanceanalyzer/rca/framework/core/Config.java	`100.00% <0.00%> (ø)`	`7.00% <0.00%> (?%)`
...cisionmaker/actions/configs/CacheActionConfig.java	`100.00% <0.00%> (ø)`	`5.00% <0.00%> (?%)`
...cisionmaker/actions/configs/QueueActionConfig.java	`97.22% <0.00%> (ø)`	`5.00% <0.00%> (?%)`
...manceanalyzer/rca/framework/core/NestedConfig.java	`100.00% <0.00%> (ø)`	`5.00% <0.00%> (?%)`
...ceanalyzer/rca/framework/core/ConfJsonWrapper.java	`94.59% <0.00%> (+0.30%)`	`20.00% <0.00%> (+4.00%)`
...performanceanalyzer/rca/configs/DeciderConfig.java	`47.05% <0.00%> (+0.39%)`	`4.00% <0.00%> (-4.00%)`	⬆️
...erformanceanalyzer/rca/framework/core/RcaConf.java	`52.21% <0.00%> (+1.30%)`	`33.00% <0.00%> (+2.00%)`
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8a806d2...e43aa16. Read the comment docs.

Fixes: #399 The persistence layer takes care of durably storing the RCA results (and in future decisions from deciders) and providing them when asked for it. The layer also takes care of periodic file rotations and cleaning up old DB files. Although, it wasn't intended, the _Persistence_ layer is tightly coupled to the _RCA graph_ today. If you look at the `write()` method in `Persistable` interface, you will find this signature: `<T extends ResourceFlowUnit> void write(Node<?> node, T flowUnit) throws SQLException, IOException;` So the Persistable Object knows about the graph node. Therefore, in future as the remediation system evolves and we have components, that are not nodes in the RCA-Graph, we might still want to persist their outputs and this interface can't do that. Therefore, the goal is to make the methods of Persistable generic, so that it can persist any object given to it and be able to read out any object the caller asks for. Say we have to persist this class: ```java Outer.java ________________ class Outer { int x; int y; boolean z; String name; List<T> myList; B b_obj; } ------------------------------------------------------- A.java _______ class B { int x; String y; } ``` when the above code is annotated as: ```java Outer.java ________________ // This will create a table named "Outer" @table class Outer { int x; int y; // Its not annotated as column, therefore, it will not be persisted. boolean z; String name; B b_obj; List<T> myList; // Add a column named x to Outer table. The value stored in the row for that column will be the value of x. @column int getX() { .. } @column int getY() { .. } @column String getName() { .. } // For all fields that are not [primitive types](https://docs.oracle.com/javase/6/docs/api/java/lang/Class.html#isPrimitive()), they will be persisted in their own tables as a new row. The auto-increment ID for this row will be persisted as a value of the column named "__table__B", so that we have a link to the nested element/table from this table. @column B getBObj() {..} // If the Column annotation exists for a collection type, then they will be written to a table of their own. // The name of the table is obtained by calling T.class.getSimpleName(). If all of them evaluates to the same // table name, then they get written to the same table as different rows. And the corresponding Outer table's // column will have a string [<rowId1]>, <rowId2>]. // If `T.class.getSimpleName()` evaluates to different names, then they are linked back in Outer table as // different columns. @column List<T> getMyList() { .. } } ------------------------------------------------------- B.java _______ @table class B { int x; String y; } ``` this creates these set of tables: Here assuming that `getMyList()` returns a list of three elements where ```java for (T in getMyList) { // T.getClass().getSimpleName() returns `T` for two of the objects and `TT` for the third. } ``` Table Outer timestamp|ID|X|Y|Name|__table__BObj|__table__T|__table__TT --|--|--|-|------|--------------|----------------|--- ..|53|1|2|name|34|[23,24]|32 Table B timestamp|ID|X|Y --|--|--|-- ..|34|23|y Table T timestamp|ID| col2| .. --|---|-|-------- ..|23|24|.. ..|24|24|.. Table TT timestamp|ID| col2| .. --|---|-|-------- ..|32|24|.. - The Java code resembles how the tables and nested tables will be laid out. - Because the nested table can be obtained just by reading the column name, someone can write a tool to read the RCAs from the SQLite files. The won't need the java package to figure out the nesting. - If new columns or nestings are added, they can also be added in the DB without schema mismatch. This is because, we create a new SQLite file on each restart of the RCA agent process (on top of periodic file rotations). - The table name and column names are derived from the classname and getter name (methodName with `get` stripped out). This provides a 1-1 mapping from the persistor class and fields to the table name and column names. - Over and above, we are able to persist any Java Object.

...com/amazon/opendistro/elasticsearch/performanceanalyzer/rca/persistence/SQLitePersistor.java

...azon/opendistro/elasticsearch/performanceanalyzer/rca/persistence/SqliteObjectPersistor.java

...com/amazon/opendistro/elasticsearch/performanceanalyzer/rca/persistence/SQLitePersistor.java

yojs · 2020-09-02T21:45:25Z

The failed test is Test testFieldDataCacheRca FAILED is a known issue

yojs force-pushed the refactor-persistence-layer branch from 3221cc0 to 36755dd Compare September 1, 2020 04:05

yojs force-pushed the refactor-persistence-layer branch from 36755dd to edc7205 Compare September 1, 2020 18:06

yojs requested review from rguo-aws and sruti1312 September 1, 2020 18:10

yojs self-assigned this Sep 1, 2020

yojs added the code-refactoring The change reduces the cognitive load of the reader of the code and makes adding new changes easier label Sep 1, 2020

yojs changed the title ~~First take - persist any Object into database not just FlowUnits~~ Refactoring the persistence layer to be able to persist any Java Object Sep 1, 2020

yojs added 2 commits September 1, 2020 12:10

Better comments and error message handling

9fbc936

Adding some more tests

7b79151

sruti1312 previously approved these changes Sep 1, 2020

View reviewed changes

Addressing PR comments

2e01e5d

yojs dismissed sruti1312’s stale review via 2e01e5d September 2, 2020 00:36

Added test to cover list of ints

e43aa16

yojs requested a review from sruti1312 September 2, 2020 01:23

sruti1312 previously approved these changes Sep 2, 2020

View reviewed changes

Added more comments for readability and addressing PR review comments

5607a22

yojs dismissed sruti1312’s stale review via 5607a22 September 2, 2020 16:56

rguo-aws reviewed Sep 2, 2020

View reviewed changes

...com/amazon/opendistro/elasticsearch/performanceanalyzer/rca/persistence/SQLitePersistor.java Show resolved Hide resolved

...com/amazon/opendistro/elasticsearch/performanceanalyzer/rca/persistence/SQLitePersistor.java Show resolved Hide resolved

rguo-aws approved these changes Sep 2, 2020

View reviewed changes

sruti1312 approved these changes Sep 2, 2020

View reviewed changes

yojs merged commit 7e76ed2 into master Sep 2, 2020

yojs deleted the refactor-persistence-layer branch September 2, 2020 21:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring the persistence layer to be able to persist any Java Object #407

Refactoring the persistence layer to be able to persist any Java Object #407

yojs commented Sep 1, 2020 •

edited

Loading

codecov bot commented Sep 1, 2020 •

edited

Loading

yojs commented Sep 2, 2020

Refactoring the persistence layer to be able to persist any Java Object #407

Refactoring the persistence layer to be able to persist any Java Object #407

Conversation

yojs commented Sep 1, 2020 • edited Loading

What this buys us

codecov bot commented Sep 1, 2020 • edited Loading

Codecov Report

yojs commented Sep 2, 2020

yojs commented Sep 1, 2020 •

edited

Loading

codecov bot commented Sep 1, 2020 •

edited

Loading