This repository has been archived by the owner on Aug 2, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 20
Integration test framework to test RCAs and decision Makers #301
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
759cd64
Integration test framework to test RCAs and decision Makers
yojs 209e36f
Fixed a bug where the database was being corrupted because of multipl…
yojs ccf1ebd
Fix the POC graph
yojs 0d43965
Removing unused methods
yojs 9caaaac
Name the validation checks that failed
yojs 086cdb4
Addressing PR comments
yojs File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
# Rca Integration test framework | ||
|
||
## Scope | ||
To be able to test scenarios where multiple RCA Schedulers are running on different hosts of a | ||
cluster and be able to inject a custom RCA graph, specify the metrics that should flow through | ||
the rca-dataflow-graph and then be able to test the results of the RCAs and decisions based on | ||
the RCAs by either querying the rca.sqlite files on a particular host or by hitting the REST | ||
endpoint on a particular host. | ||
|
||
## Out of Scope | ||
This framework will not facilitate testing the PerformanceAnalyzer Reader component, writer | ||
component and how PerformanceAnalyzer interacts with ElasticSearch. | ||
|
||
## How to write your own tests using this framework ? | ||
The RCA-IT is composed of various annotatons that you can use to configure the | ||
test environment you want your tests to run on. | ||
|
||
`__@RunWith(RcaItNotEncryptedRunner.class)__` | ||
|
||
The above specifies the runner for the junit test class and in this case, it says to junit | ||
to offload it to one of the RCA-IT runners - _RcaItNotEncryptedRunner_. All RCA-IT tests must | ||
use this annotation for them to be run by this integ test framework. | ||
|
||
`__@AClusterType(ClusterType.MULTI_NODE_CO_LOCATED_MASTER)__` | ||
|
||
This annotation tells the RCA-IT to use `a multi-node cluster with no dedicated master nodes | ||
`. The kinds of clusters supported today are: `SINGLE_NODE`, `MULTI_NODE_CO_LOCATED_MASTER | ||
` and `MULTI_NODE_DEDICATED_MASTER`. This is a required annotation and must be specified at | ||
the class level. | ||
|
||
`__@ARcaGraph(MyRcaGraph.class)__` | ||
|
||
This helps us specify the Graph that we will be using for this test. It can be a graph that | ||
exists or the one specially contrived for this test. | ||
|
||
`__@AMetric__` | ||
|
||
This helps us specify the metrics that will be pured over the RCA graph. It has multiple sub | ||
-fields. | ||
- _name_ : The metric we are filling in. The expected parameter is one of the metrics classes | ||
in `com.amazon.opendistro.elasticsearch.performanceanalyzer.rca.framework.api.metrics`. The | ||
metrics class that you specify, should have a `static final` field called `NAME` (`CPU_Utilization`) | ||
and that will be used to determine the metric table. | ||
- _dimensionNames_ : For the dimension names for a metric, please refer to the docs | ||
[here](https://opendistro.github.io/for-elasticsearch-docs/docs/pa/reference/). | ||
- _tables_ : This specifies a table for the metric. The table should be a 5 second snapshot | ||
of the metrics, similar to what exists in metricsdb files. The table is an array type | ||
, therfore it gives you the flexibility of specifying a different metrics table for | ||
different nodes in the cluster. This can be used to push different metrics to the node that | ||
we want to be marked unhealthy vs all other nodes in the cluster. | ||
- _hostTag_ : On which node of the cluster, should this metric table be emitted. | ||
- _tuple_ : This is an array type that can be used to specify the rows in the table. A | ||
row should be an n-tuple where n is the number of dimension this metrics has added to | ||
the 4 aggregate columns that all metricsdb files has - `min`, `max`, `sum` and `avg`. | ||
|
||
`__@Expect__` | ||
|
||
This is an optional annotation that can be used only at a method level. This provides an easy | ||
way to validate the result of the test. The annotation has 4 sub-fields: | ||
- what : What are we testing for - data in rca.sqlite file or the response returned by the | ||
rest API. | ||
- on : On which node should the framework look for, for the expected data. | ||
- forRca : Which particular RCA's value are we testing. | ||
- validator : This is the class that should be defined by the test writer and should | ||
implement `IValidator` interface. Once the framework gathers the data for the mentioned RCA | ||
from the given node, the data will be passed to the validator which returns if the check | ||
passed or not. | ||
|
||
The Expect annotation is a repeatable type. Therefore, you can expect multiple things from | ||
the test at steady-state. So you can have two expectations one for the RCA on data node and | ||
the other on the master. If the expectations are not true for the ith iteration, then the | ||
framework, will re-run them for the i+1 the iteration till a timeout. The timeout is | ||
configurable to any number of seconds using the field `timeoutSeconds` but has the default | ||
of 60 seconds. | ||
|
||
A test class can get access to the programmaticAPI to get information about hosts in the cluster | ||
or a particular host then the test class can declare a method with name `setTestApi(final TestApi api)` | ||
and the test runer will call this setter to give a reference of the TEestApi to the testClass. | ||
|
||
## Framework deep dive. | ||
This section might be of interest to you if you are trying to enhance the test framework itself | ||
. If you just want to add more integration tests, then you may choose to skip this section. | ||
|
||
The framework consists of four main classes: | ||
1. `RcaItRunnerBase` : This is the JUnit Runner that will be used to run all rca-it tests. It | ||
orchestrates the environment creation, initializing the test class and then executing the methods | ||
annotated with `@Test`. | ||
|
||
2. `TestEnvironment` : The RCA-IT environment is defined by the RCA graph it will be running, the | ||
metrics that will flow through the dataflow pipelines and the rca.conf that will be used by the | ||
hosts. | ||
|
||
3. `Cluster` and `Host` classes: These class initializes multiple RCAController(s) threads, | ||
each of them represent RCAFramework running on multiple nodes. In constructors, we create all the | ||
objects and create a directory per host where they will dump the _rca.sqlite_ and _rca_enabled_ | ||
files. In the second phase a call to `createServersAndThreads` is made which creates all the http | ||
and grpc servers (one per host). Then we start the RCAcontroller thread. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we don't throw after this?