Skip to content

Commit

Permalink
[Security Solution][Resolver] Adding resolver backend docs (elastic#7…
Browse files Browse the repository at this point in the history
…3726) (elastic#73854)

* Adding resolver backend docs

* Adding more clarity around ancestry array limit

Co-authored-by: Elastic Machine <[email protected]>

Co-authored-by: Elastic Machine <[email protected]>
  • Loading branch information
jonathan-buttner and elasticmachine authored Jul 30, 2020
1 parent 428a021 commit e1ca8af
Show file tree
Hide file tree
Showing 9 changed files with 242 additions and 33 deletions.
2 changes: 2 additions & 0 deletions x-pack/plugins/security_solution/common/endpoint/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,8 @@ export interface ResolverChildNode extends ResolverLifecycleNode {
*
* string: Indicates this is a leaf node and it can be used to continue querying for additional descendants
* using this node's entity_id
*
* For more information see the resolver docs on pagination [here](../../server/endpoint/routes/resolver/docs/README.md#L129)
*/
nextChild?: string | null;
}
Expand Down

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,28 @@ export class AncestryQueryHandler implements QueryHandler<ResolverAncestry> {
// bucket the start and end events together for a single node
const ancestryNodes = this.toMapOfNodes(results);

// the order of this array is going to be weird, it will look like this
// [furthest grandparent...closer grandparent, next recursive call furthest grandparent...closer grandparent]
/**
* This array (this.ancestry.ancestors) is the accumulated ancestors of the node of interest. This array is different
* from the ancestry array of a specific document. The order of this array is going to be weird, it will look like this
* [most distant ancestor...closer ancestor, next recursive call most distant ancestor...closer ancestor]
*
* Here is an example of why this happens
* Consider the following tree:
* A -> B -> C -> D -> E -> Origin
* Where A was spawn before B, which was before C, etc
*
* Let's assume the ancestry array limit is 2 so Origin's array would be: [E, D]
* E's ancestry array would be: [D, C] etc
*
* If a request comes in to retrieve all the ancestors in this tree, the accumulate results will be:
* [D, E, B, C, A]
*
* The first iteration would retrieve D and E in that order because they are sorted in ascending order by timestamp.
* The next iteration would get the ancestors of D (since that's the most distant ancestor from Origin) which are
* [B, C]
* The next iteration would get the ancestors of B which is A
* Hence: [D, E, B, C, A]
*/
this.ancestry.ancestors.push(...ancestryNodes.values());
this.ancestry.nextAncestor = parentEntityId(results[0]) || null;
this.levels = this.levels - ancestryNodes.size;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,38 +30,9 @@ export interface Options {
}

/**
* This class aids in constructing a tree of process events. It works in the following way:
* This class aids in constructing a tree of process events.
*
* 1. We construct a tree structure starting with the root node for the event we're requesting.
* 2. We leverage the ability to pass hashes and arrays by reference to construct a fast cache of
* process identifiers that updates the tree structure as we push values into the cache.
*
* When we query a single level of results for child process events we have a flattened, sorted result
* list that we need to add into a constructed tree. We also need to signal in an API response whether
* or not there are more child processes events that we have not yet retrieved, and, if so, for what parent
* process. So, at the end of our tree construction we have a relational layout of the events with no
* pagination information for the given parent nodes. In order to actually construct both the tree and
* insert the pagination information we basically do the following:
*
* 1. Using a terms aggregation query, we return an approximate roll-up of the number of child process
* "creation" events, this gives us an estimation of the number of associated children per parent
* 2. We feed these child process creation event "unique identifiers" (basically a process.entity_id)
* into a second query to get the current state of the process via its "lifecycle" events.
* 3. We construct the tree above with the "lifecycle" events.
* 4. Using the terms query results, we mark each non-leaf node with the number of expected children, if our
* tree has less children than expected, we create a pagination cursor to indicate "we have a truncated set
* of values".
* 5. We mark each leaf node (the last level of the tree we're constructing) with a "null" for the expected
* number of children to indicate "we have not yet attempted to get any children".
*
* Following this scheme, we use exactly 2 queries per level of children that we return--one for the pagination
* and one for the lifecycle events of the processes. The downside to this is that we need to dynamically expand
* the number of documents we can retrieve per level due to the exponential fanout of child processes,
* what this means is that noisy neighbors for a given level may hide other child process events that occur later
* temporally in the same level--so, while a heavily forking process might get shown, maybe the actually malicious
* event doesn't show up in the tree at the beginning.
*
* This Tree's root/origin could be in the middle of the tree. The origin corresponds to the id passed in when this
* This Tree's root/origin will likely be in the middle of the tree. The origin corresponds to the id passed in when this
* Tree object is constructed. The tree can have ancestors and children coming from the origin.
*/
export class Tree {
Expand Down

0 comments on commit e1ca8af

Please sign in to comment.