-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Security Telemetry Refactor #109875
Security Telemetry Refactor #109875
Conversation
@@ -0,0 +1,274 @@ | |||
import { ElasticsearchClient, Logger, SavedObjectsClient } from 'src/core/server'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea, IMO, is that this module is going to capture ES queries inside the telemetry library. Let me know if this is a bad idea -- generally seemed good to abstract this portion from the actual transmission.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good idea in theory, but the execution is problematic and doesn't set us up to adequately test this.
A more effective way of writing this code would be to invert control and pass the TelemetryQuerier
as a dependency to the TelemetrySender
class.
const telemetryQuerier = new TelemetryQuerier(...deps)
const telemetrySender = new TelemetrySender(...deps, telemetryQuerier)
By doing this the sender no longer needs to reference an SO Client, ES Client, Fleet Interface, etc and will get us closer to a Single Responsibility telemetry sender.
const telemetryQuerier = new TelemetryQuerier(...deps)
const telemetrySender = new TelemetrySender(...deps)
const telemetryOrchestrator = new TelemetryOrchestrator(telemetryQuerier, telemetrySender, ...deps)
I also consider the term Querier
is overloaded here - Receiver
would be a better noun to describe the interface interactions.
Let's pair on this bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah ok! That'll be a good starting point. I'm good with the inversion of control, and it's better as it removes coupling between the sender and the receiver. Where do you think the Queue should reside?
It seems to me that the primary modules to abstract from the sender are into:
I started to split these pieces out accordingly. |
import { exceptionListItemToEndpointEntry } from './helpers'; | ||
import { TelemetryEvent, ESLicense, ESClusterInfo, GetEndpointListResponse } from './types'; | ||
|
||
export class TelemetryReceiver { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@donaherc I cleaned this class up a little and got the tests / type checks working again. Make sure to g pull
On working on this code, I'm not sure "Receiver" is the correct name. "Retriever" might be better. I think this because the way the data is being requested.
entity -> push -> receiver -> sender -> ESTC
entity <- pull <- retriever -> sender -> ESTC
What do you reckon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On meditation, I think that this might a why not both situation.
We have a receiver in the form of:
*Detection Engine* -> push -> receiver -> sender -> ESTC
and a retreiver in:
* ES / SO / Fleet * <- pull <- retriever -> sender -> ESTC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's a little confusing. I think I'm comfortable renaming in the PR we discussed as a follow-on (Telemetry Orchestrator). In my first set of commits I was definitely perceiving the behavior in the second example as primary, and by lines/functions I think the majority of the complexity in that module is the "pull from ES" functionality. Since we've decided that the Orchestrator will own the Queue, it seems that it'd serve as the receiver for now, but it we determine that there's enough functionality there, we can break out orchestrator functionality under the paradigm that you've described above, and both Tasks and the Detection Engine (streaming) approach can use the portions of the Orchestrator they need.
private exceptionListClient?: ExceptionListClient; | ||
private soClient?: SavedObjectsClientContract; | ||
private readonly max_records = 10_000; | ||
private maxQueueSize = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've duplicated this value in this class and the sender. We should move this into a constant reference.
throw Error('elasticsearch client is unavailable: cannot retrieve cluster infomation'); | ||
} | ||
|
||
return this.getClusterInfo(this.esClient); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can / should collapse the private func getClusterInfo
into this function as it is only called from here and nowhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
} | ||
|
||
try { | ||
const ret = await this.getLicense(this.esClient, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. I think we can get rid of getLicense
and move the effect into this function. No point having these dispatches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
Killing it 💪 @donaherc |
@elasticmachine merge upstream |
@elasticmachine merge upstream |
💚 Build SucceededMetrics [docs]
History
To update your PR or re-run it, just comment with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, as @pjhampton and I paired on it. Some added context for other reviewers... We're taking an incremental approach to break up some chunky modules into more intuitive smaller chunks.
To that end, this PR pulls purpose-specific behavior from the sender
module and spreads it into other modules (filter
,receiver
), with some bonus information hiding. We'll then, in a subsequent PR, wrap these modules in a coordinating orchestrator
module and then hide the vast majority of the implementation inside the library itself so that both the Detection Engine and Task portions need to know less about the underlying implementation. This will greatly improve our ability to test portions of this in the future, as well as allow other portions of the code to use the telemetry in the future.
* [@pjhampton/@donaherc] Move sec telem tasks into own package. * Split filter out into its own module, started abstracting ES interaction into a queries module * Implemented querier and fixed some types * Updated tests, moved receiver to plugin from sender to decouple them. * fixed integration in detection engine, misc fixes * [@pjhampton] Fix type ref problems. Update test defs. * Make url transformer a member func of the sender class. * [@pjhampton] clean up receiver commentary. * [@pjhampton] add null check consistency. * Fix bad formatting. Co-authored-by: cdonaher <[email protected]> Co-authored-by: Kibana Machine <[email protected]>
💚 Backport successful
This backport PR will be merged automatically after passing CI. |
* [@pjhampton/@donaherc] Move sec telem tasks into own package. * Split filter out into its own module, started abstracting ES interaction into a queries module * Implemented querier and fixed some types * Updated tests, moved receiver to plugin from sender to decouple them. * fixed integration in detection engine, misc fixes * [@pjhampton] Fix type ref problems. Update test defs. * Make url transformer a member func of the sender class. * [@pjhampton] clean up receiver commentary. * [@pjhampton] add null check consistency. * Fix bad formatting. Co-authored-by: cdonaher <[email protected]> Co-authored-by: Kibana Machine <[email protected]> Co-authored-by: Pete Hampton <[email protected]> Co-authored-by: cdonaher <[email protected]>
Summary
Maturing the security solutions use of event and batch-based telemetry to improve user protections on the endpoint agent.
Checklist
Delete any items that are not applicable to this PR.
For maintainers