initial telemetry setup #69330

michaelolo24 · 2020-06-16T20:13:34Z

Summary

For reference, the original issue: https://github.com/elastic/protections-team/issues/161

This PR looks to retain the following:

Number of active endpoints
Endpoints have protections enabled or not (malware for now) (TBD).
List by customer (provided by the telemetry service, license.issued_to)
OS type
OS version

With the exception of policy details (for now), the above information is tracked in savedObjects owned by the ingest_manager team. Given our inability to access our indices directly due to security concerns, this provides the best current alternative.

Proposed telemetry schema, but a work in progress depending on discussion here as well as ability to obtain policy details. This description and PR will be updated accordingly as those details are finalized.

Schema conversation (including detections work): https://github.com/elastic/telemetry/issues/392

  {
    total_installed: 17;
    active_within_last_24_hours: 12;
	os: [
	  {
	    full_name: "Windows 10",
		platform: 'windows',
	    version: "10.0",
	    count: 10
	  },
	  {
	    full_name: "Windows Server 2012",
	    platform: 'windows',
	    version: "6.2"
	    count: 7
	  }
	],
	policies: { // TBD
		malware: {
			success: 8,
			warning: 4,
			failure: 5
		}
	}
  }

Checklist

Delete any items that are not applicable to this PR.

Unit or functional tests were updated or added to match the most common scenarios

For maintainers

This was checked for breaking API changes and was labeled appropriately

x-pack/plugins/security_solution/server/lib/telemetry/endpoint_metadata_telemetry.ts

elasticmachine · 2020-07-07T19:27:59Z

Pinging @elastic/endpoint-data-visibility-team (Team:Endpoint Data Visibility)

jonathan-buttner

For the TODOs could you move those to github issues?

If we're not really getting endpoint data right now it might be worth switching the names to agent until we are actually getting endpoint specific data.

jonathan-buttner · 2020-07-07T19:34:32Z

x-pack/plugins/security_solution/server/lib/telemetry/endpoint_metadata_telemetry.ts

+
+export const getDefaultEndpointMetadataTelemetry = (): EndpointMetadataTelemetry => ({
+  endpoints: { installed: 0, last_24_hours: 0 },
+  os: {} as Record<string, EndpointMetadataOSTelemtry>,


Does eslint complain if you don't do the as ... part?

It did before when I initially set this up, but have since fixed. Will remove, thanks

jonathan-buttner · 2020-07-07T19:42:26Z

x-pack/plugins/security_solution/server/lib/telemetry/endpoint_metadata_telemetry.ts

+  },
+});
+
+/**


A github issue might be a better spot to track the functionality implementation

Yea, I'm gonna pull this out, this was more for my own eyes as I finished up the initial implementation

jonathan-buttner · 2020-07-07T19:47:49Z

x-pack/plugins/security_solution/server/lib/telemetry/endpoint_metadata_telemetry.ts

+  });
+  const uniqueHostIds: Set<string> = new Set();
+  const endpointTelemetry = getDefaultEndpointMetadataTelemetry();
+  endpointTelemetry.endpoints.installed = endpointAgents.length; // All agents who have the endpoint registered


Is this really tracking how many agents are installed? And we're assuming the endpoint has been installed if the agent has been installed?

The agent references installed packages in the packages field here. So when we filter for it, this should give us all of the agents that have installed the endpoint. Right now I'm using system since I haven't been able to get the endpoint successfully installed via the agent 😞. @nchaulet , can you confirm my previous statement is accurate?

Actually just saw a unique case where I enrolled a second agent on an endpoint and got 2 installed instead of one. Using the unique hosts id's now to track the count to avoid this issue

jonathan-buttner · 2020-07-07T20:07:16Z

x-pack/plugins/security_solution/server/lib/telemetry/endpoint_metadata_telemetry.ts

+  };
+}
+
+export interface IngestManagerLocalMetadata extends AgentMetadata {


So fleet stores this information based on the endpoints that are connected?

Yep, successfully installed I believe

jonathan-buttner · 2020-07-07T20:15:29Z

x-pack/plugins/security_solution/server/lib/telemetry/index.ts

+) {
+  const collector = usageCollector.makeUsageCollector({
+    type: SIEM_USAGE_COLLECTOR_TYPE,
+    isReady: () => true,


Does this need to wait on the ingest manager being initialized? I suppose if no saved objects existed then we jus wouldn't find them right?

Yea and we wouldn't send empty telemetry until the next time this runs. @TinaHeiligers is there a better way to implement this?

jonathan-buttner · 2020-07-07T20:19:14Z

x-pack/plugins/security_solution/server/lib/telemetry/endpoint_metadata_telemetry.ts

+export const getDefaultEndpointMetadataTelemetry = (): EndpointMetadataTelemetry => ({
+  endpoints: { installed: 0, last_24_hours: 0 },
+  os: {} as Record<string, EndpointMetadataOSTelemtry>,
+  policies: {


Should we remove these if we're not planning on implementing them for 7.9?

Yea, I'll take them out unless something magical happens in the next day or two 😂

jonathan-buttner · 2020-07-07T20:32:31Z

x-pack/plugins/security_solution/server/lib/telemetry/index.ts

+        if (err.output?.statusCode === 404) {
+          return {};
+        }
+        return getDefaultEndpointMetadataTelemetry(); // TODO: no data returned if an error occured or should fail silently here?


Hmm I wonder if we should return {} if an error occurs because we don't know the results? It's probably not very helpful to send 0s right?

Yea, you're right. I was back and forth on this and saw different implementations, but I agree that an empty object is definitely more informative.

rylnd · 2020-07-08T15:11:19Z

x-pack/plugins/security_solution/server/lib/telemetry/endpoint_metadata_telemetry.ts

+export const getEndpointMetadataTelemetryFromFleet = async (
+  savedObjectsClient: ISavedObjectsRepository
+) => {
+  const { saved_objects: endpointAgents } = await savedObjectsClient.find<AgentSOAttributes>({


@michaelolo24 are these SOs global or space-aware? If you need to search across spaces you'll need to query the kibana index directly. You can see an example of that in rules/jobs telemetry.

They are globals? I was told that I could access the SO directly if they were listed as namepaceType: 'agnostic'. I'll take a look

michaelolo24 · 2020-07-09T14:47:03Z

x-pack/plugins/security_solution/server/lib/telemetry/endpoint.ts

+      const { host, os, elastic } = localMetadata as AgentLocalMetadata;
+
+      if (lastCheckin && new Date(lastCheckin) > aDayAgo) {
+        // Get agents that have checked in within the last 24 hours to later see if their endpoints are running


Once we are able to get policy response details setup, we should be able to get this information in this metadata

michaelolo24 · 2020-07-09T14:48:11Z

x-pack/plugins/security_solution/server/lib/telemetry/endpoint.ts

+
+  endpointMetadataTelemetry.os = Object.values(osTracker);
+
+  // Handle Last 24 Hours


Currently, the best way to know for certain that an endpoint is actively running is to check the latest status the agent has for the endpoint from it's events list

michaelolo24 · 2020-07-09T14:48:47Z

x-pack/plugins/security_solution/server/lib/telemetry/fleet_saved_objects.ts

+  savedObjectsClient.find<Agent>({
+    type: AGENT_SAVED_OBJECT_TYPE,
+    fields: ['packages', 'last_checkin', 'local_metadata'],
+    filter: `${AGENT_SAVED_OBJECT_TYPE}.attributes.packages: ${ENDPOINT_PACKAGE_CONSTANT}`,


The list of installed integrations are listed in packages

x-pack/plugins/security_solution/server/lib/telemetry/endpoint.mocks.ts

bkimmel · 2020-07-09T15:15:51Z

x-pack/plugins/security_solution/server/lib/telemetry/endpoint.test.ts

+
+describe('test security solution endpoint telemetry', () => {
+  let mockSavedObjectsRepository: jest.Mocked<ISavedObjectsRepository>;
+  let getFleetSavedObjectsMetadataSpy: jest.SpyInstance<Promise<SavedObjectsFindResponse<Agent>>>;


❔ These jest.Mocked and jest.SpyInstance deals look pretty cool. Good docs for them anywhere I can read?

x-pack/plugins/security_solution/server/lib/telemetry/endpoint.ts

bkimmel · 2020-07-09T15:41:15Z

x-pack/plugins/security_solution/server/lib/telemetry/endpoint.ts

+      const { last_checkin: lastCheckin, local_metadata: localMetadata } = metadataAttributes;
+      const { host, os, elastic } = localMetadata as AgentLocalMetadata;
+
+      if (lastCheckin && new Date(lastCheckin) > aDayAgo) {


❔ This is a big reduce and it seems like maybe it could be easier to read if we did some of this stuff in front of it as a filter. This is a ❔ more on the "ignore for now" side, just a comment for later.

Will break apart in next commit

michaelolo24 · 2020-07-09T22:05:18Z

cc @nchaulet if you can take a look through when you get a chance. Just want to make sure I didn't miss anything, including potential optimizations, regarding the savedObjects

michaelolo24 · 2020-07-13T02:38:07Z

x-pack/plugins/security_solution/server/lib/telemetry/fleet_saved_objects.ts

+) =>
+  savedObjectsClient.find<AgentEventSOAttributes>({
+    type: AGENT_EVENT_SAVED_OBJECT_TYPE,
+    filter: `${AGENT_EVENT_SAVED_OBJECT_TYPE}.attributes.agent_id: ${agentId}`,


@jonathan-buttner , is there a way for me to also filter for values that have the value endpoint somewhere in the attributes.message field? Just trying to think of ways to optimize this here

👍 on filtering this, also maybe fetch only data for the last n time, no? this list could be really long at some point

Hmm I'm not sure about that. I'll have to take a look at how the filtering works with saved objects.

@nchaulet I would do that, but only issue is, if a policy was enabled a week ago, and nothing has changed, then there won't be any endpoint related updates in the events lists in the past lets say 24 hours. i.e. there isn't a STILL_RUNNING subtype 😂. Hence why it would be great to filter by all references to endpoint, but don't think that's possible via the filter?

If I were to just check if anything has changed regarding the endpoint in the last 24 hours, then I would have to keep some knowledge from day to day of the previous state, and that feels like a recipe for bugs if something gets missed in the cracks somehow

Looks like it's pure EQL and this should work to get all endpoint references in the message! Hoot Hoot 😄

`${AGENT_EVENT_SAVED_OBJECT_TYPE}.attributes.agent_id: ${agentId} and ${AGENT_EVENT_SAVED_OBJECT_TYPE}.attributes.message: "${FLEET_ENDPOINT_PACKAGE_CONSTANT}"`

re here: https://www.elastic.co/guide/en/kibana/master/kuery-query.html#_language_syntax

Quotes around a search term will initiate a phrase search. For example, message:"Quick brown fox" will search for the phrase "quick brown fox" in the message field. Without the quotes, your query will get broken down into tokens via the message field’s configured analyzer and will match documents that contain those tokens, regardless of the order in which they appear. This means documents with "quick brown fox" will match, but so will "quick fox brown". Remember to use quotes if you want to search for a phrase.

afharo · 2020-07-13T12:48:43Z

@michaelolo24 can you run node scripts/telemetry_check --fix and commit the changes? It should add the detailed schema to the .json file containing all the plugins schemas in one place 🙏

michaelolo24 · 2020-07-13T13:07:32Z

@afharo, done. Sorry about that!

afharo

Telemetry changes (schema), LGTM!

Bear in mind, though we are using the same type as in #71102

You'll get some conflicts if the other PR is merged first (and vice-versa)

afharo · 2020-07-13T13:24:25Z

x-pack/plugins/security_solution/server/lib/telemetry/index.ts

+    },
+    schema: {
+      endpoint: {
+        total_installed: { type: 'number' },


NIT: To help us out, it might be beneficial if devs specify 'long' instead.

Looks good as-is anyway. No need to change it 👍

Sweet, I can change it though. No biggie. Once @rylnd has his PR in I'll rebase on his and update these to long 😄

afharo · 2020-07-13T13:29:00Z

x-pack/plugins/security_solution/server/lib/telemetry/index.ts

+  savedObjectsClient: ISavedObjectsRepository
+) {
+  const collector = usageCollector.makeUsageCollector<UsageData>({
+    type: 'security_solution',


Q: Bear in mind this collector is registering with the same type as the one specified in #71102. Whoever merges first will take ownership of it and it will break the other's implementation

cc/ @rylnd

Yep! Plan is for him to merge his first and then I'll make the updates for mine on top of his changes once they're in

nchaulet

👍 the query to ingest manager saved objects looks good to me

rylnd

Checked it out locally, and I saw endpoints telemetry! I didn't look to closely at implementation details here, but tests look good and data looks 💯

kibanamachine · 2020-07-14T00:11:51Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: 8f47a71

Build metrics

✅ unchanged

History

💔 Build #61108 failed 65ae3f9
💚 Build #60983 succeeded 384733f76933277b8b08b05a0c7187be32b67c18
💚 Build #60850 succeeded 706f5dbed60ed4aa3061e45a39524fa1100f67a8
💔 Build #60726 failed 6aff1123a5ec93cea6447e0deb31e0ed551d3b03
💔 Build #60463 failed f78511e5b5fad23fcd020e96c99cb0957c385e2f

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

* master: (314 commits) [APM] Use status_code field to calculate error rate (elastic#71109) [Observability] Change appLink passing the date range (elastic#71259) [Security] Add Timeline improvements (elastic#71506) adjust vislib bar opacity (elastic#71421) Fix ScopedHistory mock and adapt usages (elastic#71404) [Security Solution] Add hook for reading/writing resolver query params (elastic#70809) [APM] Bug fixes from ML integration testing (elastic#71564) [Discover] Add caused_by.type and caused_by.reason to error toast modal (elastic#70404) [Security Solution] Add 3rd level breadcrumb to admin page (elastic#71275) [Security Solution][Exceptions] Exception modal bulk close alerts that match exception attributes (elastic#71321) Change signal.rule.risk score mapping from keyword to float (elastic#71126) Added help text where needed on connectors and alert actions UI (elastic#69601) [SIEM][Detections] Value Lists Management Modal (elastic#67068) [test] Skips test preventing promotion of ES snapshot elastic#71582 [test] Skips test preventing promotion of ES snapshot elastic#71555 [ILM] Fix alignment of the timing field (elastic#71273) [SIEM][Detection Engine][Lists] Adds the ability for exception lists to be multi-list queried. (elastic#71540) initial telemetry setup (elastic#69330) [Reporting] Formatting fixes for CSV export in Discover, CSV download from Dashboard panel (elastic#67027) Search across spaces (elastic#67644) ...

…t-apps-page-titles * 'master' of github.com:elastic/kibana: (88 commits) [ML] Functional tests - disable DFA creation and cloning tests [APM] Use status_code field to calculate error rate (elastic#71109) [Observability] Change appLink passing the date range (elastic#71259) [Security] Add Timeline improvements (elastic#71506) adjust vislib bar opacity (elastic#71421) Fix ScopedHistory mock and adapt usages (elastic#71404) [Security Solution] Add hook for reading/writing resolver query params (elastic#70809) [APM] Bug fixes from ML integration testing (elastic#71564) [Discover] Add caused_by.type and caused_by.reason to error toast modal (elastic#70404) [Security Solution] Add 3rd level breadcrumb to admin page (elastic#71275) [Security Solution][Exceptions] Exception modal bulk close alerts that match exception attributes (elastic#71321) Change signal.rule.risk score mapping from keyword to float (elastic#71126) Added help text where needed on connectors and alert actions UI (elastic#69601) [SIEM][Detections] Value Lists Management Modal (elastic#67068) [test] Skips test preventing promotion of ES snapshot elastic#71582 [test] Skips test preventing promotion of ES snapshot elastic#71555 [ILM] Fix alignment of the timing field (elastic#71273) [SIEM][Detection Engine][Lists] Adds the ability for exception lists to be multi-list queried. (elastic#71540) initial telemetry setup (elastic#69330) [Reporting] Formatting fixes for CSV export in Discover, CSV download from Dashboard panel (elastic#67027) ... # Conflicts: # x-pack/plugins/index_management/public/application/index.tsx

michaelolo24 force-pushed the security-telemetry branch from 8c94f41 to 33dadc1 Compare June 16, 2020 20:16

michaelolo24 added Team:Endpoint Data Visibility Team managing the endpoint resolver v7.9.0 v8.0.0 labels Jun 26, 2020

bkimmel reviewed Jun 26, 2020

View reviewed changes

x-pack/plugins/security_solution/server/lib/telemetry/endpoint_metadata_telemetry.ts Outdated Show resolved Hide resolved

michaelolo24 force-pushed the security-telemetry branch 3 times, most recently from 8f7925f to 790ff51 Compare July 7, 2020 15:53

michaelolo24 marked this pull request as ready for review July 7, 2020 19:27

michaelolo24 requested review from a team as code owners July 7, 2020 19:27

michaelolo24 added release_note:skip Skip the PR/issue when compiling release notes Feature:Telemetry labels Jul 7, 2020

michaelolo24 force-pushed the security-telemetry branch from f7ee278 to ae07d5b Compare July 7, 2020 20:27

jonathan-buttner approved these changes Jul 7, 2020

View reviewed changes

rylnd reviewed Jul 8, 2020

View reviewed changes

michaelolo24 force-pushed the security-telemetry branch from ae07d5b to 9247b6a Compare July 9, 2020 03:28

michaelolo24 commented Jul 9, 2020

View reviewed changes

bkimmel reviewed Jul 9, 2020

View reviewed changes

x-pack/plugins/security_solution/server/lib/telemetry/endpoint.mocks.ts Outdated Show resolved Hide resolved

bkimmel reviewed Jul 9, 2020

View reviewed changes

x-pack/plugins/security_solution/server/lib/telemetry/endpoint.ts Outdated Show resolved Hide resolved

bkimmel reviewed Jul 9, 2020

View reviewed changes

x-pack/plugins/security_solution/server/lib/telemetry/endpoint.ts Outdated Show resolved Hide resolved

bkimmel reviewed Jul 9, 2020

View reviewed changes

michaelolo24 requested a review from a team July 9, 2020 20:54

michaelolo24 force-pushed the security-telemetry branch from f5449f4 to f78511e Compare July 9, 2020 21:17

bkimmel approved these changes Jul 9, 2020

View reviewed changes

michaelolo24 requested a review from nchaulet July 9, 2020 22:04

michaelolo24 force-pushed the security-telemetry branch from f78511e to 6aff112 Compare July 13, 2020 02:31

michaelolo24 commented Jul 13, 2020

View reviewed changes

afharo approved these changes Jul 13, 2020

View reviewed changes

nchaulet approved these changes Jul 13, 2020

View reviewed changes

michaelolo24 force-pushed the security-telemetry branch from 706f5db to 384733f Compare July 13, 2020 17:03

endpoint adoption telemetry

7b8bf0a

michaelolo24 force-pushed the security-telemetry branch from 384733f to f74433f Compare July 13, 2020 21:10

unify endpoints and detections telemetry

65ae3f9

michaelolo24 force-pushed the security-telemetry branch from f74433f to 65ae3f9 Compare July 13, 2020 21:13

fix type error

8f47a71

rylnd approved these changes Jul 13, 2020

View reviewed changes

michaelolo24 merged commit 8325222 into elastic:master Jul 14, 2020

michaelolo24 deleted the security-telemetry branch July 14, 2020 00:52

michaelolo24 mentioned this pull request Jul 14, 2020

[7.x] initial telemetry setup (#69330) #71578

Merged

michaelolo24 added a commit to michaelolo24/kibana that referenced this pull request Jul 14, 2020

initial telemetry setup (elastic#69330)

5c24ef4

michaelolo24 added a commit that referenced this pull request Jul 14, 2020

[7.x] initial telemetry setup (#69330) (#71578)

2d3b157

+                },
+              });
+              /**


		endpointMetadataTelemetry.os = Object.values(osTracker);

		// Handle Last 24 Hours

initial telemetry setup #69330

initial telemetry setup #69330

Conversation

michaelolo24 commented Jun 16, 2020 • edited Loading

Summary

Checklist

For maintainers

elasticmachine commented Jul 7, 2020

jonathan-buttner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelolo24 Jul 7, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelolo24 commented Jul 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelolo24 Jul 13, 2020 • edited Loading

Choose a reason for hiding this comment

afharo commented Jul 13, 2020

michaelolo24 commented Jul 13, 2020

afharo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nchaulet left a comment

Choose a reason for hiding this comment

rylnd left a comment

Choose a reason for hiding this comment

kibanamachine commented Jul 14, 2020

💚 Build Succeeded

Build metrics

History

michaelolo24 commented Jun 16, 2020 •

edited

Loading

michaelolo24 Jul 7, 2020 •

edited

Loading

michaelolo24 Jul 13, 2020 •

edited

Loading