[Heartbeat] Suite Monitor discovery mode #29917

vigneshshanmugam · 2022-01-19T21:09:32Z

Issue

Currently, we don't have a good way to deal with these below situations when one of the error occurs on running monitors that are zip url endpoints.

Network Error : ZIP url endpoint is not reachable, downloadable.
Syntax Error: Cannot run journeys due to syntax errors in one of the journey files
Run Error: Duplicate journey Id's.

Proposal

Introduce a discovery mode in Heartbeat that would create essential documents for the UI to create monitors based on the journeys. See also #30571 , for caching the discovered zips on a cloud storage endpoint for the service

Add a discovery flag to the existing monitor runs

heartbeat.monitors:
 - id: elastic-monitor
  name: Elastic Monitor
  discovery:
    enabled: true
      # optional, place to re-upload the downloaded zip for subsequent executions.
    source_cache:
      gcp_cloud_storage:
        signed_url: ""https://presignedgcpstorageurl"
  schedule: '@every 1m'
  source:
    zip_url: 
      url: "https://github.com/elastic/synthetics-demo/archive/refs/heads/main.zip"

Create the documents for the discovery mode and send them to the ES. The new documents are created differently than the existing format to avoid changing the queriers on the Uptime UI.

Error cases

Syntax/Network errors would contain only the Top level error event

{
  "suite": {
     "id": "elastic-monitor",
     "name": "Elastic Monitor"
   },
   "source_cache": {
     "google_cloud_storage": {"signed_url": "https://presignedgcpstorageurl"}
   },
   "journeys": [],
   // top level error 
   "error": {
     "message": "invalid syntax - failed with exit status 1",
     "stack": "/test.journey.ts"
   } 
}

If we can execute the synthetics runner, other related errors include duplication would be reported this way.

{
  "suite": {
     "id": "elastic-monitor",
     "name": "Elastic Monitor"
   },
   "source_cache": {
     "google_cloud_storage": {"signed_url": "https://presignedgcpstorageurl"}
   },
   "journeys": [
     {
       "id": "j1",
      "name": "journey-1"
     },
     {
       "id": "j",
        "name": "journey-2",
       "error": {
         "message": "duplicate monitor ID",
         "stack": ""
       }
     }
   ],
   // top level error 
   "error": {
     "message": "one or more journeys had errors"
   } 
}

The text was updated successfully, but these errors were encountered:

elasticmachine · 2022-01-19T21:09:41Z

Pinging @elastic/uptime (Team:Uptime)

andrewvc · 2022-02-09T02:32:04Z

I'd propose we change the syntax from discovery: true to discovery: enabled to allow adding additional discovery related options as required by #30295.

@vigneshshanmugam any objection to me merging the syntax from that issue into your example here?

vigneshshanmugam · 2022-02-09T16:29:02Z

Not at all, I will merge the example with the updated syntax.

lucasfcosta · 2022-02-16T17:47:37Z

Earlier today, while talking about elastic/synthetics#451, we have discussed whether we will be implementing suites in 8.2 or delay until the synthetics app implementation. Therefore, we may want to push this for later.

With regards to terminology, however, it doesn't seem like further refinement is needed given our agreement on what copy to use in the meeting earlier today.

dominiqueclarke · 2022-02-24T15:43:25Z

@vigneshshanmugam @andrewvc We've discussed this a bit, but could you help me crystalize this in my mind in writing.

What is the purpose of the id in journeys[i].id listed in the spec? How is it generated? Will it always be defined? What will it be used for?

Currently the monitor.id for all managed monitors is the uuid for the saved object.

The saved objects client does allow us to specify an id rather than using a random uuid id. My assumption though is that the id provided by journeys[i].id will be unique only to the suite and not with respect to other suites. Therefore, we will use this journeys[i].id, along with suite.id, only as an internal identifier to merge current saved objects with existing saved objects. Is that correct?

I am also wondering how the proposal to include a user-provided "monitor id" will impact this spec. Relating to elastic/synthetics#451, both @lucasfcosta @vigneshshanmugam proposed a mechanism for allowing the user to overwrite the monitor id in the journey. For example:

journey("I can still name the journey", ({ page }) => {
  // applies only to current journey context
  journey.use({ monitorId: "hash" });

  step("my step", async () => { });
});

Will we use this id as the actual monitor id or as an internal id for merging saved objects?

Assuming we continue using this user-provided value as an internal id rather than a monitor id, I'm worried that there may be confusion if we aren't careful about this feature. What will the expectations from our users be? Will they anticipate that the monitorId they set in the journey becomes the actual monitor id. Will it be confusing for our users if, when they navigate to the monitor details page for this journey, they see a random uuid instead of the monitor id set by them in the journey. Will they expect to be able to perform their own ES queries using this value as the monitor id?

If we aren't intending to use this as an actual monitor id, but only as an internal id, I would prefer to remove the word monitor from the term id, but that feature already exists today with the journey({ name, id }, () => {}) syntax.

Other than my confusion about the journeys[i].id, this spec looks great to me and I'm excited to get started on this project!

andrewvc · 2022-02-24T23:47:39Z

Your understanding is correct, that ID should be used as an internal ID for merging saved objects on the kibana side. We could, by the way, rather than a UUID, do what heartbeat does now, which is concatenate the suite ID with the journey ID (joining them with a -). It will be just as unique, and also predictable. TBH, I think it's not a huge deal either way, but I'll just throw it out there as option.

Going forward, I'd like us to get to the point where we can make the id less prominent in the UI, it's really a detail only important for advanced use cases, and as we improve the CRUD / GUI capabilities is less critical. It's still relevant for suites, but really only in exceptional cases, where a user has duplicated the ID. CC @drewpost @liciavale

vigneshshanmugam · 2022-02-25T00:09:01Z

Will we use this id as the actual monitor id or as an internal id for merging saved objects?

I would urge here to use the same pattern that we follow in HB to generate the monitor id for suites which will be <suiteid>:<journey.id> and use them internally as well as for the stored saved objects. Reduces the friction and also makes it explicit for the users to understand whats going on for suites.

If we aren't intending to use this as an actual monitor id, but only as an internal id, I would prefer to remove the word monitor from the term id, but that feature already exists today with the journey({ name, id }, () => {}) syntax.

Yeah I guess thats fine, but by the moment we start allowing the configuration of monitor schedule and other parameters, the journey.use API would be the natural choice as it makes it explicit. But I would wait for this and do this later once we are onboard with the new suites page.

lucasfcosta · 2022-02-28T09:55:24Z

One thing that just came to mind while reading the spec: what happens to the current ZIP URL monitors? Do they need the discovery flag to be enabled in order to work? Wouldn't it be confusing to be able to specify ZIP URLs with and without that flag and then have a different behaviour because of that?

I wonder whether we could just enable "discovery" by default and always have discovery behaviour if the specified source is a ZIP URL or if discovery settings exist (like the caching settings for example)?

Or am I missing something here?

andrewvc · 2022-02-28T17:33:09Z

It's a good question @lucasfcosta . I don't see a way around us supporting both methods for a while, if for no other reason than that this will initially only work with monitor management on the service.

At a minimum we'll need the synthetics node integration active to fully deprecate the current method of operation.

Additionally, this assumes fleet / the service, what about regular heartbeat users? Will we remove synthetics support for them? Some use heartbeat with Logstash for instance. Since we are technically beta we could consider deprecating / killing support for that, but I wouldn't make that call lightly.

lucasfcosta · 2022-03-02T10:10:20Z

Indeed, those considerations about backwards compatibility seem really important.

Just to be sure I understood your point of view, when you say:

I don't see a way around us supporting both methods for a while, if for no other reason than that this will initially only work with monitor management on the service.

You mean that we should be able to still support individual monitor creation and management from Heartbeat and thus make the flag explicit even if users use ZIP URLs?

I ask this because I was thinking that we could still have full support for all users (including fleet/service/HB). However, we'd just change how the data is sent and displayed by Kibana by making the discovery process implicit (i.e. it would be "on" when users add ZIP URLs and they don't need to know about what happens behind the scenes, we just display it nicely in Kibana).

Please let me know if I misunderstood what you meant or didn't see a possible challenge in doing that.

andrewvc · 2022-03-14T22:09:24Z

Closing in favor of elastic/synthetics#470

vigneshshanmugam added the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label Jan 19, 2022

This was referenced Jan 19, 2022

[META] Suite Discovery / Monitor Management Integration elastic/uptime#432

Closed

[Kibana] Suites tab on overview page elastic/uptime#435

Closed

dominiqueclarke mentioned this issue Jan 19, 2022

[Uptime] [Monitor Management] Suites discovery and creation elastic/kibana#123426

Closed

andrewvc mentioned this issue Jan 25, 2022

Inline monitor syntax errors should send errors to ES #29897

Closed

This was referenced Jan 27, 2022

[Uptime] UI Monitor Management - Suites Monitor Reconciliation in the event of Journey Errors elastic/kibana#123928

Closed

Uptime - Enable Suites Alerts elastic/kibana#123969

Closed

andrewvc mentioned this issue Feb 9, 2022

[Heartbeat] Support Upload / Download of suite zips to GCP cloud storage #30295

Closed

andrewvc mentioned this issue Feb 24, 2022

[Heartbeat] Support google cloud storage for suites #30571

Closed

andrewvc closed this as completed Mar 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Heartbeat] Suite Monitor discovery mode #29917

[Heartbeat] Suite Monitor discovery mode #29917

vigneshshanmugam commented Jan 19, 2022 •

edited by andrewvc

Loading

elasticmachine commented Jan 19, 2022

andrewvc commented Feb 9, 2022

vigneshshanmugam commented Feb 9, 2022

lucasfcosta commented Feb 16, 2022

dominiqueclarke commented Feb 24, 2022

andrewvc commented Feb 24, 2022

vigneshshanmugam commented Feb 25, 2022 •

edited

Loading

lucasfcosta commented Feb 28, 2022

andrewvc commented Feb 28, 2022

lucasfcosta commented Mar 2, 2022

andrewvc commented Mar 14, 2022

[Heartbeat] Suite Monitor discovery mode #29917

[Heartbeat] Suite Monitor discovery mode #29917

Comments

vigneshshanmugam commented Jan 19, 2022 • edited by andrewvc Loading

Issue

Proposal

Error cases

elasticmachine commented Jan 19, 2022

andrewvc commented Feb 9, 2022

vigneshshanmugam commented Feb 9, 2022

lucasfcosta commented Feb 16, 2022

dominiqueclarke commented Feb 24, 2022

andrewvc commented Feb 24, 2022

vigneshshanmugam commented Feb 25, 2022 • edited Loading

lucasfcosta commented Feb 28, 2022

andrewvc commented Feb 28, 2022

lucasfcosta commented Mar 2, 2022

andrewvc commented Mar 14, 2022

vigneshshanmugam commented Jan 19, 2022 •

edited by andrewvc

Loading

vigneshshanmugam commented Feb 25, 2022 •

edited

Loading