Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Heartbeat] Suite Monitor discovery mode #29917

Closed
Tracked by #432
vigneshshanmugam opened this issue Jan 19, 2022 · 11 comments
Closed
Tracked by #432

[Heartbeat] Suite Monitor discovery mode #29917

vigneshshanmugam opened this issue Jan 19, 2022 · 11 comments
Labels
Team:obs-ds-hosted-services Label for the Observability Hosted Services team

Comments

@vigneshshanmugam
Copy link
Member

vigneshshanmugam commented Jan 19, 2022

Issue

Currently, we don't have a good way to deal with these below situations when one of the error occurs on running monitors that are zip url endpoints.

Network Error : ZIP url endpoint is not reachable, downloadable.
Syntax Error: Cannot run journeys due to syntax errors in one of the journey files
Run Error: Duplicate journey Id's.

Proposal

Introduce a discovery mode in Heartbeat that would create essential documents for the UI to create monitors based on the journeys. See also #30571 , for caching the discovered zips on a cloud storage endpoint for the service

  • Add a discovery flag to the existing monitor runs
heartbeat.monitors:
 - id: elastic-monitor
  name: Elastic Monitor
  discovery:
    enabled: true
      # optional, place to re-upload the downloaded zip for subsequent executions.
    source_cache:
      gcp_cloud_storage:
        signed_url: ""https://presignedgcpstorageurl"
  schedule: '@every 1m'
  source:
    zip_url: 
      url: "https://github.com/elastic/synthetics-demo/archive/refs/heads/main.zip"
  • Create the documents for the discovery mode and send them to the ES. The new documents are created differently than the existing format to avoid changing the queriers on the Uptime UI.

Error cases

  • Syntax/Network errors would contain only the Top level error event
{
  "suite": {
     "id": "elastic-monitor",
     "name": "Elastic Monitor"
   },
   "source_cache": {
     "google_cloud_storage": {"signed_url": "https://presignedgcpstorageurl"}
   },
   "journeys": [],
   // top level error 
   "error": {
     "message": "invalid syntax - failed with exit status 1",
     "stack": "/test.journey.ts"
   } 
}
  • If we can execute the synthetics runner, other related errors include duplication would be reported this way.
{
  "suite": {
     "id": "elastic-monitor",
     "name": "Elastic Monitor"
   },
   "source_cache": {
     "google_cloud_storage": {"signed_url": "https://presignedgcpstorageurl"}
   },
   "journeys": [
     {
       "id": "j1",
      "name": "journey-1"
     },
     {
       "id": "j",
        "name": "journey-2",
       "error": {
         "message": "duplicate monitor ID",
         "stack": ""
       }
     }
   ],
   // top level error 
   "error": {
     "message": "one or more journeys had errors"
   } 
}
@vigneshshanmugam vigneshshanmugam added the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label Jan 19, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime (Team:Uptime)

@andrewvc
Copy link
Contributor

andrewvc commented Feb 9, 2022

I'd propose we change the syntax from discovery: true to discovery: enabled to allow adding additional discovery related options as required by #30295.

@vigneshshanmugam any objection to me merging the syntax from that issue into your example here?

@vigneshshanmugam
Copy link
Member Author

Not at all, I will merge the example with the updated syntax.

@lucasfcosta
Copy link
Contributor

Earlier today, while talking about elastic/synthetics#451, we have discussed whether we will be implementing suites in 8.2 or delay until the synthetics app implementation. Therefore, we may want to push this for later.

With regards to terminology, however, it doesn't seem like further refinement is needed given our agreement on what copy to use in the meeting earlier today.

@dominiqueclarke
Copy link

@vigneshshanmugam @andrewvc We've discussed this a bit, but could you help me crystalize this in my mind in writing.

What is the purpose of the id in journeys[i].id listed in the spec? How is it generated? Will it always be defined? What will it be used for?

Currently the monitor.id for all managed monitors is the uuid for the saved object.

The saved objects client does allow us to specify an id rather than using a random uuid id. My assumption though is that the id provided by journeys[i].id will be unique only to the suite and not with respect to other suites. Therefore, we will use this journeys[i].id, along with suite.id, only as an internal identifier to merge current saved objects with existing saved objects. Is that correct?

I am also wondering how the proposal to include a user-provided "monitor id" will impact this spec. Relating to elastic/synthetics#451, both @lucasfcosta @vigneshshanmugam proposed a mechanism for allowing the user to overwrite the monitor id in the journey. For example:

journey("I can still name the journey", ({ page }) => {
  // applies only to current journey context
  journey.use({ monitorId: "hash" });

  step("my step", async () => { });
});

Will we use this id as the actual monitor id or as an internal id for merging saved objects?

Assuming we continue using this user-provided value as an internal id rather than a monitor id, I'm worried that there may be confusion if we aren't careful about this feature. What will the expectations from our users be? Will they anticipate that the monitorId they set in the journey becomes the actual monitor id. Will it be confusing for our users if, when they navigate to the monitor details page for this journey, they see a random uuid instead of the monitor id set by them in the journey. Will they expect to be able to perform their own ES queries using this value as the monitor id?

If we aren't intending to use this as an actual monitor id, but only as an internal id, I would prefer to remove the word monitor from the term id, but that feature already exists today with the journey({ name, id }, () => {}) syntax.

Other than my confusion about the journeys[i].id, this spec looks great to me and I'm excited to get started on this project!

@andrewvc
Copy link
Contributor

Your understanding is correct, that ID should be used as an internal ID for merging saved objects on the kibana side. We could, by the way, rather than a UUID, do what heartbeat does now, which is concatenate the suite ID with the journey ID (joining them with a -). It will be just as unique, and also predictable. TBH, I think it's not a huge deal either way, but I'll just throw it out there as option.

Going forward, I'd like us to get to the point where we can make the id less prominent in the UI, it's really a detail only important for advanced use cases, and as we improve the CRUD / GUI capabilities is less critical. It's still relevant for suites, but really only in exceptional cases, where a user has duplicated the ID. CC @drewpost @liciavale

@vigneshshanmugam
Copy link
Member Author

vigneshshanmugam commented Feb 25, 2022

Will we use this id as the actual monitor id or as an internal id for merging saved objects?

I would urge here to use the same pattern that we follow in HB to generate the monitor id for suites which will be <suiteid>:<journey.id> and use them internally as well as for the stored saved objects. Reduces the friction and also makes it explicit for the users to understand whats going on for suites.

If we aren't intending to use this as an actual monitor id, but only as an internal id, I would prefer to remove the word monitor from the term id, but that feature already exists today with the journey({ name, id }, () => {}) syntax.

Yeah I guess thats fine, but by the moment we start allowing the configuration of monitor schedule and other parameters, the journey.use API would be the natural choice as it makes it explicit. But I would wait for this and do this later once we are onboard with the new suites page.

@lucasfcosta
Copy link
Contributor

One thing that just came to mind while reading the spec: what happens to the current ZIP URL monitors? Do they need the discovery flag to be enabled in order to work? Wouldn't it be confusing to be able to specify ZIP URLs with and without that flag and then have a different behaviour because of that?

I wonder whether we could just enable "discovery" by default and always have discovery behaviour if the specified source is a ZIP URL or if discovery settings exist (like the caching settings for example)?

Or am I missing something here?

@andrewvc
Copy link
Contributor

It's a good question @lucasfcosta . I don't see a way around us supporting both methods for a while, if for no other reason than that this will initially only work with monitor management on the service.

At a minimum we'll need the synthetics node integration active to fully deprecate the current method of operation.

Additionally, this assumes fleet / the service, what about regular heartbeat users? Will we remove synthetics support for them? Some use heartbeat with Logstash for instance. Since we are technically beta we could consider deprecating / killing support for that, but I wouldn't make that call lightly.

@lucasfcosta
Copy link
Contributor

Indeed, those considerations about backwards compatibility seem really important.

Just to be sure I understood your point of view, when you say:

I don't see a way around us supporting both methods for a while, if for no other reason than that this will initially only work with monitor management on the service.

You mean that we should be able to still support individual monitor creation and management from Heartbeat and thus make the flag explicit even if users use ZIP URLs?

I ask this because I was thinking that we could still have full support for all users (including fleet/service/HB). However, we'd just change how the data is sent and displayed by Kibana by making the discovery process implicit (i.e. it would be "on" when users add ZIP URLs and they don't need to know about what happens behind the scenes, we just display it nicely in Kibana).

Please let me know if I misunderstood what you meant or didn't see a possible challenge in doing that.

@andrewvc
Copy link
Contributor

Closing in favor of elastic/synthetics#470

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

No branches or pull requests

5 participants