Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Heartbeat] NPM-less default behavior for browser #28749

Closed
andrewvc opened this issue Nov 1, 2021 · 14 comments
Closed

[Heartbeat] NPM-less default behavior for browser #28749

andrewvc opened this issue Nov 1, 2021 · 14 comments
Labels
enhancement Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team

Comments

@andrewvc
Copy link
Contributor

andrewvc commented Nov 1, 2021

Currently, by default, heartbeat will run npm install on zip_url and folder directories before executing their contents. This is fragile and difficult to support for operators because:

  1. It assumes npm is available
  2. It assumes the cost of the install is cheap, it could be quite slow in relation to a simple test suite.

For a future service this is also not great because it's just slow.

We should add a new mode of operation where the user's package.json is ignored, and only the system libraries are used.

I'm not certain what the best way to do this is, but one way would be to use this npm-pack-all tool: http://www.leanpro.cn/docs/leanrunner/en/shared/npm_offline , then run npm install with that.

Another way would be to just create a suitable node_modules folder as part of the docker image build, and just exec an ln -s to that from the user's project. to 'fill in' any deps. I assume that approach would be faster / simpler, so I sort of prefer it.

  • We also need to create an Error document in case of version mismatch.
@andrewvc andrewvc added enhancement Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team labels Nov 1, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime (Team:Uptime)

@shahzad31
Copy link
Contributor

Can we have some sort of allowlist for the packages we allow, using npm we can install only those packages and also on top of those required by the agent. This will make sure the cost can be predicted for the npm install.

Removing npm install altogether does limit the selling point of ability to use raw js/ts in the test, in a sense that we are limiting the eco system on which js depends.

@paulb-elastic
Copy link
Contributor

More discussion needed to handle supporting just Synthetics and offline, as well as the flexibility of allowing external deps (without hammering NPM, or dealing with NPM being unavailable). Also, the option of it being disabled in the service, but available on prem.

@paulb-elastic
Copy link
Contributor

An alternative is to expose specific modules through Synthetics like we do with jest/expect

@lucasfcosta
Copy link
Contributor

As we discussed during refinement yesterday, I agree that simply not running npm install by default is the quickest and easiest win here. If users complain we can always adjust later. I also think that we should bundle as little as possible initially.

In case we want to elaborate on this, however, I thought it would be useful to list the dilemmas involved here.

I believe that in this case, there are three main characteristics we'd like our product to have. It should be:

  • Flexible/Easy to use: being able to run NPM install makes it quicker for users to write journeys as they can install and reuse third-party modules like lodash. Especially considering how heavy the usage of third-party modules is in the JS-ecosystem, this situation is relatively common.
  • Fast: running npm install for every running could be costly in terms of time depending on how many and which modules the users depend upon.
  • Portable: we want to be able to run these journeys in airgapped environments, in which users obviously don't have access to NPM (although they could configure their own registries). It's also worth notice we'll soon have to run these installs in the service, which could easily make it significantly pricier although I can't gauge by how much.

Considering these three characteristics, my descending rank of the most important ones would be:

  1. Flexible/Easy to use
  2. Portable (considering the profile/environment of many users)
  3. Fast

I also believe that item (1) is, by far, the most important of these.

Using that ranked list to scope out desired behaviour, IMO:

  1. NPM install should be doable, but it should not be the default and be hidden under a particular flag. If users do use modules other than synthetics, they should get an informative error message telling them what to do to enable npm install and fix the problem. We can probably detect this just by reading package.json.
    • As a separate note, I quite like the docker approach, so we could have a sidecart container just to manage dependency lifecycle and delegate further installations to it. It also delegates the caching in point (3) to the NPM toolchain.
  2. Considering that it should be possible and easy to use third-party modules everywhere, it should be possible to provide dependency files in air-gapped environments (and it should be easy to do so).
  3. Module installs should be cached so that we're not doing unnecessary installs, both on user's deployments and the synthetics service.
  4. We should not bundle anything other than the most essential utilities (playwright, expect, and lodash), otherwise we incur the risk of (a) being too prescriptive, (b) bloating our own package.

On another note, I particularly don't like the allowlist option because I think it defeats the purpose of making the product flexible and easy-to-use, adding extra unnecessary complexity and not offering the benefit of making the product necessarily faster or more portable given an install is still necessary.

@lucasfcosta
Copy link
Contributor

A related issue reported by one of our users: elastic/synthetics#445

Worth keeping track of it here too as there's a min-repro there and a good amount of info.

Also relates to: https://discuss.elastic.co/t/synthetics-failing-silently/295443/9

@mybyte
Copy link

mybyte commented Jan 28, 2022

Baking a container with a pre-filled npm cache is a work-around for now, but the cost of npm install is still too high. Preferably, this is something that should be executed exactly once, probably during suite development, certainly not in a running container.
Making containers with test suites baked in (as in https://github.com/elastic/synthetics/tree/main/examples/todos) is not a viable solution too. In most deployments, module/input configurations are something that you'd mount and maintain externally...

@andrewvc
Copy link
Contributor Author

I've thought the fastest thing we could do would be to replace the user's package.json and node_modules, it may be possible to do so with a symlink or hard-link to write protected directory, which would be a sort of instantaneous install.

@paulb-elastic
Copy link
Contributor

@vigneshshanmugam does this go away as a requirement to implement based on the new project (push) based monitors?

i.e. with push only (not ZIP URL), elastic agents running self managed nodes will naturally not need any internet connection, and will work fully airgapped?

@vigneshshanmugam
Copy link
Member

In theory it should work as we are not pulling the source of the synthetics package from the local path instead of fetching from the NPM registry. The exception being, users will not be able to use any external libraries other than our package itself.

We could do some validation in our end to make sure it behaves this way.

@paulb-elastic
Copy link
Contributor

If a user has a need for a non internet connected agent to run tests from, it's logical that they can't have any external dependencies (i.e. as expected).

Out of interest, would this also be true for anything they may host themselves? E.g. on an air gapped agent, could they include something from a private NPM registry (accessible on that agent), which would be included in a push? Or is it that we don’t support than anyway?

(cc @andrewvc)

@vigneshshanmugam
Copy link
Member

We dont support external libraries with push anyway for now, We could indeed add a set of libraries that we could bake inside the HB image like we have expect library as part of the Synthetics itself. However for the time being any external dependency will not be bundled when users run push command from their local system.

On the private NPM registry part, Users can override it by having a .npmrc file in repository. This would however only work for zip url source as we install it when we unpack the repository.

@andrewvc
Copy link
Contributor Author

Hmmmm, it would be good to be able to bundle some simple external deps, like totp-generator. Maybe it'd be best to just bundle a curated list for common use cases however.

@paulb-elastic
Copy link
Contributor

Closing as not relevant as we'll be deprecating ZIP URLs

@paulb-elastic paulb-elastic closed this as not planned Won't fix, can't repro, duplicate, stale Aug 15, 2022
@zube zube bot closed this as completed Nov 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

No branches or pull requests

7 participants