You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In utilizing various plugins, we have identified a few key limitations of the current plugin architecture:
1. Lack of Observability/Prometheus Metrics
Plugins are forked at startup from the main process. This makes them harder to observe:
There is not an interface for plugins to emit metrics via the standard Prometheus endpoint.
Plugins will not appear in pprof profiles generated using the argo-rolloutspprof endpoint.
Moreover, it takes a significant amount of re-configuration to enable metrics and pprof endpoints for plugins. The user needs to:
Expose the metrics/pprof endpoints within the plugin server (as in any other application), likely configurable via
command-line flags.
Modify the argo-rollout deployment object to expose these ports.
(In the case of monitoring) further modify the argo-rollout to scrape the port of all configured plugins.
2. Runtime download of HTTP Artifacts or forking the Docker image
Users have a choice between http and file locations for plugins (code). Neither one of these is ideal:
HTTP-based Plugins
http plugin locations create a reliability risk for the user, since they depend on arbitrary HTTP endpoints that may be sporadically unavailable.
This method of importing binaries isn't idiomatic, which doesn't play nicely with other ecosystem components:
Vulnerabilities in the plugin binary may slip past vuln scanning, since the binary is imported at runtime.
The plugin version is effectively specified in an arbitrary string. This makes it harder for automation (and engineers operating according to standard Helm/Docker practices) to identify that the version needs to be upgraded.
File-based Plugins
file plugin locations are difficult to implement for users, since they are responsible for placing these plugin binaries on the argo-rollouts filesystem.
Idiomatically, the user would maintain a Docker build pipeline that uses the open-source argo-rollouts image as a base, and adds the requisite plugins to the file-system, which.
argo-rollouts should slowly migrate from launching go-plugins as processes within the argo-rollouts container, to separate sidecar processes managed by Kubernetes. This has a number of benefits:
Plugins can be distributed as standalone Docker images. This allows for idiomatic versioning and caching.
Separate Plugin sidecars make the system easier-to-understand and debug for users:
Users can clearly identify which plugins are failing using standard Kubernetes tooling (i.e. viewing container exits, logs, etc).
Users can identify what is/isn't being effectively monitored.
For example, some plugins may have a separate monitoring endpoint, which users can see a PodMonitor for.
Alternatively, some plugins may lack quality monitoring, which would be clear to users.
Similarly, because ports can be exposed directly via the K8s APIs, users can easily identify debug/pprof endpoints.
This design could be implemented using Helm templates. Effectively, a user would import a plugin chart, which foundationally exports a library template. This template could then be invoked to add the sidecar configuration.
Alternatives Considered
The alternative is likely a combination of several fixes to each point problem:
1. Lack of Observability/Prometheus Metrics
Prometheus
Alternatively, go-plugin or argo-rollouts could be extended to provide a metrics interface between the main process and plugin components.
pprof
Each plugin could be expected to implement its own pprof endpoint.
The central argo-rolloutspprof endpoint could then proxy into the plugin endpoints.
2. Runtime download of HTTP Artifacts
[Option 1]: Mega Docker Image
The argo-rollouts project could have a central pipeline that exports a "mega" Docker image containing all "well-known" plugins for users to utilize. This would mitigate the risks of using HTTP directly.
[Option 2]: Image Volumes
As mentioned previously, once GA, Argo Rollouts could move to using Image Volumes as the default model for plugin distribution.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered:
Alternatively, I think it makes sense to discuss abandoning go-plugin in favor of internalizing the plugins within the codebase. There are only a handful of maintained plugins, and it seems like there are significant benefits to having a singular codebase.
Problem Statement
In utilizing various plugins, we have identified a few key limitations of the current plugin architecture:
1. Lack of Observability/Prometheus Metrics
Plugins are forked at startup from the main process. This makes them harder to observe:
pprof
profiles generated using theargo-rollouts
pprof
endpoint.Moreover, it takes a significant amount of re-configuration to enable
metrics
andpprof
endpoints for plugins. The user needs to:metrics
/pprof
endpoints within the plugin server (as in any other application), likely configurable viacommand-line flags.
argo-rollout
deployment object to expose these ports.argo-rollout
to scrape the port of all configured plugins.2. Runtime download of HTTP Artifacts or forking the Docker image
Users have a choice between
http
andfile
locations for plugins (code). Neither one of these is ideal:HTTP-based Plugins
http
plugin locations create a reliability risk for the user, since they depend on arbitrary HTTP endpoints that may be sporadically unavailable.File-based Plugins
file
plugin locations are difficult to implement for users, since they are responsible for placing these plugin binaries on theargo-rollouts
filesystem.argo-rollouts
image as a base, and adds the requisite plugins to the file-system, which.Proposal
argo-rollouts
should slowly migrate from launchinggo-plugins
as processes within theargo-rollouts
container, to separatesidecar
processes managed by Kubernetes. This has a number of benefits:Plugins can be distributed as standalone Docker images. This allows for idiomatic versioning and caching.
Separate Plugin sidecars make the system easier-to-understand and debug for users:
PodMonitor
for.This design could be implemented using Helm templates. Effectively, a user would import a plugin chart, which foundationally exports a library template. This template could then be invoked to add the sidecar configuration.
Alternatives Considered
The alternative is likely a combination of several fixes to each point problem:
1. Lack of Observability/Prometheus Metrics
Prometheus
go-plugin
orargo-rollouts
could be extended to provide a metrics interface between the main process and plugin components.pprof
pprof
endpoint.argo-rollouts
pprof
endpoint could then proxy into the plugin endpoints.2. Runtime download of HTTP Artifacts
[Option 1]: Mega Docker Image
argo-rollouts
project could have a central pipeline that exports a "mega" Docker image containing all "well-known" plugins for users to utilize. This would mitigate the risks of using HTTP directly.[Option 2]: Image Volumes
As mentioned previously, once GA, Argo Rollouts could move to using Image Volumes as the default model for plugin distribution.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered: