-
Notifications
You must be signed in to change notification settings - Fork 833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(logging): Update otel and tracing config #5291
Conversation
@@ -55,4 +55,5 @@ spec: | |||
tracingConfig: | |||
disable: {{ .Values.config.tracingConfig.disable }} | |||
otelExporterEndpoint: {{ .Values.config.tracingConfig.otelExporterEndpoint }} | |||
otelExporterProtocol: {{ .Values.config.tracingConfig.otelExporterProtocol }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafalSkolasinski following this change, should we update any of the helm chart versions? I don't know how helm chart version updates are handled, or if there is a policy on updating them. Atm, for example, I've seen that the appVersion has nothing to do with the Seldon Core version. Happy to follow what you suggest here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The descriptions I've added here were in the current helm-charts but not in the sources that generate them via kustomize (i.e. this file). I've added them because otherwise those descriptions would have disappeared on re-generation via the Makefile.
@@ -9,14 +9,14 @@ exporters: | |||
const_labels: | |||
provider: seldon | |||
|
|||
logging: | |||
debug: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the logger exporter is deprecated and replaced by debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. left minor comments / questions.
k8s/Makefile
Outdated
@@ -25,7 +25,7 @@ create-helm-charts: | |||
|
|||
.PHONY: create-yaml | |||
create-yaml: | |||
helm template seldon-core-v2-certs ./helm-charts/seldon-core-v2-certs | grep -v "namespace:" > yaml/certs.yaml | |||
helm template -n seldon-mesh seldon-core-v2-certs ./helm-charts/seldon-core-v2-certs | grep -v "namespace:" > yaml/certs.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to fix it to seldon-mesh
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. However, the certs.yaml
file currently generated from the helm chart has seldon-mesh
, so not adding the namespace here would have changed lots of dnsNames from seldon-mesh
to default
on re-generation. I've simply added this here to avoid any changes to the generated certs.yaml
Perhaps @RafalSkolasinski could suggest a better way of dealing with this? (i.e. even it involves changing the current certs.yaml)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the grep -v
cmd to exclude the namespace from the crds as an attempt to make it generic. However I have not tested a deployment with these yaml manifests in any other namespace. I have not realised before that we have things such as dns settings with seldon-mesh
hardcoded inside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I saw that; It's just that in the case of certs there's never any "namespace:" reference, but the template contains lots of {{ .Release.Namespace }}
references within the dnsNames
lists, and that gets replaced with either default
or whatever gets passed to helm template -n
when generating yamls.
Technically, anybody could generate a valid certs.yaml
for their own namespace name by running helm template
with the right -n [namespace]
. Not sure if there's a way of making the yaml generic here. Also, this is hardcoded only at the resulting yaml level, not in terms of the helm chart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I wonder then if we should fix the namespace in all yaml manifests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I've checked. In all the other ones {{ .Release.Namespace }}
only appears as a value of the namespace
key, which is removed by the grep -v
, so the resulting yamls do not end up containing any hardcoded values. In that sense they are generic.
k8s/yaml/crds.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file generated from helm charts via Makefile (make create-yaml
)
- fixes dataflow engine logging errors by specifying the otel exporter protocol this is required because the `opentelemetry-java-instrumentation` library requires a http(s) URI for the otel collector endpoint, regardless of the actual protocol. As the default for us is grpc, explicitly set the OTEL_EXPORTER_OTLP_PROTOCOL environment variable on dataflow pods - add the OTEL_EXPORTED_OTLP_PROTOCOL key to the seldon-tracing configMap and update the operator, crds and helm charts to support getting the value for this key from tracing config, similarly to how OTEL_EXPORTER_OTLP_PROTOCOL is fetched - update configs for when running outside k8s to match those in k8s
The order in which dependencies are specified for Gradle matters, influencing the ordering in the classpath of the application. Here, the slf4j klogging plugin was introduced after the kafka streams dependency, leading to errors at runtime, with the kafka-streams logging facility not able to find a slf4j provider.
kustomize bases and crds now obtained via running `make manifests` in the operator dir
What this PR does / why we need it:
the updates are needed because the
opentelemetry-java-instrumentation
library requires a http(s) URI for the otel collector endpoint, regardless of the actual protocol. As the default for us is grpc, explicitly set the OTEL_EXPORTER_OTLP_PROTOCOL environment variable on dataflow podsfixes otel-collector config to remove deprecated jagger exporter (jagger now supports otel directly).
add the OTEL_EXPORTED_OTLP_PROTOCOL key to the seldon-tracing configMap and update the operator, crds and helm charts to support getting the value for this key from tracing config, similarly to how OTEL_EXPORTER_OTLP_PROTOCOL is fetched
update versions used by ansible for jagger and opentelemetry-operator
port 14250 no longer needs to be exposed under any config
fix dependency ordering for dataflow/gradle.
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Internal issue references:
Public issues:
Special notes for your reviewer: