-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial Commit of Basic Operator for Logstash #6404
Conversation
ceaf097
to
c9e8bab
Compare
run/e2e-tests |
c9e8bab
to
5713bb4
Compare
@thbkrkr, @kaisecheng, @naemono, @barkbay I think this is ready for review now. There are a few things that are still in progress - the logstash controller still needs unit tests, but the basics are there, and I've added a unit tests for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I left a few comments. But in general I think we could merge this already as a minimal viable version of Logstash support. The only thing that we should maybe decide is the API for the services/HTTP attribute in the CRD because that is really messy to change once we released it.
// a lot like `HTTPConfig`, but is applicable for more than just an HTTP endpoint, as logstash may need to | ||
// be opened up for other services: beats, TCP, UDP, etc, inputs | ||
// +kubebuilder:validation:Optional | ||
HTTP commonv1.HTTPConfig `json:"http,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have it then right from the start as services
? It is harder to change once we have release a version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. will change to []LogstashService.
} | ||
|
||
if len(errors) > 0 { | ||
return apierrors.NewInvalid(groupKind, a.Name, errors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why we return early here instead of letting the defaultChecks
run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right. It should go through all checks.
@@ -33,6 +33,7 @@ var ( | |||
// Due to bugfixes present in 7.14 that ECK depends on, this is the lowest version we support in Fleet mode. | |||
SupportedFleetModeAgentVersions = MinMaxVersion{Min: MustParse("7.14.0-SNAPSHOT"), Max: From(8, 99, 99)} | |||
SupportedMapsVersions = MinMaxVersion{Min: From(7, 11, 0), Max: From(8, 99, 99)} | |||
SupportedLogstashVersions = MinMaxVersion{Min: From(8, 6, 0), Max: From(8, 99, 99)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a technical reason to support Logstash only as of version 8.6.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logstash needs to add some features to work well with operator and we have no plan to backport those features to 7.17
} | ||
|
||
result, err := results.Aggregate() | ||
k8s.EmitErrorEvent(r.recorder, err, logstash, events.EventReconciliationError, "Reconciliation error: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should rename this function to MaybeEmitErrorEvent
as it ignores nil
errors.
pkg/controller/logstash/reconcile.go
Outdated
defer tracing.Span(¶ms.Context)() | ||
results := reconciler.NewResult(params.Context) | ||
|
||
s, _ := sset.New(sset.Params{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error
ignored. I think the error
is actually always nil
let's just remove the return value?
pkg/controller/logstash/sset/sset.go
Outdated
// store a hash of the sset resource in its labels for comparison purposes | ||
sset.Labels = hash.SetTemplateHashLabel(sset.Labels, sset.Spec) | ||
|
||
return sset, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error
is always nil
"github.com/elastic/cloud-on-k8s/v2/test/e2e/test" | ||
) | ||
|
||
type APIError struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the custom error type? I know we introduced something similar for Maps but it seems unnecessary there as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need this
PodManagementPolicy: appsv1.ParallelPodManagement, | ||
RevisionHistoryLimit: params.RevisionHistoryLimit, | ||
// build a headless service per StatefulSet, matching the StatefulSet labels | ||
ServiceName: params.ServiceName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The service referenced here is not a headless service but the HTTP service for Logstash which has a ClusterIP. This means the individual Logstash Pods are not resolvable via DNS (because that service always resolves to the ClusterIP of the service and the routing then happens with the iptables
magic of kube proxy).
The K8s docs ask for a headless service for each stateful set. The thing I am not sure about is whether we need it if we don't need the ability to resolve the individual Logstash Pods via DNS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to headless service following the k8s doc. The default service in Logstash is only for metrics collection. Having load balancing with iptables sounds more than it needs.
pkg/apis/logstash/v1alpha1/name.go
Outdated
) | ||
|
||
const ( | ||
defaultServiceSuffix = "default" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about using the name "api" rather than "default", to reflect that this service is for the Logstash API? So this would show up as logstash-sample-ls-api
, rather than logstash-sample-ls-default
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realise additional commits were pushed since you requested my review. Some of my comments might thus be outdated already.
// +kubebuilder:validation:Optional | ||
ConfigRef *commonv1.ConfigSource `json:"configRef,omitempty"` | ||
|
||
// HTTP holds the HTTP layer configuration for the Logstash Metrics API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment out of sync with code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
services: | ||
- name: metrics | ||
service: | ||
spec: | ||
type: ClusterIP | ||
ports: | ||
- port: 9600 | ||
name: "stats" | ||
protocol: TCP | ||
targetPort: 9600 | ||
- name: beats | ||
service: | ||
spec: | ||
type: ClusterIP | ||
ports: | ||
- port: 5044 | ||
name: "beats" | ||
protocol: TCP | ||
targetPort: 5044 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this not be a single service with two port declarations?
services:
- name: ???
service:
spec:
type: ClusterIP
ports:
- port: 9600
name: "stats"
protocol: TCP
targetPort: 9600
- port: 5044
name: "beats"
protocol: TCP
targetPort: 5044
Is there still a use case for multiple services in Logstash e.g. one ClusterIP
services and one with type LoadBalancer
? I am not sure but maybe services
plural does not hurt.
But I wonder if we need the naming?
services:
- service:
spec:
type: ClusterIP
ports:
- port: 9600
name: "stats"
protocol: TCP
targetPort: 9600
- port: 5044
name: "beats"
protocol: TCP
targetPort: 5044
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does feel a little redundant, but if we removed the name from the spec, we'd need to define our own naming convention - would you suggest a simple numbered scheme for each service, eg logstash-sample-ls-0
I do think we may need the flexibility to define different service types, given the variety of plugins we support, and the use case to support multiple of those in a single logstash config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I guess a ordinal system would be possible but there is also a metadata
section in the service template that potentially allows users to specify a name (even though current docs say we will ignore it) so we have another redundancy. If it is easier to stick with the name attribute I think I would be OK with that too.
services:
- service:
metadata:
name: "stats"
spec:
type: ClusterIP
ports:
- port: 9600
# N.B. https://istio.io/latest/docs/ops/configuration/traffic-management/protocol-selection/
name: "http-stats"
protocol: TCP
targetPort: 9600
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On reflection, I think the name scheme works for us better than an ordinal scheme might do - it provides context when viewing service details, and also to allow the rules for the default API service to be overridden if necessary. I'd be happy to use the metadata
method if that is more idiomatic, but I'd also like to be consistent with the rest of ECK wherever possible, and having a special case where metadata
is meaningful seems like it might be a cause for confusion.
} | ||
|
||
type LogstashService struct { | ||
Name string `json:"name,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing comment. But I left a comment regarding the API up top at the sample to remove this all together.
pkg/controller/logstash/reconcile.go
Outdated
s := sset.New(sset.Params{ | ||
Name: logstashv1alpha1.Name(params.Logstash.Name), | ||
Namespace: params.Logstash.Namespace, | ||
ServiceName: logstashv1alpha1.DefaultServiceName(params.Logstash.Name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are setting the default service name here but you are only creating the default service if no custom service is defined.
// Service defines the template for the associated Kubernetes Service object. | ||
Service commonv1.ServiceTemplate `json:"service,omitempty"` | ||
// TLS defines options for configuring TLS for HTTP. | ||
TLS commonv1.TLSOptions `json:"tls,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unused? Is there a plan to implement certificate management for Logstash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we would like to add support for TLS for the metrics API in a follow-up commit.
defer tracing.Span(¶ms.Context)() | ||
results := reconciler.NewResult(params.Context) | ||
|
||
_, err := reconcileServices(params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are ignoring the return value here. Can it simply be removed from the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is for a follow-up commit for TLS support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaisecheng or it could be removed and added in the upcoming PR that adds TLS support? Or at least comment that section to let reviewers know what it is for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will add TLS support in the near future, so prefer to keep it. I can add a TODO to explain
pkg/controller/logstash/service.go
Outdated
// If api.http.port is customized, user is expected to config Services. | ||
// When Services exist, the port 9600 does not attach to any of Service. | ||
func reconcileServices(params Params) ([]corev1.Service, error) { | ||
if len(params.Logstash.Spec.Services) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than checking if the number of services is 0, we should probably check whether a service has been spec'd with the same name as the default
service - I think it may make sense to rename this to api
, or internal
, rather than default
, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can make a default headless service regardless of user input similar to ES
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch-sample-es-default ClusterIP None <none> 9200/TCP 17h
elasticsearch-sample-es-http ClusterIP 10.98.229.240 <none> 9200/TCP 17h
elasticsearch-sample-es-internal-http ClusterIP 10.96.255.68 <none> 9200/TCP 17h
elasticsearch-sample-es-transport ClusterIP None <none> 9300/TCP 17h
For user defined service, name it with suffixapi
. If user doesn't provide input, create a ClusterIP
service with suffix http
for metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if we could call the service for metrics api
,internal
or http
, regardless of whether they have set values in the spec
. And if they define a service with the same name, it overrides the default settings, allowing the user to change the cluster type from the default if they need to - say if they have an external monitoring tool that reads from the logstash API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. Using the name as an identifier to override default settings (eg. port) sounds good.
My concern is this comment
K8s docs ask for a headless service for each stateful set
Regardless of user's setting in spec
, I think we can always create headless service for Logstash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree - I do think we need to always create a headless service
Initial commit of a simple operator. The first operator will create: * A Service exposing the logstash metrics API, so it can be used for monitoring purposes * Secrets holding logstash.yml * A StatefulSet deploying the logstash pods * Pods with default liveness and readiness probes The sample logstash yml file as located in config/samples/logstash/logstash.yaml will create: ``` ✗ kubectl tree logstash logstash-sample NAMESPACE NAME READY REASON AGE default Logstash/logstash-sample - 4m5s default ├─Secret/logstash-sample-ls-config - 4m4s default ├─Service/logstash-sample-ls-http - 4m5s default │ └─EndpointSlice/logstash-sample-ls-http-kwfsg - 4m5s default └─StatefulSet/logstash-sample-ls - 4m4s default ├─ControllerRevision/logstash-sample-ls-79bd59f869 - 4m4s default ├─Pod/logstash-sample-ls-0 True 3m59s default ├─Pod/logstash-sample-ls-1 True 3m59s default └─Pod/logstash-sample-ls-2 True 3m59s ``` And shows status: ``` ✗ kubectl get logstash logstash-sample NAME AVAILABLE EXPECTED AGE VERSION logstash-sample 3 3 9m1s 8.6.1 ``` Still TODO: * Unit Tests * End to end Tests * Certificate handling on the HTTP Metrics API Tidy up Co-authored-by: Kaise Cheng <[email protected]>
Rudimentary tests for validation and naming
Update service name verification to `default` from `http`
improve comment Co-authored-by: Peter Brachwitz <[email protected]>
This commit adds support to enable users to specify a `default` service to override the provided defaults. This commit also ensures that the `default` service is added if other services are defined
The default service represents the logstash API, so this commit updates the service name and associated methods to reflect that. This should also be more intuitive for users who wish to change the settings for the service.
1a06bb6
to
6167cb7
Compare
@pebrc This should be ready for another round. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also want to update:
- The memory aggregator:
cloud-on-k8s/pkg/license/aggregator.go
Lines 38 to 56 in 48d6eed
// AggregateMemory aggregates the total memory of all Elastic managed components func (a Aggregator) AggregateMemory(ctx context.Context) (resource.Quantity, error) { var totalMemory resource.Quantity for _, f := range []aggregate{ a.aggregateElasticsearchMemory, a.aggregateKibanaMemory, a.aggregateApmServerMemory, a.aggregateEnterpriseSearchMemory, } { memory, err := f(ctx) if err != nil { return resource.Quantity{}, err } totalMemory.Add(memory) } return totalMemory, nil } - The telemetry reporter:
cloud-on-k8s/pkg/telemetry/telemetry.go
Lines 112 to 135 in 48d6eed
func (r *Reporter) getResourceStats(ctx context.Context) (map[string]interface{}, error) { span, _ := apm.StartSpan(ctx, "get_resource_stats", tracing.SpanTypeApp) defer span.End() stats := map[string]interface{}{} for _, f := range []getStatsFn{ esStats, kbStats, apmStats, beatStats, entStats, agentStats, mapsStats, scpStats, } { key, statsPart, err := f(r.client, r.managedNamespaces) if err != nil { return nil, err } stats[key] = statsPart } return stats, nil }
?
UpdateStrategy: appsv1.StatefulSetUpdateStrategy{ | ||
Type: appsv1.RollingUpdateStatefulSetStrategyType, | ||
}, | ||
// we don't care much about pods creation ordering, and manage deletion ordering ourselves, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[...] and manage deletion ordering ourselves
IIUC deletion is managed by the sts controller (which raises the question of whether we want to expose RollingUpdateStatefulSetStrategy, can be discussed in a separate issue/PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree - I think we will want to expose this at a later date
) | ||
|
||
// LogstashSpec defines the desired state of Logstash | ||
type LogstashSpec struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a plan to add a reference to an Elasticsearch resource?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there will be a follow up PR to add support for Elasticsearch references
apiVersion: elasticsearch.k8s.elastic.co/v1 | ||
kind: Elasticsearch | ||
metadata: | ||
name: elasticsearch-sample |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this Elasticsearch resource is not clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -30,6 +30,9 @@ crds: | |||
- name: stackconfigpolicies.stackconfigpolicy.k8s.elastic.co | |||
displayName: Elastic Stack Config Policy | |||
description: Elastic Stack Config Policy | |||
- name: logstashes.logstash.k8s.elastic.co |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may also want to update hack/operatorhub/templates/csv.tpl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Co-authored-by: Michael Morello <[email protected]>
Co-authored-by: Michael Morello <[email protected]>
@barkbay Good question - does this value factor into licensing? I think this might be a good candidate for a follow-on PR, once we have a definitive answer on how Logstash fits into licensing in general. cc @flexitrev
Yes - we have plans to add Telemetry for the logstash operator in a follow up PR. |
@barkbay Thanks for the comments - I think I addressed the points from your review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, a few other nits:
- Do we plan to add some unit tests? There is none at the moment in the
controller
package. I feel like it would make sense for functions likereconcileServices
orbuildPodTemplate
. - Other ECK resources have a
Health
field in their status. I was wondering if it would make sense to add one for Logstash. The status already holds the number ofavailableNodes
, I don't know if we could offer more to help the user understand the Logstash resource health? - Should we already open an issue for each feature/task that it is still to be done or discussed? A few that I have in mind:
- Adding the Elasticsearch reference
- Exposing RollingUpdateStatefulSetStrategy
- Implementing TLS
- Update the documentation
- Discussing telemetry and usage report
@barkbay Unit tests is a good point - I'll add. And I'll take a look at the health and see what makes sense for Logstash As for follow up tasks - we have a plan to use this PR as a building block for additional functionality. The additional functionality includes stack monitoring, better support for handling logstash pipelines, Elasticsearch references, TLS support for monitoring, etc. See https://github.com/elastic/ingest-dev/issues/1441 for meta issue |
@barkbay - given the LGTM, would you have any objections to having this committed, and adding unit tests later today/early tomorrow (especially as this PR is targeted to a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Initial commit of a simple operator.
The first operator will create:
The sample logstash yml file as located in config/samples/logstash/logstash.yaml will
create:
And shows status:
This PR also includes basic tests:
pkg/apis
8.x
TestSamples
andTestVersionUpgradeOrdering
in the test suites.Still todo:
Co-authored-by: Kaise Cheng [email protected]