Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack monitoring for Logstash #6436

Merged

Conversation

kaisecheng
Copy link
Contributor

@kaisecheng kaisecheng commented Feb 21, 2023

This commit needs to rebase onto #6404

Following the prior art, the PR integrates stack monitoring for Logstash by adding metricsbeat and filebeat as sidecars to deliver metrics and log to monitoring es cluster

There is a sample resource in config/samples/logstash/logstash_stackmonitor.yaml to deploy and test. It will create

> kubectl tree logstash logstash-sample
NAMESPACE  NAME                                                          READY  REASON  AGE
default    Logstash/logstash-sample                                      -              18m
default    ├─Secret/logstash-sample-default-monitoring-beat-ls-mon-user  -              11m
default    ├─Secret/logstash-sample-ls-config                            -              11m
default    ├─Secret/logstash-sample-ls-monitoring-default-monitoring-ca  -              11m
default    ├─Secret/logstash-sample-ls-monitoring-filebeat-config        -              11m
default    ├─Secret/logstash-sample-ls-monitoring-metricbeat-config      -              11m
default    ├─Service/logstash-sample-ls-api.                             -              11m
default    │ └─EndpointSlice/logstash-sample-ls-http-wfxfs               -              11m
default    └─StatefulSet/logstash-sample-ls                              -              11m
default      ├─ControllerRevision/logstash-sample-ls-5c5dc87d74          -              11m
default      ├─ControllerRevision/logstash-sample-ls-7844b4db48          -              11m
default      ├─ControllerRevision/logstash-sample-ls-84fc6d5f87          -              11m
default      └─Pod/logstash-sample-ls-0                                  True           10m

And show status:

> kubectl get logstash logstash-sample
NAME              AVAILABLE   EXPECTED   AGE   VERSION
logstash-sample   1           1          35m   8.6.1

This PR includes tests

  • e2e TestLogstashStackMonitoring
  • unit test sidecar_test.go
  • webhook test

Still todo:

  • integrate metrics API with https and username password
  • handle monitor that is from external ES cluster

robbavey and others added 26 commits February 15, 2023 13:26
Initial commit of a simple operator.

The first operator will create:
* A Service exposing the logstash metrics API, so it can be used for monitoring purposes
* Secrets holding logstash.yml
* A StatefulSet deploying the logstash pods
* Pods with default liveness and readiness probes

The sample logstash yml file as located in config/samples/logstash/logstash.yaml will
create:

```
✗ kubectl tree logstash logstash-sample
NAMESPACE  NAME                                                  READY  REASON  AGE
default    Logstash/logstash-sample                              -              4m5s
default    ├─Secret/logstash-sample-ls-config                    -              4m4s
default    ├─Service/logstash-sample-ls-http                     -              4m5s
default    │ └─EndpointSlice/logstash-sample-ls-http-kwfsg       -              4m5s
default    └─StatefulSet/logstash-sample-ls                      -              4m4s
default      ├─ControllerRevision/logstash-sample-ls-79bd59f869  -              4m4s
default      ├─Pod/logstash-sample-ls-0                          True           3m59s
default      ├─Pod/logstash-sample-ls-1                          True           3m59s
default      └─Pod/logstash-sample-ls-2                          True           3m59s
```

And shows status:

```
✗ kubectl get logstash logstash-sample
NAME              AVAILABLE   EXPECTED   AGE    VERSION
logstash-sample   3           3          9m1s   8.6.1
```

Still TODO:

* Unit Tests
* End to end Tests
* Certificate handling on the HTTP Metrics API

Tidy up

Co-authored-by: Kaise Cheng <[email protected]>
Rudimentary tests for validation and naming
This reverts commit 8016b7b.

Remving e2e tests to get e2e tests prior to logstash working
This reverts commit b5c775e.

And re-adds the e2e tests
Add a test to make sure that Logstash is running, by testing the metrics API endpoint
and checking that the value of status is "green"
This commit fixes the readiness probe, by setting the default visibility of
the logstash API to 0.0.0.0, allowing access to the metrics API to the probe
@botelastic botelastic bot added the triage label Feb 21, 2023
@thbkrkr thbkrkr added the >enhancement Enhancement of existing functionality label Feb 21, 2023
@botelastic botelastic bot removed the triage label Feb 21, 2023
- update min version to 8.7.0
- update sample
- update test
- remove log4j2 issue doc
@kaisecheng
Copy link
Contributor Author

@barkbay
The issue of logging in upstream Logstash is fixed and backported to 8.7.0.
I have refactored the version validation, updated tests to fulfill min version requirement, updated the sample, and removed comment/todo/doc. It is ready for review.

@kaisecheng kaisecheng requested review from barkbay and robbavey March 9, 2023 12:34
name: logstash-sample
spec:
count: 1
version: 8.7.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, this causes the TestSample e2e test to fail, due to the lack of 8.7.0 docker image for logstash

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. I only reviewed the e2e test in this review. I'm still running a few other tests and I think I will ✅ soon.

@@ -55,13 +55,13 @@ func Validate(resource monitoring.HasMonitoring, version string) field.ErrorList
}

// IsSupportedVersion returns true if the resource version is supported for Stack Monitoring, else returns false
func IsSupportedVersion(v string) error {
func IsSupportedVersion(v string, minVersion version.Version) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not from your PR: the godoc is wrong, this function returns an error not a boolean.

test/e2e/logstash/stack_monitoring_test.go Outdated Show resolved Hide resolved

// checks that the sidecar beats have sent data in the monitoring clusters
steps := func(k *test.K8sClient) test.StepList {
return checks.MonitoredSteps(&monitored, k)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that CheckBeatSidecars will be a no-op as it implicitly creates a list of Elasticsearch Pods only. It might be a good opportunity to either make it more generic or rename it CheckBeatSidecarsInElasticsearch:

diff --git a/test/e2e/test/checks/monitoring.go b/test/e2e/test/checks/monitoring.go
index 604e753cf..313935676 100644
--- a/test/e2e/test/checks/monitoring.go
+++ b/test/e2e/test/checks/monitoring.go
@@ -58,7 +58,7 @@ func (c stackMonitoringChecks) Steps() test.StepList {
        }
 }
 
-func (c stackMonitoringChecks) CheckBeatSidecars() test.Step {
+func (c stackMonitoringChecks) CheckBeatSidecarsInElasticsearch() test.Step {
        return test.Step{
                Name: "Check that beat sidecars are running",
                Test: test.Eventually(func() error {

test/e2e/test/logstash/builder.go Outdated Show resolved Hide resolved
test/e2e/test/logstash/builder.go Outdated Show resolved Hide resolved

if (logstash.Status.ExpectedNodes != expected.ExpectedNodes) ||
(logstash.Status.AvailableNodes != expected.AvailableNodes) ||
(logstash.Status.Version != expected.Version) {
return fmt.Errorf("expected status %+v but got %+v", expected, logstash.Status)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check monitoringAssociationStatus.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really, this check doesn't involve monitoring

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, it is involved in all cases, whether monitoring is activated or not? I was thinking about something along these lines:

func CheckStatus(b Builder, k *test.K8sClient) test.Step {
	return test.Step{
		Name: "Logstash status should have the correct status",
		Test: test.Eventually(func() error {
....
			expectedAssociationInStatus := len(logstash.Spec.Monitoring.Metrics.ElasticsearchRefs) + len(logstash.Spec.Monitoring.Metrics.ElasticsearchRefs)
			actualAssociationInStatus := len(logstash.Status.MonitoringAssociationStatus)
			if expectedAssociationInStatus != actualAssociationInStatus {
				return fmt.Errorf("expected %d associations in status but got %d", expectedAssociationInStatus, actualAssociationInStatus)
			}
			for a, s := range logstash.Status.MonitoringAssociationStatus {
				if s != v1.AssociationEstablished {
					return fmt.Errorf("association %s has status %s ", a, s)
				}
			}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. You are right. Added it to status check. Thank you!

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are more than 1 monitoring cluster referenced in metrics then the status states that all the associations are Established, while this configuration is not supported (and therefore, nothing is actually deployed):

apiVersion: logstash.k8s.elastic.co/v1alpha1
kind: Logstash
metadata:
  annotations:
    association.k8s.elastic.co/es-conf-272414756: '{....}'
    association.k8s.elastic.co/es-conf-3519008756: '{....}'
    association.k8s.elastic.co/es-conf-4222863958: '{....}'
  creationTimestamp: "2023-03-21T08:07:05Z"
  generation: 1
  labels:
    test-name: test-ls-mon-a
  name: test-ls-mon-a-5ptv
  namespace: e2e-mercury
  resourceVersion: "757992"
  uid: 74383387-d449-459f-a3fb-07279bef0368
spec:
  count: 1
  monitoring:
    logs:
      elasticsearchRefs:
      - name: test-ls-mon-logs-lp4x
        namespace: e2e-mercury
    metrics:
      elasticsearchRefs:
      - name: test-ls-mon-metrics1-rr2c
        namespace: e2e-mercury
      - name: test-ls-mon-metrics2-zmvs
        namespace: e2e-mercury
  podTemplate: {}
  version: 8.7.0-SNAPSHOT
status:
  monitoringAssociationStatus:
    e2e-mercury/test-ls-mon-logs-lp4x: Established
    e2e-mercury/test-ls-mon-metrics1-rr2c: Established
    e2e-mercury/test-ls-mon-metrics2-zmvs: Established
  observedGeneration: 1

I'm not sure this comes from your PR. Would you mind opening an issue if it's not the case as it seems wrong and misleading? Note that I was not running the webhook, which I think would have prevented this from happening, still looks wrong anyway.

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost LGTM, left another comment on the tests.

pkg/apis/logstash/v1alpha1/validations.go Outdated Show resolved Hide resolved
pkg/controller/logstash/driver.go Outdated Show resolved Hide resolved
@@ -10,6 +10,9 @@ import (
"k8s.io/apimachinery/pkg/util/rand"
"sigs.k8s.io/controller-runtime/pkg/client"

lsctl "github.com/elastic/cloud-on-k8s/v2/pkg/controller/logstash"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Imports formatting seems off

commonv1.TypeLabelName: label.Type,
label.ClusterNameLabelName: esName,
commonv1.TypeLabelName: typeLabel,
label.ClusterNameLabelName: name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will prevent the Logstash Pods from being listed (and any other Pod other than ES ones I guess) because they do not have a ClusterNameLabelName label.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted the changes of CheckBeatSidecars and renamed it as suggested

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kaisecheng
Copy link
Contributor Author

I'm not sure this comes from your PR. Would you mind opening an issue if it's not the case as it seems wrong and misleading? Note that I was not running the webhook, which I think would have prevented this from happening, still looks wrong anyway.

@barkbay Create an issue for tracking #6558. I could not replicate the problem locally. Please update the issue if you have steps to reproduce it.

@kaisecheng kaisecheng merged commit fa2b9e5 into elastic:feature/logstash Mar 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement Enhancement of existing functionality :logstash
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants