Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[metricbeat] [helper] Fix http server helper SSL config #39405

Merged
merged 14 commits into from
Oct 7, 2024

Conversation

gpop63
Copy link
Contributor

@gpop63 gpop63 commented May 4, 2024

Overview

The TLS config used in the metricbeat http server module uses a TLS config suitable for clients (BuildModuleClientConfig), not for servers. This makes it impossible to successfully connect to Prometheus remote_write using TLS since bad certificate is always returned.

Made some small unrelated changes to make golangci-lint happy. Added a reasonable ReadHeaderTimeout of 10 seconds. Replaced ioutil with io.

This was tested with openssl s_client and also directly with the remote_write data stream.

---
Post-Handshake New Session Ticket arrived:
SSL-Session:
Protocol : TLSv1.3
Cipher : TLS_AES_128_GCM_SHA256
Session-ID: 39C678F0ED9BBBFD657458330B59C50D3A9461C06B7727F5AAF667882F41BE5D
Session-ID-ctx:
Resumption PSK: 5C4509D103373B8036FE089AD2F79BFDC411A2B5C1439CE5852AF8F1184A9EA5
PSK identity: None
PSK identity hint: None
SRP username: None
TLS session ticket lifetime hint: 604800 (seconds)
TLS session ticket:
0000 - d6 b1 2b 3f f0 69 ea 9a-68 e1 0c e4 24 87 e3 ec ..+?.i..h...$...
0010 - 64 ff 69 a4 4d 2e 88 52-03 64 52 63 91 bc 30 04 d.i.M..R.dRc..0.
0020 - 49 55 1d 1a 79 6c 2b 86-54 46 ae c0 58 95 07 d7 IU..yl+.TF..X...
0030 - c3 b4 f4 83 8c 76 c7 8a-19 5b f6 da 8a b7 55 7f .....v...[....U.
0040 - 42 52 e2 71 17 f1 9a 76-3a f3 1e 29 b5 6a e4 1e BR.q...v:..).j..
0050 - b7 ff 89 4d ef 34 ba 6d-9f 9f df 7a 86 96 c3 59 ...M.4.m...z...Y
0060 - 53 81 5a 17 c2 f6 7f a7-55 4d b1 94 d8 0f a7 69 S.Z.....UM.....i
0070 - c3 .

Start Time: 1712569088
Timeout : 7200 (sec)
Verify return code: 0 (ok)
Extended master secret: no
Max Early Data: 0
---
read R BLOCK
800B0789E27F0000:error:0A000412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate:ssl/record/rec_layer_s3.c:1600:SSL alert number 42

The NewHttpServer is also used by the http module.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label May 4, 2024
@gpop63 gpop63 self-assigned this May 4, 2024
Copy link
Contributor

mergify bot commented May 4, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @gpop63? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@gpop63 gpop63 changed the title [metricbeat] [prometheus] Fix http server helper SSL config [metricbeat] [helper] Fix http server helper SSL config May 4, 2024
@elasticmachine
Copy link
Collaborator

elasticmachine commented May 4, 2024

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2024-05-04T21:14:57.950+0000

  • Duration: 160 min 31 sec

Test stats 🧪

Test Results
Failed 0
Passed 8614
Skipped 1239
Total 9853

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@gpop63
Copy link
Contributor Author

gpop63 commented May 14, 2024

How I tested this:

Prerequisites:

  • Prometheus (locally or docker)
  • Certificates
  • Some prometheus metrics
    • I am using a simple go app to create fake data

Generate certificates:

openssl.cnf

[req]
distinguished_name = req_distinguished_name
req_extensions = v3_req

[req_distinguished_name]
commonName = Common Name
commonName_default = localhost

[v3_req]
subjectAltName = @alt_names

[alt_names]
IP.1 = 0.0.0.0
DNS.1 = localhost

gencerts.sh

mkdir -p certs

OPENSSL_SUBJ="/C=US/ST=California/L=Santa Clara"
OPENSSL_CA="${OPENSSL_SUBJ}/CN=fake-CA"
OPENSSL_SERVER="${OPENSSL_SUBJ}/CN=fake-server"

sh ./genroot.sh "${OPENSSL_CA}"
sh ./genserver.sh "${OPENSSL_SERVER}"

genroot.sh

OPENSSL_ROOT_CA=$1

docker run --rm -v $PWD/certs:/certs -it nginx \
    openssl genrsa 2048 > certs/root-ca-key.pem

docker run --rm -v $PWD/certs:/certs -it nginx \
    openssl req -new -x509 -nodes -days 3600 \
        -subj "${OPENSSL_ROOT_CA}" \
        -key /certs/root-ca-key.pem -out /certs/root-ca.pem

genserver.sh

OPENSSL_SERVER=$1

docker run --rm -v $PWD/certs:/certs -v $PWD/openssl.cnf:/openssl.cnf -it nginx \
    openssl req -newkey rsa:2048 -days 3600 -nodes \
        -subj "${OPENSSL_SERVER}" \
        -keyout /certs/server-key.pem -out /certs/server-req.pem \
        -config /openssl.cnf

docker run --rm -v $PWD/certs:/certs -it nginx \
    openssl rsa -in /certs/server-key.pem -out /certs/server-key.pem

docker run --rm -v $PWD/certs:/certs -v $PWD/openssl.cnf:/openssl.cnf -it nginx \
    openssl x509 -req -in /certs/server-req.pem -days 3600 \
        -CA /certs/root-ca.pem -CAkey /certs/root-ca-key.pem \
        -set_serial 01 -out /certs/server-cert.pem \
        -extensions v3_req -extfile /openssl.cnf

docker run --rm -v $PWD/certs:/certs -it nginx \
    openssl verify -CAfile /certs/root-ca.pem /certs/server-cert.pem

Execute sh ./gencerts.sh.


Prometheus:

Run it with the config below.

prometheus.yml

global:
  scrape_interval: 15s
  scrape_timeout: 10s
  evaluation_interval: 15s
scrape_configs:
  - job_name: 'fake'
    static_configs:
      - targets: ['localhost:8080']
remote_write:
  - url: "https://0.0.0.0:9201/write"
    tls_config:
        ca_file: /path/to/certs/root-ca.pem
        #cert_file: /path/to/certs/server-cert.pem
        #key_file: /path/to/certs/server-key.pem

Golang app to create some fake prometheus metrics:

main.go

package main

import (
	"net/http"

	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promhttp"
)

var (
	fakeMetric = prometheus.NewCounter(
		prometheus.CounterOpts{
			Name: "fake_metric_total",
			Help: "This is a fake metric.",
		},
	)
)

func init() {
	prometheus.MustRegister(fakeMetric)
}

func main() {
	http.HandleFunc("/increment", func(w http.ResponseWriter, r *http.Request) {
		fakeMetric.Inc()
	})

	http.Handle("/metrics", promhttp.Handler())
	http.ListenAndServe(":8080", nil)
}

Run the go app and make a few curl requests curl http://localhost:8080/increment (prometheus should be running).


Metricbeat

Start metricbeat with this remote_write metricset config in x-pack/beats.

- module: prometheus
  metricsets: ["remote_write"]
  host: "localhost"
  port: "9201"

  ssl.enabled: true
  # ssl.certificate_authorities: ["/path/to/certs/root-ca.pem"]
  ssl.certificate: "/path/to/certs/server-cert.pem"
  ssl.key: "/path/to/certs/server-key.pem"

In prometheus logs you should see an error like this:

ts=2024-05-14T19:47:45.544Z caller=dedupe.go:112 component=remote level=warn remote_name=2bd00a url=https://0.0.0.0:9201/write msg="Failed to send batch, retrying" err="Post \"https://0.0.0.0:9201/write\": write tcp 127.0.0.1:45334->127.0.0.1:9201: write: broken pipe"

SSL connection to the server started by the remote_write metricset can also be tested with openssl like this echo | openssl s_client -showcerts -connect localhost:9201 -CAfile "/path/to/certs/root-ca.pem".

Copy link
Contributor

mergify bot commented May 14, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b fix_http-server_helper_ssl upstream/fix_http-server_helper_ssl
git merge upstream/main
git push upstream fix_http-server_helper_ssl

@gpop63 gpop63 added the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label May 14, 2024
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label May 14, 2024
@gpop63 gpop63 marked this pull request as ready for review May 15, 2024 09:32
@gpop63 gpop63 requested a review from a team as a code owner May 15, 2024 09:32
@gpop63 gpop63 requested review from AndersonQ and belimawr May 15, 2024 09:32
@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@gpop63 gpop63 force-pushed the fix_http-server_helper_ssl branch from f241a0b to 8e58038 Compare May 15, 2024 09:33
@gpop63 gpop63 removed request for AndersonQ and belimawr May 15, 2024 09:58
Copy link
Contributor

@belimawr belimawr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gpop63, the changes look good. Could you add a test?

Copy link
Contributor

mergify bot commented May 16, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b fix_http-server_helper_ssl upstream/fix_http-server_helper_ssl
git merge upstream/main
git push upstream fix_http-server_helper_ssl

@andresrc andresrc requested a review from belimawr August 21, 2024 15:39
@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Aug 21, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@andresrc andresrc removed the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label Aug 21, 2024
metricbeat/helper/server/http/http_test.go Outdated Show resolved Hide resolved
CHANGELOG.next.asciidoc Outdated Show resolved Hide resolved
Copy link
Contributor

mergify bot commented Sep 30, 2024

backport-8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Sep 30, 2024
@gpop63 gpop63 requested a review from belimawr September 30, 2024 12:15
@gpop63 gpop63 merged commit 6d4fbfc into elastic:main Oct 7, 2024
32 checks passed
mergify bot pushed a commit that referenced this pull request Oct 7, 2024
* add changelog entry

* fix TLS config

* fix changelog pr id

* golangci-lint fixes

* mage check

* fix http server ssl test

* Update metricbeat/helper/server/http/http_test.go

Co-authored-by: Tiago Queiroz <[email protected]>

* fix changelog

---------

Co-authored-by: Tiago Queiroz <[email protected]>
(cherry picked from commit 6d4fbfc)
pierrehilbert pushed a commit that referenced this pull request Oct 8, 2024
)

* add changelog entry

* fix TLS config

* fix changelog pr id

* golangci-lint fixes

* mage check

* fix http server ssl test

* Update metricbeat/helper/server/http/http_test.go

Co-authored-by: Tiago Queiroz <[email protected]>

* fix changelog

---------

Co-authored-by: Tiago Queiroz <[email protected]>
(cherry picked from commit 6d4fbfc)

Co-authored-by: Gabriel Pop <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants