Ideas to upper bound prometheus-server's memory consumption #2222

consideRatio · 2023-02-17T08:26:17Z

I just slowly incremented prometheus-server to 20 GB of memory requests for pangeo-hubs cluster. It appears that it wasn't sufficient with 18 GB, because it peaked at close to 19 GB before it fell down to ~3-4GB when Head GC completed was logged ~5 minutes after startup.

This prometheus-server had a /data folder mounted from the attached PVC that was 5.8 GB.

kubectl exec -n support deploy/support-prometheus-server -c prometheus-server -- du -sh /data 
5.8G	/data

The problem we have is that "write-ahead log" (WAL) is being read during startup from the disk to summarize all metrics collected as I understand it, and that takes a lot of memory. Actually, the problem is that we can't know what this memory requirement is, because it grows over time as more metrics are collected.

Ideas

We upper-bound the WAL size on disk instead of collected metrics age
We work towards node sharing (New default machine types and profile list options - sharing nodes is great! #2121) so we get less metrics from nodes
We try to limit the metrics collected by prometheus to what we consume

Example on logs from a successfull startup

ts=2023-02-17T08:08:58.378Z caller=head.go:683 level=info component=tsdb msg="WAL segment loaded" segment=25167 maxSegment=25169
ts=2023-02-17T08:08:58.379Z caller=head.go:683 level=info component=tsdb msg="WAL segment loaded" segment=25168 maxSegment=25169
ts=2023-02-17T08:08:58.379Z caller=head.go:683 level=info component=tsdb msg="WAL segment loaded" segment=25169 maxSegment=25169
ts=2023-02-17T08:08:58.379Z caller=head.go:720 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=1.080338716s wal_replay_duration=1m28.600482965s wbl_replay_duration=184ns total_replay_duration=1m30.245179793s
ts=2023-02-17T08:09:06.733Z caller=main.go:1014 level=info fs_type=EXT4_SUPER_MAGIC
ts=2023-02-17T08:09:06.733Z caller=main.go:1017 level=info msg="TSDB started"
ts=2023-02-17T08:09:06.733Z caller=main.go:1197 level=info msg="Loading configuration file" filename=/etc/config/prometheus.yml
ts=2023-02-17T08:09:06.766Z caller=kubernetes.go:326 level=info component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
ts=2023-02-17T08:09:06.768Z caller=kubernetes.go:326 level=info component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
ts=2023-02-17T08:09:06.768Z caller=kubernetes.go:326 level=info component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
ts=2023-02-17T08:09:06.769Z caller=kubernetes.go:326 level=info component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
ts=2023-02-17T08:09:06.770Z caller=main.go:1234 level=info msg="Completed loading of configuration file" filename=/etc/config/prometheus.yml totalDuration=37.243136ms db_storage=3µs remote_storage=2.821µs web_handler=958ns query_engine=1.917µs scrape=30.766642ms scrape_sd=3.27419ms notify=2.525µs notify_sd=4.572µs rules=574.073µs tracing=8.964µs
ts=2023-02-17T08:09:06.770Z caller=main.go:978 level=info msg="Server is ready to receive web requests."
ts=2023-02-17T08:09:06.770Z caller=manager.go:953 level=info component="rule manager" msg="Starting rule manager..."
ts=2023-02-17T08:11:56.487Z caller=compact.go:519 level=info component=tsdb msg="write block" mint=1674864000000 maxt=1674871200000 ulid=01GSF6Q33ZV1SA1YCH087EP9S2 duration=2m44.423969816s
ts=2023-02-17T08:12:15.236Z caller=head.go:1213 level=info component=tsdb msg="Head GC completed" caller=truncateMemory duration=18.734578757s

Keep prometheus data for 1 year rather than 90 days #1779
pangeo-hubs, prometheus: server is crashing due to memory limits #2215
LEAP prometheus server is down/scheduler faiiling #2248 (200+ nodes -> many node exporters -> a lot of scraping -> many calico-typha -> little available CPU)
Overview of grafana and prometheus related issues #2214

The text was updated successfully, but these errors were encountered:

yuvipanda · 2023-02-18T02:59:25Z

I am curious why this only seems to affect the pangeo hubs one, and not say the 2i2c cluster

consideRatio · 2023-02-18T06:59:10Z

@yuvipanda i suspect a basic relation between WAL and memory during startup, where WAL would depend on amount of metrics collected i assume. Amount of metrics would be coupled to whats being scraped. Amount of data scraped relates to more endponts scraped, such as one node-exporter per node, such as one per dask worker node.

consideRatio · 2023-11-05T10:02:32Z

I think the approach of limiting the amount of metrics consumed is relevant, but I'll go for a close on this issue now, the other ideas was explored a bit.

This was referenced Feb 22, 2023

Overview of grafana and prometheus related issues #2214

Open

LEAP prometheus server is down/scheduler faiiling #2248

Closed

This was referenced Mar 8, 2023

ingress-controller pod that routes all incoming traffic to k8d pods got evicted? #2322

Closed

resource requests/limits updates, mainly to ingress-nginx and prometheus #2324

Merged

consideRatio added the tech:prometheus label Sep 9, 2023

consideRatio closed this as completed Nov 5, 2023

damianavila assigned consideRatio Nov 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ideas to upper bound prometheus-server's memory consumption #2222

Ideas to upper bound prometheus-server's memory consumption #2222

consideRatio commented Feb 17, 2023 •

edited

Loading

yuvipanda commented Feb 18, 2023

consideRatio commented Feb 18, 2023

consideRatio commented Nov 5, 2023

Ideas to upper bound prometheus-server's memory consumption #2222

Ideas to upper bound prometheus-server's memory consumption #2222

Comments

consideRatio commented Feb 17, 2023 • edited Loading

Ideas

Example on logs from a successfull startup

Related

yuvipanda commented Feb 18, 2023

consideRatio commented Feb 18, 2023

consideRatio commented Nov 5, 2023

consideRatio commented Feb 17, 2023 •

edited

Loading