Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas to upper bound prometheus-server's memory consumption #2222

Closed
consideRatio opened this issue Feb 17, 2023 · 3 comments
Closed

Ideas to upper bound prometheus-server's memory consumption #2222

consideRatio opened this issue Feb 17, 2023 · 3 comments
Assignees

Comments

@consideRatio
Copy link
Member

consideRatio commented Feb 17, 2023

I just slowly incremented prometheus-server to 20 GB of memory requests for pangeo-hubs cluster. It appears that it wasn't sufficient with 18 GB, because it peaked at close to 19 GB before it fell down to ~3-4GB when Head GC completed was logged ~5 minutes after startup.

This prometheus-server had a /data folder mounted from the attached PVC that was 5.8 GB.

kubectl exec -n support deploy/support-prometheus-server -c prometheus-server -- du -sh /data 
5.8G	/data

The problem we have is that "write-ahead log" (WAL) is being read during startup from the disk to summarize all metrics collected as I understand it, and that takes a lot of memory. Actually, the problem is that we can't know what this memory requirement is, because it grows over time as more metrics are collected.

Ideas

  1. We upper-bound the WAL size on disk instead of collected metrics age
  2. We work towards node sharing (New default machine types and profile list options - sharing nodes is great! #2121) so we get less metrics from nodes
  3. We try to limit the metrics collected by prometheus to what we consume

Example on logs from a successfull startup

ts=2023-02-17T08:08:58.378Z caller=head.go:683 level=info component=tsdb msg="WAL segment loaded" segment=25167 maxSegment=25169
ts=2023-02-17T08:08:58.379Z caller=head.go:683 level=info component=tsdb msg="WAL segment loaded" segment=25168 maxSegment=25169
ts=2023-02-17T08:08:58.379Z caller=head.go:683 level=info component=tsdb msg="WAL segment loaded" segment=25169 maxSegment=25169
ts=2023-02-17T08:08:58.379Z caller=head.go:720 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=1.080338716s wal_replay_duration=1m28.600482965s wbl_replay_duration=184ns total_replay_duration=1m30.245179793s
ts=2023-02-17T08:09:06.733Z caller=main.go:1014 level=info fs_type=EXT4_SUPER_MAGIC
ts=2023-02-17T08:09:06.733Z caller=main.go:1017 level=info msg="TSDB started"
ts=2023-02-17T08:09:06.733Z caller=main.go:1197 level=info msg="Loading configuration file" filename=/etc/config/prometheus.yml
ts=2023-02-17T08:09:06.766Z caller=kubernetes.go:326 level=info component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
ts=2023-02-17T08:09:06.768Z caller=kubernetes.go:326 level=info component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
ts=2023-02-17T08:09:06.768Z caller=kubernetes.go:326 level=info component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
ts=2023-02-17T08:09:06.769Z caller=kubernetes.go:326 level=info component="discovery manager scrape" discovery=kubernetes msg="Using pod service account via in-cluster config"
ts=2023-02-17T08:09:06.770Z caller=main.go:1234 level=info msg="Completed loading of configuration file" filename=/etc/config/prometheus.yml totalDuration=37.243136ms db_storage=3µs remote_storage=2.821µs web_handler=958ns query_engine=1.917µs scrape=30.766642ms scrape_sd=3.27419ms notify=2.525µs notify_sd=4.572µs rules=574.073µs tracing=8.964µs
ts=2023-02-17T08:09:06.770Z caller=main.go:978 level=info msg="Server is ready to receive web requests."
ts=2023-02-17T08:09:06.770Z caller=manager.go:953 level=info component="rule manager" msg="Starting rule manager..."
ts=2023-02-17T08:11:56.487Z caller=compact.go:519 level=info component=tsdb msg="write block" mint=1674864000000 maxt=1674871200000 ulid=01GSF6Q33ZV1SA1YCH087EP9S2 duration=2m44.423969816s
ts=2023-02-17T08:12:15.236Z caller=head.go:1213 level=info component=tsdb msg="Head GC completed" caller=truncateMemory duration=18.734578757s

Related

@yuvipanda
Copy link
Member

I am curious why this only seems to affect the pangeo hubs one, and not say the 2i2c cluster

@consideRatio
Copy link
Member Author

@yuvipanda i suspect a basic relation between WAL and memory during startup, where WAL would depend on amount of metrics collected i assume. Amount of metrics would be coupled to whats being scraped. Amount of data scraped relates to more endponts scraped, such as one node-exporter per node, such as one per dask worker node.

@consideRatio
Copy link
Member Author

I think the approach of limiting the amount of metrics consumed is relevant, but I'll go for a close on this issue now, the other ideas was explored a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done 🎉
Development

No branches or pull requests

2 participants