Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix use multi process for load metrics #429

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

saeedAdelpour
Copy link

Hi,

I opened a pull request (#407) that was closed. I think I need to explain the problem in more detail.

The first problem is with using the PROMETHEUS_METRICS_EXPORT_PORT and PROMETHEUS_METRICS_EXPORT_ADDRESS environment variables. Here is the command that starts my app:

gunicorn APP.wsgi:application --name APP --workers 4 --bind=0.0.0.0:8000 --log-level=INFO --log-file=- 

The first error we encounter is Address already in use because in exports.py, line 45, the function prometheus_client.start_http_server(port, addr=addr) is called without checking if the port is already in use. This should be handled with a try-except block.

The second problem is when we use the PROMETHEUS_MULTIPROC_DIR environment variable. The function SetupPrometheusEndpointOnPort does not check PROMETHEUS_MULTIPROC_DIR and only runs this:

prometheus_client.start_http_server(port, addr=addr)

The implementation of this part must be similar to ExportToDjangoView. I mistakenly forced the app to use PROMETHEUS_MULTIPROC_DIR, but I have fixed it.

Thank you for your attention. Please tell me where I am wrong about it.

@asherf
Copy link
Collaborator

asherf commented Jul 12, 2024

TBH, I don't think this PR is a good idea. the function you modified is (IMHO, since I didn't originally write this code) is more of a reference code.
If needs be, you can have this kind of code and customization on the app side. I think adding complexity to this library is the wrong choice.

swallowing exception in this way, is really bad idea.

@saeedAdelpour
Copy link
Author

Hi,

Thank you for your response.

I agree with handling the exception this way, so I removed the try-except block to ensure only one process runs this part of the code.

However, I'm sure this code:

prometheus_client.start_http_server(port, addr=addr)

won't work with PROMETHEUS_MULTIPROC_DIR because when we try to run an HTTP server, we need to pass:

registry = prometheus_client.CollectorRegistry()

to start_http_server, and the code should be like this:

prometheus_client.start_http_server(port, addr=addr, registry=registry)

When we set PROMETHEUS_MULTIPROC_DIR in our app, all our metrics are stored in that directory, and the current code does not use that directory.

Here are the steps to reproduce the bug:

  1. Set the PROMETHEUS_MULTIPROC_DIR environment variable.
  2. Run a Django app with settings that do not include PROMETHEUS_METRICS_EXPORT_PORT and PROMETHEUS_METRICS_EXPORT_ADDRESS. The command is:
    gunicorn --bind=0.0.0.0:8001 proj.wsgi:application -w 4 --reload
    Note that we use 4 processes to run our main app.
  3. Create another settings file (/path/to/settings_prometheus.py) and add PROMETHEUS_METRICS_EXPORT_PORT and PROMETHEUS_METRICS_EXPORT_ADDRESS. The new settings file code is:
    from .settings import *
    
    PROMETHEUS_METRICS_EXPORT_PORT = 8002
    PROMETHEUS_METRICS_EXPORT_ADDRESS = "localhost"
    The command is:
    DJANGO_SETTINGS_MODULE=/path/to/settings_prometheus.py gunicorn --bind=0.0.0.0:8003 proj.wsgi:application -w 1 --reload
    Note that we use 1 process because the code:
    prometheus_client.start_http_server(port, addr=addr, registry=registry)
    must be run by only one process.
  4. We saw different results at http://localhost:8000/prometheus/metrics/ and http://localhost:8001.

Thank you for your attention. Please tell me if I am wrong about anything.

@saeedAdelpour
Copy link
Author

saeedAdelpour commented Aug 5, 2024

Hi,

I hope you’re doing well. I added a pull request 3 weeks ago. Could you spare some moment to review it? I’d really appreciate it.
@asherf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants