Skip to content

Commit

Permalink
Fix liveness probe speedup for scheduler and triggerer (#21108)
Browse files Browse the repository at this point in the history
PR apache/airflow#20833 tried to speed up the liveness probe by setting variable CONNECTION_CHECK_MAX_COUNT=0 which disables a connectivity check in `/entrypoint` (which turns out to be slow).

Unfortunately the approach taken doesn't work; we have to use `sh -c exec` instead.

GitOrigin-RevId: bca5caf3611fc8659b8bd5fdcddc04dc5b104344
  • Loading branch information
dstandish authored and Cloud Composer Team committed Sep 12, 2024
1 parent 2dd5258 commit 51478d3
Show file tree
Hide file tree
Showing 2 changed files with 34 additions and 34 deletions.
34 changes: 17 additions & 17 deletions chart/templates/scheduler/scheduler-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -162,26 +162,26 @@ spec:
periodSeconds: {{ .Values.scheduler.livenessProbe.periodSeconds }}
exec:
command:
- CONNECTION_CHECK_MAX_COUNT=0
- /entrypoint
- python
- -Wignore
- -c
- |
import os
os.environ['AIRFLOW__CORE__LOGGING_LEVEL'] = 'ERROR'
os.environ['AIRFLOW__LOGGING__LOGGING_LEVEL'] = 'ERROR'
- sh
- -c
- exec
- |
CONNECTION_CHECK_MAX_COUNT=0 /entrypoint python -Wignore -c "
import os
os.environ['AIRFLOW__CORE__LOGGING_LEVEL'] = 'ERROR'
os.environ['AIRFLOW__LOGGING__LOGGING_LEVEL'] = 'ERROR'
from airflow.jobs.scheduler_job import SchedulerJob
from airflow.utils.db import create_session
from airflow.utils.net import get_hostname
import sys
from airflow.jobs.scheduler_job import SchedulerJob
from airflow.utils.db import create_session
from airflow.utils.net import get_hostname
import sys
with create_session() as session:
job = session.query(SchedulerJob).filter_by(hostname=get_hostname()).order_by(
SchedulerJob.latest_heartbeat.desc()).limit(1).first()
with create_session() as session:
job = session.query(SchedulerJob).filter_by(hostname=get_hostname()).order_by(
SchedulerJob.latest_heartbeat.desc()).limit(1).first()
sys.exit(0 if job.is_alive() else 1)
sys.exit(0 if job.is_alive() else 1)
"
{{- if and $local (not $elasticsearch) }}
# Serve logs if we're in local mode and we don't have elasticsearch enabled.
ports:
Expand Down
34 changes: 17 additions & 17 deletions chart/templates/triggerer/triggerer-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -165,26 +165,26 @@ spec:
periodSeconds: {{ .Values.triggerer.livenessProbe.periodSeconds }}
exec:
command:
- CONNECTION_CHECK_MAX_COUNT=0
- /entrypoint
- python
- -Wignore
- -c
- |
import os
os.environ['AIRFLOW__CORE__LOGGING_LEVEL'] = 'ERROR'
os.environ['AIRFLOW__LOGGING__LOGGING_LEVEL'] = 'ERROR'
- sh
- -c
- exec
- |
CONNECTION_CHECK_MAX_COUNT=0 /entrypoint python -Wignore -c "
import os
os.environ['AIRFLOW__CORE__LOGGING_LEVEL'] = 'ERROR'
os.environ['AIRFLOW__LOGGING__LOGGING_LEVEL'] = 'ERROR'
from airflow.jobs.triggerer_job import TriggererJob
from airflow.utils.db import create_session
from airflow.utils.net import get_hostname
import sys
from airflow.jobs.triggerer_job import TriggererJob
from airflow.utils.db import create_session
from airflow.utils.net import get_hostname
import sys
with create_session() as session:
job = session.query(TriggererJob).filter_by(hostname=get_hostname()).order_by(
TriggererJob.latest_heartbeat.desc()).limit(1).first()
with create_session() as session:
job = session.query(TriggererJob).filter_by(hostname=get_hostname()).order_by(
TriggererJob.latest_heartbeat.desc()).limit(1).first()
sys.exit(0 if job.is_alive() else 1)
sys.exit(0 if job.is_alive() else 1)
"
{{- if and (.Values.dags.gitSync.enabled) (not .Values.dags.persistence.enabled) }}
{{- include "git_sync_container" . | indent 8 }}
{{- end }}
Expand Down

0 comments on commit 51478d3

Please sign in to comment.