-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smart Sensor Kubernetes Airflow 2.0.1 #15543
Comments
I figured out what is going on. I am checking if a file was modified in the last 60 seconds, but the SmartSensor does not care about my poke_interval and uses 180 seconds. class SmartSensorOperator(BaseOperator, SkipMixin):
ui_color = '#e6f1f2'
@apply_defaults
def __init__(
self,
poke_interval=180,
smart_sensor_timeout=60 * 60 * 24 * 7,
soft_fail=False,
shard_min=0,
shard_max=100000,
poke_timeout=6.0,
*args,
**kwargs,
): Also, this is not changed at Smart Sensor DAG init: num_smart_sensor_shard = conf.getint("smart_sensor", "shards")
shard_code_upper_limit = conf.getint('smart_sensor', 'shard_code_upper_limit')
for i in range(num_smart_sensor_shard):
shard_min = (i * shard_code_upper_limit) / num_smart_sensor_shard
shard_max = ((i + 1) * shard_code_upper_limit) / num_smart_sensor_shard
dag = DAG(
dag_id=dag_id,
default_args=args,
schedule_interval=timedelta(minutes=5),
concurrency=1,
max_active_runs=1,
catchup=False,
dagrun_timeout=timedelta(hours=24),
start_date=days_ago(2),
)
SmartSensorOperator(
task_id='smart_sensor_task',
dag=dag,
retries=100,
retry_delay=timedelta(seconds=10),
priority_weight=999,
shard_min=shard_min,
shard_max=shard_max,
poke_timeout=10,
smart_sensor_timeout=timedelta(hours=24).total_seconds(),
) Can we add one more config option under smart_sensor in airflow.cfg called Then do this: num_smart_sensor_shard = conf.getint("smart_sensor", "shards")
shard_code_upper_limit = conf.getint('smart_sensor', 'shard_code_upper_limit')
poke_interval = conf.getint('smart_sensor', 'poke_interval')
for i in range(num_smart_sensor_shard):
shard_min = (i * shard_code_upper_limit) / num_smart_sensor_shard
shard_max = ((i + 1) * shard_code_upper_limit) / num_smart_sensor_shard
dag = DAG(
dag_id=dag_id,
default_args=args,
schedule_interval=timedelta(minutes=5),
concurrency=1,
max_active_runs=1,
catchup=False,
dagrun_timeout=timedelta(hours=24),
start_date=days_ago(2),
)
SmartSensorOperator(
task_id='smart_sensor_task',
dag=dag,
retries=100,
retry_delay=timedelta(seconds=10),
priority_weight=999,
shard_min=shard_min,
shard_max=shard_max,
poke_timeout=10,
smart_sensor_timeout=timedelta(hours=24).total_seconds(),
poke_interval=poke_interval,
) |
I will take a look |
Since Smart Sensors is now deprecated I'm not sure if it's worth to investigating this issue? |
Closing as Smart Sensor is removed in 2.4.0 |
Apache Airflow version 2.0.1:
Kubernetes version 1.18.14
Environment: Azure - AKS
What happened:
I created a sensor:
and a DAG
My airflow.cfg:
I trigger my DAG and then I upload the file such that the poke will return True. The sensor task gets submitted to the sensor dags/service but it never returns. The DAG gets stuck in sensing phase.
I am running this on kubernetes.
task_sensor
logs:All my sensor dags have similar logs:
Scheduler logs
Is the smart sensor supposed to work with Kubernetes Executor?
The text was updated successfully, but these errors were encountered: