-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid log order with ElasticsearchTaskHandler #17512
Labels
kind:bug
This is a clearly a bug
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Apache Airflow version: 2.*
Apache Airflow Provider versions (please include all providers that are relevant to your bug): providers-elasticsearch==1./2.
Kubernetes version (if you are using kubernetes) (use
kubectl version
): 1.17What happened:
When using the ElasticsearchTaskHandler I see the broken order of the logs in the interface.
raw log entries:
{"asctime": "2021-07-16 09:37:22,278", "filename": "pod_launcher_deprecated.py", "lineno": 131, "levelname": "WARNING", "message": "Pod not yet started: a3ca15e631a7f67697e10b23bae82644.6314229ca0c84d7ba29c870edc39a268", "exc_text": null, "dag_id": "dag_name_AS10_A_A_A_A", "task_id": "k8spod_make_upload_dir_1_2", "execution_date": "2021_07_16T09_10_00_000000", "try_number": "1", "log_id": "dag_name_AS10_A_A_A_A-k8spod_make_upload_dir_1_2-2021_07_16T09_10_00_000000-1", "offset": 1626427744958776832}
{"asctime": "2021-07-16 09:37:26,485", "filename": "taskinstance.py", "lineno": 1191, "levelname": "INFO", "message": "Marking task as SUCCESS. dag_id=pod_gprs_raw_from_nfs_AS10_A_A_A_A, task_id=k8spod_make_upload_dir_1_2, execution_date=20210716T091000, start_date=20210716T092905, end_date=20210716T093726", "exc_text": null, "dag_id": "dag_name_AS10_A_A_A_A", "task_id": "k8spod_make_upload_dir_1_2", "execution_date": "2021_07_16T09_10_00_000000", "try_number": "1", "log_id": "dag_name_AS10_A_A_A_A-k8spod_make_upload_dir_1_2-2021_07_16T09_10_00_000000-1", "offset": 1626427744958776832}
What you expected to happen:
The problem lies in the method set_context that is set on the instance of the class and then all entries in the log go with the same offset, which is used to select for display in the interface. When we redefined method emit and put the offset to the record, the problem disappeared
How to reproduce it:
Run a long-lived task that generates logs, in our case it is a spark task launched from a docker container
The text was updated successfully, but these errors were encountered: