Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid log order with ElasticsearchTaskHandler #17512

Closed
suhanovv opened this issue Aug 9, 2021 · 5 comments
Closed

Invalid log order with ElasticsearchTaskHandler #17512

suhanovv opened this issue Aug 9, 2021 · 5 comments
Labels
kind:bug This is a clearly a bug

Comments

@suhanovv
Copy link
Contributor

suhanovv commented Aug 9, 2021

Apache Airflow version: 2.*

Apache Airflow Provider versions (please include all providers that are relevant to your bug): providers-elasticsearch==1./2.

Kubernetes version (if you are using kubernetes) (use kubectl version): 1.17

What happened:
When using the ElasticsearchTaskHandler I see the broken order of the logs in the interface.
IMAGE 2021-08-09 18:05:13

raw log entries:
{"asctime": "2021-07-16 09:37:22,278", "filename": "pod_launcher_deprecated.py", "lineno": 131, "levelname": "WARNING", "message": "Pod not yet started: a3ca15e631a7f67697e10b23bae82644.6314229ca0c84d7ba29c870edc39a268", "exc_text": null, "dag_id": "dag_name_AS10_A_A_A_A", "task_id": "k8spod_make_upload_dir_1_2", "execution_date": "2021_07_16T09_10_00_000000", "try_number": "1", "log_id": "dag_name_AS10_A_A_A_A-k8spod_make_upload_dir_1_2-2021_07_16T09_10_00_000000-1", "offset": 1626427744958776832}

{"asctime": "2021-07-16 09:37:26,485", "filename": "taskinstance.py", "lineno": 1191, "levelname": "INFO", "message": "Marking task as SUCCESS. dag_id=pod_gprs_raw_from_nfs_AS10_A_A_A_A, task_id=k8spod_make_upload_dir_1_2, execution_date=20210716T091000, start_date=20210716T092905, end_date=20210716T093726", "exc_text": null, "dag_id": "dag_name_AS10_A_A_A_A", "task_id": "k8spod_make_upload_dir_1_2", "execution_date": "2021_07_16T09_10_00_000000", "try_number": "1", "log_id": "dag_name_AS10_A_A_A_A-k8spod_make_upload_dir_1_2-2021_07_16T09_10_00_000000-1", "offset": 1626427744958776832}

What you expected to happen:

The problem lies in the method set_context that is set on the instance of the class and then all entries in the log go with the same offset, which is used to select for display in the interface. When we redefined method emit and put the offset to the record, the problem disappeared

How to reproduce it:
Run a long-lived task that generates logs, in our case it is a spark task launched from a docker container

@suhanovv suhanovv added the kind:bug This is a clearly a bug label Aug 9, 2021
@boring-cyborg
Copy link

boring-cyborg bot commented Aug 9, 2021

Thanks for opening your first issue here! Be sure to follow the issue template!

@potiuk
Copy link
Member

potiuk commented Aug 9, 2021

Would you be willing to provide a PR fixing.it (seems you already have the fix)?

@suhanovv
Copy link
Contributor Author

suhanovv commented Aug 9, 2021

yes, I will make a pull request

@suhanovv
Copy link
Contributor Author

@potiuk I made a PR, who can review it ?

@potiuk potiuk closed this as completed Aug 14, 2021
@potiuk
Copy link
Member

potiuk commented Aug 14, 2021

Closed by #17551

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug
Projects
None yet
Development

No branches or pull requests

2 participants