Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add backward compatibility for elasticsearch<8 #33281

Merged
merged 4 commits into from
Aug 10, 2023

Conversation

sunank200
Copy link
Collaborator

@sunank200 sunank200 commented Aug 10, 2023

For elasticsearch>8, arguments like retry_timeout has changed for elasticsearch to retry_on_timeout in Elasticsearch() compared to previous versions. Read more at: documentation This change was done as part of the following PR.

This needs backward compatibility support for elasticsearch<8. It fails with following error otherwise:

wait-for-airflow-migrations Unable to load the config, contains a configuration error.
wait-for-airflow-migrations Traceback (most recent call last):
wait-for-airflow-migrations   File "/usr/local/lib/python3.10/logging/config.py", line 565, in configure
wait-for-airflow-migrations     handler = self.configure_handler(handlers[name])
wait-for-airflow-migrations   File "/usr/local/lib/python3.10/logging/config.py", line 746, in configure_handler
wait-for-airflow-migrations     result = factory(**kwargs)
wait-for-airflow-migrations   File "/usr/local/lib/python3.10/site-packages/airflow/providers/elasticsearch/log/es_task_handler.py", line 104, in __init__
wait-for-airflow-migrations     self.client = elasticsearch.Elasticsearch(host, **es_kwargs)  # type: ignore[attr-defined]
wait-for-airflow-migrations TypeError: Elasticsearch.__init__() got an unexpected keyword argument 'retry_timeout'
wait-for-airflow-migrations 
wait-for-airflow-migrations The above exception was the direct cause of the following exception:
wait-for-airflow-migrations 
wait-for-airflow-migrations Traceback (most recent call last):
wait-for-airflow-migrations   File "/usr/local/bin/airflow", line 5, in <module>
wait-for-airflow-migrations     from airflow.__main__ import main
wait-for-airflow-migrations   File "/usr/local/lib/python3.10/site-packages/airflow/__init__.py", line 68, in <module>
wait-for-airflow-migrations     settings.initialize()
wait-for-airflow-migrations   File "/usr/local/lib/python3.10/site-packages/airflow/settings.py", line 524, in initialize
wait-for-airflow-migrations     LOGGING_CLASS_PATH = configure_logging()
wait-for-airflow-migrations   File "/usr/local/lib/python3.10/site-packages/airflow/logging_config.py", line 74, in configure_logging
wait-for-airflow-migrations     raise e
wait-for-airflow-migrations   File "/usr/local/lib/python3.10/site-packages/airflow/logging_config.py", line 69, in configure_logging
wait-for-airflow-migrations     dictConfig(logging_config)
wait-for-airflow-migrations   File "/usr/local/lib/python3.10/logging/config.py", line 811, in dictConfig
wait-for-airflow-migrations     dictConfigClass(config).configure()
wait-for-airflow-migrations   File "/usr/local/lib/python3.10/logging/config.py", line 572, in configure
wait-for-airflow-migrations     raise ValueError('Unable to configure handler '
wait-for-airflow-migrations ValueError: Unable to configure handler 'task'
wait-for-airflow-migrations Exception ignored in atexit callback: <function shutdown at 0x7fd39b485870>
wait-for-airflow-migrations Traceback (most recent call last):
wait-for-airflow-migrations   File "/usr/local/lib/python3.10/logging/__init__.py", line 2183, in shutdown
wait-for-airflow-migrations     h.close()
wait-for-airflow-migrations   File "/usr/local/lib/python3.10/site-packages/airflow/providers/elasticsearch/log/es_task_handler.py", line 366, in close
wait-for-airflow-migrations     if not self.mark_end_on_close or getattr(self, "ctx_task_deferred", None):
wait-for-airflow-migrations AttributeError: 'ElasticsearchTaskHandler' object has no attribute 'mark_end_on_close'

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

# in Elasticsearch() compared to previous versions.
# Read more at: https://elasticsearch-py.readthedocs.io/en/v8.8.2/api.html#module-elasticsearch
if es_kwargs:
retry_timeout = es_kwargs.get("retry_timeout")
Copy link
Member

@pankajkoti pankajkoti Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, #33135 should be marked as a breaking change and released with a major bump. This is one argument that we have found, but there could be more such arguments and adding back compat for them will be challenging and will be discovered only when users face issues based on the params they are using.

Copy link
Member

@pankajkoti pankajkoti Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@potiuk potiuk Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it indeed HAS potential of breaking things. If we can make it "reasonably compatible" - i.e. fix back-compatibilities that we know and smooth the migration, that would be great, but I would be for marking the next ES release as MAJOR regardless cc: @eladkal

Also there is a little twist to it. Unless I am mistaken, I think elasticsearch handler integration was anyhow broken and not working before :) . So well. it could also be seen as bugfix :P

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too think we do a major release since the underline dependency had also changed from 7.* to 8.*

Copy link
Member

@pankajkoti pankajkoti Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. yes +1 to make backward-compatible as much as possible.

I am not aware of how far it was broken. But, we have few tests and users for whom the elasticsearch handler integration was known to be working well. However, only yesterday our tests caught that the PR broke existing working setup :)

Perhaps our QA expert @vatsrahul1001 can provide more confirmation here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#33135 did not break any airflow code.

We agreed that bumping dependency is not a breaking change and as expected every time you bump x.y.z to x+1.y.z something can be broken for users.

I am not convencied next release should be a major one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eladkal , I think if our helm chart can't work with the new Es provider then it should be considered as breaking change.
Also, maybe we should consider bumping to major versions whenever we bump dependencies to major versions?

Copy link
Contributor

@utkarsharma2 utkarsharma2 Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eladkal I'm not entirely sure, but the below code was part of airflow's codebase which led to the issue because of upgrade -

retry_timeout: 'True'

Copy link
Member

@pankajkoti pankajkoti Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I agree bumping some dependency from x.y.z to x+1.y.z may not need to be a breaking change in general. 👍🏽

However, would like to bring up and discuss the following point.
For this provider, elasticsearch is the main underlying dependency, and we're updating it to the next major version from 7.x to 8.x. It does not affect core Airflow code/functionality, but since providers are versioned and released independently, upgrading to this provider release might affect existing deployments for users.

Copy link
Contributor

@eladkal eladkal Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ephraimbuddy I didn't see helm tests were broken?
and to my understanding this PR brings back support for elastic search 7 so what is the motivation for the major release? Once this PR is merged the code is backward compatible.

Also, maybe we should consider bumping to major versions whenever we bump dependencies to major versions?

This was discussed and had lazy consequence if I remember correctly. Need to lookup the thread.

chart/values.yaml Outdated Show resolved Hide resolved
@potiuk potiuk merged commit 3c61ca4 into apache:main Aug 10, 2023
42 checks passed
ferruzzi pushed a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Aug 17, 2023
* Add backward compatibility for elasticsearch<8
@eladkal
Copy link
Contributor

eladkal commented Nov 17, 2023

I may be missing something here but...
This PR added backward compatibility for elasticsearch<8
yet min version is 8.10

"elasticsearch>=8.10,<9"

so either we need to relax the min version or we should remove the backward compatibility code cc @dstandish

@potiuk
Copy link
Member

potiuk commented Nov 17, 2023

I'd be for removing it. People can use previous provider versions for ES < 8 if they need

@eladkal
Copy link
Contributor

eladkal commented Nov 25, 2023

#35707 raised to remove the functionality added by this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants