-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging from all containers in KubernetesOperatorPod #31663
Conversation
@uranusjr @potiuk @pulquero @dstandish looping you in for reviews on this PR as you were reviewers to the one that went stale |
@@ -270,6 +274,7 @@ def __init__( | |||
reattach_on_restart: bool = True, | |||
startup_timeout_seconds: int = 120, | |||
get_logs: bool = True, | |||
container_logs: list[str] | str | bool = BASE_CONTAINER_NAME, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this accept False
? If not, this should be Literal[True]
instead of bool
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Yeah it doesn't accept False. Only True
container_logs, | ||
pod.metadata.name, | ||
) | ||
elif isinstance(container_logs, bool): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to keep this check as bool so that we can reject/filter out invalid/unsupported types too here https://github.com/apache/airflow/pull/31663/files#diff-6900da9281d8404b14da0815f2b37350f3148b1b63449928b952744e6711e7e7R475-R478
@uranusjr
docs need fixing still |
From my perspective, it looks like it will do the job, thanks. |
else: | ||
self.log.error( | ||
"False is not a valid value for container_logs", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this branch needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we need it since we support only Literal[True]
when it comes to boolean. Refer to my unit tests. It adds False too as an invalid test case
I was looking for this feature and stumbled upon this pull request. Will this work also with init containers? |
Hi @JeremieDoctrine, yes it should also capture the initContainers logs in my opinion |
@uranusjr @jedcunningham may i request for a review on this pull request when you have some time? |
@hussein-awala @potiuk @jedcunningham @uranusjr following up for a review on this PR. Can you have a look when you have some time? |
time.sleep(1) | ||
return terminated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any case where terminated will be false? I fail to see it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not supposed to go into the false state. The method is awaiting completion of the container. ie. It will loop until the container is completed. Let me know if it is not clear enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean - why do we need to return the bool ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will always be True right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good point. Makes sense to revert it back to returning None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few nits/questions.
@potiuk just pushed a patch to fix the comments. Have a look when you have some time :) |
containers = [] | ||
pod_info = self.read_pod(pod) | ||
for container_spec in pod_info.spec.containers: | ||
if container_spec.name != ContainerNames.XCOM_CONTAINER: | ||
containers.append(container_spec.name) | ||
|
||
return containers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a simpler equivalent
containers = [] | |
pod_info = self.read_pod(pod) | |
for container_spec in pod_info.spec.containers: | |
if container_spec.name != ContainerNames.XCOM_CONTAINER: | |
containers.append(container_spec.name) | |
return containers | |
pod_info = self.read_pod(pod) | |
return [ | |
container_spec.name | |
for container_spec in pod_info.spec.containers | |
if container_spec.name != ContainerNames.XCOM_CONTAINER | |
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Made the changes here
@@ -398,14 +417,78 @@ def consume_logs( | |||
) | |||
time.sleep(1) | |||
|
|||
def fetch_requested_container_logs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this entire function can be transformed into a generator function instead and use yield
to get rid of all the append
calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo, it is simpler to control the flow when we have these append calls. I would like to keep it this way if that is ok by you
A couple of comments to improve the code, but they don’t change the overall logic and should be non-blocking. |
f6614e1
to
901f5af
Compare
@@ -412,14 +431,78 @@ def consume_logs( | |||
) | |||
time.sleep(1) | |||
|
|||
def fetch_requested_container_logs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we need to use the provided post_termination_timeout
in this method as we did with fetch_container_logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I quite understand this comment. We are calling the fetch_container_logs
from fetch_requested_container_logs
which will take care of post_termination_timeout
right?
class ContainerNames: | ||
"""Possible container names for airflow.""" | ||
|
||
XCOM_CONTAINER = "airflow-xcom-sidecar" | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't understand the need of this class, this value is already available in PodDefaults.SIDECAR_CONTAINER_NAME
which we use in the other methods, it's better to use it in case we change its name in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I had missed seeing this. I will undo this change and reuse PodDefaults
@hussein-awala can you take a look at this PR again when you have some time? |
@hussein-awala @uranusjr following up for a final round of review on this PR 🙂 Want to land it before the branch cut today if possible.. |
@uranusjr do we need more approvals on this one or can we proceed and merge it? |
At least one more (or if @vincbeck takes another look to make sure changes made after their approval are good) |
closes: #27282
This PR adds support to get logs from all the containers in a pod and publish them in the airflow logging instead of always getting the base container logs.
A new option
container_logging
has been added which works in conjunction withget_logs
option. ie. ifget_logs
isn't set,container_logging
will not be checked.The flag takes the following values:
This was a community suggestion here: https://github.com/apache/airflow/pull/28981/files#r1071928325
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.