Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes Docker xcom functionality #21175

Merged
merged 4 commits into from
Feb 1, 2022
Merged

Fixes Docker xcom functionality #21175

merged 4 commits into from
Feb 1, 2022

Conversation

ferruzzi
Copy link
Contributor

@ferruzzi ferruzzi commented Jan 28, 2022

closes: #18874

completes: #19027 (someone else has a PR in for this but it appears to be abandoned)

Co-Author: [email protected]


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@ferruzzi
Copy link
Contributor Author

@uranusjr - You were looking at someone else's attempt at this, which has been sitting for a while. (link in the description)

@ferruzzi
Copy link
Contributor Author

Rebased and corrected the co-author in the commit message.

@uranusjr
Copy link
Member

See #19027 (comment). I still think having a separate API call is safer.

line = line.strip()
res_lines.append(line)
self.log.info(line)
log_chunk = log_chunk.decode('utf-8')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should have error="surrogateescape" here (something like this, I don’t remember the exact name) because there’s no guaratee the log is outputting valid strings, and those characters shouldn’t crash the entire job.

Copy link
Contributor Author

@ferruzzi ferruzzi Jan 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't run into that before. I'll read up on it and get the change in on Monday.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 12df34c

Just making sure I understand, adding the surrogateescape here allows our decode to pick up and fix anything that was escaped in the initial encode, assuming they used the same surrogateescape there? Is that right?

airflow/providers/docker/operators/docker.py Outdated Show resolved Hide resolved
Co-authored-by: Tzu-ping Chung <[email protected]>
@ferruzzi
Copy link
Contributor Author

See #19027 (comment). I still think having a separate API call is safer.

Isn't this what you were asking for with that discussion? https://github.com/apache/airflow/pull/21175/files#diff-5ec82e1fcaaa38e04471f56b64ec319f67b146d3a18d8bdb69c2f973ea3dd585R306-R311

@ferruzzi ferruzzi closed this Jan 30, 2022
@ferruzzi ferruzzi reopened this Jan 30, 2022
@uranusjr
Copy link
Member

Ah right, I misread, sorry.

Comment on lines 293 to 294
log_lines = "\n".join(log_lines)
raise AirflowException(f'Docker container failed: {repr(result)} lines {log_lines}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log_lines = "\n".join(log_lines)
raise AirflowException(f'Docker container failed: {repr(result)} lines {log_lines}')
joined_log_lines = "\n".join(log_lines)
raise AirflowException(f'Docker container failed: {repr(result)} lines {joined_log_lines}')

to satisfy Mypy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. For the future, what didn't it like there? Reusing the same variable name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah from previous lines Mypy sets the type of log_lines to List[str], so assigning str to it raises an error.

@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Feb 1, 2022
@github-actions
Copy link

github-actions bot commented Feb 1, 2022

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.

@eladkal eladkal merged commit 2f4a3d4 into apache:main Feb 1, 2022
@ferruzzi ferruzzi deleted the docker-xcom branch February 1, 2022 17:31
potiuk added a commit to potiuk/airflow that referenced this pull request Feb 8, 2022
The fix from apache#21175 did not actually fix the logging behaviour
with non-ascii characters returned by docker logs when xcom
push was enabled. The problem is that DockerOperator uses
different methods to stream the logs as they come (using attach
stream) and different method to retrieve the logs to actually
return the Xcom value. The latter uses "logs" method of docker
client. The tests have not caught it, because the two methods
were mocked in two different places.

This PR uses the same "stringify()" function to convert both
"logged" logs and those that are pushed as xcom. Also added
test for "no lines returned" case.

Fixes: apache#19027
potiuk added a commit that referenced this pull request Feb 8, 2022
* Fix docker behaviour with byte lines returned

The fix from #21175 did not actually fix the logging behaviour
with non-ascii characters returned by docker logs when xcom
push was enabled. The problem is that DockerOperator uses
different methods to stream the logs as they come (using attach
stream) and different method to retrieve the logs to actually
return the Xcom value. The latter uses "logs" method of docker
client. The tests have not caught it, because the two methods
were mocked in two different places.

This PR uses the same "stringify()" function to convert both
"logged" logs and those that are pushed as xcom. Also added
test for "no lines returned" case.

Fixes: #19027

* Update tests/providers/docker/operators/test_docker.py

Co-authored-by: Jed Cunningham <[email protected]>

Co-authored-by: Jed Cunningham <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers okay to merge It's ok to merge this PR as it does not require more tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DockerOperator xcom is broken
3 participants