Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To support task display_name #1278

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

t0momi219
Copy link
Contributor

@t0momi219 t0momi219 commented Oct 23, 2024

Description

When running models that have names containing multibyte characters, runtime errors occur in Airflow environments where statsd is enabled (e.g., MWAA uses this statsd metric for collecting metrics in Cloudwatch).

Related Issue: apache/airflow#18010

To address this, Airflow 2.9 introduced the ability to render tasks using display_name, which allows task names to be rendered separately from their task_id.

Reference: https://airflow.apache.org/docs/apache-airflow/stable/_modules/airflow/example_dags/example_display_name.html

This PR adds support for display_name, enabling users who use non-ASCII characters as their native language to display task names in their own language, even in environments like MWAA.

Details

The normalize_task_id parameter is added to RenderConfig.
This option accepts a function to generate a task ID from a node. This allows users to generate arbitrary task IDs from models. If a function is passed to this option, Cosmos will use the model name as the display_name for tasks while rendering them.

def normalize_task_id(node):
    """
    This function takes a node and returns a new task_id.
    """
    if node.name == "MULTIBYTE_MODEL_NAME":
        return "MULTIBYTE_MODEL_NAME"

render_config = RenderConfig(
    normalize_task_id=normalize_task_id
)

Related Issue(s)

closes #1277

Breaking Change?

Checklist

  • I have made corresponding changes to the documentation (if required)
  • I have added tests that prove my fix is effective or that my feature works

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Oct 23, 2024
Copy link

netlify bot commented Oct 23, 2024

Deploy Preview for sunny-pastelito-5ecb04 failed.

Name Link
🔨 Latest commit 03076bc
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/67233876fddb0000086c8612

@dosubot dosubot bot added the area:rendering Related to rendering, like Jinja, Airflow tasks, etc label Oct 23, 2024
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Oct 26, 2024
@t0momi219
Copy link
Contributor Author

Hi team, ( @tatiana @pankajkoti )
This PR is ready. Could you please review this?

Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @t0momi219, thank you very much for the detailed explanation of the problem and for proposing a fix.

I was surprised with the amount of lines changed to fix the issue. It feels like you tried to do two things at once: fix the problem while refactoring the code. Would it be possible to solve the bug with less changes to the code?

As an example, what if we:

  1. Introduced a function (e.g. normalize_node_name that takes a node_name, normalizes it, handling non-ASCII characters)
  2. Replaced the ocurrences of node.name by normalize_node_name(node.name)

cosmos/airflow/graph.py Outdated Show resolved Hide resolved
cosmos/config.py Outdated Show resolved Hide resolved
@t0momi219
Copy link
Contributor Author

Hi @tatiana @pankajkoti ,
Thank you for the review, and I appreciate your acceptance of this proposal. I have revised the implementation to avoid impacting the core logic as much as possible.

PR Changes

  • Renamed the added parameter to normalize_task_id.
  • Modified the code with minimal changes.
  • Updated the documentation.

Copy link

codecov bot commented Nov 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.03%. Comparing base (1840c32) to head (79d358e).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1278      +/-   ##
==========================================
+ Coverage   95.85%   96.03%   +0.18%     
==========================================
  Files          67       67              
  Lines        3976     3985       +9     
==========================================
+ Hits         3811     3827      +16     
+ Misses        165      158       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@t0momi219
Copy link
Contributor Author

Hi @tatiana, @pankajkoti

I am also working on the following PR, and I plan to modify the same generate_task_or_group function in both: #1317

Therefore, I’d like to complete this PR first before moving on to the next PR’s changes. If possible, could you please review this PR again? I believe I’ve completed the code adjustments. If there are still any issues, please let me know. Thank you, and I apologize if further modifications are needed.
(Also, I believe that some of the previously failing tests in this PR have now passed thanks to everyone’s improvements. I’m grateful because I wasn’t able to fix it on my own. Thank you.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rendering Related to rendering, like Jinja, Airflow tasks, etc size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] To support task display_name
3 participants