Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloudwatch parsing fails for log groups containing slashes #14667

Closed
maxcountryman opened this issue Mar 8, 2021 · 5 comments
Closed

Cloudwatch parsing fails for log groups containing slashes #14667

maxcountryman opened this issue Mar 8, 2021 · 5 comments
Labels
affected_version:2.0 Issues Reported for 2.0 area:logging good first issue kind:bug This is a clearly a bug provider:amazon-aws AWS/Amazon - related issues
Milestone

Comments

@maxcountryman
Copy link
Contributor

Apache Airflow version: 2.0.1

Environment:

  • Cloud provider or hardware configuration: AWS + ECS
  • OS (e.g. from /etc/os-release): Ubuntu
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

What happened:

When enabling remote logging via Cloudwatch we have observed that you cannot use log groups with slashes in their name, e.g. "/foo/bar/baz" or a more practical example like "/ecs/your-app". This is in our experience a common convention in AWS and doesn't seem to be a documented limitation so we wonder if this is a bug.

What you expected to happen:

We believe that unless this is explicitly designed in this manner this is likely not behaving as desired.

How to reproduce it:

Simply create a log group with forward slashes and you should observe the same effect.

It's even easier to observe this in the REPL, for instance here is the culprit in our code and in the REPL we can reproduce like so:

>>> urlparse('cloudwatch://arn:aws:logs:us-west-2:123:log-group:/ecs/my-app:*').netloc
'arn:aws:logs:us-west-2:123:log-group:'
@maxcountryman maxcountryman added the kind:bug This is a clearly a bug label Mar 8, 2021
@maxcountryman
Copy link
Contributor Author

@mini1989cloud I had to create a set of AWS credentials which are injected as the standard boto3 environment variables in order for Cloudwatch to work. I don't think it knows about or will use an Airflow connection.

@eladkal eladkal added the provider:amazon-aws AWS/Amazon - related issues label Nov 6, 2021
@HybridNeos
Copy link

I am also having this problem on Apache Airflow 2.1.4 and Amazon Linux 2 as the OS. Thank you maxcountryman for outlining the problem however a fix would be appreciated.

vsimon added a commit to vsimon/airflow that referenced this issue Dec 3, 2021
)

When the log group arn contains slashes, the urlparse function parses
the group name as part of the 'path' instead of including it as part of
the 'netloc'. This fix concatenates both the 'netloc' and 'path' fields
together to use as the group arn. For group names without slashes, the
functionality remains the same as the group name is still parsed as part
of the 'netloc' field and the 'path' field will be empty.
potiuk pushed a commit that referenced this issue Dec 19, 2021
…19700)

When the log group arn contains slashes, the urlparse function parses
the group name as part of the 'path' instead of including it as part of
the 'netloc'. This fix concatenates both the 'netloc' and 'path' fields
together to use as the group arn. For group names without slashes, the
functionality remains the same as the group name is still parsed as part
of the 'netloc' field and the 'path' field will be empty.
@potiuk
Copy link
Member

potiuk commented Mar 6, 2022

I am also having this problem on Apache Airflow 2.1.4 and Amazon Linux 2 as the OS. Thank you maxcountryman for outlining the problem however a fix would be appreciated.

Indeed. anyone providing fix would be great.

@Taragolis
Copy link
Contributor

@potiuk @eladkal I think this issue fixed by #19700 (Airflow 2.3.0)

@potiuk
Copy link
Member

potiuk commented Sep 20, 2022

@maxcountryman - can you please verify it? I am closing it provisionally (thanks @Taragolis ).

@potiuk potiuk closed this as completed Sep 20, 2022
@potiuk potiuk added this to the Airflow 2.3.0 milestone Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected_version:2.0 Issues Reported for 2.0 area:logging good first issue kind:bug This is a clearly a bug provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

No branches or pull requests

7 participants
@maxcountryman @potiuk @Taragolis @HybridNeos @vikramkoka @eladkal and others