Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed replacement of - to . in sdist package name #37410

Merged
merged 1 commit into from
Feb 14, 2024

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Feb 14, 2024

Follow up after #37406 - we could not test this change as there was no way to test canary builds but as the capability has been added in #37408 we can fix the failure and actually test it in a "canary" PR.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

Follow up after #37406 - we could not test this change as there
was no way to test canary builds but as the capability has been
added in #37408 we can fix the failure and actually test it in
a "canary" PR.
@potiuk potiuk added the canary When set on PR running from apache repo - behave as canary run label Feb 14, 2024
Copy link
Contributor

@Taragolis Taragolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it is another part which could be simplified by packaging.utils ... probably one day.

@potiuk
Copy link
Member Author

potiuk commented Feb 14, 2024

Looks like it is another part which could be simplified by packaging.utils ... probably one day.

A bit more complex - because we are using provider.id as identifier in generated provider_dependencies.json, so packaging_utils would not help :).

@potiuk potiuk merged commit 6667711 into main Feb 14, 2024
80 checks passed
@Taragolis
Copy link
Contributor

Taragolis commented Feb 14, 2024

A bit more complex - because we are using provider.id as identifier in generated provider_dependencies.json

I can't see a problem for generate valid canonical name from provider_id and build mapping between them (in globals?)
We pretty often use provider_id and after that convert it into the package name and vice versa.

import json
from pathlib import Path
from packaging.utils import canonicalize_name, parse_wheel_filename, parse_sdist_filename


PROJECT_DIR = Path("").resolve(). # Some path here
provider_dependencies = PROJECT_DIR / "generated" / "provider_dependencies.json"
assert provider_dependencies.exists()


provider_dependencies_json = json.loads(provider_dependencies.read_text())
mapping = {
    pi: canonicalize_name(f"apache-airflow-providers-{pi}") for pi in provider_dependencies_json
}
inverted_mapping = {v: k for k, v in mapping.items()}
print(json.dumps(mapping, indent=2))

# Samples
sdist_files = [
    "apache-airflow-providers-airbyte-3.2.1.tar.gz",
    "apache_airflow_providers_airbyte-3.6.0.tar.gz"
]

wheel_files = [
    "apache_airflow_providers_airbyte-3.6.0-py3-none-any.whl",
    "apache-airflow-providers-airbyte-42.0.0-py3-none-any.whl",
    # Invalid name
    "apache_airflow_providers_airbyte-42.0.0.whl",
]

for sdist in sdist_files:
    package, _ = parse_sdist_filename(sdist)
    provider_id = inverted_mapping[package]
    print(f"Provider ID: {provider_id}, sdist: {sdist}")

for wheel in wheel_files:
    package, *_ = parse_wheel_filename(wheel)
    provider_id = inverted_mapping[package]
    print(f"Provider ID: {provider_id}, sdist: {wheel}")

@potiuk
Copy link
Member Author

potiuk commented Feb 14, 2024

A bit more complex - because we are using provider.id as identifier in generated provider_dependencies.json

I can't see a problem for generate valid canonical name from provider_id and build mapping between them (in globals?) We pretty often use provider_id and after that convert it into the package name and vice versa.

import json
from pathlib import Path
from packaging.utils import canonicalize_name, parse_wheel_filename, parse_sdist_filename


PROJECT_DIR = Path("").resolve(). # Some path here
provider_dependencies = PROJECT_DIR / "generated" / "provider_dependencies.json"
assert provider_dependencies.exists()


provider_dependencies_json = json.loads(provider_dependencies.read_text())
mapping = {
    pi: canonicalize_name(f"apache-airflow-providers-{pi}") for pi in provider_dependencies_json
}
inverted_mapping = {v: k for k, v in mapping.items()}
print(json.dumps(mapping, indent=2))

# Samples
sdist_files = [
    "apache-airflow-providers-airbyte-3.2.1.tar.gz",
    "apache_airflow_providers_airbyte-3.6.0.tar.gz"
]

wheel_files = [
    "apache_airflow_providers_airbyte-3.6.0-py3-none-any.whl",
    "apache-airflow-providers-airbyte-42.0.0-py3-none-any.whl",
    # Invalid name
    "apache_airflow_providers_airbyte-42.0.0.whl",
]

for sdist in sdist_files:
    package, _ = parse_sdist_filename(sdist)
    provider_id = inverted_mapping[package]
    print(f"Provider ID: {provider_id}, sdist: {sdist}")

for wheel in wheel_files:
    package, *_ = parse_wheel_filename(wheel)
    provider_id = inverted_mapping[package]
    print(f"Provider ID: {provider_id}, sdist: {wheel}")

Ah ok yes, could be :) . Feel free :)

@Taragolis Taragolis deleted the fix_replacement_of_provider_package branch April 3, 2024 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dev-tools canary When set on PR running from apache repo - behave as canary run
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants