Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide more ways to publish Binary Artifacts #40760

Open
laozhoubuluo opened this issue Mar 23, 2024 · 15 comments
Open

Provide more ways to publish Binary Artifacts #40760

laozhoubuluo opened this issue Mar 23, 2024 · 15 comments

Comments

@laozhoubuluo
Copy link

Describe the enhancement requested

Currently, apache.jfrog.io is the only release channel for various Binary Artifacts of Apache Arrow. Therefore, if apache.jfrog.io stops serving, there will be no other channel to obtain Binary Artifacts. However, apache.jfrog.io does not provide stable service. The following is a list of multiple service outages in recent years.

#12686
#34675
#40744
#40759

Considering that this problem has occurred many times, it has seriously affected various downstream software that relies on Apache Arrow. Whether to consider adding at least one other release channel for the Binary Artifacts of Apache Arrow, such as GitHub Releases or downloads.apache.org to avoid apache.jfrog.io becoming the only release channel for Binary Artifacts. And provide alternatives for obtaining Binary Artifacts in the installation instructions, even if it may require more troublesome methods such as manual installation of DPKG.

The implementation of this measure can effectively reduce risks in the downstream software supply chain and avoid the risk that a single component cannot be installed, resulting in the entire software being unable to be installed, or even the software being unable to continue to work normally.

Component(s)

Release

@assignUser
Copy link
Member

assignUser commented Mar 23, 2024

+1
I talked about this with @jbonofre yesterday and he mentioned that while repository.apache.org currently only hosts java binaries the underlying nexus software can host a number of package repos (rpm, deb, python,...).

GitHub releases could certainly also be a fallback for some things but they don't over repo functionality we need for at the minimum the Linux packages.

cc @raulcd @kou

@jbonofre
Copy link
Member

I think we have different options: nexus, dist, gh.

Let me investigate a bit what could be the easiest one to integrate in our build.

@kou
Copy link
Member

kou commented Mar 24, 2024

while repository.apache.org currently only hosts java binaries the underlying nexus software can host a number of package repos (rpm, deb, python,...).

Could you share a document URL how to use repository.apache.org for RPM/deb/wheel?

I think we have different options: nexus, dist, gh.

I think that we can't use dist.apache.org because our binaries are large to use dist.apache.org.

I think that we can use GitHub Releases for some binaries (which don't require metadata for package repository) but we can't use GitHub Releases for others (which require metadata for package repository, e.g. RPM/deb/wheel).

@assignUser
Copy link
Member

I assume we can use the API toupload binaries once a matching repository is created, but I haven't looked into it in detail/spoken with infra

@kou
Copy link
Member

kou commented Mar 25, 2024

Thanks.

It seems that we can use deb with Nexus Repository Manager 3 or later: https://help.sonatype.com/en/repository-manager-feature-matrix.html

It seems that https://repository.apache.org/ used Nexus Repository Manager 2:

Nexus Repository Manager 2.14.20-02

Could you ask INFRA whether there is a plan to upgrade repository.apache.org or not?

@kou
Copy link
Member

kou commented Mar 25, 2024

I think that we can use GitHub Releases for some binaries (which don't require metadata for package repository) but we can't use GitHub Releases for others (which require metadata for package repository, e.g. RPM/deb/wheel).

I was wrong. We can use GitHub Releases for wheel because we publish the voted wheels to https://pypi.org/.

@jbonofre
Copy link
Member

No need to upgrade to Nexus 3: even with Nexus 2 we can upload any kind of files, via HTTPs for instance. No need to use the Nexus API. Maven release plugin is "just" HTTPs client (via aether).

For instance, in Apache Karaf, I publish features XML, tar.gz, zip, etc.

Manually, it's possible to use mvn deploy:deploy-file providing the artifact type, etc.

@kou
Copy link
Member

kou commented Mar 25, 2024

Could you tell us which files are uploaded to repository.apache.org?
It seems that files listed in https://karaf.apache.org/download.html use dist.apache.org not repository.apache.org.

@jbonofre
Copy link
Member

At Karaf (like most of other Apache projects) we are using both:

We do almost the same in Arrow: the source distributions are on dist (https://dist.apache.org/repos/dist/release/arrow/).

By the way, as dist.apache.org artifacts are automatically copy to archives.apache.org, dist.apache.org should only content only the latest releases (for instance 15.0.0 and 15.0.1, etc should be deleted from dist.apache.org).

@raulcd
Copy link
Member

raulcd commented Mar 25, 2024

Yes, sorry, I have to run the remove artifacts task from the post release tasks. I'll do it today.

@raulcd
Copy link
Member

raulcd commented Mar 25, 2024

I've removed old Releases from dist.apache.org

@jbonofre
Copy link
Member

@raulcd Thanks ! Much appreciated ! And no worries at all 😄

@kou
Copy link
Member

kou commented Mar 28, 2024

Thanks.

If we use repository.apache.org for .deb/.rpm, we use https://repo1.maven.org/maven2/org/apache/arrow/debian/ and so on for APT/Yum repositories, right? Hmm. Can we use apache.org domain instead of maven.org domain?

FYI: Our upload script for Java: https://github.com/apache/arrow/blob/main/dev/release/06-java-upload.sh
It uses mvn deploy:deploy-file.

@jbonofre
Copy link
Member

maven.org is an alias to https://repository.apache.org/content/groups/public/ so yeah, we can use repository.apache.org name.

@kou
Copy link
Member

kou commented Nov 3, 2024

Thanks.

OK. Let's use https://repository.apache.org/content/groups/public/org/apache/ instead of https://apache.jfrog.io/ui/native/arrow/ for deb, RPM, C++ binaries for R and mirror of dependencies such as https://apache.jfrog.io/ui/native/arrow/boost/ . Let's use GitHub Releases for sdist, wheel and NuGet package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants