Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flat repository support for debian images #64

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ericlchen1
Copy link
Contributor

@ericlchen1 ericlchen1 commented Jul 11, 2024

Fixes #56 and followup to #55

Problem:

When working with "flat repository format" external repositories, the assumption made by rules_distroless is to use the known repository hierarchy within dists directory. See: https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format

In a flat repository, everything is "flat" within the top-level of the repository and as such - the {}/dists/{}/{}/binary-{}/Packages.{}".format(url, dist, comp, arch, file_type) logic will always fail to find the Package indices.

Example: https://cloud.r-project.org/bin/linux/debian/bullseye-cran40/ or https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/ which are both flat repositories with arch specific packages (amd64/i386)

Solution:

Manifest:
sources: add flat_repository and arch keys

  • flat_repository: default false, boolean that represents if source is a flat_repository
  • arch: lock repository to a specific architecture. Other architectures will not be considered for lockfile if this is set.

Add arch_specific_packages section with format

arch_specific_packages:
  architecture:
    - "package1"
    - "package2"

@ericlchen1 ericlchen1 marked this pull request as ready for review July 11, 2024 07:03
@ericlchen1 ericlchen1 force-pushed the ericchen/flat-repository-support branch from 3c61e32 to d5a5461 Compare July 17, 2024 14:19
@jjmaestro
Copy link
Contributor

@ericlchen1 I think this PR conflates / mixes two issues: support for flat repositories (which #55 addresses) plus then having single-architecture sources mixed with multi-architecture sources.

The "single-architecture sources" forces the hacks in the YAML manifest to mark one of the sources as single-architecture and the arch_specific_packages section.

IMHO this is unnecessary because you can have a separate manifest for those repos. For example, in this specific case, just split the bullseye bullseye-cran40 channel into its own manifest, with arch: amd64 as the only architecture in it and just one package: r-doc-html.

I think that's cleaner, easier to understand and should be much better overall since you won't be trying to resolve packages that (I think) won't exist in the majority of the sources given.

Comment on lines +68 to +69
if "Filename" in pkg:
pkg["Filename"] = pkg["Filename"].strip("./")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in #67 (comment), this is not what the spec says, check https://wiki.debian.org/DebianRepository/Format#Filename:

Filename

The mandatory Filename field shall list the path of the package archive relative to the base directory of the repository. The path should be in canonical form, that is, without any components denoting the current or parent directory ("." or ".."). It also should not make use of any protocol-specific components, such as URL-encoded parameters.

I've checked the R-Project flat repos and this doesn't happen / is not needed. However, this is something that happens in the NVIDIA CUDA flat repos . Thus, I think that this it's a quirk of some specific flat repos that don't conform with the Debian repo spec.

jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Sep 13, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Sep 13, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (pr#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Sep 13, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
@jjmaestro
Copy link
Contributor

OK, so I finally did try to write a PR that adds support for multi-arch and single-arch flat repos and addresses the issue with the NVIDIA CUDA flat repos. Hopefully that PR superseeds the others and can be merged and 1 + 3 PRs closed 😄

@ericlchen1 please have a look and let me know what you think and if #86 works for you. Happy to address any issues or concerns if there's something that breaks in your use-case!

jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Sep 13, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Sep 19, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Sep 19, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Sep 19, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Sep 20, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Sep 22, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Oct 28, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Oct 28, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Oct 29, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Oct 29, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Nov 14, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this pull request Nov 14, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Supporting non standard package sources
2 participants