Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting non standard package sources #56

Open
ericlchen1 opened this issue Jun 15, 2024 · 6 comments · May be fixed by #64
Open

Supporting non standard package sources #56

ericlchen1 opened this issue Jun 15, 2024 · 6 comments · May be fixed by #64
Labels
feature request New feature or request help wanted Extra attention is needed

Comments

@ericlchen1
Copy link
Contributor

ericlchen1 commented Jun 15, 2024

Hey, I ran into some issues when trying to use the nvidia repo as a source url https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64. This is due to the way urls are parsed in rules_distroless and are used to find the Packages file. Also there doesn't seem to be much support when the packages are not multiarch. In this case, the nvidia packages only support amd64 or i386.

Currently I've put together a pretty nonstandard solution with a patch to work around this which supports my use case. However, I was wondering if this would be supported in the future. Thanks!

rules_distroless.patch

diff --git a/apt/private/package_index.bzl b/apt/private/package_index.bzl
index 620a021..c441d4d 100644
--- a/apt/private/package_index.bzl
+++ b/apt/private/package_index.bzl
@@ -2,19 +2,24 @@
 
 load(":util.bzl", "util")
 
-def _fetch_package_index(rctx, url, dist, comp, arch, integrity):
+def _fetch_package_index(rctx, url, dist, comp, arch, integrity, override_url = None):
     target_triple = "{dist}/{comp}/{arch}".format(dist = dist, comp = comp, arch = arch)
     output = "{}/Packages.xz".format(target_triple)
+    url = "{}/dists/{}/{}/binary-{}/Packages.xz".format(url, dist, comp, arch)
+    if override_url:
+        url = override_url
+        output = "{}/Packages".format(target_triple)
     r = rctx.download(
-        url = "{}/dists/{}/{}/binary-{}/Packages.xz".format(url, dist, comp, arch),
+        url = url,
         output = output,
         integrity = integrity,
     )
-    rctx.execute([
-        "xz",
-        "--decompress",
-        output,
-    ])
+    if not override_url:
+        rctx.execute([
+            "xz",
+            "--decompress",
+            output,
+        ])
     return ("{}/Packages".format(target_triple), r.integrity)
 
 def _parse_package_index(state, contents, arch, root):
@@ -59,9 +64,11 @@ def _create(rctx, sources, archs):
     )
 
     for arch in archs:
-        for (url, dist, comp) in sources:
+        for (url, dist, comp, override_url, override_arch) in sources:
+            if override_arch and override_arch != arch:
+                continue
             rctx.report_progress("Fetching package index: {}/{}".format(dist, arch))
-            (output, _) = _fetch_package_index(rctx, url, dist, comp, arch, "")
+            (output, _) = _fetch_package_index(rctx, url, dist, comp, arch, "", override_url)
 
             # TODO: this is expensive to perform.
             rctx.report_progress("Parsing package index: {}/{}".format(dist, arch))
diff --git a/apt/private/resolve.bzl b/apt/private/resolve.bzl
index ea443d2..b14b4ff 100644
--- a/apt/private/resolve.bzl
+++ b/apt/private/resolve.bzl
@@ -66,9 +66,11 @@ def _deb_resolve_impl(rctx):
         if not dist:
             dist = distr
         sources.append((
-            src["url"],
+            src.get("url"),
             distr,
             comp,
+            src.get("override_url"),
+            src.get("override_arch"),
         ))
 
     pkgindex = package_index.new(rctx, sources = sources, archs = manifest["archs"])
@@ -76,7 +78,11 @@ def _deb_resolve_impl(rctx):
     lockf = lockfile.empty(rctx)
 
     for arch in manifest["archs"]:
-        for dep_constraint in manifest["packages"]:
+        packages = manifest["packages"]
+        arch_packages = manifest.get("arch_packages", {})
+        if arch in arch_packages:
+            packages = packages + arch_packages[arch]
+        for dep_constraint in packages:
             constraint = package_resolution.parse_depends(dep_constraint).pop()
 
             rctx.report_progress("Resolving %s" % dep_constraint)

diff --git a/apt/private/package_index.bzl b/apt/private/package_index.bzl
index c441d4d..c344442 100644
--- a/apt/private/package_index.bzl
+++ b/apt/private/package_index.bzl
@@ -42,6 +42,11 @@ def _parse_package_index(state, contents, arch, root):
 
         if len(pkg.keys()) != 0:
             pkg["Root"] = root
+            if "Filename" in pkg:
+                pkg["Filename"] = pkg["Filename"].strip("./")
+            if "Architecture" in pkg and pkg["Architecture"] != "all" and pkg["Architecture"] != arch:
+                pkg = {}
+                continue
             util.set_dict(state.packages, value = pkg, keys = (arch, pkg["Package"], pkg["Version"]))
             last_key = ""
             pkg = {}
@@ -72,7 +74,7 @@ def _create(rctx, sources, archs):
 
             # TODO: this is expensive to perform.
             rctx.report_progress("Parsing package index: {}/{}".format(dist, arch))
-            _parse_package_index(state, rctx.read(output), arch, url)
+            _parse_package_index(state, rctx.read(output), arch, override_url.split("/Packages")[0] if override_url else url)
 
     return struct(
         package_versions = lambda **kwargs: _package_versions(state, **kwargs),

debian12.yaml

version: 1
sources:
- channel: bookworm main
  url: https://snapshot-cloudflare.debian.org/archive/debian/20240612T080215Z
- channel: bookworm-security main
  url: https://snapshot-cloudflare.debian.org/archive/debian-security/20240612T080215Z
- channel: bookworm-updates main
  url: https://snapshot-cloudflare.debian.org/archive/debian/20240612T080215Z
- channel: nvidia main
  override_url: https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/Packages
  override_arch: amd64
archs:
- amd64
- arm64
...
bunch of packages here
...
arch_packages:
  amd64:
  - libcuda1
@ericlchen1
Copy link
Contributor Author

Seems to be related to #55 but the question of how architecture is handled is still there.

@ericlchen1 ericlchen1 linked a pull request Jul 11, 2024 that will close this issue
@thesayyn thesayyn added feature request New feature or request help wanted Extra attention is needed labels Jul 18, 2024
@thesayyn
Copy link
Collaborator

the question of how architecture is handled is still there.

My instinct is to do what apt does when you mix non-flat and flat repos. However, i don't have enough time to dig in myself.

@ericlchen1
Copy link
Contributor Author

ericlchen1 commented Jul 19, 2024

I'm a bit hesitant to remove arch_packages in the PR as you'd need to restrict your archs to a single arch if there's only one architecture available. It would throw a Error in fail: Unable to locate package error when running the lock instead of allowing it and having an error when trying to use the package in an invalid architecture. I'm not sure what the best way to handle this case is. The arch portion in sources is completely optional and probably could be refactored.

jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Sep 13, 2024
Fixes [issue GoogleContainerTools#56]

Follow-up and credit to @alexconrey (PR#55), @ericlchen1 (PR#64) and
@benmccown (PR#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.

* [issue GoogleContainerTools#56]: GoogleContainerTools#56
* [PR#55]: https://github.com/GoogleContainerTools/rules_distroless/pr/55
* [PR#64]: https://github.com/GoogleContainerTools/rules_distroless/pr/64
* [PR#67]: https://github.com/GoogleContainerTools/rules_distroless/pr/67
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Sep 13, 2024
Fixes [issue GoogleContainerTools#56]

Follow-up and credit to @alexconrey (pr/55), @ericlchen1 (pr/64) and
@benmccown (pr/67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Sep 13, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Sep 13, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (pr#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Sep 13, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
@jjmaestro
Copy link
Contributor

the question of how architecture is handled is still there.

My instinct is to do what apt does when you mix non-flat and flat repos. However, i don't have enough time to dig in myself.

I think it's totally OK to mix them, IMHO the issue is more with the architectures supported by the repos, regardless of flat VS canonical. For example, you could have a canonical repo with amd64 and arm64 that has (pkg1, pkg2) in amd64 but e.g. only pkg1 in arm64 and you would have the same problem as mixing a canonical repo and a flat repo that have packages missing for certain architectures.

@jjmaestro
Copy link
Contributor

I'm a bit hesitant to remove arch_packages in the PR as you'd need to restrict your archs to a single arch if there's only one architecture available. It would throw a Error in fail: Unable to locate package error when running the lock instead of allowing it and having an error when trying to use the package in an invalid architecture. I'm not sure what the best way to handle this case is. The arch portion in sources is completely optional and probably could be refactored.

As I mentioned in #64 (comment) I think this issue and @thesayyn concerns about architectures are "better solved" (at least for now) by separating the problematic packages and repos into their own manifest.

Also, this is (at least to me) the best way to do it because a manifest is "just the way to get the packages you want from the sources that have them" so if there are problematic packages-arch pairs, they should have their own manifest.

What do you think? Does #86 look like it solves all these concerns? :-?

@ericlchen1
Copy link
Contributor Author

As I mentioned in #64 (comment) I think this issue and @thesayyn concerns about architectures are "better solved" (at least for now) by separating the problematic packages and repos into their own manifest.

Also, this is (at least to me) the best way to do it because a manifest is "just the way to get the packages you want from the sources that have them" so if there are problematic packages-arch pairs, they should have their own manifest.

What do you think? Does #86 look like it solves all these concerns? :-?

Looks good and I agree with the architecture design with separate manifests :). I haven't had the time recently to work on this. It's great that someone can follow up and hopefully push this through. Happy to close this when #86 goes through.

jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Sep 13, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Sep 19, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Sep 19, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Sep 19, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Sep 20, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Sep 22, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Oct 28, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Oct 28, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Oct 29, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Oct 29, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Nov 14, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
jjmaestro added a commit to jjmaestro/rules_distroless that referenced this issue Nov 14, 2024
Fixes issue GoogleContainerTools#56

Follow-up and credit to @alexconrey (PR GoogleContainerTools#55), @ericlchen1 (PR GoogleContainerTools#64) and
@benmccown (PR GoogleContainerTools#67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants