Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apps sc: add sub path and name suffix to rclone jobs, harbor restore support azure #2368

Merged
merged 1 commit into from
Jan 7, 2025

Conversation

viktor-f
Copy link
Contributor

Warning

This is a public repository, ensure not to disclose:

  • personal data beyond what is necessary for interacting with this pull request, nor
  • business confidential information, such as customer names.

What kind of PR is this?

Required: Mark one of the following that is applicable:

  • kind/feature
  • kind/improvement
  • kind/deprecation
  • kind/documentation
  • kind/clean-up
  • kind/bug
  • kind/other

Optional: Mark one or more of the following that are applicable:

Important

Breaking changes should be marked kind/admin-change or kind/dev-change depending on type
Critical security fixes should be marked with kind/security

  • kind/admin-change
  • kind/dev-change
  • kind/security
  • kind/adr

What does this PR do / why do we need this PR?

Harbor in azure puts image data at <bucket-name>//docker/ instead of <bucket-name>/docker/. This causes issues both when restoring data to harbor in azure and when rclone is syncing from azure.

Restoring from s3 to azure:
When rcloning to azure from s3 rclone will by default put the data at the same folder as in s3 (<bucket-name>/docker/). So then we need to move it to the correct folder for azure. This is done with the new rclone move job in restore/harbor/.

Rclone syncing from azure:
When running rclone sync on a harbor bucket in azure, then rclone will by default ignore the image folder because of the // in the path. This can be fixed by running a rclone sync job that specifically targets <bucket-name>//docker in the source. Then send it to <bucket-name>//docker/ if the destination is azure and <bucket-name>/docker/ if the destination is s3 (note that sending to //docker in s3 will translate into /docker, but in the next sync rclone will not find anything in //docker so it will send all files again). However, the default rclone sync job for harbor will then delete everything in the docker folder at the destination (because it is skipping that folder at the source). So we need to disable the default sync jobs and instead create a second custom job for harbor that is syncing /backups as well (the only other folder in the bucket).

Information to reviewers

At time of writing this PR I still need to do some updates to the config schema, do some documentation updates, and add maybe add migration for running azure clusters. But the code should be complete.

I have tested the following things that are expected to work:

  • Syncing from s3 to azure and then using the restore script to move the folders and see that harbor works as expected.
  • Syncing from azure to s3 with the custom sync jobs
  • Syncing from azure to s3 with encryption
  • Running restore job from encrypted backup using addTargetsFromSync
  • Maybe something more as well 🙂

Example of sync config from azure to s3:

objectStorage:
  sync:
    enabled: true
    destinationType: s3
    syncDefaultBuckets: false
    buckets:
      - source: viktor-harbor
        sourcePath: //docker
        destinationPath: /docker
        nameSuffix: docker
      - source: viktor-harbor
        sourcePath: /backups
        destinationPath: /backups
        nameSuffix: backups
    # config for the other default buckets need to be added here as well

Checklist

  • Proper commit message prefix on all commits
  • Change checks:
    • The change is transparent
    • The change is disruptive
    • The change requires no migration steps
    • The change requires migration steps
    • The change upgrades CRDs
    • The change updates the config and the schema
  • Documentation checks:
  • Metrics checks:
    • The metrics are still exposed and present in Grafana after the change
    • The metrics names didn't change (Grafana dashboards and Prometheus alerts are not affected)
    • The metrics names did change (Grafana dashboards and Prometheus alerts were fixed)
  • Logs checks:
    • The logs do not show any errors after the change
  • Pod Security Policy checks:
    • Any changed pod is covered by Pod Security Admission
    • Any changed pod is covered by Gatekeeper Pod Security Policies
    • The change does not cause any pods to be blocked by Pod Security Admission or Policies
  • Network Policy checks:
    • Any changed pod is covered by Network Policies
    • The change does not cause any dropped packets in the NetworkPolicy Dashboard
  • Audit checks:
    • The change does not cause any unnecessary Kubernetes audit events
    • The change requires changes to Kubernetes audit policy
  • Falco checks:
    • The change does not cause any alerts to be generated by Falco
  • Bug checks:
    • The bug fix is covered by regression tests

@viktor-f viktor-f self-assigned this Dec 12, 2024
@viktor-f viktor-f requested review from a team as code owners December 12, 2024 08:35
@viktor-f viktor-f added app/harbor Harbor - Container and Artefact Registry kind/improvement Improvement of existing features, e.g. code cleanup or optimizations. labels Dec 12, 2024
@OlleLarsson
Copy link
Contributor

OlleLarsson commented Dec 12, 2024

//docker/

Is this as intended or is it a bug in Harbor azure?

Copy link
Contributor

@Eliastisys Eliastisys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small nit but otherwise lgtm so far! (did not test anything)

config/sc-config.yaml Outdated Show resolved Hide resolved
@viktor-f
Copy link
Contributor Author

//docker/

Is this as intended or is it a bug in Harbor azure?

This is a known issue in upstream registry: distribution/distribution#1247
It looks like it is fixed there and will be part of registry 3.0. But I do not know when that will be release and after it is released it has to be incorporated into Harbor as well.

Copy link
Contributor

@aarnq aarnq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the config additions look good, please do add the examples and the additions to rclone restores targets schema.

Nice use of defaults for the name suffix, makes it easy to see where they are rendered from 👍

Also great job testing with all those variations! 😅

Copy link
Contributor

@Xartos Xartos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good in general, just some minor nits

config/schemas/config.yaml Outdated Show resolved Hide resolved
restore/harbor/README.md Outdated Show resolved Hide resolved
@viktor-f viktor-f requested a review from a team as a code owner December 18, 2024 15:02
@viktor-f
Copy link
Contributor Author

Added some more things in the schema now.
I thought about adding some migration script to add buckets instead of default buckets if you are using sync and harbor on azure, but I don't think it is worth the time so I ignored it. I hope you are ok with that.
So I think this is done now, let me know if there is something else that you think I missed.

Copy link
Contributor

@Elias-elastisys Elias-elastisys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, could add a migration comment about manually adding the buckets but schema will complain anyways so would be hard to miss.

@viktor-f viktor-f force-pushed the vf/harbor-azure-rclone branch from 146e0ca to fff94cc Compare December 18, 2024 16:54
@viktor-f viktor-f requested a review from a team as a code owner December 18, 2024 16:54
@viktor-f
Copy link
Contributor Author

Lgtm, could add a migration comment about manually adding the buckets but schema will complain anyways so would be hard to miss.

Thanks for the suggestion. Added a text in the migration document now as well.

Copy link
Contributor

@aarnq aarnq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

helmfile.d/values/rclone/sync.yaml.gotmpl Outdated Show resolved Hide resolved
config/sc-config.yaml Outdated Show resolved Hide resolved
@OlleLarsson OlleLarsson self-requested a review December 19, 2024 08:40
@OlleLarsson OlleLarsson dismissed their stale review December 19, 2024 08:41

Requested changes have been implemented

@viktor-f viktor-f force-pushed the vf/harbor-azure-rclone branch from 3d1f0e9 to 587a2af Compare January 7, 2025 07:34
Added new config option for custom sync and restore to set path
Added new validation to prevent default bucket sync on azure
Added new restore part for harbor on azure
@viktor-f viktor-f force-pushed the vf/harbor-azure-rclone branch from 587a2af to 49d22c6 Compare January 7, 2025 07:35
@viktor-f viktor-f merged commit c8783f3 into main Jan 7, 2025
12 checks passed
@viktor-f viktor-f deleted the vf/harbor-azure-rclone branch January 7, 2025 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app/harbor Harbor - Container and Artefact Registry kind/improvement Improvement of existing features, e.g. code cleanup or optimizations.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[2] Restoring Harbor from S3 to Azure fails
6 participants