Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage Collector on the ci-jenkins-io-artifacts S3 storage account #3663

Closed
Tracked by #3662
lemeurherve opened this issue Jul 12, 2023 · 7 comments
Closed
Tracked by #3662

Comments

@lemeurherve
Copy link
Member

Service(s)

AWS, ci.jenkins.io

Summary

Follow-up of #3643, we need to put in place a garbage collector to delete old artifacts and stashes from this S3 storage account used by ci.jenkins.io agents.

Note: the associated costs are low, less than 1$/month, this issue isn't urgent.

Reproduction steps

No response

@dduportal
Copy link
Contributor

As per #3954, this issue is not needed anymore as the S3 bucket was deleted.

@dduportal dduportal closed this as not planned Won't fix, can't repro, duplicate, stale May 24, 2024
@dduportal dduportal removed this from the infra-team-sync-next milestone May 24, 2024
@jglick
Copy link

jglick commented May 24, 2024

So where are artifacts & stashes being sent now; and is there a cleanup policy on that bucket?

@dduportal
Copy link
Contributor

So where are artifacts & stashes being sent now; and is there a cleanup policy on that bucket?

Back to plain old data disk (eg. In the jenkins home of ci.jenkins.io). As we moved all agents to azure and the controller is also in azure, performances are way better!

as such, cleanup policy is tied to build retention policy

@jglick
Copy link

jglick commented May 24, 2024

In the jenkins home

😱

@dduportal
Copy link
Contributor

In the jenkins home

😱

It represents around 160gb (due to build cleanups), it is not that much.
Besides, jenkins home is on a local nvme at 25000 IOPS so no risks even for ci.j (spoiler : we don’t pay for it until end of august)

@jglick
Copy link

jglick commented May 28, 2024

That is not the only consideration; if you do not use an artifact manager, stashes and artifacts must be streamed over the Remoting channel, putting CPU and I/O load on the controller.

@dduportal
Copy link
Contributor

That is not the only consideration; if you do not use an artifact manager, stashes and artifacts must be streamed over the Remoting channel, putting CPU and I/O load on the controller.

Absolutely. the VM hosting ci.jenkins.io was sized for this case (default with artifacts on the JENKINS_HOME) and we never shrinked it after switching from default to S3 artifact manager.
To be transparent, the VM is huge and oversized for the usage: 8 vCPUs, 32 Gb memory.

We've carefully followed up the metrics after the switch, particularly with the huge amount of BOM builds in the days following the changes (18-19 May).

Result is that we don't see any additional load, eventually by peak (but even checking a 15 min period during a BOM build with a lot of stash/unstash does not show additional load). I believe the usage done by these builds and the oversize of the VM covers it.

Of course, we can afford this for ci.jenkins.io because the VM is running under credits until end of year (Azure credits until August, then AWS credits until December). We'll have to consider this again once it will be migrated into a paid account of course: but it is an optimization with lower priority for now: we'll add the S3 artifact manager back when we'll move it back to AWS.

Capture d’écran 2024-05-29 à 11 03 28 Capture d’écran 2024-05-29 à 11 04 07 Capture d’écran 2024-05-29 à 11 04 27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants