-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-44619][INFRA] Free up disk space for container jobs #42253
Conversation
.github/workflows/build_and_test.yml
Outdated
rm -rf /__t/CodeQL || echo "fail to delete /__t/CodeQL" | ||
rm -rf /__t/go || echo "fail to delete /__t/go" | ||
rm -rf /__t/node || echo "fail to delete /__t/node" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleting those directories in Dockerfile
takes no effect ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
88M /__e/node12
55M /__e/node12_alpine
101M /__e/node16
89M /__e/node16_alpine
8.2G /__t/CodeQL
16K /__t/Java_Temurin-Hotspot_jdk
487M /__t/PyPy
1.2G /__t/Python
62M /__t/Ruby
1.2G /__t/go
379M /__t/node
16K /__w/_PipelineMapping
26M /__w/_actions
68K /__w/_temp
681M /__w/spark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can not find an official way to uninstall CodeQL (like pip/apt-get/etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @Yikun do you happen to know can we remove /__t/CodeQL
in this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I don't find a good way to uninstall CodeQL/Go/Node
also cc @LuciferYang @HyukjinKwon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According the link:
https://github.com/actions/runner-images/blob/main/images/linux/Ubuntu2204-Readme.md#installed-apt-packages
seems only above package can be unintall by apt, for other one, we can only cleanup by this way, so this PR seems a okay way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is CodeQL
installed by the Ubuntu 22.04
runner?
I am surprised that I don't find it in a non-container job (in #42241)
0874cf4
to
9b6a16f
Compare
dev/infra/Dockerfile
Outdated
RUN apt-get autoremove --purge -y | ||
RUN apt-get clean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#run
Official Debian and Ubuntu images [automatically run apt-get clean](https://github.com/moby/moby/blob/03e2923e42446dbb830c654d0eec323a0b4ef02a/contrib/mkimage/debootstrap#L82-L105), so explicit invocation is not required.
It already be removed. But add rm -rf /var/lib/apt/lists/*
to the end of apt install line might help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this PR aims to uninstall pkgs, I would like to send another PR to following this rm -rf /var/lib/apt/lists/*
guide
let me take another look to check whether it is safe to uninstall CodeQL and what is the proper way to do |
Since the CI passes fine for the time being, we can just drop this too for now if this takes too much time to investigate. |
nice! |
6b3a9d1
to
a175f6b
Compare
let me focus on removing the tool directories first and skip the changes in dockerfile |
a175f6b
to
e66d1f3
Compare
60e5fa7
to
bbc96b4
Compare
00276d4
to
01763c2
Compare
Different from the non-container jobs, there are not many unneeded apt libraries in container jobs. |
cc @HyukjinKwon @Yikun @LuciferYang would you mind taking another look? |
thanks, merged to master |
### What changes were proposed in this pull request? follow the [Best practices for writing Dockerfiles](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#apt-get) : > Always combine RUN apt-get update with apt-get install in the same RUN statement. ### Why are the changes needed? 1, to address #42253 (comment) 2, when I attempted to change the apt-get install in #41918, the behavior was confusing. By following the best practices, further changes should work immediately. ### Does this PR introduce _any_ user-facing change? NO, dev-only ### How was this patch tested? CI ### Was this patch authored or co-authored using generative AI tooling? NO Closes #42842 from zhengruifeng/infra_docker_file_opt. Authored-by: Ruifeng Zheng <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
### What changes were proposed in this pull request? Free up disk space for container jobs ### Why are the changes needed? increase the available disk space before this PR ![image](https://github.com/apache/spark/assets/7322292/64230324-607b-4c1d-ac2d-84b9bcaab12a) after this PR ![image](https://github.com/apache/spark/assets/7322292/aafed2d6-5d26-4f7f-b020-1efe4f551a8f) ### Does this PR introduce _any_ user-facing change? No, infra-only ### How was this patch tested? updated CI Closes apache#42253 from zhengruifeng/infra_clean_container. Authored-by: Ruifeng Zheng <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
### What changes were proposed in this pull request? Free up disk space for container jobs ### Why are the changes needed? increase the available disk space before this PR ![image](https://github.com/apache/spark/assets/7322292/64230324-607b-4c1d-ac2d-84b9bcaab12a) after this PR ![image](https://github.com/apache/spark/assets/7322292/aafed2d6-5d26-4f7f-b020-1efe4f551a8f) ### Does this PR introduce _any_ user-facing change? No, infra-only ### How was this patch tested? updated CI Closes apache#42253 from zhengruifeng/infra_clean_container. Authored-by: Ruifeng Zheng <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
What changes were proposed in this pull request?
Free up disk space for container jobs
Why are the changes needed?
increase the available disk space
before this PR
after this PR
Does this PR introduce any user-facing change?
No, infra-only
How was this patch tested?
updated CI