Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dependencies on build image #8872

Closed
potiuk opened this issue May 14, 2020 · 7 comments · Fixed by #9231
Closed

Add dependencies on build image #8872

potiuk opened this issue May 14, 2020 · 7 comments · Fixed by #9231
Labels
area:production-image Production image improvements and fixes kind:feature Feature Requests

Comments

@potiuk
Copy link
Member

potiuk commented May 14, 2020

Description

Add dependencies with "on-build" feature of Docker build.

Use case / motivation

Some dependencies can be added easily using on-build feature of Docker build.

@potiuk potiuk added kind:feature Feature Requests area:production-image Production image improvements and fixes labels May 14, 2020
@22quinn
Copy link
Contributor

22quinn commented May 27, 2020

This could be a nice feature but looks like there was an effort to remove ONBUILD from official images - docker-library/official-images#2076

As the above issue mentioned, different users might want different customizations. Initially I wanted to wait for this feature, but I realize ONBUILD does not solve my problem, e.g. installing packages like docker-ce-cli from third-party repositories. Therefore, I would rather build my own image FROM apache/airflow.

@potiuk
Copy link
Member Author

potiuk commented May 27, 2020

Yeah, I see the point. I think also maybe we should not do onbuild if we can add easily build args to supplement that.

@landier
Copy link

landier commented Oct 21, 2020

Hello,

Chiming on this issue since I'm facing the very same situation. I want to extend the Airflow Dockerfile (or more precisely reduce by getting rid of some contribs) and I'm looking for a way to do it:

  • without git cloning the repository since it complicates a lot my build (and increase a lot the build time) for only one file...
  • nor duplicating only the Dockerfile by hand because it's not future-proof and prone to error if not updated correctly

Since I'm already using my own Dockerfile that extends the Airflow image, I'm super interested in the official Airflow Dockerfile to use ONBUILD actions for everything that is customizable via build args.

Am I missing something here?

Thank you.

@potiuk
Copy link
Member Author

potiuk commented Oct 21, 2020

ONBUILD does not solve your problems. I wonder how do you imagine it solve the problem? We have no GCC/build-essentials in the image anyway so at the very best you can get exactly the same as with extending the image (unless you have a better idea). So all the DEV actions are not possible. And everything that you can do with ONBUILD can be done via extending the current image.

ONBUILD is a very controversial idea and it does not serve any other purpose that extension of the Docker image would do.

I.e. @landier how any ONBUILD is different from:

FROM apache/airflow:image

USER root

# do whatever you'd like to do in ONBUILD

USER airflow 

?

I actually think about something entirely different. You do not want to use one file. If you look closer, there are some other files used by the image (scripts mainly). What I thought might be a better way, is to set up automated copying (using copybara or a dedicated Github Actions worfklow) only what is needed in order to build the image into a separate repository where only those scripts + Dockerfile would be and nothing more.

This can happen automatically at every commit and then you'd have to only clone the other repository to build your custom image.

WDYT?

@landier
Copy link

landier commented Oct 22, 2020

Hello,

First, thanks for the response.

We're currently actually extending the image exactly as your example however we're currently thinking of getting rid of some packages that are already installed (contribs mainly) hence we can try to identify each and every package and make sure we uninstall them but it sounds like something painful to do in the long run.

That is why I thought of the ONBUILD in order to set AIRFLOW_EXTRAS with only what we need but you're right with the multistage build, this should happen on the build stage and not the final image stage.

I get your point; it's actually quite similar to my thinking of having airflow in a submodule in order to use the official Dockerfile with my build-args. I'll think it over.

@potiuk
Copy link
Member Author

potiuk commented Oct 22, 2020

Just to justify a bit more my line of thoughts:

I have successfully used subrepo (https://github.com/ingydotnet/git-subrepo) rather than submodule to successfully sync-up the whole airflow to a customer project (and easily contribute back to upstream any changes we've done there). I can definitely recommend subrepo rather than submodule to do this (it is much nicer to work with as every time you sync, you end up with a direct commit in your target repo (and you have local changes that you might decide to keep for yourself or contribute back.

But I agree if you need only the Dockerfile + scripts, there is no need to clone the whole Airflow.

However, having a separate repo with only what is needed for Dockerfile (but one-way published from the main Airlfow repo) is actually much better than working directly in that repo. Currently, we use the Dockerfile to run tests in Airflow but also we run tests in Airflow to test the Dockerfile, so there is a very close coupling between them and often you have to make commits that cover changes in both - Airflow and Dockerfile at the same time. This would make separate Dockerfile repo quite a bit nightmarish to maintain.

That's why I think separate repo where only Docker + scripts to build it is a better idea. And we can easily establish one-way push to that repo after changes to 'airlfow' and make it read-only otherwise.

@potiuk
Copy link
Member Author

potiuk commented Oct 22, 2020

I created an issue for that #11740 - so we can continue the discussion there. This one is already closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:production-image Production image improvements and fixes kind:feature Feature Requests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants