Skip to content

Commit

Permalink
Add docker-context-files detection and cleanup flag. (apache#15593)
Browse files Browse the repository at this point in the history
When building images for production we are using docker-context-files
where we build packages to install. However if those context files
are not cleaned up, they unnecessary increase size and time needed
to build image and they invalidate the COPY . layer of the image.

This PR checks if docker-context-files folder contains just readme
when Breeze build-image command is run (for cases where
images are not built from docker-context-files). Inversely it
also checks that there are some files in case the image is
built with --install-from-docker-context-files switch.

This PR also ads a --cleanup-docker-context-files switch to
clean-up the folder automatically. The error mesages also help
the user instructing the user what to do.

(cherry picked from commit bf81d2e)
  • Loading branch information
potiuk committed May 6, 2021
1 parent fbde015 commit acce13f
Show file tree
Hide file tree
Showing 7 changed files with 75 additions and 28 deletions.
18 changes: 18 additions & 0 deletions BREEZE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1368,6 +1368,12 @@ This is the current syntax for `./breeze <./breeze>`_:
is used in the scheduled run in CI when we rebuild all the images from the scratch
and run the tests to see if the latest python images do not fail our tests.
--cleanup-docker-context-files
Removes whl and tar.gz files created in docker-context-files before running the command.
In case there are some files there it unnecessarily increases the context size and
makes the COPY . always invalidated - if you happen to have those files when you build your
image.
Customization options:
-E, --extras EXTRAS
Expand Down Expand Up @@ -2035,6 +2041,12 @@ This is the current syntax for `./breeze <./breeze>`_:
is used in the scheduled run in CI when we rebuild all the images from the scratch
and run the tests to see if the latest python images do not fail our tests.
--cleanup-docker-context-files
Removes whl and tar.gz files created in docker-context-files before running the command.
In case there are some files there it unnecessarily increases the context size and
makes the COPY . always invalidated - if you happen to have those files when you build your
image.
Customization options:
-E, --extras EXTRAS
Expand Down Expand Up @@ -2609,6 +2621,12 @@ This is the current syntax for `./breeze <./breeze>`_:
is used in the scheduled run in CI when we rebuild all the images from the scratch
and run the tests to see if the latest python images do not fail our tests.
--cleanup-docker-context-files
Removes whl and tar.gz files created in docker-context-files before running the command.
In case there are some files there it unnecessarily increases the context size and
makes the COPY . always invalidated - if you happen to have those files when you build your
image.
Customization options:
-E, --extras EXTRAS
Expand Down
17 changes: 17 additions & 0 deletions breeze
Original file line number Diff line number Diff line change
Expand Up @@ -1210,6 +1210,12 @@ function breeze::parse_arguments() {
echo
shift
;;
--cleanup-docker-context-files)
export CLEANUP_DOCKER_CONTEXT_FILES="true"
echo "Preserves data volumes when stopping airflow"
echo
shift
;;
--no-rbac-ui)
export DISABLE_RBAC="true"
echo "When installing Airflow 1.10, RBAC UI will be disabled."
Expand Down Expand Up @@ -2618,6 +2624,12 @@ function breeze::flag_build_docker_images() {
is used in the scheduled run in CI when we rebuild all the images from the scratch
and run the tests to see if the latest python images do not fail our tests.
--cleanup-docker-context-files
Removes whl and tar.gz files created in docker-context-files before running the command.
In case there are some files there it unnecessarily increases the context size and
makes the COPY . always invalidated - if you happen to have those files when you build your
image.
Customization options:
-E, --extras EXTRAS
Expand Down Expand Up @@ -3351,6 +3363,11 @@ function breeze::run_build_command() {
fi
;;
build_image)
if [[ ${CLEANUP_DOCKER_CONTEXT_FILES} == "true" ]]; then
build_images::cleanup_docker_context_files
fi
build_images::check_for_docker_context_files

if [[ ${PRODUCTION_IMAGE} == "true" ]]; then
build_images::prepare_prod_build
build_images::build_prod_images
Expand Down
1 change: 1 addition & 0 deletions breeze-complete
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,7 @@ runtime-apt-deps: additional-runtime-apt-deps: runtime-apt-command: additional-r
load-default-connections load-example-dags
use-packages-from-dist no-rbac-ui package-format: upgrade-to-newer-dependencies installation-method: continue-on-pip-check-failure
use-airflow-version:
cleanup-docker-context-files
test-type: preserve-volumes dry-run-docker
"

Expand Down
2 changes: 1 addition & 1 deletion scripts/ci/images/ci_wait_for_and_verify_ci_image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ function pull_ci_image() {
push_pull_remove_images::check_if_github_registry_wait_for_image_enabled

start_end::group_start "Configure Docker Registry"
build_image::configure_docker_registry
build_images::configure_docker_registry
start_end::group_end

export AIRFLOW_CI_IMAGE_NAME="${BRANCH_NAME}-python${PYTHON_MAJOR_MINOR_VERSION}-ci"
Expand Down
2 changes: 1 addition & 1 deletion scripts/ci/images/ci_wait_for_and_verify_prod_image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ shift
push_pull_remove_images::check_if_github_registry_wait_for_image_enabled

start_end::group_start "Configure Docker Registry"
build_image::configure_docker_registry
build_images::configure_docker_registry
start_end::group_end

export AIRFLOW_PROD_IMAGE_NAME="${BRANCH_NAME}-python${PYTHON_MAJOR_MINOR_VERSION}"
Expand Down
60 changes: 34 additions & 26 deletions scripts/ci/libraries/_build_images.sh
Original file line number Diff line number Diff line change
Expand Up @@ -222,26 +222,26 @@ function build_images::confirm_image_rebuild() {
fi
}

function build_images::confirm_non-empty-docker-context-files() {
function build_images::check_for_docker_context_files() {
local num_docker_context_files
num_docker_context_files=$(find "${AIRFLOW_SOURCES}/docker-context-files/" -type f |\
grep -c v "README.md" )
local docker_context_files_dir="${AIRFLOW_SOURCES}/docker-context-files/"
num_docker_context_files=$(find "${docker_context_files_dir}" -type f | grep -c -v "README.md" || true)
if [[ ${num_docker_context_files} == "0" ]]; then
if [[ ${INSTALL_FROM_DOCKER_CONTEXT_FILES} == "false" ]]; then
>&2 echo
>&2 echo "ERROR! You want to install packages from docker-context-files"
>&2 echo " but there are no packages to install in this folder."
>&2 echo
if [[ ${INSTALL_FROM_DOCKER_CONTEXT_FILES} != "false" ]]; then
echo
echo "${COLOR_YELLOW}ERROR! You want to install packages from docker-context-files${COLOR_RESET}"
echo "${COLOR_YELLOW} but there are no packages to install in this folder.${COLOR_RESET}"
echo
exit 1
fi
else
if [[ ${INSTALL_FROM_DOCKER_CONTEXT_FILES} == "false" ]]; then
>&2 echo
>&2 echo "ERROR! There are some extra files in docker-context-files except README.md"
>&2 echo " And you did not choose --install-from-docker-context-files flag"
>&2 echo " This might result in unnecessary cache invalidation and long build times"
>&2 echo " Exiting now - please remove those files (except README.md) and retry"
>&2 echo
echo
echo "${COLOR_YELLOW}ERROR! There are some extra files in docker-context-files except README.md${COLOR_RESET}"
echo "${COLOR_YELLOW} And you did not choose --install-from-docker-context-files flag${COLOR_RESET}"
echo "${COLOR_YELLOW} This might result in unnecessary cache invalidation and long build times${COLOR_RESET}"
echo "${COLOR_YELLOW} Exiting now - please restart the command with --cleanup-docker-context-files switch${COLOR_RESET}"
echo
exit 2
fi
fi
Expand Down Expand Up @@ -459,7 +459,7 @@ function build_images::get_docker_image_names() {
# either GITHUB_TOKEN or CONTAINER_REGISTRY_TOKEN depending on the registry.
# In case Personal Access token is not set, skip logging in
# Also enable experimental features of docker (we need `docker manifest` command)
function build_image::configure_docker_registry() {
function build_images::configure_docker_registry() {
if [[ ${USE_GITHUB_REGISTRY} == "true" ]]; then
local token=""
if [[ "${GITHUB_REGISTRY}" == "ghcr.io" ]]; then
Expand Down Expand Up @@ -515,7 +515,7 @@ function build_images::prepare_ci_build() {
export AIRFLOW_IMAGE="${AIRFLOW_CI_IMAGE}"
readonly AIRFLOW_IMAGE

build_image::configure_docker_registry
build_images::configure_docker_registry
sanity_checks::go_to_airflow_sources
permissions::fix_group_permissions
}
Expand Down Expand Up @@ -835,7 +835,7 @@ function build_images::prepare_prod_build() {
export AIRFLOW_IMAGE="${AIRFLOW_PROD_IMAGE}"
readonly AIRFLOW_IMAGE

build_image::configure_docker_registry
build_images::configure_docker_registry
AIRFLOW_BRANCH_FOR_PYPI_PRELOADING="${BRANCH_NAME}"
sanity_checks::go_to_airflow_sources
}
Expand Down Expand Up @@ -1033,7 +1033,7 @@ function build_images::determine_docker_cache_strategy() {
}


function build_image::assert_variable() {
function build_images::assert_variable() {
local variable_name="${1}"
local expected_value="${2}"
local variable_value=${!variable_name}
Expand All @@ -1045,19 +1045,27 @@ function build_image::assert_variable() {
fi
}

function build_images::cleanup_dist() {
mkdir -pv "${AIRFLOW_SOURCES}/dist"
rm -f "${AIRFLOW_SOURCES}/dist/"*.{whl,tar.gz}
}


function build_images::cleanup_docker_context_files() {
mkdir -pv "${AIRFLOW_SOURCES}/docker-context-files"
rm -f "${AIRFLOW_SOURCES}/docker-context-files/"*.{whl,tar.gz}
}

function build_images::build_prod_images_from_locally_built_airflow_packages() {
# We do not install from PyPI
build_image::assert_variable INSTALL_FROM_PYPI "false"
build_images::assert_variable INSTALL_FROM_PYPI "false"
# But then we reinstall airflow and providers from prepared packages in the docker context files
build_image::assert_variable INSTALL_FROM_DOCKER_CONTEXT_FILES "true"
build_images::assert_variable INSTALL_FROM_DOCKER_CONTEXT_FILES "true"
# But we install everything from scratch to make a "clean" installation in case any dependencies got removed
build_image::assert_variable AIRFLOW_PRE_CACHED_PIP_PACKAGES "false"
build_images::assert_variable AIRFLOW_PRE_CACHED_PIP_PACKAGES "false"

# Cleanup dist and docker-context-files folders
mkdir -pv "${AIRFLOW_SOURCES}/dist"
mkdir -pv "${AIRFLOW_SOURCES}/docker-context-files"
rm -f "${AIRFLOW_SOURCES}/dist/"*.{whl,tar.gz}
rm -f "${AIRFLOW_SOURCES}/docker-context-files/"*.{whl,tar.gz}
build_images::cleanup_dist
build_images::cleanup_docker_context_files

# Build necessary provider packages
runs::run_prepare_provider_packages "${INSTALLED_PROVIDERS[@]}"
Expand Down
3 changes: 3 additions & 0 deletions scripts/ci/libraries/_initialization.sh
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,9 @@ function initialization::initialize_base_variables() {
# Which means that you do not have to start from scratch
export PRESERVE_VOLUMES="false"

# Cleans up docker context files if specified
export CLEANUP_DOCKER_CONTEXT_FILES="false"

# If set to true, RBAC UI will not be used for 1.10 version
export DISABLE_RBAC=${DISABLE_RBAC:="false"}

Expand Down

0 comments on commit acce13f

Please sign in to comment.