Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable omnibus build cache #20117

Merged
merged 51 commits into from
Apr 17, 2024
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
c5d3e9e
omnibus: conditionally enable git cache
chouquette Sep 12, 2023
dfadb8a
omnibus: always build datadog projects/software
chouquette Sep 21, 2023
a3d09bd
tasks: omnibus_build: add support for omnibus git caching
chouquette Sep 19, 2023
8653336
always update omnibus cache
chouquette Oct 18, 2023
ac458d2
tasks: generate a cache key to fetch omnibus cache
chouquette Nov 7, 2023
37abbc5
currate env variables filter
chouquette Jan 9, 2024
bae0667
attempt to stop hardcoding install dir
chouquette Mar 4, 2024
0e50cb4
Merge branch 'main' into chouquette/omnibus_cache
Pythyu Mar 15, 2024
45c6717
feat(omnibus-cache): configure cache for remote updater OCI packages
Pythyu Mar 15, 2024
6caa677
feat(omnibus-cache): configure cache for remote updater OCI packages
Pythyu Mar 15, 2024
7d35a23
feat(test): use omnibus basedir in cache
Pythyu Mar 18, 2024
74d010a
feat(test): use two different omnibus cache variable
Pythyu Mar 18, 2024
134d81d
feat(test): updater omnibus cache suffix
Pythyu Mar 18, 2024
cea95fa
feat(updater): added package version into cache path
Pythyu Mar 19, 2024
53185cf
feat(updater): replace OMNIBUS_GIT_CACHE_SUFFIX with an agent.omnibus…
Pythyu Mar 19, 2024
5a1e18a
feat(agent.py): black python formatting
Pythyu Mar 19, 2024
913de28
feat(agent.py): lint-python use generator instead of map
Pythyu Mar 20, 2024
5fd51a2
feat(PR): apply suggestions
Pythyu Mar 20, 2024
ee0a2e6
feat(PR): apply suggestions
Pythyu Mar 20, 2024
604b6f2
Merge branch 'main' into chouquette/omnibus_cache
Pythyu Mar 20, 2024
1e44837
feat(linter): fix flake8 unused python import
Pythyu Apr 2, 2024
e8d83e2
feat(CODEOWNERS): update codeowners for agent-build-and-release to ow…
Pythyu Apr 4, 2024
16d0588
feat(PR): apply suggestion, enable cache on windows
Pythyu Apr 5, 2024
2c8b4c7
fix(path): os path join fix
Pythyu Apr 5, 2024
93aab43
feat(windows): correct windows path
Pythyu Apr 5, 2024
b5e6c3e
Merge branch 'main' into chouquette/omnibus_cache
Pythyu Apr 5, 2024
d90f609
fix(omnibus_cache): use aws.cmd on windows
Pythyu Apr 5, 2024
e9b9351
Merge branch 'chouquette/omnibus_cache' of github.com:DataDog/datadog…
Pythyu Apr 5, 2024
3fe2d17
fix(omnibus_cache): use correct git cache bundle path for windows
Pythyu Apr 5, 2024
838fd95
fix(omnibus_cache): use correct git cache bundle path for windows
Pythyu Apr 5, 2024
edf92fa
Merge branch 'main' into chouquette/omnibus_cache
chouquette Apr 12, 2024
978a516
fix conflict resolution mistake
chouquette Apr 12, 2024
031ca3e
omnibus cache: ignore ssh related env variables
chouquette Apr 12, 2024
5a8eef1
correctly dispatch the install dir
chouquette Apr 12, 2024
32d6afe
remove unneeded path sanitization
chouquette Apr 12, 2024
d505a22
debug
chouquette Apr 12, 2024
48da5a2
fixes & debug
chouquette Apr 12, 2024
4cfe191
don't provide an empty install path to the non-OCI installer builds
chouquette Apr 12, 2024
2245627
provide a mapping for the non-OCI installer
chouquette Apr 12, 2024
b6a0c5c
fix typo
chouquette Apr 12, 2024
1313603
reintroduce needed sanitization
chouquette Apr 15, 2024
424e169
add more patterns to environment exclusion list
chouquette Apr 15, 2024
86a84d0
add a comment explainig the cache dir & install path
chouquette Apr 15, 2024
3092fc1
filter more env variables
chouquette Apr 15, 2024
af4473b
simplify install directory sanitization
alopezz Apr 16, 2024
bdf7369
remove old fixme
chouquette Apr 16, 2024
e9d2dad
use omnibus commits in cache key
chouquette Apr 16, 2024
3d78dd7
handle both RELEASE_VERSION and RELEASE_VERSION_7
chouquette Apr 16, 2024
590bd4c
remove unneeded windows task parameter
chouquette Apr 16, 2024
98b6a70
display omnibus commits sha1
chouquette Apr 17, 2024
1e414d2
Merge branch 'main' into chouquette/omnibus_cache
chouquette Apr 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -499,6 +499,7 @@
/tasks/components.py @DataDog/agent-shared-components
/tasks/components_templates @DataDog/agent-shared-components
/tasks/updater.py @DataDog/fleet
/tasks/libs/omnibus_cache.py @DataDog/agent-build-and-releases
/test/ @DataDog/agent-developer-tools
/test/benchmarks/ @DataDog/agent-metrics-logs
/test/benchmarks/kubernetes_state/ @DataDog/container-integrations
Expand Down
1 change: 1 addition & 0 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,7 @@ variables:
## build to succeed with S3 caching disabled.
S3_OMNIBUS_CACHE_BUCKET: dd-ci-datadog-agent-omnibus-cache-build-stable
USE_S3_CACHING: --omnibus-s3-cache
OMNIBUS_GIT_CACHE_DIR: /tmp/omnibus-git-cache
## comment out the line below to disable integration wheels cache
INTEGRATION_WHEELS_CACHE_BUCKET: dd-agent-omnibus
S3_DD_AGENT_OMNIBUS_LLVM_URI: s3://dd-agent-omnibus/llvm
Expand Down
2 changes: 1 addition & 1 deletion .gitlab/package_build/remote_updater.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
- chmod 0744 /tmp/system-probe/clang-bpf /tmp/system-probe/llc-bpf
# NOTE: for now, we consider "ociru" to be a "redhat_target" in omnibus/lib/ostools.rb
# if we ever start building on a different platform, that might need to change
- inv -e agent.omnibus-build --release-version "$RELEASE_VERSION" --major-version "$AGENT_MAJOR_VERSION" --python-runtimes "$PYTHON_RUNTIMES" --base-dir $OMNIBUS_BASE_DIR ${USE_S3_CACHING} --skip-deps --go-mod-cache="$GOPATH/pkg/mod" --system-probe-bin=/tmp/system-probe --host-distribution=ociru
- inv -e agent.omnibus-build --release-version "$RELEASE_VERSION" --major-version "$AGENT_MAJOR_VERSION" --python-runtimes "$PYTHON_RUNTIMES" --base-dir $OMNIBUS_BASE_DIR ${USE_S3_CACHING} --skip-deps --go-mod-cache="$GOPATH/pkg/mod" --system-probe-bin=/tmp/system-probe --host-distribution=ociru --install-directory="$INSTALL_DIR"
- ls -la $OMNIBUS_PACKAGE_DIR
- !reference [.upload_sbom_artifacts]
variables:
Expand Down
1 change: 1 addition & 0 deletions .gitlab/package_build/windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
-e CI_JOB_NAME_SLUG=${CI_JOB_NAME_SLUG}
-e CI_COMMIT_REF_NAME=${CI_COMMIT_REF_NAME}
-e OMNIBUS_TARGET=${OMNIBUS_TARGET}
-e OMNIBUS_GIT_CACHE_DIR="/tmp/omnibus-git-cache"
Pythyu marked this conversation as resolved.
Show resolved Hide resolved
-e WINDOWS_BUILDER=true
-e RELEASE_VERSION="$RELEASE_VERSION"
-e MAJOR_VERSION="$AGENT_MAJOR_VERSION"
Expand Down
2 changes: 2 additions & 0 deletions omnibus/config/software/datadog-agent-finalize.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@

skip_transitive_dependency_licensing true

always_build true

build do
license :project_license

Expand Down
2 changes: 2 additions & 0 deletions omnibus/config/software/datadog-agent-integrations-py2.rb
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@

source git: 'https://github.com/DataDog/integrations-core.git'

always_build true

integrations_core_version = ENV['INTEGRATIONS_CORE_VERSION']
if integrations_core_version.nil? || integrations_core_version.empty?
integrations_core_version = 'master'
Expand Down
2 changes: 2 additions & 0 deletions omnibus/config/software/datadog-agent-integrations-py3.rb
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@

source git: 'https://github.com/DataDog/integrations-core.git'

always_build true

integrations_core_version = ENV['INTEGRATIONS_CORE_VERSION']
if integrations_core_version.nil? || integrations_core_version.empty?
integrations_core_version = 'master'
Expand Down
2 changes: 2 additions & 0 deletions omnibus/config/software/datadog-agent.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@
source path: '..'
relative_path 'src/github.com/DataDog/datadog-agent'

always_build true

build do
license :project_license

Expand Down
2 changes: 2 additions & 0 deletions omnibus/config/software/datadog-security-agent-policies.rb
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@
end
default_version policies_version

always_build true

build do
license "Apache-2.0"
license_file "./LICENSE"
Expand Down
2 changes: 2 additions & 0 deletions omnibus/config/software/system-probe.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
source path: '..'
relative_path 'src/github.com/DataDog/datadog-agent'

always_build true

build do
license :project_license

Expand Down
8 changes: 7 additions & 1 deletion omnibus/omnibus.rb
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,10 @@
s3_instance_profile true
end
end
use_git_caching false

if not ENV.has_key?("OMNIBUS_GIT_CACHE_DIR")
use_git_caching false
else
use_git_caching true
git_cache_dir ENV["OMNIBUS_GIT_CACHE_DIR"]
end
49 changes: 47 additions & 2 deletions tasks/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
load_release_versions,
timed,
)
from tasks.libs.omnibus_cache import omnibus_compute_cache_key
from tasks.rtloader import clean as rtloader_clean
from tasks.rtloader import install as rtloader_install
from tasks.rtloader import make as rtloader_make
Expand Down Expand Up @@ -322,8 +323,8 @@ def refresh_assets(_, build_tags, development=True, flavor=AgentFlavor.base.name
# Ensure the config folders are not world writable
os.chmod(check_dir, mode=0o755)

## add additional windows-only corechecks, only on windows. Otherwise the check loader
## on linux will throw an error because the module is not found, but the config is.
# add additional windows-only corechecks, only on windows. Otherwise the check loader
Pythyu marked this conversation as resolved.
Show resolved Hide resolved
# on linux will throw an error because the module is not found, but the config is.
if sys.platform == 'win32':
for check in WINDOWS_CORECHECKS:
check_dir = os.path.join(dist_folder, f"conf.d/{check}.d/")
Expand Down Expand Up @@ -883,6 +884,7 @@ def omnibus_build(
python_mirror=None,
pip_config_file="pip.conf",
host_distribution=None,
install_directory="/opt/datadog-agent",
):
"""
Build the Agent packages with Omnibus Installer.
Expand Down Expand Up @@ -933,6 +935,32 @@ def omnibus_build(
with timed(quiet=True) as bundle_elapsed:
bundle_install_omnibus(ctx, gem_path, env)

omnibus_cache_dir = os.environ.get('OMNIBUS_GIT_CACHE_DIR')
use_omnibus_git_cache = omnibus_cache_dir is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, OMNIBUS_GIT_CACHE_DIR is not set on the Windows Agent builds, meaning that the build cache is not enabled for Windows.

The PR description doesn't mention that this change is Linux-only, so could you confirm that this is expected? If yes, can this be added somewhere in code and in the PR description?

if use_omnibus_git_cache:
if install_directory[0] == "/":
install_directory = install_directory[1:]
omnibus_cache_dir = os.path.join(omnibus_cache_dir, install_directory)
remote_cache_name = os.environ.get('CI_JOB_NAME_SLUG')
# We don't want to update the cache when not running on a CI
# Individual developers are still able to leverage the cache by providing
# the OMNIBUS_GIT_CACHE_DIR env variable, but they won't pull from the CI
# generated one.
use_remote_cache = remote_cache_name is not None
if use_remote_cache:
cache_state = None
cache_key = omnibus_compute_cache_key(ctx)
git_cache_url = f"s3://{os.environ['S3_OMNIBUS_CACHE_BUCKET']}/builds/{cache_key}/{remote_cache_name}"
bundle_path = "/tmp/omnibus-git-cache-bundle"
with timed(quiet=True) as restore_cache:
# Allow failure in case the cache was evicted
if ctx.run(f"aws s3 cp --only-show-errors {git_cache_url} {bundle_path}", warn=True):
print(f'Successfully restored cache {cache_key}')
ctx.run(f"git clone --mirror {bundle_path} {omnibus_cache_dir}")
cache_state = ctx.run(f"git -C {omnibus_cache_dir} tag -l").stdout
else:
print(f'Failed to restore cache from key {cache_key}')

with timed(quiet=True) as omnibus_elapsed:
omnibus_run_task(
ctx=ctx,
Expand All @@ -945,6 +973,19 @@ def omnibus_build(
host_distribution=host_distribution,
)

if use_omnibus_git_cache:
stale_tags = ctx.run(f'git -C {omnibus_cache_dir} tag --no-merged', warn=True).stdout
# Purge the cache manually as omnibus will stick to not restoring a tag when
# a mismatch is detected, but will keep the old cached tags.
# Do this before checking for tag differences, in order to remove staled tags
# in case they were included in the bundle in a previous build
for _, tag in enumerate(stale_tags.split(os.linesep)):
ctx.run(f'git -C {omnibus_cache_dir} tag -d {tag}')
if use_remote_cache and ctx.run(f"git -C {omnibus_cache_dir} tag -l").stdout != cache_state:
with timed(quiet=True) as update_cache:
ctx.run(f"git -C {omnibus_cache_dir} bundle create {bundle_path} --tags")
ctx.run(f"aws s3 cp --only-show-errors {bundle_path} {git_cache_url}")

# Delete the temporary pip.conf file once the build is done
os.remove(pip_config_file)

Expand All @@ -953,6 +994,10 @@ def omnibus_build(
print(f"Deps: {deps_elapsed.duration}")
print(f"Bundle: {bundle_elapsed.duration}")
print(f"Omnibus: {omnibus_elapsed.duration}")
if use_omnibus_git_cache and use_remote_cache:
print(f"Restoring omnibus cache: {restore_cache.duration}")
print(f"Updating omnibus cache: {update_cache.duration}")

_send_build_metrics(ctx, omnibus_elapsed.duration)


Expand Down
127 changes: 127 additions & 0 deletions tasks/libs/omnibus_cache.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
import hashlib
import os


def _get_build_images(ctx):
# We intentionally include both build images & their test suffixes in the pattern
# as a test image and the merged version shouldn't share their cache
tags = ctx.run("grep -E 'DATADOG_AGENT_.*BUILDIMAGES' .gitlab-ci.yml | cut -d ':' -f 2", hide='stdout').stdout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 suggestion: ‏I'd probably lean towards writing this in python directly, rather than shelling out. But, this isn't a blocking comment.

return (t.strip() for t in tags.splitlines())


def _get_environment_for_cache() -> dict:
"""
Compute a hash from the environment after excluding irrelevant/insecure
environment variables to ensure we don't omit a variable
"""

def env_filter(item):
key = item[0]
excluded_prefixes = [
'AGENT_',
'ARTIFACTORY_',
'AWS_',
'BUILDENV_',
'CI_',
'CLUSTER_AGENT_',
'DATADOG_AGENT_',
'DD_',
'DEB_',
'DESTINATION_',
'DOCKER_',
'FF_',
'GITLAB_',
'GIT_',
'K8S_',
'KERNEL_MATRIX_TESTING_',
'KUBERNETES_',
'OMNIBUS_',
'POD_',
'RELEASE_VERSION',
'RPM_',
'S3_',
'TEST_INFRA_',
'USE_',
'VAULT_',
'WINDOWS_',
]
excluded_suffixes = [
'_SHA256',
'_VERSION',
]
excluded_values = [
"AVAILABILITY_ZONE",
"BENCHMARKS_CI_IMAGE",
"BUCKET_BRANCH",
"BUNDLER_VERSION",
"CHANNEL",
"CI",
"CONSUL_HTTP_ADDR",
"DOGSTATSD_BINARIES_DIR",
"EXPERIMENTS_EVALUATION_ADDRESS",
"GCE_METADATA_HOST",
"GENERAL_ARTIFACTS_CACHE_BUCKET_URL",
"GET_SOURCES_ATTEMPTS",
"HOME",
"HOSTNAME",
"HOST_IP",
"INTEGRATION_WHEELS_CACHE_BUCKET",
"IRBRC",
"KITCHEN_INFRASTRUCTURE_FLAKES_RETRY",
"LESSCLOSE",
"LESSOPEN",
"LC_CTYPE",
"LS_COLORS",
"MACOS_S3_BUCKET",
"MESSAGE",
"OLDPWD",
"PROCESS_S3_BUCKET",
"PWD",
"PYTHON_RUNTIMES",
"RUN_ALL_BUILDS",
"RUN_KITCHEN_TESTS",
"RUNNER_TEMP_PROJECT_DIR",
"RUSTC_SHA256",
"RUST_VERSION",
"SHLVL",
"STATIC_BINARIES_DIR",
"STATSD_URL",
"SYSTEM_PROBE_BINARIES_DIR",
"TRACE_AGENT_URL",
"USE_CACHING_PROXY_PYTHON",
"USE_CACHING_PROXY_RUBY",
"USE_S3_CACHING",
"WIN_S3_BUCKET",
"_",
"build_before",
]
for p in excluded_prefixes:
if key.startswith(p):
return False
for s in excluded_suffixes:
if key.endswith(s):
return False
if key in excluded_values:
return False
return True

return dict(filter(env_filter, sorted(os.environ.items())))


def omnibus_compute_cache_key(ctx):
print('Computing cache key')
h = hashlib.sha1()
omnibus_last_commit = ctx.run('git log -n 1 --pretty=format:%H omnibus/', hide='stdout').stdout
h.update(str.encode(omnibus_last_commit))
print(f'\tLast omnibus commit is {omnibus_last_commit}')
buildimages_hash = _get_build_images(ctx)
for img_hash in buildimages_hash:
h.update(str.encode(img_hash))
environment = _get_environment_for_cache()
for k, v in environment.items():
print(f'\tUsing environment variable {k} to compute cache key')
h.update(str.encode(f'{k}={v}'))
# FIXME: include omnibus-ruby and omnibus-software version once they are pinned
cache_key = h.hexdigest()
print(f'Cache key: {cache_key}')
return cache_key
Loading