Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor tools to allow building container and setting up dev machine #276

Merged
merged 16 commits into from
Feb 16, 2023

Conversation

derekwbrown
Copy link
Contributor

@derekwbrown derekwbrown commented Sep 5, 2022

Objective

This PR attempts to solve the following problems

  1. Maintain the build container with little change to adding/modifying software
  2. Create a local build environment using the same scripts as the build container, so that the local build environment matches the container build environment
  3. Allow the local build environment to be run multiple times, to update/add new tools as necessary

Description

This PR modifies the helper scripts for the container building into a single top-level build script. This allows that same build script to be run on a local machine, ensuring the local build tools match the container build tools.

The script also saves the current state of the tools (what version has been installed), so that it can be re run. When it is run for the N+1th time, it checks the version installed and skips software that's already correct (so no multiple installs of vstudio). For versions that have changed, it will upgrade to the new version. For tools that have been added since the last run, it will add the new tools.

Using the local build environment

When running in local mode (not when building the container), the script refrains from changing the system environment wherever possible. It does not add environment variables and change the path.

The local build mode includes a powershell module (DDDeveloper), which includes powershell commands to enable the environment. Once installed, the developer can execute Use-BuildEnv and the module will update the environment (path and necessary variables) for that shell only, which avoids polluting the general setup.

A Note on GO

In the local environment, this tool allows multiple concurrent versions of go. This is to support wanting to maintain > 1 older versions of go in the event that switching is necessary between versions. When running Use-BuildEnv, by default, it will select the most recent version of Go installed on the machine. However, it accepts an argument -GoVer in which the user can select the version. There is a companion command Get-GoVersions which will display the version numbers installed on the system.

Side Effects

Container size

The resulting container is actually smaller. I don't fully understand why, as all the software as the same, but it seems to be an artifact of using a single layer rather than one layer per install. This seems to be a good effect

Iterating on the container.

Prior to this change, when trying to develop the container itself, then the last successful install was stored as the last layer written. The developer could then start the shell from the last layer, and proceed to help iterate on the container faster.
Since it's all one layer, then when there's a failure, then you're back to the beginning. This is definitely a step back. If it's considered a serious step back, I have some ideas how to mitigate.

Copy link
Contributor

@iglendd iglendd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the script and refactoring you have put in to clean it up. My concern only individual ps1 files which moved and cannot be compared (some moves are comparable for some reasons).

Then we had internal debates about pros and cons of this approach and its usage in dev cycle.

@@ -1,606 +0,0 @@
from ast import Not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My beautiful script :)

@julien-lebot
Copy link
Contributor

julien-lebot commented Sep 6, 2022

My concern only individual ps1 files which moved and cannot be compared (some moves are comparable for some reasons).

In git the files need to be moved & committed before being modified so that it's possible to track them as being identical.

@iglendd
Copy link
Contributor

iglendd commented Sep 6, 2022

""My concern only individual ps1 files which moved and cannot be compared (some moves are comparable for some reasons).""
"In git the files need to be moved & committed before being modified so that it's possible to track them as being identical."
@derekwbrown if you reapply your change by explicitly moving file, may they will lineup better for the code review

@julien-lebot
Copy link
Contributor

My main concern with this PR is that now updating the buildimage will take much longer as any change will require to fully rebuild the container. This will also impact our CI which won't be able to reuse the cached unchanged layers.
I think it's a significant downside just so that the local development environment match the build environment by running the same script.
IMO there rarely is a need to have a local environment match exactly the build environment for local development and when there is we have the buildimage for that.
What we can do is leave the buildimage like it is and have a script that reproduces the Dockerfile locally on a development machine, which is what @iglendd did in the first place.

@derekwbrown
Copy link
Contributor Author

My main concern with this PR is that now updating the buildimage will take much longer as any change will require to fully rebuild the container. This will also impact our CI which won't be able to reuse the cached unchanged layers. I think it's a significant downside just so that the local development environment match the build environment by running the same script. IMO there rarely is a need to have a local environment match exactly the build environment for local development and when there is we have the buildimage for that. What we can do is leave the buildimage like it is and have a script that reproduces the Dockerfile locally on a development machine, which is what @iglendd did in the first place.

So, it's not clear to me when/if we get the caching benefits, although it's true this new method reconstructs the entire container from scratch. For comparison, this build took 1 hr; comparing to the build where we added codeql, which is the most charitable comparison (because it added a layer at the end so presumably got the most benefit of caching), that took 44 minutes. So it's longer, but not considerably so (and again, that's the best case).

As for what @iglendd did, the reason that this is "better" is that what he did can only be done once.

Again, as an example... you run what Len wrote... and then Bryce adds ninja for the system-probe builds. You can't just rerun the script and get the updates. With this you can. Same thing when we update go versions.

I disagree with this
IMO there rarely is a need to have a local environment match exactly the build environment for local development
As soon as someone adds a tool, you need to match the build environment.

Ultimately, I'm (no longer) the owner of this... I put this forth as a suggestion/possibility. If the team doesn't like it then we can just close the PR w/o merging.

@iglendd
Copy link
Contributor

iglendd commented Sep 8, 2022

That is why I suggested a meeting (let's plan after you come back). Perhaps we can add other experts on CI/Docker like @KSerrania. I personally think the benefits outweigh its downside but I am biased.

@KSerrania
Copy link
Contributor

KSerrania commented Sep 27, 2022

This will also impact our CI which won't be able to reuse the cached unchanged layers.

The CI doesn't use the docker cache (docker build is run with the --no-cache option), so this doesn't actually change things much there (though it does impact people that are building the build image locally).

IMO, using the docker cache in the CI would have limited benefits (the buildimages jobs are run very infrequently, over a large number of runners, and the runners themselves are recycled every two days, so there's very little chance to get a cache hit), and will become a noop if/once we implement Windows runner autoscaling (since a given runner will never get reused), except if we decide to build some kind of solution to share & manage a docker layer cache between all runners.

@derekwbrown derekwbrown marked this pull request as ready for review October 12, 2022 19:38
@derekwbrown derekwbrown requested review from a team as code owners October 12, 2022 19:38
@iglendd iglendd self-requested a review October 13, 2022 17:36
@@ -0,0 +1,129 @@
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in https://github.com/DataDog/datadog-agent/tree/main/devenv to avoid having multiple places to setup a development environment.

windows/Dockerfile Outdated Show resolved Hide resolved
@derekwbrown derekwbrown force-pushed the db/std-toolchain branch 4 times, most recently from 3aa6b3e to 9d44013 Compare October 19, 2022 03:08
windows/Dockerfile Outdated Show resolved Hide resolved
derekwbrown and others added 8 commits February 6, 2023 20:57
fix error in go path finding

Attempt to catch msys install failure

:Add some debugging for 2022ltsc

Move mingit after vstudio

fix path to patch
Change env var helpers to write to USER env instead of SYSTEM env
@clarkb7
Copy link
Contributor

clarkb7 commented Feb 7, 2023

rebasing to resolve conflicts

windows/helpers.ps1 Outdated Show resolved Hide resolved
windows/versions.ps1 Outdated Show resolved Hide resolved
Ensure Get-RemoteFile can handle empty env vars
@derekwbrown derekwbrown merged commit 8c1c831 into main Feb 16, 2023
@derekwbrown derekwbrown deleted the db/std-toolchain branch February 16, 2023 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants