Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RPM builds to the CI pipeline #2982

Merged
merged 7 commits into from
Sep 13, 2022

Conversation

webbnh
Copy link
Member

@webbnh webbnh commented Aug 24, 2022

This PR enhances build.sh to extend the CI pipeline to include building binary RPMs for the Agent and the Server.

There are a few related changes which include:

  • build.sh:
    • We create the ${HOME}/.config directory if it doesn't exist. Previously, in the container scenarios, we mapped this to a temporary file system; now it just resides on the same file system as ${HOME}, but we cannot be guaranteed that it will exist unless we create it.
    • We list the RPMs at the end of the build. This is a "victory lap" which will be removed in the next PR when the RPMs are actually consumed.
  • jenkins/Makefile: A change requested in another PR for a code comment
  • jenkins/ci.fedora.Dockerfile:
    • Updated to Fedora 35!
    • The addition of the podman-remote package to allow us to invoke containers on the host from within the container running the build
    • A couple of dnf tweaks that I ran across in my investigations which seemed like a good idea
    • A couple of npm tweaks (one which npm itself suggested, and the other avoids a timeout problem)
  • jenkins/run: changes to support running a container from inside the container
    • Documentation of the environment/shell variables which govern the script's operation
    • If the CONTAINER_HOST environment variable is defined, then we set up the new-container invocation to be made with podman-remote which will run the container at the location specified by CONTAINER_HOST.
    • If the CONTAINER_HOST environment variable is not defined, then we set up the new-container invocation to be made with podman, and we start a systemd service which will listen for a request from podman-remote.
    • We use the environment variable WORKSPACE to determine where the Git checkout is. When running in the CI, this is defined by the Jenkins environment; if it is undefined, we set it to the current working directory, which jenkins/run requires to be the Git checkout. We define this environment variable inside the container, so that it knows where on the host the Git checkout is, so that invocations of podman-remote can set up the mapping properly, even though they are made from inside a container.
    • We use the environment variable WORKSPACE_TMP to determine where the home directory inside the container should be mapped. When running in the CI, this is defined by the Jenkins environment (although, we have to create the directory tree, which is done in build.sh). If it is undefined, we set up a temporary Podman volume for the mapping (using a UUID for the volume name; the volume is deleted at the end of the script if it is no longer in use). WORKSPACE_TMP can also be defined to point to a real file system on the host, which a developer can use to capture the build products locally. We define this environment variable inside the container, so that it knows what on the host to map to the home directory inside the container, so that invocations of podman-remote can set up the mapping properly, even though they are made from inside a container.
    • Some other tweaks to the container invocation:
      • We define the CONTAINER_HOST environment variable in the container, to trigger the use of podman-remote inside the container, and we point it at the service which we've defined if it was not already defined.
      • We map into the container the unix socket where the Podman listener is configured, so that the podman-remote command can use it to communicate to the service on the host.
      • We disable SELinux labeling inside the container (because it just seems to get in the way without producing any value in our use cases), and we remove the re-labeling directives from the volume mappings.
  • utils/rpm.mk:
    • Set the build area (${BLD_ROOT}) to ${HOME} by default, but allow the distro-specific build invocations to override it. Likewise, set the build subdirectory to rpmbuild but allow it to be overridden, as well.
    • Add a new TMP directory to the RPMDIRS list and use it instead of /tmp, so that we don't have collisions between concurrent distros' builds. Define a variable, ${RPMTMP} for use instead of ${TMPDIR}.
    • Correct the dependency of the <distro>-rpm target so that it requires only the build output directories and not the spec file and source RPM. (The rpm target of the sub-make already has those dependencies.)
    • The output location for the RPMs is, by default, still ~/rpmbuild, but, if the build is run inside a container, this location will be remapped appropriately by jenkins/run.
    • Removed a spurious slash from the sub-make current directory and changed it to be a relative path.
    • Modified the nested container invocation so that it passes in the appropriate definitions of ${BLD_ROOT} and ${BLD_SUBDIR}.

PBENCH-720

@webbnh webbnh added Agent Server Code Infrastructure Containerization Of and relating to the process of setting up and maintaining container images labels Aug 24, 2022
@webbnh webbnh added this to the v0.72 milestone Aug 24, 2022
@webbnh webbnh self-assigned this Aug 24, 2022
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could telegraph the plot a bit more clearly in a few places, but I infer this is likely sound. I'm approving, but I won't pre-resolve the conversations because I'd like to see the answers.

utils/rpm.mk Show resolved Hide resolved
jenkins/run Outdated Show resolved Hide resolved
jenkins/ci.fedora.Dockerfile Show resolved Hide resolved
dbutenhof
dbutenhof previously approved these changes Aug 25, 2022
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conversations resolved, though I think your responses could be used to form some additional commentary in the files that might help future spelunkers.

jenkins/ci.fedora.Dockerfile Show resolved Hide resolved
utils/rpm.mk Show resolved Hide resolved
@webbnh
Copy link
Member Author

webbnh commented Aug 26, 2022

Since I had to rebase anyway, I improved (I think...) the Makefile comment per Dave's suggestion.

@webbnh webbnh requested a review from dbutenhof August 26, 2022 15:32
dbutenhof
dbutenhof previously approved these changes Aug 26, 2022
@webbnh webbnh marked this pull request as draft August 26, 2022 17:16
@webbnh
Copy link
Member Author

webbnh commented Aug 26, 2022

Switching to draft mode, while I explore the differences between my environment and Mr. Jenkins's. 😝

@webbnh webbnh force-pushed the ci-pipeline-rpm branch 9 times, most recently from 4121d10 to 2e5e5d8 Compare August 26, 2022 23:14
@webbnh
Copy link
Member Author

webbnh commented Sep 7, 2022

I would like to know why it is required, and why we can't do it another way.

I'm sure that there are many ways to address this problem, each with a different set of requirements and maintenance loads. Employing podman-remote is a solution which addresses the problem in one way. It is required only if we opt to proceed with that approach. If we opt instead to take on a different set of requirements, etc. then we can use a different solution.

Using the approach which you suggest, of invoking each stage inside its own container instead of invoking all the stages inside a single container, trades the problem of "how do you invoke a container from inside a container" (which is now solved) for the problem of "how do you get the build products out of the container and share them with the next stage". This of course, is easily done by making a bunch of interesting changes to jenkins/run....but I'm not convinced that, once the dust settles, the result will be any simpler than the changes presented in this PR.

Under the approach in this PR, there is a single container execution. This means that all of the dependency management is done in a single container (which has its good points and it's liabilities), which means it is all pre-packaged for the user. And, it means that the file system(s) used by the execution can easily be decoupled from the host. That is, a developer can issue a single command and run the build in isolation from the host, without installing any dependencies or having to worry about file system conflicts. By default, the build outputs are placed in temporary volumes which don't interact with the user's local environment, so the builds and tests will not be adversely affected by artifacts which the user has lying around, and, when the build is done, there won't be artifacts left around which the user will trip on. However, users (and the CI) can easily override this default behavior if they want the build artifacts to be captured on a local file system instead of having them placed on a temporary volume.

I can certainly pursue the approach that you suggest. It looks workable on its face, but it means more points of involvement on the host side. jenkins/run will have to be enhanced to allow the user to specify the volume for the $HOME directory used inside the container (analogous to what the current PR approach does), so that this can be shared among the latter stages. And, any uses of containers in the build/test system will have to be initiated from the top level (because the lower levels will already be inside containers), which will constrain what we can do. (E.g., the Agent RPM builds use containers, so they'll have to be invoked directly from build.sh rather than inside a jenkins/run invocation, which means that any dependencies they have prior to their container invocations will have to be installed on the host unless we arrange those pieces to be invoked in containers, as well.)

So, I think it's pretty close to six of one vs. a half-dozen of the other in terms of the ultimate complexity. I would be happy to give you a guided tour of these changes if that would allay your concerns. There are a lot of constraints here, and many of them are not obvious. I think this PR presents the most versatile solution which meets all of the requirements.

I would like to know why you think that we should not use podman-remote and why we should instead do it some other way.

dbutenhof
dbutenhof previously approved these changes Sep 7, 2022
@webbnh
Copy link
Member Author

webbnh commented Sep 8, 2022

Rebased on to the end of main and resolved the conflicts introduced by the merge of #2937.

dbutenhof
dbutenhof previously approved these changes Sep 8, 2022
@portante portante dismissed their stale review September 9, 2022 19:08

Withdrawing my "change request" after talking with Webb.

Copy link
Member

@portante portante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Withdrawing my "change request" after talking with Webb. I still have reservations about the additional level of complexity (however minimal it is) and reliance on the new mechanism of podman-remote when there seems to be a good alternative.

If the rest of the team agrees with the change, great. I'll submit a consideration for an alternative for the team to consider later.

@webbnh
Copy link
Member Author

webbnh commented Sep 12, 2022

Back to draft mode while I work out support for Git "worktrees".

 - Use WORKSPACE from outside the container for mappings requested inside it
 - Change destination for RPM build from ~/rpmbuild* to ~/pbench/rpmbuild*
 - Remove spurious trailing slash from relative "Pbench Top" location
 - Use symbol for Podman socket location
 - Tweak the rpmbuild directory mapping
@webbnh
Copy link
Member Author

webbnh commented Sep 12, 2022

The problem with Git worktrees turned out to be not in my changes so much as in the fact that RHEL-7 comes with a very old version of Git which doesn't include worktree support! So, the previous support for Git worktrees in jenkins/run seems to work well enough on all the other platforms which we aspire to build on, and it's just the intersection of Git worktrees and building the RHEL-7 binary RPM which yields a problem...and this can be avoided either by not using Git worktrees (which Mr. Jenkins does not) or by not trying to build the RHEL-7 RPM (e.g., by overriding the platforms definition in agent/rpm/Makefile).

The exercise did bring to light the fact that, if you are using an alternate worktree, then the path to your main worktree must not be ~/pbench, because this produces a conflict when setting up the volume mappings for the container. But, this too is easily worked around. To aid with that, I added a check for this situation to jenkins/run so that the victim will get a clear explanation when the situation arises.

So, this PR is once again ready for review.

@webbnh webbnh marked this pull request as ready for review September 12, 2022 21:19
@webbnh webbnh merged commit 0f4f25b into distributed-system-analysis:main Sep 13, 2022
@webbnh webbnh deleted the ci-pipeline-rpm branch September 13, 2022 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Agent Code Infrastructure Containerization Of and relating to the process of setting up and maintaining container images Server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants