Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Github CI workflows for CPU builds and tests #380

Merged

Conversation

ohearnk
Copy link
Collaborator

@ohearnk ohearnk commented Oct 2, 2024

This MR updates the QUICK tests in Github workflows to use v4 artifact actions (v1 and v2 deprecated, then removed -- see here).

Additionally, this MR also updates the checkout action (v4), renames the build artifacts to avoid naming conflicts between the two build systems (legacy configure/make, and CMake)., parallelizes the serial builds, switches to invoking the CMake build/install commands directly (instead of depending on the generated Makefiles), and refactoring the general job and step naming to better reflect the tasks.

@ohearnk ohearnk self-assigned this Oct 2, 2024
@ohearnk ohearnk force-pushed the github-tests-workflow-action-artifact-updates branch from 41b02df to fa85ae9 Compare October 2, 2024 16:40
@ohearnk ohearnk requested a review from agoetz October 2, 2024 16:44
agoetz
agoetz previously approved these changes Oct 3, 2024
Copy link
Collaborator

@agoetz agoetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cmake builds fail. The logs show

Run actions/upload-artifact@v4
With the provided path, there will be 169 files uploaded
Artifact name is valid!
Root directory input is valid!
Error: Failed to CreateArtifact: Received non-retryable error: Failed request: (409) Conflict: an artifact with this name already exists on the workflow run

From https://github.com/actions/upload-artifact?tab=readme-ov-file#breaking-changes

Breaking Changes

  • Uploading to the same named Artifact multiple times.

Due to how Artifacts are created in this new version, it is no longer possible to upload to the same named Artifact multiple times. You must either split the uploads into multiple Artifacts with different names, or only upload once. Otherwise you will encounter an error.

@agoetz agoetz self-requested a review October 3, 2024 05:57
@agoetz agoetz dismissed their stale review October 3, 2024 06:02

Fix is not working, tests fail.

@agoetz
Copy link
Collaborator

agoetz commented Oct 3, 2024

While we are at it, we should update the build command. Gihub hosted runners now have 4 CPU cores. See

https://docs.github.com/en/actions/using-github-hosted-runners/using-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories

We currently use make and make -j2.

We should update to make -j4 or just use make -j, which should automatically build on all available CPU cores.

@ohearnk
Copy link
Collaborator Author

ohearnk commented Oct 9, 2024

@agoetz Regarding bumping up the number of cores used in Github runners from 2 to 4 -- doing this causes the MPI tests to fail consistently on the first test (both configure/make and CMake builds). Moreover, the error code returned from most of these failed tests from quick.MPI appears to be a value of 2 (i.e., the MPI rank first detected to be aborting).

Can you check if there is a settting in the merzlab account regarding the Github runner configuration? My best guess is that a change might be required to reflect the hardware upgrades.

Other than that issue, this should be ready to merge.

@ohearnk ohearnk changed the title Update Github action artifacts (v4) Update Github CI workflows for CPU builds and tests Oct 9, 2024
@merzlab
Copy link
Owner

merzlab commented Oct 11, 2024

@ohearnk There are no settings regarding the Github runner configuration. There is an option for Larger Github-hosted runners, but this is restricted to Team & Enterprise accounts. Another option is Self-hosted runners. These are virtual machines for GitHub Actions workflows that you manage and maintain outside of GitHub. We should look into this.

I checked for the CPU in the Github runner:

> Run lscpu
Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Address sizes:                      48 bits physical, 48 bits virtual
Byte Order:                         Little Endian
CPU(s):                             4
On-line CPU(s) list:                0-3
Vendor ID:                          AuthenticAMD
Model name:                         AMD EPYC 7763 64-Core Processor
CPU family:                         25
Model:                              1
Thread(s) per core:                 2
Core(s) per socket:                 2
Socket(s):                          1

The information on the Github website is misleading. These are actually 4 threads but only 2 physical 4 CPU cores.

Let's leave build and MPI runs at 2 cores.

Copy link
Owner

@merzlab merzlab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I will merge this PR.

@merzlab merzlab merged commit 47eb00b into merzlab:master Oct 11, 2024
4 checks passed
@ohearnk ohearnk deleted the github-tests-workflow-action-artifact-updates branch October 14, 2024 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants