[README]: Contributing new modules #1136

HofmeisterAn · 2024-03-04T18:31:04Z

Problem

Hi 👋, in the past few weeks, I've encountered several issues with the GitHub-hosted runners. Due to the increasing number of modules, pulling all the images occupies too much disk space. I've implemented a temporary fix to free up some disk space, but this won't be enough as we add more new modules.

Until we find a suitable fix and implement a mechanism to handle the growing number of modules and image sizes, I prefer not to merge pull requests that involve very simple module configurations. Which can be accomplished with just a few lines using the generic container builder API. If you believe the module is valuable for the community, please consider creating an issue first and discussing it there to avoid unnecessary work on creating a pull request. OC, I will review and try to merge the remaining and outstanding module PRs.

Solution

-

Benefit

-

Alternatives

-

Would you like to help contributing this enhancement?

Yes

mrudat · 2024-04-26T09:46:06Z

Would it be worthwhile to refactor modules so that each image or class of images (e.g. Web/Database/... Servers that might want to share a common interface) has its own repository?

That would mean that exactly one docker image should be created for each image-specific repository, presuming that the same configuration used from multiple implementations in different languages results in an identical container being tested.

We could also describe the image-specific configuration in a common configuration format and use that to generate the language-specific code.

For example, username/password/... is often configured via environment variables, and the specific environment variable names are specific to individual images. However, the code for capturing the username/password is shared between multiple images across multiple languages, which suggests we have too much nearly identical custom-maintained code and are waiting for a refactoring pass to extract the common functionality.

eddumelendez · 2024-05-23T01:39:00Z

Hi, I think this can help to free some more space for linux workers https://github.com/jlumbroso/free-disk-space

HofmeisterAn · 2024-05-23T05:09:24Z

Hi Eddú, we are doing this already:

testcontainers-dotnet/.github/workflows/cicd.yml

Lines 48 to 64 in 734fb52

	# Our modules occupy too much disk space. The GitHub-hosted runners ran into the
	# error: "no space left on device." The pulled images are not cleaned up between
	# the test runs. One obvious approach is splitting the tests and running them on
	# multiple runners. However, we need to keep in mind that running too many
	# simultaneous builds has an impact on others as well. We observed that scheduled
	# Dependabot builds blocked others in the Testcontainers organization.
	- name: Free Disk Space
	uses: jlumbroso/[email protected]
	if: runner.os == 'Linux'
	with:
	tool-cache: true
	android: true
	dotnet: true
	haskell: true
	large-packages: true
	docker-images: true
	swap-storage: false

kiview · 2024-05-23T08:08:02Z

Would it be worthwhile to refactor modules so that each image or class of images (e.g. Web/Database/... Servers that might want to share a common interface) has its own repository?

From past experiences with tc-java, this leads to a lot of operational overhead (e.g. releases), unless we have good automation in place.

Splitting in parallelizing the CI workflow on a per-module or module-sets level can help and it is what we essentially do in tc-java as well.

kevin0x90 · 2024-10-02T22:00:07Z

As recommended by @kiview I took a look at the way test containers Java is doing the parallel execution and would like to support in this regard whenever I got some time to work on it.

@HofmeisterAn what do you think about doing the parallel execution analogous to the Java project?

HofmeisterAn · 2024-10-03T08:26:33Z

As recommended by @kiview I took a look at the way test containers Java is doing the parallel execution and would like to support in this regard whenever I got some time to work on it.

@HofmeisterAn what do you think about doing the parallel execution analogous to the Java project?

Definitely something we should try out 👍. We already have the implementation to merge coverage results from different runners (for Linux and Windows), so we probably just need to group (by core-linux, core-windows, modules), split and run the test projects accordingly. I'm not sure if we need to take a look at GitHub's usage limits. IIRC, we already had an issue with Dependabot once, where we created too many runners in a short period of time, which caused issues for other Testcontainers repos because they got degraded and couldn’t create new runners for a while.

Maybe we can also reuse the GitHub composite actions (.github/workflows/actions/tests/action.yml) from this branch.

HofmeisterAn · 2024-11-09T13:15:59Z

I began looking into this issue and prioritizing it to address the remaining module PRs afterward. So far, my initial tests look quite promising (diff). The max-parallel configuration is helpful for adjusting the maximum number of runners to prevent degradation for the Testcontainers organization (maybe the configuration does not help here). There are a few more improvements we can make, such as building the entire project just once, but for now, I think this is already a good step forward and doesn't require too many changes.

HofmeisterAn · 2024-11-11T06:56:28Z

The disadvantage is that we need to maintain a list of test projects in the GitHub workflow. Ideally, we can create this list somehow during runtime. At the very least, we need to ensure that no one forgets to add the test project here.

HofmeisterAn · 2024-11-14T13:20:14Z

I was thinking about using the following PowerShell command/script to determine which runner belongs to which project. Instead of reading the file content, we can simply use its name. If the file does not exist, we can either throw an error or fall back to a default runner.

Get-ChildItem -Path 'tests' -Directory | % { @{ "name" = $_.Name; "runs-on" = [string](Get-Content -LiteralPath (Join-Path -Path $_.FullName -ChildPath ".runs-on") -ErrorAction SilentlyContinue) } }

The command create a JSON similar to:

[
    {
        "name":  "Testcontainers.ActiveMq.Tests",
        "runs-on":  "ubuntu-22.04"
    },
    {
        "name":  "Testcontainers.ArangoDb.Tests",
        "runs-on":  "ubuntu-22.04"
    },
    {
        "name":  "Testcontainers.Azurite.Tests",
        "runs-on":  null
    }
]

0xced · 2024-11-23T13:48:30Z

The disadvantage is that we need to maintain a list of test projects in the GitHub workflow. Ideally, we can create this list somehow during runtime. At the very least, we need to ensure that no one forgets to add the test project here.

I'm going to submit a pull request for this really soon. 😉

Edit: done in #1305.

HofmeisterAn · 2024-11-24T10:55:25Z

I put together a PowerShell script/command to generate the JSON (strategy) during build time. But it looks like that using the output of a job as an input strategy isn't as straightforward as I thought. It seems to require some hacks to make it work.

# Retrieves the GH workflow 'runs-on' configuration for each test project.
# During runtime, the strategy is passed to the CI job, allowing test projects to
# run parallel on individual runners.
$testProjects = Get-ChildItem -Path 'tests' -Directory `
    | % { $testProject = $_.Name; Join-Path -Path $_.FullName -ChildPath '.runs-on' } `
    | % { If (Test-Path -LiteralPath $_) { Get-Content -LiteralPath $_ } Else { $Null } } `
    | % { [PSCustomObject]@{ 'project' = $testProject; 'runs-on' = [string]$_ } }

# Checks if any test project does not contain a valid '.runs-on' configuration.
# If a project is missing this configuration, an error is thrown to prevent
# developers from forgetting to add the configuration.
$runsOnNotFound = $testProjects `
    | Where-Object 'runs-on' -Eq '' `
    | Select-Object -ExpandProperty 'project'

If ($runsOnNotFound)
{
    Write-Error "Please add a '.runs-on' configuration file to the test project:`n  $($runsOnNotFound -Join "`n  ")"
    Exit 1
}

HofmeisterAn added module An official Testcontainers module chore A change that doesn't impact the existing functionality, e.g. internal refactorings or cleanups labels Mar 4, 2024

HofmeisterAn pinned this issue Mar 4, 2024

TechLiam mentioned this issue Sep 19, 2024

fix: Bump MSSQL image version, remove Azure SQL Edge and Papercut module #1265

Merged

HofmeisterAn mentioned this issue Nov 10, 2024

chore: Run each test project on a separate runner #1295

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[README]: Contributing new modules #1136

[README]: Contributing new modules #1136

HofmeisterAn commented Mar 4, 2024

mrudat commented Apr 26, 2024

eddumelendez commented May 23, 2024

HofmeisterAn commented May 23, 2024

kiview commented May 23, 2024

kevin0x90 commented Oct 2, 2024

HofmeisterAn commented Oct 3, 2024

HofmeisterAn commented Nov 9, 2024 •

edited

Loading

HofmeisterAn commented Nov 11, 2024

HofmeisterAn commented Nov 14, 2024

0xced commented Nov 23, 2024 •

edited

Loading

HofmeisterAn commented Nov 24, 2024

[README]: Contributing new modules #1136

[README]: Contributing new modules #1136

Comments

HofmeisterAn commented Mar 4, 2024

Problem

Solution

Benefit

Alternatives

Would you like to help contributing this enhancement?

mrudat commented Apr 26, 2024

eddumelendez commented May 23, 2024

HofmeisterAn commented May 23, 2024

kiview commented May 23, 2024

kevin0x90 commented Oct 2, 2024

HofmeisterAn commented Oct 3, 2024

HofmeisterAn commented Nov 9, 2024 • edited Loading

HofmeisterAn commented Nov 11, 2024

HofmeisterAn commented Nov 14, 2024

0xced commented Nov 23, 2024 • edited Loading

HofmeisterAn commented Nov 24, 2024

HofmeisterAn commented Nov 9, 2024 •

edited

Loading

0xced commented Nov 23, 2024 •

edited

Loading