Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Windows CI builds failing to find docker (Update to Run Windows inside docker containers) #281

Closed
mch2 opened this issue May 9, 2023 · 39 comments

Comments

@mch2
Copy link
Member

mch2 commented May 9, 2023

Describe the bug

Windows CI builds are failing, example: https://build.ci.opensearch.org/job/gradle-check/14914/console

+ docker logout
C:/Users/Administrator/jenkins/workspace/gradle-check@tmp/durable-392d4e2e/script.sh: line 1: docker: command not found
[Pipeline] }
[Pipeline] // script
Error when executing always post condition:
hudson.AbortException: script returned exit code 127
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.handleExit(DurableTaskStep.java:664)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.check(DurableTaskStep.java:610)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.run(DurableTaskStep.java:554)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

To reproduce

N/A

Expected behavior

Builds should pass and docker tests should run.

Screenshots

If applicable, add screenshots to help explain your problem.

Host / Environment

No response

Additional context

No response

Relevant log output

No response

@mch2 mch2 added bug Something isn't working untriaged Issues that have not yet been triaged labels May 9, 2023
@jordarlu jordarlu removed the untriaged Issues that have not yet been triaged label May 12, 2023
@jordarlu
Copy link
Contributor

Hi, @mch2 , could you let me know how do yor trigger the gradle_check in this case? if you had a PR that triggered it, can you send me the PR link?
thanks,

CC @peterzhuamazon

@peterzhuamazon
Copy link
Member

I will take care of this as I have talked to @mch2 offline.
Thanks.

@peterzhuamazon
Copy link
Member

peterzhuamazon commented May 12, 2023

Able to get docker running on Windows with hyperv.

Administrator@<> MINGW64 ~
$ docker version
Client:
 Version:           23.0.6
 API version:       1.42
 Go version:        go1.19.9
 Git commit:        ef23cbc
 Built:             Fri May  5 21:18:35 2023
 OS/Arch:           windows/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          23.0.6
  API version:      1.42 (minimum version 1.24)
  Go version:       go1.19.9
  Git commit:       9dbdbd4
  Built:            Fri May  5 21:17:32 2023
  OS/Arch:          windows/amd64
  Experimental:     false



Administrator@<> MINGW64 ~
$  docker pull mcr.microsoft.com/windows/nanoserver:ltsc2019
ltsc2019: Pulling from windows/nanoserver
aaaa081173ae: Pulling fs layer
aaaa081173ae: Verifying Checksum
aaaa081173ae: Download complete
aaaa081173ae: Pull complete
Digest: sha256:fb78bd84ac937f6b1453e19015ccce41636bbeca5fe5bc6dc5c7d55adb4a2bc5
Status: Downloaded newer image for mcr.microsoft.com/windows/nanoserver:ltsc2019
mcr.microsoft.com/windows/nanoserver:ltsc2019

Needs @mch2 to confirm what are the exact images that windows docker is running with.

On windows, if you use hyperv then windows host can only run windows container.
If we need windows host to run linux container, we need to enable wsl2 later on and might have issues.

Please let me know about this.
Thanks.

@peterzhuamazon
Copy link
Member

peterzhuamazon commented May 12, 2023

Also, this can be a good start into these two issues to bring windows integTest with docker host and containers, even building the artifacts on windows docker containers.

Here is a chart showcasing the comparison between different offers of containers on Windows:

Here's a chart comparing some of the key differences between Windows Server with Server Core installation and Windows Nano Server:

Feature Windows Server with Server Core Windows Nano Server
Installation size Larger (several GBs) Smaller (a few hundred MBs)
Attack surface Larger Smaller
Support for GUI Yes (minimal) No
Support for 32-bit applications Yes No
Support for Windows Services Yes Limited
Support for .NET Framework Yes Limited
Support for Containers Yes Yes
Licensing Standard, Datacenter Standard, Datacenter
Available editions All Windows Server editions Standard and Datacenter only

Will try to see if we can bring nanoserver in place to make Windows light wight in build, test, and check.

Thanks.

@peterzhuamazon
Copy link
Member

peterzhuamazon commented Jul 26, 2023

I eventually get the docker container running the nanoserver on Windows:


PS C:\Users\Administrator> docker images
REPOSITORY                             TAG        IMAGE ID       CREATED       SIZE
mcr.microsoft.com/windows/nanoserver   ltsc2019   82ef3885248c   2 weeks ago   252MB

PS C:\Users\Administrator> docker run 82ef3885248c
Microsoft Windows [Version 10.0.17763.4645]
(c) 2018 Microsoft Corporation. All rights reserved.

C:\>

PS C:\Users\Administrator> docker ps -a
CONTAINER ID   IMAGE          COMMAND                    CREATED              STATUS                          PORTS     NAMES
4aced3bb72dd   82ef3885248c   "c:\\windows\\system32…"   About a minute ago   Exited (0) About a minute ago             blissful_liskov

PS C:\Users\Administrator> docker rm 4aced3bb72dd
4aced3bb72dd

PRs:

  • Updating

@peterzhuamazon
Copy link
Member

We will be better of with the servercore option rather than the nanoserver, as the latter lack of several core components, while the servercore is just a headless version of the normal server base of Windows.

https://techcommunity.microsoft.com/t5/containers/nano-server-x-server-core-x-server-which-base-image-is-the-right/ba-p/2835785

@peterzhuamazon
Copy link
Member

peterzhuamazon commented Jul 26, 2023

Issues in the windows docker that is currently not able to solve to make it the same as AMI:
Move-Item : Access to the path is denied.

moby/moby#38256
microsoft/Windows-Containers#147

@peterzhuamazon
Copy link
Member

Just able to confirm that I am using --isolation=process not --isolation=hyperv.

@peterzhuamazon
Copy link
Member

Able to resolve the move issue by just using mingw and force the mv happens by bash.exe.

bash.exe -c "mv -v 'C:\\Windows\\System32\\find.exe' 'C:\\Windows\\System32\\find_windows.exe'"

renamed 'C:\Windows\System32\find.exe' -> 'C:\Windows\System32\find_windows.exe'

@peterzhuamazon
Copy link
Member

Seems like issue with volta on 1.1.1: volta-cli/volta#1435

Will revert to either the older 1.0.8 or 1.1.0 now.

Thanks.

@peterzhuamazon
Copy link
Member

Able to invoke bash.exe directly in the windows container and able to run test workflow:

ContainerAdministrator@44082dfc4844 MINGW64 /c
$ whoami
ContainerAdministrator


@peterzhuamazon
Copy link
Member

peterzhuamazon commented Aug 29, 2023

git clone now on the windows host is instant on build repo.

@peterzhuamazon
Copy link
Member

There is a bug right now that every time when we pull the image from fresh it will always fail once on the sh stage.
I suspect we need to pre-load the image on the runner beforehand.
It will goes to success soon after in the second rerun:

ERROR: script returned exit code 127

@peterzhuamazon
Copy link
Member

Add a docker image initialization step on Windows Docker Host to resolve above issues.

@peterzhuamazon
Copy link
Member

Add new integTest support with Windows container now.

@peterzhuamazon
Copy link
Member

Per opensearch-project/opensearch-build#3816 we have fixed the docker commands issues on Windows, but it only supports hyperv running windows on windows through docker.

Per discussion with @mch2 the core team needs to disable the linux container related test on Windows.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

4 participants