-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gpu functional testing #1744
gpu functional testing #1744
Conversation
31a6af1
to
19bf0d1
Compare
}], | ||
"command": ["sh", "-c", "nvidia-smi -L | wc -l | grep \"2\" && exit 42 || exit 1"] | ||
}] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new line at the end of file
@@ -1245,3 +1246,42 @@ func TestSSMSecretsEncryptedASMSecrets(t *testing.T) { | |||
exitCode, _ := task.ContainerExitcode("ssmsecrets-environment-variables") | |||
assert.Equal(t, 42, exitCode, fmt.Sprintf("Expected exit code of 42; got %d", exitCode)) | |||
} | |||
|
|||
// Note: This functional test requires ECS GPU instance which has atleast 4 GPUs | |||
// Please use instance like p3.8xlarge for running this test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to confirm this with justin/jake to see if this will work well with the release tests
"type":"GPU", | ||
"value": "2" | ||
}], | ||
"command": ["sh", "-c", "nvidia-smi -L | wc -l | grep \"2\" && exit 42 || exit 1"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grep \"2\"
seems to be to broad, 20/22/12 will also pass the test, is there a way to check if it's exact 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use grep -w
for absolute match
iid, _ := ec2.NewEC2MetadataClient(nil).InstanceIdentityDocument() | ||
for _, gpuInstance := range gpuInstances { | ||
if strings.HasPrefix(iid.InstanceType, gpuInstance) { | ||
// GPU test should only run on p2/p3 ECS instances |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p2/p3/g3
|
Summary
A simple GPU functional test that verifies if the right number of GPU devices are assigned to the task's container
NOTE: this cannot be merged until backend changes are in prod(test passes in gamma)
Implementation details
Check if instance is of the type p2/p3/g3, set the config var
ECS_ENABLE_GPU_SUPPORT
and bind mount the gpu info file in the instance created by init to the functional test's agent container.Verify if two gpus are assigned to an nvidia cuda container. **
** For the test, use a GPU instance that has atleast 2 Nvidia GPUs
Testing
make release
)go build -out amazon-ecs-agent.exe ./agent
)make test
) passgo test -timeout=25s ./agent/...
) passmake run-integ-tests
) pass.\scripts\run-integ-tests.ps1
) passmake run-functional-tests
) pass.\scripts\run-functional-tests.ps1
) passRan the test manually on p2,p3,g3 instances and it passes
Test output:
New tests cover the changes: yes
Description for the changelog
N/A
Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.