Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added integration tests for error scenarios and added error observability to plugin #30

Merged
merged 21 commits into from
Feb 10, 2023

Conversation

nathandotleeathpe
Copy link
Contributor

I added integration tests simulating driver errors at each DWS stage. The feature file contains a comment detailing how the tests need to change if Flags=TeardownFailure is configured.

Driver errors are now returned to Slurm, allowing the errors to show up in the SystemComment field in scontrol output. This was written using a json path to return all DriverStatus entries for the current watch state and then filtering through those entries to find Errors. The json path was designed to handle multiline driver errors. Also note that lua iterators will return "nil" if you have reached the end of their queue.

The unit tests were updated to cover the driver error handling. Previously, the unit tests were written to also run against a live k8s and slurm environment. The integration tests remove the need for this feature, so I removed it. I also removed the old CRD validator tool for the same reason.

nathandotleeathpe and others added 13 commits December 15, 2022 10:22
Signed-off-by: Nathan Lee <[email protected]>
Signed-off-by: Nathan Lee <[email protected]>
…taWorkflowServices#28)

* Using dws-test-driver for DWS state progression integration tests

Signed-off-by: Nathan Lee <[email protected]>

* Fixed integration test errors

Signed-off-by: Nathan Lee <[email protected]>

* code review changes

Signed-off-by: Nathan Lee <[email protected]>

* Updated dws-test-driver to main branch HEAD

Signed-off-by: Nathan Lee <[email protected]>

* Code review

Signed-off-by: Nathan Lee <[email protected]>

---------

Signed-off-by: Nathan Lee <[email protected]>
@github-actions
Copy link

github-actions bot commented Feb 6, 2023

Code Coverage

Package Line Rate Health
burst_buffer 90%
Summary 90% (323 / 358)

Signed-off-by: Nathan Lee <[email protected]>
@roehrich-hpe
Copy link
Contributor

Would you please update the dws-test-driver and slurm-docker-cluster submodules with this PR? When they're not pointing into 'main' then the github cli tool has trouble checking out this PR:

$ gh pr checkout 30 
$ git submodule update --init --recursive
[...]
fatal: Fetched in submodule path 'testsuite/submodules/slurm-docker-cluster', but it did not contain 2f85cf28023416c0382a5a0e31a885b24f67a2bd. Direct fetching of that commit failed.

Maybe that's a commit in one of your personal forks?

Signed-off-by: Nathan Lee <[email protected]>
@roehrich-hpe
Copy link
Contributor

When you update the submodules, you should leave 'dws' at v0.0.6, until we snap another release of it.

@nathandotleeathpe nathandotleeathpe merged commit 0dae837 into DataWorkflowServices:main Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants