Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PREVIEW] teuthology: add support for graph walks of QA suites #2012

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

batrick
Copy link
Member

@batrick batrick commented Nov 27, 2024

This is a preview PR for the branch used for my talk at Cephalocon 2024:

https://ceph2024.sched.com/event/1ktVR/the-art-of-teuthology-patrick-donnelly-ibm-inc

TODO:

  • To maintain consistency with the --subset operator on matrices, we
    should ensure that each fragment is included at least once in a subset
    (if the subset is large enough to make that possible). The probable
    way to do that is to generate the randomized set of children for each
    node once and always pull the first child if a path should be
    generated (then put that child at the end of the queue of children).
  • matrix implementation of rados creates 4455758 jobs whereas graph implementation creates 2792029 jobs. This may be a divergence caused by nested subsets but needs investigation.
  • print fragments not included in any path walk (job) as a warning.
  • docs
  • tests

Patrick Donnelly added 3 commits December 4, 2024 03:16
This is unrelated to graph walking but makes testing much easier to
iterate on. It's not necessary to clone the QA suite whenever the commit
changes; instead clone once and fetch the difference.

Reviewer's note: this commit needs cleaned up, obviously. This is part
of a preview.

Signed-off-by: Patrick Donnelly <[email protected]>
Reviewer's note: this is part of a preview PR. This lacks tests/docs.
There is more documentation to do.

Also, the code needs one small improvement:
- To maintain consistency with the --subset operator on matrices, we
  should ensure that each fragment is included at least once in a subset
  (if the subset is large enough to make that possible). The probable
  way to do that is to generate the randomized set of children for each
  node once and always pull the first child if a path should be
  generated (then put that child at the end of the queue of children).

Signed-off-by: Patrick Donnelly <[email protected]>
Reviewer's note: we will probably want to make the matrix representation
a valid fallback using a command line switch. This is also important
when doing reruns with subsets generated via the matrix representation.

Signed-off-by: Patrick Donnelly <[email protected]>
@batrick batrick changed the title [PLACEHOLDER] teuthology: add support for graph walks of QA suites [PREVIEW] teuthology: add support for graph walks of QA suites Dec 4, 2024
@batrick batrick added the feature label Dec 4, 2024
@@ -95,7 +95,7 @@ def current_branch(path: str) -> str:
return result


def enforce_repo_state(repo_url, dest_path, branch, commit=None, remove_on_error=True):
def enforce_repo_state(dest_clone, dest_path, repo_url, branch, commit=None, remove_on_error=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to change initial signature and leave repo_url at first position, and add dest_cone as an optional argument with defaults value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, there will probably be lots of reworking to this code but it was a dirty hack to start because the clones after every QA suite tweak was driving me nuts. That's why it's in this PR.

def bare_repo(git_dir):
log.info("bare_repo %s", git_dir)
args = ['git', 'init', '--bare', git_dir]
proc = subprocess.Popen(args)
Copy link
Contributor

@phlogistonjohn phlogistonjohn Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't need a lot of the complexity that Popen provides, I heartily recommend using the simpler subprocess.run function (avialable since 3.5). You can use the check=True parameter to have it raise exceptions on non-zero exit automatically too.

@phlogistonjohn
Copy link
Contributor

I think this is really cool improvement. I have a possibly silly suggestion, based on the nice graphs in your cephalocon presentation, would it be reasonable to have a mode that generates graphvis output that can be used to generate graphical representations of the steps? This kind of visualization could be handy to debug/understand the steps as they're assembled.

@batrick
Copy link
Member Author

batrick commented Dec 17, 2024

I think this is really cool improvement. I have a possibly silly suggestion, based on the nice graphs in your cephalocon presentation, would it be reasonable to have a mode that generates graphvis output that can be used to generate graphical representations of the steps? This kind of visualization could be handy to debug/understand the steps as they're assembled.

I already played around with a few options (including a terminal representation, which is ... fun to grok). The code is there. I would ultimately want to:

  • make a .svg for the path walk that produced a job, in the job dir
  • make a .svg for the entire suite graph in the run dir
  • make a .svg for a dry-run for inspection prior to scheduling

If you have tooling/library suggestions, please do share.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants