Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E Testing refactored Scheduler and Watcher #288

Open
15 tasks done
mbthornton-lbl opened this issue Nov 19, 2024 · 5 comments
Open
15 tasks done

E2E Testing refactored Scheduler and Watcher #288

mbthornton-lbl opened this issue Nov 19, 2024 · 5 comments
Assignees

Comments

@mbthornton-lbl
Copy link
Contributor

mbthornton-lbl commented Nov 19, 2024

Perform End-to-end testing of refactored Scheduler and Watcher in the dev environment

Deployment:

nmdcda@perlmutter:login33:~> squeue -u nmdcda
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          30091487      cron cromwell   nmdcda  R 5-10:56:57      1 login35
          30091486      cron   condor   nmdcda  R 5-10:56:55      1 login24
          33101818 regular_m nmdc_con   nmdcda PD       0:00      1 (Priority)
          33101851 regular_m nmdc_con   nmdcda PD       0:00      5 (Priority)
          33101927 regular_m nmdc_con   nmdcda PD       0:00      1 (Priority)

Integration Test Cases -

1. Scheduler creates a new unclaimed JobRequest in the jobs collection in the dev DB

Test Conditions: Create conditions for the scheduler to pickup incomplete workflow(s) and create new job requests via:

  • Added nmdc:omprc-13-01jx8727 to allow.lst

Pass Conditions:

  • log entries showing new JobReq creation
  • new unclaimed JR's in the jobs collection

2. Watcher picks up an unclaimed JobRequest and submits Cromwell job

Test Conditions: New unclaimed Job Request(s) in the dev DB (result from success in Case 1)

Pass Conditions:

  • log entry showing Cromwell job submission / status check
  • correctly configured running Cromwell job

Watcher Log:

(nersc-python) nmdcda@perlmutter:login15:~/nmdc_automation/dev> head nohup.out 
2024-11-25 19:58:59,760 INFO: Initializing Watcher: config file: /global/homes/n/nmdcda/nmdc_automation/dev/site_configuration_nersc.toml
2024-11-25 19:58:59,761 INFO: Using state file from config: /global/cfs/cdirs/m3408/var/dev/agent.state
2024-11-25 19:58:59,761 INFO: New Job from State: nmdc:wfmag-11-g7msr323.1, nmdc:66cf64b6-7462-11ef-8b84-deaa01ab0f49
2024-11-25 19:58:59,761 INFO: Last Status: Succeeded
2024-11-25 19:58:59,761 INFO: New Job from State: nmdc:wfmag-12-h52r0792.1, nmdc:c2b7c884-ab78-11ef-8298-3e652b5abb3d
2024-11-25 19:58:59,761 INFO: Last Status: Running
2024-11-25 19:58:59,761 INFO: Adding 2 new jobs from state file.
2024-11-25 19:59:01,019 INFO: Entering polling loop
2024-11-25 19:59:01,047 DEBUG: Starting new HTTPS connection (1): api-dev.microbiomedata.org:443
2024-11-25 19:59:01,690 DEBUG: https://api-dev.microbiomedata.org:443 "POST /token HTTP/11" 200 None
(nersc-python) nmdcda@perlmutter:login15:~/nmdc_automation/dev> head -50 nohup.out 
2024-11-25 19:58:59,760 INFO: Initializing Watcher: config file: /global/homes/n/nmdcda/nmdc_automation/dev/site_configuration_nersc.toml
2024-11-25 19:58:59,761 INFO: Using state file from config: /global/cfs/cdirs/m3408/var/dev/agent.state
2024-11-25 19:58:59,761 INFO: New Job from State: nmdc:wfmag-11-g7msr323.1, nmdc:66cf64b6-7462-11ef-8b84-deaa01ab0f49
2024-11-25 19:58:59,761 INFO: Last Status: Succeeded
2024-11-25 19:58:59,761 INFO: New Job from State: nmdc:wfmag-12-h52r0792.1, nmdc:c2b7c884-ab78-11ef-8298-3e652b5abb3d
2024-11-25 19:58:59,761 INFO: Last Status: Running
2024-11-25 19:58:59,761 INFO: Adding 2 new jobs from state file.
2024-11-25 19:59:01,019 INFO: Entering polling loop
2024-11-25 19:59:01,047 DEBUG: Starting new HTTPS connection (1): api-dev.microbiomedata.org:443
2024-11-25 19:59:01,690 DEBUG: https://api-dev.microbiomedata.org:443 "POST /token HTTP/11" 200 None
2024-11-25 19:59:01,692 DEBUG: Starting new HTTPS connection (1): api-dev.microbiomedata.org:443
2024-11-25 19:59:02,576 DEBUG: https://api-dev.microbiomedata.org:443 "GET /jobs?max_page_size=100&filter=%7B%22workflow.id%22:%20%7B%22$in%22:%20%5B%22Sequencing%20Noninterleaved:%20%22,%20%22Sequencing%20Interleaved:%20%22,%20%22Reads%20QC:%20v1.0.13%22,%20%22Reads%20QC%20Interleave:%20v1.0.12%22,%20%22Metagenome%20Assembly:%20v1.0.7%22,%20%22Metagenome%20Annotation:%20v1.1.0%22,%20%22MAGs:%20v1.3.12%22,%20%22Readbased%20Analysis:%20v1.0.8%22%5D%7D,%20%22claims%22:%20%7B%22$size%22:%200%7D%7D HTTP/11" 200 16
2024-11-25 19:59:02,577 INFO: Found 0 unclaimed jobs.
2024-11-25 19:59:02,579 INFO: Checking for finished jobs.
2024-11-25 19:59:02,579 DEBUG: Getting job status from https://nmdc-cromwell.freeddns.org:8443/api/workflows/v1/a9068f41-37a8-43cc-b576-4ab4d6aea3c2/status
2024-11-25 19:59:02,581 DEBUG: Starting new HTTPS connection (1): nmdc-cromwell.freeddns.org:8443
2024-11-25 19:59:02,598 DEBUG: https://nmdc-cromwell.freeddns.org:8443 "GET /api/workflows/v1/a9068f41-37a8-43cc-b576-4ab4d6aea3c2/status HTTP/11" 200 64
2024-11-25 19:59:22,601 DEBUG: Starting new HTTPS connection (1): api-dev.microbiomedata.org:443

Running Cromwell Job

 curl --netrc https://nmdc-cromwell.freeddns.org:8443/api/workflows/v1/a9068f41-37a8-43cc-b576-4ab4d6aea3c2/status
{"status":"Running","id":"a9068f41-37a8-43cc-b576-4ab4d6aea3c2"}

3.1 Watcher picks up successful Cromwell job and creates data directories and files

Test Conditions: Successful job on Cromwell (result from success in Case 2)

Pass Conditions:

  • Watcher picks up successful cromwell job
  • Watcher creates expected data directories and data files

3.2 Watcher updates NMDC dev database with successful job DataObjects and WorkflowExecution

Test Conditions: Successful job on Cromwell (result from success in Case 2)

Pass Conditions:

  • log entry showing correct nmdc.Database creation from Cromwell results
  • creates expected DataObject and WorkflowExecution documents in the dev DB
@aclum
Copy link
Contributor

aclum commented Nov 21, 2024

The data_generation_set records attached to microbiomedata/issues#935 are in mongo dev and can be used for end to end testing of a workflow starting with reads qc.

@mbthornton-lbl
Copy link
Contributor Author

Testing Update:
Scheduler Run on Dev for nmdc:omprc-13-01jx8727

Log:

root@scheduler-645f4d56b6-j9b5f:/conf# ./run.sh
./run.sh: line 8: kill: (243) - No such process
root@scheduler-645f4d56b6-j9b5f:/conf# INFO:root:Initializing Scheduler
INFO:root:Found 1 new jobs for nmdc:wfmgan-11-hg8af485.1
INFO:root:JOB RECORD: nmdc:c2b7c884-ab78-11ef-8298-3e652b5abb3d

@mbthornton-lbl
Copy link
Contributor Author

Head of the created job record in the devDB

{
    "_id" : ObjectId("6744f39ef8d82d27b2716c70"),
    "workflow" : {
        "id" : "MAGs: v1.3.12"
    },
    "id" : "nmdc:c2b7c884-ab78-11ef-8298-3e652b5abb3d",
    "created_at" : ISODate("2024-11-25T22:01:02.000+0000"),
    "config" : {
        "git_repo" : "https://github.com/microbiomedata/metaMAGs",
        "release" : "v1.3.12",
        "wdl" : "mbin_nmdc.wdl",
        "activity_id" : "nmdc:wfmag-12-h52r0792.1",
        "activity_set" : "workflow_execution_set",
        "was_informed_by" : "nmdc:omprc-13-01jx8727",
        "trigger_activity" : "nmdc:wfmgan-11-hg8af485.1",
        "iteration" : NumberInt(1),
        "input_prefix" : "nmdc_mags",
        "inputs" : {
            "proj" : "nmdc:wfmag-12-h52r0792.1",

@mbthornton-lbl
Copy link
Contributor Author

Full record
nmdc-omprc-13-01jx8727_mags_job.json

@ssarrafan
Copy link

@mbthornton-lbl can this issue be closed? I see all the pretty checkboxes above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants