Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitor classes for SLURM, PBS, and LSF #32

Open
4 tasks done
wlandau opened this issue Jan 8, 2024 · 17 comments
Open
4 tasks done

Monitor classes for SLURM, PBS, and LSF #32

wlandau opened this issue Jan 8, 2024 · 17 comments

Comments

@wlandau
Copy link
Owner

wlandau commented Jan 8, 2024

Prework

  • Read and agree to the Contributor Code of Conduct and contributing guidelines.
  • If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
  • New features take time and effort to create, and they take even more effort to maintain. So if the purpose of the feature is to resolve a struggle you are encountering personally, please consider first posting a GitHub discussion.
  • Format your code according to the tidyverse style guide.

Proposal

crew.cluster 0.2.0 supports a new "monitor" class to help list and terminate SGE jobs from R instead of the command line. https://wlandau.github.io/crew.cluster/index.html#monitoring shows an example using crew_monitor_sge():

monitor <- [crew_monitor_sge](https://wlandau.github.io/crew.cluster/reference/crew_monitor_sge.html)()
job_list <- monitor$jobs()
job_list
#> # A tibble: 2 × 9
#>   job_number prio    name    owner state start_time queue_name jclass_name slots
#>   <chr>      <chr>   <chr>   <chr> <chr> <chr>      <chr>      <lgl>       <chr>
#> 1 131853812  0.05000 crew-m… USER… r     2024-01-0… all.norma… NA          1    
#> 2 131853813  0.05000 crew-m… USER… r     2024-01-0… all.norma… NA          1
monitor$terminate(jobs = job_list$job_number)
#> USER has registered the job 131853812 for deletion
#> USER has registered the job 131853813 for deletion
monitor$jobs()
#> data frame with 0 columns and 0 rows

Currently only SGE is supported. I would like to add other monitor classes for other clusters, but I do not have access to SLURM, PBS, or LSF. cc'ing @nviets, @brendanf, and/or @mglev1n, in case you are interested.

@nviets
Copy link

nviets commented Jan 9, 2024

Hi @wlandau - just to confirm my understanding, you're proposing we add, for instance crew_monitor_slurm(), and all related bits following crew_monitor_sge.R?

@wlandau
Copy link
Owner Author

wlandau commented Jan 9, 2024

Yes, exactly! On SGE, the hardest part for me was parsing job status information. I had to dig into the XML because the non-XML output from qstat is not machine-readable. Other than that, we would just use SLURM's commands instead of qstat/qdel. The R6 boilerplate should be a simple copy/paste.

@wlandau
Copy link
Owner Author

wlandau commented Jan 9, 2024

The R6 boilerplate should be a simple copy/paste.

Actually, first I would like to simplify this part by creating a common abstract parent class for all the monitors to inherit from...

@nviets
Copy link

nviets commented Jan 9, 2024

I'll give some thought to slurm. There are the usual slurm commands (squeue, scancel, etc...) whose output we could parse, but there's also a DB (optional and typically used in larger installations) that could be queried. Maybe the former is better at least in the short term, since not everyone will have the DB.

@wlandau
Copy link
Owner Author

wlandau commented Jan 9, 2024

Thanks for looking into this! In the end I would prefer something that all/most SLURM users would be able to use.

By the way, as of 8cf036b I created parent monitor class that all cluster-specific monitors inherit from: https://github.com/wlandau/crew.cluster/blob/main/R/crew_monitor_cluster.R. This helps reduce duplicated code/docs. The SGE monitor is much shorter now and easy to copy: https://github.com/wlandau/crew.cluster/blob/main/R/crew_monitor_sge.R. Tests are at https://github.com/wlandau/crew.cluster/blob/main/tests/testthat/test-crew_monitor_sge.R and https://github.com/wlandau/crew.cluster/blob/main/tests/sge/monitor.R.

@brendanf
Copy link
Contributor

To make sure I understand, the monitor is only for interactive use? So the data.frame which is output by jobs() does not need to have any particular column names?

@brendanf
Copy link
Contributor

brendanf commented Feb 20, 2024

There are two options for squeue that I am aware of: parse the standard output, which is a fixed with table (optionally the columns and widths can be specified with the -o or -O options if we don't trust the defaults will be the same for all users):

# this is the default format given in `man squeue`, but specify it
# in case some user's configuration is different
default_format <- "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R"
text <- system2(
  "squeue",
  args = shQuote(c("-u", user, "-o", default_format)),
  stdout = TRUE,
  stderr = if_any(private$.verbose, "", FALSE),
  wait = TRUE
)
con <- textConnection(text)
out <- read.fwf(
  con,
  widths = c(18, -1, 9, -1, 8, -1, 8, -1,  2, -1, 10, -1, 6, -1, 100),
  skip = 1,
  col.names = c("JOBID", "PARTITION", "NAME", "USER", "ST", "TIME", "NODES", "NODELIST_REASON"),
  strip.white = TRUE
)
tibble::as_tibble(out)
## A tibble: 7 × 8
#     JOBID PARTITION NAME     USER     ST    TIME  NODES NODELIST_REASON
#     <int> <chr>     <chr>    <chr>    <chr> <chr> <int> <chr>          
#1 20504876 small     crew-Opt brfurnea R     52:46     1 r18c36         
#2 20504877 small     crew-Opt brfurnea R     52:46     1 r18c23         
#3 20504863 small     crew-Opt brfurnea R     52:50     1 r18c41         
#4 20504851 small     crew-Opt brfurnea R     53:06     1 r18c33         
#5 20504854 small     crew-Opt brfurnea R     53:06     1 r18c35         
#6 20504857 small     crew-Opt brfurnea R     53:06     1 r18c40         
#7 20504848 small     OptimOTU brfurnea R     53:35     1 r18c43

The second option is slurm --yaml, which gives a full dump of the entire queue. Arguments like -u do nothing to filter the output, so the monitor would have to do this itself. Especially on a big cluster, this is a lot of data:

text <- system2("squeue", args = shQuote("--yaml"), stdout = TRUE, stderr = FALSE, wait = TRUE)
length(text)
# [1] 269314

This both because there are a lot of jobs, but also because it gives all possible fields, more than 100 per job.

My feeling is that option 1 is the way to go, despite the fact that fixed-width outputs may cut some values (for instance, NAME above).

@wlandau
Copy link
Owner Author

wlandau commented Feb 21, 2024

That's a tough choice, and it's a shame that the more structured YAML-based is large. How large exactly, in terms of the size of the output and the execution time? I am concerned that subtle variations from cluster to cluster and odd things like spaces in job names could interfere with the standard output.

@brendanf
Copy link
Contributor

On my cluster, slurm --yaml returned 11Mb in 0.7s. Parsing the result with yaml::read_yaml() took about 1.4s. At the time of my test there were 2166 jobs in the queue. If it's only going to be used interactively, it's probably acceptable, but I certainly would not want to call it often in a script.

@wlandau
Copy link
Owner Author

wlandau commented Feb 21, 2024

Yeah, monitor objects are just for interactive use. I think those performance metrics are not terrible as long as the documentation gives the user a heads up.

@brendanf
Copy link
Contributor

The yaml queue dump includes 111 fields for each job, some of which are themselves structured; e.g. one field is "job resources" which looks like this:

job_resources
job_resources$nodes
[1] "r15c35"

job_resources$allocated_cores
[1] 6

job_resources$allocated_hosts
[1] 1

job_resources$allocated_nodes
job_resources$allocated_nodes[[1]]
job_resources$allocated_nodes[[1]]$sockets
job_resources$allocated_nodes[[1]]$sockets$`0`
job_resources$allocated_nodes[[1]]$sockets$`0`$cores
job_resources$allocated_nodes[[1]]$sockets$`0`$cores$`6`
[1] "allocated"

job_resources$allocated_nodes[[1]]$sockets$`0`$cores$`7`
[1] "allocated"

job_resources$allocated_nodes[[1]]$sockets$`0`$cores$`8`
[1] "allocated"

job_resources$allocated_nodes[[1]]$sockets$`0`$cores$`9`
[1] "allocated"

job_resources$allocated_nodes[[1]]$sockets$`0`$cores$`10`
[1] "allocated"

job_resources$allocated_nodes[[1]]$sockets$`0`$cores$`11`
[1] "allocated"




job_resources$allocated_nodes[[1]]$nodename
[1] "r15c35"

job_resources$allocated_nodes[[1]]$cpus_used
[1] 6

job_resources$allocated_nodes[[1]]$memory_used
[1] 12288

job_resources$allocated_nodes[[1]]$memory_allocated
[1] 12288

@brendanf
Copy link
Contributor

This code approximately recreates the default squeue output. I substituted start time for elapsed time, because the yaml does not actually include elapsed time, and I want to avoid situations where, e.g., I am using UTC while SLURM is configured to use local time or vice versa.

user <- ps::ps_username()
monitor_cols <- c("job_id", "partition", "name", "user_name", "job_state",
       "start_time", "node_count", "state_reason")
text <- system2(
  "squeue",
  args = "--yaml",
  stdout = TRUE,
#stderr = ifany(private$.verbose, "", FALSE),
  wait = TRUE
)
yaml = yaml::read_yaml(text = text)
out <- map(
  yaml$jobs,
  ~ tibble::new_tibble(
    c(
     map(.x[monitor_cols], ~ unlist(.x) %||% NA),
     list(nodes = paste(unlist(.x$job_resources$nodes), collapse = ",") %||% NA)
    )
  )
)
out <- do.call(vctrs::vec_rbind, out)
out <- out[out$user_name == user,]
out$start_time <- as.POSIXct(out$start_time, origin = "1970-01-01")
out

# A tibble: 14 × 9
     job_id partition name    user_name job_state start_time          node_count
      <int> <chr>     <chr>   <chr>     <chr>     <dttm>                   <int>
 1 20386512 longrun   R_Mothguilbaul  RUNNING   2024-02-09 09:05:33          1
 2 20386513 longrun   R_Mothguilbaul  RUNNING   2024-02-09 09:05:33          1
 3 20386514 longrun   R_Mothguilbaul  RUNNING   2024-02-09 09:05:33          1
 4 20386515 longrun   R_Mothguilbaul  RUNNING   2024-02-09 09:05:33          1
 5 20386516 longrun   R_Mothguilbaul  RUNNING   2024-02-09 09:05:33          1
 6 20386517 longrun   R_Mothguilbaul  RUNNING   2024-02-09 09:05:33          1
 7 20386509 longrun   R_Mothguilbaul  RUNNING   2024-02-09 09:05:33          1
 8 20446032 longrun   R_Mothguilbaul  RUNNING   2024-02-14 09:27:25          1
 9 20446033 longrun   R_Mothguilbaul  RUNNING   2024-02-14 09:27:25          1
10 20446034 longrun   R_Mothguilbaul  RUNNING   2024-02-14 09:27:25          1
11 20446035 longrun   R_Mothguilbaul  RUNNING   2024-02-14 09:27:25          1
12 20446036 longrun   R_Mothguilbaul  RUNNING   2024-02-14 09:27:25          1
13 20446037 longrun   R_Mothguilbaul  RUNNING   2024-02-14 09:27:25          1
14 20446004 longrun   R_Mothguilbaul  RUNNING   2024-02-14 09:27:25          1
# ℹ 2 more variables: state_reason <chr>, nodes <chr>

@wlandau
Copy link
Owner Author

wlandau commented Feb 22, 2024

Nice! Got time for a PR?

@brendanf brendanf mentioned this issue Feb 22, 2024
2 tasks
@nviets
Copy link

nviets commented Feb 23, 2024

Sorry I was pulled away from this thread by work. The yaml option looks like a much better approach than parsing squeue, but I think it requires an extra plugin and minimum slurm version. It would be worth adding a warning or something. See: Why am I getting the following error: "Unable to find plugin: serializer/json"?.

@mglev1n
Copy link
Contributor

mglev1n commented Feb 29, 2024

It looks like the LSF job output can similarly be parsed either using the fixed-width table, or JSON (see example below) - this would add a jsonlite dependency:

text <- system2(
  "bjobs",
  args = c("-o 'user jobid job_name stat queue slots mem start_time run_time'", "-json"),
  stdout = TRUE,
  wait = TRUE
)
json <- jsonlite::fromJSON(text)
out <- json$RECORDS
out
user <- ps::ps_username()
text <- system2(
    "bjobs",
    args = c("-o 'user jobid job_name stat queue slots mem start_time run_time'", "-json"),
    stdout = TRUE,
    #stderr = ifany(private$.verbose, "", FALSE),
    wait = TRUE
)
json <- jsonlite::fromJSON(text)
out <- json$RECORDS
out

     USER    JOBID JOB_NAME STAT               QUEUE SLOTS         MEM   START_TIME         RUN_TIME
1 mglevin 25900189     bash  RUN voltron_interactive     1    8 Mbytes Feb 29 09:12    313 second(s)
2 mglevin 25900201     bash  RUN voltron_interactive     1    2 Mbytes Feb 29 09:17     22 second(s)
3 mglevin 25665912  rstudio  RUN     voltron_rstudio     2 87.9 Gbytes Feb 26 15:36 236482 second(s)

@wlandau
Copy link
Owner Author

wlandau commented Feb 29, 2024

Awesome! jsonlite is super lightweight and reliable, I don't mind it as a dependency.

Would you be willing to open a PR?

@mdsumner
Copy link

mdsumner commented Aug 1, 2024

just here to say hi, still early days for me with {crew} but I'm excited to learn, I have access to SLURM and PBS, and I'm reading along

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants