profiling: run cylc 7 profile tests against cylc 8 #141

oliver-sanders · 2022-01-12T12:11:21Z

Cylc 7 had an on board automated profile battery which provided us with insights into Cylc's scaling characteristics.

This profile battery will need a total re-write for Cylc 8, there is an issue for that here #38, however, this is a relatively large job and one for which we don't have time at the moment so I suggest manually running some of the old scaling tests to give us an idea of how scaling has changed and how Cylc 8 compares againt Cylc 7.

Quick overview of the main scaling dimensions of Cylc:

Tasks
Edges
Recurrences (e.g. T00, T01, T02, etc, raises CPU load)
Churn / throughput (number of task events per second)

These dimensions are targeted by suites which were located in etc/dev-suites in the Cylc 7 source code.

For MO people some (rather dated) results from the old automated profile-battery can be found here: https://www-nwp/~osanders/profiling-results/2016/

To test scaling you will need to run the same experiment with different template variables.

Notes:

I suggest ramping up the scaling factors slowly, you can easily take down your machine with profiling experiments.

The chicken switch is killall python!

If you keep adding parallel tasks (without an internal Cylc queue), then you will eventually hit your machines fork limit which will cause task failures (and a very unhappy box).

To profile them I suggest using /usr/bin/time as we have a good level of understanding of this command and results will be comparable to prior tests.

Notes:

time and /usr/bin/time may differ in your shell.

Always use verbose mode.

For Darwin install GNU time.

$ /usr/bin/time -o <output-file> -v cylc play <workflow> -s <option> ... --no-detach

In the output we are interested in:

Wallclock time.
User + System time (i.e. CPU usage)
RSS memory usage.

For fair tests you need to run all experiments on the same hardware and to ensure the machine is idle. Running on a cluster may be a good idea but you may need to claim the whole node to meet these conditions (with the obvious resource wastage implications).

We don't have any up-to-date results for Cylc 7 so you will need to profile that too, however, you can use the automated profile battery there if you like. I think you run it like so:

$ cylc profile-battery -e complex -v 7.x.x

The text was updated successfully, but these errors were encountered:

hjoliver · 2022-01-12T21:23:07Z

Yeah this should be a priority once rc1 is out. (Feature freeze, more or less, then bug-fix, profile, optimize...)

Cylc 8 ought to be a lot more efficient, but we haven't really proved that yet.

MetRonnie · 2022-01-13T11:10:53Z

Here is my first go at profiling:

Small workflow

Datetime cycling example from tutorials; tasks with no scripts/sleeps.

flow.cylc

[scheduler]
    allow implicit tasks = True
    UTC mode = True
[scheduling]
    initial cycle point = 2000-01-01T00Z
    final cycle point = 2000-01-04T00Z
    [[graph]]
        PT3H = """
            get_observations_belmullet & get_observations_camborne &
            get_observations_heathrow & get_observations_shetland => consolidate_observations
        """
        +PT6H/PT6H = """
            consolidate_observations[-PT6H] => forecast
            consolidate_observations[-PT3H] => forecast
            consolidate_observations & get_rainfall => forecast
            forecast => post_process_exeter
        """
        +PT12H/PT6H = """
            forecast[-PT6H] => forecast
        """

	7.8.10	8.0rc1.dev0¹
Wall clock time	0:01:46	0:04:03
User + Sys time (s)	21.0	31.4
% of CPU this job got	19	12
Max resident set size (kB)	27,464	48,704

Complex workflow

A while ago I had taken the "Complex" suite from Oliver's ~/cylc-run and updated it to Cylc 8 syntax: https://gist.github.com/MetRonnie/98d42d0b090b1fee7f24a000b90be0af

Tasks with 1s sleeps (note for some reason I used env-script instead of script in [runtime][root])

	7.8.10	8.0rc1.dev0¹
Wall clock time	0:25:18	1:11:06
User + Sys time (s)	469.3	771.9
% of CPU this job got	30	18
Max resident set size (kB)	71,664	101,748

P.S. An indication of how verbose the logging has become (at log level INFO) - I noticed there were 9 log files for Cylc 8 vs 3 for Cylc 7. And the first 4 of the log files for Cylc 8 had been deleted as the default is to only keep 5 at any time.

Commit: 72781934, Python 3.9.7 ↩ ↩²

oliver-sanders · 2022-01-13T11:35:08Z

The Cylc 7 results are as expected, the Cylc 8 results less so, will need to investigate. To help reproduce these results:

What platform did you profile on?
~~What was the final cycle point of the complex suite?~~ hardcoded to 20150220T00
What were the Python versions used?

MetRonnie · 2022-01-13T11:40:03Z

Ran on my VLD box while not using it (with only a VSCode window with a few tabs, a file manager window and 3 terminal sessions open). Cylc 7 was (/usr/bin/python) 2.7.5, Cylc 8 was (~/.conda/envs/cylc8/bin/python) 3.9.7.

Also, I did the Cylc 8 complex test while keeping the VDI connection open during the middle of the day (doing stuff in Windows in the meantime). Whereas for the Cylc 7 complex test I started it last thing in the evening and closed the VDI connection to let it run overnight. Don't know if a difference in terms of load on the VLD servers depending on time of day, or whether the virtual desktop connection was open or not would affect the results.

MetRonnie · 2022-01-13T15:40:21Z

Profiled the small datetime-cycling workflow using memory-profiler

MetRonnie · 2022-01-13T16:07:08Z

Validation of the complex workflow:

C.f. ~100 MB max RSS during running of the workflow

oliver-sanders · 2022-01-13T16:19:05Z

I've profiled cylc validate complex two ways to give an inner and an outer measurement.

14322992 b (inner measurement of the config object)
51855360 b (outer measurement of the whole program)

The program as a whole weighs ~3.5 times more than the config alone adding up to ~38Mb.

Measuring the overheads of a "Hello World" Python script I get a number of 6909952 b, subtracting that from the 38Mb leaves us with 30Mb of un accounted memory.

51855360 - (14322992 + 6909952) = 30622416 b

This is approximately 60% of the total usage.

From introspection it looks like the bulk of that is str objects.

Python does fill up memory with a lot of _code objects representing the system code itself, I have seen these get quite large (Python2.6). Will take a bit more investigation to determine what this 30Mb is being used for.

Methodology

Obtaining the inner measurement:

diff --git a/cylc/flow/scripts/validate.py b/cylc/flow/scripts/validate.py
index 53a9b824f..d1c134323 100755
--- a/cylc/flow/scripts/validate.py
+++ b/cylc/flow/scripts/validate.py
@@ -168,3 +168,5 @@ def main(parser: COP, options: 'Values', reg: str) -> None:
 
     print(cparse('<green>Valid for cylc-%s</green>' % CYLC_VERSION))
     profiler.stop()
+    from pympler.asizeof import asized
+    print(asized(cfg, detail=2).format(depth=2))

$ cylc validate complex
...
<cylc.flow.config.WorkflowConfig object at 0x7f80e9030ad0> size=14322992 flat=64
    __dict__ size=14322928 flat=416
        [V] taskdefs: {'ensemble_install_startdata_cold': <c....def.TaskDef object at 0x7f80eb1dce60>} size=9974040 flat=36976
        [V] edges: {<cylc.flow.cycling.iso8601.ISO8601Seq....mble_forecast_012_tN', False, False)}} size=2080024 flat=248
        [V] pcfg: <cylc.flow.cfgspec.workflow.RawWorkflowConfig object at 0x7f80e7c8f6d0> size=1615208 flat=64
        [V] runtime: {'parents': {'root': [], 'HPC': ['root....fication_to_highway_happyland_yay2']}} size=625096 flat=248
        [V] ns_defn_order: ['root', 'HPC', 'HPC_BACKUP', 'CORETYP....tification_to_highway_happyland_yay2'] size=10040 flat=10040
        [V] leaves: ['archive_logs', 'brief_archive', 'bri....', 'recover1', 'recover2', 'recover3'] size=9200 flat=9200
        [V] clock_offsets: {'long_start_00': <cylc.flow.cycling.i....601Interval object at 0x7f80ea82f5d0>} size=1736 flat=656
        [V] options: <Values at 0x7f80e565c690: {'check_cir....emplatevars_file': None, 'icp': None}> size=1688 flat=64

Obtaining the outer measurement:

$ # (after reversing the above diff)
$ /usr/bin/time -lp cylc validate complex  # note: BSD time args, use -v for GNU time
...
real         1.96
user         1.85
sys          0.10
  51855360  maximum resident set size
         0  average shared memory size
         0  average unshared data size
         0  average unshared stack size
     24744  page reclaims
         0  page faults
         0  swaps
         0  block input operations
         0  block output operations
         0  messages sent
         0  messages received
         0  signals received
         8  voluntary context switches
       170  involuntary context switches

Obtaining the Python baseline:

$ /usr/bin/time -lp python -c 'print("hello world")'
hello world
real         0.03
user         0.02
sys          0.00
   6909952  maximum resident set size
         0  average shared memory size
         0  average unshared data size
         0  average unshared stack size
      1706  page reclaims
         0  page faults
         0  swaps
         0  block input operations
         0  block output operations
         0  messages sent
         0  messages received
         0  signals received
         0  voluntary context switches
         2  involuntary context switches

oliver-sanders · 2022-01-13T16:43:15Z

Simple test comparing Python with and without importing cylc.flow.scheduler.Scheduler shows the Cylc code (including dependencies) occupies 33.7Mb in memory at runtime. A similar number to the one in the post above.

40,615,936 b - 6,909,952 b = 33,705,984 b

$ cat test.py
from cylc.flow.scheduler import Scheduler
$ /usr/bin/time -lp python test.py
real         0.47
user         0.34
sys          0.10
  40615936  maximum resident set size
         0  average shared memory size
         0  average unshared data size
         0  average unshared stack size
     10017  page reclaims
         0  page faults
         0  swaps
         0  block input operations
         0  block output operations
         0  messages sent
         0  messages received
         0  signals received
        49  voluntary context switches
        73  involuntary context switches
$ /usr/bin/time -lp python -c 'print("hello world")'
hello world
real         0.03
user         0.02
sys          0.00
   6909952  maximum resident set size
         0  average shared memory size
         0  average unshared data size
         0  average unshared stack size
      1706  page reclaims
         0  page faults
         0  swaps
         0  block input operations
         0  block output operations
         0  messages sent
         0  messages received
         0  signals received
         0  voluntary context switches
         1  involuntary context switches

MetRonnie · 2022-01-13T17:45:39Z

Using /usr/bin/time on cylc validate datetime-cycling:

If I strip cylc/flow/scripts/validate.py down to just

def main(*a, **k) -> None:
    """cylc validate CLI."""
    return

Max RSS: 19,248 kB

Adding the imports back in

Code snippet

from ansimarkup import parse as cparse
from optparse import Values
import sys
import textwrap

from cylc.flow import LOG, __version__ as CYLC_VERSION
from cylc.flow.config import WorkflowConfig
from cylc.flow.exceptions import (
    WorkflowConfigError,
    TaskProxySequenceBoundsError,
    TriggerExpressionError
)
import cylc.flow.flags
from cylc.flow.loggingutil import disable_timestamps
from cylc.flow.option_parsers import (
    CylcOptionParser as COP,
    Options
)
from cylc.flow.profiler import Profiler
from cylc.flow.task_proxy import TaskProxy
from cylc.flow.templatevars import get_template_vars
from cylc.flow.terminal import cli_function
from cylc.flow.workflow_files import parse_reg


def main(*a, **k) -> None:
    """cylc validate CLI."""
    return

Max RSS: 31,520 kB

And with the unmodified file I get max RSS: 35,632 kB

In graphical form:

For comparison in Cylc 7 the full cylc validate has max RSS: 16,428 kB

MetRonnie · 2022-01-13T18:29:24Z

SnakeViz for `cylc play datetime-cycling --profile -n`

If <method 'poll' of 'select.epoll' objects> is sleeps, are we sleeping too much? Considering that Cylc 7 was able to run it in half the wall clock time.

Zooming in:

(Note: everything under update_profiler_logs on the left can be ignored as that doesn't run when not using --profile)

Snakeviz for `cylc play complex --profile -n`:

Zooming in to the left (red squiggly bit is profiling itself so can be ignored):

Zooming in to the centre

MetRonnie · 2022-01-14T13:21:10Z

Possible memory leak? (This is not the full workflow run - I stopped it before completion)

oliver-sanders · 2022-01-14T13:59:47Z

If <method 'poll' of 'select.epoll' objects> is sleeps, are we sleeping too much? Considering that Cylc 7 was able to run it in half the wall clock time.

That's mostly await asyncio.sleep which is to be expected. If the workflow takes longer to complete (which id did) there will be more iterations of the main loop, so more sleep between iterations as you said.

We can reduce the main loop sleep for more responsiveness, however, that raises CPU usage.

oliver-sanders · 2022-01-14T17:00:47Z

Here's the complex workflow running with --main-loop 'log memory'

The largest object @12MB is the config, followed by the task pool@4Mb, nothing to explain the >100Mb reported which suggests the remaining memory is perhaps not associated directly with the Scheduler object?

The orange line sloping upwards which looks like a memory leak is the data that generated this graph accumulating.

MetRonnie · 2022-01-14T18:28:29Z

(Scheduler start is everything up to the start of the main loop)

Looking into these reveals:

hjoliver · 2022-01-17T04:01:44Z

P.S. An indication of how verbose the logging has become (at log level INFO)

That was deliberate (on my part anyway) to help with Cylc 8 development. Expected to pare it back a lot once things stabilized a bit.

hjoliver · 2022-01-17T04:51:20Z

SnakeViz for cylc play datetime-cycling --profile -n

@MetRonnie - did you do this for Cylc 7 too?

MetRonnie · 2022-01-17T09:46:03Z

No, Python 2 vs 3 might make it difficult but I'll try

MetRonnie · 2022-01-17T09:48:28Z

The small datetime-cycling workflow run over dozens of cycles. Memory usage creeping up monotonically

MetRonnie · 2022-01-18T17:51:23Z

The Cylc 7 cylc run --profile didn't seem to produce a .prof file to use with Snakeviz. However, the top functions by cumulative time are (for the complex suite)

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.230    0.230 1513.650 1513.650 ~/github/cylc-7/lib/cylc/scheduler.py:1603(run)
     1136    4.405    0.004  721.388    0.635 ~/github/cylc-7/lib/cylc/suite_db_mgr.py:187(process_queued_ops)
     1135    0.009    0.000  721.269    0.635 ~/github/cylc-7/lib/cylc/scheduler.py:1275(process_suite_db_queue)
     2272    2.048    0.001  674.076    0.297 ~/github/cylc-7/lib/cylc/rundb.py:401(execute_queued_items)
    18876    0.160    0.000  453.895    0.024 ~/github/cylc-7/lib/cylc/rundb.py:454(_execute_stmt)
    18876  433.598    0.023  433.598    0.023 {method 'executemany' of 'sqlite3.Connection' objects}
      864    0.387    0.000  229.521    0.266 ~/github/cylc-7/lib/cylc/scheduler.py:1245(process_task_pool)
     1942  212.488    0.109  212.488    0.109 {method 'commit' of 'sqlite3.Connection' objects}
     1135  199.721    0.176  199.721    0.176 {time.sleep}
      864    0.171    0.000  168.005    0.194 ~/github/cylc-7/lib/cylc/task_job_mgr.py:180(submit_task_jobs)
      864    0.269    0.000  159.935    0.185 ~/github/cylc-7/lib/cylc/task_job_mgr.py:159(prep_submit_task_jobs)
     3480    0.523    0.000  159.659    0.046 ~/github/cylc-7/lib/cylc/task_job_mgr.py:733(_prep_submit_task_job)
    10841    0.091    0.000  144.661    0.013 /usr/lib64/python2.7/subprocess.py:475(_eintr_retry_call)
     3480    8.404    0.002  127.785    0.037 ~/github/cylc-7/lib/cylc/job_file.py:44(write)
     1136    0.738    0.001  120.515    0.106 ~/github/cylc-7/lib/cylc/scheduler.py:1669(update_state_summary)
     5039    0.334    0.000   86.496    0.017 /usr/lib64/python2.7/subprocess.py:650(__init__)
     5039    1.094    0.000   85.533    0.017 /usr/lib64/python2.7/subprocess.py:1188(_execute_child)
     1136    0.061    0.000   83.884    0.074 ~/github/cylc-7/lib/cylc/subprocpool.py:172(process)
     1134    0.048    0.000   76.739    0.068 ~/github/cylc-7/lib/cylc/scheduler.py:1590(update_profiler_logs)
      398    0.009    0.000   75.389    0.189 ~/github/cylc-7/lib/cylc/subprocpool.py:158(_proc_exit)
      398    0.012    0.000   75.338    0.189 ~/github/cylc-7/lib/cylc/subprocpool.py:358(_run_command_exit)
      398    0.010    0.000   75.304    0.189 ~/github/cylc-7/lib/cylc/task_job_mgr.py:690(_submit_task_jobs_callback)
      398    0.196    0.000   75.294    0.189 ~/github/cylc-7/lib/cylc/task_job_mgr.py:476(_manip_task_jobs_callback)
     1134    0.127    0.000   74.937    0.066 ~/github/cylc-7/lib/cylc/scheduler.py:2031(_update_cpu_usage)
    20930    0.114    0.000   73.991    0.004 /usr/lib64/python2.7/logging/__init__.py:1249(_log)
      956   13.481    0.014   73.941    0.077 ~/github/cylc-7/lib/cylc/state_summary_mgr.py:50(update)
    20930    0.069    0.000   72.642    0.003 /usr/lib64/python2.7/logging/__init__.py:1270(handle)
    20930    0.171    0.000   72.549    0.003 /usr/lib64/python2.7/logging/__init__.py:1302(callHandlers)
    41846    0.142    0.000   72.377    0.002 /usr/lib64/python2.7/logging/__init__.py:736(handle)
    10440    0.271    0.000   72.236    0.007 ~/github/cylc-7/lib/cylc/task_events_mgr.py:286(process_message)
     5534   71.470    0.013   71.470    0.013 {posix.read}

So a lot of CPU time taken up by DB ops in Cylc 7 too

dwsutherland · 2022-01-31T11:43:24Z

Not sure I understand the time difference:
0:25:18 vs 1:11:06

Seems kinda impossible... It's almost as if the job communication isn't working and it's switched to polling mode...

Is the memory overhead the loaded libraries?

MetRonnie · 2022-02-01T18:59:26Z

How can you tell if the job communication is happening via polling?

I'm getting

    # CYLC TASK ENVIRONMENT:
    export CYLC_TASK_COMMS_METHOD=zmq

in job scripts.

Example job.status:

CYLC_JOB_RUNNER_NAME=background
CYLC_JOB_ID=17391
CYLC_JOB_RUNNER_SUBMIT_TIME=2022-02-01T18:54:42Z
CYLC_JOB_PID=17391
CYLC_JOB_INIT_TIME=2022-02-01T18:54:49Z
CYLC_JOB_EXIT=SUCCEEDED
CYLC_JOB_EXIT_TIME=2022-02-01T18:54:54Z

which is submit --[7s]--> init --[5s]--> exit for a task with script = sleep 1

dwsutherland · 2022-02-02T06:45:55Z

let's say there's 5000 sequential tasks (maybe more in parallel), then if each job/task message takes 200ms longer with 3 messages each job. That's 3000sec or 50min extra, which would account for the difference.. (I don't know the exact numbers).

I'm, of course, assuming that what we found previously (with the time spent loading network libraries, and the auth etc) is the issue..

This is still a concern, as someone could use Cylc for running many small jobs in the same way... We need to find out exactly where..

How can you tell if the job communication is happening via polling?

debug mode will tell you how the messages are being received (zmq or otherwise), in the workflow log..

which is submit --[7s]--> init --[5s]--> exit for a task with script = sleep 1

What's cylc7 look like? looks like that's it.... If so, then at least it's not the Scheduler internals (I suppose)

oliver-sanders · 2022-02-02T10:34:38Z

How can you tell if the job communication is happening via polling?

I think we now omit the contact file installation for polling platforms.

You can also run the workflow in --debug mode and inspect the job.err log which would record the ZMQ send operations for TCP comms.

oliver-sanders · 2022-02-02T10:36:08Z

then if each job/task message takes 200ms longer with 3

We run cylc message in a non-blocking way to avoid this issue e.g:

https://github.com/cylc/cylc-flow/blob/9d481bae9c58b994fbb11351812e1f6861e9e5c4/cylc/flow/etc/job.sh#L130

It does wait for the started message:

https://github.com/cylc/cylc-flow/blob/ed19b4b5e574f1c7170e77722a939f6d2dfe8d5c/cylc/flow/etc/job.sh#L163

But only at the end of the job.

MetRonnie · 2022-02-02T14:08:33Z

Comparing the same tasks (all script = 1s but probably had differing numbers of other tasks running at the same time):

brief_observations_background_atmos

	Cylc 7	Cylc 8
submit -> init	1s	4s
init -> exit	1s	3s

ensemble_res_create

	Cylc 7	Cylc 8
submit -> init	2s	7s
init -> exit	1s	4s

long_assimilation_create_zap_larger

	Cylc 7	Cylc 8
submit -> init	9s	29s
init -> exit	4s	23s

dwsutherland · 2022-02-02T22:34:39Z

We should create some sort of job profiler, cause that's obviously where the issue is.. We can find out exactly what's taking so long but fiddling with cylc/flow/network/client.py and maybe the job script..

The auth will be doing file reads, and we could print out timing about library loading, client/connection instantiation, and message send/receive.

dwsutherland · 2022-02-02T22:40:44Z

I think, the only reason going via the UIS could be faster is because it already has libraries loaded and client/connection instantiated.

If we find it is this, one option may be a long lived client for the life of the job (which may mean having the job wrapper in python)... But I don't know, could introduce other delays I suppose...

Surely there's a way to slim down the client end with zmq. (again assuming this is the issue)

oliver-sanders · 2022-02-03T09:54:47Z

(coming back to this issue after we have have a release out)

oliver-sanders · 2023-04-19T15:03:28Z

Here's a basic auto-profiling script I wrote for the increment_graph_window investigation:

Basic profiling script

./profile

#!/usr/bin/env bash
set -eu
WORKFLOWS="$(cylc scan --states=all --format=name)"
checkout () {

local branch="$1"

(

cd "$HOME/cylc";

git checkout "$branch";

)

}
profile () {

local branch="$1"

local branch_path="$(sed 's|/|_|g' <<< "$branch")"

local tasks="$2"

local cycles="$3"

local run_name="${branch_path}__tasks-${tasks}__cycles-${cycles}"

local workflow_id="diamond/$run_name"
if [[ "$WORKFLOWS" == *$workflow_id* ]]; then
    echo -e "\n\nUsing cached: branch=$branch tasks=$tasks cycles=$cycles\n\n"
    return
else
    echo -e "\n\nRunning: branch=$branch tasks=$tasks cycles=$cycles\n\n"
fi

cylc clean "${workflow_id}" -y 2>/dev/null || true
cylc install diamond --run-name="$run_name"

# profile with cProfile (inner measurement)
# cylc play "diamond/$run_name" \
#     --profile \
#     --no-detach \
#     -s "TASKS=$tasks" \
#     -s "CYCLES=$cycles"

# profile with /usr/bin/time (outer measurement)
/usr/bin/time \
    -v \
    -o results \
    cylc play \
    "diamond/$run_name" \
    --no-detach \
    -s "TASKS=$tasks" \
    -s "CYCLES=$cycles"
mv results "$HOME/cylc-run/${workflow_id}/results"

}
run () {

local branches=(

'upstream/master'
    'ronnie-local/SPEED-POWER'
    'ronnie/tokens'

    'david/store-tproxy-efficiency'
    'store-tproxy-efficiency-2023-01-24'
    # 'store-tproxy-efficiency-w-platform-call'
    # 'avoid_creating_task_proxies'

    # 'tim/dont_get_platform_too_soon'
    # 'tim/cache_platform_copying'
)

for branch in "${branches[@]}"; do
    checkout "$branch"
    profile "$branch" 1 1
    profile "$branch" 5 1
    profile "$branch" 10 1
    profile "$branch" 15 1
    profile "$branch" 20 1
    profile "$branch" 25 1
    profile "$branch" 30 1
    profile "$branch" 35 1
    profile "$branch" 40 1
    profile "$branch" 45 1
    profile "$branch" 50 1
done

}
run
</pre>
<b>./report.py</b>
<pre>
from contextlib import suppress
from datetime import timedelta
from pathlib import Path
import sys

import matplotlib.pyplot as plt


def load(filename):
    results = {}
    with open(filename, 'r') as f:
        for line in f:
            key, value = (i.strip() for i in line.rsplit(': ', 1))
            results[key] = value
    return derive(results)


def time_delta(value):
    try:
        minutes, seconds = value.rsplit(':', 1)
        return timedelta(
            minutes=float(minutes),
            seconds=float(seconds),
        ).total_seconds()
    except:
        raise ValueError(value)


CAST_METHODS = (
    float,
    int,
    time_delta,
)


def cast(value):
    for method in CAST_METHODS:
        with suppress(ValueError):
            return method(value)
    return value

timedelta(hours=1, minutes=49, seconds=51).total_seconds


def derive(results):
    _results = {
        key: cast(value)
        for key, value in results.items()
    }
    _results['CPU time (seconds)'] = _results['User time (seconds)'] + _results['System time (seconds)']
    return _results


def report(results):
    print(results)


def plot(results, key, metric):
    data = []
    for branch, _results in results.items():
        x = []
        y = []
        for args, __result in _results:
            # data.append(args[key], result[metric]
            x.append(args[key])
            y.append(__result[metric])
        plt.plot(x, y, label=branch)
    plt.legend()
    plt.xlabel(key)
    plt.ylabel(metric)
    plt.show()


def run():
    workflow = 'diamond'
    branches = [
        'upstream/master',
        'ronnie/tokens',
        'david/store-tproxy-efficiency',
        'ronnie-local/SPEED-POWER'

        # 'david/store-tproxy-efficiency',
        # 'store-tproxy-efficiency-w-platform-call',
        # 'avoid_creating_task_proxies',
        # 'tim/dont_get_platform_too_soon',
        # 'tim/cache_platform_copying',
    ]
    args = [
        {'TASKS': 1, 'CYCLES': 1},
        {'TASKS': 5, 'CYCLES': 1},
        {'TASKS': 10, 'CYCLES': 1},
        {'TASKS': 15, 'CYCLES': 1},
        {'TASKS': 20, 'CYCLES': 1},
        {'TASKS': 25, 'CYCLES': 1},
        {'TASKS': 30, 'CYCLES': 1},
        {'TASKS': 35, 'CYCLES': 1},
        {'TASKS': 40, 'CYCLES': 1},
        {'TASKS': 45, 'CYCLES': 1},
        {'TASKS': 50, 'CYCLES': 1},
    ]
    results = {}
    for branch in branches:
        _results = []
        for argset in args:
            workflow_id = get_id(workflow, branch, **argset)
            try:
                __results = load(Path('~osanders/cylc-run/', workflow_id, 'results').expanduser())
            except Exception as exc:
                print(f'WARNING: skipping {workflow_id}: {exc}')
                continue
            _results.append((argset, __results))
        results[branch] = _results
    plot(results, 'TASKS', 'Elapsed (wall clock) time (h:mm:ss or m:ss)')
    plot(results, 'TASKS', 'CPU time (seconds)')
    plot(results, 'TASKS', 'Maximum resident set size (kbytes)')


def get_id(workflow, branch, **args):
    ret = f'{workflow}/{branch.replace("/", "_")}'
    for key, value in args.items():
        ret += f'__{key.lower()}-{value}'
    return ret


if __name__ == '__main__':
    # run(sys.argv[1])
    run()

</pre>
</details>

MetRonnie mentioned this issue Jan 17, 2022

2022 Cylc Meetings #143

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

profiling: run cylc 7 profile tests against cylc 8 #141

profiling: run cylc 7 profile tests against cylc 8 #141

oliver-sanders commented Jan 12, 2022

hjoliver commented Jan 12, 2022 •

edited

Loading

MetRonnie commented Jan 13, 2022 •

edited

Loading

oliver-sanders commented Jan 13, 2022 •

edited

Loading

MetRonnie commented Jan 13, 2022 •

edited

Loading

MetRonnie commented Jan 13, 2022 •

edited

Loading

MetRonnie commented Jan 13, 2022

oliver-sanders commented Jan 13, 2022 •

edited

Loading

oliver-sanders commented Jan 13, 2022 •

edited

Loading

MetRonnie commented Jan 13, 2022 •

edited

Loading

MetRonnie commented Jan 13, 2022 •

edited

Loading

MetRonnie commented Jan 14, 2022 •

edited

Loading

oliver-sanders commented Jan 14, 2022 •

edited

Loading

oliver-sanders commented Jan 14, 2022

MetRonnie commented Jan 14, 2022 •

edited

Loading

hjoliver commented Jan 17, 2022

hjoliver commented Jan 17, 2022

MetRonnie commented Jan 17, 2022

MetRonnie commented Jan 17, 2022

MetRonnie commented Jan 18, 2022 •

edited

Loading

dwsutherland commented Jan 31, 2022

MetRonnie commented Feb 1, 2022 •

edited

Loading

dwsutherland commented Feb 2, 2022 •

edited

Loading

oliver-sanders commented Feb 2, 2022

oliver-sanders commented Feb 2, 2022 •

edited

Loading

MetRonnie commented Feb 2, 2022

dwsutherland commented Feb 2, 2022

dwsutherland commented Feb 2, 2022

oliver-sanders commented Feb 3, 2022

oliver-sanders commented Apr 19, 2023

profiling: run cylc 7 profile tests against cylc 8 #141

profiling: run cylc 7 profile tests against cylc 8 #141

Comments

oliver-sanders commented Jan 12, 2022

hjoliver commented Jan 12, 2022 • edited Loading

MetRonnie commented Jan 13, 2022 • edited Loading

Small workflow

Complex workflow

Footnotes

oliver-sanders commented Jan 13, 2022 • edited Loading

MetRonnie commented Jan 13, 2022 • edited Loading

MetRonnie commented Jan 13, 2022 • edited Loading

MetRonnie commented Jan 13, 2022

oliver-sanders commented Jan 13, 2022 • edited Loading

Methodology

oliver-sanders commented Jan 13, 2022 • edited Loading

MetRonnie commented Jan 13, 2022 • edited Loading

MetRonnie commented Jan 13, 2022 • edited Loading

SnakeViz for cylc play datetime-cycling --profile -n

Snakeviz for cylc play complex --profile -n:

MetRonnie commented Jan 14, 2022 • edited Loading

oliver-sanders commented Jan 14, 2022 • edited Loading

oliver-sanders commented Jan 14, 2022

MetRonnie commented Jan 14, 2022 • edited Loading

hjoliver commented Jan 17, 2022

hjoliver commented Jan 17, 2022

MetRonnie commented Jan 17, 2022

MetRonnie commented Jan 17, 2022

MetRonnie commented Jan 18, 2022 • edited Loading

dwsutherland commented Jan 31, 2022

MetRonnie commented Feb 1, 2022 • edited Loading

dwsutherland commented Feb 2, 2022 • edited Loading

oliver-sanders commented Feb 2, 2022

oliver-sanders commented Feb 2, 2022 • edited Loading

MetRonnie commented Feb 2, 2022

dwsutherland commented Feb 2, 2022

dwsutherland commented Feb 2, 2022

oliver-sanders commented Feb 3, 2022

oliver-sanders commented Apr 19, 2023

hjoliver commented Jan 12, 2022 •

edited

Loading

MetRonnie commented Jan 13, 2022 •

edited

Loading

oliver-sanders commented Jan 13, 2022 •

edited

Loading

MetRonnie commented Jan 13, 2022 •

edited

Loading

MetRonnie commented Jan 13, 2022 •

edited

Loading

oliver-sanders commented Jan 13, 2022 •

edited

Loading

oliver-sanders commented Jan 13, 2022 •

edited

Loading

MetRonnie commented Jan 13, 2022 •

edited

Loading

MetRonnie commented Jan 13, 2022 •

edited

Loading

SnakeViz for `cylc play datetime-cycling --profile -n`

Snakeviz for `cylc play complex --profile -n`:

MetRonnie commented Jan 14, 2022 •

edited

Loading

oliver-sanders commented Jan 14, 2022 •

edited

Loading

MetRonnie commented Jan 14, 2022 •

edited

Loading

MetRonnie commented Jan 18, 2022 •

edited

Loading

MetRonnie commented Feb 1, 2022 •

edited

Loading

dwsutherland commented Feb 2, 2022 •

edited

Loading

oliver-sanders commented Feb 2, 2022 •

edited

Loading