Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose aggregate metadata namespace for UI #3345

Merged
merged 6 commits into from
Mar 17, 2023

Conversation

dbutenhof
Copy link
Member

@dbutenhof dbutenhof commented Mar 14, 2023

PBENCH-1091

Managing the display of metadata in the dashboard is complicated by the fact that much of the namespace is highly dynamic: the global and user spaces are completely defined by clients, and can be completely different for each dataset, while the dataset.metalog namespace depends on the agent benchmark parameters and postprocessing.

In order to provide a reference point, this PR proposes a way to acquire a JSON document representing the aggregated namespace across a set of selected datasets. It leverages the existing GET /datasets collection query for this purpose, including all the filter keywords: however instead of returning a list of matching datasets, it builds a JSON document showing the nested key namespace.

While the example below is a bit large, the following is real output from a query made during a functional test run. Note that "leaf" nodes in the tree are designated by the value None.

So, for example, the output shows valid metadata keys dataset.access, dataset.id, dataset.metalog, dataset.metalog.iterations/0__linpack-binary=:root:linpack:xlinpack_xeon64.clients, and so forth...

GET /api/v1/datasets?keysummary=true
-----
{
    "dataset": {
        "access": null,
        "id": null,
        "metalog": {
            "iterations/0__linpack-binary=:root:linpack:xlinpack_xeon64": {
                "clients": null,
                "iteration_name": null,
                "iteration_number": null,
                "linpack-binary": null
            },
            "iterations/1": {
                "iteration_name": null,
                "user_script": null
            },
            "iterations/1-forwarding_test.json-5pct_drop": {
                "iteration_name": null,
                "iteration_number": null,
                "max_loss_pct": null,
                "profile": null
            },
            "iterations/1-rw-4KiB": {
                "block_size_kib": null,
                "iteration_name": null,
                "iteration_number": null,
                "test_type": null
            },
            "iterations/1-tcp_stream-64B-1i": {
                "instances": null,
                "iteration_name": null,
                "iteration_number": null,
                "message_size_bytes": null,
                "protocol": null,
                "test_type": null
            },
            "iterations/1__linpack-binary=:root:linpack:bob:xlinpack_xeon64": {
                "clients": null,
                "iteration_name": null,
                "iteration_number": null,
                "linpack-binary": null
            },
            "iterations/2__linpack-binary=:root:linpack:l_mklb_p_2019.6.005:benchmarks_2019:linux:mkl:benchmarks:linpack:xlinpack_xeon64": {
                "clients": null,
                "iteration_name": null,
                "iteration_number": null,
                "linpack-binary": null
            },
            "pbench": {
                "config": null,
                "date": null,
                "iterations": null,
                "name": null,
                "rpm-version": null,
                "script": null,
                "tar-ball-creation-timestamp": null
            },
            "run": {
                "controller": null,
                "end_run": null,
                "prefix": null,
                "raw_size": null,
                "start_run": null,
                "user": null
            },
            "tools": {
                "group": null,
                "hosts": null
            },
            "tools/10.10.10.153": {
                "hostname-s": null,
                "turbostat": null
            },
            "tools/10.10.10.19": {
                "hostname-s": null,
                "turbostat": null
            },
            "tools/ansible-host": {
                "hostname-s": null,
                "iostat": null,
                "pidstat": null,
                "[email protected]": null,
                "[email protected]": null,
                "[email protected]": null,
                "[email protected]": null,
                "[email protected]": null
            },
            "tools/app-node-0.scale-ci.example.com": {
                "hostname-s": null,
                "vmstat": null
            },
            "tools/app-node-1.scale-ci.example.com": {
                "hostname-s": null,
                "vmstat": null
            },
            "tools/ctlrA": {
                "hostname-s": null,
                "turbostat": null
            },
            "tools/dhcp31-44": {
                "hostname-s": null,
                "iostat": null,
                "mpstat": null,
                "perf": null,
                "pidstat": null,
                "proc-interrupts": null,
                "proc-vmstat": null,
                "sar": null,
                "turbostat": null
            },
            "tools/infra-node-0.scale-ci.example.com": {
                "hostname-s": null,
                "vmstat": null
            },
            "tools/infra-node-1.scale-ci.example.com": {
                "hostname-s": null,
                "vmstat": null
            },
            "tools/infra-node-2.scale-ci.example.com": {
                "hostname-s": null,
                "vmstat": null
            },
            "tools/perf124": {
                "hostname-s": null,
                "turbostat": null
            },
            "tools/rhel8-1": {
                "hostname-s": null,
                "[email protected]": null,
                "[email protected]": null,
                "turbostat": null
            }
        },
        "name": null,
        "owner_id": null,
        "resource_id": null,
        "uploaded": null
    },
    "server": {
        "archiveonly": null,
        "deletion": null,
        "origin": null,
        "tarball-path": null
    },
    "user": {
        "pbench": {
            "access": null
        }
    }
}

@dbutenhof dbutenhof added Server Dashboard Of and relating to the Dashboard GUI API Of and relating to application programming interfaces to services and functions labels Mar 14, 2023
@dbutenhof dbutenhof self-assigned this Mar 14, 2023
@portante portante added this to the v0.73 milestone Mar 14, 2023
@webbnh
Copy link
Member

webbnh commented Mar 14, 2023

"iterations/0__linpack-binary=:root:linpack:xlinpack_xeon64": {
                "clients": {},
                "iteration_name": {},
                "iteration_number": {},
                "linpack-binary": {}
            },

My immediate reaction is to ask, how or why is this response better than

"iterations/0__linpack-binary=:root:linpack:xlinpack_xeon64": [
                 "clients",
                 "iteration_name",
                 "iteration_number",
                 "linpack-binary"
             ],

(I.e., why return a mapping of keys to (known-to-be) empty objects when you could just return the list of keys? Presumably, the client side is either going to iterate over the mapping's keys or extract them into a list, anyway.)

@dbutenhof
Copy link
Member Author

The point was that this is the same format all the way down instead of having to look at each level to see whether it's a list of leaves or a dict with another level to descend. (Or even worse, a list at each level where some elements of the list are nested dicts...)

I toyed with several other representations, but this was straightforward to create and I liked the consistent format.

webbnh
webbnh previously approved these changes Mar 14, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a pointed question for you, a couple of nits, and some whinging, but other than those these changes look really good.

this is the same format all the way down instead of having to look at each level to see whether it's a list of leaves or a dict with another level to descend.

Yes, having reviewed the code, I can see why you're proposing this output format.

However, my instinct is that it's not quite what we want. While it seems like the accumulated key-space tree which we're returning is a very useful intermediate result, it should be flattened into a list for the client's use. (I think that this can be done nicely with a small amount of recursive code, very like the proposed accumulate().)

lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
siddardh-ra
siddardh-ra previously approved these changes Mar 15, 2023
Copy link
Member

@siddardh-ra siddardh-ra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Member Author

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it seems like the accumulated key-space tree which we're returning is a very useful intermediate result, it should be flattened into a list for the client's use.

I'm thinking in terms of display as a hierarchical collapsing list like the demo we saw, which suggests that if I flattened it she'd just have to parse the key paths to construct the hierarchy.

However, the opinion I'm really looking for here is Varshini's. 😄

lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it seems like the accumulated key-space tree which we're returning is a very useful intermediate result, it should be flattened into a list for the client's use.

I'm thinking in terms of display as a hierarchical collapsing list like the demo we saw, which suggests that if I flattened it she'd just have to parse the key paths to construct the hierarchy.

Right. I think that that approach is probably good for navigating a metadata instance and looking at the values, but I'm concerned that it will be cumbersome for users trying to select keys for search/display -- it'll be a lot of clicks to get to deeply-nested names. As an alternative, providing a pull-down with a flat namespace, perhaps with an auto-completed search field (somewhat like the "add reviewer" widget here in GitHub PRs), would be easier and quicker to use (it could also offer shortcuts to commonly-referenced or favorite keys).

the opinion I'm really looking for here is Varshini's. 😄

Quite right. @MVarshini, what are your thoughts?

lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
PBENCH-1091

Managing the display of metadata in the dashboard is complicated by the fact
that much of the namespace is highly dynamic: the `global` and `user` spaces
are completely defined by clients, and can be completely different for each
dataset, while the `dataset.metalog` namespace depends on the agent benchmark
parameters and postprocessing.

In order to provide a reference point, this PR proposes a way to acquire a
JSON document representing the aggregated namespace across a set of selected
datasets. It leverages the existing `GET /datasets` collection query for this
purpose, including all the filter keywords: however instead of returning a
list of matching datasets, it builds a JSON document showing the nested key
namespace.

While the example below is a bit large, the following is real output from a
query made during a functional test run. Note that "leaf" values in the tree
are designated by empty JSON object values. I thought of using `None`, but
this was a bit easier and seemed more consistent. (I'd particularly like to
see feedback on how useful this format is to the UI Javascript code.)

```
GET /api/v1/datasets?keysummary=true
-----
{
    "dataset": {
        "access": {},
        "id": {},
        "metalog": {
            "iterations/0__linpack-binary=:root:linpack:xlinpack_xeon64": {
                "clients": {},
                "iteration_name": {},
                "iteration_number": {},
                "linpack-binary": {}
            },
            "iterations/1": {
                "iteration_name": {},
                "user_script": {}
            },
            "iterations/1-forwarding_test.json-5pct_drop": {
                "iteration_name": {},
                "iteration_number": {},
                "max_loss_pct": {},
                "profile": {}
            },
            "iterations/1-rw-4KiB": {
                "block_size_kib": {},
                "iteration_name": {},
                "iteration_number": {},
                "test_type": {}
            },
            "iterations/1-tcp_stream-64B-1i": {
                "instances": {},
                "iteration_name": {},
                "iteration_number": {},
                "message_size_bytes": {},
                "protocol": {},
                "test_type": {}
            },
            "iterations/1__linpack-binary=:root:linpack:bob:xlinpack_xeon64": {
                "clients": {},
                "iteration_name": {},
                "iteration_number": {},
                "linpack-binary": {}
            },
            "iterations/2__linpack-binary=:root:linpack:l_mklb_p_2019.6.005:benchmarks_2019:linux:mkl:benchmarks:linpack:xlinpack_xeon64": {
                "clients": {},
                "iteration_name": {},
                "iteration_number": {},
                "linpack-binary": {}
            },
            "pbench": {
                "config": {},
                "date": {},
                "iterations": {},
                "name": {},
                "rpm-version": {},
                "script": {},
                "tar-ball-creation-timestamp": {}
            },
            "run": {
                "controller": {},
                "end_run": {},
                "prefix": {},
                "raw_size": {},
                "start_run": {},
                "user": {}
            },
            "tools": {
                "group": {},
                "hosts": {}
            },
            "tools/10.10.10.153": {
                "hostname-s": {},
                "turbostat": {}
            },
            "tools/10.10.10.19": {
                "hostname-s": {},
                "turbostat": {}
            },
            "tools/ansible-host": {
                "hostname-s": {},
                "iostat": {},
                "pidstat": {},
                "[email protected]": {},
                "[email protected]": {},
                "[email protected]": {},
                "[email protected]": {},
                "[email protected]": {}
            },
            "tools/app-node-0.scale-ci.example.com": {
                "hostname-s": {},
                "vmstat": {}
            },
            "tools/app-node-1.scale-ci.example.com": {
                "hostname-s": {},
                "vmstat": {}
            },
            "tools/ctlrA": {
                "hostname-s": {},
                "turbostat": {}
            },
            "tools/dhcp31-44": {
                "hostname-s": {},
                "iostat": {},
                "mpstat": {},
                "perf": {},
                "pidstat": {},
                "proc-interrupts": {},
                "proc-vmstat": {},
                "sar": {},
                "turbostat": {}
            },
            "tools/infra-node-0.scale-ci.example.com": {
                "hostname-s": {},
                "vmstat": {}
            },
            "tools/infra-node-1.scale-ci.example.com": {
                "hostname-s": {},
                "vmstat": {}
            },
            "tools/infra-node-2.scale-ci.example.com": {
                "hostname-s": {},
                "vmstat": {}
            },
            "tools/perf124": {
                "hostname-s": {},
                "turbostat": {}
            },
            "tools/rhel8-1": {
                "hostname-s": {},
                "[email protected]": {},
                "[email protected]": {},
                "turbostat": {}
            }
        },
        "name": {},
        "owner_id": {},
        "resource_id": {},
        "uploaded": {}
    },
    "server": {
        "archiveonly": {},
        "deletion": {},
        "index-map": {
            "container-pbench.v5.result-data-sample.2020-02-28": {},
            "container-pbench.v6.run-data.2018-10": {},
            "container-pbench.v6.run-data.2020-02": {},
            "container-pbench.v6.run-toc.2018-10": {},
            "container-pbench.v6.run-toc.2020-02": {}
        },
        "origin": {},
        "tarball-path": {}
    },
    "user": {
        "pbench": {
            "access": {}
        }
    }
}
```
Primarily change to report leaf key nodes with None rather than an empty JSON
object at Varshini's request.

Revise the "left outer join" comment to clarify.
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you just posted a response, so I'll post these and then look at those.

lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
npalaska
npalaska previously approved these changes Mar 15, 2023
Copy link
Member

@npalaska npalaska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks generally good. But I do wonder whether we should always send the metalog information in the key summary though.

Comment on lines +349 to +350
# "metalog" is a top-level key in the Metadata schema, but we
# report it as a sub-key of "dataset".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, a bit confused, why did you choose to add it as a sub-key of the dataset?

Also, I realized I think we should send the metalog only if asked explicitly, the output can get big quickly for complex and long benchmark runs. Thought?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, a bit confused, why did you choose to add it as a sub-key of the dataset?

Actually I originally wanted to make it a column of Dataset because it really is part of the "identity" of the dataset rather than external tracking context like the server namespace. At the time I added it, I felt that I didn't want to change the DB schema. And at this point I'd feel obligated to write, and somehow test, migration code if I moved it, so I probably won't. 😆

Also, I realized I think we should send the metalog only if asked explicitly, the output can get big quickly for complex and long benchmark runs. Thought?

The metadata.log is an important part of the test run, including the script name, config data, and such. The problems here are (1) the weird key names that the agent postprocessing generates, and (2) the extremely "irregular" key space depending on the benchmark and parameters; but I think it's important and valuable to expose all this for display and filtering.

What I did wrestle with was whether to arbitrarily suppress the server.index-map. While it's not so bad here since we only represent the keys, the value (if someone decided to enable display) can be enormous.

I forgot to drop the `warning` message; instead just set the key to `None`.

Although I'd liked the idea of aggregating metadata over the three datasets
visible to the `drb` test user, leaving the third (`fio_2`) *without* metadata
should increase test coverage by 1/2 line to 100%...
webbnh
webbnh previously approved these changes Mar 16, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of pointed questions....

lib/pbench/test/unit/server/test_datasets_list.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/datasets_list.py Outdated Show resolved Hide resolved
So, forget I even thought about dealing with the `MetadataError` here. I can't
collect "errors" here because `_get_dataset_metadata` actually internally
traps any "real" metadata error and returns a `None` value already. To collect
a list of keyspace errors I'd need to change `_get_dataset_metadata`, and I'm
not going to do that here.

The only `MetadataError` that can get through here is actually a bad keyname,
and we've already done that validation so we really don't need to worry about
it.

I'd like to revise this separately because I've realized that while I allow
`filter` to reference the weird non-alphanumeric key paths in `metalog`, I
didn't do that for the `metadata` parameter... and that'll affect the
`_get_dataset_metadata` as well because it independently validates the key
paths and wouldn't know about that exception. (Ugh.) Not here. I did however
add a test case showing that a keyspace conflict error results in a `None`
value.
webbnh
webbnh previously approved these changes Mar 16, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@dbutenhof dbutenhof merged commit 1f5cd39 into distributed-system-analysis:main Mar 17, 2023
@dbutenhof dbutenhof deleted the keys branch March 17, 2023 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Of and relating to application programming interfaces to services and functions Dashboard Of and relating to the Dashboard GUI Server
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants