Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PBENCH-1127 Implementation of Quisby API #3463

Merged
merged 15 commits into from
Jun 21, 2023

Conversation

siddardh-ra
Copy link
Member

Implementation of Quisby API

Integrated pquisby package into the Pbench server. The first pass of this API implementation will be used for retrieving quisby data for single dataset results.

Currently, pquisby supports only the uperf benchmark; eventually, we will increase the support for other benchmarks too.

Right now, for the benchmark_type we are fetching it from dataset.metalog.pbench.script.But if we are running pbench-user-benchmark and this method won't work; we need to find a way to capture the desired benchmark_type in this case. I will address those in subsequent PRs

GET /api/v1/quisby/{dataset}

@siddardh-ra siddardh-ra requested a review from dbutenhof June 14, 2023 08:22
@siddardh-ra siddardh-ra self-assigned this Jun 14, 2023
@siddardh-ra siddardh-ra added API Of and relating to application programming interfaces to services and functions Server labels Jun 14, 2023
@siddardh-ra siddardh-ra requested review from webbnh and riya-17 June 14, 2023 08:29
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, Siddardh!

I think we don't want to expose the name "quisby" in our API or messaging. For one thing, we may not always/exclusively use the quisby package... what we're doing here is postprocessing results data for client-side "visualization", and "quisby-ing" data won't mean anything to anyone outside Perf & Scale (and possibly not to everyone within Perf & Scale) so I suggest we call this something more appropriately generic and descriptive like "visualize".

lib/pbench/server/api/__init__.py Outdated Show resolved Hide resolved
lib/pbench/server/api/__init__.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/quisby.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/quisby.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/quisby.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/quisby.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_quisby_results.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_quisby_results.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_quisby_results.py Outdated Show resolved Hide resolved
@siddardh-ra siddardh-ra requested review from dbutenhof and riya-17 June 16, 2023 11:20
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to fix the client API enum (and adding a functional test would be a great "bonus"). I also don't like the trailing slash API mapping: I don't think it's appropriate here. I have one other minor style comment. Aside from that, this is looking really good.

lib/pbench/client/__init__.py Outdated Show resolved Hide resolved
lib/pbench/server/api/__init__.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/visualize.py Outdated Show resolved Hide resolved
@siddardh-ra siddardh-ra requested a review from dbutenhof June 16, 2023 12:08
dbutenhof
dbutenhof previously approved these changes Jun 16, 2023
riya-17
riya-17 previously approved these changes Jun 16, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks generally excellent. However, I have a few concerns (in descending order):

  • Are we sure we like that API method path?
  • There is a useless assignment in one of the tests which looks like it should instead be an assertion.
  • I have a coding suggestion for selecting the Quisby benchmark type.
  • A missing result.csv file shouldn't be an internal Server error.
  • There's a raise which should have a from.
  • There's an error message for a missing dataset which should be trimmed or reworked.

And, while you're at it, there are a few other smaller things and nits.

lib/pbench/server/api/__init__.py Outdated Show resolved Hide resolved
lib/pbench/server/cache_manager.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/visualize.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/visualize.py Outdated Show resolved Hide resolved
Comment on lines 74 to 87
try:
file = tarball.extract(tarball.tarball_path, f"{name}/result.csv")
except TarballUnpackError as e:
raise APIInternalError(str(e)) from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that this should be an Internal Server Error: as things currently stand, the result.csv file is created by the Agent, right? So, if it's not there, it's a problem with the dataset, not with the Pbench Server.

I would be inclined to go with a NOT_FOUND result (with a suitable error message). (Or, possibly, UNSUPPORTED_MEDIA_TYPE.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since NOT_FOUND technically refers to the primary resource (the dataset), that would be misleading and unhelpful here. I'm not crazy about UNSUPPORTED_MEDIA_TYPE either, though it's probably no worse than the other contexts we've already used it, and I can't think of anything in the limited HTTP error space that's less inappropriate. 😦

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "medium" (i.e., the result) has to have a result.csv file produced by the post-processing of a supported benchmark...otherwise, it's...um...unsupported. 😁

But, @dbutenhof, you concur that it should not be an internal error, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we've already failed if dataset.metalog.pbench.script isn't exactly "uperf", I'd be inclined to say by the time we get here we expect that this is a pbench-uperf benchmark wrapper execution which can reasonably be expected to have proper post-processing, and lack of the proper summary file is definitely unexpected. Maybe that's potentially more of an "agent-side internal error" but I can't see any reasonable way to express that in HTTPStatus aside from "internal server error". And, at best, I think we can reasonably say it means the server API was insufficiently paranoid about the dataset format...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I'm confused, but I think we're checking for the result.csv file before we're trying to decode the benchmark type. (Do you think we should do this in a different order?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re METHOD_NOT_ALLOWED, Mozilla explains it as

indicates that the server knows the request method, but the target resource doesn't support this method.

which sounds pretty close to me (although, technically, we're supposed to respond with a list of methods which are allowed, which sounds like a pain).

UNSUPPORTED_MEDIA_TYPE, but that's really intended to report incompatible request Content-Type

Ah, yes, I'd missed the fact that it refers to the request payload. Bummer.

I'm not entirely happy with UNPROCESSABLE_ENTITY

Yeah, the WebDAV ones always look so attractive until you get to the details. 😛

Re BAD_REQUEST, Mozilla describes it as

indicates that the server cannot or will not process the request due to something that is perceived to be a client error

I would suggest that the error is trying to visualize a dataset which has no results.csv file in it...the client just cannot to that. 😒

I like 501/NOT_IMPLEMENTED in principal, but, based on what Mozilla says, it doesn't fit:

501 is the appropriate response when the server does not recognize the request method and is incapable of supporting it for any resource. The only methods that servers are required to support (and therefore that must not return 501) are GET and HEAD.

That is, we do recognize the method (and, worse, it is a GET).

Interestingly, Mozilla goes on to suggest:

If the server does recognize the method, but intentionally does not support it, the appropriate response is 405 Method Not Allowed.

So, 405 sounds like a reasonable fit (better than 403, which I agree would cover it, but too generically).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we do support the HTTP POST method for this endpoint. That's telling them they shouldn't be using a POST and is otherwise completely unhelpful. I think that's an incredibly bad fit. 😦 In particular, clients are encouraged to cache this to know that a subsequent POST would be inappropriate. It's not and can't be resource-specific.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, by the way, your Mozilla descriptions often conflict with the "easy reference" Wikipedia page I often use, so from now on I'm sticking with RFC 9110 which differs from both in subtle ways that make them both liars.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from now on I'm sticking with RFC 9110

Fair enough. But, that reference says that 405 is resource-specific (and therefore, hopefully, the caching is resource-sensitive, as well...).

Copy link
Member

@dbutenhof dbutenhof Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would still mean rejecting a POST on the /api/v1/datasets/<id>/visualize with a quite specifically defined complaint that POST isn't acceptable and we only allow POST. And that's the problem; the HTTP errors are a very narrow and specifically defined set of errors, which is absolutely ridiculous.

The most "wiggle room" I see in any of these is FORBIDDEN and UNPROCESSABLE_ENTITY, and I'm not thrilled about either of those.

lib/pbench/server/api/resources/visualize.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_visualize.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_visualize.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_visualize.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_visualize.py Outdated Show resolved Hide resolved
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but there are two small items which should probably be fixed, and there's some follow-up for the method path change.

  • I'm happy with the change to the method path; however, it has a bunch of knock-on effects which should be attended to...if Dave is onboard with the change.
  • There's a use of get() in one of the test assertions which is problematic...but we should probably just drop the assertion altogether.
  • There's another "missing assertion"

The rest of my comments are nits and small stuff.

lib/pbench/server/api/__init__.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/visualize.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/visualize.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/visualize.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_endpoint_configure.py Outdated Show resolved Hide resolved
lib/pbench/client/__init__.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_visualize.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_visualize.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_visualize.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_visualize.py Outdated Show resolved Hide resolved
dbutenhof
dbutenhof previously approved these changes Jun 20, 2023
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error codes in general are a problem, and we can potentially decide to come up with and implement a consistent/different scheme later but I don't think we need to hold up this critical change while we think/argue about it.

Comment on lines 74 to 87
try:
file = tarball.extract(tarball.tarball_path, f"{name}/result.csv")
except TarballUnpackError as e:
raise APIInternalError(str(e)) from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

METHOD_NOT_ALLOWED normally refers to POST vs GET/etc; I think that would be confusing when what we mean might best be respresented as "endpoint not allowed for this resource".

Yes, we've squeezed UNSUPPORTED_MEDIA_TYPE, but that's really intended to report incompatible request Content-Type (I presented XML but the API only supports JSON). I've never been entirely comfortable with this mapping, although at the time I hadn't noticed anything that looked measurably better or less misleading.

I'm not entirely happy with UNPROCESSABLE_ENTITY, either, which is oriented towards WebDAV protocol payloads ... and we're not talking about request payloads here... but as it's a niche error a lot less common than UNSUPPORTED_MEDIA_TYPE, it'd likely be a safer victim to steal and corrupt to our evil ends ...

Similarly for BAD_REQUEST ... this isn't a bad request, (there's nothing wrong with the URL, headers, or payload), but the resource isn't appropriate for the endpoint. You'd think this would be a common problem, but I really don't see any HTTP status with a description in that ballpark. It's even worse than trying to describe thread programming errors with those ridiculous UNIX error numbers... 😬

I could actually see 501/NOT_IMPLEMENTED for "benchmark other than uperf", come to think of it; along with the implicit indication that this might change in the future, which is wholly appropriate.

Interesting, although common usage (including ours) makes this potentially misleading, 403/FORBIDDEN is technically described as "The request contained valid data and was understood by the server, but the server is refusing action." (Failed permission check is just a "for example".) Aside from that common inference, though, this isn't a bad fit at all. And the message explains why we're refusing action... 🤔

Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is just one lingering issue (other than the nit that Dave pointed out, which I don't think either of us would necessarily hold the merge for): how to report a missing results.csv file. Given that we don't seem to be converging on a solution for that problem, I'm approving the code as is.

dbutenhof

This comment was marked as resolved.

@dbutenhof dbutenhof merged commit 68d543d into distributed-system-analysis:main Jun 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Of and relating to application programming interfaces to services and functions Server
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants