Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement POST /api/v1/relay #3425

Merged
merged 5 commits into from
May 31, 2023

Conversation

dbutenhof
Copy link
Member

PBENCH-1143

This API integrates with the file-relay server by assuming that the client (e.g., the pbench-results-move command) creates two separate files on the relay. The main artifact is the tarball, identified only by the tarball's computed SHA256 hash, as http://relay.example.com/<name>/<sha256>. However the Pbench Server needs the MD5 for the formal resource_id and a name for the dataset before we can begin "intake". We also want a mechanism to override the default "private" access level, and to set metadata, so by convention we add a second JSON file to which I'm referring as the "Relay configuration file". This provides the dataset "metadata" we require, including the tarball URI:

{
  "name": "tarball.tar.xz",
  "md5": "myMD5",
  "access": "private",
  "metadata": ["server.origin:relay", "global.server.relay:webb"],
  "uri": "http://relay.example.com/<name>/<tarball-sha256>"
}

The Pbench Server will first GET the URI parameter, which it expects to be the Relay Configuration File. It will then GET the "uri" parameter in the configuration file.

To accomplish this, the old Upload code has been refactored into a base class defining most of the logic, and two subclasses which provide "helper" methods called _prepare and _access to encapsulate the operational differences between the "push" and "pull" models. Essentially the Upload class _prepare processes the URI and query parameters while the Relay class _prepare does the first GET and processes the JSON configuration parameters. The _access helpers return the content length and an IO[byte] stream which is read and processed by the common code.

This has been tested manually by standing up an instance of the relay server and I added unit tests to cover the distinct functions of the relay helper methods while not duplicating the upload unit test cases that cover the common code.

@dbutenhof dbutenhof added Server API Of and relating to application programming interfaces to services and functions labels May 17, 2023
@dbutenhof dbutenhof self-assigned this May 17, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So...I found a few things to comment on.

The biggest one (which is not really big...) is that any new/modified code which is raising a new exception from inside an except clause should include a from statement, either referencing the current exception or None, as appropriate.

I believe that the new member functions _prepare() and _access() could/should be @staticmember functions. (And, it would be great if we could find a better name than "access", which is now overloaded....)

I think we can remove a lot of the exception handling from _intake() and let the outer try block handle it (particularly the APIInternalError's).

Unfortunately, the new unit test code continues our tradition of being very "functional-like". For the new CUT, we really don't need the full-on Flask client -- we should be able to invoke _prepare() and _access() directly, which requires much less mocking. (Of course, doing so would necessitate testing the base class separately, but I think that's actually a Good Thing™.)

lib/pbench/server/api/resources/intake_base.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/intake_base.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/intake_base.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/intake_base.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/intake_base.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_relay.py Show resolved Hide resolved
lib/pbench/test/unit/server/test_relay.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_upload.py Show resolved Hide resolved
lib/pbench/test/unit/server/test_upload.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_upload.py Show resolved Hide resolved
webbnh
webbnh previously approved these changes May 26, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it's just down to nits and similar, so I'm approving, but I've called out a number of things which you might consider tweaking.

In particular, it would be good to have appropriate from clauses on all the raise's inside except blocks (at least for code which you're already changing). (There are one or two which, arguably, should be omitted, because they are reporting errors during error handling, but most of them are reporting errors whose causes are the caught errors.

lib/pbench/server/api/resources/intake_base.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/intake_base.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/intake_base.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/intake_base.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/relay.py Show resolved Hide resolved
lib/pbench/server/api/resources/upload.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/intake_base.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/intake_base.py Outdated Show resolved Hide resolved
dbutenhof added 4 commits May 26, 2023 15:06
PBENCH-1143

This API integrates with the file-relay server by assuming that the client
(e.g., the `pbench-results-move` command) creates *two* separate files on the
relay. The main artifact is the tarball, identified only by the tarball's
computed SHA256 hash, as `http://relay.example.com/<name>/<sha256>`. However
the Pbench Server needs the MD5 for the formal `resource_id` and a *name* for
the dataset before we can begin "intake". We also want a mechanism to override
the default "private" access level, and to set metadata, so by convention we
add a second JSON file to which I'm referring as the "Relay configuration
file". This provides the dataset "metadata" we require, including the tarball
URI:

```json
{
  "name": "tarball.tar.xz",
  "md5": "myMD5",
  "access": "private",
  "metadata": ["server.origin:relay", "global.server.relay:webb"],
}
```

The Pbench Server will first `GET` the URI parameter, which it expects to be
the Relay Configuration File. It will then `GET` the `"uri"` parameter in the
configuration file.

To accomplish this, the old `Upload` code has been refactored into a base
class defining most of the logic, and two subclasses which provide "helper"
methods called `_prepare` and `_access` to encapsulate the operational
differences between the "push" and "pull" models. Essentially the `Upload`
class `_prepare` processes the URI and query parameters while the `Relay`
class `_prepare` does the first `GET` and processes the JSON configuration
parameters. The `_access` helpers return the content length and an `IO[byte]`
stream which is read and processed by the common code.

This has been tested manually by standing up an instance of the `relay` server
and I added unit tests to cover the distinct functions of the relay helper
methods while not duplicating the `upload` unit test cases that cover the
common code.
webbnh
webbnh previously approved these changes May 27, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good (but...there's a new a nit).

lib/pbench/server/api/resources/intake_base.py Outdated Show resolved Hide resolved
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@dbutenhof dbutenhof merged commit f417cc1 into distributed-system-analysis:main May 31, 2023
@dbutenhof dbutenhof deleted the pullapi branch May 31, 2023 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Of and relating to application programming interfaces to services and functions Server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants