From 1dbad11993f192d3c930f27a9cbc6b5b2302b3ef Mon Sep 17 00:00:00 2001 From: David Butenhof Date: Thu, 1 Jun 2023 14:31:41 -0400 Subject: [PATCH] Some API documentation fixes (#3441) * Some API documentation fixes PBENCH-819 This is primarily a minor "cleanup pass" to make some specific example paths more generic and retire a long-standing story. Along the way I also found a few places I'd missed during the REST API changes so I fixed them as well. Finally, I added a `relay.md` API file for the new Relay API. Any other bugs, omissions, or other comments in API documentation could also be considered "in scope" here. --- docs/Server/API/README.md | 17 ++-- docs/Server/API/V1/README.md | 17 ---- docs/Server/API/V1/contents.md | 25 +++--- docs/Server/API/V1/inventory.md | 25 +++++- docs/Server/API/V1/relay.md | 145 ++++++++++++++++++++++++++++++++ docs/Server/API/V1/upload.md | 15 +++- 6 files changed, 204 insertions(+), 40 deletions(-) create mode 100644 docs/Server/API/V1/relay.md diff --git a/docs/Server/API/README.md b/docs/Server/API/README.md index e0ae327aa1..50714c11a1 100644 --- a/docs/Server/API/README.md +++ b/docs/Server/API/README.md @@ -7,11 +7,14 @@ The Pbench Server provides a set of HTTP endpoints to manage user authentication and curated performance information, called "dataset resources" or just "datasets". -The [V1 API](V1/README.md) provides a functional interface that's not quite -standard REST. The intent is to migrate to a cleaner resource-oriented REST -style for a future V2 API. +The [V1 API](V1/README.md) provides a REST-like functional interface. -The Pbench Server primarily uses serialized JSON parameters (mimetype -`application/json`) both for request bodies and response bodies. A few -exceptions use raw byte streams (`application/octet-stream`) to allow uploading -new datasets and to access individual files from a dataset. +The Pbench Server APIs accept parameters from a variety of sources. See the +individual API documentation for details. +1. Some parameters, especially "resource ids", are embedded in the URI, such as +`/api/v1/datasets/`; +2. Some parameters are passed as query parameters, such as +`/api/v1/datasets?name:fio`; +3. For `PUT` and `POST` APIs, parameters may also be passed as a JSON +(`application/json` content type) request payload, such as +`{"metadata": {"dataset.name": "new name"}}` diff --git a/docs/Server/API/V1/README.md b/docs/Server/API/V1/README.md index 25fe93f24d..9d677bf04d 100644 --- a/docs/Server/API/V1/README.md +++ b/docs/Server/API/V1/README.md @@ -132,20 +132,3 @@ through the `directories` list of each The [inventory](inventory.md) API returns the raw byte stream of any regular file within the directory hierarchy, including log files, postprocessed JSON files, and benchmark result text files. - -### Example - -``` - def directory(request, url: str, name: str = "/", level: int = 0): - ls = request.get(url).get_json() - print(f"{' '*level}{name}") - for d in ls.directories: - directory(request, level + 1, d.name, d.url) - for f in ls.files: - print(f"{' '*(level+1)}{f.name}) - bytes = request.get(f.url) - # display byte stream: - # inline on terminal doesn't really make sense - - directory(request, "http://host.example.com/api/v1/contents//") -``` diff --git a/docs/Server/API/V1/contents.md b/docs/Server/API/V1/contents.md index e215065fcf..b8d7794ecd 100644 --- a/docs/Server/API/V1/contents.md +++ b/docs/Server/API/V1/contents.md @@ -75,33 +75,33 @@ Pbench returns a JSON object with two list fields: { "directories": [ { - "name": "1-iter1", + "name": "dir1", "type": "dir", - "uri": "http://hostname/api/v1/datasets/contents//1-iter1" + "uri": "http://hostname/api/v1/datasets//contents/dir1" }, { - "sysinfo", + "name": "dir2", "type": "dir", - "uri": "http://hostname/api/v1/datasets/contents//sysinfo" + "uri": "http://hostname/api/v1/datasets//contents/dir2" }, ... ], "files": [ { - "name": ".iterations", + "name": "file.txt", "mtime": "2022-05-18T16:02:30", "size": 24, "mode": "0o644", "type": "reg", - "uri": "http://hostname/api/v1/datasets/inventory//.iterations" + "uri": "http://hostname/api/v1/datasets//inventory/file.txt" }, { - "name": "iteration.lis", + "name": "data.lis", "mtime": "2022-05-18T16:02:06", "size": 18, "mode": "0o644", "type": "reg", - "uri": "http://hostname/api/v1/datasets/inventory//iteration.lis" + "uri": "http://hostname/api/v1/datasets//inventory/data.lis" }, ... ] @@ -126,7 +126,12 @@ The `type` codes are: { "name": "reference-result", "type": "sym", - "uri": "http://hostname/api/v1/datasets/contents//sample1" + "uri": "http://hostname/api/v1/datasets//contents/linkresult" +}, +{ + "name": "directory", + "type": "dir", + "uri": "http://hostname/api/v1/datasets//contents/directory" } ``` @@ -154,6 +159,6 @@ URI returning the linked file's byte stream. "size": 18, "mode": "0o644", "type": "reg", - "uri": "http://hostname/api/v1/datasets/inventory//" + "uri": "http://hostname/api/v1/datasets//inventory/" } ``` diff --git a/docs/Server/API/V1/inventory.md b/docs/Server/API/V1/inventory.md index 2d9d66934c..6f1b8fdabe 100644 --- a/docs/Server/API/V1/inventory.md +++ b/docs/Server/API/V1/inventory.md @@ -11,8 +11,10 @@ The resource ID of a Pbench dataset on the server. `` string \ The resource path of an item in the dataset inventory, as captured by the -Pbench Agent packaging; for example, `/metadata.log` for the dataset metadata, -or `/1-default/sample1/result.txt` for the default first iteration results. +Pbench Agent packaging; for example, `/metadata.log` for a file named +`metadata.log` at the top level of the dataset tarball, or `/dir1/dir2/file.txt` +for a `file.txt` file in a directory named `dir2` within a directory called +`dir1` at the top level of the dataset tarball. ## Request headers @@ -25,6 +27,23 @@ E.g., `authorization: bearer ` `content-type: application/octet-stream` \ The return is a raw byte stream representing the contents of the named file. +`content-disposition: ; filename=` \ +This header defines the recommended client action on receiving the byte stream. +The `` types are either `inline` which suggests that the data can be +displayed "inline" by a web browser or `attachment` which suggests that the data +should be saved into a new file. The `` is the original filename on the +Pbench Server. For example, + +``` +content-disposition: attachment; filename=pbench-fio-config-2023-06-29-00:14:50.tar.xz +``` + +or + +``` +content-disposition: inline; filename=data.txt +``` + ## Resource access * Requires `READ` access to the `` resource @@ -48,7 +67,7 @@ exist. `415` **UNSUPPORTED MEDIA TYPE** \ The `` refers to a directory. Use -`/api/v1/dataset/contents/` to request a JSON response document +`/api/v1/dataset//contents/` to request a JSON response document describing the directory contents. `503` **SERVICE UNAVAILABLE** \ diff --git a/docs/Server/API/V1/relay.md b/docs/Server/API/V1/relay.md new file mode 100644 index 0000000000..4fbb0b9071 --- /dev/null +++ b/docs/Server/API/V1/relay.md @@ -0,0 +1,145 @@ +# `POST /api/v1/relay/` + +This API creates a dataset resource by reading data from a Relay server. There +are two distinct steps involved: + +1. A `GET` on the provided URI must return a "Relay manifest file". This is a +JSON file (`application/json` MIME format) providing the original tarball +filename, the tarball's MD5 hash value, a URI to read the tarball file, and +optionally metadata key/value pairs to be applied to the new dataset. (See +[Manifest file keys](#manifest-file-keys).) +2. A `GET` on the Relay manifest file's `uri` field value must return the +tarball file as an `application/octet-stream` payload, which will be stored by +the Pbench Server as a dataset. + +## URI parameters + +`` string \ +The Relay server URI of the tarball's manifest `application/json` file. This +JSON object must provide a set of parameter keys as defined below in +[Manifest file keys](#manifest-file-keys). + +## Manifest file keys + +For example, + +```json +{ + "uri": "https://relay.example.com/52adfdd3dbf2a87ed6c1c41a1ce278290064b0455f585149b3dadbe5a0b62f44", + "md5": "22a4bc5748b920c6ce271eb68f08d91c", + "name": "fio_rw_2018.02.01T22.40.57.tar.xz", + "access": "private", + "metadata": ["server.origin:myrelay", "global.agent:cloud1"] +} +``` + +`access`: [ `private` | `public` ] \ +The desired initial access scope of the dataset. Select `public` to make the +dataset accessible to all clients, or `private` to make the dataset accessible +only to the owner. The default access scope if the key is omitted from the +manifest is `private`. + +For example, `"access": "public"` + +`md5`: tarball MD5 hash \ +The MD5 hash of the compressed tarball file. This must match the actual tarball +octet stream specified by the manifest `uri` key. + +`metadata`: [metadata key/value strings] \ +A set of desired Pbench Server metadata key values to be assigned to the new +dataset. You can set the initial resource name (`dataset.name`), for example, as +well as assigning any keys in the `global` and `user` namespaces. See +[metadata](../metadata.md) for more information. + +In particular the client can set any of: +* `dataset.name`: [default dataset name](../metadata.md#datasetname) +* `server.origin`: [dataset origin](../metadata.md#serverorigin) +* `server.archiveonly`: [suppress indexing](../metadata.md#serverarchiveonly) +* `server.deletion`: [default dataset expiration time](../metadata.md#serverdeletion). + +`name`: The original tarball file name \ +The string value must represent a legal filename with the compound type of +`.tar.xz` representing a `tar` archive compressed with the `xz` program. + +`uri`: Relay URI resolving to the tarball file \ +An HTTP `GET` on this URI, exactly as recorded, must return the original tarball +file as an `application/octet-stream`. + +## Request headers + +`authorization: bearer` token \ +*Bearer* schema authorization assigns the ownership of the new dataset to the +authenticated user. E.g., `authorization: bearer ` + +`content-length` tarball size \ +The size of the request octet stream in bytes. Generally supplied automatically by +an upload agent such as Python `requests` or `curl`. + +## Response headers + +`content-type: application/json` \ +The return is a serialized JSON object with status information. + +## Response status + +`200` **OK** \ +Successful request. The dataset MD5 hash is identical to that of a dataset +previously uploaded to the Pbench Server. This is assumed to be an identical +tarball, and the secondary URI (the `uri` field in the Relay manifest file) +has not been accessed. + +`201` **CREATED** \ +The tarball was successfully uploaded and the dataset has been created. + +`400` **BAD_REQUEST** \ +One of the required headers is missing or incorrect, invalid query parameters +were specified, or a bad value was specified for a query parameter. The return +payload will be a JSON document with a `message` field containing details. + +`401` **UNAUTHORIZED** \ +The client is not authenticated. + +`502` **BAD GATEWAY** \ +This means that a problem occurred reading either the manifest file or the +tarball from the Relay server. The return payload will be a JSON document with +a `message` field containing more information. + +`503` **SERVICE UNAVAILABLE** \ +The server has been disabled using the `server-state` server configuration +setting in the [server configuration](./server_config.md) API. The response +body is an `application/json` document describing the current server state, +a message, and optional JSON data provided by the system administrator. + +## Response body + +The `application/json` response body consists of a JSON object containing a +`message` field. On failure this will describe the nature of the problem and +in some cases an `errors` array will provide details for cases where multiple +problems can occur. + +```json +{ + "message": "File successfully uploaded" +} +``` + +or + +```json +{ + "message": "Dataset already exists", +} +``` + +or + +```json +{ + "message": "at least one specified metadata key is invalid", + "errors": [ + "Metadata key 'server.archiveonly' value 'abc' for dataset must be a boolean", + "improper metadata syntax dataset.name=test must be 'k:v'", + "Key test.foo is invalid or isn't settable", + ], +} +``` diff --git a/docs/Server/API/V1/upload.md b/docs/Server/API/V1/upload.md index 031c7382c4..f047f90cae 100644 --- a/docs/Server/API/V1/upload.md +++ b/docs/Server/API/V1/upload.md @@ -82,13 +82,22 @@ a message, and optional JSON data provided by the system administrator. ## Response body -The `application/json` response body consists of a JSON object giving a detailed -message on success or failure: +The `application/json` response body consists of a JSON object containing a +`message` field. On failure this will describe the nature of the problem and +in some cases an `errors` array will provide details for cases where multiple +problems can occur. + +```json +{ + "message": "File successfully uploaded" +} +``` + +or ```json { "message": "Dataset already exists", - "errors": [ ] } ```