Skip to content

Commit

Permalink
Some API documentation fixes (#3441)
Browse files Browse the repository at this point in the history
* Some API documentation fixes

PBENCH-819

This is primarily a minor "cleanup pass" to make some specific example paths
more generic and retire a long-standing story.

Along the way I also found a few places I'd missed during the REST API changes
so I fixed them as well. Finally, I added a `relay.md` API file for the new
Relay API.

Any other bugs, omissions, or other comments in API documentation could also
be considered "in scope" here.
  • Loading branch information
dbutenhof authored Jun 1, 2023
1 parent f5a6b12 commit 1dbad11
Show file tree
Hide file tree
Showing 6 changed files with 204 additions and 40 deletions.
17 changes: 10 additions & 7 deletions docs/Server/API/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,14 @@ The Pbench Server provides a set of HTTP endpoints to manage user
authentication and curated performance information, called "dataset resources"
or just "datasets".

The [V1 API](V1/README.md) provides a functional interface that's not quite
standard REST. The intent is to migrate to a cleaner resource-oriented REST
style for a future V2 API.
The [V1 API](V1/README.md) provides a REST-like functional interface.

The Pbench Server primarily uses serialized JSON parameters (mimetype
`application/json`) both for request bodies and response bodies. A few
exceptions use raw byte streams (`application/octet-stream`) to allow uploading
new datasets and to access individual files from a dataset.
The Pbench Server APIs accept parameters from a variety of sources. See the
individual API documentation for details.
1. Some parameters, especially "resource ids", are embedded in the URI, such as
`/api/v1/datasets/<resource_id>`;
2. Some parameters are passed as query parameters, such as
`/api/v1/datasets?name:fio`;
3. For `PUT` and `POST` APIs, parameters may also be passed as a JSON
(`application/json` content type) request payload, such as
`{"metadata": {"dataset.name": "new name"}}`
17 changes: 0 additions & 17 deletions docs/Server/API/V1/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,20 +132,3 @@ through the `directories` list of each
The [inventory](inventory.md) API returns the raw byte stream of any regular
file within the directory hierarchy, including log files, postprocessed JSON
files, and benchmark result text files.

### Example

```
def directory(request, url: str, name: str = "/", level: int = 0):
ls = request.get(url).get_json()
print(f"{' '*level}{name}")
for d in ls.directories:
directory(request, level + 1, d.name, d.url)
for f in ls.files:
print(f"{' '*(level+1)}{f.name})
bytes = request.get(f.url)
# display byte stream:
# inline on terminal doesn't really make sense
directory(request, "http://host.example.com/api/v1/contents/<dataset>/")
```
25 changes: 15 additions & 10 deletions docs/Server/API/V1/contents.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,33 +75,33 @@ Pbench returns a JSON object with two list fields:
{
"directories": [
{
"name": "1-iter1",
"name": "dir1",
"type": "dir",
"uri": "http://hostname/api/v1/datasets/contents/<id>/1-iter1"
"uri": "http://hostname/api/v1/datasets/<id>/contents/dir1"
},
{
"sysinfo",
"name": "dir2",
"type": "dir",
"uri": "http://hostname/api/v1/datasets/contents/<id>/sysinfo"
"uri": "http://hostname/api/v1/datasets/<id>/contents/dir2"
},
...
],
"files": [
{
"name": ".iterations",
"name": "file.txt",
"mtime": "2022-05-18T16:02:30",
"size": 24,
"mode": "0o644",
"type": "reg",
"uri": "http://hostname/api/v1/datasets/inventory/<id>/.iterations"
"uri": "http://hostname/api/v1/datasets/<id>/inventory/file.txt"
},
{
"name": "iteration.lis",
"name": "data.lis",
"mtime": "2022-05-18T16:02:06",
"size": 18,
"mode": "0o644",
"type": "reg",
"uri": "http://hostname/api/v1/datasets/inventory/<id>/iteration.lis"
"uri": "http://hostname/api/v1/datasets/<id>/inventory/data.lis"
},
...
]
Expand All @@ -126,7 +126,12 @@ The `type` codes are:
{
"name": "reference-result",
"type": "sym",
"uri": "http://hostname/api/v1/datasets/contents/<id>/sample1"
"uri": "http://hostname/api/v1/datasets/<id>/contents/linkresult"
},
{
"name": "directory",
"type": "dir",
"uri": "http://hostname/api/v1/datasets/<id>/contents/directory"
}
```

Expand Down Expand Up @@ -154,6 +159,6 @@ URI returning the linked file's byte stream.
"size": 18,
"mode": "0o644",
"type": "reg",
"uri": "http://hostname/api/v1/datasets/inventory/<id>/<path>"
"uri": "http://hostname/api/v1/datasets/<id>/inventory/<path>"
}
```
25 changes: 22 additions & 3 deletions docs/Server/API/V1/inventory.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,10 @@ The resource ID of a Pbench dataset on the server.

`<path>` string \
The resource path of an item in the dataset inventory, as captured by the
Pbench Agent packaging; for example, `/metadata.log` for the dataset metadata,
or `/1-default/sample1/result.txt` for the default first iteration results.
Pbench Agent packaging; for example, `/metadata.log` for a file named
`metadata.log` at the top level of the dataset tarball, or `/dir1/dir2/file.txt`
for a `file.txt` file in a directory named `dir2` within a directory called
`dir1` at the top level of the dataset tarball.

## Request headers

Expand All @@ -25,6 +27,23 @@ E.g., `authorization: bearer <token>`
`content-type: application/octet-stream` \
The return is a raw byte stream representing the contents of the named file.

`content-disposition: <action>; filename=<name>` \
This header defines the recommended client action on receiving the byte stream.
The `<action>` types are either `inline` which suggests that the data can be
displayed "inline" by a web browser or `attachment` which suggests that the data
should be saved into a new file. The `<name>` is the original filename on the
Pbench Server. For example,

```
content-disposition: attachment; filename=pbench-fio-config-2023-06-29-00:14:50.tar.xz
```

or

```
content-disposition: inline; filename=data.txt
```

## Resource access

* Requires `READ` access to the `<dataset>` resource
Expand All @@ -48,7 +67,7 @@ exist.

`415` **UNSUPPORTED MEDIA TYPE** \
The `<path>` refers to a directory. Use
`/api/v1/dataset/contents/<dataset><path>` to request a JSON response document
`/api/v1/dataset/<dataset>/contents/<path>` to request a JSON response document
describing the directory contents.

`503` **SERVICE UNAVAILABLE** \
Expand Down
145 changes: 145 additions & 0 deletions docs/Server/API/V1/relay.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
# `POST /api/v1/relay/<uri>`

This API creates a dataset resource by reading data from a Relay server. There
are two distinct steps involved:

1. A `GET` on the provided URI must return a "Relay manifest file". This is a
JSON file (`application/json` MIME format) providing the original tarball
filename, the tarball's MD5 hash value, a URI to read the tarball file, and
optionally metadata key/value pairs to be applied to the new dataset. (See
[Manifest file keys](#manifest-file-keys).)
2. A `GET` on the Relay manifest file's `uri` field value must return the
tarball file as an `application/octet-stream` payload, which will be stored by
the Pbench Server as a dataset.

## URI parameters

`<uri>` string \
The Relay server URI of the tarball's manifest `application/json` file. This
JSON object must provide a set of parameter keys as defined below in
[Manifest file keys](#manifest-file-keys).

## Manifest file keys

For example,

```json
{
"uri": "https://relay.example.com/52adfdd3dbf2a87ed6c1c41a1ce278290064b0455f585149b3dadbe5a0b62f44",
"md5": "22a4bc5748b920c6ce271eb68f08d91c",
"name": "fio_rw_2018.02.01T22.40.57.tar.xz",
"access": "private",
"metadata": ["server.origin:myrelay", "global.agent:cloud1"]
}
```

`access`: [ `private` | `public` ] \
The desired initial access scope of the dataset. Select `public` to make the
dataset accessible to all clients, or `private` to make the dataset accessible
only to the owner. The default access scope if the key is omitted from the
manifest is `private`.

For example, `"access": "public"`

`md5`: tarball MD5 hash \
The MD5 hash of the compressed tarball file. This must match the actual tarball
octet stream specified by the manifest `uri` key.

`metadata`: [metadata key/value strings] \
A set of desired Pbench Server metadata key values to be assigned to the new
dataset. You can set the initial resource name (`dataset.name`), for example, as
well as assigning any keys in the `global` and `user` namespaces. See
[metadata](../metadata.md) for more information.

In particular the client can set any of:
* `dataset.name`: [default dataset name](../metadata.md#datasetname)
* `server.origin`: [dataset origin](../metadata.md#serverorigin)
* `server.archiveonly`: [suppress indexing](../metadata.md#serverarchiveonly)
* `server.deletion`: [default dataset expiration time](../metadata.md#serverdeletion).

`name`: The original tarball file name \
The string value must represent a legal filename with the compound type of
`.tar.xz` representing a `tar` archive compressed with the `xz` program.

`uri`: Relay URI resolving to the tarball file \
An HTTP `GET` on this URI, exactly as recorded, must return the original tarball
file as an `application/octet-stream`.

## Request headers

`authorization: bearer` token \
*Bearer* schema authorization assigns the ownership of the new dataset to the
authenticated user. E.g., `authorization: bearer <token>`

`content-length` tarball size \
The size of the request octet stream in bytes. Generally supplied automatically by
an upload agent such as Python `requests` or `curl`.

## Response headers

`content-type: application/json` \
The return is a serialized JSON object with status information.

## Response status

`200` **OK** \
Successful request. The dataset MD5 hash is identical to that of a dataset
previously uploaded to the Pbench Server. This is assumed to be an identical
tarball, and the secondary URI (the `uri` field in the Relay manifest file)
has not been accessed.

`201` **CREATED** \
The tarball was successfully uploaded and the dataset has been created.

`400` **BAD_REQUEST** \
One of the required headers is missing or incorrect, invalid query parameters
were specified, or a bad value was specified for a query parameter. The return
payload will be a JSON document with a `message` field containing details.

`401` **UNAUTHORIZED** \
The client is not authenticated.

`502` **BAD GATEWAY** \
This means that a problem occurred reading either the manifest file or the
tarball from the Relay server. The return payload will be a JSON document with
a `message` field containing more information.

`503` **SERVICE UNAVAILABLE** \
The server has been disabled using the `server-state` server configuration
setting in the [server configuration](./server_config.md) API. The response
body is an `application/json` document describing the current server state,
a message, and optional JSON data provided by the system administrator.

## Response body

The `application/json` response body consists of a JSON object containing a
`message` field. On failure this will describe the nature of the problem and
in some cases an `errors` array will provide details for cases where multiple
problems can occur.

```json
{
"message": "File successfully uploaded"
}
```

or

```json
{
"message": "Dataset already exists",
}
```

or

```json
{
"message": "at least one specified metadata key is invalid",
"errors": [
"Metadata key 'server.archiveonly' value 'abc' for dataset must be a boolean",
"improper metadata syntax dataset.name=test must be 'k:v'",
"Key test.foo is invalid or isn't settable",
],
}
```
15 changes: 12 additions & 3 deletions docs/Server/API/V1/upload.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,13 +82,22 @@ a message, and optional JSON data provided by the system administrator.

## Response body

The `application/json` response body consists of a JSON object giving a detailed
message on success or failure:
The `application/json` response body consists of a JSON object containing a
`message` field. On failure this will describe the nature of the problem and
in some cases an `errors` array will provide details for cases where multiple
problems can occur.

```json
{
"message": "File successfully uploaded"
}
```

or

```json
{
"message": "Dataset already exists",
"errors": [ ]
}
```

Expand Down

0 comments on commit 1dbad11

Please sign in to comment.