Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web API for identifiers and identifier sets #230

Merged
merged 8 commits into from
Jul 19, 2021
Merged

Conversation

tsibley
Copy link
Member

@tsibley tsibley commented Jun 21, 2021

Refresh of a long-standing WIP from a couple years ago to add an identifier API. It had slightly different but overlapping motivations at the time. I refreshed it on the latest code and added an identifier lookup endpoint which could be used for validation.

Marked as draft because I expect we'll want some further consideration first. We don't have to use this as-is, but I figured it could be a starting point for discussion about what we do want there with much of the plumbing done.

Commit messages contain details.

@tsibley tsibley requested a review from a team as a code owner June 21, 2021 20:12
@tsibley tsibley marked this pull request as draft June 21, 2021 20:13
Copy link
Contributor

@davereinhart davereinhart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested each of these endpoint successfully. Looks like a useful set! Just a few questions.
(I tried commenting on multiple lines of code below but it's not showing the ones I selected... I need to practice with that feature a little more)

return jsonify(set._asdict())


@api_v1.route("/warehouse/identifier-sets/<name>", methods = ['PUT'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you chose PUT instead of POST method here? Also, would it be beneficial to also be able to set the description value for the new record?

Copy link
Member Author

@tsibley tsibley Jul 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. In my experience:

PUT is usually used for idempotent operations acting directly on a specific resource (e.g. one with a unique id, in this case <name>).

POST is usually used for non-idempotent operations acting on a collection resource, to take arbitrary action (e.g. launch a job) or create new records where you don't know the unique id in advance (e.g. auto-numbered records). Responses to POSTs are often 302 Redirects to the newly-created specific resource, which tell the client what the new id is.

There are lots of discussions about this, of course. One reasonable example is https://restfulapi.net/rest-put-vs-post/.

It would indeed probably be beneficial to allow a description to be set. I didn't think it necessary as a first pass (and it wasn't in my old wip), but it wouldn't be hard to add. I'll add it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that explanation makes sense. I came across that same page, and the last entry in the comparison table was throwing me off:

Generally, in practice, always use PUT for UPDATE operations. Always use POST for CREATE operations.

But your explanation matches up well with this answer on SO, so thanks for clearing that up for me!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added support for "description" with additional commits.

return "", 201 if new_set else 204


@api_v1.route("/warehouse/identifier-sets/<name>/identifiers", methods = ['POST'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth considering setting an upper limit to n value here?

Copy link
Member Author

@tsibley tsibley Jul 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good call. It's always good to consider what limits are appropriate. I'm not sure what reasonable bound would be though… Here's the current summary stats of batch size for all we've minted so far:

min	     1
mode	 1,040
max	11,900

Stepping back from n, I think what a limit would really seek to prevent is minting reqs that run "forever". We can't guarantee this just through n alone (although very small n might approximate it well enough), so maybe the more appropriate thing would be an explicit time limit. There is some existing limit on request/response duration already at the web server level (both at uWSGI and Apache, I believe), so that may provide enough of a time limit if the corresponding database session is terminated when the request is. But I'm not sure that's the case.

This also means that we may not be able to use this API as-is to mint large batches of identifiers without increasing existing request/response duration limits, possibly to unreasonable levels. An async API would obviate that issue, but then require a minting job queue and polling of job status by the client. (We've talked about this in the past in related convos.)

All of that said, I don't think we currently have a concrete planned use for this minting API endpoint (unlike the other endpoints in this PR). So maybe we just drop it for now and can revive it later after we have a more concrete use to build for?

@seattleflu/core-devs thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to drop it or if possible just disable that route until there's a use case for it. I see that the id3c identifier mint command is used when generating barcode labels, but I'm not familiar enough with all of the workflows to know which ones might need this functionality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree, reasonable to drop it from this PR. I'd move it to another branch and push that up as a wip, to potentially be picked up by someone else in the future. Would love input from others here too before I do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repushed dropping the minting endpoint.

@tsibley tsibley marked this pull request as ready for review July 2, 2021 18:27
The sole callsite of JsonEncoder, our as_json(), already disallows
these, but I'm about to need to pass JsonEncoder directly to Flask and
want to avoid them there as well.
We want to serialize datetimes as ISO 8601 strings instead of RFC 822
strings, and we don't want to allow floating point NaNs or ±Infinity.

This affects jsonify(), a handy route helper that I'm about to use.
UUIDs show up in our warehouse.identifier table, and I'll soon be
returning API responses for those rows.
Provides a validation endpoint for integrations, such as the BBI LIMS,
to check specimen ids.
I wrote these endpoints over 2 years ago while integrating identifier
minting and barcode label generation, only to end up refactoring the
bulk of it out and generating labels via a CLI command instead.  With a
web interface for minting new identifiers in mind, I kept the code
around as a wip branch.  I thought of it again as we're thinking about
identifier API integration with the BBI LIMS, so I updated the code to
account for changes that happened in the interim.
I empirically checked that Flask does the right thing by only reading
the request body if there's a Content-Length (and only up to
Content-Length bytes).
…hout a body

Better allows endpoints to accept requests with an optional body, like
PUTs to create/update a resource with optional fields.
Optional, to allow for the pattern of ensuring a set exists without
overwriting its description.
Copy link
Contributor

@davereinhart davereinhart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. After this is merged, I'll make some minor changes to this code as part of PR #234 to include the new required use column in the identifier_set table.

@tsibley
Copy link
Member Author

tsibley commented Jul 19, 2021

Sounds like a plan, thanks!

@tsibley tsibley merged commit 2ef4a5e into master Jul 19, 2021
@tsibley tsibley deleted the api/identifiers branch July 19, 2021 18:47
@tsibley
Copy link
Member Author

tsibley commented Jul 19, 2021

Deployed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants