-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate household
endpoints to new API structure
#2038
base: master
Are you sure you want to change the base?
Conversation
|
||
@household_bp.route("/<household_id>", methods=["GET"]) | ||
@validate_country | ||
def get_household(country_id: str, household_id: str) -> Response: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion, blocking I don't like that country_id is an implicit requirement that has to be fulfilled when you attach the blueprint by adding it to the prefix. Honestly, I'm surprised that works.
I would expect you to need to explicitly declare the URL parameters you need in the blueprint some way.
- question: I'm not sure what pyflask intends you to do here... the documentation is pretty thin. This feels like a side effect, not a feature though. Do you have a better reference?
- suggested immediate solution: With that caveat, I suggest just using the full path here: "/<country_id>/<household_id>" and then attaching the blueprint to the "/" app.
- This is explicit (you can see where country_id is coming from and can't leave it out) - yay
- This means you now register overlapping blueprints for anything that needs "country_id". If you accidentally define two with the same structure, pyflask says nothing and just uses whatever gets registered second - boo.
- discussion for next iteration I wonder if this is telling us we should flatten our REST API structure. Nesting tends to be an anti-pattern.
- In this case you could have /countries/<country_id> and /households/<household_id>. where household_id is globally unique. After all.. we know which country the household is in, why are we making the customer remember and restate it?
- This addresses the current problem, but generally this makes for cleaner and unambigous URLs: is the path "/one/two" referring to a country "one" and a household with ID "two", a top level resource "one" and and a sub resource called "two", or a country called "one" for which I want the sub resource called "two
- Which in turn solves the problem above of overlapping paths (not coincidentally)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see where you're coming from, and you're right, the documentation on this usage is pretty thin. I would not be against migrating to the format you've described. I think that no matter what format we choose, we're operating in a less than ideal environment, since literally every route requires a country ID value.
I also agree that there's probably a benefit to flattening the API structure overall in the long-run. However, this introduces a couple of important questions. For example, we have one or two triple-nested endpoints (e.g., <country_id>/household/<household_id>/over/policy/<policy_id>
or something similar), and we'd have to consider a proper implementation for them.
Additionally, this form of migration is obviously breaking, and because the API wasn't designed with a version marker in its route, our only way of migrating is inherently breaking - we must make changes to the API, break it temporarily, then update the front-end app.
Finally, there are cases of routes where we definitely need the country ID. Metadata is obviously one, but that's not difficult to implement within a flattened API structure. The harder-to-implement routes are those enqueuing society-wide calculation.
My proposed solution here would be to shift to the format you've recommended (and also update the other blueprints added recently for AI and economy endpoints), then engage in a longer discussion regarding flattening the structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLDR
agree.
Less than ideal paths
Yeah. As structured now, it's awkward.
flattening that example
Yeah I mean you could go a bunch of ways there.
The Computation is the Entity
IMO we're modeling a request to compute a value based on inputs and modeling this as a POST to a computations entity actually ends up being a lot more flexible/useful than treating this as a child of a household and a policy.
Create a computation
POST /computations/
{
household_id=...
policy_id=...
}
returns
a link to /computations/[computation_id]
GET the result
GET
/computations/[computation_id]
{
status="..."
inputs= {
household
policy
...
}
data={...}
}
Caching
Post simply redirects to an existing computation instead of creating a new one.
In fact the "computation_id" could just be an amalgam of the input values (I.e. [household_id]&[policy_id]) in which case it's easy to map to a cached value.
Searching
GET /computations?country=...
GET /computations?policy_id=...
GET /computations?household_id=...
Pros:
- Puts the one true place for finding computations of all kinds under /computations
- on the implementation side I don't need to implement partial queries under household, population, country, etc.
- on the client side I can ask global questions with one query in one place.
- Simplifies different ways of asking for the same thing: I can easily support both household ID and direct data as input for household without needing another path, just one POST with flexible input.
- Can all go into a completely encapsulated blueprint that lives under an unambiguous URL.
Cons:
- Client slightly more complicated in that we have POST then GET poll instead of just poll.
The household (or the policy, or both) is the entity
you could do /households/[id]/computations
and then follow the same model as above
This is less flexible though because
- I have to have a saved household with a household_id to run the computation (what if I just want to pass the household data directly?)
- If I want to search for computations across households I have to support that with another endpoint under a different resource.
- I probably still want to be able to search for computations under the household so now I have to duplicate some of that global search under the household.
Also.. why is the household the entity instead of the policy? I mean it seems reasonable to want to search for all simulations under a policy right? So now do I implement both or pick one, etc?
|
||
# Ensure that household ID is a number | ||
try: | ||
household_id = int(household_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion/nonblocking I can't find in the flask docs where this is documented (!) but apparently you can use "int:household_id" in the path to do basic validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that would make life easier. Let me see if I can find in logs, curious to see if there are more advanced features so that we could further cut down on validation code.
) | ||
else: | ||
return Response( | ||
json.dumps( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question why does the metadata endpoint return just a dict while others return {status,message,result}? are we trying to align on one standard, which one and why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we've ever had a standard here - definitely time for one. Let me work up what I think makes sense, gather your feedback, then implement.
) | ||
|
||
|
||
class TestGetHousehold: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit/nonblocking: I have a mild preference for testing the blueprint (I.e. specifically wrap the blueprint in an app and test it) instead of the full app for all tests. Why tightly couple all that stuff if you don't need to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. Previously, we had no blueprints, and some of our earliest tests were merely reactions to particular calculations failing, and thus were essentially targeted e2e's or integration tests. Moving forward, I think that would make a lot of sense.
mock_row = MagicMock(spec=LegacyRow) | ||
mock_row.__getitem__.side_effect = lambda x: {"id": 1}[x] | ||
mock_database.query().fetchone.return_value = mock_row |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitt/nonblocking I tend to prefer a test support method (I would usually call this a fixture, but that may have specific technical implications here) like given_any_query_will_return(...)
instead of mock-object-ese in the test. It makes for a cleaner read and if you change the way we query the DB you just need to change the function once, not every build test we wrote.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to chat with you about that. I'm new to the world of mock objects and fixtures, frankly, so curious to hear more during our meeting Wednesday.
assert b"Label must be a string or None" in response.data | ||
|
||
|
||
class TestUpdateHousehold: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussion/non-blocking I like the way you have organized these tests with a class per operation and clearly named tests.
Have you looked at pytest-describe or similar ways of removing some of the boilerplate/repetitive aspects of the test syntax (I haven't used that specifically, but I've used similar frameworks in javascript and ruby)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not, but will
assert response.status_code == 200 | ||
assert data["status"] == "ok" | ||
assert data["result"]["household_id"] == "1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question/maybe blocking Does this actually verify an update happened? I'm expecting a verify of some kind confirming the SQL update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only in the sense that the API returns a 200 when an update occurs, while it returns a 201 when record creation occurs, but you're right, this test lacks checking beyond that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion validate the query unless it's a ton of work to do.
|
||
|
||
# Service level tests | ||
class TestHouseholdService: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question/nitt/nonblocking
The nitt - For the most part in most code bases there is one file per class under test. Usually it's under a directory matching the class file.
I would for that reason expect this to live in something like tests/.../services/test_household_service.py
I personally find this convention a bit confusing because I need more context to know where to look to find the appropriate test.
The question - do you currently have a convention this fits neatly into and/or do Python projects do that differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a PR open to begin to migrate to a convention, because I agree, this method is basically a free-for-all. That said, looking forward to discussing it with you Wednesday.
household
endpoints to new API structure
Hey Mike, Thanks again for your review on this. Here's what I'd like to propose as a roadmap forward for both this PR and the PR migrating
|
I approve of your plan. I have added some more discussion above, but I don't want to block this work on it. |
…pdate tests to mirror that behavior
eb430a1
to
a3a78fc
Compare
Fixes #1988.
Fixes #2034.
This PR migrates the household endpoints to the new API structure.