Add warning to search response when source parameter has mixed validity #4031

sarayourfriend · 2024-04-04T01:00:51Z

Fixes

Fixes #4030 by @sarayourfriend
Closes #3895 by @obulat

Description

This PR:

Removes the logging we added in Log when source query parameter contains invalid values #3945 for investingating the potential solutions in the linked discussion
Adds a "warnings" on the search response when the request has mixed validity in the source parameter
Raises a validation error when none of the sources are valid. See the linked discussion for @obulat's reasoning for why this is a necessary and acceptable breakage. I've left a comment in the code explaining it as well.

I am happy with the PR in its current state, but I went back and forth about whether the new warnings key should default to an empty list or to not being present at all when there are no warnings. I went with the empty list because it's the simplest and most consistent. But I could easily argue that having it not be present at all is also totally reasonable. I could also see wanting to prefix the key to _warnings or nest it in some meta object on the response. I'm open to any suggestions on this.

I originally started with a more complex idea for this PR, to have a middleware that would add warnings to the responses based on a list of warnings set onto the request context. That would make it so any endpoint could (theoretically) easily set a warning on any request, so long as it had some way of accessing the request. This was inspired by Django's messages utility used to flash warnings in rendered HTML pages. DRF does not have an equivalent, and existing libraries for it do something totally different then what I wanted. All of that is way more complex than we need for this specific issue, though, so I chose to go for a more direct approach. If we find other use cases for the warnings, we can evaluate whether a more generic solution is appropriate.

Testing Instructions

Evaluate the changes and confirm the tests sufficiently cover the new cases. I've gone with additional integration tests rather than testing at the serializer level because there are three places that need to work together (i.e., to be "integrated") for this to work, so unit testing just the serializer would be insufficient, and would duplicate any meaningful testing at the integration level.

Run the application locally using just api/up and visit the search endpoint. Evaluate the following scenarios:

No parameters: empty warnings list
Only valid parameters: empty warnings list
Mixed validity of the source parameter: the new warning
Only bad parameters: 400 response

Checklist

My pull request has a descriptive title (not a vague title likeUpdate index.md).
My pull request targets the default branch of the repository (main) or a parent feature branch.
My commit messages follow best practices.
My code follows the established code style of the repository.
I added or updated tests for the changes I made (if applicable).
I added or updated documentation (if applicable).
I tried running the project locally and verified that there are no visible errors.
[N/A] I ran the DAG documentation generator (if applicable).

Developer Certificate of Origin

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

dhruvkb

LGTM! I have strong feelings about this one comment (the 2nd one below) and would definitely like to see that fixed but it felt wrong to block a PR that works over a change in the tests.

dhruvkb · 2024-04-04T04:20:54Z

api/api/serializers/media_serializers.py

+                        "code": "partially invalid source parameter",
+                        "message": "The source parameter was partially invalid.",
+                        "invalid_sources": invalid_sources,
+                        "referenced_sources": valid_sources,


The name "valid" feels clearer to me than "referenced".

Suggested change

"referenced_sources": valid_sources,

"valid_sources": valid_sources,

I found juggling the list of valid and available sources confusing. Usually when I see an error or warning about invalid values, the valid values listed are all the possible valid values, if that makes sense? I'm torn, so I'll wait and see what the other reviewer says, and change it if they want it changed as well, if that's okay.

"valid", "invalid" and "available"/"all" potentially. But yeah, let's allow one more review to see what they think.

referenced did not feel clear to me either, but I see what Sara's saying as well.

Catching up on the linked discussions, Dhruv mentioned it seems like that is the most common way of handling cases where the input is partially acceptable and can generate a valid response. Are you referencing something in particular you can link to? I'm not familiar with this type of response.

Along those lines, is the shape of this response following some established convention, or could it be changed? For example, do we need to explicitly list which of the provided sources were valid at all, or can we just have invalid_sources and available_sources? Or could this information all be spelled out in the "message" instead of in separate named fields?

Or could this information all be spelled out in the "message" instead of in separate named fields?

I actually originally had the warnings just be a list of strings, but I found it hard to do a meaningful test without just duplicating the string almost word-for-word in the test case 😰 On top of that, because of using sets instead of lists, the order of sources in the strings was non-deterministic, making it even hard to test against a simple string.

We should use whatever format here we want. I've included in the documentation for the response field that it is meant to be human readable rather than read by a machine, and that the contents of each dict are not stable.

Maybe discarded_sources, kept_sources, and available_sources? 🤷 whatever folks want here, happy to change it, I am not attached to any specific language, even if I found something or other personally confusing. I think it will get the idea across that something isn't right about the parameter on the request and that the developer needs to take a closer look at it.

Which also makes me wonder whether the warnings should go first in the JSON, rather than at the end? On a page of 20 results, I don't know whether it's easier to miss at the front or end of the document.

Are you referencing something in particular you can link to?

I didn't keep a record of my search when looking for a good pattern but I went through my browser history and found these references.

https://www.mediawiki.org/wiki/API:Errors_and_warnings

https://discuss.jsonapi.org/t/multi-status-responses-partial-success-in-particular/30/3

To be clear, this is not an established convention. It's the simplest, backwards compatible way I could think of to stick with a 200 OK status code but also convey problems in their input to the user.

Gotcha! And thanks for the links -- I wanted to make sure I wasn't suggesting deviating from some widely accepted pattern :)

I actually originally had the warnings just be a list of strings, but I found it hard to do a meaningful test without just duplicating the string almost word-for-word in the test case 😰 On top of that, because of using sets instead of lists, the order of sources in the strings was non-deterministic, making it even hard to test against a simple string.

Dang, that makes sense. One final suggestion -- what if we moved just the link to the available sources into the message? So the warning could be something like:

{ "code": "partially invalid source parameter", "message": "The source parameter was partially invalid. For a list of available sources, see http://localhost:50280/v1/images/stats", "invalid_sources": [ "foo" ], "valid_sources": [ "flickr" ] }

I think that would fix the problem with testing but make it a little clearer.

Which also makes me wonder whether the warnings should go first in the JSON, rather than at the end? On a page of 20 results, I don't know whether it's easier to miss at the front or end of the document.

+1 for putting it first in the JSON, now you mention it.

api/test/integration/test_media_integration.py

stacimc

This all tests well for me, approved! I did +1 to your suggestion to move the warnings up in the JSON and had one more suggestion for the names, but not a blocker -- up to you :)

sarayourfriend · 2024-04-08T01:09:13Z

I've moved the warning to the top, made it only appear when there is actually a warning (otherwise it's kind of ominous looking on requests that don't have issues, plus it's a few bytes over the wire that we can save on those responses). I also updated the warning dict to match Staci's suggestion.

Add warning to search response when source parameter has mixed validity

35d00e3

sarayourfriend requested a review from a team as a code owner April 4, 2024 01:00

sarayourfriend requested review from krysal and stacimc April 4, 2024 01:00

github-actions bot added the 🧱 stack: api Related to the Django API label Apr 4, 2024

dhruvkb approved these changes Apr 4, 2024

View reviewed changes

Use fixture source list

caddd9e

obulat removed the 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work label Apr 5, 2024

stacimc approved these changes Apr 5, 2024

View reviewed changes

sarayourfriend force-pushed the remove/invalid-source-logger-warning branch from d6f14eb to 861e946 Compare April 8, 2024 01:45

Move warnings to top and use clearer message

4eea42f

sarayourfriend force-pushed the remove/invalid-source-logger-warning branch from 861e946 to 4eea42f Compare April 8, 2024 01:58

sarayourfriend merged commit e0e0e27 into main Apr 8, 2024
41 checks passed

sarayourfriend deleted the remove/invalid-source-logger-warning branch April 8, 2024 02:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add warning to search response when source parameter has mixed validity #4031

Add warning to search response when source parameter has mixed validity #4031

sarayourfriend commented Apr 4, 2024

dhruvkb left a comment •

edited

Loading

dhruvkb Apr 4, 2024

sarayourfriend Apr 4, 2024

dhruvkb Apr 4, 2024

stacimc Apr 4, 2024

sarayourfriend Apr 5, 2024

dhruvkb Apr 5, 2024

stacimc Apr 5, 2024

stacimc left a comment

sarayourfriend commented Apr 8, 2024

	"referenced_sources": valid_sources,
	"valid_sources": valid_sources,

Add warning to search response when source parameter has mixed validity #4031

Add warning to search response when source parameter has mixed validity #4031

Conversation

sarayourfriend commented Apr 4, 2024

Fixes

Description

Testing Instructions

Checklist

Developer Certificate of Origin

dhruvkb left a comment • edited Loading

Choose a reason for hiding this comment

dhruvkb Apr 4, 2024

Choose a reason for hiding this comment

sarayourfriend Apr 4, 2024

Choose a reason for hiding this comment

dhruvkb Apr 4, 2024

Choose a reason for hiding this comment

stacimc Apr 4, 2024

Choose a reason for hiding this comment

sarayourfriend Apr 5, 2024

Choose a reason for hiding this comment

dhruvkb Apr 5, 2024

Choose a reason for hiding this comment

stacimc Apr 5, 2024

Choose a reason for hiding this comment

stacimc left a comment

Choose a reason for hiding this comment

sarayourfriend commented Apr 8, 2024

dhruvkb left a comment •

edited

Loading