404 vs 403 for unauthorized requests to prevent data leaks #11
Replies: 3 comments 2 replies
-
Should the suggesting to return 404 instead of 403 only apply when the path has identifiers? |
Beta Was this translation helpful? Give feedback.
-
I definitely appreciate this perspective a lot in ensuring we are not leaking information. This exact question was discussed in the working group last Nov 8, and I really wish we had detailed notes on the reasoning at that time - unfortunately, the meeting recording is about a month too old and has been removed. This is why I have been encouraging us moving forward to use GitHub Discussions so we can better track and refer back to the results. If I recall correctly @jwineinger , @eggilbert and @alexander-ivakhnenko were all present and key contributors to this discussion at that time. Hoping you can take a minute to review and provide feedback here on your thoughts. From what I recall, one key difference with our API is that it is always authenticated access from subscribed or paying organizations. This is not a free API that anyone can sign up and use - which does limit the scope of bad actors, but nevertheless, that alone is likely a weak argument for or against. You could also reasonably reverse this as well, which is I believe how AWS S3 style resource lookups work. The idea that instead of returning a 404 for things you don't have permission too (like your GitHub repo example), you could return 403 for everything you don't have permission too, including resources that do not exist. In some business scenarios this might make more sense to consuming engineers for the API as we are telling them they are forbidden from knowing if something exists, rather than saying it doesn't exist for you - could be interpreted a little bit easier. I think this, to a very small degree, may also be use-case specific depending on the sensitive nature of the content - I wonder in our domain if this does pose real risk? Perhaps there are more qualifiers on when to use this pattern vs globally? In your example, can you expand on how knowing the existence of an opportunity would be used nefariously (not implying that your wrong at all, just want to understand the specifics a bit more in the discussion). This particular line is really helpful in directing towards your suggestion Koko as general industry intent:
|
Beta Was this translation helpful? Give feedback.
-
We've talked a lot about 404's more in the context of "object missing vs. invalid URL" but this part has come up too. For me this fits into the realm of "maybe better for security but worse for functionality and does it actually make us any safer?" For support, debugging and client functionality purposes having accurate response codes is valuable so we take a hit making such a change. On the other side of the coin: Sure an attacker can use response codes to farm the 'shape' of an API and to enumerate certain identifiers but for the former we're publishing the spec so it's not really protected data anymore and for the latter: The Identity Example could be "qualified" as "Don't leak PII" so if your query params will return a 403 or not based on some PII then return a consistent response code to not allow usage of the endpoint to farm for PII. I do prefer the suggestion to always return a 403 instead of 404 as we already have a usage collision on 404's from raw HTTP so would rather overload 403.. Travis comment to the nature of our APIs we are at least limited to "Existing customers trying to sus out data on other customers" vs "external bad actors trying to penetrate our systems" since all of our APIs are secured. |
Beta Was this translation helpful? Give feedback.
-
We had a discussion in the ART 3 arch sync today about the return code for a request to an endpoint to retrieve quotes from SF:
GET /v1/opportunity/:opp-id/quotes
The endpoint performs authorization on these requests to make sure the user has access to the quotes.
Our discussion was centered around what to return if the two accounts don't match. A 403 makes sense, because the user is unauthorized to view that account. However, even just returning 403 is leaking some information to this user, it is telling them that an opportunity with that ID exists. A user could use this information to do other nefarious things.
You may have tried to open something in Github on a new browser and noticed how it returns a 404:
They are following this same principle: don't leak the information that an entity exists to a user without access.
Identity Service does this too with password resets, failed logins, etc. We make sure the "password is incorrect" errors use the same response as the "email not found" errors, because we don't want to leak whether or not an email is in our system.
I think we should update https://github.com/SPSCommerce/sps-api-standards/blob/main/standards/authentication.md to reflect this, and make it clear that 404 is probably preferred to 403 in most cases.
Beta Was this translation helpful? Give feedback.
All reactions