Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status code 400 (bad request) if more than 1000 IDs are posted to the chemical annotation service #88

Open
erikyao opened this issue Nov 5, 2020 · 6 comments
Assignees

Comments

@erikyao
Copy link
Contributor

erikyao commented Nov 5, 2020

API involved:

Per the document:

ids: Required. Accept multiple chemical ids separated by comma, e.g., “ids=SDUQYLNIPVEERB-QPPQHZFASA-N,SESFRYSPDFLNCH-UHFFFAOYSA-N,SHGAZHPCJJPHSC-ZVCIMWCZSA-N”. Note that currently we only take the input ids up to 1000 maximum, the rest will be omitted.

However if I have more than 1000 IDs, I'll receive a 400 bad request from server.

MWE:

import requests

drugbank_ids = [str(i) for i in range(1001)]  # 1001 fake IDs

url = 'http://mychem.info/v1/chem'
data = {"ids": ",".join(drugbank_ids)}
response = requests.post(url, data=data)

# raise an HTTPError if the HTTP request returned an unsuccessful status code 
response.raise_for_status()

# HTTPError: 400 Client Error: Bad Request for url: http://mychem.info/v1/chem
@namespacestd0
Copy link

namespacestd0 commented Nov 5, 2020

Advise updating the documentation to indicate in fact we reject requests of more than 1000 queries per request.

@kevinxin90
Copy link
Contributor

@namespacestd0 @newgene

I didn't recall we have any discussion on this before making this change, which is a code-breaking one.

From my perspective, I don't think either implementation is wrong (return first 1000 vs return 400 error) as long as they are clearly documented. And given that the old implementation has been there for a long time, I don't think it's a good idea to make this change. Instead, I think it's better just to keep the old implementation, and embed a message in the response to remind the user.

I also noticed that the behavior of the size parameter was also changed. The max limit of size is 1000. Previously, if you set size > 1000, e.g. 2000, it will still return 1000 docs. But now it becomes a 400 error.

@zcqian
Copy link
Contributor

zcqian commented Nov 6, 2020

Personally I’d lean towards the 400 error. I know this will break some people’s code, but I guess it’s better when someone didn’t read the documentation and start scratching their head when they see some results missing.

It would be better if the response body contains a more detailed explanation of what part of the client request was “bad”.

@kevinxin90
Copy link
Contributor

As I mentioned above, you still can embed a message in the response (with only 1000 hits), which shows why only 1000 returns and how to retrieve the rest. This is how most APIs implement their pagination mechanisms.

@zcqian
Copy link
Contributor

zcqian commented Nov 6, 2020

My worry is that because people may be only programmatically reading specific fields from the response instead of manually inspecting the response, it’s likely that they won’t see the message.

@namespacestd0
Copy link

Working on a solution to differentiate strict parameter processing vs forgiving processing, with the possibilities to add warning messages in the response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants