-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
too many elements cause 414 error #62
Comments
well... it's really just a looooong list of ways to download that triggers the error like wayid1, wayid2, ..waid999999 |
@austinhartzheim feel free to look into that. I think we need a way to stop at some point to avoid an endless loop. Maybe we can use this issue to discuss possible solutions. Do you already have an idea? In general in think it's good to provide this kind of abstraction, so that a consumer of osmapi doesn't have to care about URL length limits. Something like a generator might come in handy here. I've seen something similar already in the OAI-PMH client implementation of pyoai. Let me know if you want to discuss this more in detail. |
Excellent. I'm busy with final projects/exams at my university right now but I should have time in late December. If someone else is interested in working on this issue before then, feel free to take it. |
Root CauseI've been looking into this issue and it seems that the URI length limit is not defined in the API server software. Rather, I believe that the limit is imposed by the Apache web server itself. It seems that the length of the HTTP request line is the limiting factor. And Apache limits it to 8190 bytes by default. This is the default value, which has not been set specifically on the servers. (If we wanted to pursue having the value set explicitly on the servers rather than relying on the default, I believe this Chef file would be the location to do it). Experimental VerificationThe following code shows that a request line of 8190 bytes gives the expected result whereas a request line of 8195 bytes causes the 414 error we are addressing in this issue: len('GET /api/0.6/waysways=') + len(','.join([str(x) for x in range(1, 1854)])) + len(' HTTP/1.1\r\n') # 8190
len('GET /api/0.6/waysways=') + len(','.join([str(x) for x in range(1, 1855)])) + len(' HTTP/1.1\r\n') # 8195
api.WaysGet(range(1,1854)) # 404 error - expected
api.WaysGet(range(1,1855)) # 414 error - not expected Possible SolutionsHere are some of the most likely solutions.
DiscussionI'm personally leaning towards hardcoding a URI length limit constant, with or without trying to standardize the limit. I believe that the efficiency gains of this approach may be significant. Furthermore, I do not think it is likely that the length limit will be decreased in the future. I'm interested in hearing your thoughts or alternate solutions. |
@austinhartzheim thank you very much for this very thorough analysis of the problem at hand. I have a few things to add:
All these points lead me to the conclusion, that I'd prefer a limit with a good default value, that a user of |
I like the idea of retrying the request if we see a 414 error. I think a good strategy would be to start at ~8000 bytes. Upon encountering a 414 error, we divide that number in half and retry. And if we encounter another 414 error, we divide it in half again to ~2000 bytes. After that, we raise an exception if the request is not successful. The reason for starting at 8000 is that RFC 7230 recommends that servers support at least 8000 byte request lines. The reason for ending at 2000 is because this is what browsers support and so almost every server (unless configured otherwise) is likely to support that. Also, we can add a configuration option to override the default settings. I'm considering setting the number of retries to zero if that is the case (or perhaps we can make that configurable as well). Also, you mentioned using a generator. Do you want the methods to return a generator instead or should we collect all the results and return them as a list? |
I stumped upon this library for retrying, this might be handy for this use case. |
when attempting to download a big number of elements (say - using WaysGet method) ends in 'Request-URI Too Long'. it would be nice if osmapi is able to fight this by allowing to finish the request in chunks
The text was updated successfully, but these errors were encountered: