-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
url-encode parameters URL paths and the fallback_id header #55
Conversation
- strings passed in to request(data=some_str) get encoded as latin-1 by default. Data passed in as dictionaries to `json` or `params` are encoded to utf-8 by default. So encode the XML. - forms.py: tidy up docstring wrap length to 90 chars - submissions.py: - add encoding parameter, default utf-8, for encoding the XML, for submission.create and submission.edit - move example submission edit XML from private ._put method to public .edit method so users can see it. - test_client.py: add integration tests for non-ascii form_ids and instance_ids, incl. check for retrieval after create/update.
@matthew-white submission XML also needed explicit encoding to utf-8 since it was defaulting to latin-1. I've fixed that in the PR but came across another issue. I'm wondering if this is a bug in pyODK (in the PR), or a bug in Central, or just undefined behaviour. I created a submission and edited it a couple of times (the below XML is I think the 3rd edit). I can see the submission in the Central, but when I click the "edit" link for that submission I get an error 404.1 shown in the UI as a red toast box "Could not find the resource you were looking for.". client = Client()
xml = """
<data id="'=+/*-451%/%" version="1">
<meta>
<deprecatedID>~!@#$%^&*()_+=-✘✘</deprecatedID>
<instanceID>~!@#$%^&*()_+=-✘✘✘</instanceID>
</meta>
<fruit>Papaya</fruit>
<note_fruit/>
</data>
"""
client.submissions.edit(
xml=xml, form_id="'=+/*-451%/%", instance_id="~!@#$%^&*()_+=-✅✅"
) Form URL https://staging.getodk.cloud/#/projects/1/forms/'%3D%2B%2F*-451%25%2F%25 Submission URL Expected - the edit was submitted with this: Submission URL Actual - partially encoded? |
Is that when you click the "More" link or the edit button (the pencil icon)? I see error messages in both cases, but I only see "Could not find the resource…" after clicking "More". Clicking the "More" link is definitely running into a bug with the Central API. In most cases, Central doesn't assume that submission instance IDs take any particular form. However, for requesting an individual submission over OData, Central assumes that the instance ID is a UUID (either with or without the Clicking the edit button, I see the following error in Enketo: "Record not present. It may have expired." I see one request that resulted in an error, to the URL https://staging.getodk.cloud/-/submission/HYP8U3AqTpcUsPP8oE7eGEHeJG8k4N7?enketoId=HYP8U3AqTpcUsPP8oE7eGEHeJG8k4N7&instanceId=~!%40. %40 is @, so it looks like the first 3 characters of the instance ID are sent, then the ID is truncated at #. I'm guessing that # isn't encoded, so it's treated as an actual URL hash, and everything after it is ignored. @lognaturel, let me know if you'd like me to file an Enketo issue about this. Also, @lognaturel, could you confirm that I'm right in thinking that in practice, almost all instances IDs are UUIDs? Is there any chance it'd make sense to make that part of the XForms spec? Or are we intentionally leaving open the possibility that we could do something different in the future?
I think this one is OK actually! Those characters are allowed to be encoded, but they don't have to be. From the MDN article about
|
I was looking at this a little more. It seems like the JavaScript function The Central API and frontend both use
This paragraph also seems relevant to me:
I think I'll leave it here unless @lognaturel wants me to look into this further. Mostly I just wanted to clarify my previous comment. It looks like pyODK might be encoding according to RFC 3986, which should work well with Central. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests look good. I made sure the edits make sense. I did not spend time looking for potential missed opportunities to encode.
I don’t think it had previously been a hard requirement anywhere so it would be a restriction for Central only. I’m not sure it belongs in the spec? At the same time, I think it is a de facto standard. |
Closes #53
Closes #54
What has been done to verify that this works as intended?
New unit tests for the encoding functions on Session, and mocked integration test on client.forms.update.
Why is this the best possible solution? Were any other approaches considered?
Similar to Central frontend behaviour. In terms of implementation, 1) could have used a custom string class but that seemed over the top, or 2) could have put the functionality in the URL classes, but it seemed like it might be useful to put on Session so that users that are using client.get (or other verbs) can use it easily as well.
How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?
Supports use cases where form_id or instance_id contains non-ASCII characters.
Do we need any specific form for testing your changes? If so, please attach one.
Required forms are in the repository under tests/resources/forms.
Does this change require updates to documentation? If so, please file an issue here and include the link below.
Probably not needed, unless there is a gotcha when interacting with Central
Before submitting this PR, please make sure you have:
tests
nosetests
and verified all tests passpython bin/pre_commit.py
to format / lint code