Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quote event_name when raising event #435

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

waltervos
Copy link

This should fix Azure/azure-sdk-for-python#30161 by using urllib's quote function on the event name.

quote event name when raising event
@waltervos waltervos marked this pull request as ready for review May 1, 2023 17:57
Copy link
Collaborator

@cgillum cgillum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks good to me, though we might want to apply it to instance_id as well. I'll defer to @davidmrdavid for official sign-off and merge.

@waltervos
Copy link
Author

waltervos commented May 2, 2023

I've, sort of, tested this by having an orchestrator wait on an event Event/1 and raising that event quoted (Event%2F1). No 404's then, but the event is delivered to the orchestration without unquoting it first. What did work was waiting on Event%2F1 and raising Event%2F1. In other words: this approach can work, but only if the event name is delivered to the orchestration un-quoted.

I would also like to add that as far as I can tell, the Durable Task framework itself sets no restrictions on what an event name can be. And since (again, as far as I can tell) it doesn't go through HTTP like the Python implementation does this would not be an issue there.

@waltervos waltervos requested a review from cgillum May 2, 2023 18:44
@davidmrdavid
Copy link
Collaborator

Thanks @waltervos for your contribution,

I would like us to test this in a few ways before merging this change and, from your latest comment, it seems that using an event named Event/1 would be a good first input to use for testing.

Just to confirm - you're saying that, in order to make this change work end to end, we'd need to unquote the eventName when received by the orchestrator, is that correct?

I think it might be simpler just to simply quote the name argument provided to wait_for_external_event here:

def wait_for_external_event(self, name: str) -> TaskBase:
"""Wait asynchronously for an event to be raised with the name `name`.
Parameters
----------
name : str
The event name of the event that the task is waiting for.
Returns
-------
Task
Task to wait for the event
"""
action = WaitForExternalEventAction(name)
task = self._generate_task(action, id_=name)
return task

Any chance you could try adding that to your PR and let us know if it works as expected on your end? If it does work end to end, and removes any inconsistencies in how the event name is written in the client and the orchestrator, then I'll do some tests on my end and finally look to merge. Thanks!

@waltervos
Copy link
Author

@microsoft-github-policy-service agree company="NS"

@waltervos
Copy link
Author

waltervos commented May 3, 2023

Thanks @waltervos for your contribution,

I would like us to test this in a few ways before merging this change and, from your latest comment, it seems that using an event named Event/1 would be a good first input to use for testing.

Just to confirm - you're saying that, in order to make this change work end to end, we'd need to unquote the eventName when received by the orchestrator, is that correct?

I think it might be simpler just to simply quote the name argument provided to wait_for_external_event here:

def wait_for_external_event(self, name: str) -> TaskBase:
"""Wait asynchronously for an event to be raised with the name `name`.
Parameters
----------
name : str
The event name of the event that the task is waiting for.
Returns
-------
Task
Task to wait for the event
"""
action = WaitForExternalEventAction(name)
task = self._generate_task(action, id_=name)
return task

Any chance you could try adding that to your PR and let us know if it works as expected on your end? If it does work end to end, and removes any inconsistencies in how the event name is written in the client and the orchestrator, then I'll do some tests on my end and finally look to merge. Thanks!

Hi David,

I've quoted the event name in wait_for_external_event now as well. That makes it work. However, let's say I wanted to log the event name(s) that completed after the wait is finished then it would say something like "Event%2F1" is finished.

I don't think this PR should be merged, really, it was meant as a conversation starter mostly. I'm curious to hear your thoughts.

Edit, to clarify: I don't think this PR should be merged as is, because it looks a bit hacky to me. There are probably many more places where quoting a user supplied string might be appropriate (orchestration instance IDs probably suffer from the same behaviour). When I made the initial PR, my hope was that the "server" in Azure Durable Functions would unquote the quoted string as built-in behaviour, but it seems that's not really how it works.

@davidmrdavid
Copy link
Collaborator

Hi @waltervos,

Thanks for trying out my suggestion, at least it's good to know that it works, albeit with some inconsistencies around logging.

Here's my thoughts: I agree that, for simplicity's sake, the Durable Functions "server" (which we call the "extension") would manage the character-escaping logic.

However, there can be "extra" serialization points between a Durable Functions SDK (in this case, the Python SDK), and the extension. The language worker process (the component that invokes your code, which in turn invokes the DF SDK) may do some serialization of its own, for example, and there's differences in serialization settings depending on the programming language in play. So, by the time we get to the extension, it can be hard to safely modify user inputs, such as the external event name, in a way that's language agnostic. That's why my first instinct is to handle this in the language SDKs themselves, and to just let the extension be oblivious the character escaping.

I hope that makes sense. Happy to discuss this further.

@waltervos: you mentioned logging the event that was raised and wanting to get the unescaped text. I think we could make that work. However, I'd need to know how you were thinking of getting the event name that was raised. It seems you wanted to get it from the resulting task. Can you please clarify in what way? If I'm not mistaken, the wait_for_external_event task does not have a public property with its event name.

@waltervos
Copy link
Author

@waltervos: you mentioned logging the event that was raised and wanting to get the unescaped text. I think we could make that work. However, I'd need to know how you were thinking of getting the event name that was raised. It seems you wanted to get it from the resulting task. Can you please clarify in what way? If I'm not mistaken, the wait_for_external_event task does not have a public property with its event name.

Hi David, that would be the AtomicTask's id property. It's set as the event name.

There's a whole other way to look at this issue: The exception message coming from DurableOrchestrationClient.raise_event() is inaccurate. It always says f"No instance with ID {instance_id} found." in case of a 404 but as I've demonstrated, there are other reasons for a 404 as well.

Looking at the code, I'd says it's missing a layer of abstraction. raise_event deals with 404's directly (and naively ;-) ) and it would benefit from another more generic function that deals with HTTP responses. That function could raise the exception with the error message that came from the API response. Alternative mappings of HTTP code to Exception message ("In case of a 410 say f"Instance with ID {instance_id} is gone: either completed or failed"") could be accepted as well if required. I'm guessing these exception messages are now hardcoded because the ones coming from the API aren't always satisfactory.

With a better error message, the Azure Durable Functions user that I am might have more easily identified the problem. And then I have a choice: I can just quote() my event IDs, or make sure I supply something that's always URL safe.

@waltervos
Copy link
Author

Any thoughts, @davidmrdavid ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A 404 from raise event is inaccurately described as instance not found
3 participants