Skip to content
This repository has been archived by the owner on Jul 19, 2024. It is now read-only.

Checkpoint progressed on internal azure function exceptions #66

Closed
xzuttz opened this issue Jun 4, 2020 · 3 comments
Closed

Checkpoint progressed on internal azure function exceptions #66

xzuttz opened this issue Jun 4, 2020 · 3 comments
Assignees
Milestone

Comments

@xzuttz
Copy link

xzuttz commented Jun 4, 2020

The current EventHubListener.cs implementation progresses the checkpoint if there is an unhandled exception in user code.

See EventHubListener.cs:

public async Task ProcessEventsAsync(PartitionContext context, IEnumerable<EventData> messages)
{
var triggerInput = new EventHubTriggerInput
{
Events = messages.ToArray(),
PartitionContext = context
};
TriggeredFunctionData input = null;
if (_singleDispatch)
{
// Single dispatch
int eventCount = triggerInput.Events.Length;
List<Task> invocationTasks = new List<Task>();
for (int i = 0; i < eventCount; i++)
{
if (_cts.IsCancellationRequested)
{
break;
}
EventHubTriggerInput eventHubTriggerInput = triggerInput.GetSingleEventTriggerInput(i);
input = new TriggeredFunctionData
{
TriggerValue = eventHubTriggerInput,
TriggerDetails = eventHubTriggerInput.GetTriggerDetails(context)
};
Task task = TryExecuteWithLoggingAsync(input, triggerInput.Events[i]);
invocationTasks.Add(task);
}
// Drain the whole batch before taking more work
if (invocationTasks.Count > 0)
{
await Task.WhenAll(invocationTasks);
}
}
else
{
// Batch dispatch
input = new TriggeredFunctionData
{
TriggerValue = triggerInput,
TriggerDetails = triggerInput.GetTriggerDetails(context)
};
using (_logger.BeginScope(GetLinksScope(triggerInput.Events)))
{
await _executor.TryExecuteAsync(input, _cts.Token);
}
}
// Dispose all messages to help with memory pressure. If this is missed, the finalizer thread will still get them.
bool hasEvents = false;
foreach (var message in messages)
{
hasEvents = true;
message.Dispose();
}
// Checkpoint if we processed any events.
// Don't checkpoint if no events. This can reset the sequence counter to 0.
// Note: we intentionally checkpoint the batch regardless of function
// success/failure. EventHub doesn't support any sort "poison event" model,
// so that is the responsibility of the user's function currently. E.g.
// the function should have try/catch handling around all event processing
// code, and capture/log/persist failed events, since they won't be retried.
if (hasEvents)
{
await CheckpointAsync(context);
}
}

Unfortunately, internal exceptions in DryIoC are treated as user exceptions, and the user has no way of catching them.

Azure/azure-webjobs-sdk#2432
Azure/azure-functions-host#5240

Fixing the DryIoC exceptions is one thing. There is however no guarantee that such situations will not occur again.

I propose a slighty modified flow, where the user is able to decide whether the checkpoint should be progressed or not, if there is an unhandled user exception. This can be done through a setting in host.json.

I forked the repo and made a pull request to demonstrate. There is also a MyGet package if you want to test. Currently, in the PR, it is hardcoded to not progress the checkpoint.

PR: xzuttz#1

If you are interested in pursuing the proposal I am happy to complete the pull request with any further instructions.

Albeit I am not able to really reproduce the DryIoC exception in a test environment, we can reproduce that DryIoC exceptions, in the function scope, will be treated as user exceptions, by doing the following:

  1. Create a HTTP trigger
  2. Set a breakpoint in FunctionExecutionFeature.cs.
  3. Call the HTTP trigger through its URL
    Visual Studio will stop at the breakpoint in step 2
  4. Create a breakpoint in Container.cs
  5. Push F5 or the "Continue" button i VS to continue the flow
    Visual Studio will stop at the breakpoint in step 4
  6. In Visual Studio, move the pointer to the next line, forcing an exception, and push F5 or" Continue" to continue the flow.

The HTTP trigger will not return.

The logs will say that the DryIoC exception came from the HTTP trigger context, thus the checkpoint will be progressed by the EventHubListener.

@drdamour
Copy link

i was surprised by this behaviour, but stream hub really is suppose to just keep chugging away. retries and an optional poison hub might make sense for certain scenarios. I've found it advances it even if it's more extension type exceptions like json deserialization error in the binding. i'd love to throw those to a poison queue and keep on chugging.

@alrod
Copy link
Member

alrod commented Feb 4, 2022

this can be closed once we merge the PR:
#105

@alrod
Copy link
Member

alrod commented Mar 7, 2022

Closing as fix for DryLoc exception is merged.

@alrod alrod closed this as completed Mar 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants