API Proposal: Add Async validation #329

almostchristian · 2024-06-28T13:44:44Z

The new validator should include an async API to avoid calls to TaskHelper.Await. During my performance profiling sessions, I found using TaskHelper.Await, or even Task.GetAwater().GetResult() makes the traces chaotic. Also, I feel it has a significant scalability impact when used with network bound resource resolvers.

Proposed new API

The three interfaces use the default interface method implementation for the new API. If the validator does not need to be async (i.e., it doesn't call any async api such as other other validators), it doesn't need to implement the new async method.

public class Validator
{
   Task<OperationOutcome> ValidateAsync(ElementNode instance, string? profile, CancellationToken cancellationToken);

   Task<OperationOutcome> ValidateAsync(Resource instance, string? profile, CancellationToken cancellationToken);
}

public interface IElementSchemaResolver
{
   ValueTask<ElementSchema?> GetSchemaAsync(Canonical schemaUri)
      => new(GetSchema(schemaUri));
}


public interface IValidatable
{
   ValueTask<ResultReport?> ValidateAsync(IScopedNode input, ValidationSettings vc, ValidationState state, CancellationToken cancellationToken)
      => new(Validate(input, vc, state));
}


public interface IGroupValidatable
{
   ValueTask<ResultReport?> ValidateAsync(IEnumerable<IScopedNode> input, ValidationSettings vc, ValidationState state, CancellationToken cancellationToken)
      => new(Validate(input, vc, state));
}

Performance

The async validate API also appears to be faster according to benchmarks:

Method	Resource	Mean	Error	StdDev	Ratio
LegacyValidator	Hippocrates.practitioner	3,438.0 μs	67.57 μs	109.11 μs	1.00
NewValidator	Hippocrates.practitioner	464.2 μs	8.45 μs	16.48 μs	0.14
NewValidatorAsync	Hippocrates.practitioner	170.4 μs	3.34 μs	5.58 μs	0.05

LegacyValidator	Levin.patient	47,261.2 μs	931.72 μs	1,035.60 μs	1.00
NewValidator	Levin.patient	1,009.2 μs	20.03 μs	47.60 μs	0.02
NewValidatorAsync	Levin.patient	733.7 μs	13.67 μs	24.99 μs	0.02

LegacyValidator	MainBundle.bundle	71,836.7 μs	1,367.22 μs	1,342.79 μs	1.00
NewValidator	MainBundle.bundle	5,487.6 μs	116.04 μs	340.33 μs	0.08
NewValidatorAsync	MainBundle.bundle	5,212.4 μs	122.50 μs	361.20 μs	0.07

LegacyValidator	patient-clinicalTrial.xml	37,239.2 μs	673.64 μs	597.17 μs	1.00
NewValidator	patient-clinicalTrial.xml	3,630.5 μs	71.53 μs	147.71 μs	0.10
NewValidatorAsync	patient-clinicalTrial.xml	2,726.8 μs	54.42 μs	113.60 μs	0.07

LegacyValidator	Weight.observation	55,374.9 μs	1,086.31 μs	1,958.84 μs	1.00
NewValidator	Weight.observation	2,121.3 μs	71.24 μs	210.06 μs	0.04
NewValidatorAsync	Weight.observation	1,707.1 μs	33.73 μs	85.87 μs	0.03

The text was updated successfully, but these errors were encountered:

ewoutkramer · 2024-07-01T08:32:53Z

This is interesting, as we initially had an async interface, but due to the fact that many validations (cardinality, string length, patterns, etc) were merely very short computes, the overhead of Task/await on all those nested validates became noticable, so we removed it. I guess it depends on the ratio of I/O bound validations and simple, compute validations. And as we know, async tends to spread everywhere, so we cannot have non-async validates for simple computes, and async validates for i/o bound ones. At least I think so, I'd love to be proven wrong ;-) As it stands, I don't feel like re-introducing async anytime soon after our previous experiences.

almostchristian · 2024-07-02T14:32:05Z

In your previous testing, did you use ValueTask or Task for the validators? I agree that superfluous use of async can hurt performance and it may take testing many use cases to definitively say which one is better for performance (and in the end the answer may be 'it depends'). I think the validator should support both synchronous and asynchronous validation, and it will be up to the end users to validate which is the best one to use for their use case.

For now, the main advantage to using async is that it preserves the traces in flame graphs which makes it easier to diagnose performance issues. In the synchronous version, the occasional calls to TaskHeper.Await changes the stack traces

Flame graph (synchronous)

Flame graph (async)

brianpos · 2024-07-02T23:51:49Z

My QR validator has a hybrid of both approaches.
https://github.com/brianpos/fhir-net-web-api/blob/66e90f72c6adcdf36f436a313e8e886fa61abf9e/src/Hl7.Fhir.StructuredDataCapture/QuestionnaireResponseValidator.cs#L742
It has an async top level, with a List to track any async requestst that are made, which it then waits on before returning.
So anything that has an async result is added to the list and everything else just continues synchronously.
All issues are added into the OutcomeIssues list too, which is what is returned at the end.

almostchristian · 2024-07-17T08:37:21Z

One benefit to an async validtion API is that potentially, validation could take a long time and it is possible that the client has cancelled the validation request. Respecting the cancellation token can help with scalability.

mmsmits · 2024-08-29T08:57:14Z

We think async will lead to too much overhead current, since most validation are so fast that starting up a task will take more time than actually validating the validation blocks.

almostchristian mentioned this issue Jun 28, 2024

Add async validation #330

Open

mmsmits closed this as completed Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Proposal: Add Async validation #329

API Proposal: Add Async validation #329

almostchristian commented Jun 28, 2024

ewoutkramer commented Jul 1, 2024

almostchristian commented Jul 2, 2024

brianpos commented Jul 2, 2024

almostchristian commented Jul 17, 2024

mmsmits commented Aug 29, 2024

API Proposal: Add Async validation #329

API Proposal: Add Async validation #329

Comments

almostchristian commented Jun 28, 2024

Proposed new API

Performance

ewoutkramer commented Jul 1, 2024

almostchristian commented Jul 2, 2024

brianpos commented Jul 2, 2024

almostchristian commented Jul 17, 2024

mmsmits commented Aug 29, 2024