Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat datetime values as Strings during json deserialization in Json… #562

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion data/SampleData/Json/ExamplePatient.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,6 @@
"1234-5679"
],
"Gender": "M",
"DOB": "20010110"
"DOB": "20010110",
"TimeOfDeath": "2023-07-28T01:59:23.388-05:00"
}
3 changes: 2 additions & 1 deletion data/Templates/Json/ExamplePatient.liquid
Original file line number Diff line number Diff line change
Expand Up @@ -50,5 +50,6 @@
"managingOrganization": {
"reference": "Organization/2.16.840.1.113883.19.5",
"display": "Good Health Clinic"
}
},
"deceasedDateTime": "{{ msg.TimeOfDeath }}"
}
Original file line number Diff line number Diff line change
Expand Up @@ -40,5 +40,6 @@
"managingOrganization": {
"reference": "Organization/2.16.840.1.113883.19.5",
"display": "Good Health Clinic"
}
},
"deceasedDateTime": "2023-07-28T01:59:23.388-05:00"
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,14 @@

using System;
using System.Collections.Generic;
using System.Data.Common;
using System.IO;
using System.Text.RegularExpressions;
using System.Threading;
using DotLiquid;
using Microsoft.Health.Fhir.Liquid.Converter.Exceptions;
using Microsoft.Health.Fhir.Liquid.Converter.Models;
using Microsoft.Health.Fhir.Liquid.Converter.Processors;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using Xunit;

Expand Down Expand Up @@ -362,7 +362,7 @@ public void GivenJObjectInput_WhenConvertWithJsonProcessor_CorrectResultShouldBe
{
var processor = _jsonProcessor;
var templateProvider = new TemplateProvider(TestConstants.JsonTemplateDirectory, DataType.Json);
var testData = JObject.Parse(_jsonTestData);
var testData = JsonConvert.DeserializeObject<JObject>(_jsonTestData, new JsonSerializerSettings() { DateParseHandling = DateParseHandling.None });
var result = processor.Convert(testData, "ExamplePatient", templateProvider);
Assert.True(JToken.DeepEquals(JObject.Parse(_jsonExpectData), JToken.Parse(result)));
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,5 +40,6 @@
"managingOrganization": {
"reference": "Organization/2.16.840.1.113883.19.5",
"display": "Good Health Clinic"
}
},
"deceasedDateTime": "2023-07-28T01:59:23.388-05:00"
}
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,34 @@
// -------------------------------------------------------------------------------------------------

using System;
using EnsureThat;
using Microsoft.Health.Fhir.Liquid.Converter.Exceptions;
using Microsoft.Health.Fhir.Liquid.Converter.Extensions;
using Microsoft.Health.Fhir.Liquid.Converter.Models;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;

namespace Microsoft.Health.Fhir.Liquid.Converter.Parsers
{
public class JsonDataParser : IDataParser
{
private static readonly JsonSerializerSettings DefaultSerializerSettings = new JsonSerializerSettings()
{
DateParseHandling = DateParseHandling.None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dustinburson @pallar-ms This is technically a breaking change for anyone using the OSS code. But I'd like to address it here as its a bug. A couple of ideas:

  • Bump the major version of the oss code to account for the breaking change
  • Keep the existing behavior in OSS, and create different implementations in PaaS products which set this value to DateParseHandling.None.

Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you looked at how this is defined/used in the FHIR service. I am concerned this won't allow us to change the behavior per request at the FHIR service level.

I am not concerned about a breaking change in the OSS. We can do the necessary version update as you said and add documentation if necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative is to just create a StrictJsonDataParser & StrictJsonProcessor. This would preserve backwards compat but allow us to define which parser we use in the API calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Fhir Service pulls out the correct IFhirConveter at request time from a map.

After this change the map will hold the 'fixed' datatime logic. The plan is to create a new JsonParser with a DateTimeFormattingJsonDataParser supplied to it and store that in the ConvertDataEngine/FhirServer. This parser will perform the current behavior. Based on the incoming request either use this parser or pull from the map.

Copy link
Collaborator

@pallar-ms pallar-ms May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan to use an API version of the request or another request body input to decide which parser to use?
Btw the convert G2 also loads the processors into the map during app init and on request, depending on the input data type selects the processor. https://microsofthealth.visualstudio.com/Health/_git/convert?path=/convert/core/src/Microsoft.Health.Convert.LiquidConverter/Handlers/LiquidConverterHandler.cs
We can update the key tuple with another value(whatever the new input we are basing it off of) and update the map that way too, i.e., we can add JsonProcessor() and JsonProcessor(new DateTimeFormattingJsonDataParser) to the map with different keys.

Copy link
Contributor Author

@rogordon01 rogordon01 May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current idea is to create a new request body input called JsonFhirConversionDatesAsStrings with a boolean or enum type of 'enabled/disabled'.

The map idea is interesting but I can see it getting a bit messy as we would need to add in another dimension to the tuple key, as you've said. I think it would be easier with a conditional:

//ConvertDataEngine Class
private readonly JsonProcessor strictJsonProcessor = new JsonProcessor(new DateTimeFormattingJsonDataParser);

private string GetConvertDataResult(ConvertDataRequest convertRequest, ITemplateProvider templateProvider, CancellationToken cancellationToken)
{
    IFhirConvert converter;
    if ( convertRequest.JsonFhirConversionDatesAsStrings)
    {
       parserToUse = strictJsonProcessor;
    }
    else
    {
       parserToUse = convertMap.Get(...)
    }

   ...
}

We could also capture this logic inside of a factory to keep it a bit cleaner.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that would work and I agree keeping it in the factory would be better. Might need to adjust this accordingly - https://github.com/microsoft/FHIR-Converter/blob/main/src/Microsoft.Health.Fhir.Liquid.Converter/Processors/ConvertProcessorFactory.cs

The only reason I proposed the map update was to keep the pattern of choosing the processor consistent from the map. Otherwise on first glance it looks odd why for one request param 'JsonFhirConversionDatesAsStrings' we pick the processor differently and then for another request param 'InputDataType' we use the map. But not strongly advocating it either since yeah adding to the key tuple is not too extendable and neat if we have another field later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious if we can supply other configurable settings in JsonTextReader()?
For now this seems fine since we are looking at datetime specifically, but wondering if we should just use JsonSerializerSettings to configure the deserialization behaviour for more options in the future. e.g.,
JsonConvert.DeserializeObject<JObject>(json, new JsonSerializerSettings { DateParseHandling = DateParseHandling.None });
Unless there are perf hits with using this compared to JsonTextReader.

Copy link
Contributor Author

@rogordon01 rogordon01 May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking into JsonConvert.DeserializeObject originally but it used JsonTextReader under the hood. And since it didn't appear to set the DateParseHandling flag on the reader I thought it wouldn't work. But I actually tried your suggestion and it appears to be honoring the DateParseHandling flag on the JsonSerializerSettings object. So your suggestion makes sense, I'll update accordingly.

Unsure about any perf impacts. Is there a test/perf harness available where I can try and get numbers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set up some unit tests to quickly evaluate perf. For each deserialization method, I performed 5 runs and captured the overall time. Each run performed 1 million deserializations.

The results are below. There doesn't appear to be much difference between the two approaches, so I'll go with JsonConvert.DeserializeObject.
 

JTokenParseTestAsync
00:00:07.7852149
00:00:06.5830398
00:00:06.9395215
00:00:05.9901044
00:00:06.0748714

JsonConvertTestAsync
00:00:06.4465205
00:00:06.6376552
00:00:06.4552853
00:00:06.4866381
00:00:06.3361394

Copy link
Collaborator

@pallar-ms pallar-ms May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for evaluating the perf impact.
I initially wondered if JsonTextReader might be more efficient from a memory standpoint since it streams data but if DeserializeObject is internally calling the JsonTextReader, then should be the same.

};

public JsonDataParser()
: this(DefaultSerializerSettings)
{
}

public JsonDataParser(JsonSerializerSettings jsonSerializerSettings)
{
JsonSerializerSettings = EnsureArg.IsNotNull(jsonSerializerSettings, nameof(jsonSerializerSettings));
}

protected JsonSerializerSettings JsonSerializerSettings { get; private set; }

public object Parse(string json)
{
if (string.IsNullOrWhiteSpace(json))
Expand All @@ -22,7 +41,7 @@ public object Parse(string json)

try
{
return JToken.Parse(json).ToObject();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does JObject.Parse(json) also have the same behaviour? If so, I do see other places where JObject.Parse is used and would be good to check if a similar setting needs to be applied there too depending on the input being parsed in those cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suspect that all JXXX.Parse methods have the same behavior. There seem to be some default settings in Newtonsoft that are undesirable for our needs.

Agree that we should go through and address this if needed across the project. I suggest doing that work in a separate PR to reduce the scope of this one, which is to address a known customer issue.

We may also want to use this opportunity to move away from Newtonsoft and onto .Net's implementation, which would be a bigger effort.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. But just to point out, in the thread you forwarded, Dustin mentioned that this problem exists even in the post processing logic which also impacts the customer's issue being addressed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a potential issue with 'pre/post' processing the result in ADF, and not within Convert/FHIR Server. Looking at the internal PostProcessor of Json it looks like DateParsing is already disabled. I've tested out using Convert/Fhir Server directly and verified that results can come back with the original date format preserved.

return JsonConvert.DeserializeObject<JToken>(json, JsonSerializerSettings).ToObject();
}
catch (Exception ex)
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
using System.Globalization;
using System.Threading;
using DotLiquid;
using EnsureThat;
using Microsoft.Extensions.Logging;
using Microsoft.Health.Fhir.Liquid.Converter.Extensions;
using Microsoft.Health.Fhir.Liquid.Converter.Models;
Expand All @@ -20,11 +21,17 @@ namespace Microsoft.Health.Fhir.Liquid.Converter.Processors
{
public class JsonProcessor : BaseProcessor
{
private readonly IDataParser _parser = new JsonDataParser();
private readonly IDataParser _parser;

public JsonProcessor(ProcessorSettings processorSettings, ILogger<JsonProcessor> logger)
: this(processorSettings, new JsonDataParser(), logger)
{
}

public JsonProcessor(ProcessorSettings processorSettings, IDataParser parser, ILogger<JsonProcessor> logger)
: base(processorSettings, logger)
{
_parser = EnsureArg.IsNotNull(parser, nameof(parser));
}

protected override DataType DataType { get; set; } = DataType.Json;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
using System.Text;
using System.Threading;
using DotLiquid;
using EnsureThat;
using Microsoft.Extensions.Logging;
using Microsoft.Health.Fhir.Liquid.Converter.Exceptions;
using Microsoft.Health.Fhir.Liquid.Converter.Extensions;
Expand All @@ -18,20 +19,32 @@
using Microsoft.Health.Fhir.Liquid.Converter.Models.Json;
using Microsoft.Health.Fhir.Liquid.Converter.Parsers;
using Microsoft.Health.MeasurementUtility;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using NJsonSchema;

namespace Microsoft.Health.Fhir.Liquid.Converter.Processors
{
public class JsonToHl7v2Processor : BaseProcessor
{
private readonly IDataParser _parser = new JsonDataParser();
private static readonly JsonSerializerSettings DefaultSerializerSettings = new JsonSerializerSettings()
{
DateParseHandling = DateParseHandling.None,
};

private readonly IDataParser _parser;

private string[] _segmentsWithFieldSeparator = new string[] { "MSH", "BHS", "FHS" };

public JsonToHl7v2Processor(ProcessorSettings processorSettings, ILogger<JsonToHl7v2Processor> logger)
: this(processorSettings, new JsonDataParser(), logger)
{
}

public JsonToHl7v2Processor(ProcessorSettings processorSettings, IDataParser parser, ILogger<JsonToHl7v2Processor> logger)
: base(processorSettings, logger)
{
_parser = EnsureArg.IsNotNull(parser, nameof(parser));
}

protected override string InternalConvert(string data, string rootTemplate, ITemplateProvider templateProvider, TraceInfo traceInfo = null)
Expand All @@ -45,7 +58,7 @@ protected override string InternalConvert(string data, string rootTemplate, ITem

var result = InternalConvertFromObject(jsonData, rootTemplate, templateProvider, traceInfo);

var hl7Message = GenerateHL7Message(JObject.Parse(result));
var hl7Message = GenerateHL7Message(ConvertToJObject(result));

var hl7String = ConvertHl7MessageToString(hl7Message);

Expand All @@ -56,7 +69,7 @@ public string Convert(JObject data, string rootTemplate, ITemplateProvider templ
{
var jsonData = data.ToObject();
var result = InternalConvertFromObject(jsonData, rootTemplate, templateProvider, traceInfo);
var hl7Message = GenerateHL7Message(JObject.Parse(result));
var hl7Message = GenerateHL7Message(ConvertToJObject(result));
Copy link
Contributor Author

@rogordon01 rogordon01 Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pallar-ms @dustinburson This class also uses the updated JsonProcessor which defaults to treating dates as strings. Because of this I updated this post processing step to also treat dates as strings.

We can also add this in a backwards compatible way in the Fhir-Server

As @pallar-ms mentioned there are a few other places where JObject.Parse is used. We can see if they need to be updated in as separate PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, the JsonToHL7v2Processor is only used in the new convert preview APIs and is not supported in the FHIR server $convert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! That makes things easier


var hl7String = ConvertHl7MessageToString(hl7Message);
return hl7String;
Expand Down Expand Up @@ -239,5 +252,10 @@ protected override void CreateTraceInfo(object data, Context context, TraceInfo
jsonTraceInfo.ValidateSchemas = jsonContext.ValidateSchemas;
}
}

private JObject ConvertToJObject(string input)
{
return JsonConvert.DeserializeObject<JObject>(input, DefaultSerializerSettings);
}
}
}