-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treat datetime values as Strings during json deserialization in Json… #562
Treat datetime values as Strings during json deserialization in Json… #562
Conversation
using Newtonsoft.Json.Linq; | ||
|
||
namespace Microsoft.Health.Fhir.Liquid.Converter.Parsers | ||
{ | ||
public class JsonDataParser : IDataParser | ||
{ | ||
private static Func<string, JsonReader> _defaultJsonReaderGenerator = (json) => new JsonTextReader(new StringReader(json)) | ||
{ | ||
DateParseHandling = DateParseHandling.None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dustinburson @pallar-ms This is technically a breaking change for anyone using the OSS code. But I'd like to address it here as its a bug. A couple of ideas:
- Bump the major version of the oss code to account for the breaking change
- Keep the existing behavior in OSS, and create different implementations in PaaS products which set this value to
DateParseHandling.None
.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you looked at how this is defined/used in the FHIR service. I am concerned this won't allow us to change the behavior per request at the FHIR service level.
I am not concerned about a breaking change in the OSS. We can do the necessary version update as you said and add documentation if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative is to just create a StrictJsonDataParser & StrictJsonProcessor. This would preserve backwards compat but allow us to define which parser we use in the API calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Fhir Service pulls out the correct IFhirConveter at request time from a map.
After this change the map will hold the 'fixed' datatime logic. The plan is to create a new JsonParser with a DateTimeFormattingJsonDataParser
supplied to it and store that in the ConvertDataEngine/FhirServer. This parser will perform the current behavior. Based on the incoming request either use this parser or pull from the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the plan to use an API version of the request or another request body input to decide which parser to use?
Btw the convert G2 also loads the processors into the map during app init and on request, depending on the input data type selects the processor. https://microsofthealth.visualstudio.com/Health/_git/convert?path=/convert/core/src/Microsoft.Health.Convert.LiquidConverter/Handlers/LiquidConverterHandler.cs
We can update the key tuple with another value(whatever the new input we are basing it off of) and update the map that way too, i.e., we can add JsonProcessor() and JsonProcessor(new DateTimeFormattingJsonDataParser) to the map with different keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My current idea is to create a new request body input called JsonFhirConversionDatesAsStrings
with a boolean or enum type of 'enabled/disabled'.
The map idea is interesting but I can see it getting a bit messy as we would need to add in another dimension to the tuple key, as you've said. I think it would be easier with a conditional:
//ConvertDataEngine Class
private readonly JsonProcessor strictJsonProcessor = new JsonProcessor(new DateTimeFormattingJsonDataParser);
private string GetConvertDataResult(ConvertDataRequest convertRequest, ITemplateProvider templateProvider, CancellationToken cancellationToken)
{
IFhirConvert converter;
if ( convertRequest.JsonFhirConversionDatesAsStrings)
{
parserToUse = strictJsonProcessor;
}
else
{
parserToUse = convertMap.Get(...)
}
...
}
We could also capture this logic inside of a factory to keep it a bit cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that would work and I agree keeping it in the factory would be better. Might need to adjust this accordingly - https://github.com/microsoft/FHIR-Converter/blob/main/src/Microsoft.Health.Fhir.Liquid.Converter/Processors/ConvertProcessorFactory.cs
The only reason I proposed the map update was to keep the pattern of choosing the processor consistent from the map. Otherwise on first glance it looks odd why for one request param 'JsonFhirConversionDatesAsStrings' we pick the processor differently and then for another request param 'InputDataType' we use the map. But not strongly advocating it either since yeah adding to the key tuple is not too extendable and neat if we have another field later.
using Newtonsoft.Json.Linq; | ||
|
||
namespace Microsoft.Health.Fhir.Liquid.Converter.Parsers | ||
{ | ||
public class JsonDataParser : IDataParser | ||
{ | ||
private static Func<string, JsonReader> _defaultJsonReaderGenerator = (json) => new JsonTextReader(new StringReader(json)) | ||
{ | ||
DateParseHandling = DateParseHandling.None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious if we can supply other configurable settings in JsonTextReader()?
For now this seems fine since we are looking at datetime specifically, but wondering if we should just use JsonSerializerSettings to configure the deserialization behaviour for more options in the future. e.g.,
JsonConvert.DeserializeObject<JObject>(json, new JsonSerializerSettings { DateParseHandling = DateParseHandling.None });
Unless there are perf hits with using this compared to JsonTextReader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking into JsonConvert.DeserializeObject
originally but it used JsonTextReader
under the hood. And since it didn't appear to set the DateParseHandling
flag on the reader I thought it wouldn't work. But I actually tried your suggestion and it appears to be honoring the DateParseHandling
flag on the JsonSerializerSettings
object. So your suggestion makes sense, I'll update accordingly.
Unsure about any perf impacts. Is there a test/perf harness available where I can try and get numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set up some unit tests to quickly evaluate perf. For each deserialization method, I performed 5 runs and captured the overall time. Each run performed 1 million deserializations.
The results are below. There doesn't appear to be much difference between the two approaches, so I'll go with JsonConvert.DeserializeObject
.
JTokenParseTestAsync
00:00:07.7852149
00:00:06.5830398
00:00:06.9395215
00:00:05.9901044
00:00:06.0748714
JsonConvertTestAsync
00:00:06.4465205
00:00:06.6376552
00:00:06.4552853
00:00:06.4866381
00:00:06.3361394
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks for evaluating the perf impact.
I initially wondered if JsonTextReader might be more efficient from a memory standpoint since it streams data but if DeserializeObject is internally calling the JsonTextReader, then should be the same.
@@ -22,7 +42,8 @@ public object Parse(string json) | |||
|
|||
try | |||
{ | |||
return JToken.Parse(json).ToObject(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does JObject.Parse(json) also have the same behaviour? If so, I do see other places where JObject.Parse is used and would be good to check if a similar setting needs to be applied there too depending on the input being parsed in those cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suspect that all JXXX.Parse methods have the same behavior. There seem to be some default settings in Newtonsoft that are undesirable for our needs.
Agree that we should go through and address this if needed across the project. I suggest doing that work in a separate PR to reduce the scope of this one, which is to address a known customer issue.
We may also want to use this opportunity to move away from Newtonsoft and onto .Net's implementation, which would be a bigger effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. But just to point out, in the thread you forwarded, Dustin mentioned that this problem exists even in the post processing logic which also impacts the customer's issue being addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a potential issue with 'pre/post' processing the result in ADF, and not within Convert/FHIR Server. Looking at the internal PostProcessor of Json it looks like DateParsing is already disabled. I've tested out using Convert/Fhir Server directly and verified that results can come back with the original date format preserved.
@@ -56,7 +69,7 @@ public string Convert(JObject data, string rootTemplate, ITemplateProvider templ | |||
{ | |||
var jsonData = data.ToObject(); | |||
var result = InternalConvertFromObject(jsonData, rootTemplate, templateProvider, traceInfo); | |||
var hl7Message = GenerateHL7Message(JObject.Parse(result)); | |||
var hl7Message = GenerateHL7Message(ConvertToJObject(result)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pallar-ms @dustinburson This class also uses the updated JsonProcessor
which defaults to treating dates as strings
. Because of this I updated this post processing
step to also treat dates as strings.
We can also add this in a backwards compatible way in the Fhir-Server
As @pallar-ms mentioned there are a few other places where JObject.Parse
is used. We can see if they need to be updated in as separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, the JsonToHL7v2Processor is only used in the new convert preview APIs and is not supported in the FHIR server $convert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! That makes things easier
@@ -362,8 +362,8 @@ public void GivenJObjectInput_WhenConvertWithJsonProcessor_CorrectResultShouldBe | |||
{ | |||
var processor = _jsonProcessor; | |||
var templateProvider = new TemplateProvider(TestConstants.JsonTemplateDirectory, DataType.Json); | |||
var testData = JObject.Parse(_jsonTestData); | |||
var result = processor.Convert(testData, "ExamplePatient", templateProvider); | |||
var testData = JObject.ReadFrom(new JsonTextReader(new StringReader(_jsonTestData)) { DateParseHandling = DateParseHandling.None }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit suggestion: Probably doesn't matter for this test, but just for consistency and in case a search all is done to get all references, maybe changing this to also do DeserializeObject() with the serializer settings would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
JsonDataParser currently uses the default Newtonsoft datetime deserialization behavior, which is to turn datetime strings into .Net DateTime objects. This causes reformatting of the supplied value as well as loss of timezone details. This updated value is what is supplied to Liquid. The converted value may not be desired by end users and could cause further issues downstream.
This PR makes several changes: