-
Notifications
You must be signed in to change notification settings - Fork 15.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support indented JSON formatting in C# #9391
Support indented JSON formatting in C# #9391
Conversation
Tests failed on Windows build server, due to string constants containing New Lines, which are checked out as CRLF when using Pushed a fix. The workflow may be run again, if it's ok @elharo. |
4ee24ae
to
3dfd683
Compare
I've run a benchmark to measure performance differences compared to the previous implementation:
Benchmark codeI copied the previous implementation into the project and renamed the class JsonFormatterBefore#region Copyright notice and license
// Protocol Buffers - Google's data interchange format
// Copyright 2015 Google Inc. All rights reserved.
// https://developers.google.com/protocol-buffers/
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#endregion
using System;
using System.Collections;
using System.Globalization;
using System.Text;
using Google.Protobuf.Reflection;
using Google.Protobuf.WellKnownTypes;
using System.IO;
using System.Linq;
using System.Collections.Generic;
using System.Reflection;
using System.Diagnostics.CodeAnalysis;
namespace Google.Protobuf
{
/// <summary>
/// Reflection-based converter from messages to JSON.
/// </summary>
/// <remarks>
/// <para>
/// Instances of this class are thread-safe, with no mutable state.
/// </para>
/// <para>
/// This is a simple start to get JSON formatting working. As it's reflection-based,
/// it's not as quick as baking calls into generated messages - but is a simpler implementation.
/// (This code is generally not heavily optimized.)
/// </para>
/// </remarks>
public sealed class JsonFormatterBefore
{
internal const string AnyTypeUrlField = "@type";
internal const string AnyDiagnosticValueField = "@value";
internal const string AnyWellKnownTypeValueField = "value";
private const string TypeUrlPrefix = "type.googleapis.com";
private const string NameValueSeparator = ": ";
private const string PropertySeparator = ", ";
/// <summary>
/// Returns a formatter using the default settings.
/// </summary>
public static JsonFormatterBefore Default { get; } = new JsonFormatterBefore(Settings.Default);
// A JSON formatter which *only* exists
private static readonly JsonFormatterBefore diagnosticFormatter = new JsonFormatterBefore(Settings.Default);
/// <summary>
/// The JSON representation of the first 160 characters of Unicode.
/// Empty strings are replaced by the static constructor.
/// </summary>
private static readonly string[] CommonRepresentations = {
// C0 (ASCII and derivatives) control characters
"\\u0000", "\\u0001", "\\u0002", "\\u0003", // 0x00
"\\u0004", "\\u0005", "\\u0006", "\\u0007",
"\\b", "\\t", "\\n", "\\u000b",
"\\f", "\\r", "\\u000e", "\\u000f",
"\\u0010", "\\u0011", "\\u0012", "\\u0013", // 0x10
"\\u0014", "\\u0015", "\\u0016", "\\u0017",
"\\u0018", "\\u0019", "\\u001a", "\\u001b",
"\\u001c", "\\u001d", "\\u001e", "\\u001f",
// Escaping of " and \ are required by www.json.org string definition.
// Escaping of < and > are required for HTML security.
"", "", "\\\"", "", "", "", "", "", // 0x20
"", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", // 0x30
"", "", "", "", "\\u003c", "", "\\u003e", "",
"", "", "", "", "", "", "", "", // 0x40
"", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", // 0x50
"", "", "", "", "\\\\", "", "", "",
"", "", "", "", "", "", "", "", // 0x60
"", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", // 0x70
"", "", "", "", "", "", "", "\\u007f",
// C1 (ISO 8859 and Unicode) extended control characters
"\\u0080", "\\u0081", "\\u0082", "\\u0083", // 0x80
"\\u0084", "\\u0085", "\\u0086", "\\u0087",
"\\u0088", "\\u0089", "\\u008a", "\\u008b",
"\\u008c", "\\u008d", "\\u008e", "\\u008f",
"\\u0090", "\\u0091", "\\u0092", "\\u0093", // 0x90
"\\u0094", "\\u0095", "\\u0096", "\\u0097",
"\\u0098", "\\u0099", "\\u009a", "\\u009b",
"\\u009c", "\\u009d", "\\u009e", "\\u009f"
};
static JsonFormatterBefore()
{
for (int i = 0; i < CommonRepresentations.Length; i++)
{
if (CommonRepresentations[i] == "")
{
CommonRepresentations[i] = ((char) i).ToString();
}
}
}
private readonly Settings settings;
private bool DiagnosticOnly => ReferenceEquals(this, diagnosticFormatter);
/// <summary>
/// Creates a new formatted with the given settings.
/// </summary>
/// <param name="settings">The settings.</param>
public JsonFormatterBefore(Settings settings)
{
this.settings = ProtoPreconditions.CheckNotNull(settings, nameof(settings));
}
/// <summary>
/// Formats the specified message as JSON.
/// </summary>
/// <param name="message">The message to format.</param>
/// <returns>The formatted message.</returns>
public string Format(IMessage message)
{
var writer = new StringWriter();
Format(message, writer);
return writer.ToString();
}
/// <summary>
/// Formats the specified message as JSON.
/// </summary>
/// <param name="message">The message to format.</param>
/// <param name="writer">The TextWriter to write the formatted message to.</param>
/// <returns>The formatted message.</returns>
public void Format(IMessage message, TextWriter writer)
{
ProtoPreconditions.CheckNotNull(message, nameof(message));
ProtoPreconditions.CheckNotNull(writer, nameof(writer));
if (message.Descriptor.IsWellKnownType)
{
WriteWellKnownTypeValue(writer, message.Descriptor, message);
}
else
{
WriteMessage(writer, message);
}
}
/// <summary>
/// Converts a message to JSON for diagnostic purposes with no extra context.
/// </summary>
/// <remarks>
/// <para>
/// This differs from calling <see cref="Format(IMessage)"/> on the default JSON
/// formatter in its handling of <see cref="Any"/>. As no type registry is available
/// in <see cref="object.ToString"/> calls, the normal way of resolving the type of
/// an <c>Any</c> message cannot be applied. Instead, a JSON property named <c>@value</c>
/// is included with the base64 data from the <see cref="Any.Value"/> property of the message.
/// </para>
/// <para>The value returned by this method is only designed to be used for diagnostic
/// purposes. It may not be parsable by <see cref="JsonParser"/>, and may not be parsable
/// by other Protocol Buffer implementations.</para>
/// </remarks>
/// <param name="message">The message to format for diagnostic purposes.</param>
/// <returns>The diagnostic-only JSON representation of the message</returns>
public static string ToDiagnosticString(IMessage message)
{
ProtoPreconditions.CheckNotNull(message, nameof(message));
return diagnosticFormatter.Format(message);
}
private void WriteMessage(TextWriter writer, IMessage message)
{
if (message == null)
{
WriteNull(writer);
return;
}
if (DiagnosticOnly)
{
ICustomDiagnosticMessage customDiagnosticMessage = message as ICustomDiagnosticMessage;
if (customDiagnosticMessage != null)
{
writer.Write(customDiagnosticMessage.ToDiagnosticString());
return;
}
}
writer.Write("{ ");
bool writtenFields = WriteMessageFields(writer, message, false);
writer.Write(writtenFields ? " }" : "}");
}
private bool WriteMessageFields(TextWriter writer, IMessage message, bool assumeFirstFieldWritten)
{
var fields = message.Descriptor.Fields;
bool first = !assumeFirstFieldWritten;
// First non-oneof fields
foreach (var field in fields.InFieldNumberOrder())
{
var accessor = field.Accessor;
var value = accessor.GetValue(message);
if (!ShouldFormatFieldValue(message, field, value))
{
continue;
}
if (!first)
{
writer.Write(PropertySeparator);
}
WriteString(writer, accessor.Descriptor.JsonName);
writer.Write(NameValueSeparator);
WriteValue(writer, value);
first = false;
}
return !first;
}
/// <summary>
/// Determines whether or not a field value should be serialized according to the field,
/// its value in the message, and the settings of this formatter.
/// </summary>
private bool ShouldFormatFieldValue(IMessage message, FieldDescriptor field, object value) =>
field.HasPresence
// Fields that support presence *just* use that
? field.Accessor.HasValue(message)
// Otherwise, format if either we've been asked to format default values, or if it's
// not a default value anyway.
: settings.FormatDefaultValues || !IsDefaultValue(field, value);
// Converted from java/core/src/main/java/com/google/protobuf/Descriptors.java
internal static string ToJsonName(string name)
{
StringBuilder result = new StringBuilder(name.Length);
bool isNextUpperCase = false;
foreach (char ch in name)
{
if (ch == '_')
{
isNextUpperCase = true;
}
else if (isNextUpperCase)
{
result.Append(char.ToUpperInvariant(ch));
isNextUpperCase = false;
}
else
{
result.Append(ch);
}
}
return result.ToString();
}
internal static string FromJsonName(string name)
{
StringBuilder result = new StringBuilder(name.Length);
foreach (char ch in name)
{
if (char.IsUpper(ch))
{
result.Append('_');
result.Append(char.ToLowerInvariant(ch));
}
else
{
result.Append(ch);
}
}
return result.ToString();
}
private static void WriteNull(TextWriter writer)
{
writer.Write("null");
}
private static bool IsDefaultValue(FieldDescriptor descriptor, object value)
{
if (descriptor.IsMap)
{
IDictionary dictionary = (IDictionary) value;
return dictionary.Count == 0;
}
if (descriptor.IsRepeated)
{
IList list = (IList) value;
return list.Count == 0;
}
switch (descriptor.FieldType)
{
case FieldType.Bool:
return (bool) value == false;
case FieldType.Bytes:
return (ByteString) value == ByteString.Empty;
case FieldType.String:
return (string) value == "";
case FieldType.Double:
return (double) value == 0.0;
case FieldType.SInt32:
case FieldType.Int32:
case FieldType.SFixed32:
case FieldType.Enum:
return (int) value == 0;
case FieldType.Fixed32:
case FieldType.UInt32:
return (uint) value == 0;
case FieldType.Fixed64:
case FieldType.UInt64:
return (ulong) value == 0;
case FieldType.SFixed64:
case FieldType.Int64:
case FieldType.SInt64:
return (long) value == 0;
case FieldType.Float:
return (float) value == 0f;
case FieldType.Message:
case FieldType.Group: // Never expect to get this, but...
return value == null;
default:
throw new ArgumentException("Invalid field type");
}
}
/// <summary>
/// Writes a single value to the given writer as JSON. Only types understood by
/// Protocol Buffers can be written in this way. This method is only exposed for
/// advanced use cases; most users should be using <see cref="Format(IMessage)"/>
/// or <see cref="Format(IMessage, TextWriter)"/>.
/// </summary>
/// <param name="writer">The writer to write the value to. Must not be null.</param>
/// <param name="value">The value to write. May be null.</param>
public void WriteValue(TextWriter writer, object value)
{
if (value == null || value is NullValue)
{
WriteNull(writer);
}
else if (value is bool)
{
writer.Write((bool)value ? "true" : "false");
}
else if (value is ByteString)
{
// Nothing in Base64 needs escaping
writer.Write('"');
writer.Write(((ByteString)value).ToBase64());
writer.Write('"');
}
else if (value is string)
{
WriteString(writer, (string)value);
}
else if (value is IDictionary)
{
WriteDictionary(writer, (IDictionary)value);
}
else if (value is IList)
{
WriteList(writer, (IList)value);
}
else if (value is int || value is uint)
{
IFormattable formattable = (IFormattable) value;
writer.Write(formattable.ToString("d", CultureInfo.InvariantCulture));
}
else if (value is long || value is ulong)
{
writer.Write('"');
IFormattable formattable = (IFormattable) value;
writer.Write(formattable.ToString("d", CultureInfo.InvariantCulture));
writer.Write('"');
}
else if (value is System.Enum)
{
if (settings.FormatEnumsAsIntegers)
{
WriteValue(writer, (int)value);
}
else
{
string name = OriginalEnumValueHelper.GetOriginalName(value);
if (name != null)
{
WriteString(writer, name);
}
else
{
WriteValue(writer, (int)value);
}
}
}
else if (value is float || value is double)
{
string text = ((IFormattable) value).ToString("r", CultureInfo.InvariantCulture);
if (text == "NaN" || text == "Infinity" || text == "-Infinity")
{
writer.Write('"');
writer.Write(text);
writer.Write('"');
}
else
{
writer.Write(text);
}
}
else if (value is IMessage)
{
Format((IMessage)value, writer);
}
else
{
throw new ArgumentException("Unable to format value of type " + value.GetType());
}
}
/// <summary>
/// Central interception point for well-known type formatting. Any well-known types which
/// don't need special handling can fall back to WriteMessage. We avoid assuming that the
/// values are using the embedded well-known types, in order to allow for dynamic messages
/// in the future.
/// </summary>
private void WriteWellKnownTypeValue(TextWriter writer, MessageDescriptor descriptor, object value)
{
// Currently, we can never actually get here, because null values are always handled by the caller. But if we *could*,
// this would do the right thing.
if (value == null)
{
WriteNull(writer);
return;
}
// For wrapper types, the value will either be the (possibly boxed) "native" value,
// or the message itself if we're formatting it at the top level (e.g. just calling ToString on the object itself).
// If it's the message form, we can extract the value first, which *will* be the (possibly boxed) native value,
// and then proceed, writing it as if we were definitely in a field. (We never need to wrap it in an extra string...
// WriteValue will do the right thing.)
if (descriptor.IsWrapperType)
{
if (value is IMessage)
{
var message = (IMessage) value;
value = message.Descriptor.Fields[WrappersReflection.WrapperValueFieldNumber].Accessor.GetValue(message);
}
WriteValue(writer, value);
return;
}
if (descriptor.FullName == Timestamp.Descriptor.FullName)
{
WriteTimestamp(writer, (IMessage)value);
return;
}
if (descriptor.FullName == Duration.Descriptor.FullName)
{
WriteDuration(writer, (IMessage)value);
return;
}
if (descriptor.FullName == FieldMask.Descriptor.FullName)
{
WriteFieldMask(writer, (IMessage)value);
return;
}
if (descriptor.FullName == Struct.Descriptor.FullName)
{
WriteStruct(writer, (IMessage)value);
return;
}
if (descriptor.FullName == ListValue.Descriptor.FullName)
{
var fieldAccessor = descriptor.Fields[ListValue.ValuesFieldNumber].Accessor;
WriteList(writer, (IList)fieldAccessor.GetValue((IMessage)value));
return;
}
if (descriptor.FullName == Value.Descriptor.FullName)
{
WriteStructFieldValue(writer, (IMessage)value);
return;
}
if (descriptor.FullName == Any.Descriptor.FullName)
{
WriteAny(writer, (IMessage)value);
return;
}
WriteMessage(writer, (IMessage)value);
}
private void WriteTimestamp(TextWriter writer, IMessage value)
{
// TODO: In the common case where this *is* using the built-in Timestamp type, we could
// avoid all the reflection at this point, by casting to Timestamp. In the interests of
// avoiding subtle bugs, don't do that until we've implemented DynamicMessage so that we can prove
// it still works in that case.
int nanos = (int) value.Descriptor.Fields[Timestamp.NanosFieldNumber].Accessor.GetValue(value);
long seconds = (long) value.Descriptor.Fields[Timestamp.SecondsFieldNumber].Accessor.GetValue(value);
writer.Write(Timestamp.ToJson(seconds, nanos, DiagnosticOnly));
}
private void WriteDuration(TextWriter writer, IMessage value)
{
// TODO: Same as for WriteTimestamp
int nanos = (int) value.Descriptor.Fields[Duration.NanosFieldNumber].Accessor.GetValue(value);
long seconds = (long) value.Descriptor.Fields[Duration.SecondsFieldNumber].Accessor.GetValue(value);
writer.Write(Duration.ToJson(seconds, nanos, DiagnosticOnly));
}
private void WriteFieldMask(TextWriter writer, IMessage value)
{
var paths = (IList<string>) value.Descriptor.Fields[FieldMask.PathsFieldNumber].Accessor.GetValue(value);
writer.Write(FieldMask.ToJson(paths, DiagnosticOnly));
}
private void WriteAny(TextWriter writer, IMessage value)
{
if (DiagnosticOnly)
{
WriteDiagnosticOnlyAny(writer, value);
return;
}
string typeUrl = (string) value.Descriptor.Fields[Any.TypeUrlFieldNumber].Accessor.GetValue(value);
ByteString data = (ByteString) value.Descriptor.Fields[Any.ValueFieldNumber].Accessor.GetValue(value);
string typeName = Any.GetTypeName(typeUrl);
MessageDescriptor descriptor = settings.TypeRegistry.Find(typeName);
if (descriptor == null)
{
throw new InvalidOperationException($"Type registry has no descriptor for type name '{typeName}'");
}
IMessage message = descriptor.Parser.ParseFrom(data);
writer.Write("{ ");
WriteString(writer, AnyTypeUrlField);
writer.Write(NameValueSeparator);
WriteString(writer, typeUrl);
if (descriptor.IsWellKnownType)
{
writer.Write(PropertySeparator);
WriteString(writer, AnyWellKnownTypeValueField);
writer.Write(NameValueSeparator);
WriteWellKnownTypeValue(writer, descriptor, message);
}
else
{
WriteMessageFields(writer, message, true);
}
writer.Write(" }");
}
private void WriteDiagnosticOnlyAny(TextWriter writer, IMessage value)
{
string typeUrl = (string) value.Descriptor.Fields[Any.TypeUrlFieldNumber].Accessor.GetValue(value);
ByteString data = (ByteString) value.Descriptor.Fields[Any.ValueFieldNumber].Accessor.GetValue(value);
writer.Write("{ ");
WriteString(writer, AnyTypeUrlField);
writer.Write(NameValueSeparator);
WriteString(writer, typeUrl);
writer.Write(PropertySeparator);
WriteString(writer, AnyDiagnosticValueField);
writer.Write(NameValueSeparator);
writer.Write('"');
writer.Write(data.ToBase64());
writer.Write('"');
writer.Write(" }");
}
private void WriteStruct(TextWriter writer, IMessage message)
{
writer.Write("{ ");
IDictionary fields = (IDictionary) message.Descriptor.Fields[Struct.FieldsFieldNumber].Accessor.GetValue(message);
bool first = true;
foreach (DictionaryEntry entry in fields)
{
string key = (string) entry.Key;
IMessage value = (IMessage) entry.Value;
if (string.IsNullOrEmpty(key) || value == null)
{
throw new InvalidOperationException("Struct fields cannot have an empty key or a null value.");
}
if (!first)
{
writer.Write(PropertySeparator);
}
WriteString(writer, key);
writer.Write(NameValueSeparator);
WriteStructFieldValue(writer, value);
first = false;
}
writer.Write(first ? "}" : " }");
}
private void WriteStructFieldValue(TextWriter writer, IMessage message)
{
var specifiedField = message.Descriptor.Oneofs[0].Accessor.GetCaseFieldDescriptor(message);
if (specifiedField == null)
{
throw new InvalidOperationException("Value message must contain a value for the oneof.");
}
object value = specifiedField.Accessor.GetValue(message);
switch (specifiedField.FieldNumber)
{
case Value.BoolValueFieldNumber:
case Value.StringValueFieldNumber:
case Value.NumberValueFieldNumber:
WriteValue(writer, value);
return;
case Value.StructValueFieldNumber:
case Value.ListValueFieldNumber:
// Structs and ListValues are nested messages, and already well-known types.
var nestedMessage = (IMessage) specifiedField.Accessor.GetValue(message);
WriteWellKnownTypeValue(writer, nestedMessage.Descriptor, nestedMessage);
return;
case Value.NullValueFieldNumber:
WriteNull(writer);
return;
default:
throw new InvalidOperationException("Unexpected case in struct field: " + specifiedField.FieldNumber);
}
}
internal void WriteList(TextWriter writer, IList list)
{
writer.Write("[ ");
bool first = true;
foreach (var value in list)
{
if (!first)
{
writer.Write(PropertySeparator);
}
WriteValue(writer, value);
first = false;
}
writer.Write(first ? "]" : " ]");
}
internal void WriteDictionary(TextWriter writer, IDictionary dictionary)
{
writer.Write("{ ");
bool first = true;
// This will box each pair. Could use IDictionaryEnumerator, but that's ugly in terms of disposal.
foreach (DictionaryEntry pair in dictionary)
{
if (!first)
{
writer.Write(PropertySeparator);
}
string keyText;
if (pair.Key is string)
{
keyText = (string) pair.Key;
}
else if (pair.Key is bool)
{
keyText = (bool) pair.Key ? "true" : "false";
}
else if (pair.Key is int || pair.Key is uint | pair.Key is long || pair.Key is ulong)
{
keyText = ((IFormattable) pair.Key).ToString("d", CultureInfo.InvariantCulture);
}
else
{
if (pair.Key == null)
{
throw new ArgumentException("Dictionary has entry with null key");
}
throw new ArgumentException("Unhandled dictionary key type: " + pair.Key.GetType());
}
WriteString(writer, keyText);
writer.Write(NameValueSeparator);
WriteValue(writer, pair.Value);
first = false;
}
writer.Write(first ? "}" : " }");
}
/// <summary>
/// Writes a string (including leading and trailing double quotes) to a builder, escaping as required.
/// </summary>
/// <remarks>
/// Other than surrogate pair handling, this code is mostly taken from src/google/protobuf/util/internal/json_escaping.cc.
/// </remarks>
internal static void WriteString(TextWriter writer, string text)
{
writer.Write('"');
for (int i = 0; i < text.Length; i++)
{
char c = text[i];
if (c < 0xa0)
{
writer.Write(CommonRepresentations[c]);
continue;
}
if (char.IsHighSurrogate(c))
{
// Encountered first part of a surrogate pair.
// Check that we have the whole pair, and encode both parts as hex.
i++;
if (i == text.Length || !char.IsLowSurrogate(text[i]))
{
throw new ArgumentException("String contains low surrogate not followed by high surrogate");
}
HexEncodeUtf16CodeUnit(writer, c);
HexEncodeUtf16CodeUnit(writer, text[i]);
continue;
}
else if (char.IsLowSurrogate(c))
{
throw new ArgumentException("String contains high surrogate not preceded by low surrogate");
}
switch ((uint) c)
{
// These are not required by json spec
// but used to prevent security bugs in javascript.
case 0xfeff: // Zero width no-break space
case 0xfff9: // Interlinear annotation anchor
case 0xfffa: // Interlinear annotation separator
case 0xfffb: // Interlinear annotation terminator
case 0x00ad: // Soft-hyphen
case 0x06dd: // Arabic end of ayah
case 0x070f: // Syriac abbreviation mark
case 0x17b4: // Khmer vowel inherent Aq
case 0x17b5: // Khmer vowel inherent Aa
HexEncodeUtf16CodeUnit(writer, c);
break;
default:
if ((c >= 0x0600 && c <= 0x0603) || // Arabic signs
(c >= 0x200b && c <= 0x200f) || // Zero width etc.
(c >= 0x2028 && c <= 0x202e) || // Separators etc.
(c >= 0x2060 && c <= 0x2064) || // Invisible etc.
(c >= 0x206a && c <= 0x206f))
{
HexEncodeUtf16CodeUnit(writer, c);
}
else
{
// No handling of surrogates here - that's done earlier
writer.Write(c);
}
break;
}
}
writer.Write('"');
}
private const string Hex = "0123456789abcdef";
private static void HexEncodeUtf16CodeUnit(TextWriter writer, char c)
{
writer.Write("\\u");
writer.Write(Hex[(c >> 12) & 0xf]);
writer.Write(Hex[(c >> 8) & 0xf]);
writer.Write(Hex[(c >> 4) & 0xf]);
writer.Write(Hex[(c >> 0) & 0xf]);
}
/// <summary>
/// Settings controlling JSON formatting.
/// </summary>
public sealed class Settings
{
/// <summary>
/// Default settings, as used by <see cref="JsonFormatterBefore.Default"/>
/// </summary>
public static Settings Default { get; }
// Workaround for the Mono compiler complaining about XML comments not being on
// valid language elements.
static Settings()
{
Default = new Settings(false);
}
/// <summary>
/// Whether fields which would otherwise not be included in the formatted data
/// should be formatted even when the value is not present, or has the default value.
/// This option only affects fields which don't support "presence" (e.g.
/// singular non-optional proto3 primitive fields).
/// </summary>
public bool FormatDefaultValues { get; }
/// <summary>
/// The type registry used to format <see cref="Any"/> messages.
/// </summary>
public TypeRegistry TypeRegistry { get; }
/// <summary>
/// Whether to format enums as ints. Defaults to false.
/// </summary>
public bool FormatEnumsAsIntegers { get; }
/// <summary>
/// Creates a new <see cref="Settings"/> object with the specified formatting of default values
/// and an empty type registry.
/// </summary>
/// <param name="formatDefaultValues"><c>true</c> if default values (0, empty strings etc) should be formatted; <c>false</c> otherwise.</param>
public Settings(bool formatDefaultValues) : this(formatDefaultValues, TypeRegistry.Empty)
{
}
/// <summary>
/// Creates a new <see cref="Settings"/> object with the specified formatting of default values
/// and type registry.
/// </summary>
/// <param name="formatDefaultValues"><c>true</c> if default values (0, empty strings etc) should be formatted; <c>false</c> otherwise.</param>
/// <param name="typeRegistry">The <see cref="TypeRegistry"/> to use when formatting <see cref="Any"/> messages.</param>
public Settings(bool formatDefaultValues, TypeRegistry typeRegistry) : this(formatDefaultValues, typeRegistry, false)
{
}
/// <summary>
/// Creates a new <see cref="Settings"/> object with the specified parameters.
/// </summary>
/// <param name="formatDefaultValues"><c>true</c> if default values (0, empty strings etc) should be formatted; <c>false</c> otherwise.</param>
/// <param name="typeRegistry">The <see cref="TypeRegistry"/> to use when formatting <see cref="Any"/> messages. TypeRegistry.Empty will be used if it is null.</param>
/// <param name="formatEnumsAsIntegers"><c>true</c> to format the enums as integers; <c>false</c> to format enums as enum names.</param>
private Settings(bool formatDefaultValues,
TypeRegistry typeRegistry,
bool formatEnumsAsIntegers)
{
FormatDefaultValues = formatDefaultValues;
TypeRegistry = typeRegistry ?? TypeRegistry.Empty;
FormatEnumsAsIntegers = formatEnumsAsIntegers;
}
/// <summary>
/// Creates a new <see cref="Settings"/> object with the specified formatting of default values and the current settings.
/// </summary>
/// <param name="formatDefaultValues"><c>true</c> if default values (0, empty strings etc) should be formatted; <c>false</c> otherwise.</param>
public Settings WithFormatDefaultValues(bool formatDefaultValues) => new Settings(formatDefaultValues, TypeRegistry, FormatEnumsAsIntegers);
/// <summary>
/// Creates a new <see cref="Settings"/> object with the specified type registry and the current settings.
/// </summary>
/// <param name="typeRegistry">The <see cref="TypeRegistry"/> to use when formatting <see cref="Any"/> messages.</param>
public Settings WithTypeRegistry(TypeRegistry typeRegistry) => new Settings(FormatDefaultValues, typeRegistry, FormatEnumsAsIntegers);
/// <summary>
/// Creates a new <see cref="Settings"/> object with the specified enums formatting option and the current settings.
/// </summary>
/// <param name="formatEnumsAsIntegers"><c>true</c> to format the enums as integers; <c>false</c> to format enums as enum names.</param>
public Settings WithFormatEnumsAsIntegers(bool formatEnumsAsIntegers) => new Settings(FormatDefaultValues, TypeRegistry, formatEnumsAsIntegers);
}
// Effectively a cache of mapping from enum values to the original name as specified in the proto file,
// fetched by reflection.
// The need for this is unfortunate, as is its unbounded size, but realistically it shouldn't cause issues.
private static class OriginalEnumValueHelper
{
// TODO: In the future we might want to use ConcurrentDictionary, at the point where all
// the platforms we target have it.
private static readonly Dictionary<System.Type, Dictionary<object, string>> dictionaries
= new Dictionary<System.Type, Dictionary<object, string>>();
[UnconditionalSuppressMessage("Trimming", "IL2072",
Justification = "The field for the value must still be present. It will be returned by reflection, will be in this collection, and its name can be resolved.")]
internal static string GetOriginalName(object value)
{
var enumType = value.GetType();
Dictionary<object, string> nameMapping;
lock (dictionaries)
{
if (!dictionaries.TryGetValue(enumType, out nameMapping))
{
nameMapping = GetNameMapping(enumType);
dictionaries[enumType] = nameMapping;
}
}
string originalName;
// If this returns false, originalName will be null, which is what we want.
nameMapping.TryGetValue(value, out originalName);
return originalName;
}
private static Dictionary<object, string> GetNameMapping(
[DynamicallyAccessedMembers(
DynamicallyAccessedMemberTypes.PublicFields |
DynamicallyAccessedMemberTypes.NonPublicFields)]
System.Type enumType)
{
return enumType.GetTypeInfo().DeclaredFields
.Where(f => f.IsStatic)
.Where(f => f.GetCustomAttributes<OriginalNameAttribute>()
.FirstOrDefault()?.PreferredAlias ?? true)
.ToDictionary(f => f.GetValue(null),
f => f.GetCustomAttributes<OriginalNameAttribute>()
.FirstOrDefault()
// If the attribute hasn't been applied, fall back to the name of the field.
?.Name ?? f.Name);
}
}
}
} using BenchmarkDotNet.Attributes;
using Google.Protobuf.TestProtos;
namespace Google.Protobuf.Benchmarks
{
[MemoryDiagnoser]
public class JsonFormatterBenchmarks
{
private static readonly JsonFormatter jsonFormatterPretty = new JsonFormatter(JsonFormatter.Settings.Default.WithIndentation());
[Benchmark]
public string JsonFormatterTest()
{
var value = GetMessage();
return JsonFormatter.Default.Format(value);
}
[Benchmark]
public string JsonFormatterPrettyTest()
{
var value = GetMessage();
return jsonFormatterPretty.Format(value);
}
[Benchmark]
public string JsonFormatterBeforeTest()
{
var value = GetMessage();
return JsonFormatterBefore.Default.Format(value);
}
private static NestedTestAllTypes GetMessage()
{
return new NestedTestAllTypes
{
Payload = new TestAllTypes
{
SingleBool = true,
SingleInt32 = 100,
SingleString = "multiple fields",
RepeatedString = { "string1", "string2" },
},
Child = new NestedTestAllTypes
{
Payload = new TestAllTypes
{
SingleString = "single field",
},
},
RepeatedChild =
{
new NestedTestAllTypes { Payload = new TestAllTypes { SingleString = "child 1", RepeatedString = { "string" } } },
new NestedTestAllTypes { Payload = new TestAllTypes { SingleString = "child 2" } },
},
};
}
}
} |
Hi @elharo, thank you for running the pipelines. Is there anything I should still do, or will a reviewer be assigned at some point from now? |
Hey @jskeet! Would you be interested in reviewing or merging this PR? Cheers~ |
@boukeversteegh: This is something for the protobuf team to consider. In particular, I would look at whether it's supported by other languages (and particularly Java as the closest cousin). |
Alright! Here are the languages that seem to support indentation at the moment:
|
Okay, that's good to know. I'll have a look at the implementation when I can, but I can't make any promises. (This isn't a library I work on day-to-day.) |
FormatInternal(message, writer, depth: 0); | ||
} | ||
|
||
private void FormatInternal(IMessage message, TextWriter writer, int depth) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To calculate indentation, Format
needs to know the current depth.
I added a private overload to make sure the public API is not changed.
However, it would be useful to make it public, to allow embedding the json inside another json string (which is what I need personally). If the reviewer sees no objection I would prefer to add a public overload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be okay, yes - the docs for depth
will just need to be really clear. I think it's also entirely reasonable for the docs for the existing method to say (probably in a remarks element) that it delegates to this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the overload public, with <remark>
explaining the delegation.
Renamed depth
to indentationLevel
for clarity
@@ -356,7 +362,8 @@ private static bool IsDefaultValue(FieldDescriptor descriptor, object value) | |||
/// </summary> | |||
/// <param name="writer">The writer to write the value to. Must not be null.</param> | |||
/// <param name="value">The value to write. May be null.</param> | |||
public void WriteValue(TextWriter writer, object value) | |||
/// <param name="depth">The current indentation depth. Optional when writing non-indented values.</param> | |||
public void WriteValue(TextWriter writer, object value, int depth = 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the time of writing, I had not realized this method was public, so the public API is changed unintentionally.
This could be breaking when WriteValue
is called through reflection.
Let me know if this should be avoided, and a new private overload is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please introduce a new private overload - or even a public one (without the parameter being optional). We definitely, definitely don't want to break binary compatibility. (If it were only a problem calling via reflection, I wouldn't be too bothered - but code using the new version uses a library which in turn uses the old version, this could fail without any reflection being involved.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Binary compatibility is a great point that I hadn't even thought of. Fixed it!
ec0e840
to
787395b
Compare
Hi @jskeet, thank you for remembering this!
|
Apologies for not responding earlier - I didn't get notified of your comment for some reason. I'll try to get this finalized ASAP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few nits - looks great otherwise. (Apologies if they're nits I should have caught before.) Given that they're all formatting etc, let me know if you'd prefer me to get this merged then create a new PR that just fixes the nits.
Argh, I failed to notice before - as you've added a file (JsonFormatterSettingsTest.cs) you'll need to modify |
Fixed the nits and the missing file. I don't mind to keep iterating until it's good, but if there's anything trivial left, and you find it easier to apply the change yourself, please don't refrain out of courtesy. I'd be glad if this branch can get cleared out one way or another :-) |
Rerunning Kokoro now - fingers crossed. And thanks for the go-ahead to add fixes myself - will do that if there's anything more. I've worked out why I wasn't being notified, too - I'd accidentally muted the thread in gmail. Now unmuted, so you shouldn't get radio silence from me any more :) |
Hmm... I suspect some of the failures may be due to the branch being out of date and hard to rebase. My plan is to wait until tomorrow (as I've only got an hour now, and this might take longer than that), reset and recreate the change as a single commit, then create a new PR against my own repo just so that I don't cause any problems on your branch first. If that works, I'll push the "single commit rebased against main" commit on your branch (hopefully so it shows us both as authors) and we can both double check it in this PR. Once we're both happy and it's green, we can get it in. Does that sound about right? |
Strange, I had already squashed and rebased everything from the previous review round, as it became hard to preserve every commit. I'm fine with squashing everything once again and then rebasing, if you think it helps. Tomorrow evening I'll have time, but no need to wait for me to do it. If you proceed though, please do make sure to preserve the commit author, as for me it's a great honor to contribute to a google repository and I would like the event recorded in its history. |
Indeed - my local copy of your branch was out of date. I thought I'd fetched it, but a conflict kept it from being updated. It looks like it's still 75 commits behind main - but it does rebase cleanly. Unfortunately I don't have permission to push to your branch, but I've created #10315 which is literally just "rebase main, git push" and which I'm using to run Kokoro test. I'll try to get that green (it may already go green from the start). We'll get there :) |
Okay, the good news is that my other PR is green - so hopefully if you just fetch main, rebase to it, then force push again, then I should be able to rerun kokoro and get green lights. Sorry for all the administrivia involved, but we're getting there! |
f6072ee
to
97ffad8
Compare
97ffad8
to
69dcd1e
Compare
Alright! Squashed and rebased again. Still as optimistic as you are! |
Kokoro is green - hooray! @fowles please could you run the maintainer workflow and merge this if everything passes? |
Yay! Thank you so much for your help getting this through! |
Any ideas when this will arrive in a published Nuget package? I am currently on 3.21.7 and it is not there. |
@kwaclaw: I wouldn't expect it until 3.22.0. I don't have a timeframe for when that will come out. |
This adds indented formatting (pretty printing) support for c#.
Notes
JsonFormatter.Settings.WithIndentation(string indentation = " ")
.JsonFormatter.Format
that specify the initial indentation level, for embedding the json inside a nested json string.WithIndentation
is called.Usage
Output:
Feedback welcome!