Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyzer: Recommend upgrading to CompositeFormat for enhanced performance #85525

Closed
geeknoid opened this issue Apr 28, 2023 · 10 comments
Closed
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime code-analyzer Marks an issue that suggests a Roslyn analyzer code-fixer Marks an issue that suggests a Roslyn code fixer help wanted [up-for-grabs] Good issue for external contributors partner-impact This issue impacts a partner who needs to be kept updated
Milestone

Comments

@geeknoid
Copy link
Member

geeknoid commented Apr 28, 2023

.NET 8 introduces the CompositeFormat abstraction which can substantially improve performance of code that does a lot of formatting (we have some services that spend their day doing this). Having a performance-centric analyzer discovering uses of String.Format/StringBuilder.AppendFormat and recommending updates to CompositeFormat would help the ecosystem evolve. A fixer here should be relatively simple.

Category : Performance
Severity : Hidden

@geeknoid geeknoid added code-analyzer Marks an issue that suggests a Roslyn analyzer code-fixer Marks an issue that suggests a Roslyn code fixer labels Apr 28, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Apr 28, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Apr 28, 2023
@geeknoid
Copy link
Member Author

Here's the core part of our existing analyzer around recommending string format -> composite format. It's not a full analyzer class since it's built around some higher-level convenience wrappers we use for analyzers, but it has the core logic.

internal sealed class StringFormat
{
    public StringFormat(CallAnalyzer.Registrar reg)
    {
        foreach (var method in reg.Compilation.GetSpecialType(SpecialType.System_String).GetMembers("Format").OfType<IMethodSymbol>())
        {
            reg.RegisterMethod(method, Handle);
        }

        var type = reg.Compilation.GetTypeByMetadataName("System.Text.StringBuilder");
        if (type != null)
        {
            foreach (var method in type.GetMembers("AppendFormat").OfType<IMethodSymbol>())
            {
                reg.RegisterMethod(method, Handle);
            }
        }

        static void Handle(OperationAnalysisContext context, IInvocationOperation op)
        {
            var format = GetFormatArgument(op);
            if (format.ChildNodes().First().IsKind(SyntaxKind.StringLiteralExpression))
            {
                var properties = new Dictionary<string, string?>();
                if (op.TargetMethod.Name == "Format")
                {
                    properties.Add("StringFormat", null);
                }

                var diagnostic = Diagnostic.Create(DiagDescriptors.StringFormat, op.Syntax.GetLocation(), properties.ToImmutableDictionary());
                context.ReportDiagnostic(diagnostic);
            }

            static SyntaxNode GetFormatArgument(IInvocationOperation invocation)
            {
                var sm = invocation.SemanticModel!;
                var arguments = invocation.Arguments;
                var typeInfo = sm.GetTypeInfo(arguments[0].Syntax.ChildNodes().First());

                // This check is needed to identify exactly which argument of string.Format is the format argument
                // The format might be passed as first or second argument
                // if there are more than 1 arguments and first argument is IFormatProvider then format is second argument otherwise it is first
                if (arguments.Length > 1 && typeInfo.Type != null && typeInfo.Type.AllInterfaces.Any(i => i.MetadataName == "IFormatProvider"))
                {
                    return arguments[1].Syntax;
                }

                return arguments[0].Syntax;
            }
        }
    }

@geeknoid
Copy link
Member Author

And here's a corresponding fixer:

[ExportCodeFixProvider(LanguageNames.CSharp, Name = nameof(StringFormatFixer))]
[Shared]
public sealed class StringFormatFixer : CodeFixProvider
{
    private const string TargetClass = "CompositeFormat";
    private const string TargetMethod = "Format";
    private const string VariableName = "_sf";
    private const int ArgumentsToSkip = 2;
    private static readonly IdentifierNameSyntax _textNamespace = SyntaxFactory.IdentifierName("Microsoft.R9.Extensions.Text");

    /// <inheritdoc/>
    public override ImmutableArray<string> FixableDiagnosticIds => ImmutableArray.Create(DiagDescriptors.StringFormat.Id);

    /// <inheritdoc/>
    public override FixAllProvider GetFixAllProvider() => WellKnownFixAllProviders.BatchFixer;

    /// <inheritdoc/>
    public override Task RegisterCodeFixesAsync(CodeFixContext context)
    {
        var diagnostics = context.Diagnostics.First();
        context.RegisterCodeFix(
               CodeAction.Create(
                   title: Resources.StringFormatTitle,
                   createChangedDocument: cancellationToken => ApplyFixAsync(context.Document, diagnostics.Location, diagnostics.Properties, cancellationToken),
                   equivalenceKey: nameof(Resources.StringFormatTitle)),
               context.Diagnostics);

        return Task.CompletedTask;
    }

    private static async Task<Document> ApplyFixAsync(Document document, Location diagnosticLocation, IReadOnlyDictionary<string, string?> properties, CancellationToken cancellationToken)
    {
        var editor = await DocumentEditor.CreateAsync(document, cancellationToken).ConfigureAwait(false);
        if (editor.OriginalRoot.FindNode(diagnosticLocation.SourceSpan) is InvocationExpressionSyntax expression)
        {
            var classDeclaration = GetTypeDeclaration(expression);
            if (classDeclaration != null)
            {
                var (format, argList) = GetFormatAndArguments(editor, expression);
                var formatKind = format.ChildNodes().First().Kind();

                if (formatKind is SyntaxKind.StringLiteralExpression)
                {
                    var (identifier, field) = GetFieldDeclaration(editor, classDeclaration, format);
                    var invocation = properties.ContainsKey("StringFormat")
                        ? CreateInvocationExpression(editor, identifier, argList.Arguments, expression)
                        : GetStringBuilderExpression(editor, identifier, argList.Arguments, expression);
                    ApplyChanges(editor, expression, invocation, classDeclaration, field);
                }
            }
        }

        return editor.GetChangedDocument();
    }

    private static TypeDeclarationSyntax? GetTypeDeclaration(SyntaxNode node)
    {
        return node.FirstAncestorOrSelf<TypeDeclarationSyntax>(n => n.IsKind(SyntaxKind.ClassDeclaration) || n.IsKind(SyntaxKind.StructDeclaration));
    }

    private static (string identifier, FieldDeclarationSyntax? field) GetFieldDeclaration(SyntaxEditor editor, SyntaxNode classDeclaration, SyntaxNode format)
    {
        var members = classDeclaration.DescendantNodes().OfType<FieldDeclarationSyntax>();
        int numberOfMembers = 1;

        var strExp = format.ToString();

        var arguments = SyntaxFactory.Argument(SyntaxFactory.ParseExpression(strExp));
        HashSet<string> fields = new HashSet<string>();

        foreach (var member in members)
        {
            var fieldName = member.DescendantNodes().OfType<VariableDeclaratorSyntax>().First().Identifier.ToString();
            _ = fields.Add(fieldName);

            if (member.Declaration.Type.ToString() == "CompositeFormat")
            {
                if (member.DescendantNodes().OfType<ObjectCreationExpressionSyntax>().First().ArgumentList!.Arguments.First().ToString() == strExp)
                {
                    return (member.DescendantNodes().OfType<VariableDeclaratorSyntax>().First().Identifier.ToString(), null);
                }

                numberOfMembers++;
            }
        }

        string variableName;
        do
        {
            variableName = $"{VariableName}{numberOfMembers}";
            numberOfMembers++;
        }
        while (!IsFieldNameAvailable(fields, variableName));

        return (variableName, editor.Generator.FieldDeclaration(
                        variableName,
                        SyntaxFactory.ParseTypeName(TargetClass),
                        Accessibility.Private,
                        DeclarationModifiers.Static | DeclarationModifiers.ReadOnly,
                        SyntaxFactory.ObjectCreationExpression(
                                SyntaxFactory.Token(SyntaxKind.NewKeyword),
                                SyntaxFactory.IdentifierName(TargetClass),
                                SyntaxFactory.ArgumentList().AddArguments(arguments),
                                null)) as FieldDeclarationSyntax);
    }

    private static bool IsFieldNameAvailable(ICollection<string> fields, string field)
    {
        return !fields.Contains(field);
    }

    private static (ArgumentSyntax argument, ArgumentListSyntax argumentList) GetFormatAndArguments(DocumentEditor editor, InvocationExpressionSyntax invocation)
    {
        var arguments = invocation.ArgumentList.Arguments;
        var first = arguments[0];
        var typeInfo = editor.SemanticModel.GetTypeInfo(first.ChildNodes().First());
        SeparatedSyntaxList<ArgumentSyntax> separatedList;
        if (arguments.Count > 1 && typeInfo.Type!.AllInterfaces.Any(i => i.MetadataName == "IFormatProvider"))
        {
            separatedList = SyntaxFactory.SingletonSeparatedList(first).AddRange(arguments.Skip(ArgumentsToSkip));
            return (arguments[1], SyntaxFactory.ArgumentList(separatedList));
        }

        var nullArgument = SyntaxFactory.Argument(SyntaxFactory.LiteralExpression(SyntaxKind.NullLiteralExpression));
        separatedList = SyntaxFactory.SingletonSeparatedList(nullArgument).AddRange(arguments.Skip(1));
        return (first, SyntaxFactory.ArgumentList(separatedList));
    }

    private static SyntaxNode CreateInvocationExpression(SyntaxEditor editor, string identifierName, IEnumerable<SyntaxNode> arguments, SyntaxNode invocation)
    {
        var gen = editor.Generator;
        var identifier = gen.IdentifierName(identifierName);
        var memberAccessExpression = gen.MemberAccessExpression(identifier, TargetMethod);
        return gen.InvocationExpression(memberAccessExpression, arguments).WithTriviaFrom(invocation);
    }

    private static void ApplyChanges(SyntaxEditor editor, SyntaxNode oldInvocation, SyntaxNode newInvocation, SyntaxNode classDeclaration, SyntaxNode? field)
    {
        if (field != null)
        {
            editor.AddMember(classDeclaration, field);
        }

        editor.ReplaceNode(oldInvocation, newInvocation);
        editor.TryAddUsingDirective(_textNamespace);
    }

    private static SyntaxNode GetStringBuilderExpression(SyntaxEditor editor, string identifierName, IEnumerable<ArgumentSyntax> arguments, InvocationExpressionSyntax invocation)
    {
        var gen = editor.Generator;
        var identifier = gen.IdentifierName(identifierName);
        var memberAccessExpression = gen.Argument(identifier);
        var list = SyntaxFactory.SingletonSeparatedList(memberAccessExpression).AddRange(arguments);
        return invocation.WithArgumentList(SyntaxFactory.ArgumentList(list));
    }
}

@ghost
Copy link

ghost commented Apr 28, 2023

Tagging subscribers to this area: @dotnet/area-system-runtime
See info in area-owners.md if you want to be subscribed.

Issue Details

.NET 8 introduces the CompositeFormat abstraction which can substantially improve performance of code that does a lot of formatting (we have some services that spend their day doing this). Having a performance-centric analyzer discovering uses of String.Format/StringBuilder.AppendFormat and recommending updates to CompositeFormat would help the ecosystem evolve. A fixer here should be relatively simple.

Author: geeknoid
Assignees: -
Labels:

area-System.Runtime, untriaged, code-analyzer, code-fixer, needs-area-label

Milestone: -

@jkotas jkotas removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Apr 28, 2023
@jeffhandley jeffhandley added api-suggestion Early API idea and discussion, it is NOT ready for implementation partner-impact This issue impacts a partner who needs to be kept updated labels May 17, 2023
@buyaa-n buyaa-n added this to the 8.0.0 milestone May 17, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label May 17, 2023
@jeffhandley
Copy link
Member

Triage note: We could consider having the/a fixer create the CompositeFormat lazily upon the first execution so that the static startup cost penalty isn't paid for exception-related call sites (or other call sites that might not get hit).

@buyaa-n
Copy link
Contributor

buyaa-n commented Jun 1, 2023

[Triage note:] It might be not a deoptimization for exception code paths, which is the majority of case where String.Format used, therefore we are not sure if the default severity should be info or hidden by default.

Check CompositeFormat APIs proposal for more info.

@buyaa-n buyaa-n added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Jun 1, 2023
@bartonjs
Copy link
Member

bartonjs commented Jun 1, 2023

The intent of the analyzer is to replace code like

...
string formatted = string.Format(ResourceTable.SomeExampleMessage, count);

with

private static readonly CompositeFormat s_someExampleMessageFormat = new CompositeFormat(ResourceTable.SomeExampleMessage);
...

string formatted = s_someExampleMessageFormat.Format(count);

It would seemingly also apply to a literal format string, but literal formats should instead prefer being converted to interpolated strings.

@bartonjs
Copy link
Member

bartonjs commented Jun 1, 2023

Video

In discussion we identified three different cases, that should have three different diagnostic IDs:

  • The format string is a literal.
    • The suggested fix for this is to use an interpolated string.
    • We believe we've already approved (and perhaps implemented) this one, but we can't find it. If it doesn't exist, it's now approved.
    • Default visibility: Info
  • The format string comes from a static reference (a static property, method, or field).
    • The suggested fix is to introduce a static CompositeFormat field and change the string.Format calls to use that field.
    • Default visibility: Hidden
  • The format string comes from a const
    • Both fixers should be suggested.
    • There may be nuance... member-local consts, consts from a foreign assembly, ...
    • Default visibility: Hidden

Category: Performance

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Jun 1, 2023
@buyaa-n buyaa-n added the help wanted [up-for-grabs] Good issue for external contributors label Jun 1, 2023
@stephentoub
Copy link
Member

The format string is a literal. The suggested fix for this is to use an interpolated string. We believe we've already approved (and perhaps implemented) this one, but we can't find it. If it doesn't exist, it's now approved. Default visibility: Info

This does exist as a built-in refactoring in VS. So we should simply skip the cases where this would trigger.

@stephentoub stephentoub self-assigned this Jun 6, 2023
@stephentoub stephentoub removed the help wanted [up-for-grabs] Good issue for external contributors label Jun 7, 2023
@stephentoub stephentoub removed their assignment Jun 7, 2023
@stephentoub stephentoub added the help wanted [up-for-grabs] Good issue for external contributors label Jun 7, 2023
@Youssef1313
Copy link
Member

This does exist as a built-in refactoring in VS

If it's only a refactoring, there is no way to enforce it in command-line builds, which isn't ideal if it's a performance analyzer rather than a code-style analyzer.

@stephentoub
Copy link
Member

stephentoub commented Jun 7, 2023

This does exist as a built-in refactoring in VS

If it's only a refactoring, there is no way to enforce it in command-line builds, which isn't ideal if it's a performance analyzer rather than a code-style analyzer.

Even so, we're not going to create another analyzer that does the same thing and leads to the same option twice in VS.

EDIT: Actually, thinking more about it, we can just have the additional diagnostic as off-by-default and without a fixer, leaving that up to the high-quality one in VS.

EDIT 2: I opened dotnet/roslyn#68469 instead. I don't want two of these.

@tannergooding tannergooding modified the milestones: 8.0.0, Future Jul 24, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Oct 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime code-analyzer Marks an issue that suggests a Roslyn analyzer code-fixer Marks an issue that suggests a Roslyn code fixer help wanted [up-for-grabs] Good issue for external contributors partner-impact This issue impacts a partner who needs to be kept updated
Projects
None yet
Development

No branches or pull requests

8 participants