Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression when deserializing compressed binary XML streams using data contract serializers #75437

Open
dev991301 opened this issue Sep 12, 2022 · 11 comments
Labels
area-System.IO documentation Documentation bug or enhancement, does not impact product or test code needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration tenet-performance Performance related issue
Milestone

Comments

@dev991301
Copy link

Description

When compared to .NET Framework 4.8, the output from BenchmarkDotNet shows execution time has greatly increased in .NET6/7 when deserializing from compressed binary XML streams using data contract serializers for large collections of objects (>1000). For example, using my configuration, deserializing 10,000 objects is around 200 times slower in .NET6/7 when compared to .NET Framework 4.8.

Data Contract Code
[DataContract(IsReference = true)]
public class Person
{
    [DataMember]
    public Person Parent { get; set; }
    
    [DataMember]
    public List<Person> Children { get; set; }
}
Benchmark Code
[SimpleJob(RuntimeMoniker.Net48, baseline: true)]
[SimpleJob(RuntimeMoniker.Net60)]
[SimpleJob(RuntimeMoniker.Net70)]
public class Benchmark
{
    [Params(10, 100, 1000, 10000)]
    public int N { get; set; }

    private byte[] _serialized;
    private readonly DataContractSerializer _serializer = new (typeof(Person));

    [GlobalSetup]
    public void Serialize()
    {
        var person = new Person();
        person.Children = Enumerable.Range(0, N-1).Select(_ => new Person {Parent = person}).ToList();

        using var compressed = new MemoryStream();
        using var compressor = new DeflateStream(compressed, CompressionMode.Compress);
        
        using var writer = XmlDictionaryWriter.CreateBinaryWriter(compressor);
        _serializer.WriteObject(writer, person);
        writer.Close();
        
        _serialized =  compressed.ToArray();
    }
    
    [Benchmark]
    public object Deserialize()
    {
        using var compressed = new MemoryStream(_serialized);
        using var decompressor = new DeflateStream(compressed, CompressionMode.Decompress);
        
        using var reader = XmlDictionaryReader.CreateBinaryReader(decompressor, XmlDictionaryReaderQuotas.Max);
        
        return _serializer.ReadObject(reader);
    }
}

Configuration

BenchmarkDotNet=v0.13.2, OS=Windows 11 (10.0.22000.856/21H2)
11th Gen Intel Core i7-11800H 2.30GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=7.0.100-preview.7.22377.5
  [Host]             : .NET 7.0.0 (7.0.22.37506), X64 RyuJIT AVX2
  .NET 6.0           : .NET 6.0.8 (6.0.822.36306), X64 RyuJIT AVX2
  .NET 7.0           : .NET 7.0.0 (7.0.22.37506), X64 RyuJIT AVX2
  .NET Framework 4.8 : .NET Framework 4.8 (4.8.4510.0), X64 RyuJIT VectorSize=256

Data

Method Job Runtime N Mean Error StdDev Ratio RatioSD
Deserialize .NET 6.0 .NET 6.0 10 29.06us 0.256us 0.227us 0.82 0.01
Deserialize .NET 7.0 .NET 7.0 10 28.99us 0.318us 0.297us 0.82 0.01
Deserialize .NET Framework 4.8 .NET Framework 4.8 10 35.42us 0.253us 0.211us 1.00 0.00
Deserialize .NET 6.0 .NET 6.0 100 323.17us 2.265us 2.008us 1.19 0.01
Deserialize .NET 7.0 .NET 7.0 100 319.55us 2.648us 2.348us 1.17 0.01
Deserialize .NET Framework 4.8 .NET Framework 4.8 100 272.22us 2.486us 2.204us 1.00 0.00
Deserialize .NET 6.0 .NET 6.0 1000 310,428.48us 914.629us 810.795us 116.82 0.81
Deserialize .NET 7.0 .NET 7.0 1000 312,653.03us 2,976.487us 2,638.578us 117.73 1.04
Deserialize .NET Framework 4.8 .NET Framework 4.8 1000 2,657.57us 18.909us 15.790us 1.00 0.00
Deserialize .NET 6.0 .NET 6.0 10000 6,167,079.07us 28,492.478us 26,651.882us 214.41 2.02
Deserialize .NET 7.0 .NET 7.0 10000 6,140,992.62us 13,397.507us 11,187.525us 213.47 2.17
Deserialize .NET Framework 4.8 .NET Framework 4.8 10000 28,765.16us 293.030us 274.100us 1.00 0.00
@dev991301 dev991301 added the tenet-performance Performance related issue label Sep 12, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Sep 12, 2022
@filipnavara
Copy link
Member

Can you try wrapping the DeflateStream with BufferedStream? I have seen half a dozen similar regressions due to the fact that the buffering works differently. For consumers that use small reads it could often make a huge difference.

@stephentoub
Copy link
Member

Yes, this is the same as #39233. Here Deserialize is your benchmark and Deserialize2 is your benchmark but with a BufferedStream added in between:

Method Runtime N Mean Ratio
Deserialize .NET Framework 4.8 10000 37.77 ms 1.00
Deserialize .NET 6.0 10000 381.80 ms 10.10
Deserialize2 .NET Framework 4.8 10000 14.35 ms 1.00
Deserialize2 .NET 6.0 10000 11.17 ms 0.78

@dev991301
Copy link
Author

Thank you for the help. Using the BufferedStream does indeed significantly improve the performance.

I am porting a large code base from .NET Framework 4.8 to .NET 6 and came across this issue. I am not sure if there are other issues that could have negative performance implications when porting from Framework. Is there documentation I should have read that highlights potential performance regressions when porting from Framework and the solutions?

Thanks again for your help.

@danmoseley
Copy link
Member

danmoseley commented Sep 13, 2022

@PriyaPurkayastha do we have a place for such info? @StephenBonikowsky I know such migration bumps are something you have an interest in also. This is similar to the discussion we had in #72266 - where do we record speed bumps that aren't breaking by a stricter definition. Cc @ericstj

@PriyaPurkayastha
Copy link

@danmoseley I am aware of "What's New in .NET 6", "Known Issues" and "Breaking changes" published per release. It appears that nobody is comfortable using the existing documentation channels for such issues, so we might just need to figure out a path forward. Since such speed bumps might be spread out over different Fundamental areas, I think it would be valuable to hear the opinions/thoughts from respective Fundamentals area owners. e.g This one as well as #72266 would be something that we can talk to @sblom about. I will start a discussion on this and also include documentation team since they have valuable inputs as well.
cc @marklio

@stephentoub
Copy link
Member

I am aware of "What's New in .NET 6", "Known Issues" and "Breaking changes" published per release. It appears that nobody is comfortable using the existing documentation channels for such issues,

I think this makes sense as a known issue; we certainly didn't want this particular usage to be so much slower, and it happened because of a change in one of our dependencies. The downside is it'll likely be a known issue in several releases, as long as we're using that dependency and it itself makes the same tradeoffs it currently does.

@ghost
Copy link

ghost commented Sep 22, 2022

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

When compared to .NET Framework 4.8, the output from BenchmarkDotNet shows execution time has greatly increased in .NET6/7 when deserializing from compressed binary XML streams using data contract serializers for large collections of objects (>1000). For example, using my configuration, deserializing 10,000 objects is around 200 times slower in .NET6/7 when compared to .NET Framework 4.8.

Data Contract Code
[DataContract(IsReference = true)]
public class Person
{
    [DataMember]
    public Person Parent { get; set; }
    
    [DataMember]
    public List<Person> Children { get; set; }
}
Benchmark Code
[SimpleJob(RuntimeMoniker.Net48, baseline: true)]
[SimpleJob(RuntimeMoniker.Net60)]
[SimpleJob(RuntimeMoniker.Net70)]
public class Benchmark
{
    [Params(10, 100, 1000, 10000)]
    public int N { get; set; }

    private byte[] _serialized;
    private readonly DataContractSerializer _serializer = new (typeof(Person));

    [GlobalSetup]
    public void Serialize()
    {
        var person = new Person();
        person.Children = Enumerable.Range(0, N-1).Select(_ => new Person {Parent = person}).ToList();

        using var compressed = new MemoryStream();
        using var compressor = new DeflateStream(compressed, CompressionMode.Compress);
        
        using var writer = XmlDictionaryWriter.CreateBinaryWriter(compressor);
        _serializer.WriteObject(writer, person);
        writer.Close();
        
        _serialized =  compressed.ToArray();
    }
    
    [Benchmark]
    public object Deserialize()
    {
        using var compressed = new MemoryStream(_serialized);
        using var decompressor = new DeflateStream(compressed, CompressionMode.Decompress);
        
        using var reader = XmlDictionaryReader.CreateBinaryReader(decompressor, XmlDictionaryReaderQuotas.Max);
        
        return _serializer.ReadObject(reader);
    }
}

Configuration

BenchmarkDotNet=v0.13.2, OS=Windows 11 (10.0.22000.856/21H2)
11th Gen Intel Core i7-11800H 2.30GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=7.0.100-preview.7.22377.5
  [Host]             : .NET 7.0.0 (7.0.22.37506), X64 RyuJIT AVX2
  .NET 6.0           : .NET 6.0.8 (6.0.822.36306), X64 RyuJIT AVX2
  .NET 7.0           : .NET 7.0.0 (7.0.22.37506), X64 RyuJIT AVX2
  .NET Framework 4.8 : .NET Framework 4.8 (4.8.4510.0), X64 RyuJIT VectorSize=256

Data

Method Job Runtime N Mean Error StdDev Ratio RatioSD
Deserialize .NET 6.0 .NET 6.0 10 29.06us 0.256us 0.227us 0.82 0.01
Deserialize .NET 7.0 .NET 7.0 10 28.99us 0.318us 0.297us 0.82 0.01
Deserialize .NET Framework 4.8 .NET Framework 4.8 10 35.42us 0.253us 0.211us 1.00 0.00
Deserialize .NET 6.0 .NET 6.0 100 323.17us 2.265us 2.008us 1.19 0.01
Deserialize .NET 7.0 .NET 7.0 100 319.55us 2.648us 2.348us 1.17 0.01
Deserialize .NET Framework 4.8 .NET Framework 4.8 100 272.22us 2.486us 2.204us 1.00 0.00
Deserialize .NET 6.0 .NET 6.0 1000 310,428.48us 914.629us 810.795us 116.82 0.81
Deserialize .NET 7.0 .NET 7.0 1000 312,653.03us 2,976.487us 2,638.578us 117.73 1.04
Deserialize .NET Framework 4.8 .NET Framework 4.8 1000 2,657.57us 18.909us 15.790us 1.00 0.00
Deserialize .NET 6.0 .NET 6.0 10000 6,167,079.07us 28,492.478us 26,651.882us 214.41 2.02
Deserialize .NET 7.0 .NET 7.0 10000 6,140,992.62us 13,397.507us 11,187.525us 213.47 2.17
Deserialize .NET Framework 4.8 .NET Framework 4.8 10000 28,765.16us 293.030us 274.100us 1.00 0.00
Author: dev991301
Assignees: -
Labels:

area-Serialization, area-System.IO, tenet-performance, untriaged

Milestone: -

@StephenMolloy
Copy link
Member

Moving this to System.IO so the appropriate team can decide how they want to document this recommendation for BufferedStream when moving from 6.0 -> 7.0.

@jozkee jozkee added this to the 7.0.0 milestone Sep 27, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Sep 27, 2022
@jozkee jozkee added untriaged New issue has not been triaged by the area owner needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Sep 27, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Sep 27, 2022
@jozkee jozkee added the documentation Documentation bug or enhancement, does not impact product or test code label Sep 27, 2022
@jozkee
Copy link
Member

jozkee commented Sep 27, 2022

I assume this regression has been around since we bumped to v1.2.11 dotnet/corefx#32732?

@jozkee jozkee modified the milestones: 7.0.0, Future Sep 27, 2022
@stephentoub
Copy link
Member

yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.IO documentation Documentation bug or enhancement, does not impact product or test code needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

8 participants