-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in System.IO.Compression.Inflater or the native zlib dependency - invalid distance too far back. #50235
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
@zlatanov, I tried your repro on both .NET 5 and .NET 6 recent bits, and both ran successfully without exception for me on Windows. What version are you testing against? On what OS? I can try on Linux as well. |
I just tried as well on Ubuntu and got "System.IO.InvalidDataException: The archive entry was compressed using an unsupported compression method." |
@stephentoub I'm sorry, I attached the working version of the code. On line 19, change the rent'ed buffer size to be 512 or 1024 instead of 2048 please. I will reattach the project. |
How did you create this data file? |
Using using System;
using System.Buffers;
using System.IO;
using System.IO.Compression;
using System.Reflection;
class Program
{
static void Main()
{
const BindingFlags flags = BindingFlags.Instance | BindingFlags.NonPublic;
var constructor = typeof(DeflateStream).GetConstructor(flags, binder: null,
types: new Type[] { typeof(Stream), typeof(CompressionMode), typeof(bool), typeof(int), typeof(long) }, modifiers: null);
var stream = new MemoryStream();
using (var inflator = (DeflateStream)constructor.Invoke(new object[] {
stream, CompressionMode.Compress, /*leaveOpen*/true, /*windowBits*/-10, /*uncompressedSize*/-1L }))
{
var data = new byte[ushort.MaxValue];
new Random(0).NextBytes(data);
inflator.Write(data);
}
stream.Position = 0;
using (var deflator = (DeflateStream)constructor.Invoke(new object[] {
stream, CompressionMode.Decompress, /*leaveOpen*/false, /*windowBits*/-10, /*uncompressedSize*/-1L }))
{
typeof(DeflateStream).GetField("_buffer", flags).SetValue(deflator, ArrayPool<byte>.Shared.Rent(512));
var buffer = new byte[512];
var count = 0;
while (true)
{
var byteCount = deflator.Read(buffer);
count += byteCount;
if (count == ushort.MaxValue)
{
break;
}
}
}
}
} |
@carlossanlop @adamsitnik @jozkee The same error also appeared several times during WebSocket compression testing with https://github.com/crossbario/autobahn-testsuite |
I have created https://github.com/zlatanov/websockets-compliance repository with instructions how to run the testsuite. To demonstrate the issue we can look only at one of the cases which is pretty simple: Case 13.3.1 (id: 428) Running the project in the aforementioned repository as follows: Not all cases have this problem, only some of them, and only those which use 9 window bits. I cannot confidently say that this isn't an issue with the way I use the deflater, it's just that I can't find the reason for it. |
Here is a minimal repro with data captured from the testsuite for test case 428. using System;
using System.Collections.Generic;
using System.IO;
using System.IO.Compression;
using System.Linq;
using System.Reflection;
class Program
{
static readonly ConstructorInfo Constructor = typeof(DeflateStream).GetConstructor(BindingFlags.Instance | BindingFlags.NonPublic, binder: null,
types: new Type[] { typeof(Stream), typeof(CompressionMode), typeof(bool), typeof(int), typeof(long) }, modifiers: null);
static void Main()
{
Working();
Failing();
}
static void Working()
{
var memoryStream = new MemoryStream(Compress().SelectMany(x => x).ToArray());
var deflate = (DeflateStream)Constructor.Invoke(new object[] {
memoryStream, CompressionMode.Decompress, /*leaveOpen*/false, /*windowBits*/-9, /*uncompressedSize*/-1L });
var buffer = new byte[1024];
while (true)
{
var bytesRead = deflate.Read(buffer);
if (bytesRead == 0)
{
break;
}
}
Console.WriteLine("Works");
}
static void Failing()
{
var memoryStream = new MemoryStream();
var deflate = (DeflateStream)Constructor.Invoke(new object[] {
memoryStream, CompressionMode.Decompress, /*leaveOpen*/false, /*windowBits*/-9, /*uncompressedSize*/-1L });
var buffer = new byte[1024];
foreach (var segment in Compress())
{
memoryStream.Write(segment);
memoryStream.Position -= segment.Length;
deflate.Read(buffer);
}
Console.WriteLine("Should not see this...");
}
static List<byte[]> Compress()
{
var stream = new MemoryStream();
var result = new List<byte[]>();
using var inflator = (DeflateStream)Constructor.Invoke(new object[] {
stream, CompressionMode.Compress, /*leaveOpen*/true, /*windowBits*/-9, /*uncompressedSize*/-1L });
var reader = new StringReader(Data);
var line = reader.ReadLine();
while (line is not null)
{
inflator.Write(Convert.FromHexString(line));
inflator.Flush();
result.Add(stream.ToArray());
stream.SetLength(0);
line = reader.ReadLine();
}
return result;
}
const string Data =
@"7B0A202020224175746F6261686E5079
74686F6E2F302E362E30223A207B0A20
202020202022312E312E31223A207B0A
20202020202020202022626568617669
6F72223A20224F4B222C0A2020202020
20202020226265686176696F72436C6F
7365223A20224F4B222C0A2020202020
20202020226475726174696F6E223A20
322C0A2020202020202020202272656D
6F7465436C6F7365436F6465223A2031
3030302C0A2020202020202020202272
65706F727466696C65223A2022617574
6F6261686E707974686F6E5F305F365F
305F636173655F315F315F312E6A736F
6E220A2020202020207D2C0A20202020
202022312E312E32223A207B0A202020
202020202020226265686176696F7222
3A20224F4B222C0A2020202020202020
20226265686176696F72436C6F736522
3A20224F4B222C0A2020202020202020
20226475726174696F6E223A20322C0A
2020202020202020202272656D6F7465
436C6F7365436F6465223A2031303030
2C0A202020202020202020227265706F
727466696C65223A20226175746F6261
686E707974686F6E5F305F365F305F63
6173655F315F315F322E6A736F6E220A
2020202020207D2C0A20202020202022
312E312E33223A207B0A202020202020
202020226265686176696F72223A2022
4F4B222C0A2020202020202020202262
65686176696F72436C6F7365223A2022
4F4B222C0A2020202020202020202264
75726174696F6E223A20322C0A202020
2020202020202272656D6F7465436C6F
7365436F6465223A20313030302C0A20
2020202020202020227265706F727466
696C65223A20226175746F6261686E70
7974686F6E5F305F365F305F63617365
5F315F315F332E6A736F6E220A202020
2020207D2C0A20202020202022312E31
2E34223A207B0A202020202020202020
226265686176696F72223A20224F4B22
2C0A2020202020202020202262656861
76696F72436C6F7365223A20224F4B22
2C0A2020202020202020202264757261
74696F6E223A20322C0A202020202020";
} The difference between the working and the failing method is that in the first I am reading everything at once. In the second I am reading line by line. |
I can confirm this only happens with the intel version of the zlib library. I've attached a c++ project that replicates the test case above. I've included in the project the source files from I've added a project property that enables to build with either one of the versions - However, if we change the project to use the Intel's build, we should see this: I've written the main function of the project so that it's easy to set a breakpoint where things go wrong (see line 29): This should confirm that there is indeed a problem with zlib and not the websockets source code. /cc @CarnaViire, @stephentoub |
|
@stephentoub After looking around I found this: runtime/src/libraries/Native/Windows/System.IO.Compression.Native/zlib-intel/deflate_medium.c Line 233 in 01b7e73
runtime/src/libraries/Native/Windows/System.IO.Compression.Native/zlib-intel/deflate_medium.c Line 268 in 01b7e73
If we replace the usage of the I am also unsure if this is correct too: runtime/src/libraries/Native/Windows/System.IO.Compression.Native/zlib-intel/deflate_medium.c Line 145 in 01b7e73
To me it seems this would work only when the window size is 32KB. |
cc @jtkukunas for zlib-intel |
Now that the PR for this bug has been merged in the original repo, how do you suggest we proceed so we can bring it here as well? Should I create a PR for it here, or do you have other procedure for keeping the zlib-intel version in sync with the original repo? |
I can produce a new release of intel zlib, which will make it easier to consume. |
A new release will give a tag to reference. Last time I did this I think I used patch files to port the changes. @vkvenkat set it up originally IIRC. |
@jkotas You mentioned here #49194 (comment) that the sources are somehow refreshed once in a while. Is it possibly an automated process for updating? Can we just run it after @jtkukunas creates a new release? |
It is a question for area-System.IO.Compression area owners - @adamsitnik @carlossanlop @jozkee according to https://github.com/dotnet/runtime/blob/main/docs/area-owners.md - on whether they believe that automating this process would be a net win long term. |
Ok, I just got an impression that an automated process exists. If there's none, I guess we'd have to update the sources manually... |
No, nothing automatic: changes don't happen often. Don't do it "manually", create patch files for the last commits and apply them in a PR. |
This is related to the work I am doing to enable compression in WebSockets (here). I have written several tests that make sure the websocket works well with all supported window sizes. To test edge cases I also have created data, which when compressed results in bigger payload.
All tests work fine, except one and I isolated the issue to the following reproducible code:
Ignore the reflection, in
System.Net.WebSockets
we have access to theInterop.zlib.cs
, but it would not be easy to write code that shows the problem without a lot of boilerplate stuff.The code in the example will throw an exception, and the native zlib error will be "invalid distance too far back". If we change the size of the
_buffer
to 2048, the error disappears and the data is inflated without a problem.The default buffer size for
DeflateStream
is 8KB and I so far have been unable to create data that would cause the same error, but nevertheless I think there is a bug somewhere.In the
WebSocket
right now I have dynamic buffer size, depending on how big the user buffer is. I can easily use 8KB as minimum, but I think this would only hide the problem or make it rarer.//cc @CarnaViire, @carlossanlop, @stephentoub
ConsoleApp.zip
The text was updated successfully, but these errors were encountered: