Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

P/Invoke and GCHandle pinning regression on .NET 5.0.301 #55944

Closed
ied206 opened this issue Jul 19, 2021 · 6 comments
Closed

P/Invoke and GCHandle pinning regression on .NET 5.0.301 #55944

ied206 opened this issue Jul 19, 2021 · 6 comments

Comments

@ied206
Copy link

ied206 commented Jul 19, 2021

Description

Summary

Updating .NET runtime to 5.0.301 from 5.0.202 breaks the existing p/invoke code.

I have a zlib pinvoke library, Joveler.Compression.Zlib.
Its zlib inflate/deflate streaming API has broken since upgrading .NET SDK to 5.0.301.

I highly suspect a bug of .NET runtime in handling pinned GCHandle.

Details

zlib uses C struct z_stream as its context object, and requires the address of z_stream must not change while the object is alive. This is undocumented, but apparent on zlib's s->strm != strm check code. (s->strm is the zlib allocated address, and strm is parameter address).

Thus the wrapper must pin z_stream with GCHandle to prevent GC from moving the object.
If it doesn't, zlib deflate()/inflate() occasionally returns Z_STREAM_ERROR code, as GC changed the address of z_stream.

// This code worked prior to .NET SDK 5.0.301
_zs64 = new ZStreamL64(); // .NET translated z_stream
_zsPin = GCHandle.Alloc(_zs64, GCHandleType.Pinned); // GCHandle pinning
...
ZLibRet ret = ZLibInit.Lib.L64.Deflate(_zs64, ZLibFlush.NoFlush); // returns Z_STREAM_ERROR on .NET SDK 5.0.301

If I change the code not to pin GCHandle, its test code randomly fails on every .NET platform.

// This code randomly fails on .NET Framework 4.8, .NET Core 3.1, .NET 5.0
_zs64 = new ZStreamL64(); // .NET translated z_stream
_zsPin = GCHandle.Alloc(_zs64, GCHandleType.Pinned); // Does not pins GCHandle
...
ZLibRet ret = ZLibInit.Lib.L64.Deflate(_zs64, ZLibFlush.NoFlush); // randomly returns Z_STREAM_ERROR 

Therefore, it is highly suspected that there is a bug around pinned GCHandle handling since .NET SDK 5.0.301.

Configuration

  • .NET 5.0.301 on Windows x86, x64
  • .NET 5.0.302 on Ubuntu 20.04 x64

Regression?

The test code worked on every .NET platform prior to 5,0.301.
You can look at Azure Pipelines build log, which was performed on 2021-04-11 with .NET 5.0.202 SDK.

  • .NET 5.0
    • .NET 5.0.202 on Windows x64
    • .NET 5.0.202 on Ubuntu 20.04 x64
    • .NET 5.0.102 on Debian 10 arm64 (emulated with QEMU 6.0)
    • .NET 5.0.102 on Debian 10 arm64 (emulated with QEMU 4.1)
  • .NET Core 3.1
  • .NET Framework 4.8

Other information

Reproduce

To reproduce this issue, simply checkout Joveler.Compression repo and run the tests on .NET SDK 5.0.301 or higher.
The tests should fail like this:

image

To test GCHandle pinning effect, change L182 and L173 of ZLibStreams.cs like this:

// BEFORE (pinned)
_zsPin = GCHandle.Alloc(<HANDLE>, GCHandleType.Pinned);

// AFTER (not pinned)
_zsPin = GCHandle.Alloc(<HANDLE>, GCHandleType.Normal);

Why suspect GCHandle pinning?

Different from my zlib wrapper, my xz-utils wrapper and lz4 wrapper works fine on the latest .NET SDK. The only thing zlib wrapper does differently is the GCHandle pinning of context object (z_stream).

Credit to @Luzifix who noticed and reported test failure.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.IO.Compression untriaged New issue has not been triaged by the area owner labels Jul 19, 2021
@ghost
Copy link

ghost commented Jul 19, 2021

Tagging subscribers to this area: @dotnet/area-system-io-compression
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

Summary

Updating .NET runtime to 5.0.301 from 5.0.202 broke the existing p/invoke code.

I have written a zlib pinvoke library, Joveler.Compression.Zlib.
Its zlib inflate/deflate streaming API was broken after upgrading .NET SDK to 5.0.301.

I highly suspect a bug of .NET runtime in handling pinned GCHandle.

Details

zlib uses C struct z_stream as its context object, and requires the address of z_stream must not change while the object is alive. This is undocumented, but apparent on zlib's s->strm != strm check code. (s->strm is the zlib allocated address, and strm is parameter address).

Thus the wrapper must pin z_stream with GCHandle to prevent GC from moving the object.
If it doesn't, zlib deflate()/inflate() occasionally returns Z_STREAM_ERROR code, as GC changed the address of z_stream.

// This code worked prior to .NET SDK 5.0.301
_zs64 = new ZStreamL64(); // .NET translated z_stream
_zsPin = GCHandle.Alloc(_zs64, GCHandleType.Pinned); // GCHandle pinning
...
ZLibRet ret = ZLibInit.Lib.L32.Deflate(_zs32, ZLibFlush.NoFlush); // returns Z_STREAM_ERROR on .NET SDK 5.0.301

If I change the code not to pin GCHandle, its test code randomly fails on every .NET platform.

// This code randomly fails on .NET Framework 4.8, .NET Core 3.1, .NET 5.0
_zs64 = new ZStreamL64(); // .NET translated z_stream
_zsPin = GCHandle.Alloc(_zs64, GCHandleType.Pinned); // Does not pins GCHandle
...
ZLibRet ret = ZLibInit.Lib.L32.Deflate(_zs32, ZLibFlush.NoFlush); // randomly returns Z_STREAM_ERROR 

Thus I think there is a high possibility on .NET SDK 5.0.301 or higher has a bug with handling pinned GCHandle.

Configuration

  • .NET 5.0.301 on Windows x86, x64
  • .NET 5.0.302 on Ubuntu 20.04 x64

Regression?

The test code worked on every .NET platform prior to 5,0.203.
You can look at Azure Pipelines build log, which was performed on 2021-04-11 with .NET 5.0.202 SDK.

  • .NET 5.0
    • .NET 5.0.202 on Windows x64
    • .NET 5.0.202 on Ubuntu 20.04 x64
    • .NET 5.0.102 on Debian 10 arm64 (emulated with QEMU 6.0)
    • .NET 5.0.102 on Debian 10 arm64 (emulated with QEMU 4.1)
  • .NET Core 3.1
  • .NET Framework 4.8

Other information

Reproduce

To reproduce this issue, simply checkout Joveler.Compression repo and run the tests on .NET SDK 5.0.301 or higher.
The tests should fail like this:

image

To test GCHandle pinning effect, change L182 and L173 of ZLibStreams.cs like this:

// BEFORE (pinned)
_zsPin = GCHandle.Alloc(<HANDLE>, GCHandleType.Pinned);

// AFTER (not pinned)
_zsPin = GCHandle.Alloc(<HANDLE>, GCHandleType.Normal);

Why suspect GCHandle pinning?

Different from my zlib wrapper, my xz-utils wrapper and lz4 wrapper works fine on the latest .NET SDK. The only thing zlib wrapper does differently is the GCHandle pinning of context object (z_stream).

Credit to @Luzifix who noticed and reported test failure.

Author: ied206
Assignees: -
Labels:

area-System.IO.Compression, untriaged

Milestone: -

@ied206
Copy link
Author

ied206 commented Jul 19, 2021

Please tag @dotnet/area-GC-coreclr, as it is likely to be an issue with GC.

@ghost
Copy link

ghost commented Jul 19, 2021

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

Summary

Updating .NET runtime to 5.0.301 from 5.0.202 breaks the existing p/invoke code.

I have a zlib pinvoke library, Joveler.Compression.Zlib.
Its zlib inflate/deflate streaming API has broken since upgrading .NET SDK to 5.0.301.

I highly suspect a bug of .NET runtime in handling pinned GCHandle.

Details

zlib uses C struct z_stream as its context object, and requires the address of z_stream must not change while the object is alive. This is undocumented, but apparent on zlib's s->strm != strm check code. (s->strm is the zlib allocated address, and strm is parameter address).

Thus the wrapper must pin z_stream with GCHandle to prevent GC from moving the object.
If it doesn't, zlib deflate()/inflate() occasionally returns Z_STREAM_ERROR code, as GC changed the address of z_stream.

// This code worked prior to .NET SDK 5.0.301
_zs64 = new ZStreamL64(); // .NET translated z_stream
_zsPin = GCHandle.Alloc(_zs64, GCHandleType.Pinned); // GCHandle pinning
...
ZLibRet ret = ZLibInit.Lib.L32.Deflate(_zs64, ZLibFlush.NoFlush); // returns Z_STREAM_ERROR on .NET SDK 5.0.301

If I change the code not to pin GCHandle, its test code randomly fails on every .NET platform.

// This code randomly fails on .NET Framework 4.8, .NET Core 3.1, .NET 5.0
_zs64 = new ZStreamL64(); // .NET translated z_stream
_zsPin = GCHandle.Alloc(_zs64, GCHandleType.Pinned); // Does not pins GCHandle
...
ZLibRet ret = ZLibInit.Lib.L32.Deflate(_zs64, ZLibFlush.NoFlush); // randomly returns Z_STREAM_ERROR 

Therefore, it is highly suspected that there is a bug around pinned GCHandle handling since .NET SDK 5.0.301.

Configuration

  • .NET 5.0.301 on Windows x86, x64
  • .NET 5.0.302 on Ubuntu 20.04 x64

Regression?

The test code worked on every .NET platform prior to 5,0.301.
You can look at Azure Pipelines build log, which was performed on 2021-04-11 with .NET 5.0.202 SDK.

  • .NET 5.0
    • .NET 5.0.202 on Windows x64
    • .NET 5.0.202 on Ubuntu 20.04 x64
    • .NET 5.0.102 on Debian 10 arm64 (emulated with QEMU 6.0)
    • .NET 5.0.102 on Debian 10 arm64 (emulated with QEMU 4.1)
  • .NET Core 3.1
  • .NET Framework 4.8

Other information

Reproduce

To reproduce this issue, simply checkout Joveler.Compression repo and run the tests on .NET SDK 5.0.301 or higher.
The tests should fail like this:

image

To test GCHandle pinning effect, change L182 and L173 of ZLibStreams.cs like this:

// BEFORE (pinned)
_zsPin = GCHandle.Alloc(<HANDLE>, GCHandleType.Pinned);

// AFTER (not pinned)
_zsPin = GCHandle.Alloc(<HANDLE>, GCHandleType.Normal);

Why suspect GCHandle pinning?

Different from my zlib wrapper, my xz-utils wrapper and lz4 wrapper works fine on the latest .NET SDK. The only thing zlib wrapper does differently is the GCHandle pinning of context object (z_stream).

Credit to @Luzifix who noticed and reported test failure.

Author: ied206
Assignees: -
Labels:

area-GC-coreclr, regression-from-last-release, untriaged

Milestone: -

@mangod9
Copy link
Member

mangod9 commented Jul 19, 2021

@cshung @Maoni0. There was a change around moving statics to the POH, but not sure whether that would affect this scenario.

@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Jul 19, 2021
@mangod9 mangod9 added this to the 6.0.0 milestone Jul 19, 2021
@jkotas
Copy link
Member

jkotas commented Jul 19, 2021

@jkoritzinsky Fixed by #54244 ?

@jkoritzinsky
Copy link
Member

Yes

@ghost ghost locked as resolved and limited conversation to collaborators Aug 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants