Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to zlib-ng, part 2: consume it in runtime (second attempt) #104454

Merged
merged 6 commits into from
Jul 8, 2024

Conversation

carlossanlop
Copy link
Member

@carlossanlop carlossanlop commented Jul 4, 2024

Contributes to: #101465

This PR reapplies the reverted changes due to a wasm break in the official build: #102403

It also includes an extra commit that fixes the NativeAOT build failure that only happened in the official build.

Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-io-compression
See info in area-owners.md if you want to be subscribed.

Copy link
Member Author

@carlossanlop carlossanlop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pavelsavara @ilonatommy @kg @jkotas @jkoritzinsky @akoeplinger I need some help, I am still unable to get a successful build in wasm.

I can repro the exact same scenario used in the CI by executing these commands inside my Ubuntu WSL:

docker run -ti -d -v /home/carlos/repos/runtime:/home/carlos/repos/runtime -e ROOTFS_DIR=/crossrootfs/x64 -w /home/carlos/repos/runtime  mcr.microsoft.com/dotnet-buildtools/prereqs:azurelinux-3.0-webassembly-amd64-net9.0
docker attach <GUID_HERE>

git config --global --add safe.directory /home/carlos/repos/runtime
./build.sh mono+libs+host+packs+libs.tests -c Release -arch wasm -os browser /p:MonoEnableAssertMessages=true /p:BrowserHost=linux /p:AotHostArchitecture=x64 /p:AotHostOS=linux

And the build is failing with the following error, which tells me that I am still unable to successfully link the system zlib (which I confirmed exists inside the crossrootfs folder) to mono/metadata:

 /home/carlos/repos/runtime/src/mono/mono/metadata/debug-mono-ppdb.c:33:10: fatal error: 'zlib.h' file not found
  #include <zlib.h>
           ^~~~~~~~
  1 error generated.
  emcc: error: '/home/carlos/repos/runtime/src/mono/browser/emsdk/bin/clang -target wasm32-unknown-emscripten -fignore-exceptions -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -DEMSCRIPTEN -Werror=implicit-function-declaration --sysroot=/home/carlos/repos/runtime/src/mono/browser/emsdk/emscripten/cache/sysroot -Xclang -iwithsysroot/include/fakesdl -Xclang -iwithsysroot/include/compat -DCOMPILER_SUPPORTS_W_RESERVED_IDENTIFIER -DHAVE_CONFIG_H -DHAVE_SGEN_GC -DMONO_DLL_EXPORT -D_THREAD_SAFE -I/home/carlos/repos/runtime/artifacts/obj -I/home/carlos/repos/runtime/src/native -I/home/carlos/repos/runtime/artifacts/obj/mono/browser.wasm.Release/mono/metadata/../.. -I/home/carlos/repos/runtime/src/mono/mono/metadata/../.. -I/home/carlos/repos/runtime/src/mono/mono/metadata/.. -I/home/carlos/repos/runtime/src/native/public/. -I/home/carlos/repos/runtime/artifacts/obj/mono/browser.wasm.Release/mono/eglib -I/home/carlos/repos/runtime/src/mono/mono/eglib -fno-strict-aliasing -fwrapv -Wall -Wunused -Wmissing-declarations -Wpointer-arith -Wno-cast-qual -Wwrite-strings -Wno-switch -Wno-switch-enum -Wno-unused-value -Wno-attributes -Wno-format-zero-length -Wno-unused-function -Qunused-arguments -Wno-tautological-compare -Wno-parentheses-equality -Wno-self-assign -Wno-return-stack-address -Wno-constant-logical-operand -Wno-zero-length-array -Wno-asm-operand-widths -Wmissing-prototypes -Wstrict-prototypes -Wnested-externs -Werror=return-type -Werror=implicit-function-declaration -Werror=incompatible-pointer-types -O3 -DNDEBUG -std=gnu11 -g3 -fPIC -fvisibility=hidden -Wno-strict-prototypes -Wno-unused-but-set-variable -Wno-single-bit-bitfield-constant-conversion -Os -ffp-contract=off -MD -MT mono/metadata/CMakeFiles/metadata_objects.dir/debug-mono-ppdb.c.o -MF CMakeFiles/metadata_objects.dir/debug-mono-ppdb.c.o.d -c /home/carlos/repos/runtime/src/mono/mono/metadata/debug-mono-ppdb.c -o CMakeFiles/metadata_objects.dir/debug-mono-ppdb.c.o' failed (returned 1)
  make[2]: *** [mono/metadata/CMakeFiles/metadata_objects.dir/build.make:257: mono/metadata/CMakeFiles/metadata_objects.dir/debug-mono-ppdb.c.o] Error 1
  make[1]: *** [CMakeFiles/Makefile2:415: mono/metadata/CMakeFiles/metadata_objects.dir/all] Error 2
  make: *** [Makefile:136: all] Error 2
/home/carlos/repos/runtime/src/mono/mono.proj(765,5): error MSB3073: The command "cmake --build . --target install --config Release --parallel 12" exited with code 2.

Build FAILED.

src/mono/CMakeLists.txt Show resolved Hide resolved
src/mono/mono/metadata/CMakeLists.txt Outdated Show resolved Hide resolved
src/native/external/zlib-ng.cmake Outdated Show resolved Hide resolved
src/native/external/zlib-ng.cmake Outdated Show resolved Hide resolved
@carlossanlop
Copy link
Member Author

carlossanlop commented Jul 5, 2024

@LoopedBard3 I think I need to remove zlib from the list of dependencies that get installed by this performance job. Can you please confirm? If yes, what azp run do I need to execute to confirm that change?:

https://github.com/dotnet/runtime/blob/main/eng/pipelines/coreclr/templates/run-performance-job.yml#L114

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@pavelsavara
Copy link
Member

pavelsavara commented Jul 8, 2024

zlib-ng is not ready for WASI Preview2, see #104414 (comment)

@ericstj
Copy link
Member

ericstj commented Jul 8, 2024

zlib-ng is not ready for WASI Preview2, see #104414 (comment)

Let's collaborate on getting those fixed in the WASI PR. It will be easier to test and iterate there. It's important that this change makes it into Preview 7.

Copy link
Member

@ericstj ericstj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve this change to get this in and building. Please file a follow up issue on the platform manifest cleanup, and work with @pavelsavara on wasi fix.

@@ -279,6 +279,7 @@
<PlatformManifestFileEntry Include="libicuuc.a" IsNative="true" />
<!-- zlib-specific files -->
<PlatformManifestFileEntry Include="libz.a" IsNative="true" />
<PlatformManifestFileEntry Include="zlibstatic.lib" IsNative="true" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it intentional to ship zlibstatic.lib in our shared framework? That seems unusual as it's a dev-time resource - however I see other dev-time resources here like main.c, driver.h, etc. I take it you needed to add this to build - but I bet a better fix is to get all of this stuff out of the platform manifest since I don't think it needs to be fed to the SDK for conflict resolution @jkoritzinsky @dsplaisted.

I see we have a metadata flag ExcludeFromDataFiles for that: https://github.com/dotnet/arcade/blob/76f733ee57811c38bb5b8e1ac9c6c50e92bc5dc9/src/Microsoft.DotNet.SharedFramework.Sdk/targets/sharedfx.targets#L249

I think we should review all the things added here, remove files that don't need to ship - if any, and apply ExcludeFromDataFiles for the files that need to ship and don't actually need to go into the manifest. That can be done separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll keep this in mind as a follow up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it intentional to ship zlibstatic.lib in our shared framework?

zlibstatic.lib ships in the native aot runtime package only. It does not ship in the regular shared framework.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I implemented the shared framework SDK, no one really knew what needed to go into the manifest, so it requires/lists all files. If we know what types of files need to be in it, I can update the SDK to only validate (for the template case) or include (for the generative case) files that matter.

@carlossanlop
Copy link
Member Author

/ba-g All failures determined to be unrelated.

@carlossanlop carlossanlop merged commit cf08d43 into dotnet:main Jul 8, 2024
206 of 242 checks passed
@carlossanlop carlossanlop deleted the zlib-ng-reattempt branch July 8, 2024 23:11
@carlossanlop
Copy link
Member Author

@LoopedBard3 @caaavik-msft @DrewScoggins where can I see the runtime perf chart before and after this change?

@LoopedBard3
Copy link
Member

If you have specific tests in mind, we have an ADX dashboard and database. I have DMed these to you. Otherwise, if you want to just look at a few of the test results, we also have an allTestHistory that is updated once a day with the latest run data. Although it seems we currently have a pathing bug so you will need to remove the "reports/" from the url to view the actual test if the links give a 404.

@am11
Copy link
Member

am11 commented Jul 10, 2024

@carlossanlop, VMR jobs are failing with:

      Generating native code
      /usr/bin/ld.bfd: /vmr/src/runtime/artifacts/bin/microsoft.netcore.app.runtime.linux-x64/Release/runtimes/linux-x64/native/libSystem.IO.Compression.Native.a(pal_zlib.c.o): in function `CompressionNative_DeflateInit2_':
      /vmr/src/runtime/src/native/libs/System.IO.Compression.Native/pal_zlib.c:122:(.text.CompressionNative_DeflateInit2_+0x89): undefined reference to `deflateInit2_'
      /usr/bin/ld.bfd: /vmr/src/runtime/artifacts/bin/microsoft.netcore.app.runtime.linux-x64/Release/runtimes/linux-x64/native/libSystem.IO.Compression.Native.a(pal_zlib.c.o): in function `CompressionNative_Deflate':
      /vmr/src/runtime/src/native/libs/System.IO.Compression.Native/pal_zlib.c:134:(.text.CompressionNative_Deflate+0x2c): undefined reference to `deflate'
      /usr/bin/ld.bfd: /vmr/src/runtime/artifacts/bin/microsoft.netcore.app.runtime.linux-x64/Release/runtimes/linux-x64/native/libSystem.IO.Compression.Native.a(pal_zlib.c.o): in function `CompressionNative_DeflateEnd':
      /vmr/src/runtime/src/native/libs/System.IO.Compression.Native/pal_zlib.c:145:(.text.CompressionNative_DeflateEnd+0x29): undefined reference to `deflateEnd'
      /usr/bin/ld.bfd: /vmr/src/runtime/artifacts/bin/microsoft.netcore.app.runtime.linux-x64/Release/runtimes/linux-x64/native/libSystem.IO.Compression.Native.a(pal_zlib.c.o): in function `CompressionNative_InflateInit2_':
      /vmr/src/runtime/src/native/libs/System.IO.Compression.Native/pal_zlib.c:159:(.text.CompressionNative_InflateInit2_+0x6b): undefined reference to `inflateInit2_'
      /usr/bin/ld.bfd: /vmr/src/runtime/artifacts/bin/microsoft.netcore.app.runtime.linux-x64/Release/runtimes/linux-x64/native/libSystem.IO.Compression.Native.a(pal_zlib.c.o): in function `CompressionNative_Inflate':
      /vmr/src/runtime/src/native/libs/System.IO.Compression.Native/pal_zlib.c:171:(.text.CompressionNative_Inflate+0x2c): undefined reference to `inflate'
      /usr/bin/ld.bfd: /vmr/src/runtime/artifacts/bin/microsoft.netcore.app.runtime.linux-x64/Release/runtimes/linux-x64/native/libSystem.IO.Compression.Native.a(pal_zlib.c.o): in function `CompressionNative_InflateEnd':
      /vmr/src/runtime/src/native/libs/System.IO.Compression.Native/pal_zlib.c:182:(.text.CompressionNative_InflateEnd+0x29): undefined reference to `inflateEnd'
      /usr/bin/ld.bfd: /vmr/src/runtime/artifacts/bin/microsoft.netcore.app.runtime.linux-x64/Release/runtimes/linux-x64/native/libSystem.IO.Compression.Native.a(pal_zlib.c.o): in function `CompressionNative_Crc32':
      /vmr/src/runtime/src/native/libs/System.IO.Compression.Native/pal_zlib.c:192:(.text.CompressionNative_Crc32+0x8): undefined reference to `crc32'
    clang : error : linker command failed with exit code 1 (use -v to see invocation) [/vmr/src/runtime/src/coreclr/tools/aot/crossgen2/crossgen2_publish.csproj]
    /vmr/src/runtime/artifacts/bin/coreclr/linux.x64.Release/build/Microsoft.NETCore.Native.targets(368,5): error MSB3073: The command ""/usr/bin/clang-14" "/vmr/src/runtime/artifacts/obj/coreclr/crossgen2_publish/linux.x64.Release/native/crossgen2.o" -o "/vmr/src/runtime/artifacts/bin/crossgen2_publish/x64/Release/native/crossgen2" -Wl,--version-script=/vmr/src/runtime/artifacts/obj/coreclr/crossgen2_publish/linux.x64.Release/native/crossgen2.exports -Wl,--export-dynamic -gz=zlib -fuse-ld=bfd /vmr/src/runtime/artifacts/bin/coreclr/linux.x64.Release/aotsdk/libbootstrapper.o /vmr/src/runtime/artifacts/bin/coreclr/linux.x64.Release/aotsdk/libRuntime.ServerGC.a /vmr/src/runtime/artifacts/bin/coreclr/linux.x64.Release/aotsdk/libeventpipe-enabled.a /vmr/src/runtime/artifacts/bin/coreclr/linux.x64.Release/aotsdk/libRuntime.VxsortEnabled.a /vmr/src/runtime/artifacts/bin/coreclr/linux.x64.Release/aotsdk/libstandalonegc-disabled.a /vmr/src/runtime/artifacts/bin/coreclr/linux.x64.Release/aotsdk/libstdc++compat.a /vmr/src/runtime/artifacts/bin/coreclr/linux.x64.Release/aotsdk/libz.a /vmr/src/runtime/artifacts/bin/microsoft.netcore.app.runtime.linux-x64/Release/runtimes/linux-x64/native/libSystem.Native.a /vmr/src/runtime/artifacts/bin/microsoft.netcore.app.runtime.linux-x64/Release/runtimes/linux-x64/native/libSystem.IO.Compression.Native.a /vmr/src/runtime/artifacts/bin/microsoft.netcore.app.runtime.linux-x64/Release/runtimes/linux-x64/native/libSystem.Net.Security.Native.a /vmr/src/runtime/artifacts/bin/microsoft.netcore.app.runtime.linux-x64/Release/runtimes/linux-x64/native/libSystem.Security.Cryptography.Native.OpenSsl.a -g -Wl,-rpath,'$ORIGIN' -Wl,--build-id=sha1 -Wl,--as-needed -pthread -ldl -lrt -lm -pie -Wl,-pie -Wl,-z,relro -Wl,-z,now -Wl,--eh-frame-hdr -Wl,--discard-all -Wl,--gc-sections" exited with code 1. [/vmr/src/runtime/src/coreclr/tools/aot/crossgen2/crossgen2_publish.csproj]

blocking SDK dotnet/sdk#42019. Do we need to set -p:UseSystemZlib=true?

@carlossanlop
Copy link
Member Author

carlossanlop commented Jul 10, 2024

Yep, I've been notified and I'm looking into this.

Do we need to set -p:UseSystemZlib=true?

No, that property needs to be set conditionally, plus it is already set to true in this failing case, so the problem is not there.

@LoopedBard3
Copy link
Member

Likely related improvements:
Linux arm64: dotnet/perf-autofiling-issues#38054

@sebastienros
Copy link
Member

We are seeing different results in the compressed payload in aspnetcore tests, 1 byte difference, is that expected with this switch? Wondering if a gzipped payload is supposed to be deterministic with the same compression level.

@stephentoub
Copy link
Member

We are seeing different results in the compressed payload in aspnetcore tests, 1 byte difference, is that expected with this switch? Wondering if a gzipped payload is supposed to be deterministic with the same compression level.

There's no guarantee about compressed bytes being exactly the same from version to version. It's expected that updating the zlib implementation may change the exact compressed bytes generated.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants