-
Notifications
You must be signed in to change notification settings - Fork 533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for compressed assemblies in APK #4686
Conversation
src/Xamarin.Android.Build.Tasks/Utilities/AssemblyCompression.cs
Outdated
Show resolved
Hide resolved
src/Xamarin.Android.Build.Tasks/Xamarin.Android.Build.Tasks.csproj
Outdated
Show resolved
Hide resolved
7225496
to
a339792
Compare
@JonDouglas chimed in today and believes that we should have assembly compression enabled by default. |
5d80083
to
ea8f74f
Compare
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
3503e1c
to
f48cd41
Compare
f48cd41
to
fb8b8f8
Compare
4f07ef4
to
61d4e74
Compare
Draft release notes During the finalization of this PR before merge, when you get a chance, you can add a release note for it to
|
61d4e74
to
c80dfb9
Compare
// } | ||
|
||
data.DestinationPath = $"{data.SourcePath}.lz4"; | ||
data.SourceSize = (uint)fi.Length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yet another place I think we should consider using checked()
@@ -5,13 +5,13 @@ | |||
"Size": 3684 | |||
}, | |||
"classes.dex": { | |||
"Size": 2200624 | |||
"Size": 2198984 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat odd that this shrank…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
¯\_(ツ)_/¯
I'm not quite sure I understand the use/interaction of |
Either can be used. The item group simply makes it possible for the assemblies you don't control - coming from nugets for instance - as you wouldn't be able to add metadata items to them in any other way. |
I find it surprising that Any idea why this is the case? |
byte[] sourceBytes = null; | ||
byte[] destBytes = null; | ||
try { | ||
sourceBytes = bytePool.Rent ((int)fi.Length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thought on the checked()
front, 4GB is "too big", but what about "2GB+1"? That would be enough for this (int)
cast to return a negative value, which can't be good.
Presumably bytePool.Rent()
would throw, but given that Compress()
could return CompressionResult.InputTooBig
, an "unhandled exception" might be less than ideal.
|
2e48fb6
to
fa74232
Compare
Currently, `Xamarin.Android` supports managed assembly compression in the APK archive if application is bundled (with Mono's `mkbundle`) into a native shared library. Managed assemblies are compressed using gzip compression and placed in an array inside the data section of the shared library. However, support for `mkbundle` is possibly going to be removed and we realized it is a feature some developers appreciate since the produced APKs are smaller and the impact on startup time isn't big enough to worry. This commit aims to be a replacement for `mkbundle` with a handful of improvements thrown in. First of all, the compression is performed using the [managed implementation][0] of the excellent [LZ4][1] algorithm. This gives us a decent compression ratio and a much faster (de)compression speed than gzip/zlib offer. Also, assemblies are stored directly in the APK in their usual directory, which allows us to `mmap` them on the runtime directly from the APK. The build process calculates the size required to store the decompressed assemblies and adds a data section to `libxamarin-app.so` which makes Android allocate all the required memory when the DSO is loaded, thus removing the need of dynamic memory allocation and making the startup faster. Compression is supported only in `Release` builds and is enabled by default, but it can be turned off by setting the `$(AndroidEnableAssemblyCompression)` MSBuild property to `False`. If there's a need to turn compression off for an individual assembly by adding the `AndroidSkipCompression` metadata item to the assembly in question using code similar to this, in the application's project file: <AndroidCustomMetaDataForReferences Include="MyAssembly.dll"> <AndroidSkipCompression>true</AssemblySkipCompression> </AndroidCustomMetaDataForReferences> The compressed assemblies still use their original name (e.g. `Mono.Android.dll`) so that we don't have to perform any string matching on the runtime in order to detect whether the assembly we are asked to load is compressed or not. Instead, the compression code prepends a short header to each .dll file (in pseudo-code): uint32 magic = 0x5A4C4158; // 'XALZ', little-endian uint32 index; // Index into an internal assembly descriptor table uint32 uncompressed_length; The decompression code looks at the mmapped data and checks whether the above header is present. If yes, the assembly is decompressed, otherwise it's loaded as-is. It is important to remember that the assemblies are compressed on the build time using LZ4 block compression which requires assembly data to be entirely loaded into memory (we do this instead of using the LZ4 frame format to make decompression on the run time faster) before compression. The compression output also requires a separate buffer, thus memory consumption will roughly be 1.5x the assembly size. However, since we use a byte buffer pool, memory consumption will not be a sum of all the assemblies but rather the size of the biggest one in the set. ~ Application Size ~ A Xamarin.Forms "Hello World" application APK shrunk by 27% with this commit: | Before | After | Δ | |----------|----------|-----------| | 23305194 | 16813034 | -27,85% | Size comparison between this commit and APKs created with `$(BundleAssemblies) == True` depends on the number of enabled ABI targets in the application. For each ABI, `$(BundleAssemblies) == True` creates a separate shared library, so the amount of space consumed increases by the size of the bundle shared library. The new compression scheme shares the compressed assemblies among all the enabled ABIs, thus effectively creating smaller multi-ABI APKs. In the tables below, `Before` refers to the APK created with `$(BundleAssemblies) == True`, `After` refers to the APK build with the new compression scheme. All ABIs enabled: | Before | After | Δ | |----------|----------|-----------| | 27130240 | 16813034 | -38,03% | Single ABI enabled: | Before | After | Δ | |----------|----------|-----------| | 7783449 | 8746878 | +11,01% | ~ Startup Performance ~ Startup time of the same application isn't affected too much by decompression (comparison between uncompressed application and one compressed using the new scheme): ~ Before ~ App configuration: **Release** Xamarin.Android - Version: **10.4.100-12** - Branch: **master** - Commit: **3f438e46d7b166a3a3ef54c9ffafb5f426760468** ~ After ~ App configuration: **Release** Xamarin.Android - Version: **10.4.100-18** - Branch: **compress-assemblies** - Commit: **cec90e936478f9afbbc31b43e52164ecd5182c79** Device - Model: **Pixel 3 XL** - Native architecture: **arm64-v8a** - SDK version: **29** ~ Application Displayed Time ~ | Before | After | Δ | Notes | | ------- | ------- | -------- | ------------------------------ | | 795.800 | 793.800 | -0.25% ✓ | preload enabled; 32-bit build | | 777.100 | 780.500 | +0.44% ✗ | preload disabled; 32-bit build | | 779.000 | 791.500 | +1.58% ✗ | preload enabled; 64-bit build | | 776.000 | 781.400 | +0.69% ✗ | preload disabled; 64-bit build | Comparison of startup times between the `$(BundleAssemblies) == True` scheme and the new one with the same device and application as above (once again `Before` refers to the `$(BundleAssemblies)` application): | Before | After | Δ | Notes | | ------- | ------- | -------- | ------------------------------ | | 855.600 | 793.800 | -7.22% ✓ | preload enabled; 32-bit build | | 843.000 | 780.500 | -7.41% ✓ | preload disabled; 32-bit build | | 849.400 | 791.500 | -6.82% ✓ | preload enabled; 64-bit build | | 841.600 | 781.400 | -7.15% ✓ | preload disabled; 64-bit build | [0]: https://www.nuget.org/packages/K4os.Compression.LZ4/ [1]: https://github.com/lz4/lz4 [2]: https://quixdb.github.io/squash-benchmark/#results-table
fa74232
to
f29332d
Compare
Weird intermixing of spaces following tabs…
Draft commit message:
|
…#4686) Currently, Xamarin.Android supports compression of managed assemblies within the `.apk` if the app is built with [`$(BundleAssemblies)`=True][0], with the compressed assembly data stored within `libmonodroid_bundle_app.so` using gzip compression and placed in an array inside the data section of the shared library. There are two problems with this approach: 1. `mkbundle` emits C code, which requires a C compiler which requires the full Android NDK, and thus requires Visual Studio Enterprise. 2. Reliance on Mono's `mkbundle` results in possible issues around [filename globbing][1] such that `Xamarin.AndroidX.AppCompat.Resources.dll` is improperly treated as a [satellite assembly][2]. Because of (2), we are planning on [removing support][3] for `$(BundleAssemblies)` in .NET 6 ([née .NET 5][4]), which resulted in [some pushback][5] because `.apk` size is very important for some customers, and the startup overheads we believed to be inherent to `$(BundleAssemblies)` turned out to be somewhat over-estimated. To resolve the above issues, add an assembly compression mechanism that doesn't rely on `mkbundle` and the NDK: separately compress the assemblies and store the compressed data within the `.apk`. Compression is performed using the [managed implementation][6] of the excellent [LZ4][7] algorithm. This gives us a decent compression ratio and a much faster (de)compression speed than gzip/zlib offer. Also, assemblies are stored directly in the APK in their usual directory, which allows us to [**mmap**(2)][8] them in the runtime directly from the `.apk`. The build process calculates the size required to store the decompressed assemblies and adds a data section to `libxamarin-app.so` which causes *Android* to allocate all the required memory when the DSO is loaded, thus removing the need of dynamic memory allocation and making the startup faster. Compression is supported only in `Release` builds and is enabled by default, but it can be turned off by setting the `$(AndroidEnableAssemblyCompression)` MSBuild property to `False`. Compression can be disabled for an individual assembly by setting the `%(AndroidSkipCompression)` MSBuild item metadata to True for the assembly in question, e.g. via: <AndroidCustomMetaDataForReferences Include="MyAssembly.dll"> <AndroidSkipCompression>true</AssemblySkipCompression> </AndroidCustomMetaDataForReferences> The compressed assemblies still use their original name, e.g. `Mono.Android.dll`, so that we don't have to perform any string matching on the runtime in order to detect whether the assembly we are asked to load is compressed or not. Instead, the compression code *prepends* a short header to each `.dll` file (in pseudo C code): struct CompressedAssemblyHeader { uint32_t magic; // 0x5A4C4158; 'XALZ', little-endian uint32_t descriptor_index; // Index into an internal assembly descriptor table uint32_t uncompressed_length; // Size of assembly, uncompressed }; The decompression code looks at the `mmap`ed data and checks whether the above header is present. If yes, the assembly is decompressed, otherwise it's loaded as-is. It is important to remember that the assemblies are compressed at build time using LZ4 block compression, which requires assembly data to be entirely loaded into memory before compression; we do this instead of using the LZ4 frame format to make decompression at runtime faster. The compression output also requires a separate buffer, thus memory consumption at *build* time will be roughly 1.5x the size of the largest assembly, which is reused across all assemblies. ~~ Application Size ~~ A Xamarin.Forms "Hello World" application `.apk` shrinks by 27% with this approach for a single ABI: | Before (bytes) | LZ4 (bytes) | Δ | |------------------:|--------------:|:---------:| | 23,305,194 | 16,813,034 | -27.85% | Size comparison between this commit and `.apk`s created with `$(BundleAssemblies)` =True depends on the number of enabled ABI targets in the application. For each ABI, `$(BundleAssemblies)`=True creates a separate shared library, so the amount of space consumed increases by the size of the bundle shared library. The new compression scheme shares the compressed assemblies among all the enabled ABIs, thus effectively creating smaller multi-ABI `.apk`s. In the tables below, `mkbundle` refers to the APK created with `$(BundleAssemblies)`=True, `lz4` refers to the `.apk` build with the new compression scheme: | ABIs | mkbundle (bytes) | LZ4 (bytes) | Δ | |--------------------------------------:|------------------:|--------------:|---------| | armeabi-v7a, arm64-v8a, x86, x86_64 | 27,130,240 | 16,813,034 | -38.03% | | arm64-v8a | 7,783,449 | 8,746,878 | +11.01% | The single API case is ~11% larger because gzip offers better compression, at the cost of higher runtime startup overhead. ~~ Startup Performance ~~ When launching the Xamarin.Forms "Hello World" application on a Pixel 3 XL, the use of LZ4-compressed assemblies has at worst a ~1.58% increase in the Activity Displayed time (64-bit app w/ assembly preload enabled), while slightly faster on 32-bit apps, but is *always* faster than the mkbundle startup time for all configurations: | | | | | LZ4 vs | LZ4 vs | | Description | None (ms) | mkbundle (ms) | LZ4 (ms) | None Δ | mkbundle Δ | |----------------------------------:|----------:|--------------:|----------:|:--------:|:----------:| | preload enabled; 32-bit build | 795.8 | 855.6 | 783.8 | -0.25% ✓ | -7.22% ✓ | | preload disabled; 32-bit build | 777.1 | 843.0 | 780.5 | +0.44% ✗ | -7.41% ✓ | | preload enabled; 64-bit build | 779.0 | 843.0 | 791.5 | +1.58% ✗ | -6.82% ✓ | | preload disabled; 64-bit build | 776.0 | 841.6 | 781.5 | +0.69% ✗ | -7.15% ✓ | [0]: https://docs.microsoft.com/en-us/xamarin/android/deploy-test/release-prep/?tabs=windows#bundle-assemblies-into-native-code [1]: dotnet/android-libraries#64 [2]: https://github.com/mono/mono/blob/9b4736d4c271e9d4e04cafa258ddd58961f1a39f/mcs/tools/mkbundle/mkbundle.cs#L1315-L1317 [3]: dotnet/android-libraries#64 (comment) [4]: https://devblogs.microsoft.com/dotnet/announcing-net-5-preview-4-and-our-journey-to-one-net/ [5]: dotnet/android-libraries#64 (comment) [6]: https://www.nuget.org/packages/K4os.Compression.LZ4/ [7]: https://github.com/lz4/lz4 [8]: https://linux.die.net/man/2/mmap
By enabling this by default, you have broken every post-package obfuscator. Mine was spitting out an cryptic "invalid assembly" error. It took 3 days to figure out this was a problem with the packaging of the APK, not a problem with my obfuscator. I use a robust obfuscator that renames the public interfaces/methods of my two .Net standard DLLs, and propagates those name changes up to the referencing DLLs above. It is more complex than simple obfuscators and sadly only runs post-packaging. It would be really nice if Microsoft would incorporate security directly into the platform, rather than forcing me to use an external tool. It is easy to forget about external tools that might suffer from a packaging change. Its unfortunate when those tools are for something as important as securing your apps. Everything is working again after I discovered the tag. This should be part of the Visual Studio project UI, with a info bubble that says, "This may need to be disabled to allow for post-packaging obfuscation or instrumentation tools." |
@scottkdavis can you file a new issue describing how you hook into MSBuild? Include a diagnostic MSBuild log. It is possible you are referencing a "private" MSBuild target that is prefixed with an underscore. This means it could be renamed/reordered, etc. We have some documented extension points that should be used instead. |
Hello @jonathanpeppers thanks for the response. This is post-package. My obfuscator takes the APK as the input, obfuscates the DLLs, then repackages and resigns the APK. I have a very robust obfuscator that renames all public methods and classes and maps those changes into the upstream DLLs. I tried to make this part of the build process years ago, but my project is structured like this: 1: Android csproj -> 2:Xamarin Forms UI csproj -> 3:Core logic and service layer csproj When I tried to incorporate the public renaming into the build process, DLL3's public interfaces are renamed when DLL2 is compiled, then renamed again when DLL1 is compiled. The naming map is different (random) and therefore the double renaming breaks things. When incorporated into the build process, the obfuscator doesn’t know if another project is also going to do renaming as part of the compile since they run in separate processes. That’s why the obfuscator must run after packaging so a single obfuscation process can handle the public renaming and mapping across all DLLs at once. I had written a long response about app security, my past meetings with the Xamarin product team about the lack of good security practices among Xamarin app developers, and how to help developers write secure apps. I deleted it, that is a whole other conversation. Happy to help any way I can, but the point of my first comment was to alert the team to the fact that this change breaks post-packaging tools and it can be a very expensive process to discover what happened. I realize this impacts very few developers since in all the conference talks I've given on app security, I’ve only met one other Xamarin developer who is obfuscating their app. I've never met anyone who understood the importance of stopping a hacker from using the public methods on their DLLs to automate activity, or robo-call their servers. Its unlikely most developers will ever care about securing their apps until it is just part of Visual Studio. No further action needed for me. I have my solution working. I was simply passing on a side-effect you may not have anticipated. Best wishes. |
@scottkdavis if your product post processes APK files, you should be able to use the same K4os.Compression.LZ4 NuGet package used here: You should be able to lz4-decompress, run your obfuscation, lz4-compress and put things back the way they were. I would recommend adding this feature, as it is enabled for Release builds of Xamarin.Android apps by default going forward. Does your product obfuscate AOT'd assemblies? I would think you would hit similar problems there. Does it support Android App Bundles, as well? |
Hi Jonathan, I should have been more clear. When I say "my obfuscator" I meant the one I have purchased. I don't have access into their pipeline for processing files. I however have forwarded on the compression information to them. I sincerely appreciate all your responses to help me find an optimal solution. I'm fine with distributing uncompressed assemblies for now. The ROI of changing my build process again is negative. Thank you for the attention you have give this. |
Currently,
Xamarin.Android
supports managed assembly compression inthe APK archive if application is bundled (with Mono's
mkbundle
) intoa native shared library. Managed assemblies are compressed using gzip
compression and placed in an array inside the data section of the shared
library. However, support for
mkbundle
is possibly going to beremoved and we realized it is a feature some developers appreciate since
the produced APKs are smaller and the impact on startup time isn't big
enough to worry.
This commit aims to be a replacement for
mkbundle
with a handful ofimprovements thrown in. First of all, the compression is performed
using the managed implementation of the excellent LZ4
algorithm. This gives us a decent compression ratio and a much faster
(de)compression speed than gzip/zlib offer. Also, assemblies are stored
directly in the APK in their usual directory, which allows us to
mmap
them on the runtime directly from the APK. The build process calculates
the size required to store the decompressed assemblies and adds a data
section to
libxamarin-app.so
which makes Android allocate all therequired memory when the DSO is loaded, thus removing the need of
dynamic memory allocation and making the startup faster.
Compression is supported only in
Release
builds and is enabled bydefault, but it can be turned off by setting the
$(AndroidEnableAssemblyCompression)
MSBuild property toFalse
. Ifthere's a need to turn compression off for an individual assembly by
adding the
AndroidSkipCompression
metadata item to the assembly inquestion using code similar to this, in the application's project file:
The compressed assemblies still use their original name (e.g.
Mono.Android.dll
) so that we don't have to perform any string matchingon the runtime in order to detect whether the assembly we are asked to
load is compressed or not. Instead, the compression code prepends a
short header to each .dll file (in pseudo-code):
The decompression code looks at the mmapped data and checks whether the
above header is present. If yes, the assembly is decompressed,
otherwise it's loaded as-is.
It is important to remember that the assemblies are compressed on the
build time using LZ4 block compression which requires assembly data to
be entirely loaded into memory (we do this instead of using the LZ4
frame format to make decompression on the run time faster) before
compression. The compression output also requires a separate buffer,
thus memory consumption will roughly be 1.5x the assembly size.
However, since we use a byte buffer pool, memory consumption will not be
a sum of all the assemblies but rather the size of the biggest one in
the set.
~ Application Size ~
A Xamarin.Forms "Hello World" application APK shrunk by 27% with this
commit:
Size comparison between this commit and APKs created with
$(BundleAssemblies) == True
depends on the number of enabled ABItargets in the application. For each ABI,
$(BundleAssemblies) == True
creates a separate shared library, so the amount of space consumed
increases by the size of the bundle shared library. The new compression
scheme shares the compressed assemblies among all the enabled ABIs, thus
effectively creating smaller multi-ABI APKs.
In the tables below,
Before
refers to the APK created with$(BundleAssemblies) == True
,After
refers to the APK build with thenew compression scheme.
All ABIs enabled:
Single ABI enabled:
~ Startup Performance ~
Startup time of the same application isn't affected too much by
decompression (comparison between uncompressed application and one
compressed using the new scheme):
~ Before ~
App configuration: Release
Xamarin.Android
~ After ~
App configuration: Release
Xamarin.Android
Device
~ Application Displayed Time ~
Comparison of startup times between the
$(BundleAssemblies) == True
scheme and the new one with the same device and application as
above (once again
Before
refers to the$(BundleAssemblies)
application):