Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Error and Result<()> to be the same size as HRESULT #3126

Merged
merged 17 commits into from
Jun 26, 2024

Conversation

sivadeilra
Copy link
Collaborator

Many codebases use HRESULT (whether they use COM or not) and yet never use IErrorInfo. For those codebases, the size of the Result<()> object can be a substantial cost. Currently on 64-bit platforms, size_of<Result<()>> is 16 bytes, when ideally it would be basically a smartly-typed version of HRESULT. For example, Direct3D, DirectWrite, etc. never use IErrorInfo (they don't set any error info objects on failures), so all of the code that uses these codebases will suffer in performance.

This PR allows you to statically disable the error info within Error, by enabling the slim-errors feature in the windows-result crate. (The windows and windows-core crates will also propagate the feature to windows-result, so that you don't need to add an extra crate dependency.)

The performance costs come from the larger code and larger stack area needed. There is also a significant penalty because IErrorInfo requires that the Error object have a Drop implementation. This requires the compiler to precisely track the lifetime of Error values and even of the Result<()> value. This adds a lot of extra code that almost never runs, to many function bodies.

This PR also adds a new NonZeroHRESULT, which is equivalent to HRESULT except that it contains a NonZeroI32, not an i32. This means that it cannot represent S_OK, which is fine. This gives the Rust compiler a "niche", which is a byte pattern that cannot be represented by one variant of an enum, which frees up that byte pattern for a different variant or for enum's "tag". In our case, the compiler will generate type layouts for Option<NonZeroHRESULT> and Result<(), NonZeroHRESULT> that are the same size and alignment as the underlying HRESULT. This is great because it allows them to be stored in a single register, and in many situations the compiler can totally avoid type layout conversions and can produce much better code.

The Error type now uses NonZeroHRESULT. When the slim-errors feature is enabled, size_of::<Error>() == 4 and size_of::<Result<()>>() == 4. Before this change, size_of::<Error>() == 16 and size_of::<Result<()>>() == 24.

This PR improves these types even when not using slim-errors. After this PR, with slim-errors disabled, size_of::<Result<()>>() == 16, 8 bytes less, because of the niche optimization relating to NonZeroHRESULT.

@riverar
Copy link
Collaborator

riverar commented Jun 25, 2024

I'm not very familiar in this area. Is that a guaranteed and/or documented compiler niche optimization? Would be awkward if this changed and resulted in net zero or negative savings.

@sivadeilra
Copy link
Collaborator Author

I'm not very familiar in this area. Is that a guaranteed and/or documented compiler niche optimization? Would be awkward if this changed and resulted in net zero or negative savings.

It's a guaranteed optimization. It's the reason that the NonZeroI32 type (and similar) were added to the language.

There are static assertions in the PR that verify this. If anything changed, we would see a build break. But these optimizations are guaranteed, so these assertions are more about documentation than anything else.

dpaoliello
dpaoliello previously approved these changes Jun 25, 2024
crates/libs/result/src/error.rs Outdated Show resolved Hide resolved
crates/libs/result/src/error.rs Outdated Show resolved Hide resolved
crates/libs/result/src/error.rs Outdated Show resolved Hide resolved
crates/tests/result/tests/error.rs Outdated Show resolved Hide resolved
@riverar
Copy link
Collaborator

riverar commented Jun 26, 2024

Is there any benchmark data that demonstrates this is more than just a theoretical optimization? Not trying to be a blocker, just trying to prevent unnecessary downstream churn.

@sivadeilra
Copy link
Collaborator Author

sivadeilra commented Jun 26, 2024

Is there any benchmark data that demonstrates this is more than just a theoretical optimization? Not trying to be a blocker, just trying to prevent unnecessary downstream churn.

Yes. This change alone removes 1.2% of the entire binary size for DWriteCore. The code is smaller, the exception handling tables are smaller. Text layout benchmarks show meaningful increases in speed as well, which makes sense because these components make extensive use of COM, including purely inside DWriteCore, not just between DWriteCore and external apps.

Keep in mind that DWriteCore is not yet pure Rust; it's about 65% Rust. So that number will go even higher as I convert more of DWriteCore to Rust.

Or, to put it a different way, I see this much regression when I move from com-rs to windows-rs. 1.2% increase in binary size, for a feature that provides absolutely no benefit to DWriteCore or Office, is not something that the Office team is willing to accept. They have pretty strict size gates, especially because they ship on mobile platforms (where we also ship DWriteCore).

@kennykerr
Copy link
Collaborator

The size optimization looks neat. As someone who spent a great deal of effort optimizing code gen for call sites in C++, I can see how this would make a meaningful improvement. Mainly I'm concerned about the following:

  • The use of features here just doesn't seem right. Features are meant to be additive, as I mentioned above, and not fundamentally change behavior. I think this would be better served by a cfg flag like windows_raw_dylib.

  • This PR seemingly rewrites all of the windows-result crate, greatly increasing complexity and approachability for what boils down to an HRESULT. I think we can have the cfg option to trim down the overhead by removing the IErrorInfo field and possibly the HRESULT optimization without actually changing the public interface to the crate at all. Previously, this crate provided precisely two types; an HRESULT and an Error struct. Now there's ErrorInfo, ErrorT, NoDetail, NonZeroHRESULT, ErrorDetail, and DefaultDetail added to the mix. Ideally this stuff can be simplified, less generic, and there should be no new public types. In other words, we should be able to achieve the main goal here essentially by simply removing the original info field with a cfg attribute.

  • The use of S_OK to report a failure without error information is critical to the way null interfaces are modeled in windows-rs. This change essentially turns them into E_POINTER errors. This means that you cannot distinguish between a COM call that returns E_POINTER from a call that returns S_OK while returning a null interface pointer.

Copy link
Collaborator

@dpaoliello dpaoliello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the change to crates/libs/windows/Cargo.toml intentional?

crates/libs/windows/Cargo.toml Outdated Show resolved Hide resolved
crates/libs/result/src/error.rs Outdated Show resolved Hide resolved
@riverar
Copy link
Collaborator

riverar commented Jun 26, 2024

Something seems wrong with the Error::code natvis, is this expected?

let error_from_win32 = windows_result::Error::from_win32();
0:000> dx -r1 error_from_win32
error_from_win32                 [Type: windows_result::error::Error]
    [+0x008] code             : 1398755147 [Type: core::num::nonzero::NonZero<i32>]
    [+0x000] info             [Type: windows_result::error::error_info::ErrorInfo]

@sivadeilra
Copy link
Collaborator Author

sivadeilra commented Jun 26, 2024

Something seems wrong with the Error::code, is this expected?

Hmmm. I may have to re-introduce the NonZeroHRESULT type, so that NatVis has something to activate it...

It's strange, though, because the NatVis tests pass locally.

@riverar
Copy link
Collaborator

riverar commented Jun 26, 2024

I wouldn't spend too much time on it. I've spent a heavy amount of time with natvis and due to the split in expression engines (Visual Studio / windbg), it's very difficult to get right. Adding VS expression engine tests to start chipping away at this problem is on my TODO.

Copy link
Collaborator

@kennykerr kennykerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet - thanks!

@kennykerr kennykerr merged commit 41d2e99 into microsoft:master Jun 26, 2024
75 checks passed
@sivadeilra sivadeilra deleted the user/ardavis/feature-get-error branch June 26, 2024 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants