-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP2: Optimize huffman encoding static table initialization #45297
Comments
Tagging subscribers to this area: @dotnet/ncl Issue DetailsCurrently huffman encoding static table is currently compiled into ~5,5K IL bytes which is JITed into ~9.5K of ASM code. It might be beneficial to somehow optimize above.
|
My recommendation is to create ReadonlySpan containing source data from which the encoding table will be initialized. private static ReadOnlySpan<byte> s_encodingTableData => new byte[257*5]
{
0b11111111, 0b11000000, 0b00000000, 0b00000000, 13,
0b11111111, 0b11111111, 0b10110000, 0b00000000, 23,
....
};
private static readonly (uint code, int bitLength)[] s_encodingTable = GenerateEncodingTable();
private static (uint code, int bitLength)[] GenerateEncodingTable()
{
// TODO: build if from s_encodingTableData bytes by converting 4 bytes into 'uint code' and 5th into 'int bitLength'
} |
@rokonec would you like to take this one? |
The strategy outlined looks good to me. |
I'd suggest just starting with a |
@scalablecory I'd like to take this one If I could. Please assign me to it if you feel its OK. |
@stephentoub I was thinking about something like this: https://sharplab.io/#gist:c13c371e129cb662d20d6c2d3ab0f312 Sharplab does support static constructors and initialization for Jit ASM view so the above sample is 'unstaticisied' |
You can do that, but you're now incurring lots of additional bounds checks and division and the like. [Benchmark]
public (uint code, int bitLength)[] Create1()
{
var data = s_encodingTableData;
var table = new (uint code, int bitLength)[data.Length / 5];
for (int i = 0; i < s_encodingTableData.Length;)
{
table[i / 5] = ((uint)((data[i++] << 24) | (data[i++] << 16) | (data[i++] << 8) | data[i++]), data[i++]);
}
return table;
}
[Benchmark]
public (uint code, int bitLength)[] Create2()
{
var uintData = s_encodingTableUintData;
var byteData = EncodingTableByteData;
var table = new (uint code, int bitLength)[uintData.Length];
for (int i = 0; i < uintData.Length; i++)
{
table[i] = (uintData[i], byteData[i]);
}
return table;
}
That said, in retrospect this probably doesn't matter, since this code is ever going to be invoked only once. And with the uint[] approach, there will also be the initial cost to allocate and blit the contents of the array, but that would go away with #24961 . That's all to say, what you proposed seems fine for now, but I do think we should change it when #24961 arrives. I'd even be curious to know if at that point it's necessary at all to build up this single table, rather than just indexing into each of the two spans to get the two distinct values needed by the call site. And, actually, is that necessary even now? What's the impact on perf of the call site if you just use the uint code = s_encodingTableCodes[octet];
int bitLength = EncodingTableBitLengths[octet]; |
As for now, encoding table is only needed to generate decoding table - only once - and in unit tests. Maybe we can consider to skip that table building and just use those two arrays (uint[] and ReadOnlySpan) directly from generate decoding table. If we decide to support huffman encoding in our HTTP2 clients later, we can optimize for it at that time. |
I'll create PR with latest @stephentoub recommendation: uint code = s_encodingTableCodes[octet];
int bitLength = EncodingTableBitLengths[octet]; We can iterate on it in the PR as those are really elementary changes. |
Great. Thanks. |
PR #45303 |
Linked PR was merged, closing. Please reopen/refile if there is something else to look at here. |
Currently huffman encoding static table is compiled into ~5,5K IL bytes which is JITed into ~9.5K of ASM code.
Sharplab prove
It might be beneficial to somehow optimize above.
The text was updated successfully, but these errors were encountered: