Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(s2n-quic-core): add IP checksum function #1643

Merged
merged 3 commits into from
Mar 2, 2023
Merged

Conversation

camshaft
Copy link
Contributor

@camshaft camshaft commented Feb 28, 2023

Description of changes:

When constructing raw IP and UDP packets, we need a way to compute the checksum for the appropriate fields. This PR implements the internet checksum function, as described in RFC1071.

I wanted to see if using SIMD/x86 intrinsics could improve performance at all over the generic version. Turns out that it's pretty significant! Below are the comparisons to Fuchsia's internet-checksum crate, which itself is pretty optimized.

inet/s2n/checksum/1500  time:   [27.122 ns 27.129 ns 27.136 ns]
                        thrpt:  [51.482 GiB/s 51.494 GiB/s 51.508 GiB/s]
inet/fuchsia/checksum/1500
                        time:   [59.451 ns 59.478 ns 59.504 ns]
                        thrpt:  [23.477 GiB/s 23.488 GiB/s 23.498 GiB/s]
inet/s2n/checksum/9000  time:   [146.39 ns 146.41 ns 146.44 ns]
                        thrpt:  [57.238 GiB/s 57.249 GiB/s 57.258 GiB/s]
inet/fuchsia/checksum/9000
                        time:   [359.83 ns 359.86 ns 359.89 ns]
                        thrpt:  [23.290 GiB/s 23.292 GiB/s 23.294 GiB/s]
inet/s2n/checksum/65536 time:   [1.0878 µs 1.0879 µs 1.0882 µs]
                        thrpt:  [56.091 GiB/s 56.102 GiB/s 56.110 GiB/s]
inet/fuchsia/checksum/65536
                        time:   [2.3346 µs 2.3348 µs 2.3349 µs]
                        thrpt:  [26.140 GiB/s 26.142 GiB/s 26.143 GiB/s]

Screen Shot 2023-02-28 at 2 36 48 PM

Call-outs:

I tried adding a differential test using the internet-checksum crate, but it panics on integer overflow. This has been fixed in the repository but has not been published to crates.io.

Testing:

  • Differential test with the RFC implementation
    • Runs under kani and miri
  • Differential test to compare the x86 implementation to the generic implementation

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@camshaft camshaft marked this pull request as ready for review February 28, 2023 21:45
}

/// Minimum size for a payload to be considered for platform-specific code
const LARGE_WRITE_LEN: usize = 32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd always be above this with the min_indistinguishable_packet_len being enforced. Though I'm fine keeping this here as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that's for the whole payload, yes. But smaller values (like just the IPv4 pseudo headers) can also be written.

unsafe fn add(&mut self, rhs: __m128i) {
// Reads pairs of bytes into a 32-bit value
//
// Since we have 16 bytes as input, we need to outputs since we're doubling the bit-width
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Since we have 16 bytes as input, we need to outputs since we're doubling the bit-width
// Since we have 16 bytes as input, we need two outputs since we're doubling the bit-width

@camshaft camshaft enabled auto-merge (squash) March 1, 2023 23:39
@camshaft camshaft merged commit e6c45d1 into main Mar 2, 2023
@camshaft camshaft deleted the camshaft/checksum branch March 2, 2023 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants