Fix TSAN s2n_shutdown failures #4055

lrstewart · 2023-06-14T21:05:27Z

Resolved issues:

Related to #4026

Description of changes:

s2n-tls has always used "informal" atomic variables, since C99 technically has no concept of concurrency. However, if we want to use TSAN to test for concurrency issues without worrying about filtering out false positives, we need to formalize our atomic operations.

Even if we stick to the C99 standard, GCC provides builtins that implement atomic operations. For small variables like int in sane environments, those operations are non-locking. That's why s2n-tls's "informal" atomics work in the first place ;)

I added an "s2n_atomic" type to s2n-tls that uses the GCC builtins if supported and enabled. This type has a couple benefits:

Explicitly marks in our code which variables are touched by both the reader and the writer.
Provides a limited set of operations that should keep even our "informal" atomics safe. For example, using s2n_atomic_set would force us to fix this thread-safety bug.
Allows us to use TSAN to detect problems.

Call-outs:

For now, I'm erring on the side of caution and only using the new behavior for TSAN testing. We could probably enable atomics outside of testing if available, but I'm not sure how we'd handle odd environments where the atomic operations on s2n_atomic require locking. Right now, if you enable atomic, s2n_init will fail if locking is required.

Testing:

ThreadSanitizer passes for all tests now: https://us-west-2.console.aws.amazon.com/codesuite/codebuild/024603541914/projects/s2nGeneralBatch/batch/s2nGeneralBatch%3A0ecc09d3-bd8d-4fe7-9380-cba0b9e823ca?region=us-west-2 Previously it failed for s2n_examples_test due to the shutdown issues.
I will add a test for KeyUpdates in another PR, once I've officially enabled the ThreadSanitizer CI job.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

utils/s2n_atomic.h

camshaft · 2023-06-14T22:09:06Z

utils/s2n_atomic.h

+ * rely on setting / clearing a small value generally being atomic in practice.
+ */
+S2N_RESULT s2n_atomic_init();
+void s2n_atomic_set(s2n_atomic *var);


It's probably worth copying the naming conventions from rust or cpp. That would be store instead of set, load instead of check.

Idk, store implies that you can specify a value, instead of it always storing true.
And how about "test" instead of "check"? That'd be inline with the builtin __atomic_test_and_set

After our offline conversation, I replaced set/clear with load. But I've reverted that-- I think set/clear is safer. See the KeyUpdate bug I use as an example in the description: because the reader blindly sets conn->key_update_pending to key_update_request, it could be setting key_update_pending back to 0. But only the writer is allowed to set key_update_pending back to 0, or else we lose key updates. Developers need to be very clear on whether they're setting or clearing the flag.

set/clear/test doesn't match atomic operations, but it does match our bitflag operations.

CMakeLists.txt

camshaft

Very nice 👍

maddeleine · 2023-06-15T18:40:42Z

For now, I'm erring on the side of caution and only using the new behavior for TSAN testing.

TSAN isn't going to catch errors unless we have threading tests right? Would we actually be able to catch the key_update issue since our thread tests don't send enough data to do a keyupdate?

lrstewart · 2023-06-15T19:01:57Z

For now, I'm erring on the side of caution and only using the new behavior for TSAN testing.

TSAN isn't going to catch errors unless we have threading tests right? Would we actually be able to catch the key_update issue since our thread tests don't send enough data to do a keyupdate?

I have a KeyUpdate thread test written and blocked on this PR. That will trigger TSAN.

That's also why this PR doesn't actually fix the KeyUpdate problems yet ;)

github-actions bot added the s2n-core team label Jun 14, 2023

Fix TSAN s2n_shutdown failures

d0322c0

lrstewart force-pushed the threads_close branch from 5bd890c to d0322c0 Compare June 14, 2023 21:24

lrstewart requested a review from camshaft June 14, 2023 22:00

lrstewart marked this pull request as ready for review June 14, 2023 22:00

lrstewart requested a review from maddeleine June 14, 2023 22:00

camshaft reviewed Jun 14, 2023

View reviewed changes

lrstewart added 2 commits June 14, 2023 17:19

PR comments: renames + move to struct

6b7cadd

PR comments: rework compiler flags

255447e

lrstewart force-pushed the threads_close branch from 0b34580 to 255447e Compare June 15, 2023 00:19

Make feature test check for locking

b113101

lrstewart requested a review from camshaft June 15, 2023 01:08

lrstewart added 2 commits June 14, 2023 18:11

Use same methods in feature test as in source

15445c2

Change store/load back to set/clear/test

04149ee

lrstewart force-pushed the threads_close branch from 5a0fce9 to 04149ee Compare June 15, 2023 05:50

camshaft approved these changes Jun 15, 2023

View reviewed changes

maddeleine approved these changes Jun 15, 2023

View reviewed changes

Merge branch 'main' into threads_close

ec81b5e

lrstewart enabled auto-merge (squash) June 15, 2023 19:15

lrstewart merged commit c9dd66e into aws:main Jun 15, 2023

lrstewart deleted the threads_close branch June 15, 2023 20:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix TSAN s2n_shutdown failures #4055

Fix TSAN s2n_shutdown failures #4055

lrstewart commented Jun 14, 2023 •

edited

Loading

camshaft Jun 14, 2023

lrstewart Jun 14, 2023

lrstewart Jun 15, 2023 •

edited

Loading

camshaft left a comment

maddeleine commented Jun 15, 2023 •

edited

Loading

lrstewart commented Jun 15, 2023 •

edited

Loading

Fix TSAN s2n_shutdown failures #4055

Fix TSAN s2n_shutdown failures #4055

Conversation

lrstewart commented Jun 14, 2023 • edited Loading

Resolved issues:

Description of changes:

Call-outs:

Testing:

camshaft Jun 14, 2023

Choose a reason for hiding this comment

lrstewart Jun 14, 2023

Choose a reason for hiding this comment

lrstewart Jun 15, 2023 • edited Loading

Choose a reason for hiding this comment

camshaft left a comment

Choose a reason for hiding this comment

maddeleine commented Jun 15, 2023 • edited Loading

lrstewart commented Jun 15, 2023 • edited Loading

lrstewart commented Jun 14, 2023 •

edited

Loading

lrstewart Jun 15, 2023 •

edited

Loading

maddeleine commented Jun 15, 2023 •

edited

Loading

lrstewart commented Jun 15, 2023 •

edited

Loading