Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on ODR between instrumented and non-instrumented libraries #398

Open
ramosian-glider opened this issue Sep 1, 2015 · 21 comments
Open

Comments

@ramosian-glider
Copy link
Member

Originally reported on Google Code with ID 398

When the same global variable is defined in ASan-instrumented and in a non-instrumented
libraries, there is a 50% chance that the linker will pick the non-instrumented symbol
and __asan_register_globals will attempt to poison redzones around it.

Negative effects range from cryptic out-of-bounds reports to startup CHECK failures
(ex. because the uninstrumented variable is not 32-byte aligned).

I wonder if this can be mitigated by making the reference in asan global descriptor
point to a local symbol for the same global.

Reported by [email protected] on 2015-07-14 18:09:45

@ramosian-glider
Copy link
Member Author

FYI that's what GCC does.

Reported by tetra2005 on 2015-07-14 19:03:02

@ramosian-glider
Copy link
Member Author

Hm, I've tried the following (draft) patch:

diff --git a/lib/Transforms/Instrumentation/AddressSanitizer.cpp b/lib/Transforms/Instrumentation/AddressSanitizer.cpp
index f7b7a71..71b65c3 100644
--- a/lib/Transforms/Instrumentation/AddressSanitizer.cpp
+++ b/lib/Transforms/Instrumentation/AddressSanitizer.cpp
@@ -1338,8 +1338,14 @@ bool AddressSanitizerModule::InstrumentGlobals(IRBuilder<> &IRB,
Module &M) {
       SourceLoc = ConstantInt::get(IntptrTy, 0);
     }

+    // Create local alias for NewGlobal to avoid crash on ODR between
+    // instrumented and non-instrumented libraries.
+    // https://code.google.com/p/address-sanitizer/issues/detail?id=398
+    auto *GA = GlobalAlias::create(
+        GlobalValue::InternalLinkage, MD.Name + M.getName(), NewGlobal);
+
     Initializers[i] = ConstantStruct::get(
-        GlobalStructTy, ConstantExpr::getPointerCast(NewGlobal, IntptrTy),
+        GlobalStructTy, ConstantExpr::getPointerCast(GA, IntptrTy),
         ConstantInt::get(IntptrTy, SizeInBytes),
         ConstantInt::get(IntptrTy, SizeInBytes + RightRedzoneSize),
         ConstantExpr::getPointerCast(Name, IntptrTy),

on this testcase:

$ cat main.c

int f = 4;
int g = 5;
int foo (int *);
int main ()
{
  return foo (&f) - 4;
}

$ cat libfoo.c

int f = 444;
int g = 555;
int foo (int *p)
{
  return *p;
}

and got interesting results. If we force clang to use external (GNU) AS via '-no-integrated-as'
option, everything works well:

$ clang -no-integrated-as -fsanitize=address libfoo.c -shared -fpic -fsanitize=address
-o libfoo.so
$ clang main.c -c -o main.o
$ clang -fsanitize=address main.o ./libfoo.so -o main
$ ASAN_OPTIONS=report_globals=3  ./main 
    #0 0x4b49b7 in __asan_register_globals /home/max/src/llvm/projects/compiler-rt/lib/asan/asan_globals.cc:226
    #1 0x7fb518d1fa0d in asan.module_ctor (libfoo.so+0xa0d)
    #2 0x7fb518f31139 in call_init /build/buildd/eglibc-2.19/elf/dl-init.c:78
    #3 0x7fb518f31222 in call_init /build/buildd/eglibc-2.19/elf/dl-init.c:36
    #4 0x7fb518f31222 in _dl_init /build/buildd/eglibc-2.19/elf/dl-init.c:126
    #5 0x7fb518f22309  (/lib64/ld-linux-x86-64.so.2+0x1309)

=== ID 629145601; 0x7fb518f20100 0x7fb518f20138
==18620==Added Global[0x7fb518f20100]: beg=0x7fb518f20080 size=4/64 name=f module=foo.c
dyn_init=0
==18620==  location (0x7fb518f1fdf8): name=foo.c[0x7fb518d1fa39], 1 5
==18620==Added Global[0x7fb518f20138]: beg=0x7fb518f200c0 size=4/64 name=g module=foo.c
dyn_init=0
==18620==  location (0x7fb518f1fe08): name=foo.c[0x7fb518d1fa39], 2 5
==18620==Removed Global[0x7fb518f20100]: beg=0x7fb518f20080 size=4/64 name=f module=foo.c
dyn_init=0
==18620==  location (0x7fb518f1fdf8): name=foo.c[0x7fb518d1fa39], 1 5
==18620==Removed Global[0x7fb518f20138]: beg=0x7fb518f200c0 size=4/64 name=g module=foo.c
dyn_init=0
==18620==  location (0x7fb518f1fe08): name=foo.c[0x7fb518d1fa39], 2 5

For f and g, beg is located in libfoo.so and this is fine.

However, when use embedded clang asm, I've got this:

$ clang -fsanitize=address libfoo.c -shared -fpic -fsanitize=address -o libfoo.so
$ clang main.c -c -o main.o
$ clang -fsanitize=address main.o ./libfoo.so -o main

$ ASAN_OPTIONS=report_globals=3   ./main 
    #0 0x4b49b7 in __asan_register_globals /home/max/src/llvm/projects/compiler-rt/lib/asan/asan_globals.cc:226
    #1 0x7f09ab41da0d in asan.module_ctor (libfoo.so+0xa0d)
    #2 0x7f09ab62f139 in call_init /build/buildd/eglibc-2.19/elf/dl-init.c:78
    #3 0x7f09ab62f222 in call_init /build/buildd/eglibc-2.19/elf/dl-init.c:36
    #4 0x7f09ab62f222 in _dl_init /build/buildd/eglibc-2.19/elf/dl-init.c:126
    #5 0x7f09ab620309  (/lib64/ld-linux-x86-64.so.2+0x1309)

=== ID 662700033; 0x7f09ab61e100 0x7f09ab61e138
==18635==Added Global[0x7f09ab61e100]: beg=0x00000070bb10 size=4/64 name=f module=foo.c
dyn_init=0
==18635==  location (0x7f09ab61ddf8): name=foo.c[0x7f09ab41da39], 1 5
==18635==Added Global[0x7f09ab61e138]: beg=0x00000070bb14 size=4/64 name=g module=foo.c
dyn_init=0
==18635==  location (0x7f09ab61de08): name=foo.c[0x7f09ab41da39], 2 5
==18635==AddressSanitizer CHECK failed: /home/max/src/llvm/projects/compiler-rt/lib/asan/asan_globals.cc:146
"((AddrIsAlignedByGranularity(g->beg))) != (0)" (0x0, 0x0)
    #0 0x4bca0d in __asan::AsanCheckFailed(char const*, int, char const*, unsigned
long long, unsigned long long) /home/max/src/llvm/projects/compiler-rt/lib/asan/asan_rtl.cc:71
    #1 0x4c3b03 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned
long long, unsigned long long) /home/max/src/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_common.cc:136
    #2 0x4b520b in RegisterGlobal /home/max/src/llvm/projects/compiler-rt/lib/asan/asan_globals.cc:146
    #3 0x4b520b in __asan_register_globals /home/max/src/llvm/projects/compiler-rt/lib/asan/asan_globals.cc:230
    #4 0x7f09ab41da0d in asan.module_ctor (libfoo.so+0xa0d)
    #5 0x7f09ab62f139 in call_init /build/buildd/eglibc-2.19/elf/dl-init.c:78
    #6 0x7f09ab62f222 in call_init /build/buildd/eglibc-2.19/elf/dl-init.c:36
    #7 0x7f09ab62f222 in _dl_init /build/buildd/eglibc-2.19/elf/dl-init.c:126
    #8 0x7f09ab620309  (/lib64/ld-linux-x86-64.so.2+0x1309)

Here, beg for f and g beg=0x00000070bb**, that is located in main. So, for embedded
asm local aliases didn't work. I wonder if this is an assembler bug? Or maybe there
is another way to create local alias that would work with both external an embedded
as?

Reported by chefMax7 on 2015-08-06 16:39:41

@ramosian-glider
Copy link
Member Author

The patch looks good in principle, but it has to work with intergrated-as.
Could you try reproducing the difference in the assembler behavior and file a bug with
llvm? 

Reported by [email protected] on 2015-08-07 18:27:51

@chefmax
Copy link

chefmax commented Oct 5, 2015

Well, since pr24486 was fixed (https://llvm.org/bugs/show_bug.cgi?id=24486), the patch I've posted above works fine for both integrated/GNU as. But if use it as it is, the ODR violation detection would be broken, because globals would have different addresses. There is a long discussion about this in GCC bugzilla (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63888) and I don't know what solution is preferable here. Yury suggested to use names of globals instead of their addresses to detect ODR violation, is it an option here?

@kcc
Copy link
Contributor

kcc commented Oct 5, 2015

Dunno, maybe.
At the very least it will be slower. Do you want to make a prototype and prove otherwise?

Frankly, I am not motivated to invest time into this because in llvm everything works for the most common case (everything is instrumented) and in GCC the ODR checking is currently disabled. (Right?)

@ghost
Copy link

ghost commented Oct 6, 2015

I'd say that sanitized is an important usecase as well and silently dropping support for it (especially in such an ugly way as in this bug) is a huge step back. How about we pass both global and local addresses to asan_register_global? Local address can then be used for poisoning and global for odr.

@kcc
Copy link
Contributor

kcc commented Oct 6, 2015

sanitized is an important usecase
Typo?

How about we pass both global and local addresses to asan_register_global?
That'll be another ABI break. Not a huge deal, but still..
I am still not sure how this is going to look. Want to prototype?

@ghost
Copy link

ghost commented Oct 6, 2015

Yeah, I meant sanitized library.

I'm not sure why you care about ABI breaks but we could keep existing function and just add a new "extended" version to preserve compatibility. Max?

@chefmax
Copy link

chefmax commented Oct 6, 2015

So, we would just compare names to detect ODR violation? I think this can work (although we can still have ODR violation FP for C and others, but this is a different story). I can prepare a prototype and post it here.

@ghost
Copy link

ghost commented Oct 6, 2015

No, we'll still use global addresses for odr but locals for poisoning. That'll require a change to Asan ABI (register_global function).

@kcc
Copy link
Contributor

kcc commented Oct 6, 2015

I'm not sure why you care about ABI breaks
I don't care too much, I just don't want to do it w/o a good reason.
I can prepare a prototype and post it here.
Not here, at llvm-commits, please

@ygribov
Copy link

ygribov commented Oct 26, 2015

LLVM's current approach to poisoning globals may also cause another tricky bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68016 (yes, we've seen it in production).

@chefmax
Copy link

chefmax commented Dec 17, 2015

One another approach here:

  1. Use private aliases for globals to be instrumented.
  2. Pass an address to this alias to runtime to avoid poisoning of innocent globals in non-sanitized modules.
  3. To preserve ODR violation checking, we introduce another global symbol (say, _asan_gen_XXX) and use it as indicator in runtime (say, 0 and 1).

Cons:

  • The size of .bss section will be increased, we have doubled number of global symbols.
  • ASan won't detect global buffer overflow in code with copy relocations due to using private aliases (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68016).
  • ABI change.

Proc:

  • This way, mixing sanitized code with non-sanitzed would be safe.

I have a prototype for this and if there aren't strong objections here, I'll post it at llvm-commits shortly.

@ghost
Copy link

ghost commented Dec 17, 2015

The size of .bss section will be increased, we have doubled number of global symbols.

On the other hand, new globals would be 1-byte bools.

ASan won't detect global buffer overflow in code with copy relocations due to using private aliases

Why? You could still poison copy-relocated global if it's __asan_gen_XXX is present.

@chefmax
Copy link

chefmax commented Dec 17, 2015

ASan won't detect global buffer overflow in code with copy relocations due to using private aliases
Why? You could still poison copy-relocated global if it's __asan_gen_XXX is present.

Sorry, I've meant non-sanitized binary and sanitized library use case. For sanitized binary we indeed can poison copy-relocated global.

@ghost
Copy link

ghost commented Dec 17, 2015

Sorry, I've meant non-sanitized binary and sanitized library use case.

Well, this case currently segfaults so I wouldn't treat this as a regression)

llvm-mirror pushed a commit to llvm-mirror/llvm that referenced this issue Feb 8, 2016
As discussed in google/sanitizers#398, with current
implementation of poisoning globals we can have some CHECK failures or false
positives in case of mixing instrumented and non-instrumented code due to ASan
poisons innocent globals from non-sanitized binary/library. We can use private
aliases to avoid such errors. In addition, to preserve ODR violation detection,
we introduce new __odr_asan_gen_XXX symbol for each instrumented global that
indicates if this global was already registered. To detect ODR violation in
runtime, we should only check the value of indicator and report an error if it
isn't equal to zero.

Differential Revision: http://reviews.llvm.org/D15642


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260075 91177308-0d34-0410-b5e6-96231b3b80d8
earl pushed a commit to earl/llvm-mirror that referenced this issue Feb 8, 2016
As discussed in google/sanitizers#398, with current
implementation of poisoning globals we can have some CHECK failures or false
positives in case of mixing instrumented and non-instrumented code due to ASan
poisons innocent globals from non-sanitized binary/library. We can use private
aliases to avoid such errors. In addition, to preserve ODR violation detection,
we introduce new __odr_asan_gen_XXX symbol for each instrumented global that
indicates if this global was already registered. To detect ODR violation in
runtime, we should only check the value of indicator and report an error if it
isn't equal to zero.

Differential Revision: http://reviews.llvm.org/D15642


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260075 91177308-0d34-0410-b5e6-96231b3b80d8
@chefmax
Copy link

chefmax commented Feb 11, 2016

This should be fixed on trunk now, can we close this?

@ghost
Copy link

ghost commented Feb 11, 2016

Another option is waiting until this is enabled by default.

@morehouse
Copy link
Contributor

@kcc: Do we want to enable -asan-use-private-alias by default?

@rnk
Copy link
Contributor

rnk commented Jun 6, 2018

I think this is the same as https://bugs.llvm.org/show_bug.cgi?id=37545

delcypher pushed a commit to swiftlang/swift that referenced this issue Dec 12, 2020
Previously Swift enabled the "UseOdrIndicator" ASan instrumentation mode
and gave no option to disable this. This probably wasn't intentional but
happened due to the fact the
`createModuleAddressSanitizerLegacyPassPass()` function has a default
value for the `UseOdrIndicator` parameter of `true` and in Swift we
never specified this parameter explicitly.

Clang disables the "UseOdrIndicator" mode by default but allows it to be
enabled using the `-fsanitize-address-use-odr-indicator` flag.
Having "UseOdrIndicator" off by default is probably the right
default choice because it bloats the binary. So this patch changes the
Swift compiler to match Clang's behavior.

This patch disables the "UseOdrIndicator" mode by default but adds a
hidden driver and frontend flag (`-sanitize-address-use-odr-indicator`)
to enable it. The flag is hidden so that we can remove it in the future
if needed.

A side of disabling "UseOdrIndicator" is that by we will no longer use
private aliases for poisoning globals. Private aliases were introduced
to avoid crashes (google/sanitizers#398) due
to ODR violations with non-instrumented binaries. On Apple platforms the
use of two-level namespaces probably means that using private aliases
wasn't ever really necessary to avoid crashes. On platforms with a flat
linking namespace (e.g. Linux) using private aliases might matter more
but should users actually run into problems they can either:

* Fix their environment to remove the ODR, thus avoiding the crash.
* Instrument the previously non-instrumented code to avoid the crash.
* Use the new `-sanitize-address-use-odr-indicator` flag

rdar://problem/69335186
delcypher pushed a commit to swiftlang/swift that referenced this issue Dec 14, 2020
Previously Swift enabled the "UseOdrIndicator" ASan instrumentation mode
and gave no option to disable this. This probably wasn't intentional but
happened due to the fact the
`createModuleAddressSanitizerLegacyPassPass()` function has a default
value for the `UseOdrIndicator` parameter of `true` and in Swift we
never specified this parameter explicitly.

Clang disables the "UseOdrIndicator" mode by default but allows it to be
enabled using the `-fsanitize-address-use-odr-indicator` flag.
Having "UseOdrIndicator" off by default is probably the right
default choice because it bloats the binary. So this patch changes the
Swift compiler to match Clang's behavior.

This patch disables the "UseOdrIndicator" mode by default but adds a
hidden driver and frontend flag (`-sanitize-address-use-odr-indicator`)
to enable it. The flag is hidden so that we can remove it in the future
if needed.

A side of disabling "UseOdrIndicator" is that by we will no longer use
private aliases for poisoning globals. Private aliases were introduced
to avoid crashes (google/sanitizers#398) due
to ODR violations with non-instrumented binaries. On Apple platforms the
use of two-level namespaces probably means that using private aliases
wasn't ever really necessary to avoid crashes. On platforms with a flat
linking namespace (e.g. Linux) using private aliases might matter more
but should users actually run into problems they can either:

* Fix their environment to remove the ODR, thus avoiding the crash.
* Instrument the previously non-instrumented code to avoid the crash.
* Use the new `-sanitize-address-use-odr-indicator` flag

rdar://problem/69335186
delcypher pushed a commit to swiftlang/swift that referenced this issue Dec 15, 2020
Previously Swift enabled the "UseOdrIndicator" ASan instrumentation mode
and gave no option to disable this. This probably wasn't intentional but
happened due to the fact the
`createModuleAddressSanitizerLegacyPassPass()` function has a default
value for the `UseOdrIndicator` parameter of `true` and in Swift we
never specified this parameter explicitly.

Clang disables the "UseOdrIndicator" mode by default but allows it to be
enabled using the `-fsanitize-address-use-odr-indicator` flag.
Having "UseOdrIndicator" off by default is probably the right
default choice because it bloats the binary. So this patch changes the
Swift compiler to match Clang's behavior.

This patch disables the "UseOdrIndicator" mode by default but adds a
hidden driver and frontend flag (`-sanitize-address-use-odr-indicator`)
to enable it. The flag is hidden so that we can remove it in the future
if needed.

A side effect  of disabling "UseOdrIndicator" is that by we will no
longer use private aliases for poisoning globals. Private aliases were
introduced to avoid crashes
(google/sanitizers#398) due to ODR violations
with non-instrumented binaries. On Apple platforms the use of two-level
namespaces probably means that using private aliases wasn't ever really
necessary to avoid crashes. On platforms with a flat linking namespace
(e.g. Linux) using private aliases might matter more but should users
actually run into problems they can either:

* Fix their environment to remove the ODR, thus avoiding the crash.
* Instrument the previously non-instrumented code to avoid the crash.
* Use the new `-sanitize-address-use-odr-indicator` flag

rdar://problem/69335186
MaskRay added a commit to MaskRay/llvm-project that referenced this issue Nov 3, 2022
This enables odr indicators on all platforms and private aliases on non-Windows.
Note that GCC also uses private aliases: this fixes spurious
`The following global variable is not properly aligned.` errors for interposed global variables
(PR37545 (this patch should allow us to restore D46665) and
google/sanitizers#1017)

Global variables of non-hasExactDefinition() linkages (i.e.
linkonce/linkonce_odr/weak/weak_odr/common/external_weak) are not instrumented.
If an instrumented variable gets interposed to an uninstrumented variable due to
symbol interposition (e.g. in PR37545, _ZTS1A in foo.so is resolved to _ZTS1A in
the executable), there may be a bogus error.

With private aliases, the register code will not resolve to a definition in
another module, and thus prevent the issue.

google/sanitizers#398 Similar to the above, but about
an instrumented global variable gets interposed to an uninstrumented global
variable (not using address sanitizer) in another module.

Cons: negligible size increase. On ELF, this is mainly due to extra `__odr_asan_gen_*` symbols.
In relocatable files, private aliases replace some relocations referencing
global symbols with .L symbols. This may introduce some STT_SECTION symbols.

For lld, with -g0, the size increase is 0.07~0.09% for many configurations I
have tested: -O0, -O1, -O2, -O3, -O2 -ffunction-sections -fdata-sections
-Wl,--gc-sections. With -g1 or above, the size increase ratio will be even smaller.

This replaces D92078.

Don't migrate Windows for now: the static data member of a specialization
`std::num_put<char>::id` is a weak symbol, as well as its ODR indicator.
Unfortunately, link.exe (and lld without -lldmingw) generally doesn't support duplicate
weak definitions (weak symbols in different TUs likely pick different defined external symbols
and conflict).

Differential Revision: https://reviews.llvm.org/D137227
MaskRay added a commit to llvm/llvm-project that referenced this issue Nov 3, 2022
This enables odr indicators on all platforms and private aliases on non-Windows.
Note that GCC also uses private aliases: this fixes bogus
`The following global variable is not properly aligned.` errors for interposed global variables

Fix google/sanitizers#398
Fix google/sanitizers#1017
Fix #36893 (we can restore D46665)

Global variables of non-hasExactDefinition() linkages (i.e.
linkonce/linkonce_odr/weak/weak_odr/common/external_weak) are not instrumented.
If an instrumented variable gets interposed to an uninstrumented variable due to
symbol interposition (e.g. in issue 36893, _ZTS1A in foo.so is resolved to _ZTS1A in
the executable), there may be a bogus error.

With private aliases, the register code will not resolve to a definition in
another module, and thus prevent the issue.

Cons: minor size increase. This is mainly due to extra `__odr_asan_gen_*` symbols.
(ELF) In addition, in relocatable files private aliases replace some relocations
referencing global symbols with .L symbols and may introduce some STT_SECTION symbols.

For lld, with -g0, the size increase is 0.07~0.09% for many configurations I
have tested: -O0, -O1, -O2, -O3, -O2 -ffunction-sections -fdata-sections
-Wl,--gc-sections. With -g1 or above, the size increase ratio will be even smaller.

This patch obsoletes D92078.

Don't migrate Windows for now: the static data member of a specialization
`std::num_put<char>::id` is a weak symbol, as well as its ODR indicator.
Unfortunately, link.exe (and lld without -lldmingw) generally doesn't support
duplicate weak definitions (weak symbols in different TUs likely pick different
defined external symbols and conflict).

Differential Revision: https://reviews.llvm.org/D137227
@MaskRay
Copy link

MaskRay commented Nov 3, 2022

This issue can be closed now that -fsanitize-address-use-odr-indicator is the default for non-Windows platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants