-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compiler_rt: Port UBSan mini-runtime #5165
Conversation
Need to make sure this doesn't end up breaking on targets where the zig standard library's |
This allows the user to know what went wrong whenever the UBSan-inserted checks fail.
The problem with depending on the std lib is that compiler_rt.a and the application are in different compilation units. So all the stdlib stuff (including the mutex protecting stderr) are duplicated. Big waste of bytes, and potentially a race with writing to stderr. That's where A problem we have to solve in order to merge this is that there are two different use cases:
The latter case is easy; compiler-rt uses The former case is a bit trickier. How much debug info code should compiler-rt include to make UBSAN errors nicer in C code? On one side of the scale we have I think the key here is to identify which version of compiler-rt we want. If we can identify whether we are in the former or latter case, it will inform which compiler-rt to generate. Zig is uniquely positioned to generate compiler-rt tailored for specific use cases like this. So here's my proposal to move forward on this: Add detection of which "compiler-rt mode" is desirable: zig-as-the-main-application tailored compiler-rt or C-as-the-main-application tailored compiler-rt. Float this mode down into compiler-rt somehow so it can make a comptime branch on that value. Depending on the mode, call I believe what this pull request currently does is unconditionally assume the C-as-the-main-application tailored compiler-rt. |
Follow-up: the UBSAN handling code could unconditionally call // Avoid dragging in the runtime safety mechanisms into this .o file,
// unless we're trying to test this file.
pub fn panic(msg: []const u8, error_return_trace: ?*builtin.StackTrace) noreturn {
@setCold(true);
if (is_test) {
std.debug.panic("{}", .{msg});
} else {
unreachable;
}
} Regardless of this PR, this should probably be changed to trap rather than unreachable. Anyway, here is where the comptime logic would go - it would be |
It's slightly more complex than that as we also have to provide the "recovery" handlers that only record/print the problem so panicking doesn't really work there. |
Do we though? Panic is noreturn so we could simply omit the recovery mechanisms and crash. This seems like a perfectly acceptable way to handle undefined behavior to me. |
Yes, the contract between clang and the compiler_rt implementation is pretty clear and you need to export both the aborting and non-aborting variants to allow the compiler to respect the If yu're not interested in providing a complete minimal ubsan runtime then we can add |
A feature I wish the current version of GCC had for ubsan is that the version that calls __builtin_trap on detection of undefined behavior would at least call a function that would pass an enum to indicate what the failure is. We are on a bare-metal target so we are using the ubsan runtime but not linking the library. We define a handler with just a breakpoint instruction for each of the potential handlers. The app on the debugger stops immediately which is nice, and I can go up the callstack to find the line where the offense occurred. The issue is that the binary gets bloated a bit extra from the unnecessary setting up of the arguments to the ubsan runtime functions, which we completely ignore. Space is premium in embedded obviously, so this is sub-optimal. We have 16MB of flash space on this target which is extremely unusual for us or else we wouldn't be able to use ubsan at all. An alternative would be to just call the different abort handlers without passing anything on failure. Really I just need the name of the function to provide the hint as to what the issue was. |
I think this is an important idea to explore, but it's not ready to merge as is. Closing for now, with the understanding that it can be re-opened in the future whenever you have time to work on it. |
@LemonBoy is it OK if I look at picking this up using your code as a basis? :) |
You have my blessing, cc me if you want a review or opinion on something. |
commit 7c9c37c12ec5f726e8298c47bac0e699b2880f49
Author: Francisco Demartino <[email protected]>
Date: Sat Feb 12 22:07:55 2022 -0300
update to zig 0.10.0-dev.85+c0ae9647f
diff --git a/zigubsan/ubsan.zig b/zigubsan/ubsan.zig
index 07acdd35..9495ab0e 100644
--- a/zigubsan/ubsan.zig
+++ b/zigubsan/ubsan.zig
@@ -19,7 +19,7 @@
// Runtime support for Clang's Undefined Behavior sanitizer
const std = @import("std");
const ubsan_value = @import("ubsan_value.zig");
-const builtin = std.builtin;
+const builtin = @import("builtin");
// for setting breakpoints with gdb / lldb
export fn ubsanbreak() void {}
@@ -30,10 +30,10 @@ export fn ubsanbreak() void {}
fn makeHandler(comptime error_msg: []const u8) type {
return struct {
pub fn recover_handler() callconv(.C) void {
- std.debug.warn("ubsan: " ++ error_msg, .{});
+ std.debug.print("ubsan: " ++ error_msg, .{});
}
pub fn abort_handler() callconv(.C) noreturn {
- std.debug.warn("ubsan: " ++ error_msg, .{});
+ std.debug.print("ubsan: " ++ error_msg, .{});
// was panic here instead, but panicked while panicking. let's add a silly hook for breaking
ubsanbreak();
std.os.exit(234);
@@ -49,12 +49,12 @@ const OverflowData = extern struct {
const DynamicTypeCacheMissData = extern struct {
source_location: ubsan_value.SourceLocation,
type_descriptor: *const ubsan_value.TypeDescriptor,
- type_info: ?*c_void,
+ type_info: ?*opaque{},
type_check_kind: u8,
};
fn exportHandlers(comptime handlers: anytype, comptime export_name: []const u8) void {
- const linkage: builtin.GlobalLinkage = if (std.builtin.is_test) .Internal else .Weak;
+ const linkage: std.builtin.GlobalLinkage = if (builtin.is_test) .Internal else .Weak;
{
const handler_symbol = std.builtin.ExportOptions{ .name = "__ubsan_" ++ export_name, .linkage = linkage };
@export(handlers.recover_handler, handler_symbol);
@@ -66,7 +66,7 @@ fn exportHandlers(comptime handlers: anytype, comptime export_name: []const u8)
}
const abort_log = std.debug.panic;
-const warn_log = std.debug.warn;
+const warn_log = std.debug.print;
fn handleOverflow(comptime report_log: anytype, comptime name: []const u8, comptime symbol: []const u8, overflow_data: *OverflowData, lhs: ubsan_value.ValueHandle, rhs: ubsan_value.ValueHandle) void {
const source_location = overflow_data.source_location.acquire();
@@ -174,7 +174,7 @@ export fn __ubsan_handle_dynamic_type_cache_miss_abort(dynamic_type_cache_miss_d
export fn __ubsan_handle_dynamic_type_cache_miss(dynamic_type_cache_miss_data: *DynamicTypeCacheMissData, pointer: ubsan_value.ValueHandle, hash: ubsan_value.ValueHandle) callconv(.C) void {
if (handleDynamicTypeCacheMiss(dynamic_type_cache_miss_data, pointer, hash)) {
- std.debug.warn("ubsan: " ++ "handle_dynamic_type_cache_miss", .{});
+ std.debug.print("ubsan: " ++ "handle_dynamic_type_cache_miss", .{});
}
}
@@ -220,7 +220,7 @@ comptime {
.{ "handle_add_overflow_abort", "add-overflow-abort", .Abort, .Full },
};
- const linkage: builtin.GlobalLinkage = if (std.builtin.is_test) .Internal else .Weak;
+ const linkage: std.builtin.GlobalLinkage = if (builtin.is_test) .Internal else .Weak;
inline for (HANDLERS) |entry| {
const handler = makeHandler(entry[1]);
diff --git a/zigubsan/ubsan_value.zig b/zigubsan/ubsan_value.zig
index 4d01e12c..16ba6867 100644
--- a/zigubsan/ubsan_value.zig
+++ b/zigubsan/ubsan_value.zig
@@ -54,7 +54,7 @@ pub const TypeDescriptor = extern struct {
pub fn getFloatSize(type_descriptor: *const TypeDescriptor) u64 {
if (type_descriptor.kind != .float) unreachable;
- return info;
+ return type_descriptor.info;
}
fn getIntValueBits(type_descriptor: *const TypeDescriptor, value_handle: ValueHandle) u128 {
@@ -89,4 +89,4 @@ pub const TypeDescriptor = extern struct {
// If the type is small enough - its value is stored in
// in the ValueHandle pointer, otherwise it is what the
// the ValueHandle points at.
-pub const ValueHandle = *c_void;
+pub const ValueHandle = *opaque{}; |
commit 8d01d9ca4566322b16b0d3bcb4c33d0ac4fa0806
Author: Francisco Demartino <[email protected]>
Date: Thu Mar 16 22:06:06 2023 -0300
fix zig ubsan for 0.10.1
diff --git a/zigubsan/ubsan.zig b/zigubsan/ubsan.zig
index 9495ab0e..ec0026de 100644
--- a/zigubsan/ubsan.zig
+++ b/zigubsan/ubsan.zig
@@ -72,18 +72,20 @@ fn handleOverflow(comptime report_log: anytype, comptime name: []const u8, compt
const source_location = overflow_data.source_location.acquire();
const format_string = "ubsan: " ++ name ++ " Overflow: {} " ++ symbol ++ " {} in type {s}\n{s}:{}:{}\n";
const type_name = overflow_data.type_descriptor.getNameAsString();
-
- if (overflow_data.type_descriptor.isSignedInteger()) {
- const lhs_value = overflow_data.type_descriptor.getSignedIntValue(lhs);
- const rhs_value = overflow_data.type_descriptor.getSignedIntValue(rhs);
- report_log(format_string, .{ lhs_value, rhs_value, type_name, source_location.file_name, source_location.line, source_location.column });
- } else {
- const lhs_value = overflow_data.type_descriptor.getUnsignedIntValue(lhs);
- const rhs_value = overflow_data.type_descriptor.getUnsignedIntValue(rhs);
- report_log(format_string, .{ lhs_value, rhs_value, type_name, source_location.file_name, source_location.line, source_location.column });
+ if (source_location.file_name) |file_name| {
+ if (overflow_data.type_descriptor.isSignedInteger()) {
+ const lhs_value = overflow_data.type_descriptor.getSignedIntValue(lhs);
+ const rhs_value = overflow_data.type_descriptor.getSignedIntValue(rhs);
+ report_log(format_string, .{ lhs_value, rhs_value, type_name, file_name, source_location.line, source_location.column });
+ } else {
+ const lhs_value = overflow_data.type_descriptor.getUnsignedIntValue(lhs);
+ const rhs_value = overflow_data.type_descriptor.getUnsignedIntValue(rhs);
+ report_log(format_string, .{ lhs_value, rhs_value, type_name, file_name, source_location.line, source_location.column });
+ }
}
}
+
fn makeOverflowHandler(comptime export_name: []const u8, comptime name: []const u8, comptime symbol: []const u8) void {
const handlers = struct {
pub fn recover_handler(overflow_data: *OverflowData, lhs: ubsan_value.ValueHandle, rhs: ubsan_value.ValueHandle) callconv(.C) void {
@@ -107,12 +109,14 @@ fn handleNegateOverflow(comptime report_log: anytype, overflow_data: *OverflowDa
const format_string = "ubsan: Negation of {} cannot be represented in type {s}\n{s}:{}:{}\n";
const type_name = overflow_data.type_descriptor.getNameAsString();
- if (overflow_data.type_descriptor.isSignedInteger()) {
- const int_value = overflow_data.type_descriptor.getSignedIntValue(value);
- report_log(format_string, .{ int_value, type_name, source_location.file_name, source_location.line, source_location.column });
- } else {
- const int_value = overflow_data.type_descriptor.getUnsignedIntValue(value);
- report_log(format_string, .{ int_value, type_name, source_location.file_name, source_location.line, source_location.column });
+ if (source_location.file_name) |file_name| {
+ if (overflow_data.type_descriptor.isSignedInteger()) {
+ const int_value = overflow_data.type_descriptor.getSignedIntValue(value);
+ report_log(format_string, .{ int_value, type_name, file_name, source_location.line, source_location.column });
+ } else {
+ const int_value = overflow_data.type_descriptor.getUnsignedIntValue(value);
+ report_log(format_string, .{ int_value, type_name, file_name, source_location.line, source_location.column });
+ }
}
}
@@ -130,19 +134,22 @@ comptime {
fn handleDivremOverflow(comptime report_log: anytype, overflow_data: *OverflowData, lhs: ubsan_value.ValueHandle, rhs: ubsan_value.ValueHandle) void {
const source_location = overflow_data.source_location.acquire();
- if (overflow_data.type_descriptor.isSignedInteger() and (overflow_data.type_descriptor.getSignedIntValue(rhs) == -1)) {
- const format_string = "ubsan: Divsion of {} by -1 cannot be represented in type {s}\n{s}:{}:{}\n";
- const type_name = overflow_data.type_descriptor.getNameAsString();
- const int_value = overflow_data.type_descriptor.getSignedIntValue(lhs);
- report_log(format_string, .{ int_value, type_name, source_location.file_name, source_location.line, source_location.column });
- } else switch (overflow_data.type_descriptor.kind) {
- .float, .integer => {
- const format_string = "ubsan: Division by 0\n{s}:{}:{}\n";
- report_log(format_string, .{ source_location.file_name, source_location.line, source_location.column });
- },
- .unknown => {
- unreachable;
- },
+
+ if (source_location.file_name) |file_name| {
+ if (overflow_data.type_descriptor.isSignedInteger() and (overflow_data.type_descriptor.getSignedIntValue(rhs) == -1)) {
+ const format_string = "ubsan: Divsion of {} by -1 cannot be represented in type {s}\n{s}:{}:{}\n";
+ const type_name = overflow_data.type_descriptor.getNameAsString();
+ const int_value = overflow_data.type_descriptor.getSignedIntValue(lhs);
+ report_log(format_string, .{ int_value, type_name, file_name, source_location.line, source_location.column });
+ } else switch (overflow_data.type_descriptor.kind) {
+ .float, .integer => {
+ const format_string = "ubsan: Division by 0\n{s}:{}:{}\n";
+ report_log(format_string, .{ file_name, source_location.line, source_location.column });
+ },
+ .unknown => {
+ unreachable;
+ },
+ }
}
} |
gah, the diffs are for this (kinda related) gist: https://gist.github.com/moosichu/ee3dd8407653640a36fb9625ac586082 |
commit 5389c0a3be05c52df56eb18ab4f3d9984a4e6e02
Author: Francisco Demartino <[email protected]>
Date: Tue Sep 5 22:54:47 2023 -0300
fix zig ubsan for 0.11.0
diff --git a/src/ubsan_value.zig b/src/ubsan_value.zig
index 0133e60f..769b2f25 100644
--- a/src/ubsan_value.zig
+++ b/src/ubsan_value.zig
@@ -40,7 +40,7 @@ pub const TypeDescriptor = extern struct {
name: [1]u8,
pub fn getNameAsString(type_descriptor: *const TypeDescriptor) [*:0]const u8 {
- return @ptrCast([*:0]const u8, &type_descriptor.name);
+ return @as([*:0]const u8, @ptrCast(&type_descriptor.name));
}
pub fn isSignedInteger(type_descriptor: *const TypeDescriptor) bool {
@@ -50,7 +50,7 @@ pub const TypeDescriptor = extern struct {
pub fn getIntegerSize(type_descriptor: *const TypeDescriptor) u64 {
if (type_descriptor.kind != .integer) unreachable;
// Bit-shift the signed value away and then get 2^n out
- return @as(u64, 1) << @intCast(u6, type_descriptor.info >> 1);
+ return @as(u64, 1) << @as(u6, @intCast(type_descriptor.info >> 1));
}
pub fn getFloatSize(type_descriptor: *const TypeDescriptor) u64 {
@@ -63,12 +63,12 @@ pub const TypeDescriptor = extern struct {
const size = type_descriptor.getIntegerSize();
const max_inline_size = @sizeOf(ValueHandle) * 8;
if (size <= max_inline_size) {
- return @ptrToInt(value_handle);
+ return @intFromPtr(value_handle);
} else {
if (size == 64) {
- return @ptrCast(*u64, @alignCast(8, value_handle)).*;
+ return @as(*u64, @alignCast(@ptrCast(value_handle))).*;
} else if (size == 128) {
- return @ptrCast(*u128, @alignCast(16, value_handle)).*;
+ return @as(*u128, @alignCast(@ptrCast(value_handle))).*;
} else {
unreachable;
}
@@ -77,7 +77,7 @@ pub const TypeDescriptor = extern struct {
pub fn getSignedIntValue(type_descriptor: *const TypeDescriptor, value_handle: ValueHandle) i128 {
if (!type_descriptor.isSignedInteger()) unreachable;
- return @bitCast(i128, type_descriptor.getIntValueBits(value_handle));
+ return @as(i128, @bitCast(type_descriptor.getIntValueBits(value_handle)));
}
pub fn getUnsignedIntValue(type_descriptor: *const TypeDescriptor, value_handle: ValueHandle) u128 { |
This allows the user to know what went wrong whenever the UBSan-inserted
checks fail.
Consider this as a working MVP, there's no de-duplication as in the original miniubsan and the dependency on the stdlib may be too much.
We can do much better than trapping.
Addresses #5163