Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non matching mangled name for C functions using structs #2782

Open
jacob-carlborg opened this issue Jul 18, 2018 · 32 comments
Open

Non matching mangled name for C functions using structs #2782

jacob-carlborg opened this issue Jul 18, 2018 · 32 comments

Comments

@jacob-carlborg
Copy link
Contributor

Example:

module main;

struct Foo
{
    int a;
}

extern (C) void bar(Foo);
module foo;

struct Foo
{
    int a;
}

extern (C) void bar(Foo);
$ ldc2 main.d foo.d
main.d(8): Error: Function type does not match previously declared function with the same mangled name: bar

C functions doesn't have a mangling. I suspect the mangled name that would be used for a D function for the Foo struct is used.

@kinke
Copy link
Member

kinke commented Jul 18, 2018

This is a duplicate of long-standing #1020.

A quick look at the IR shows the problem - 2 equivalent Foo structs in different D modules:

; ldc2 -output-ll main.d
%main.Foo = type { i32 }
declare void @bar(%main.Foo) #0
; ldc2 -output-ll foo.d
%foo.Foo = type { i32 }
declare void @bar(%foo.Foo) #0

This is only mostly a problem with (mostly default/enforced) -singleobj though (i.e., -c should work), where code from all D modules is emitted into a single IR module (=> object file), and we can only have one IR function for a particular (mangled) name. extern(D) functions always feature the module prefix in their mangle, so this only applies to other linkages.

We make sure the IR types match if there are multiple declarations (and only emit the first definition in that case). If we didn't, we'd have to adapt the call sites, e.g., repainting IR foo.Foo call args as main.Foo ones.

@jacob-carlborg
Copy link
Contributor Author

Wouldn't the proper way to do this be to not look at the actual name of the struct but instead its members? Basically building up a mangled/IR name combining the mangled names of the members, recursively.

@kinke
Copy link
Member

kinke commented Jul 18, 2018

That's definitely something I haven't thought of before, but only imagining the length of some of these names (need to take alignments into account as well, unions...) makes my head ache. I look at IR a lot, and having type names >1k isn't bearable IMO.

@jacob-carlborg
Copy link
Contributor Author

Yeah, the might be a problem. But as I mentioned in the other issue #1020 it's easy to detect if a class/struct is extern(D) or some other linkage.

@kinke
Copy link
Member

kinke commented Jul 18, 2018

it's easy to detect if a class/struct is extern(D) or some other linkage.

That has just been added with 2.081 AFAIK. But as your example shows, people don't expect having to declare a struct extern(C[++]) just because it's used in some C[++] function.

We could keep a map of struct identifier (excl. module prefix) to IR types, per IR module. When trying to emit a new IR type, we could check if its IR layout is equivalent to an existing type with the same identifier, and use that existing one instead.

@jacob-carlborg
Copy link
Contributor Author

That has just been added with 2.081 AFAIK.

It's been present in ClassDeclaration for a long time, it was recently moved up to AggregateDeclaration to support structs as well. It has also been changed from several bools to one enum. It has existed as long as D has supported C++ classes. But there is no way to identify extern(C) struct because that doesn't have any other semantic differences compared to extern(D) struct.

But as your example shows, people don't expect having to declare a struct extern(C[++]) just because it's used in some C[++] function.

True.

We could keep a map of struct identifier (excl. module prefix) to IR types, per IR module. When trying to emit a new IR type, we could check if its IR layout is equivalent to an existing type with the same identifier, and use that existing one instead.

As long it works I'm happy.

@JohanEngelen
Copy link
Member

But as your example shows, people don't expect having to declare a struct extern(C[++]) just because it's used in some C[++] function.

This only applies to C functions, right? For C++ functions, the type is part of the mangling: in the OP example when extern(C) is replaced with extern(C++) the two modules would declare two different extern(C++) bar functions, because the two Foos are different types.
I am wary of collapsing types that are different into the same thing; I don't know whether that may interfere badly with LTO+some_future_metadata_for_types+optimization.
Could we emit a struct literal type (instead of named type) for extern(C) functions? I.e. something like this:

declare void @bar({i32}) #0

@kinke
Copy link
Member

kinke commented Jul 18, 2018

C++ functions are affected too, the struct doesn't even need to be extern(C++) then, as the C++ mangle (_Z3bar3Foo in both cases) doesn't contain the D module (just extern(C++) namespaces, which can be defined across multiple D modules too).

@jacob-carlborg
Copy link
Contributor Author

Could we emit a struct literal type (instead of named type) for extern(C) functions?

Doesn't have the same problem as I suggested #2782 (comment)?

@JohanEngelen
Copy link
Member

C++ functions are affected too, the struct doesn't even need to be extern(C++) then, as the C++ mangle (_Z3bar3Foo in both cases) doesn't contain the D module (just extern(C++) namespaces, which can be defined across multiple D modules too).

Wow indeed. That's a bad problem that will lead to miscompiles when templates are involved: consider two template instantiations in different compilation units but with different types, they will mangled the same although the types are different and just happen to have the same name, and linking will merge the symbols because that's needed for templates. This sounds like a bug to me: the mangling of C++ functions with D types should use the D type mangling.

Simple example that shows the bug:

static import core.thread;

struct Thread {
    int a;
}

extern (C++) void bar(Thread*) {}

extern (C++) void bar(core.thread.Thread) {}

@kinke
Copy link
Member

kinke commented Jul 19, 2018

I don't necessarily see that as bug, just one more thing to consider for C++ interop. 'Fixing' it would break a ton of code I guess. Our error for mismatching IR signatures may help in preventing such bugs (mainly with -singleobj).

Anyway, wrt. to the original issue, I'm not sure a special handling (collapsing types etc.) pays off; after all, this is almost always triggered by duplicate equivalent struct definitions and function declarations, and code duplication is smelly. Extracting the duplicate C[++] types/functions into a dedicated D module solves the problem.

@dnadlinger
Copy link
Member

If we want to allow this (signatures incompatible on the D level), I'd probably go for implementing it using repainting (replaceAllUsesWith and casting the function pointer), as trying to regularize the type on the LLVM level and getting that right in all cases seems like a lot of complexity.

We could then have a D-type-system implementation of structural type comparison/whatever heuristics we want to come up with to decide whether to present the user with an error message or not.

@dnadlinger
Copy link
Member

[...] and code duplication is smelly.

Agreed, but what if two separate D libraries used in a large project wrap the same C/C++ code?

@kinke
Copy link
Member

kinke commented Jul 19, 2018

If compiled in separate invokations (and no LTO), there's no problem. - But yeah, there'll always be some legitimate corner cases (template instantiations...).

@JohanEngelen
Copy link
Member

@kinke
Copy link
Member

kinke commented Jul 21, 2018

replaceAllUsesWith and casting the function pointer

Yeah, repainting the function pointer instead of the args should be a lot simpler indeed.

@burner
Copy link

burner commented May 23, 2019

I ran in this today as well:

/usr/include/dlang/ldc/core/stdc/stdio.d(1278,8): Error: Function type does not match previously declared function with the same mangled name: fwrite
/usr/include/dlang/ldc/core/stdc/stdio.d(1278,8):        Previous IR type: i64 (i8*, i64, i64, %libxlsxd.xlsxwrap._IO_FILE*)
/usr/include/dlang/ldc/core/stdc/stdio.d(1278,8):        New IR type:      i64 (i8*, i64, i64, %core.stdc.stdio._IO_FILE*)

this breaks xlsxd builds. this is with ldc 1.15.0

the structs are different though.

@kinke
Copy link
Member

kinke commented May 23, 2019

the structs are different though

So xlsxd is duplicating the _IO_FILE struct definition and fwrite() prototype from core.stdc.stdio? Why? ;) - If it was using the druntime struct, the problem would go away.

@burner
Copy link

burner commented May 23, 2019

because I use dpp to include the headers of a c library called libxlsxwriter which in turn pulls in stdio.h which pulls in _IO_FILE.

another problem I see is that the _IO_FILE in druntine is out-of-date with the one in glibc.
It was last updated in 2008.
I have a way to hack around that, but its another manual step I would like to avoid.

@kinke
Copy link
Member

kinke commented May 23, 2019

Ah, yeah dpp doesn't work (at all?) with LDC due to its architecture and this issue - it causes lots of duplicate aggregate definitions (and so init symbol + TypeInfos bloat etc.), Atila knows about it.

@burner
Copy link

burner commented May 23, 2019

luckily for me, for xlsxd its only _IO_FILE. Commenting that out makes it compile and pass the tests.

Has then been any idea/progress been made on this issue in general?

@kinke
Copy link
Member

kinke commented May 23, 2019

No progress, but there are some ideas, mainly in this issue here (see #3021 (comment) for links to the other duplicate issues). It's not trivial, and the most promising idea (casting the function pointer after verifying that the signatures are basically equivalent, i.e., that the used aggregates are 'equivalent', although distinct separate types) wouldn't work in your case, as your _IO_FILE diverges from druntime's.

@jacob-carlborg
Copy link
Contributor Author

Has then been any idea/progress been made on this issue in general?

Yes, use DStep 😉

@etcimon
Copy link
Contributor

etcimon commented Feb 10, 2020

Why not have a sort of @TolerantMangling flag in the meantime so we can get around library conflicts

@dnadlinger
Copy link
Member

Sure, we can implement that as an attribute. I wonder whether that's worth it, though, or whether we should just allow it outright for simplicity (possibly with a warning hidden behind a command-line flag).

@etcimon
Copy link
Contributor

etcimon commented Feb 10, 2020

As I've mentioned in #3318 (comment) it would be important to be able to override wrong declarations as well for legacy purposes.

@kinke
Copy link
Member

kinke commented Feb 10, 2020

+1 for the optional warning without extra UDA. Casting the function pointer should implicitly allow slightly wrong declarations, such as int/long for x64 targets if passed in a register.

@etcimon
Copy link
Contributor

etcimon commented Feb 10, 2020

I would rather not have warnings that can't be removed if my code is superior to the one it's overriding

@kinke
Copy link
Member

kinke commented Feb 10, 2020

Those warnings would be almost exclusively caused by duplicate struct declarations in multiple analyzed modules, so any 'superiority' would come with TypeInfos bloat (at best, just more stripping work for the linker). Settling for exactly one central D binding, either in druntime or some dub package, for each C(++) library would prevent this issue (and enforce DRY), but that's probably utopic.

@etcimon
Copy link
Contributor

etcimon commented Feb 10, 2020

Exactly, but the issue with type bloat is doomed to stay because of workarounds on top of workarounds due to compatibility reasons. If we had the perfect solution from the start it would've helped

kinke added a commit to kinke/ldc that referenced this issue Jun 10, 2021
(Only) when compiling them all-at-once anyway; primarily for implicit cross-
module inlining across the same library.

This won't work as long as there are multiple conflicting declarations
across a library's modules - issue ldc-developers#2782.
kinke added a commit to kinke/ldc that referenced this issue Jun 10, 2021
(Only) when compiling them all-at-once anyway; primarily for implicit cross-
module inlining across the same library.

This won't work as long as there are multiple conflicting declarations
across a library's modules - issue ldc-developers#2782.
@huglovefan
Copy link
Contributor

this also affects importC:

# make the C file - this can be stdio.h or some real header that includes it
cpp -std=c11 -D__restrict= -D__asm__\(x\)= /usr/include/stdio.h > cmodule.c
// run: ldc2 -i example.d
//  or: ldc2 example.d cmodule.c
import core.stdc.stdio;
static import cmodule;
void main() { fprintf(stdout, "hello\n"); }
/opt/ldc2-1.28.1-linux-x86_64/bin/../import/core/stdc/stdio.d(1247): Error: Function type does not match previously declared function with the same mangled name: `fprintf`
/opt/ldc2-1.28.1-linux-x86_64/bin/../import/core/stdc/stdio.d(1247):        Previous IR type: i32 (%cmodule._IO_FILE*, i8*, ...)
/opt/ldc2-1.28.1-linux-x86_64/bin/../import/core/stdc/stdio.d(1247):        New IR type:      i32 (%core.stdc.stdio._IO_FILE*, i8*, ...)

(the existence of fprintf in the C header is causing a conflict when trying to use the version from core.stdc.stdio)

kinke added a commit to kinke/ldc that referenced this issue Jun 9, 2023
Instead of eagerly IR-declaring all such functions in root modules
being compiled.

This might reduce compile times in some cases, and is a cheap way
to improve the situation wrt. issue ldc-developers#2782.
kinke added a commit that referenced this issue Jun 10, 2023
Instead of eagerly IR-declaring all such functions in root modules
being compiled.

This might reduce compile times in some cases, and is a cheap way
to improve the situation wrt. issue #2782.
@kinke
Copy link
Member

kinke commented Jun 10, 2023

The check is now less pedantic - an object file now needs to actually reference multiple conflicting functions (e.g., calling them), for the error to be emitted.

cschlote pushed a commit to cschlote/ldc that referenced this issue Jun 30, 2023
Instead of eagerly IR-declaring all such functions in root modules
being compiled.

This might reduce compile times in some cases, and is a cheap way
to improve the situation wrt. issue ldc-developers#2782.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants