-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
symbol lookup error #7177
Comments
Do you know if it works well in a regular Ubuntu? |
Good question. As soon as I get home I'll check. |
I just compiled it in an actual Ubuntu system and the same thing happens. |
Sorry, closed accidentally. I don't know what could be going on. You could check whether the missing symbol is indeed missing from the generated |
The symbol |
This only reproduces on one program, right? Compiling other programs works fine? |
seems i have similar problem once, fixed by removed |
Yes, only this one. But it happens in 3 different Linux installations. And not in MacOS.
That was among the first things I tried but nothing changed. -- I've tried to comment out different code paths trying to pinpoint the source of the problem but I didn't get anywhere. Things that are unrelated to each other like logging or using class that wraps a Hash and a Mutex instead of a normal Hash look like triggers (the program runs if I remove either) but I have been using both extensively for a while without this happening. At one point even removing a single call to |
@exilor Did you find the solution, or did you give up? |
As I said in my last comment:
I didn't find a singular cause. I'll reopen if I have something tangible to report. |
The only way I've been able to reproduce the error is with this code: {% for i in 0..35000 %}
class Foo{{i}}
end
Foo{{i}}.new
{% end %} Which gives the same kind of errors unless compiled with |
Each class is compiled into a separate dot o file, so it's possible we are hitting some limit. |
After deleting the contents of the crystal cache I've compiled the application and there were 8193 files in it afterwards. It's a suspicious number ((1024 * 8) + 1) but it could be a coincidence. |
My experience with the compiler tells me that such numbers are never a coincidence :-) |
I've been commenting out classes to get the code size to decrease and eventually the executable runs. And then uncommenting that code and commenting out some other code just in case a few times. I'm pretty certain that those classes don't call any code that isn't also called elsewhere.
Apparently if it goes above 4096 the executable throws the error. Edit: It appears that as I get closer to 4096 (like 4098) the error changes from a
And these are "relocation errors" instead. |
@exilor Could you try to compile a compiler using this branch and then compile the program that fails (not the one using a macro to generate many classes, but the real one that was failing for you) and tell me whether it works now? The branch tries to reduce the number of generated object files. It's not perfect, meaning that it's still possible to reach that limit, but the compiler's source code reaches about 1200 object files, and it's a pretty big project. |
@asterite I've never compiled the compiler before but I'll try tomorrow (it's bedtime here). I'll keep you informed. By the way, to be clear, this:
Was from compiling the actual program not the macro one. |
@asterite It didn't work and it took more classes commented out to get it to work compared to Crystal 0.27.
After taking out one more class it worked. As before, when approaching the point of working the error changed from symbol lookup error to relocation error. |
I'm hitting this with the crystal compiler:
Could it be we're exceding the maximum commandline size to the linker? |
We should probably generate one obj file for all generic instantiations of a same generic type. That means less reuse but also less explotion of obj files . |
@asterite would be nice to be able to confirm that's the problem somehow... |
You can compile with -v and see the list of linked objects. If it's something close to a power of 2 that might be it. |
I counted 5077 files with --verbose. |
looks close enough to 4096 to me, especially taking into account the |
A workaround: install This is a binutils bug (for fucks sake) |
binutils LD is producing an invalid executable,
|
Where should that go? |
it's an environment variable, you can prefix the crystal command with it in a shell |
It fails:
|
@exilor you need at least LLVM 7, or use |
Using lld-8 seems to work locally. Using
Should we update the |
@RX14 I already had LLVM 7 installed and using that command outputs a warning from clang that |
@bcardiff this is a serious bug in binutils, and has to be reported upstream. These workarounds should be temporary, not attempted to be applied automatically.
I'd prefer either workaround be applied in crystal's CI setup, instead of at the docker image level though. |
I was just checking with --single-module right now. I think we might need to add that to the current CI scripts. The current infrastructure use |
I feel comfortable requiring that for developing crystal and collaborating with the std-lib I vouch for:
|
How much does this reduce the number of object files in practice? I feel any granularity change is going to hurt incremental build times.
It won't be helpful in the short-term, but it is neccesary in the long-term. Supporting
Are the
I think this should remain a workaround clearly stated/documented in this github issue, I'm not sure it belongs in the official docs (though I don't feel strongly about this). Especially since this bug only shows up on really big projects. i.e. the crystal compiler test suite itself.
I think that the crystal |
current std_spec uses 3826 obj files, 198 are tuples/named_tuples, ~1400 seems to be generic instances
I have no idea were to diagnose what is happening with
Only in the CI. They are built at https://github.com/crystal-lang/crystal-dist/blob/master/docker/crystal/Dockerfile#L16-L25 . I was planning on adding This is equivalent to let contributors set up
The thing is that contributors might come to this issue while running
lld in ubuntu is 6.0, I guess it should work. lld-8 does not symlink to That is way I would add the symlink of |
Me neither, but it's not practical to simply not support the linker with 90% market share. Not supporting In practical terms, this means at the least submitting a bug report to them. We can leave this till later but I really don't want to forget.
This is wrong way to do it, export
We definitely need to document it somewhere in the contributing docs.
I think
Well, shit. I just played around with this, and clang supports |
So I will rebuild 0.32.1-build images with https://github.com/crystal-lang/crystal-dist/pull/8/files And after that the following changes to the Makefile to build the specs https://github.com/crystal-lang/crystal/compare/ci/use-lld using lld If
In osx if
Sound good? I've also updated the https://github.com/crystal-lang/crystal/wiki/All-required-libraries#ubuntu |
@bcardiff I was thinking of only adding |
After updating to 0.33.0 the error persists with the same message. |
Even with lld? |
|
After installing lld there are one of the following you need to do:
|
Oh my. I can't believe it worked after so long! The executable went from 181 mb to 198 but the program seems the be working just fine. |
On Arch Linux, for Crystal master, running
|
After installing |
Yep, installing |
After compiling successfully with 0.27 and LLVM 4.0.0 (Ubuntu on the WSL) running the executable gives this error:
./game_server: symbol lookup error: ./game_server: undefined symbol: errno, version Object::to_s<String::Builder>:String::Builder
.The specific method referenced can change between computers (computers that were able to run the executables before this error appeared).
Unfortunately I can't provide code to reproduce because the program has over 150k lines. I'm trying to comment out code in the hopes of narrowing down where this comes from.
The problem doesn't happen if I compile with
--single-module
or in MacOS.It's not much to go on but does anyone have an inkling of where this could be coming from?
The text was updated successfully, but these errors were encountered: