-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: cmd/link: redesign format of intermediate object files #11123
Comments
Let's change this to follow the proposal process. I've done some more hacking and have a branch (https://github.com/mwhudson/go/tree/symbol-table-2) that uses an ad-hoc format and makes linking godoc about 30% faster, reduces linker memory consumption by about 10% and has negligible effect on the compiler runtime or memory usage. This seems like enough to me to make the effort of changing format worthwhile. An open question is whether to use a more established format like ELF or macho or PE (the formats Go has general support for; I only really understand ELF of these) or something more tailored to Go's needs. The main advantage of using a standard format that I can see would be being able to use standard tools to inspect them, which would be nice but not super compelling. Maybe I'm missing something else though. |
I'm curious whether this is a consequence of the object file format or of the mechanically translated linker. Do you see comparable speed-ups using cmd/newlink? (It might be a fair amount of work to make similar changes to cmd/newlink. I hate to ask you do that unnecessarily, so if you see another good way to answer the original question, by all means do that instead.) |
I think it's really the format. I guess it would be possible to transplant my code into cmd/internal/goobj and write some more focused benchmarks. I've been a bit guilty of making a whole bunch of changes and not measuring the impact of each one, but I'm pretty sure the biggest change is using a symbol table, which means many fewer calls to expandpkg and many fewer map lookups and consequent string hashing. It also makes the .a files quite a bit (20%?) smaller, which obviously reduces the IO required to read them... Having a string block so you are converting []byte to string just the once and allocating all the strings/data read from the file as slices of one big string/data block certainly helps a bit too, and those are dependent on changing the format too. |
I should be clear that in all of this, I will defer to @rsc, @ianlancetaylor, @minux, and others who know the linker well.
This might be a good idea. Given that there is lots of value in object file format stability, it makes sense to make one big change rather than many small ones. The easier it is to experiment and generate good numbers, the easier it will be to have confidene that we've settled on a good design. It is probably also worth surveying to see whether there are other changes to make at the same time. For example, the current object file format doesn't support jump tables. And there are several issues that might be helped by object file format changes (or might not--I haven't thought it through very far): #7384, #7599, #6993, #10170 (and thus #6643).
I'm surprised that there was so much headroom available, since IIRC, the current design was intended to be optimized for I/O. I also seem to recall (but can't seem to find a reference or details now) that there was a goal to keep some important info (what?) at the beginning of the file, to enable peeking. Does the symbol table impact that? (I guess I need to figure out the details to answer that.) Is it safe to assume that you have looked at the Go 1.3 Linker Overhaul doc and the discussion of the file format there?
Good to know. When I played with this a while ago, I got the sense that the many []byte/string conversions were just a limitation of the code base, not the file format, but I am definitely prepared to believe that I was wrong. :) Thanks for pushing on this. My comments are questions here stem not from resistance but from wanting the next generation (if there is one) to be awesome. |
On 19 June 2015 at 11:42, Josh Bleecher Snyder [email protected]
and they must have been on my mind too, because basically I've addressed
There are lots of ways you could speed things up if you are willing to give
|
I made my goobj changes work (see previously mentioned branch): mwhudson@glamdring:src$ go test cmd/internal/goobj -test.bench .
PASS
BenchmarkLoadRuntime-4 30 38951883 ns/op
ok cmd/internal/goobj 1.226s
mwhudson@glamdring:src$ go test cmd/internal/goobj -test.bench .
PASS
BenchmarkLoadRuntime-4 100 14326216 ns/op
ok cmd/internal/goobj 1.460s That's 2.7x faster :-) |
We're not ready to process proposals under the new proposal process yet, but once we are this would be a good one. I never intended the current object file format to last this long. There are many things that can be fixed up to make processing them faster, and I'd love to see this happen. FWIW, I don't believe it makes snse to ELF or some other standard object file format. Shoehorning Go's needs into that object file will, I expect, cause more problems than it would bring benefits. But if you can make the case otherwise, I'm certainly willing to listen. |
So now we're in very early 1.6, I feel I should ask: should I polish up my somewhat ad-hoc format for submission, or is there someone who wants to do some serious design? I'm not sure I'm the right person to think about how to represent jump tables in the .a files, for example. |
/cc @griesemer |
Just FYI: I'm currently experimenting with a redesigned export data format for the export data (found in .a files) for more compact size, and more efficient reading. Whatever the new format, it should be very efficient to find that export data section since it's read by the compiler when processing an import declaration. |
I was only thinking about the .o files here I guess. Hadn't thought about changing the .a part at all, although I guess it shouldn't be completely off the table. |
@mwhudson I was just mentioning this since .a files were also brought up before. (And please ignore the close-reopening of the issue. I always press the wrong button in this UI.) |
Why doesn't the ELF format suit Go's requirements well? Just curious. |
Go uses its own object fornat because it's much simpler than ELF and also
means newer Go versions could completely redesign the object file format
without being limited by ELF's feature set.
|
@mwhudson are you still interested in writing a proposal for this? |
I'm interested but I don't know that I'll get to it any time soon. Is it OK to leave it open for now? |
Sure On 28 September 2015 at 06:17, Michael Hudson-Doyle <
|
Please keep this proposal about the object part of the file. The container format is a separate issue. I intend to replace both the inner .o structure and the outer .a structure with standard zip. |
CL https://golang.org/cl/21020 mentions this issue. |
Several piecemeal changes that have been proposed to the object format. Some of them are already submitted, one, cl/21020, is effectively blocked on this. @mwhudson do you want to redesign the object format from some new principle, or should we stick to small changes with individually measurable performance improvement? |
I objected to CL 21020 not because of this issue but because it ignores and On Mon, Mar 28, 2016 at 10:08 AM David Crawshaw [email protected]
|
I'm sorry, I made a typo which is adding to the confusion. I agree that CL 21020 should be on hold. I meant CL 21026. It is compatible with your proposal, but Ian objected to piecemeal object format changes while this proposal for a redesign exists. I think CL 21026 is reasonable and would like to see it move forward, but how it fits with this should be sorted first. |
I think most of the changes I had in mind have been done by now and there's not a lot of value in keeping this open, so I'm going to close it. |
Although this is closed now, @ianlancetaylor wanted me to also clear up the direction i'm taking with future changes to the format and this seems like the right place to do it.
|
@rsc would it work to have the object file kept uncompressed in the .a file? That way seeking could happen without extracting the whole file beforehand. Did you have any other incompatibilities in mind? |
I need to look more carefully at the CL you prepared. I will try to find On Tue, Mar 29, 2016 at 5:34 AM skohanim [email protected] wrote:
|
That would be great. |
CL https://golang.org/cl/21752 mentions this issue. |
I've sent in CL https://golang.org/cl/21752 which splits up the object file sections into small independent units. |
This is a good idea, but right now we don't have the bandwidth to help redesign the object format and support the resulting churn in the code base. We should revisit the object format at some point, but it won't be soon. Marking this declined for now, with the expectation that we may come back to this later. |
https://golang.org/cl/10833 is a simplistic change that changes the object files consumed by the linker to have a symbol table rather that writing the symbol names out each time they are referenced and it improves all.bash times by 30s on one laptop. We should investigate further when 1.6 opens.
The text was updated successfully, but these errors were encountered: