Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/tar: needs hardening and refactoring #12638

Closed
dsnet opened this issue Sep 16, 2015 · 10 comments
Closed

archive/tar: needs hardening and refactoring #12638

dsnet opened this issue Sep 16, 2015 · 10 comments

Comments

@dsnet
Copy link
Member

dsnet commented Sep 16, 2015

I'm filing this issue as the "capture all" ticket for all currently known archive/tar issues. I've discovered quite a few issues and it's getting unwieldy filing an issue for them.

Over time, the tar library has become increasingly buggy, especially with the inclusion of support for PAX, GNU, and sparse files. There are many edge conditions where these various formats interact in unexpected ways leading to bugs. The fact that the fuzzer has detected a large of bugs in archive/tar goes to show its fragility. Also, most of the "fixes" done to resolve each issue were bandaids on a larger problem.

I have gone through the Reader implementation several times now and have read parts of the GNU, BSD, and STAR codebases to get a good understanding of how all the different utilities handle all the oddities when it comes to the TAR format.

Some of the major issues with the library are:

Currently open tickets are:

The plan is to refactor the Reader code first to be as reliable as possible, and then to tackle various out-standing Writer issues.

I'll be filing no more archive/tar related issues after this. I promise :]

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/14624 mentions this issue.

@dsnet
Copy link
Member Author

dsnet commented Sep 18, 2015

@dsymonds, what is the point of the Writer.Flush method?

At first, I thought that the function was used to ignore the rest of the current file and automatically fill it in with zeros. In fact, the for loop on L57 seems to indicate that this was the original purpose. However, the check on L51 prevents use of the method unless all file contents have been written already. Is the purpose of this method really just to flush out the final padding? If that's the case, why can't Writer.Write just do that when nb hits zero?

@dsymonds
Copy link
Contributor

You're looking at some pretty ancient code. I wrote the original archive/tar in mid 2009, before we even really had a good grasp of what good Go code looked like. It's accreted changes ever since as people have had need. It probably needs a good tidy up.

I managed to find an email from the code review where Writer.Flush got added. @rsc said:

Line 51: func (tw *Writer) writePadding() {
rename to Flush and return os.Error.
clients may want to call this explicitly
to finish one file even if they're not
ready to call Close or WriteHeader.

In its original form it did indeed write out zeros to the end of the current file. It seems a little bogus now, but we can't drop it.

The neutering on line 51 does indeed look silly. It was added in d75abb7 (early 2012), and @rsc noticed that it was silly after it was submitted (https://codereview.appspot.com/5777064). I can't remember why we didn't revise it back then.

With the benefit of hindsight, I can't see the use case for writing the zeros automatically. The user of tar.Writer needs to know the size in advance (to put it in the header), and can easily write zeros if they want. The 2012 change made Flush only write the trailing padding anyway, so we could simplify it now, but it's also somewhat of a historical artifact at this point.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/14723 mentions this issue.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/14823 mentions this issue.

@dsymonds
Copy link
Contributor

Sorry, I don't have time to review all these changes. I'll finish off the one I started, but you'll need to track down a different reviewer.

@dsnet
Copy link
Member Author

dsnet commented Sep 23, 2015

Is there anyone you recommend for doing the reviews?

@dsymonds
Copy link
Contributor

@davecheney might be interested, or you can ask around on golang-dev.

dsymonds pushed a commit that referenced this issue Sep 23, 2015
Convert splitUSTARPath to return a bool rather than an error since
the caller never ever uses the error other than to check if it is
nil. Thus, we can remove errNameTooLong as well.

Also, fold the checking of the length <= fileNameSize and whether
the string is ASCII into the split function itself.

Lastly, remove logic to set the MAGIC since that's already done on
L200. Thus, setting the magic is redundant.

There is no overall logic change.

Updates #12638

Change-Id: I26b6992578199abad723c2a2af7f4fc078af9c17
Reviewed-on: https://go-review.googlesource.com/14723
Reviewed-by: David Symonds <[email protected]>
Run-TryBot: David Symonds <[email protected]>
@dsnet
Copy link
Member Author

dsnet commented Sep 25, 2015

@davecheney, would you be willing to do this? Otherwise, I will ask around.

@rsc
Copy link
Contributor

rsc commented Oct 23, 2015

Please file separate, focused issues for bugs you find. One giant issue does not help anyone. It doesn't give a crisp statement of something to do, and it doesn't help when you see it in a CL description.

@rsc rsc closed this as completed Oct 23, 2015
@golang golang locked and limited conversation to collaborators Oct 24, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants