Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with parsing rust-ar generated archives after ranlib #10

Closed
bjorn3 opened this issue Apr 1, 2019 · 13 comments · Fixed by #11
Closed

Problems with parsing rust-ar generated archives after ranlib #10

bjorn3 opened this issue Apr 1, 2019 · 13 comments · Fixed by #11

Comments

@bjorn3
Copy link
Contributor

bjorn3 commented Apr 1, 2019

//! ```cargo
//! [dependencies]
//! ar = "0.6.2"
//! ```

use std::io::Read;

// 64 gives Invalid file size field in entry header
// 32 gives unexpected EOF in the middle of archive entry header
const METADATA_LEN: usize = 64;

fn main() {
    let mut builder = ar::Builder::new(std::fs::File::create("test.a").expect("create"));

    // Remove this append and there is no problem.
    let header = ar::Header::new(b"core-fc675157235ea9.dummy_name.rcgu.o".to_vec(), 0);
    // Remove any of the characters in the filename and ! from will show up in the error message.
    // Making it shorter than 17 chars will fix the problem though.

    builder.append(&header, &mut (&[] as &[u8])).expect("add rcgu");

    let mut buf: Vec<u8> = vec!['!' as u8; 28];
    buf.extend(b"hello worl");
    buf.extend(&[0u8; 26] as &[u8]);
    assert!(buf.len() >= METADATA_LEN);

    let header = ar::Header::new(b"rust.metadata.bin".to_vec(), METADATA_LEN as u64);
    builder.append(&header, &mut (&buf[0..METADATA_LEN])).expect("add meta");

    std::mem::drop(builder);

    // Remove this ranlib invocation and there is no problem.
    assert!(
        std::process::Command::new("ranlib")
            .arg("test.a")
            .status()
            .expect("Couldn't run ranlib")
            .success()
    );

    let mut archive = ar::Archive::new(std::fs::File::open("test.a").expect("open"));
    while let Some(entry) = archive.next_entry() {
        entry.unwrap();
    }
}
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom { kind: InvalidData, error: StringError("Invalid file size field in entry header (\"hello worl\")") }', src/libcore/result.rs:997:5
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom { kind: UnexpectedEof, error: StringError("unexpected EOF in the middle of archive entry header") }', src/libcore/result.rs:997:5

Edit: change example to be clearer.

@bjorn3
Copy link
Contributor Author

bjorn3 commented Apr 2, 2019

Ar produced file:

$ ar r test.a core-fc675.rcgu.o rust.metadata.bin
$ hexyl test.a
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 21 3c 61 72 63 68 3e 0a ┊ 2f 2f 20 20 20 20 20 20 │!<arch>_┊//      │
│00000010│ 20 20 20 20 20 20 20 20 ┊ 20 20 20 20 20 20 20 20 │        ┊        │
│00000020│ 20 20 20 20 20 20 20 20 ┊ 20 20 20 20 20 20 20 20 │        ┊        │
│00000030│ 20 20 20 20 20 20 20 20 ┊ 33 38 20 20 20 20 20 20 │        ┊38      │
│00000040│ 20 20 60 0a 63 6f 72 65 ┊ 2d 66 63 36 37 35 2e 72 │  `_core┊-fc675.r│
│00000050│ 63 67 75 2e 6f 2f 0a 72 ┊ 75 73 74 2e 6d 65 74 61 │cgu.o/_r┊ust.meta│
│00000060│ 64 61 74 61 2e 62 69 6e ┊ 2f 0a 2f 30 20 20 20 20 │data.bin┊/_/0    │
│00000070│ 20 20 20 20 20 20 20 20 ┊ 20 20 30 20 20 20 20 20 │        ┊  0     │
│00000080│ 20 20 20 20 20 20 30 20 ┊ 20 20 20 20 30 20 20 20 │      0 ┊    0   │
│00000090│ 20 20 36 34 34 20 20 20 ┊ 20 20 30 20 20 20 20 20 │  644   ┊  0     │
│000000a0│ 20 20 20 20 60 0a 2f 31 ┊ 39 20 20 20 20 20 20 20 │    `_/1┊9       │
│000000b0│ 20 20 20 20 20 20 30 20 ┊ 20 20 20 20 20 20 20 20 │      0 ┊        │
│000000c0│ 20 20 30 20 20 20 20 20 ┊ 30 20 20 20 20 20 36 34 │  0     ┊0     64│
│000000d0│ 34 20 20 20 20 20 36 35 ┊ 20 20 20 20 20 20 20 20 │4     65┊        │
│000000e0│ 60 0a 21 21 21 21 21 21 ┊ 21 21 21 21 21 21 21 21 │`_!!!!!!┊!!!!!!!!│
│000000f0│ 21 21 21 21 21 21 21 21 ┊ 21 21 21 21 21 21 48 65 │!!!!!!!!┊!!!!!!He│
│00000100│ 6c 6c 6f 20 77 6f 72 6c ┊ 2a 2a 2a 2a 2a 2a 2a 2a │llo worl┊********│
│00000110│ 2a 2a 2a 2a 2a 2a 2a 2a ┊ 2a 2a 2a 2a 2a 2a 2a 2a │********┊********│
│00000120│ 2a 2a 0a 0a             ┊                         │**__    ┊        │ 
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘

rust-ar produced file:

$ hexyl test.a
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 21 3c 61 72 63 68 3e 0a ┊ 23 31 2f 32 30 20 20 20 │!<arch>_┊#1/20   │
│00000010│ 20 20 20 20 20 20 20 20 ┊ 30 20 20 20 20 20 20 20 │        ┊0       │
│00000020│ 20 20 20 20 30 20 20 20 ┊ 20 20 30 20 20 20 20 20 │    0   ┊  0     │
│00000030│ 30 20 20 20 20 20 20 20 ┊ 32 30 20 20 20 20 20 20 │0       ┊20      │
│00000040│ 20 20 60 0a 63 6f 72 65 ┊ 2d 66 63 36 37 35 2e 72 │  `_core┊-fc675.r│
│00000050│ 63 67 75 2e 6f 00 00 00 ┊ 23 31 2f 32 30 20 20 20 │cgu.o000┊#1/20   │
│00000060│ 20 20 20 20 20 20 20 20 ┊ 30 20 20 20 20 20 20 20 │        ┊0       │
│00000070│ 20 20 20 20 30 20 20 20 ┊ 20 20 30 20 20 20 20 20 │    0   ┊  0     │
│00000080│ 30 20 20 20 20 20 20 20 ┊ 38 34 20 20 20 20 20 20 │0       ┊84      │
│00000090│ 20 20 60 0a 72 75 73 74 ┊ 2e 6d 65 74 61 64 61 74 │  `_rust┊.metadat│
│000000a0│ 61 2e 62 69 6e 00 00 00 ┊ 21 21 21 21 21 21 21 21 │a.bin000┊!!!!!!!!│
│000000b0│ 21 21 21 21 21 21 21 21 ┊ 21 21 21 21 21 21 21 21 │!!!!!!!!┊!!!!!!!!│
│000000c0│ 21 21 21 21 68 65 6c 6c ┊ 6f 20 77 6f 72 6c 2a 2a │!!!!hell┊o worl**│
│000000d0│ 2a 2a 2a 2a 2a 2a 2a 2a ┊ 2a 2a 2a 2a 2a 2a 2a 2a │********┊********│
│000000e0│ 2a 2a 2a 2a 2a 2a 2a 2a ┊                         │********┊        │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘

@bjorn3
Copy link
Contributor Author

bjorn3 commented Apr 10, 2019

Ping @mdsteele

@mdsteele
Copy link
Owner

Thanks for the detailed writeup; sorry for the delayed response.

When a filename within an archive file is longer than 16 characters, there are two prevailing formats for encoding it: the GNU format and the BSD format. Currently, rust-ar can read both, but can only write the BSD format. (https://docs.rs/ar/0.6.2/ar/#format-variants)

Looking at your second comment, I can see that your ar tool is generating the GNU format, while rust-ar is generating BSD. So I suspect that what is happening is that your ranlib is expecting a GNU archive, and is mangling the BSD archive produced by rust-ar.

I think the right solution here would be to give rust-ar the capability to produce GNU-formatted archives. This is a little tricky with rust-ar's current API, since (IIRC) the GNU format requires you to know all the long filenames in advance when generating the archive. But we could always have two different builder structs, one for each format, with different APIs as needed.

@bjorn3
Copy link
Contributor Author

bjorn3 commented Apr 11, 2019

sorry for the delayed response.

No problem. I won't need large filenames until rust-lang/rust#59564 is merged anyway.

But we could always have two different builder structs, one for each format, with different APIs as needed.

What about passing in a reader when adding a file, instead of returning a writer. The readers can then be stored when producing a GNU archive and can be written all at once after all files have been added when the Builder is dropped. It can still immediately write them when a BSD archive is produced.

@mdsteele
Copy link
Owner

What about passing in a reader when adding a file, instead of returning a writer. The readers can then be stored when producing a GNU archive and can be written all at once after all files have been added when the Builder is dropped. It can still immediately write them when a BSD archive is produced.

I like this idea, but unfortunately in some cases it's not feasible to have all the readers in scope at once. For example, if you're reading files out of one archive and writing them into another, you have to drop the first reader out of scope before you can get the next one, so there's no way for the builder to be storing all those readers at once.

@bjorn3
Copy link
Contributor Author

bjorn3 commented Apr 11, 2019

you have to drop the first reader out of scope before you can get the next one

:( Copying from one to another archive is very important for me.

@bjorn3
Copy link
Contributor Author

bjorn3 commented Apr 11, 2019

Maybe something like this.

struct DelayedWritable(u64);

struct WriteDelayed(...);

impl<W: Write> Builder<W> {
    fn new(W) -> Builder<W>;
    fn write_delayed(self) -> Result<WriteDelayed<W>>; // new

    fn append<'a, R: Read + 'a>(&'a mut self, &Header, R) -> Result<()>; // reader outlives self
    fn append_path<P: AsRef<Path>>(&mut self, P) -> Result<()>; // unchanged
    fn append_file(&'a mut self, Cow<'a, [u8]>, &mut File) -> Result<()>; // Cow name

    /// When used you must call .write_delayed() and then .set_data for every append_delayed or there will be a panic.
    fn append_delayed(&mut self, Header) -> DelayedWritable; // new
}

impl <W: Write> WriteDelayed<W> {
    fn set_data<R: Read>(&mut self, DelayedWritable, R) -> Result<()>;
    fn into_inner(self) -> Result<W>;
}

It still stores a reader where possible, but gives also the option to provide the reader after all filenames are known.

@bjorn3
Copy link
Contributor Author

bjorn3 commented Apr 15, 2019

Oops didn't notice 1cfa68d :)

@bjorn3
Copy link
Contributor Author

bjorn3 commented Apr 16, 2019

Unfortunately I got:

ranlib: /home/bjorn/Documenten/rustc_codegen_cranelift/build_sysroot/target/x86_64-unknown-linux-gnu/debug/deps/libcompiler_builtins-78891cf83a7d3547.rlib: Malformed archive

when trying GnuBuilder.

Repro:

//! ```cargo
//! [dependencies]
//! ar = { git = "https://github.com/mdsteele/rust-ar.git" }
//! ```

use std::io::Read;

fn main() {
    let filenames = vec![b"rust.metadata.bin".to_vec(), b"compiler_builtins-78891cf83a7d3547.dummy_name.rcgu.o".to_vec()];
    let mut builder = ar::GnuBuilder::new(std::fs::File::create("test.a").expect("create"), filenames.clone());

    for filename in filenames {
        builder.append(&ar::Header::new(filename, 0), &mut (&[] as &[u8])).expect("add file");
    }

    std::mem::drop(builder);

    // Remove this ranlib invocation and there is no problem.
    assert!(
        std::process::Command::new("ranlib")
            .arg("test.a")
            .status()
            .expect("Couldn't run ranlib")
            .success()
    );

    let mut archive = ar::Archive::new(std::fs::File::open("test.a").expect("open"));
    while let Some(entry) = archive.next_entry() {
        entry.unwrap();
    }
}
$ hexyl test.a
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 21 3c 61 72 63 68 3e 0a ┊ 2f 2f 20 20 20 20 20 20 │!<arch>_┊//      │
│00000010│ 20 20 20 20 20 20 20 20 ┊ 20 20 20 20 20 20 20 20 │        ┊        │
│00000020│ 20 20 20 20 20 20 20 20 ┊ 20 20 20 20 20 20 20 20 │        ┊        │
│00000030│ 20 20 20 20 20 20 20 20 ┊ 37 33 20 20 20 20 20 20 │        ┊73      │
│00000040│ 20 20 60 0a 72 75 73 74 ┊ 2e 6d 65 74 61 64 61 74 │  `_rust┊.metadat│
│00000050│ 61 2e 62 69 6e 2f 0a 63 ┊ 6f 6d 70 69 6c 65 72 5f │a.bin/_c┊ompiler_│
│00000060│ 62 75 69 6c 74 69 6e 73 ┊ 2d 37 38 38 39 31 63 66 │builtins┊-78891cf│
│00000070│ 38 33 61 37 64 33 35 34 ┊ 37 2e 64 75 6d 6d 79 5f │83a7d354┊7.dummy_│
│00000080│ 6e 61 6d 65 2e 72 63 67 ┊ 75 2e 6f 2f 0a 2f 30 20 │name.rcg┊u.o/_/0 │
│00000090│ 20 20 20 20 20 20 20 20 ┊ 20 20 20 20 20 30 20 20 │        ┊     0  │
│000000a0│ 20 20 20 20 20 20 20 20 ┊ 20 30 20 20 20 20 20 30 │        ┊ 0     0│
│000000b0│ 20 20 20 20 20 30 20 20 ┊ 20 20 20 20 20 30 20 20 │     0  ┊     0  │
│000000c0│ 20 20 20 20 20 20 20 60 ┊ 0a 2f 31 39 20 20 20 20 │       `┊_/19    │
│000000d0│ 20 20 20 20 20 20 20 20 ┊ 20 30 20 20 20 20 20 20 │        ┊ 0      │
│000000e0│ 20 20 20 20 20 30 20 20 ┊ 20 20 20 30 20 20 20 20 │     0  ┊   0    │
│000000f0│ 20 30 20 20 20 20 20 20 ┊ 20 30 20 20 20 20 20 20 │ 0      ┊ 0      │
│00000100│ 20 20 20 60 0a          ┊                         │   `_   ┊        │ 
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘

@mdsteele
Copy link
Owner

Hmm. Could you try producing the same example archive with both rust-ar and your command-line ar, and seeing how the files differ? Probably there's some formatting bug in the initial version of GnuBuilder.

@bjorn3
Copy link
Contributor Author

bjorn3 commented Apr 24, 2019

ar:

$ touch rust.metadata.bin
$ touch compiler_builtins-78891cf83a7d3547.dummy_name.rcgu.o
$ ar r test.ar rust.metadata.bin
ar: creating test.ar
$ ar r test.ar compiler_builtins-78891cf83a7d3547.dummy_name.rcgu.o
$ cat test.ar
!<arch>
//                                              74        `
rust.metadata.bin/
compiler_builtins-78891cf83a7d3547.dummy_name.rcgu.o/

/0              0           0     0     644     0         `
/19             0           0     0     644     0         `

rust-ar:

!<arch>
//                                              73        `
rust.metadata.bin/
compiler_builtins-78891cf83a7d3547.dummy_name.rcgu.o/
/0              0           0     0     0       0         `
/19             0           0     0     0       0         `

@bjorn3
Copy link
Contributor Author

bjorn3 commented Apr 24, 2019

I am trying to fix this.

@bjorn3
Copy link
Contributor Author

bjorn3 commented Apr 24, 2019

I found the root problem: it an identifier doesn't have a size of a multiple of 2, then the name table doesn't have a multiple of 2 size either. PR #11 will fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants