-
-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modifying PDBs #16
Comments
Correct, this library is read-only.
The bad news is that writing PDBs would be all new code. The good news is that it would be adjacent to code which can already read PDBs, and it's easy to test symmetrical transforms. I think it makes more sense to have PDB-writing code here than in a separate library. There's two main pieces:
If you want to stab at this, #10 is the place to start, and it would definitely be good to approach that with read/write fidelity in mind. This would prompt some user-facing API changes. Take this code for example: S_LDATA32 | S_LDATA32_ST |
S_GDATA32 | S_GDATA32_ST |
S_LMANDATA | S_LMANDATA_ST |
S_GMANDATA | S_GMANDATA_ST => {
Ok(SymbolData::DataSymbol(DataSymbol {
global: match kind { S_GDATA32 | S_GDATA32_ST | S_GMANDATA | S_GMANDATA_ST => true, _ => false },
managed: match kind { S_LMANDATA | S_LMANDATA_ST | S_GMANDATA | S_GMANDATA_ST => true, _ => false },
type_index: buf.parse_u32()?,
offset: buf.parse_u32()?,
segment: buf.parse_u16()?,
}))
} The What happens if |
Thank you for the detailed response! This sounds like a good design to me. For my use-case, I'm fine with sequential writes to a PDB. In fact, I want all the streams to be sequential instead of having their pages scattered all over the file. I think concurrent writes are possible, but it's definitely more complicated and not something I want to implement. (IIRC, pages are written atomically using the pairs of pages in the free page map.) I'm not yet sure exactly what the I'll take a look at solving #10 first though. |
The simplifying conceit used by Carrying this idea through to the output side would suggest passing around a If we did know how long a stream would be, it's straightforward to imagine an implementation for If we don't know how long every stream will be, then we need a way to get a growable Thinking aloud some more: the main reason |
FYI, while looking at the Microsoft source again (no thanks to you 😉) I noted this comment about the "storing the stream table page list in a list of pages":
I don't know if this is what Microsoft's tools do in practice, but it seems like it'd be fairly straightforward to support this for cases of updating an existing PDB file: you just write out all the new data to empty pages, write out a new stream table page list (also to empty pages), and then write out the header with the new stream table pages. |
Bumping this issue with some new insights. At our company we found ourselves in need of adding new symbol information into the pdb. We only cared to add new We forked this library and did the following:
The PDB code is from the 90s its an elderscroll. Figured id put my 0.02 here and say what we did. :)
Writing anymore more complex to the PDB is going to require major changes to pdb-rs as its been said before its read-centric. The |
Is it possible to modify parts of a PDB or rewrite it entirely with this? Browsing through the docs, it looks like it only reads PDBs.
The reason I ask is that I'm considering rewriting my C++ tool Ducible in Rust using this crate. The tool rewrites PDBs to remove non-deterministic data. By far, the greatest effort was in deciphering the PDB format from Microsoft's pile of 💩. (It's good that the LLVM guys have documented this a little bit.) So, I'd be happy to switch to a good PDB parsing library and gain Rust's ability to produce rock-solid software.
If you think writing PDBs falls into the purview of this library and it isn't too difficult to add, I could take a stab at implementing it with some guidance.
I'm currently doing it by having an abstract stream type where the stream could either be on-disk or in-memory. Then, an MSF stream can be replaced with in an in-memory stream before writing everything back out to disk. In this way, streams are essentially copy-on-write. Doing it like this in Rust could be difficult with the ownership system, so I don't think this is the best approach. I'm definitely open to any good ideas about how to do this.
P.S. Thanks for writing this library. The PDB format is a real pain in the arse to parse.
The text was updated successfully, but these errors were encountered: