-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support setting custom vmcoreinfo #396
Conversation
The same issue exists with guest memory dumps from QEMU that don't include VM translations. Currently, drgn (without libkdumpfile) works out of the box with QEMU dumps generated with the paging flag (dump-guest-memory -p). With the flag, QEMU performs page table walks and replicates the VM mapping via ELF segments in the dump. This appears to be slow and highly inefficient: In the best case this takes about a minute and the resulting ELF has just a little over 60000 segments, but in some cases the QEMU process hangs indefinitely (I suppose while the guest is executing a process with a huge address space). The physical memory dumps (without paging) are much faster and work with drgn when Without libkdumpfile, I propose applying the kernel mapping offset to |
Ooof, yeah, this option is heavily discouraged due to performance. Also, a malicious guest could cause a bit of a denial of service by finagling the page tables in just the right way so that the page mappings are way too large & complex for QEMU to handle. But, drgn should also work out-of-the-box with vmcores generated by
Oops, I hope you didn't spend too long on that since I had a similar (but less polished) workaround in my vmcoreinfo_wip branch. Glad to know we came a similar solution :D That said, I like your idea of moving the fixes into the page table iterator, that can allow the architecture code to be self-contained. - table_physical = false;
+ table_physical = true; The above is from your changes to the x86_64 pgtable iterator, I think there's a risk that unconditionally setting diff --git a/libdrgn/arch_x86_64.c b/libdrgn/arch_x86_64.c
index a0a51c36..d7a2a4a4 100644
--- a/libdrgn/arch_x86_64.c
+++ b/libdrgn/arch_x86_64.c
@@ -616,7 +616,10 @@ linux_kernel_pgtable_iterator_next_x86_64(struct drgn_program *prog,
for (;; level--) {
uint64_t table;
bool table_physical;
- if (level == levels) {
+ if (level == levels && it->it.pgtable == prog->vmcoreinfo.swapper_pg_dir) {
+ table = it->it.pgtable + prog->vmcoreinfo.phys_base - START_KERNEL_MAP;
+ table_physical = true;
+ } else if (level == levels) {
table = it->it.pgtable;
table_physical = false;
} else { That way the changes are entirely contained within the arch-specific section of the page table code, and we can add similar code for other architectures as we see fit. I've pushed a new commit with this change and added a coauthor tag for you on it. |
ea77cc5
to
8d97df1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! I kind of hate that this is necessary, but that's not your fault, and I'm happy with how you handled it. A few minor comments throughout. The last commit also needs a commit message.
libdrgn/kdump.c
Outdated
attr.val.blob = blob; | ||
ks = kdump_set_attr(ctx, "linux.vmcoreinfo.raw", &attr); | ||
if (ks != KDUMP_OK) { | ||
free(vmcoreinfo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding of the libkdumpfile documentation, blob
owns vmcoreinfo
now, so this would cause kdump_block_decref()
to double free, right?
free(vmcoreinfo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you're definitely correct, and it really obviously looks wrong too. Good catch.
More interestingly however is that the libkdumpfile code seems to indicate that even in the error case, the reference to the blob is already stolen, and thus we cannot free it in our error path. See this comment. However, there are clearly cases of errors inside of kdump_set_attr()
where the function returns and the blob is not decref'd. In summary: on some failures, we can't even decref the blob. On other failures, we should. And there's no way to tell 😜
So my vote here (I know it's gross) is to heavily comment this and get rid of both the free(vmcoreinfo)
and the kdump_blob_decref()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I filed ptesarik/libkdumpfile#81 for Petr, we'll see if he agrees too. Assuming he does, I'll remove the error handling and document the leak .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If kdump_set_attr()
is supposed to take ownership, then yes, we should definitely remove kdump_blob_decref()
, and I don't think we even need a comment; we just need to report any leaks as a libkdumpfile bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, dropped the comment and the kdump_blob_decref()
, we can revisit it if Petr says the ownership semantic is supposed to be different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple more things I noticed in this round.
Currently set_core_dump() expects to be initializing the vmcoreinfo itself. But it could be beneficial to let callers set the vmcoreinfo with something else, e.g. if the vmcoreinfo can't be found in the ELF notes or kdump metadata, but has been extracted via other means. So update these initialization steps to only setup vmcoreinfo information if it's not already present. Signed-off-by: Stephen Brennan <[email protected]>
Signed-off-by: Stephen Brennan <[email protected]>
Ideally these could be put under some sort of "advanced" settings so they don't confuse new users. Signed-off-by: Stephen Brennan <[email protected]>
It's not immediately obvious from the API, but libkdumpfile allows setting the vmcoreinfo attribute. However, setting the vmcoreinfo is not enough, we must also set the platform information given by the user. Further, we need to specify these elements in the correct order with respect to the file descriptor. If done correctly, then libkdumpfile can successfully handle a core whose vmcoreinfo is not present in the diskdump or ELF metadata. Of course, the user must find the vmcoreinfo note and manually give this to Drgn, along with the platform architecture. Signed-off-by: Stephen Brennan <[email protected]>
Signed-off-by: Stephen Brennan <[email protected]>
Most core dumps contain some virtual address mappings: usually at a minimum, the kernel's direct map is represented in ELF vmcores via a segment. So normally, drgn can rely on the vmcore to read the virtual address of swapper_pg_dir. However, some vmcores only contain physical address information, so when drgn reads memory at swapper_pg_dir, it needs to first translate that address, thus causing a recursive translation error like below: >>> prog["slab_caches"] Traceback (most recent call last): File "<console>", line 1, in <module> File "/home/stepbren/repos/drgn/drgn/cli.py", line 141, in _displayhook text = value.format_(columns=shutil.get_terminal_size((0, 0)).columns) _drgn.FaultError: recursive address translation; page table may be missing from core dump: 0xffffffff9662aff8 Debuggers like crash, as well as libkdumpfile, contain fallback code which can translate swapper_pg_dir in order to bootstrap this address translation. In fact, the above error does not occur in drgn when using libkdumpfile. So, let's add this fallback case to drgn as well. Other architectures will need to have equivalent support added. Co-authored-by: Illia Ostapyshyn <[email protected]> Signed-off-by: Stephen Brennan <[email protected]>
Merged, thanks! |
This branch has sat on the back burner for a while, but new discussion in #350 led me to update it and get it ready to merge. It adds the ability to give drgn some data to be interpreted as the vmcoreinfo when the Program is created. Later, when a core dump is attached, drgn will respect the pre-existing info rather than trying to get it from the core dump.
There's a bit of a delicate dance to be had with libkdumpfile, because it also wants to use the vmcoreinfo to find the page tables. For vmcores missing the vmcoreinfo, it seems that libkdumpfile doesn't know how to find their architecture, so we need to manually provide the architecture, then the dump FD, then the vmcoreinfo.
While I was at it, I threw in RISC-V support for libkdumpfile, which has been available since 0.5.4, but since we have some if-statements that enumerate all the libkdumpfile architectures, we never updated those and thus haven't been able to use it (I think... I don't actually have a RISC-V core dump handy).
Testing this is of course the important part. One obvious candidate where this code makes drgn work is hypervisor core dumps that are missing vmcoreinfo. QEMU will produce them if you don't have
-device vmcoreinfo
, which drgn already notes and produces a nice error message about.Thankfully, drgn's own test framework runs QEMU without that option, so it's easy to generate a good test vmcore:
Running drgn against it will fail:
But with this branch, you can use my dumpphys tool like below to get a text file containing the vmcoreinfo:
And then, you can go ahead and run drgn, but add new arguments
--vmcoreinfo file.vmcoreinfo
to specify the vmcoreinfo, and--architecture X86_64
to specify the architecture (since libkdumpfile cannot detect it). At that point, drgn works like normal.For ELF vmcores, drgn requires some added bootstrapping/knowledge to find the physical address of the root page table (i.e. swapper_pg_dir). I've implemented this ability within the x86_64 page table iterator. Other architectures will need this support added. libkdumpfile already seems to have this ability for x86_64, so for other architectures it stands to reason that
DRGN_USE_LIBKDUMPFILE_FOR_ELF=1
might help them avoid the following error: