-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize memory handling #1963
Optimize memory handling #1963
Conversation
3d204e4
to
aa296ad
Compare
I have additional added the flatview to the context. This way flatview doesn't need to be rebuild on I have tested and profiled this. It does in fact save us relevant time. |
@wtdcode have you overseen this PR? It would be nice to get some feedback. |
Sorry for not leaving a comment. I had a quick review on it and didn't find any obvious problems. I planned to have further debugging to have a better understanding before commenting further. |
I ported d010357 which generally allows unicorn to go fast I have twinkled a bit on the code to ensure that when snapshot is enabled, no TLB is marked clean. However, I'm not sure if that works for you, you may have a look. Regarding this PR, I locally tested and think it is good to go if you don't have more to add. |
Thanks for the update. As far as I understand this patch allows faster writes, if the tlb cache entry exists. So for snapshots it might be enough to clear the tlb cache. This way the store_helper would be called again and everything works as expected. I'll test this.
I would like to rebase this to the dev branch to remove the "fixup" commits. Also because I haven't looked at this for a few weeks, I would like to test this again with the current HEAD. I'll ping you when I'm finished testing. |
aa296ad
to
de2ae81
Compare
@wtdcode I have tested the code again and removed the fixup commits so for me this is ready to merge. For the notdirty write I'm working on a optimization, but I'll open another PR for it. |
Will check today or tomorrow, thanks for your work! |
@@ -2418,6 +2431,11 @@ uc_err uc_context_restore(uc_engine *uc, uc_context *context) | |||
return ret; | |||
} | |||
uc_snapshot(uc); | |||
uc->ram_list.freed = context->ramblock_freed; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I should have noticed this earlier. With this change (and actually the early snapshot_level
change), with UC_CTL_CONTEXT_MEMORY
, the context is no longer safe to keep as a file (it contains pointers) and be restored to another uc_engine
instance.
It is super non-trivial to adapt the memory context to a value type and thus I suggest adding a few documents in the header (uc_context_save/restore
, UC_CTL_CONTEXT_MEMORY
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we had this question already in the snapshots feature. I'll write some documentation about this.
uc.c
Outdated
if (context->fv) { | ||
free(context->fv->ranges); | ||
} | ||
g_free(context->fv); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this shall move to the if (context->fv)
branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
g_free() and free() with a NULL is a noop so it's not necesary. I can move it if you prefer it this way better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please do so because it looks more straightforward.
qemu/softmmu/memory.c
Outdated
} | ||
} | ||
memory_region_transaction_commit(mr); | ||
// memory_region_transaction_commit(mr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to remove this commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This function is called in two cases.
- unmap/free complete memory region (as part of destroy)
Then the parrent unmap will trigger the commit - restore the memory state
Then the restore also restores the old flatview so no rebuild is nesecary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick check, and it seems to make sense.
Could you also add these as comments here? So that I know the preconditions here if we are going to bump qemu version.
Okay, I finished the review, and it's safe to go now with minor fixes and a few document updates. Thanks again for your brilliant work. |
Save the last element of the ram_list. This allows to faster find where to add new elements when they are not bigger then page size.
this keeps the optimization for find_ram_offset() intact after snapshot restore.
Building each flatview new when the memory has changed is quite expensive when many MemoryRegions are used. This is an issue when using snapshots.
this avoids rebuilding the flatview when restore a context.
Specialy stress that with UC_CTL_CONTEXT_MEMORY it is not possible to use the context with a different unicorn object.
de2ae81
to
bd33da3
Compare
I have added the last fixes and rebased it to the dev branch. The failed test doesn't look like it's a bug by my changes. |
I will rerun and it seems Github Action bug. |
Thanks! |
By using the memory snapshots with real examples we found some bottlenecks. Therefor I have implemented some optimizations.
We found that the flatview creation is quite expensive (around 1/4 of all runtime including emulation) when you have a lot of memory regions. By permanently creating new regions with CoW we have a lot of regions and rebuild for each cow the flatview again. I have added a update function does only change affected ranges.
Also we found that the
find_ram_offset_last
optimization doesn't work after a restore. To fix this the context now includes also theramblock_freed
bool.This also includes two smaller optimizations: only clear the affected tlb on memory_cow and save the last
ramblock
of theram_list
It's currently only a draft, because we need to do a few more tests to see if the optimization works as expected.