Skip to content

Commit

Permalink
add migration capability to bypass the shared memory
Browse files Browse the repository at this point in the history
When the migration capability 'bypass-shared-memory'
is set, the shared memory will be bypassed when migration.

It is the key feature to enable several excellent features for
the qemu, such as qemu-local-migration, qemu-live-update,
extremely-fast-save-restore, vm-template, vm-fast-live-clone,
yet-another-post-copy-migration, etc..

The philosophy behind this key feature and the advanced
key features is that a part of the memory management is
separated out from the qemu, and let the other toolkits
such as libvirt, runv(https://github.com/hyperhq/runv/)
or the next qemu-cmd directly access to it, manage it,
provide features to it.

The hyperhq(http://hyper.sh  http://hypercontainer.io/)
introduced the feature vm-template(vm-fast-live-clone)
to the hyper container for several months, it works perfect.
(see hyperhq/runv#297)

The feature vm-template makes the containers(VMs) can
be started in 130ms and save 80M memory for every
container(VM). So that the hyper containers are fast
and high-density as normal containers.

In current qemu command line, shared memory has
to be configured via memory-object. Anyone can add a
-mem-path-share to the qemu command line for combining
with -mem-path for this feature. This patch doesn’t include
this change of -mem-path-share.

Advanced features:
1) qemu-local-migration, qemu-live-update
Set the mem-path on the tmpfs and set share=on for it when
start the vm. example:
-object \
memory-backend-file,id=mem,size=128M,mem-path=/dev/shm/memory,share=on \
-numa node,nodeid=0,cpus=0-7,memdev=mem

when you want to migrate the vm locally (after fixed a security bug
of the qemu-binary, or other reason), you can start a new qemu with
the same command line and -incoming, then you can migrate the
vm from the old qemu to the new qemu with the migration capability
'bypass-shared-memory' set. The migration will migrate the device-state
*ONLY*, the memory is the origin memory backed by tmpfs file.

2) extremely-fast-save-restore
the same above, but the mem-path is on the persistent file system.

3)  vm-template, vm-fast-live-clone
the template vm is started as 1), and paused when the guest reaches
the template point(example: the guest app is ready), then the template
vm is saved. (the qemu process of the template can be killed now, because
we need only the memory and the device state files (in tmpfs)).

Then we can launch one or multiple VMs base on the template vm states,
the new VMs are started without the “share=on”, all the new VMs share
the initial memory from the memory file, they save a lot of memory.
all the new VMs start from the template point, the guest app can go to
work quickly.

The new VM booted from template vm can’t become template
again, if you need this special feature, you can write a cloneable-tmpfs
kernel module for it.

The libvirt toolkit can’t manage vm-template currently, in the
hyperhq/runv, we use qemu wrapper script to do it. I hope someone add
“libvrit managed template” feature to libvirt.

4) yet-another-post-copy-migration
It is a possible feature, no toolkit can do it well now.
Using nbd server/client on the memory file is reluctantly Ok but
inconvenient. A special feature for tmpfs might be needed to
fully complete this feature.
No one need yet another post copy migration method,
but it is possible when some crazy man need it.

Signed-off-by: Lai Jiangshan <[email protected]>
  • Loading branch information
laijs authored and 0day robot committed Aug 9, 2016
1 parent ab861f3 commit 126ad16
Show file tree
Hide file tree
Showing 7 changed files with 52 additions and 10 deletions.
5 changes: 5 additions & 0 deletions exec.c
Original file line number Diff line number Diff line change
Expand Up @@ -1402,6 +1402,11 @@ static void qemu_ram_setup_dump(void *addr, ram_addr_t size)
}
}

bool qemu_ram_is_shared(RAMBlock *rb)
{
return rb->flags & RAM_SHARED;
}

const char *qemu_ram_get_idstr(RAMBlock *rb)
{
return rb->idstr;
Expand Down
1 change: 1 addition & 0 deletions include/exec/cpu-common.h
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
void qemu_ram_set_idstr(RAMBlock *block, const char *name, DeviceState *dev);
void qemu_ram_unset_idstr(RAMBlock *block);
const char *qemu_ram_get_idstr(RAMBlock *rb);
bool qemu_ram_is_shared(RAMBlock *rb);

void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
int len, int is_write);
Expand Down
1 change: 1 addition & 0 deletions include/migration/migration.h
Original file line number Diff line number Diff line change
Expand Up @@ -290,6 +290,7 @@ void migrate_add_blocker(Error *reason);
*/
void migrate_del_blocker(Error *reason);

bool migrate_bypass_shared_memory(void);
bool migrate_postcopy_ram(void);
bool migrate_zero_blocks(void);

Expand Down
9 changes: 9 additions & 0 deletions migration/migration.c
Original file line number Diff line number Diff line change
Expand Up @@ -1189,6 +1189,15 @@ void qmp_migrate_set_downtime(double value, Error **errp)
max_downtime = (uint64_t)value;
}

bool migrate_bypass_shared_memory(void)
{
MigrationState *s;

s = migrate_get_current();

return s->enabled_capabilities[MIGRATION_CAPABILITY_BYPASS_SHARED_MEMORY];
}

bool migrate_postcopy_ram(void)
{
MigrationState *s;
Expand Down
37 changes: 28 additions & 9 deletions migration/ram.c
Original file line number Diff line number Diff line change
Expand Up @@ -605,6 +605,28 @@ static void migration_bitmap_sync_init(void)
num_dirty_pages_period = 0;
xbzrle_cache_miss_prev = 0;
iterations_prev = 0;
migration_dirty_pages = 0;

This comment has been minimized.

Copy link
@zhanghaoyu1986

zhanghaoyu1986 Sep 25, 2017

This initialization is not necessary.

}

static void migration_bitmap_init(unsigned long *bitmap)
{
RAMBlock *block;

bitmap_clear(bitmap, 0, last_ram_offset() >> TARGET_PAGE_BITS);
rcu_read_lock();
QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
bitmap_set(bitmap, block->offset >> TARGET_PAGE_BITS,
block->used_length >> TARGET_PAGE_BITS);

/*
* Count the total number of pages used by ram blocks not including
* any gaps due to alignment or unplugs.
*/
migration_dirty_pages += block->used_length >> TARGET_PAGE_BITS;
}
}
rcu_read_unlock();
}

static void migration_bitmap_sync(void)
Expand All @@ -631,7 +653,9 @@ static void migration_bitmap_sync(void)
qemu_mutex_lock(&migration_bitmap_mutex);
rcu_read_lock();
QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
migration_bitmap_sync_range(block->offset, block->used_length);
if (!migrate_bypass_shared_memory() || !qemu_ram_is_shared(block)) {
migration_bitmap_sync_range(block->offset, block->used_length);
}
}
rcu_read_unlock();
qemu_mutex_unlock(&migration_bitmap_mutex);
Expand Down Expand Up @@ -1926,19 +1950,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS;
migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages);
bitmap_set(migration_bitmap_rcu->bmap, 0, ram_bitmap_pages);
migration_bitmap_init(migration_bitmap_rcu->bmap);

if (migrate_postcopy_ram()) {
migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages);
bitmap_set(migration_bitmap_rcu->unsentmap, 0, ram_bitmap_pages);
bitmap_copy(migration_bitmap_rcu->unsentmap,
migration_bitmap_rcu->bmap, ram_bitmap_pages);
}

/*
* Count the total number of pages used by ram blocks not including any
* gaps due to alignment or unplugs.
*/
migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS;

memory_global_dirty_log_start();
migration_bitmap_sync();
qemu_mutex_unlock_ramlist();
Expand Down
6 changes: 5 additions & 1 deletion qapi-schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -553,11 +553,15 @@
# been migrated, pulling the remaining pages along as needed. NOTE: If
# the migration fails during postcopy the VM will fail. (since 2.6)
#
# @bypass-shared-memory: the shared memory region will be bypassed on migration.
# This feature allows the memory region to be reused by new qemu(s)
# or be migrated separately. (since 2.8)
#
# Since: 1.2
##
{ 'enum': 'MigrationCapability',
'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
'compress', 'events', 'postcopy-ram'] }
'compress', 'events', 'postcopy-ram', 'bypass-shared-memory'] }

##
# @MigrationCapabilityStatus
Expand Down
3 changes: 3 additions & 0 deletions qmp-commands.hx
Original file line number Diff line number Diff line change
Expand Up @@ -3723,6 +3723,7 @@ Enable/Disable migration capabilities
- "compress": use multiple compression threads to accelerate live migration
- "events": generate events for each migration state change
- "postcopy-ram": postcopy mode for live migration
- "bypass-shared-memory": bypass shared memory region

Arguments:

Expand Down Expand Up @@ -3753,6 +3754,7 @@ Query current migration capabilities
- "compress": Multiple compression threads state (json-bool)
- "events": Migration state change event state (json-bool)
- "postcopy-ram": postcopy ram state (json-bool)
- "bypass-shared-memory": bypass shared memory state (json-bool)

Arguments:

Expand All @@ -3767,6 +3769,7 @@ Example:
{"state": false, "capability": "compress"},
{"state": true, "capability": "events"},
{"state": false, "capability": "postcopy-ram"}
{"state": false, "capability": "bypass-shared-memory"}
]}

EQMP
Expand Down

1 comment on commit 126ad16

@zhanghaoyu1986
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If hotplug memory during migration, the calculation of migration_dirty_pages
maybe not correct,
void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
{
...
migration_dirty_pages += new - old;
call_rcu(old_bitmap, migration_bitmap_free, rcu);
...
}

Please sign in to comment.