Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uffd: Disable image deduplication after fork #2492

Open
wants to merge 2 commits into
base: criu-dev
Choose a base branch
from

Conversation

liusdu
Copy link

@liusdu liusdu commented Oct 12, 2024

After a fork, both the child and parent processes may trigger a page fault (#PF) at the same virtual address, referencing the same position in the page image. If deduplication is enabled, the last process to trigger the page fault will fail.

Therefore, deduplication should be disabled after a fork to prevent this issue.

@rst0git
Copy link
Member

rst0git commented Oct 13, 2024

If deduplication is enabled, the last process to trigger the page fault will fail.

Would it be possible to add a ZDTM test for this use-case?

After a fork, both the child and parent processes may trigger a page fault (#PF)
at the same virtual address, referencing the same position in the page image.
If deduplication is enabled, the last process to trigger the page fault will fail.

Therefore, deduplication should be disabled after a fork to prevent this issue.

Signed-off-by: Liu Hua <[email protected]>
@liusdu
Copy link
Author

liusdu commented Oct 14, 2024

If deduplication is enabled, the last process to trigger the page fault will fail.

Would it be possible to add a ZDTM test for this use-case?

I will add a ZDTM testcase later. The follow test case can reproduce this issue.

  • source code
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <string.h>
#include <signal.h>
#include <sched.h>


#define VALUE 0x1001
#define STACK_SIZE (1024 * 1024)

int validate_memory(void *arg)
{
    if  (*(int *)arg != VALUE) {
        printf("memory is corrupted\n");
        return -1;
    }
    printf("memory: %x\n", *(int *)arg);
    return 0;
}

int main() {
    int pid = 0;
    char *stack;
    int *buffer;;
     int flags = CLONE_VFORK;
    stack = malloc(STACK_SIZE + 4096);
    if (stack == NULL) {
        perror("malloc");
        exit(EXIT_FAILURE);
    }
    buffer = (int *)(stack + STACK_SIZE);
    buffer[0] = VALUE;

    kill(getpid(), 19);
    if (clone(validate_memory, buffer, flags, (void *)buffer) == -1) {
        perror("clone");
        free(stack);
        exit(EXIT_FAILURE);
    }
    validate_memory(buffer);

    return 0;
}
  • steps
  gcc -o clone clone.c 
  setsid ./clone
  criu dump --shell-job  -vvvv --tree `pidof ./clone` -o dumo.log  --images-dir ./ckp 
  criu lazy-pages -vvvv -o lp.log  --auto-dedup --images-dir ckp/ &
  criu  restore -vvvv -o restore.log  --images-dir  ./ckp/ --lazy-pages -j --no-background-fetch  &
  kill -s SIGCONT `ps aux | grep '\[clone\]' | awk '{print $2}'`
 

@@ -703,6 +703,7 @@ int parse_options(int argc, char **argv, bool *usage_error, bool *has_exec_cmd,
BOOL_OPT("mntns-compat-mode", &opts.mntns_compat_mode),
BOOL_OPT("unprivileged", &opts.unprivileged),
BOOL_OPT("ghost-fiemap", &opts.ghost_fiemap),
BOOL_OPT("no-background-fetch", &opts.no_background_fetch),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liusdu What is the use-case for this option?

cc: @rppt

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the test case. We know that CRIU can set up pages in the background or on demand in lazy-page mode. We must disable the background mode to ensure that both the child and parent processes trigger two page faults (#PF). Otherwise, by the time the child and parent processes start running, the lazy page server may have already completed its tasks and exited

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. We typically add command-line options to enable new CRIU features and document these options in the man page and criu --help. However, having a large number of options may lead to confusion for users who are not familiar with the available CRIU features.

Would it be possible to use an environment variable instead? For example, something like the following:

CRIU_DISABLE_BACKGROUND_LAZY_PAGE_FETCH=1  criu restore ...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok~

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. We typically add command-line options to enable new CRIU features and document these options in the man page and criu --help. However, having a large number of options may lead to confusion for users who are not familiar with the available CRIU features.

Would it be possible to use an environment variable instead? For example, something like the following:

CRIU_DISABLE_BACKGROUND_LAZY_PAGE_FETCH=1  criu restore ...

Or maybe even fault injection mechanism, because normally we don't want to disable the background fetch

@duguhaotian
Copy link
Contributor

can you show some error message of this issuse?

@@ -2213,7 +2213,7 @@ static int restore_root_task(struct pstree_item *init)
goto out_kill_network_unlocked;
}

if (lazy_pages_finish_restore())
if (!opts.no_background_fetch && lazy_pages_finish_restore())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand how this would prevent background fetch

@@ -703,6 +703,7 @@ int parse_options(int argc, char **argv, bool *usage_error, bool *has_exec_cmd,
BOOL_OPT("mntns-compat-mode", &opts.mntns_compat_mode),
BOOL_OPT("unprivileged", &opts.unprivileged),
BOOL_OPT("ghost-fiemap", &opts.ghost_fiemap),
BOOL_OPT("no-background-fetch", &opts.no_background_fetch),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. We typically add command-line options to enable new CRIU features and document these options in the man page and criu --help. However, having a large number of options may lead to confusion for users who are not familiar with the available CRIU features.

Would it be possible to use an environment variable instead? For example, something like the following:

CRIU_DISABLE_BACKGROUND_LAZY_PAGE_FETCH=1  criu restore ...

Or maybe even fault injection mechanism, because normally we don't want to disable the background fetch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants