Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device: Don't remove an existing target directory when unmounting a disk device if the original dir hasn't been created by LXD #12700

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

gabrielmougard
Copy link
Contributor

@gabrielmougard gabrielmougard commented Jan 5, 2024

closes #12648
closes #12716


Overall description

When a disk device is removed while relying on a host directory and mapped to a target within a container or a VM,
we detect if the target directory has been created by LXD or not in order to not delete the content of a target
directory during the unmount. Additionaly, with the VM case, we cleanly unmount the target path inside the VM.

Container case

Example:

  • Let say we mounted a custom host empty directory (test) on the existing /opt directory of the container,
    when unmounted (lxc device remove ...), the target /opt won't be removed, because we marked it as NOT being created
    by LXD at mount time.

  • If, however, we created a custom empty target directory at mount time:
    lxc config device add u1 test disk source=/home/user/test path=/new_dir,
    the directory new_dir will be created on the target instance and if we decide to unmount test,
    the target /new_dir will be removed because is has been created by LXD

Particular case for VMs

In addition to that, this fix also cleanly unmount the target directory inside the VM through an new LXD-agent API call. Before that, here is what would have happened:

mkdir /tmp/empty-dir
lxc launch ubuntu:jammy v1 --vm
lxc config device add v1 empty-dir disk source=/tmp/empty-dir path=/opt
lxc config device remove v1 empty-dir

lxc shell v1 -- stat /opt
stat: cannot statx '/opt': Transport endpoint is not connected

This happens because the mounted device and its associated char device were removed using QEMU's QMP without unmounting the target.

Now, we inspect for the mounts and the over-mounts if any on the VM, and unmount them in the right order.

@gabrielmougard
Copy link
Contributor Author

Working on the integration tests now.

@github-actions github-actions bot added the Documentation Documentation needs updating label Jan 5, 2024
@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch from 9ee6693 to 20ba221 Compare January 5, 2024 11:52
@gabrielmougard gabrielmougard marked this pull request as ready for review January 5, 2024 11:52
@gabrielmougard
Copy link
Contributor Author

Tests should be ready

shared/instance.go Outdated Show resolved Hide resolved
shared/instance.go Outdated Show resolved Hide resolved
test/suites/container_devices_disk.sh Show resolved Hide resolved
doc/config_options.txt Outdated Show resolved Hide resolved
shared/instance.go Outdated Show resolved Hide resolved
@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch from 20ba221 to 06cfd51 Compare January 8, 2024 14:20
@github-actions github-actions bot added the API Changes to the REST API label Jan 8, 2024
@gabrielmougard
Copy link
Contributor Author

Also, @tomponline, the deviceVolatileSetFunc function, which calls the VolatileSet, updates the DB. I think this is good to persist this information though. Removing an existing target dir like /opt for example (see the original post of the issue author) through an umount seems quite bad.

Or, we can introduce a new device option to say "not to track that kind of changes" so that I can set d.volatileSetPersistDisable = true before my volatile set call and then restore it to false after. This will surely be faster (no DB call) but the user accept a potential data loss on the target instance if he mounted on an existing target dir.

doc/config_options.txt Outdated Show resolved Hide resolved
doc/config_options.txt Outdated Show resolved Hide resolved
doc/api-extensions.md Outdated Show resolved Hide resolved
@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch 2 times, most recently from 9504543 to db861a7 Compare January 8, 2024 17:08
ru-fu
ru-fu previously approved these changes Jan 9, 2024
Copy link
Contributor

@ru-fu ru-fu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs look good now. :)

@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch from db861a7 to eedfede Compare January 9, 2024 17:52
@gabrielmougard gabrielmougard marked this pull request as draft January 9, 2024 17:54
@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch from eedfede to 8838e7c Compare January 10, 2024 15:48
@gabrielmougard gabrielmougard marked this pull request as ready for review January 10, 2024 15:49
@gabrielmougard
Copy link
Contributor Author

@tomponline I implemented the logic for the VM as well within the lxd-agent. I tested it on my side without an issue.

@tomponline
Copy link
Member

tomponline commented Jan 10, 2024

Thanks @gabrielmougard please can you add a PR description with what has changed so the reviewer can compare the code to the intended design.

Please do include any new API endpoints you've added to the lxd-agent.

Cheers

@tomponline
Copy link
Member

Please can you rebase @gabrielmougard

@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch 2 times, most recently from 9140527 to 844ac0f Compare January 19, 2024 14:39
Copy link
Member

@tomponline tomponline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gabrielmougard when the disk device is hot-plugged its mount inside the guest is triggered by an event in eventsProcess(). Is there a reason you didn't feel that we could unmount the disk in the same fashion and required the API endpoint instead?

lxd/instance/instancetype/instance.go Outdated Show resolved Hide resolved
shared/version/api.go Outdated Show resolved Hide resolved
@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch from 844ac0f to f4c1453 Compare January 26, 2024 11:26
@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch from 789a10f to cdc2492 Compare March 25, 2024 13:11
@gabrielmougard
Copy link
Contributor Author

@tomponline @roosterfish can you review this please?

lxd-agent/devices.go Fixed Show resolved Hide resolved
@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch from cdc2492 to e9989e6 Compare March 27, 2024 15:21
Copy link

github-actions bot commented Mar 27, 2024

Heads up @mionaalex - the "Documentation" label was applied to this issue.

lxd-agent/devices.go Fixed Show fixed Hide fixed
@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch from e9989e6 to 1db74aa Compare April 11, 2024 12:17
@gabrielmougard
Copy link
Contributor Author

@tomponline @roosterfish updated

@tomponline
Copy link
Member

Please can you rebase

@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch from 1db74aa to 4139a48 Compare May 23, 2024 15:35
Some devices like disk devices with a target path need to be cleanly unmounted.
In the case of a VM instance, we prefer to handle the unmount logic within the agent
and handle the unmounting of the potential overmounts of the guest.

In some cases, we also need to remove a path resource (or not) on the guest side. That is
why we also pass some extra metadata contained in the `Volatile` field, to handle the resource
removal process in the agent and not on the LXD side (this would result in spawning aan SFTP client, which is a wasteful approach)

Signed-off-by: Gabriel Mougard <[email protected]>
…k_dev_name>.last_state.created` for the disk device

Signed-off-by: Gabriel Mougard <[email protected]>
…ory when unmounting

When a disk device is removed while relying on a host directory and mapped to a target within a container,
we detect if the target directory has been created by LXD or not in order to not delete the content of a target
directory during the unmount.

Example:

- Let say we mounted a custom host empty directory (`test`) on the existing `/opt` directory of the container,
when unmounted (`lxc device remove ...`), the target `/opt` won't be removed, because we marked it as NOT being created
by LXD at mount time.

- If, however, we created a custom empty target directory at mount time:
`lxc config device add u1 test disk source=/home/user/test path=/new_dir`,
the directory `new_dir` will be created on the target instance and if we decide to unmount `test`,
the target `/new_dir` will be removed because is has been created by LXD

Signed-off-by: Gabriel Mougard <[email protected]>
@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch 2 times, most recently from fc5d572 to 370bb39 Compare November 4, 2024 14:28
lxd-agent/devices.go Fixed Show fixed Hide fixed
lxd-agent/devices.go Fixed Show fixed Hide fixed
@gabrielmougard gabrielmougard force-pushed the fix/rm-disk-dev-deletes-empty--dir branch from 370bb39 to 96cd966 Compare November 4, 2024 14:40
lxd-agent/devices.go Dismissed Show dismissed Hide dismissed
@gabrielmougard
Copy link
Contributor Author

@tomponline updated

Copy link
Member

@tomponline tomponline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up from our meeting, we need to:

  • Check for performance reductions with this approach when starting instances.
  • Consider what happens when part of the target path exists but part of it doesn't. What gets deleted?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Changes to the REST API Documentation Documentation needs updating
Projects
None yet
4 participants