Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protection of virtual hosts from "fat finger" deletion #12772

Closed
michaelklishin opened this issue Nov 21, 2024 · 6 comments · Fixed by #13015
Closed

Protection of virtual hosts from "fat finger" deletion #12772

michaelklishin opened this issue Nov 21, 2024 · 6 comments · Fixed by #13015
Assignees
Milestone

Comments

@michaelklishin
Copy link
Member

michaelklishin commented Nov 21, 2024

Problem Definition

Every so often we see users delete virtual hosts by accident. This can happen both over the HTTP API
and with rabbitmqctl delete_vhost.

When this happen, certain kinds of virtual host data can be
restored relatively easily. The definitions data set is small (compared to tens or hundreds of
GiBs a stream can contain, for example). Unfortunately, for messages stored in queues (not streams),
the backup and restore problem is a very different in scope.

Tanzu RabbitMQ has a Warm Standby Replication feature that replicates both schema and messages (for QQs, streams, CQs) to a remote warm standby cluster or multiple clusters. This does help with certain fat finger errors but not all of them,
and only if a reasonably large schema synchronization interval is used.

Since virtual host deletion is one of the most destructive operations (up there with node reset),
it could use some extra protection, similarly to how opt-in (experimental) feature flag enablement was made much more difficult to do accidentally in #11998, #12643.

In addition, specific clusters, systems, and RabbitMQ-based distributions such as Tanzu RabbitMQ can
have system virtual host. "System" here means .

Proposed Solution

Virtual hosts can be marked with additional metadata. This metadata today stores the DQT (default queue type), a description, a and set of tags.

The same metadata map can be used to enable protection from deletion. If set,
rabbitmqctl delete_vhost and DELETE /api/vhosts/{name} will fail with a reasonably specific
error asking the user to delete the metadata key from the target virtual host first, then retry.

Deletion of the key will act as a confirmation for CLI tools, and will require the same permissions
as DELETE /api/vhosts/{name} in the case of HTTP API.

Dedicated CLI commands and HTTP API endpoints will be provided for locking and unlocking deletion.

Currently Available Options

DELETE /api/vhosts/{name} requires the user to be tagged as administrator.

For CLI tools, such "roles" do not exist, so an explicit confirmation is particularly important to have.

@michaelklishin michaelklishin self-assigned this Nov 21, 2024
@michaelklishin michaelklishin changed the title Consider a mechanism for protection of virtual hosts from "fat finger" deletion Protection of virtual hosts from "fat finger" deletion Nov 21, 2024
@sateeshkvk7
Copy link

Thank you, @michaelklishin, for developing/enhancing the extra-protection feature! Additionally, is it possible to add another feature option to duplicate virtual hosts? This would enable users to create exact copies of virtual hosts, including their configurations, messages, and resources. Such a feature would be incredibly useful for testing, backup, recovery, and migration, as it allows for modifications without affecting the original setup or provides a quick restore option.

@Zerpet
Copy link
Contributor

Zerpet commented Nov 21, 2024

I like this idea. I'm wondering about a "force" parameter both both rabbitmqctl delete_vhost and DELETE /api/vhosts/{name}. Similar how rm has rm -f flag. The user would be explicitly requesting JustDoIt ❗ By default, the protection, if present, will stop the deletion.

A "force" option could improve UX, because I only have to remember to add --force; whereas to remove the metadata, I have to remember 2 additional pieces of information: 1) remove vhost metadata command + 2) the metadata protection tag name. I hear the argument that a "force" option would partly defeat the purpose of this protection. I think it would be a matter of preference how robust/strict we want to make this protection.

@michaelklishin
Copy link
Member Author

@sateeshkvk7 a virtual host can have many GiBs of data, so "duplicating" it is not something we plan on doing. WSR in Tanzu RabbitmQ "continuously duplicates" a cluster and it took several person-years to get there.

This issue is for OSS RabbitMQ and only has to do with a deletion protection mechanism.

@michaelklishin
Copy link
Member Author

@Zerpet that's the point, make you remember and have to run an additional command. You have opted in for deletion protection after all. There isn't really a -f equivalent for the HTTP API.

For CLI tools, we can also add a -y or --confirm flag but that seems a bit pointless to me because it will become a part of all scripts, shell histories, muscle memory and in the end won't really have any effect.

Explicitly opting in for deletion protection and the requirement to explicitly lift it won't be used "as automatically".

@Zerpet
Copy link
Contributor

Zerpet commented Nov 21, 2024

@Zerpet that's the point, make you remember and have to run an additional command. You have opted in for deletion protection after all. There isn't really a -f equivalent for the HTTP API.

For CLI tools, we can also add a -y or --confirm flag but that seems a bit pointless to me because it will become a part of all scripts, shell histories, muscle memory and in the end won't really have any effect.

I was thinking about an additional query parameter to support "force" delete via the HTTP API. I see your point, and I think it's reasonable. If we go down this route, it would be useful to have dedicated commands to add or remove the metadata protection tag. I agree with having to remember one command (to update vhost tags), but I think remembering the specific protection tag (in addition to the command) would lead to bad UX; specially if we expect this command to be used infrequently i.e. how often do you create/delete vhosts?

@michaelklishin
Copy link
Member Author

@Zerpet yes, dedicated CLI commands and HTTP API endpoints will be provided.

@michaelklishin michaelklishin added this to the 4.1.0 milestone Jan 2, 2025
michaelklishin added a commit that referenced this issue Jan 2, 2025
Accidental "fat finger" virtual deletion accidents
would be easier to avoid if there was a protection mechanism
that would apply equally even to CLI tools and external
applications that do not use confirmations for deletion
operations.

This introduce the following changes:

 * Virtual host metadata now supports a new queue,
   'protected_from_deletion', which, when set,
   will be considered by key virtual host deletion function(s)
 * DELETE /api/vhosts/{name} was adapted to handle
   such blocked deletion attempts to respond with
   a 412 Precondition Failed status
 * 'rabbitmqctl list_vhosts' and 'rabbitmqctl delete_vhost'
   were adapted accordingly
 * DELETE /api/vhosts/{name}/deletion/protection
   is a new endpoint that can be used to remove
   the protective seal (the metadata key)
 * POST /api/vhosts/{name}/deletion/protection
   marks the virtual host as protected

In the case of the HTTP API, all operations on
virtual host metadata require administrative
privileges from the target user.

Other considerations:

 * When a virtual host does not exist, the behavior
  remains the same: the original, protection-unaware
  code path is used to preserve backwards compatibility

References #12772.
michaelklishin added a commit that referenced this issue Jan 2, 2025
mergify bot pushed a commit that referenced this issue Jan 4, 2025
Accidental "fat finger" virtual deletion accidents
would be easier to avoid if there was a protection mechanism
that would apply equally even to CLI tools and external
applications that do not use confirmations for deletion
operations.

This introduce the following changes:

 * Virtual host metadata now supports a new queue,
   'protected_from_deletion', which, when set,
   will be considered by key virtual host deletion function(s)
 * DELETE /api/vhosts/{name} was adapted to handle
   such blocked deletion attempts to respond with
   a 412 Precondition Failed status
 * 'rabbitmqctl list_vhosts' and 'rabbitmqctl delete_vhost'
   were adapted accordingly
 * DELETE /api/vhosts/{name}/deletion/protection
   is a new endpoint that can be used to remove
   the protective seal (the metadata key)
 * POST /api/vhosts/{name}/deletion/protection
   marks the virtual host as protected

In the case of the HTTP API, all operations on
virtual host metadata require administrative
privileges from the target user.

Other considerations:

 * When a virtual host does not exist, the behavior
  remains the same: the original, protection-unaware
  code path is used to preserve backwards compatibility

References #12772.

(cherry picked from commit f62d46c)
mergify bot pushed a commit that referenced this issue Jan 4, 2025
(cherry picked from commit 6281dcb)
@michaelklishin michaelklishin modified the milestones: 4.1.0, 4.0.6 Jan 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants