-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tool for Emergency Master-Key Recovery #15952
Comments
Below is a copy of my related CVE submission (https://github.com/openzfs/zfs/security/advisories/GHSA-5wqj-fcr9-j434) Encrypted ZFS Volumes Vulnerable to Encryption Re-Key Attack Summary The lack of an offline backup mechanism for decrypting the immutable master key results in the potential for encrypted ZFS volumes to be weaponized. A malicious actor who gains privileged access can re-key available encrypted volumes, their children and snapshots. By extension, systems which then pull snapshots for offline backup will also have the associated volumes re-keyed. This can result in the complete loss of data on the related volumes or enable a perfect scenario for ransom. Details As a feature, any unlocked, encrypted ZFS volume, its children and all dependent snapshots can be re-keyed, via "zfs change-key -i". When snapshots are transmitted to offline backup systems, those volumes and systems also inherit the new key. The only requirement for re-keying is that the existing key is already loaded, that the user have permission to peform encryption operations (root account or granted via zfs allow), and that the new key is available. Any system that is up and running and relies on data within an encrypted ZFS volume will presumably have keys loaded. Attack and escallation vectors that may result in the compromise of a system with elevated permissions are not a daily occurance but are common enough to be assured that even a system that is kept up to date with security patches will have windows of opportunities for this attack chain. The risk factor of a successful attack is the complete loss of not only the active host but also snapshots/volumes pulled from the affected host. For many these are the only data backups kept for member systems. @tcaputi, a non-active but previously significant contributor to openzfs/zfs, indicates that this attack vector could be mitigated in #6824: "...there is an answer here that wouldn't take too much work to implement. The current filesystem decryption code works using the user-supplied key material and the parameters saved on disk to decrypt the immutable master key. In the event of the attack described here, the stored parameters would no longer be available, but theoretically we could allow the user to save these parameters to a file that they could save securely on their own. Then they could decrypt the master key even if the on-disk encryption keys were altered via a new ioctl... ...it would probably be useful as a break-glass kind of solution." PoC Step 1: Host is accessed by an a malicious insider or compromised by unrelated attack vector with privileged permissions Impact All users utilizing encrypted zfs |
I'm an outsider to this project, I've seen your original comments back then in the other threads.
Personally I don't think this is reaching the threshold for a CVE frankly, this is a feature request really. I would believe this may only be a CVE if zfs would advertise to be protected against your described scenario, I have not seen this to be the case so far. Also you are assuming that backups are being created with the
I agree that would be a handy feature indeed. A workaround for disaster recovery might to use the provided command of
This can always happen. The only proper protection for ransomware attacks is having actual physical offline backups, everything that is online can always be affected on infiltration. The definition of offline is that the storage is physically disconnected/airgapped - not reachable by other computers and can only be accessed by physical access. Please follow USA CISA Ransomware Guide. SOC2 and common DPIAs will likely require you to have such disaster recovery backups in place.
This wouldn't be an offline backup any longer if it can "pull" new changes. Only if you are using Overall:
|
@norpol: I'm not getting the impression you have significant experience with network security or complex multi-user enterprise environments, certanly not with zfs. Or, for that matter, the general point of this issue.
The name for any type of attack doesn't exist...until one day it does. Your inability to google a similar attack vector has no bearing. The name is arbitrary. The vector is very real. "ReKey" as a "security feature" is only a feature if you are the one to do the re-keying. Every tool is a "feature" until it is in the wrong hands. The problem here is that it has extreme knock-on effects. If I "rekey" your system without your knowledge, you are SOL. I hope your critical data is mighty static and your "offline backups" were recent. This vulnerability is acknowledged by @tcaputi, who is certainly an authority. If it should not be a CVE then how about we let the ZFS owners state as much?
So, your point here is that no one should use native encryption, the -w flag exists for no valid reason, and that old-style copy-pasta backups are a good solution? Please explain the following:
Your narrow concept of enterprise environments does not cover these use cases. I could go on, but the point is made.
What exactly is a "physical offline backup"? Do the packets to this magical datastore teleport to this location while the power is off with no networks connected? Your comments remind me of the age-old chestnuts: "If they're on your internal network you have bigger problems anyway" and "If the attacker gains root doesn't matter (so what's the point)". A configuration using a secure, centralized backup system that has one-way inbound access to the various systems to be backed up, pulling snapshots which are then further exported to an offsite location certainly meets the metric for very strong disaster recovery...except for the vulnerability described herein. Remove that vulnerability and this setup is about 1000x faster, deeper, and more flexible than anything you can come up with "offline backup". Restic cannot compete with instantaneous block-level snapshots. Being able to recover the master security key should have been baked in at inception and it is an oversight that it hasn't been made available so far.
No. Not everything. Not always. Please try and break into our centralized backup system that has no ports open from anywhere but does outbound connections via ssh only. If you can write an exploit for openssh on a target system that results in a shell on an inbound connecting system, please call the NSA today, they need your skills immediately. Also: good luck.
So, for several hundred datasets, we should setup a mechanism to send them, zstreamdump each, and then somehow, hopefully, with the help of the community or "a professional" (what do you think we are?) maybe extract the key for each of these that will save our bacon in the future? This is not much of a disaster recovery plan.
This is the only thing you've said of merit. If I had the bandwidth I'd put together a PR myself. It is important enough that I will take some time and see what I can pull together in any case, but I suspect that based on prior comments by Tom that someone with actual experience with the codebase could put together a small tool with minimal effort. Hence the separate, single-issue, issue. A small tool to fix this significant vulnerability can likely be produced quickly. With the ability to extract/save/use the master-key for a change-key operation, the entire attack chain goes away. The problem in #12649 is muddled and resolution is certainly a heavier lift, therefore a potentially long wait. |
I apologize that my reply appears to have made you feel uncomfortable, but I'd also appreciate if you'd stay decent though and abstain from becoming too personal. Suggesting similarities to "age-old chestnuts", telling me that I have an "inability to google" or sarcastic/harsh side notes such as "Do the packets to this magical datastore teleport" or "please call the NSA today, they need your skills immediately. Also: good luck." should not be necessary to communicate your underlying needs. Since there was not a lot of context for in which environment/threat model you are utilizing zfs encryption, of course, it is impossible to give tailored advice in a simple reply, I apologize for that attempt. I just thought it would help others to decide when to utilize what at the moment. Perhaps that context can support others to get a better understanding why this is important to you and might prioritize a bit more.
No. I just wanted to clarify in which situation the issue arises, since this wasn't covered by the CVE report, though I've now noticed that your original quoted part of your issue contains the phrasing "user does a raw zfs send" which actually covered this.
So I assume that is the situation you're dealing with, sounds cool.
Perhaps that is something you would like to elaborate on, I would have not expected that you are a senior C developer with familiarity in cryptography or the relevant zfs codebase from your previous statements. I apologize if that came over rude, though was certainly not my intention. Anyway, I hope this issue can be dealt with soon. Unfortunately it appears to me that at the moment there is a lack of resources of the ZFS encryption parts. I guess one possible workaround for the time being is to come up with some additional safeguards on the receiving end, such as using
Great, I'm sure everyone would appreciate that. |
@norpol: Nothing you have said makes me uncomfortable. Nothing to apologize for. I also was not trying to be personal. By "inability to google" was more a comment that whether or not there exists a single prior example in the whole of computing history is irrelevant...not your ability to find it. And yes, as someone who runs a penetration testing/red-team consultancy, I get a lot of pushback from well-intentioned but half-informed admins that have a very similar perspective, commenting on externalities that have no bearing at the specific topic at hand. I am generally brusque at distractions.
The context is simple and covered 100% in the detail of the CVE submission. I don't think there is any advice to give, unless you personally know how to extract-and-backup the master-key, and then use it to recover a dataset whose master-key is encrypted with an unknown key. This isn't a forum for how to improve your enterprise deployment or pass random-government-regulation. Other backup schemes or why you might use them are not helpful and do nothing but muddy the topic. Please review this comment from #6824 by Tom Caputi, which is very succinct and anyone interested in this topic should absorb:
The third paragraph should pique the interest of anyone who this affects and the driver behind my creating this issue. As far as my background:
I started my career as a C developer and I am not Bruce Schneier, but I read and understand his work and can similarly wade through whatever codebase as needed. If my life depended on it, I could do the work. As it stands, much like everyone, I have other priorities and I will have to leave it to more nimble hands who have time to dedicate. Until a tool can be created, any workaround which enables similar functionality would be welcome, regardless of complexity...just as long as it is 100% going to work. As far as I remember, there are no details are visible via zstream regarding the encryption keys but I'll double check ASAP. It would be tremendous if so, a great stopgap measure. |
What do you mean by that exactly? I followed this #12649 (comment) and it is also on my local zfs pool available. Writing something that will first validate the key,iv,mac,salt looks as expected and then continues with zfs receive should be relatively trivial, no? Of course depends on how/what you are retrieving.
|
I don't think I've ever reviewed encryption information using zstream. Clearly a failing on my part. Upon review of #12649 comments you mentioned I now see This does not eliminate the need for a tool, but an available workaround certainly lowers the urgency. |
I would think a simpler solution would be for zfs receive to have an option -x encryptkey that prevented it from honoring requests to change the key. That sounds really easy to do, and sounds like it's well worth it. |
Absolutely. Being that it was trivial for me to implement a check externally based on @norpol's observation, it should be equally trivial to update zfs receive to have this option. I'd say it should be something other than -x, just because the relevant encryption key data aren't properties; but otherwise, yes. Maybe "zfs receive -K" ? When I have a few mins I will create a separate issue for this...or, if you have the time @clhedrick, please do. Both of these features should be implemented but a "zfs receive -K" (or whatever) PR could be done just as trivially with the right hands I imagine. |
I would still like to see a Disaster Recovery mechanism for the master key, so I will keep this issue open. There are many use-cases where this would be of value above and beyond 'evilhacker'. For instance, I would like to be able to delegate key-authority over several of our encrypted datasets various personnel. Unfortunately, we cannot risk any mistakes (because of said lack of DRP method) and therefore our keys must stay accessible/deployable only through our elaborate-but-safe system™ which is not friendly to multi-user access. |
Regarding the safety of backups using See #5341 (comment) for possible ways to mess with backup servers. There was also a discussion a good while back (sorry, was not able find it) about the possibility to completely destroy backup servers through feeding a manipulated datastream to the |
@GregorKopka I did not have time to read the entire thread you referenced (on phone at airport). Does your hypothetical scenario work where the snapshots are only generated by the receiving system? To clarify: In our setup the source systems do zero snapshotting themselves. It is all done by the backup system which connects, makes the snapshots, normalizes them (gets rid of incompatible snapshots or panics and messages admin if something unexpected happens) and then pulls them, and never with zfs receive -F. (and now also doing a zfs send | zstream dump on both sending and receiving datasets to compare keys before proceeding) I don't currently see a scenario in this configuration where property manipulation or other skulduggery on the sending system would result in actual data-loss in the datasets stored on the backup system. Thoughts? |
TL;DR: yes Plus the (so far) theoretical option of delivering a stream that comes through the consistency checks on recv but contains enough garbage to mess up the target pool for good... maybe even a |
Interesting. I don't see where the first couple would apply to our situation...we do not use receive -F anywhere and nothing is ever mounted on the backup systems, but the theoretical attack warrents review. A few questions:
The only damaging attack I've been able to actually execute is changing the key... |
Then it should be reasonably safe from the snapshots on the target side being wiped. |
@GregorKopka: This is a novel approach I had not considered. We do not pull empty snapshots but the general proof-of-concept still bears significant consideration. All of our logic resides solely on the pulling/backup hosts and this type of zeroing-out or zeroing-then-minorly-changing 1000x arbitrary snapshots should be trivial to plan for and prevent...but it needs to be mitigated ahead of time. Post-mortem would just be tears in the rain. As it stands I believe our setup only pulls snapshots created by the backup system and any others on the target are destroyed. In theory someone should notice that 'critical data' has been changed and long before our own rules run the course to the point of being unrecoverable; otherwise the data probably wasn't that critical, n'est pas? |
Rustic can be made to make remote encrypted snapshots of local ZFS snapshots. Rustic can make incremental encrypted snapshots, and rustic can delete any snapshot. For now, this is the only reliable way to send encrypted copies of ZFS snapshots to untrusted remote machines. ZFS native encrypted backups aren't stable, yet. Restic can do deduplication. So, it is not too slow as long as you use intel CPU for accelerating encryption. |
Describe the feature would like to see added to OpenZFS
A mechanism for backup and restore of the master key of an encrypted dataset
How will this feature improve OpenZFS?
When the encryption key is changed for a given dataset that key is also immediately changed for all snapshots. This is then propogated to any dataset which receives a snapshot as part of a backup process. Without a backup key, a malicious actor with sufficient access can potentially lock the active filesystem and all backups. This could also happen innocently, where an administrator manages to fat-finger a password twice, or has caps-lock on, or pastes incorrectly, etc. In any case the result is the same: Loss of the affected dataset and all backups.
A mechanism to recover the master key is essential to the safety of use of ZFS native encryption.
Additional context
In November 2023 I submitted a CVE for this vulnerability (https://github.com/openzfs/zfs/security/advisories/GHSA-5wqj-fcr9-j434) however it has not yet been reviewed. The text of that is not visible to the public until approved by OpenZFS Owners I will post a comment with the full text below.
This concept is already covered in Encryption keys/roots management tools needed #12649 but that request is much larger. Emergency key recovery could and should be a small simple tool and be made available ASAP vs significant overhaul of the encryption tools.
The text was updated successfully, but these errors were encountered: