-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usage of tarfile.extractall() function without validating tarfile members #226
Comments
So this is interesting as the tarballs that Tern works on come from container images built by tools like Docker. It would be really cool to detect such issues with container images. The flip side to this is if most container images folks work with are subject to this attack, Tern can only sternly warn and continue until the problem is fixed in the container images. A note: It seems that this issue might be fixed in Python 3.8 with the addition of a class |
On further review of the conversations in the Python issue, I think they're not going to merge their fix any time soon, mostly because the security issues with the Python library inherit from the underlying security issues with However, tricking someone into downloading a naughty image from Dockerhub is definitely a thing, so I don't think we should ignore it completely. There's a patch which takes care of most of the issues: https://bugs.python.org/file47826/safetarfile-4.diff |
@nishakm I would like to work on this if it is not assigned yet. Please let me know. Thank you, Ravi. |
@nishakm, thanks for looking at the best approach to solve it. I don't know all the current usages of tern, only that if there's a way of reaching one of the highlighted functions, an attack is possible. As it is difficult to define all use cases of any project, I would suggest that we consider the worst case scenario where tern receives untrusted tars on a protected system. Also, the fact that tern runs in a container shouldn't be considered sufficient protection for a system, there are plenty of things that can go wrong such as insecure container runtime configuration. The fix from the pyvcloud team was just taken as an example, as they are dealing with OVA's they can just discard symlinks. In tern's case this might indeed require more thought. However, there is one case for which we can already protect from which is the arbitrary file overwrite using directory traversal sequences in the member filename, I don't see any valid use case for it. Thanks |
Agree 100%. There are three ways I see we can resolve this:
I would think that the warning on the README is sufficient indication that running Tern in a container is not secure :)
I would rather all the protections we know about at this point (stdlib patch above) arrive in one REVERTME patch, hopefully with something in place to track REVERTMEs.
|
So here's how we can proceed:
|
@rparikh Do you want to take this on? |
@nishakm yes. I would love to work on this. Thank you, Ravi. |
Awesome!! Note that this is quite a big task, so a series of small commits in the PR is fine. It would be great if you could also write a test for all the modules in this file in |
@nishakm Just wanted to update that I have started working on this. I will be submitting a review soon. As you mentioned, full things will be committed in small batches. Thank you so much, Ravi. |
@nishakm So we will create a class and have methods to examine the content? And we can use tarfile object as an argument and just iterate over each member and decide whether its a risk or not? I am just thinking if we need to have a class if we do not subclass it to Tarfile. We can have an independent method to examine the content. Also are we just looking for relative name/symlink issue? or also looking for max number of files and size point of view as well? Thank you, Ravi. |
As mentioned, we don't need to create a class that is a subclass of Tarfile. Individual
We can set different limits than what is given.
|
@rparikh Are you still working on this? We need this solved urgently as it does pose a security issue. You can submit a PR with "WIP" in the title and I can take over from where you left off if you are not finding the time to finish. Would this be OK with you? |
@nishakm I am really sorry for the late reply on this thread. I have worked on this but could not complete it. Will it be okay if I start the review over this weekend? Though I could not complete it, I would still like to contribute. Please let me know. |
@rparikh sure! Like I said before, if you could submit a PR with |
@rparikh I still haven't seen a PR from you. I'm going to continue to work on this. You can work on something else when you get the time. Thanks for trying! |
#435 will also resolve this
|
|
@nishakm I have absolutely no problem with this solution as it will indeed mitigate the issue by processing safely the archive. On your end you will lose the stacktrace info after the |
Depends on #435 |
This resolves tern-tools#226 This change does the following: 1. Replace python's tarfile with a call to the system's tar utility. We do this to take advantage of the CVE-2013-4420 fix to libtar. Python's tarfile module has a fix in the workings but is yet to be merged as of this change. 2. We check for EOF errors and empty tarballs. This is mostly to address a few instances where we have seen Docker images that were malformed. 3. We modify some functions around loading and analyzing Docker images and layers including catching the extra errors that we raise for No. 2. - rootfs: Moved some functionality from check_tar_permissions into a new function called shell_command. This function simply runs shell commands as the current user and returns the result and error to be dealt by the calling function. - rootfs: check_tar_permissions will now use shell_command. - rootfs: Created a new function called check_tar_members which will list the elements in the tarball to see if there are any EOF or empty tarballs. - rootfs: Repurposed extract_layer_tar to be a general purpose extract_tarfile function which can be used throughout the code. - container: Use extract_tarfile to extract image metadata. - image_layer: Use extract_tarfile to extract image layer tarballs. - analyze: Set up mount points after the image is loaded. - report: In general setup, don't create directories. extract_tarfile will now do it. - report: Catch all the appropriate errors that might get thrown when trying to load an image. Signed-off-by: Nisha K <[email protected]>
This resolves #226 This change does the following: 1. Replace python's tarfile with a call to the system's tar utility. We do this to take advantage of the CVE-2013-4420 fix to libtar. Python's tarfile module has a fix in the workings but is yet to be merged as of this change. 2. We check for EOF errors and empty tarballs. This is mostly to address a few instances where we have seen Docker images that were malformed. 3. We modify some functions around loading and analyzing Docker images and layers including catching the extra errors that we raise for No. 2. - rootfs: Moved some functionality from check_tar_permissions into a new function called shell_command. This function simply runs shell commands as the current user and returns the result and error to be dealt by the calling function. - rootfs: check_tar_permissions will now use shell_command. - rootfs: Created a new function called check_tar_members which will list the elements in the tarball to see if there are any EOF or empty tarballs. - rootfs: Repurposed extract_layer_tar to be a general purpose extract_tarfile function which can be used throughout the code. - container: Use extract_tarfile to extract image metadata. - image_layer: Use extract_tarfile to extract image layer tarballs. - analyze: Set up mount points after the image is loaded. - report: In general setup, don't create directories. extract_tarfile will now do it. - report: Catch all the appropriate errors that might get thrown when trying to load an image. Signed-off-by: Nisha K <[email protected]>
This resolves tern-tools#226 This change does the following: 1. Replace python's tarfile with a call to the system's tar utility. We do this to take advantage of the CVE-2013-4420 fix to libtar. Python's tarfile module has a fix in the workings but is yet to be merged as of this change. 2. We check for EOF errors and empty tarballs. This is mostly to address a few instances where we have seen Docker images that were malformed. 3. We modify some functions around loading and analyzing Docker images and layers including catching the extra errors that we raise for No. 2. - rootfs: Moved some functionality from check_tar_permissions into a new function called shell_command. This function simply runs shell commands as the current user and returns the result and error to be dealt by the calling function. - rootfs: check_tar_permissions will now use shell_command. - rootfs: Created a new function called check_tar_members which will list the elements in the tarball to see if there are any EOF or empty tarballs. - rootfs: Repurposed extract_layer_tar to be a general purpose extract_tarfile function which can be used throughout the code. - container: Use extract_tarfile to extract image metadata. - image_layer: Use extract_tarfile to extract image layer tarballs. - analyze: Set up mount points after the image is loaded. - report: In general setup, don't create directories. extract_tarfile will now do it. - report: Catch all the appropriate errors that might get thrown when trying to load an image. Signed-off-by: Nisha K <[email protected]>
Describe the bug
The usage of the tarfile.extractall() function without validating members could be used to perform archive attacks.
This happens in two places in tern:
Expected behavior
Two things can be done to protect against this attack:
(see warning under https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractall)
We should validate that the tarfile members names listed in its header don't contain invalid characters (../) and are not symlinks.
You can find a good description of how to protect against this here https://stackoverflow.com/a/10077309 and also an example of how to protect against these type of attacks here vmware-archive/pyvcloud#268
The text was updated successfully, but these errors were encountered: