Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tarfile: Traversal attack vulnerability #65308

Closed
DanielGarcia mannequin opened this issue Mar 31, 2014 · 39 comments
Closed

tarfile: Traversal attack vulnerability #65308

DanielGarcia mannequin opened this issue Mar 31, 2014 · 39 comments
Assignees
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-security A security issue

Comments

@DanielGarcia
Copy link
Mannequin

DanielGarcia mannequin commented Mar 31, 2014

BPO 21109
Nosy @birkenfeld, @jcea, @gustaebel, @vstinner, @taleinat, @tiran, @benjaminp, @jwilk, @ned-deily, @vadmium, @serhiy-storchaka, @psyker156, @shanxS, @epicfaace, @websurfer5
PRs
  • bpo-21109: Add SafeTarFile #15244
  • Dependencies
  • bpo-17102: tarfile extract can write files outside the destination path
  • bpo-29788: [Security] tarfile: Add absolute_path option to tarfile, disabled by default
  • Files
  • prevent-tar-traversal-attack.diff: patch to prevent
  • safetarfile-1.diff: New SafeTarFile class and documentation
  • safetarfile-2.diff
  • safetarfile-3.diff
  • safetarfile-4.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gustaebel'
    closed_at = None
    created_at = <Date 2014-03-31.08:14:19.090>
    labels = ['type-security', 'library', '3.9']
    title = 'tarfile: Traversal attack vulnerability'
    updated_at = <Date 2021-02-27.08:56:06.564>
    user = 'https://bugs.python.org/DanielGarcia'

    bugs.python.org fields:

    activity = <Date 2021-02-27.08:56:06.564>
    actor = 'vstinner'
    assignee = 'lars.gustaebel'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2014-03-31.08:14:19.090>
    creator = 'Daniel.Garcia'
    dependencies = ['17102', '29788']
    files = ['34676', '35127', '47800', '47803', '47826']
    hgrepos = []
    issue_num = 21109
    keywords = ['patch', 'security_issue']
    message_count = 35.0
    messages = ['215222', '215223', '215224', '215225', '215226', '215237', '215239', '215242', '215656', '215658', '216675', '217188', '217189', '217690', '277339', '289438', '324193', '324198', '324262', '324908', '325229', '325329', '325491', '325607', '325635', '326423', '326437', '327451', '327458', '334921', '335078', '335292', '349517', '349583', '387772']
    nosy_count = 19.0
    nosy_names = ['georg.brandl', 'jcea', 'lars.gustaebel', 'vstinner', 'taleinat', 'christian.heimes', 'benjamin.peterson', 'jwilk', 'ned.deily', 'Arfrever', 'martin.panter', 'serhiy.storchaka', 'edulix', 'Daniel.Garcia', 'Philippe.Godbout', 'shanxS', 'epicfaace', 'uhei3nn9', 'Jeffrey.Kintscher']
    pr_nums = ['15244']
    priority = 'high'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue21109'
    versions = ['Python 3.9']

    @DanielGarcia
    Copy link
    Mannequin Author

    DanielGarcia mannequin commented Mar 31, 2014

    The application does not validate the filenames inside the tar archive, allowing to extract files in arbitrary path. An attacker can craft a tar file to override files.

    I've view this vulnerability in libtar:
    http://lwn.net/Vulnerabilities/587141/
    I've checked that python tarfile doesn't validate the filenames so python tarfile is vulnerable to this attack.

    @DanielGarcia DanielGarcia mannequin added stdlib Python modules in the Lib dir type-security A security issue labels Mar 31, 2014
    @DanielGarcia
    Copy link
    Mannequin Author

    DanielGarcia mannequin commented Mar 31, 2014

    The solution in the patch is based on the gnutar solution to this, removing the prefix when extracting and adding.

    @ned-deily
    Copy link
    Member

    Setting as release blocker pending evaluation.

    @tiran
    Copy link
    Member

    tiran commented Mar 31, 2014

    It's a known and well-documented behavior of the tar module:

    https://docs.python.org/2.7/library/tarfile.html#tarfile.TarFile.extractall

    @vstinner
    Copy link
    Member

    It's a known and well-documented behavior of the tar module

    Would it possible to disable this behaviour by default, and only enable ti explicitly? The tar command line program has for example the -P / --absolute-paths option.

    @serhiy-storchaka
    Copy link
    Member

    Yes, this behavior is documented, but still it is desirable to fix it. The tar utility has a lot of switches which controls extracting and by default it prevents three ways of attack (absolute names, '..' and symlinks), but there are other possible ways of attack. This is complex issue and I'm working on it. See also bpo-19974.

    In any case we should be very careful because every protection against attack changes a behavior (which can be safe if you know what you do), so perhaps we should add parameters which controls behavior. This is possible only in new Python version.

    @bitdancer
    Copy link
    Member

    Note that any issues here should also be considered for zipfile and shutil. (Well, shutil can just use the other two once the security is available.) See bpo-20907.

    @tiran
    Copy link
    Member

    tiran commented Mar 31, 2014

    Don't forget about SUID and SGID, too.

    @gustaebel
    Copy link
    Mannequin

    gustaebel mannequin commented Apr 6, 2014

    In the past, our answer to these kinds of bug reports has always been that you must not extract an archive from an untrusted source without making sure that it has no malicious contents. And that tarfile conforms to the posix specifications with respect to extraction of files and pathname resolution. That's why we put this prominent warning in the documentation, and I think its advice still holds.

    I don't think that this issue should be marked as a release blocker, because the way tarfile currently works was a conscious decision, not an accident. tarfile does what it is designed to do: it processes a sequence of instructions to store a number of files in the filesystem. So the attack that is described by Daniel Garcia exploits neither a bug in tarfile nor a loophole in the tar archive format. A necessary condition for this attack to work is that the attacker has to trick the user into extracting the malicious archive first. After that, tarfile interprets the contained instructions word-for-word but still only within the boundaries defined by the user's privileges.

    I think it is obvious that it is potentially dangerous to extract tar archives we didn't create ourselves, because we actually give another person direct access to our filesystem. tarfile could mitigate some of the adverse effects, but this will not change the fact that it remains unsafe to use tarfile to a certain degree unless you use it with your own data or take reasonable precautions.

    Anyway, if we come to the conclusion that we want to eliminate this kind of attack, we must be aware that there is a lot more to do than that. tarfile as it is today is vulnerable to all known attacks against tar programs, and maybe even a few more that rely on its specific implementation.

    1. Path traversal:

      The archive contains files names e.g. /etc/passwd or ../etc/passwd.

    2. Symlink file attack:

      foo links to /etc/passwd.
      Another member named foo follows, its data overwrites the target file's data.

    3. Symlink directory attack:

      foo links to /etc.
      The following member foo/passwd overwrites /etc/passwd.

    4. Hardlink attack:

      Hardlink member foo links to /etc/passwd.
      tarfile creates the hardlink to /etc/passwd because it cannot find it inside the archive and falls back to the one in the filesystem.
      Another file named foo follows, its data overwrites /etc/passwd's data.

    5. Permission manipulation:

      The archive contains an executable that is placed somewhere in PATH with its setuid flag set, so that an unprivileged user is able to gain root privileges.

    6. Device file attacks:

      The archive contains a device node foo with the same major and minor numbers as an attached device.
      Another member named foo follows, its data is written to the device.

    7. Huge zero file attacks:

      Bzip2 and lzma allow it to store huge blobs of repetetive data in tiny archives. When unpacked this data may fill up an entire filesystem.

    8. Excessive memory usage:

      tarfile saves one TarInfo object per member it finds in an archive. If the archive contains several millions of members, this may fill up the memory.

    9. Saving a huge sparse file:

      tarfile is unable to detect holes in sparse files and thus cannot store them efficiently. Archiving a huge sparse file can take very long and may lead to a very big archive that fills up the filesystem.

    Additionally, there are more issues mentioned in the GNU tar manual:

    https://www.gnu.org/software/tar/manual/html_node/Security.html

    In conclusion, I like to emphasize that tarfile is a library, it is no replacement for GNU tar. And as a library it has a different focus, it is merely a building block for an application, and has to be used with a little bit of responsibility. And even if we start to implement all possible checks, I'm afraid we never can do without a warning in the documentation that reminds everyone to keep an eye on what they're doing.

    @larryhastings
    Copy link
    Contributor

    Thank you Lars for your thorough reply.

    While I agree that this isn't a release blocker, as it was clearly designed to behave this way... it seems to me that it wouldn't take much to make the tarfile module a lot safer. Specifically:

    • Don't allow creating files whose absolute path is not under the
      destination.
    • Don't allow creating links (hard or soft) which link to a path
      outside of the destination.
    • Don't create device nodes.

    This would fix your listed attacks 1-6. The remaining attacks you cite are denial-of-service attacks; while they're undesirable, they shouldn't compromise the security of the machine. (I suppose we could even address those, adding "reasonable" quotas for disk space and number of files.)

    I doubt that would make tarfile secure. But maybe "practicality beats purity"?

    @vadmium
    Copy link
    Member

    vadmium commented Apr 17, 2014

    Seems like shutil._unpack_tarfile() is affected. I guess it could at least do with one of those warnings in the documentation for make_archive().

    The patch for this bug looks a bit over enthusiastic, for example skip_prefixes("blaua../stuff") would incorrectly strip the first bit and just return "stuff".

    It seems there might already be plenty of existing code to check for bad paths. Examples that come to mind:

    • http.server.SimpleHTTPRequestHandler.translate_path()
    • zipfile.ZipFile._extract_member()
    • shutil._unpack_zipfile()

    This code either ignores the bad path elements, or ignores the whole path. Perhaps some of it could be recycled into a common function somewhere, rather than implementing it all over again for tar files.

    I have written my own function joinpath() to do this sort of checking, which you are welcome to use:

    https://bitbucket.org/vadmium/pyrescene/src/34264f6/rescene/utility.py#cl-217

    You would call it with something like joinpath(tarpath.split("/"), osdir).

    @edulix
    Copy link
    Mannequin

    edulix mannequin commented Apr 26, 2014

    Do we have any final decision on what's the best approach to solve this? I see some possibilities:

    a) leave the issue to the library user. I think that's a not good solution security-wise as many will be unaware of the problem and this promotes code duplication for the fix. On the other hand, this does not change default behavior.

    b) fix the problem as proposed in the patch sent by Daniel. This makes the tarfile secure against this kind of attacks. It does change the behavior and doesn't allow to extract in arbitrary paths, though.

    c) fix the problem so that by default extracting in arbitrary paths is not allowed, but allow somehow to do that optionally. This way we change the default behavior but provide an easy fix for those that depend on that functionality.

    d) do not change the default, but provide a well documented and easy way to activate the safety checks that fix this kind of attacks. The advantage is that it doesn't change the default behavior, the disadvantage is that many people will have to modify their code to be secure, and that the default is not very secure.

    For what is worth, I believe either b or c should be chosen to fix this issue.

    @edulix
    Copy link
    Mannequin

    edulix mannequin commented Apr 26, 2014

    Also, I guess this patch solves and is closely related to bpo-1044 which was, at the time (2007), considered "not a bug".

    @gustaebel
    Copy link
    Mannequin

    gustaebel mannequin commented May 1, 2014

    Let me present for discussion a proposal (and a patch with documentation) with an approach that is a little different, but in my opinion the most effective. I hope that it will appeal to all involved.

    My proposal consists of a new class SafeTarFile, that is a subclass and drop-in replacement for the TarFile class and can be employed whenever the user feels the necessity. It can be used the same way as TarFile, with the difference that SafeTarFile is equipped with a wide range of tests and as soon as it detects anything bad it interrupts the current operation with a SecurityError exception. This way damage is effectively averted, and it is up to the developer to decide whether he rejects the archive altogether (which is the obvious and recommended measure) or he wants to continue to process it in a subsequent step (on his own responsibility).

    To simplify a few common operations, SafeTarFile has three more methods: analyze(), filter() and is_safe(). These methods will allow access to the archive without SecurityError exceptions being raised. The analyze() method is a kind of low-level iterator that produces each TarInfo object together with a list of warnings (if the member is bad) as a tuple. This gives a developer access to all the information he needs to implement his own more differentiated way of handling bad archives. The filter() method is a convenience method that provides an iterator over all the "good" members of an archive leaving out all the "bad" ones. It can be used as an argument to SafeTarFile.extractall() for example. is_safe() is a high-level shortcut method that reduces the result of the analysis to a simple True or False.

    SafeTarFile has a variety of checks that test e.g. for bad pathnames, bad permissions and duplicate files. Also, to prevent denial-of-service scenarios, it enforces user-defined limits upon the archive, such as a maximum number of files or a maxmimum size of unpacked data.

    The main advantage of this approach is the higher degree of security. The practice of rewriting paths (e.g. like in Daniel.Garcia's patch) is error-prone, has side-effects and is hard to maintain because of its tendency towards regression. It just adds another layer of complexity to an already complex and delicate problem.

    SafeTarFile (or whatever it will be called) is backward compatible and easy to maintain, because it is an isolated addition to the tarfile module. It is easily subclassable to add more tests. It can be used as a standalone tool to check for bad archives and possible denial-of-service scenarios. Its analyze() method tells the user exactly what's wrong with the archive instead of keeping it away from him. Instead of silently extracting files to locations they weren't expected to be stored (i.e. after "fixing" their pathnames), SafeTarFile simply refuses to extract them at all. This way it is far more transparent and understandable to the user what happens.

    Feedback is welcome.

    @gustaebel gustaebel mannequin removed the release-blocker label May 1, 2014
    @gustaebel gustaebel mannequin self-assigned this May 1, 2014
    @tiran tiran added the 3.7 (EOL) end of life label Sep 24, 2016
    @vadmium
    Copy link
    Member

    vadmium commented Sep 24, 2016

    bpo-17102 is open about the specific problem of escaping the destination directory. Maybe it is a duplicate, but this bug also discusses other problems.

    @vadmium
    Copy link
    Member

    vadmium commented Mar 11, 2017

    bpo-29788 proposes an option to disable the vulnerability in the CLI

    @taleinat
    Copy link
    Contributor

    Lars, a huge +1 from me for your suggested approach and patch. I'd like to work this into a review-ready PR.

    The patch is a great step forward but still a ways from being ready for a PR: It is missing tests entirely and there are still several important methods of SafeTarFile which don't use the safety logic, e.g. next() and getmemebers().

    Lars, would you be interested in continuing to work on this?

    @taleinat taleinat added the 3.8 (EOL) end of life label Aug 27, 2018
    @psyker156
    Copy link
    Mannequin

    psyker156 mannequin commented Aug 27, 2018

    Lars, I think the suggested approach is great. Documentation for the tarfile class should be changed in order to direct user to the "safe" version with an relevant warning. A bit like what is done for PRNG safety.
    As stated by Eduardo an optional "safe" parameter to opt into safe mode could also be an interesting approach.

    @jwilk
    Copy link
    Mannequin

    jwilk mannequin commented Aug 28, 2018

    I've tested Lars's patch against my collection of sly tarballs:
    https://github.com/jwilk/path-traversal-samples

    SafeTarFile defeated most, but not all attacks.
    It still allows directory traversal for these two tarfile:

    1. https://github.com/jwilk/path-traversal-samples/releases/download/0/dirsymlink2a.tar

    lrwxrwxrwx cur -> .
    lrwxrwxrwx par -> cur/..
    -rw-r--r-- par/moo

    1. https://github.com/jwilk/path-traversal-samples/releases/download/0/dirsymlink2b.tar

    lrwxrwxrwx cur -> .
    lrwxrwxrwx cur/par -> ..
    -rw-r--r-- par/moo

    @shanxS
    Copy link
    Mannequin

    shanxS mannequin commented Sep 10, 2018

    A. Regrading Jakub's tests, I suppose the changes needed are to
    for every name in tar
    i) find reasonable occurrence of symlink's name and replace it with smylink's linkname
    ii) convert it to normal path and then check for relative / absolute paths

    B. Jakub, are your tests exhaustive or there maybe some corner cases missing? I think we don't have tests for hardlinks, yet. Also I think, we will need tests for CHRTYPE, FIFOTYPE and BLKTYPE of tar's typeflag[0]

    [0] http://www.gnu.org/software/tar/manual/html_node/Standard.html

    @shanxS
    Copy link
    Mannequin

    shanxS mannequin commented Sep 13, 2018

    1. I have done some changes to Lar's patch to address class of bugs which Jakub found.
      Attached patch safetarfile-2.diff
      Patch is for code only and is work in progress.

    2. However, there maybe several edge cases which have not been covered. Going by types of bugs we want to catch:

    • Don't allow creating files whose absolute path is not under the
      destination.
    • Don't allow creating links (hard or soft) which link to a path
      outside of the destination.
    • Don't create device nodes.

    I suspect there may be more so which haven't been mentioned yet, one of which I have listed below.

    1. Now, on to patch, Safetar now tries to keep a list of symlinks it has seen so far and tries to figure effective path of current name which may use symlinks. This approach does address bugs found in Jakub's comment, details below, but when symlink's link has a symlink in it, there are cases which this impl. let slip.

    For example:
    # tar -tvf dirsym_rouge.tar
    drwxr-xr-x root/root 0 2018-09-12 19:03 dirsym/
    lrwxrwxrwx root/root 0 2018-09-12 18:39 dirsym/sym -> .
    lrwxrwxrwx root/root 0 2018-09-12 19:02 dirsym/symsym3 -> ../dirsym/sym/../..

    This escapes the check since, given name "../dirsym/sym/../.." translates to "..", ideally this should have given relative link warning.
    Above symlink is valid.
    

    But for:
    # tar -tvf dirsym.tar
    drwxr-xr-x root/root 0 2018-09-12 18:44 dirsym/
    lrwxrwxrwx root/root 0 2018-09-12 18:44 dirsym/sym -> .
    lrwxrwxrwx root/root 0 2018-09-12 18:44 dirsym/symsym -> dirsym/sym/../..

    This fails with warning of relative link name, as expected.
    given name "dirsym/sym/../.." translates to "../.."
    However, the symlink points to invalid path which may or maynot be useful.
    
    1. Regarding bugs reported by Jakub, following enumerates the effective name that gets computed by Safetar:

      absolute1.tar
      "/tmp/moo" translates to "/tmp/moo"

      absolute2.tar
      "//tmp/moo" translates to "//tmp/moo"

      dirsymlink.tar
      "tmp" translates to "tmp"
      "/tmp" translates to "tmp"

      dirsymlink2a.tar
      "cur" translates to "cur"
      "." translates to "."
      "par" translates to "par"
      "cur/.." translates to ".."
      "par/moo" translates to "../moo"

      dirsymlink2b.tar
      "cur" translates to "cur"
      "." translates to "."
      "cur/par" translates to "par"
      ".." translates to ".."
      "par/moo" translates to "../moo"

      relative0.tar
      "../moo" translates to "../moo"

      relative2.tar
      "tmp/../../moo" translates to "../moo"

      symlink.tar
      "moo" translates to "moo"
      "/tmp/moo" translates to "/tmp/moo"

    @shanxS
    Copy link
    Mannequin

    shanxS mannequin commented Sep 14, 2018

    Figured a fix for the bug I found, trick was to keep track of current working dir of symlink it was trying to evaluate.

    Attached patch: safetarfile-3.diff
    Patch is for code only.

    I'd like to see this go thorough, and would appreciate feedback.

    @taleinat
    Copy link
    Contributor

    For one thing, the new diffs are still missing tests.

    Tests should include, at the least:

    1. *Safely* testing SafeTarFile against examples of problematic tarballs. Perhaps from Jakub's collection of "sly" tarballs could be used, assuming those could be licensed appropriately.
    2. SafeTarFile should pass all of the normal TarFile tests.

    @shanxS
    Copy link
    Mannequin

    shanxS mannequin commented Sep 18, 2018

    I can't use Jakub's repo (or Makefile from that repo) directly because it relies on tar, which doesn't look like dependency for building Python. I can make similar tarballs but I am not sure how licensing will work. I can add tarballs for the cases I discovered.

    As of now, I'll move ahead with my version of tarballs and will change them for licensing requirement as needed before merging it to source code.

    Would you prefer a PR over a patch?

    @taleinat
    Copy link
    Contributor

    I am not a lawyer, but to the best of my understanding, using such tarballs would be fine. Since Jakub's repo only provides code to generate archive files but doesn't include actual archive files, and the generation code is licensed via the MIT license, we are free to use the code to generate archive files, which would then not fall under the copyright of the author of the generation code (Jakub).

    @shanxS
    Copy link
    Mannequin

    shanxS mannequin commented Sep 26, 2018

    Added tests.
    Patch file: safetarfile-4.diff

    Following works with 456 tests passed, after doing make clean && make

    ./python -m unittest -v test.test_tarfile

    Attached patch is on top of master's commit:
    commit 2aaf98c (HEAD -> master, origin/master, origin/HEAD)
    Author: INADA Naoki <[email protected]>
    Date: Wed Sep 26 12:59:00 2018 +0900

    @taleinat
    Copy link
    Contributor

    shashank, you're making good progress on this!

    The tests should also put SafeTarFile through all of the tests for TarFile, considering that it is being described as a drop-in replacement. You should look through the existing tests for other modules which do similar things for methods to implement this cleanly, e.g. Lib/tests/test_binascii.py and Lib/tests/datetimetester.py.

    @shanxS
    Copy link
    Mannequin

    shanxS mannequin commented Oct 10, 2018

    It won't exactly be drop-in replacement.

    I mean if users decide to replace Tarfile with SafeTarFile, existing code may break since there might be cases where dodgy tarballs are acceptable and/or used then SafeTarFile.open will throw an exception.

    Having said that, I am refactoring the tests right now since the test file is ~3000 lines and adding SafeTarFile tests for every TarFile test is cluttering it.

    @taleinat
    Copy link
    Contributor

    Having said that, I am refactoring the tests right now since the test file is ~3000 lines and adding SafeTarFile tests for every TarFile test is cluttering it.

    This must be done without adding much test code and with minimal changes to the existing tests. That's why I referenced some existing places in our code where similar things are done in tests, please take a look at those.

    @tiran
    Copy link
    Member

    tiran commented Feb 6, 2019

    There is some new research on the topic, see https://snyk.io/research/zip-slip-vulnerability, snyk/zip-slip-vulnerability#4 (comment) and BPO bpo-35909

    @uhei3nn9
    Copy link
    Mannequin

    uhei3nn9 mannequin commented Feb 8, 2019

    Is there any update on this? Will this be fixed in the next release?

    Having a code execution vulnerability (yes it is!) in python for 5 years does not really spark confidence...

    @taleinat
    Copy link
    Contributor

    Is there any update on this? Will this be fixed in the next release?

    There was progress made as described on this issue, but there is yet work to be done, and no-one seems to be taking this upon themselves at the moment.

    I agree that it would be great to have this in 3.8.

    @taleinat taleinat removed the 3.7 (EOL) end of life label Feb 12, 2019
    @epicfaace
    Copy link
    Mannequin

    epicfaace mannequin commented Aug 13, 2019

    I've added a PR in which I'm working on adding in the tests. Wanted to make sure this is the approach you had in mind? It wasn't as simple as how tests are handled in, say, test_binascii.py, because over there there was only one class that handled the main test suite, while in this file, there are multiple.

    @epicfaace
    Copy link
    Mannequin

    epicfaace mannequin commented Aug 13, 2019

    SafeTarFile does not pass the existing tests, mainly because the existing file Lib/test/tarfiletestdata/testtar.tar seems to be "unsafe", producing errors like these:

    tarfile.SecurityError: <TarInfo 'ustar/blktype' at 0x7fb9119b3bb0>: block device

    tarfile.SecurityError: <TarInfo 'ustar/regtype' at 0x7fb9119b3910>: duplicate name

    It seems like the solution here is to remove block devices and duplicate names from testtar.tar. However, is this desirable -- do we need to keep these in for the tests for TarFile?

    @epicfaace epicfaace mannequin added 3.9 only security fixes and removed 3.8 (EOL) end of life labels Aug 13, 2019
    @vstinner
    Copy link
    Member

    What is the status of this issue?

    @gpshead
    Copy link
    Member

    gpshead commented Sep 21, 2022

    fwiw this is likely https://nvd.nist.gov/vuln/detail/CVE-2007-4559.

    @brandsimon
    Copy link

    There is now a bot (TrellixVulnTeam) which automatically opens pull requests [0,1] to fix this problem.
    It seams like the bot took code from here [2], but the code itself is still vulnerable if the archive contains symlinks to outside of the target directory.
    This just demonstrates how difficult it is for a project to check for malicious tar files.

    Is anyone working on this right now?

    [0] https://github.com/brandsimon/verify-squash-root/pull/10
    [1] Project-Awaken/android_external_boringssl#1
    [2] https://gist.github.com/Kasimir123/9bdc38f2eac9a19f83ef6cc52c7ce82e?

    @encukou
    Copy link
    Member

    encukou commented May 3, 2023

    PEP-706 (Filter for tarfile.extractall) has been implemented in #102950. See the added docs.

    Python 3.12, and security updates to some earlier releases, will allow users to avoid CVE-2007-4559 by changing their code/settings.
    Python 3.12 will emit a warning urging people to do that.
    Python 3.14 will fix CVE-2007-4559 by making safer behaviour the default.

    @vstinner
    Copy link
    Member

    vstinner commented May 4, 2023

    @encukou encukou closed this as completed yesterday

    Yahoo! Thanks for your tenacity to go to the root of these issues, design an API for it and a smooth transition, and implementing it!

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-security A security issue
    Projects
    Status: Done
    Development

    No branches or pull requests