Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zipfile library doesn't extract windows zip files properly on linux #91036

Open
nimrodf mannequin opened this issue Feb 28, 2022 · 6 comments
Open

zipfile library doesn't extract windows zip files properly on linux #91036

nimrodf mannequin opened this issue Feb 28, 2022 · 6 comments
Labels
3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@nimrodf
Copy link
Mannequin

nimrodf mannequin commented Feb 28, 2022

BPO 46880
Nosy @not-my-profile

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2022-02-28.12:48:44.967>
labels = ['type-bug', '3.8', '3.9', '3.10', '3.11', '3.7', 'library']
title = "zipfile library doesn't extract windows zip files properly on linux"
updated_at = <Date 2022-02-28.17:38:55.773>
user = 'https://bugs.python.org/nimrodf'

bugs.python.org fields:

activity = <Date 2022-02-28.17:38:55.773>
actor = 'push-f'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2022-02-28.12:48:44.967>
creator = 'nimrodf'
dependencies = []
files = []
hgrepos = []
issue_num = 46880
keywords = []
message_count = 2.0
messages = ['414193', '414211']
nosy_count = 2.0
nosy_names = ['nimrodf', 'push-f']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue46880'
versions = ['Python 3.7', 'Python 3.8', 'Python 3.9', 'Python 3.10', 'Python 3.11']

@nimrodf
Copy link
Mannequin Author

nimrodf mannequin commented Feb 28, 2022

Created a zip file using Powershell's Compress-Archive method.
Moved the file to Debian.
Used zipfile's extractall method to extract.
The result was a flat directory with long file names such as:
"migrated-image952821\\m4a\\runiis.ps".
I would expect instead for a "migrated-image952821" directory to be created, containing an "m4a" directory which contains "runiis.ps"

@nimrodf nimrodf mannequin added 3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Feb 28, 2022
@not-my-profile
Copy link
Mannequin

not-my-profile mannequin commented Feb 28, 2022

Can you attach such a .zip file so that others can reproduce the bug?

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@nimrodf1
Copy link

artifacts.zip

@dignissimus
Copy link
Contributor

From the Zip File Specification

4.4.17 file name: (Variable)

4.4.17.1 The name of the file, with optional relative path.
The path stored MUST NOT contain a drive or
device letter, or a leading slash. All slashes
MUST be forward slashes '/' as opposed to
backwards slashes '' for compatibility with Amiga
and UNIX file systems etc. If input came from standard
input, there is no file name field.

Opening the archive with file-roller produces the same result as the zipfile library while displaying files but upon extraction, it creates the directories. Extracting the files with unzip also creates the directories.

I think it would be justified to leave the behaviour as it is currently but I also think there's an argument to accommodate for archives like these.

Relevant stack exchange post: https://superuser.com/questions/1382839/zip-files-expand-with-backslashes-on-linux-no-subdirectories
Relevant Microsoft.PowerShell.Archive GitHub issue: PowerShell/Microsoft.PowerShell.Archive#48

@galtgendo
Copy link

galtgendo commented Nov 11, 2022

Actually, the more relevant entry is https://learn.microsoft.com/en-us/dotnet/framework/migration-guide/mitigation-ziparchiveentry-fullname-path-separator.
Given the date/version, this was fixed relatively recently.

...
Sorry, that part was mentioned in the superuser answer. It's just that that's more relevant than the article on PowerShell, as it has far larger affected area.

@serhiy-storchaka
Copy link
Member

See also #117084.

Backslashes are already partially supported as path component separators in extraction on Windows. Partially -- because it does not work for explicit directories (what #117084 is about). You can enable this on non-Windows platform by temporary setting os.path.altsep = '\\'. But this is a hack, and there is no obligation to support this in future.

We may add an option to better control this behavior on all platforms. It should allow to treat a backslash as a path component separator on non-Windows platforms or as error on Windows, or emit warnings, or raise errors. It may also control the processing of absolute paths, paths containing the .. component, the null character, or special Windows names like NUL or CON.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Status: No status
Development

No branches or pull requests

4 participants