Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for compressed phar #7028

Closed
wants to merge 22 commits into from

Conversation

nazar-pc
Copy link
Contributor

@nazar-pc nazar-pc commented May 1, 2016

This is a big PR, so it needs extended description.

Features:

  • Support for .phar.gz .phar.bz2, *.phar.tar, *.phar.tar.gz, *.phar.tar.bz2, *.phar.zip kinds of Phar archives regardless of their extension
  • Implemented methods:
    • Phar::isValidPharFilename() with tests ported from PHP's sources
    • Phar::isCompressed() returns real value now instead of hardcoded false
    • Phar::setAlias() implemented as proxy method with necessary checks, but not actually used until write support implemented properly
    • Phar::setStub() implemented as proxy method, but not actually used until write support implemented properly
    • Phar::isFileFormat() was checking only whether file is plain Phar, now also checks for Tar and Zip
    • Phar::getSupportedCompression() returns supported compression formats instead of empty array
    • Phar::getSupportedSignatures() returns supported signatures instead of empty array
    • PharFileInfo::isCRCChecked() is just a stub for now
  • Tar archives support was fixed to support archives that contain type flag 5, it happened when creating archives with popular file-roller, I suspect GNU Tar is used under the hood
  • Added support for include 'phar:///path/to.phar', include 'phar:///path/to.phar.gz', include 'phar:///path/to.phar.bz2' (gz and bz2 variations do not work without phar:// protocol yet)
  • For include 'phar:///path/to.phar' stub will be used, even though path to it is not specified.

Architecture decisions:

  • There was initial Tar and Zip support in PharData class using __SystemLib\TarArchiveHandler and __SystemLib\ZipArchiveHandler classes.
    Similarly to mentioned classes new class __SystemLib\PharArchiveHandler was added, it provides support for plain Phar format (absorbed Phar-specific functionality from Phar class), all three classes refactored and unified internally.
  • Phar and PharData classes are very similar, but had different implementation for similar methods (offset*() methods especially), now they are unified and share much more common code.
  • There was major memory waste by storing all contents of archives in memory, now streams are used instead of strings.
    Zip-based implementation relies on ZipArchive::getStream() and its efficiency.
    Tar-based implementation was rewritten to be stream-based, but for maximum efficiency needs some sort of stream slicing mechanism (otherwise there is still some overhead when copying bytes to php://temp, which is not too bad though and very unlikely to eat all memory).
    Phar-based implementation was also rewritten for streams usage and similarly to Tar-based needs stream slicing for maximum efficiency, but is good enough already.

Fixes #4263

nazar-pc and others added 19 commits April 30, 2016 03:32
It should fail if `.phar` extension not found regardless of contents.
* added automatic content-based checks for distinguishing each format type regardless of its extension
* phar archive constructor decoupled into separate method, stubs for tar/zip-based phars added
* `Phar::isCompressed()` now returns correct result even though tar/zip-based phars are not properly supported yet
…ethods, iterators and have some common properties.

Functionality specific to Phar format moved from `Phar` class to new class `__SystemLib\PharArchiveHandler` which extends `__SystemLib\ArchiveHandler`.
`__SystemLib\ArchiveHandler` was extended with additional necessary methods in order to fulfill the need to access additional information from `Phar` class.
`__SystemLib\TarArchiveHandler`, `__SystemLib\ZipArchiveHandler` and `__SystemLib\PharArchiveHandler` unified to use the same naming for the same stuff.
`__SystemLib\PharArchiveHandler` is partially stream-based, but still more work needed to handle big files nicely.
Added missing `PharFileInfo::isCRCChecked()` method as defined in PHP docs.
Public methods added to `PharFileInfo::getSize()` and `::getTimestamp()` added for internal use by `Phar` class (PHP for whatever reason defines, for instance, `::getCompressedSize()`, but not mentioned useful methods, so I've added them on top anyway).
…tested and unit tests should be added).

Fix for iterating Tar archives that contain type flag `5` (test case included file without entries with such flag).
`Archive_Tar-1.3.11.tar.gz` has exactly the same contents as `Archive_Tar-1.3.11.tgz `, but with 3 directory entries (created using popular GUI tool `file-roller`).
…s efficiently, others require streams slicing of something similar to actually avoid copying data
…`) using `new Phar()`.

Added support for `include 'phar:///path/to.phar'`, `include 'phar:///path/to.phar.gz'`, `include 'phar:///path/to.phar.bz2'` (gz and bz2 variations do not work without `phar://` protocol yet).
For `include 'phar:///path/to.phar'` stub will be used.
* .phar.tar
* .phar.tar.gz
* .phar.tar.bz2
* .phar.zip
* .phar which is inside any of above
…thod itself adjusted to pass all the tests
…Tar-based archive.

More unification of Phar- and Tar-based archives handling.
@ghost
Copy link

ghost commented May 1, 2016

This pull request has been imported into Phabricator, and discussion and review of the diff will take place at https://reviews.facebook.net/D57495

@ghost ghost added the CLA Signed label May 1, 2016
… ignoring it silently.

Alias correctness check moved into its own method.
Restored accidentally removed method description.
@hhvm-bot hhvm-bot closed this in e0713e7 May 11, 2016
if (!ret.isResource()) {
return nullptr;
}
return dyn_cast_or_null<File>(ret.asResRef());
}

virtual int access(const String& path, int mode) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No way!!!that's so cool

@nazar-pc nazar-pc deleted the phar-compression-support branch February 15, 2017 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HHVM doesn't support tar, tar.gz, tar.bz2 and zip versions of Phar archives
3 participants