is_file check on html #307

blaaat · 2019-03-11T13:42:23Z

PHP Warning is_file(): open_basedir restriction in effect. File(<!DOCTYPE HTML>
<html>

phpwkhtmltopdf/src/Pdf.php:325 mikehaertl\wkhtmlto\Pdf::processOptions

On my production machine (CentOS7; PHP 7.2) PHP_MAXPATHLEN is quite high. Somehow the html header-footer-html triggers the openbase_dir error in the is_file call.

Maybe better to do the other checks (html/xml/url regex or strip_tags) before the is_file call?

blaaat · 2019-03-11T13:44:05Z

For now; if anyone is interested; this is my work-around;

class Pdf extends \mikehaertl\wkhtmlto\Pdf
{
	public function setHeader($input)
	{
		return $this->setOptionHtml('header-html', $input);
	}
	public function setFooter($input)
	{
		return $this->setOptionHtml('footer-html', $input);
	}
	
	protected function setOptionHtml($option, $input) {
        $options[$option] = $this->_tmpFiles[] = new File($input, '.html', self::TMP_PREFIX, $this->tmpDir);
        $this->setOptions($options);
	}
}

mikehaertl · 2019-03-11T13:56:15Z

Maybe better to do the other checks (html/xml/url regex or strip_tags) before the is_file call?

It's tricky. We simply can't know, that what you passed is some static text or a file name. Using is_file() was the only option we saw. Note, that what you pass here doesn't need to be HTML. If you see a better way, feel free to suggest a change. Also see #100 where we already changed this.

mikehaertl · 2019-03-11T14:01:06Z

We could maybe add another check for HTML via the Regex we already have. This would at least help in the cases, where a full HTML document is passed.

blaaat · 2019-03-11T14:04:27Z

Isn't the processInput() method suitable to reuse?

mikehaertl · 2019-03-11T14:23:26Z

Yeah, your're probably right. It's been a while since I wrote this. I'll try to modify things a little and find better method names. Will let you know when I've updated it. It would be great if you could then help testing.

blaaat · 2019-03-11T14:24:58Z

Thanks! and off course!

mikehaertl · 2019-03-11T17:19:02Z

@blaaat I've created a MR with my refactorings, see the comments there. Could you give it a try?

As a side note: This changes the default behavior. Since we follow Semver this change will probably also require another major release.

blaaat · 2019-03-11T17:28:20Z

Thanks, I'll and let you know.

The strip_tags way to determine HTML looks risky. A path might include a tag which could be stripped:

/home/files<2018>/test.html

I'd prefer the old regex (maybe even with a ^prefixed)

blaaat · 2019-03-11T18:18:56Z

@blaaat I've created a MR with my refactorings, see the comments there. Could you give it a try?

Fixes my problem! Thanks.

As a side note: This changes the default behavior. Since we follow Semver this change will probably also require another major release.

I think it isn't necessary to change default behavior and still fix the problem. An extra regex (prefixed with ^) won't affect any path's, because I don't think a filesystem exists that allows a path starting with <html or <doctype

mikehaertl · 2019-03-11T19:28:15Z

The strip_tags way to determine HTML looks risky. A path might include a tag which could be stripped:

Good point, thanks. I somehow assumed < and > are not valid filename characters - but they are. So I've reverted to the old regex HTML check and removed the strip_tags() completely. I've not added the ^ though, as this would definitely break BC: Before it also accepted strings with content before <html>.

Right now it tries to stay away from is_file() as long as possible. There may still be situations where you hit your initial problem. So I've now also included the option to pass a File instance if all other methods fail.

$pdf->setOption([
    'header-html' => new File('some complex content', '.html'),
]);

See the updated MR here: #308

Maybe you can take another look?

blaaat · 2019-03-12T06:24:27Z

Looks good and works perfect! Thanks.

Right now it tries to stay away from is_file() as long as possible. There may still be situations where you hit your initial problem. So I've now also included the option to pass a File instance if all other methods fail.

Nice solution!

Issue #307 Refactor check for temp file creation

mikehaertl · 2019-03-12T07:05:48Z

Great. Just released 2.4.0 including this change. Thanks for your help!

mikehaertl added a commit that referenced this issue Mar 11, 2019

Issue #307 Refactor check for temp file creation

0902b3e

mikehaertl added a commit that referenced this issue Mar 11, 2019

Issue #307 Revert HTML detection change

4318898

mikehaertl added a commit that referenced this issue Mar 11, 2019

Issue #307 Allow to pass File instance as option

f35cc7c

mikehaertl added a commit that referenced this issue Mar 11, 2019

Issue #307 Save reference to passed File instances

c09a20d

mikehaertl added a commit that referenced this issue Mar 12, 2019

Issue #307 Update README

96c78ad

mikehaertl added a commit that referenced this issue Mar 12, 2019

Merge pull request #308 from mikehaertl/307-is-file-check

dcd1236

Issue #307 Refactor check for temp file creation

mikehaertl closed this as completed Mar 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

is_file check on html #307

is_file check on html #307

blaaat commented Mar 11, 2019

blaaat commented Mar 11, 2019

mikehaertl commented Mar 11, 2019

mikehaertl commented Mar 11, 2019

blaaat commented Mar 11, 2019

mikehaertl commented Mar 11, 2019

blaaat commented Mar 11, 2019

mikehaertl commented Mar 11, 2019

blaaat commented Mar 11, 2019

blaaat commented Mar 11, 2019 •

edited

Loading

mikehaertl commented Mar 11, 2019 •

edited

Loading

blaaat commented Mar 12, 2019

mikehaertl commented Mar 12, 2019

is_file check on html #307

is_file check on html #307

Comments

blaaat commented Mar 11, 2019

blaaat commented Mar 11, 2019

mikehaertl commented Mar 11, 2019

mikehaertl commented Mar 11, 2019

blaaat commented Mar 11, 2019

mikehaertl commented Mar 11, 2019

blaaat commented Mar 11, 2019

mikehaertl commented Mar 11, 2019

blaaat commented Mar 11, 2019

blaaat commented Mar 11, 2019 • edited Loading

mikehaertl commented Mar 11, 2019 • edited Loading

blaaat commented Mar 12, 2019

mikehaertl commented Mar 12, 2019

blaaat commented Mar 11, 2019 •

edited

Loading

mikehaertl commented Mar 11, 2019 •

edited

Loading