Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using addElement to add support for unsupported elements does not appear to be consistent #435

Open
ebssalvail opened this issue Jan 15, 2025 · 0 comments

Comments

@ebssalvail
Copy link

ebssalvail commented Jan 15, 2025

Hello, I am trying to add support for some additional elements (many of them HTML5) that are not supported by default. To do this, I followed the customize documentation here: http://htmlpurifier.org/docs/enduser-customize.html
However, I am seeing inconsistent results when trying to add elements in this way.

Below is an example that demonstrates my point.
I am using the latest version 4.18.0
My basic setup looks like this:

$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.DefinitionID', 'html5-definitions');
$config->set('HTML.DefinitionRev', 1);
if ($def = $config->maybeGetRawHTMLDefinition()) {
    $def->addElement('figure', 'Block', 'Flow', 'Common');
    $def->addElement('figcaption', 'Inline', 'Flow', 'Common');
    $def->addElement('article', 'Block', 'Flow', 'Common');
}

$purifier = HTMLPurifier::instance($config);
$purifiedData = $purifier->purify($html);

The raw HTML I am testing with looks like:

<figure>test</figure>
<figcaption>test</figcaption>
<article>
    <h1>Test article heading</h1>
</article>

After running this HTML through the purifier, this is what I see being returned:

<figure>test</figure>
<figcaption>test</figcaption>
    <h1>Test article heading</h1>

As you can see, the figure and figcaption elements are not stripped as expected. However, article is still being removed.
How am I meant to add support for this? Is something in my config incorrect?
I have also tested with other elements like input, label, header, footer, section, etc and got the same results.

Is my only option trying to use another library with more HTML5 support like https://github.com/xemlock/htmlpurifier-html5 ?

EDIT:
After further testing, I have found that this config works and does not remove the article element:

$def = $config->getDefinition('HTML', true);
$def->addElement('figure', 'Block', 'Flow', 'Common');
$def->addElement('figcaption', 'Inline', 'Flow', 'Common');
$def->addElement('article', 'Block', 'Flow', 'Common');

However, I'm still confused on why it didn't work with my prior config as that is what is stated in the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant