Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow more granular HTML filter options for HTML attributes, not just tags #6100

Open
Tracked by #378
jenlampton opened this issue May 2, 2023 · 6 comments
Open
Tracked by #378

Comments

@jenlampton
Copy link
Member

Input filters are used to control what HTML tags a user can use. For example, the default "filtered input" filter may allow users to use <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>. This means that the node author can use tags such as <strong> to make some portion of the text bold as I've done here.

The problem is that accessible HTML should use CSS for formatting. Most Rich Text editors do this properly by writing CSS instead of the (deprecated) HTML tags. But our input filter system strips CSS classes and style attributes, leaving HTML only. And pages don't display as intended by the Rich Text editor. The only way to fix this is to replace the CSS style directives with HTML. However, many of the HTML formatting tags have been deprecated (for example <center> <font> <u> <i> are all deprecated). You can still use them, but it's not advisable. And there's just no way of accomplishing certain basic text formatting without CSS (for example, the CSS attribute "text-align: right" doesn't have an equivalent HTML tag).

So, we need to rethink the filter.module.

Class attributes should not filtered. For example, <p class="right"> seems like reasonable HTML to allow. Is there any security risk here? I can't think of one.
We need a way of specifying what attributes can or can not be used with the filter, rather than just what HTML tags can or can not be used. This is the tricky part, and will involve an entirely new way of defining filters.

See the comparable Drupal 8 issue.

@klonos
Copy link
Member

klonos commented May 2, 2023

...I thought that we already allowed attributes in the HTML filter, by adding them within the tags. Or am I thinking Drupal 9/10? 🤔

@klonos
Copy link
Member

klonos commented May 2, 2023

...for example, the "Limit allowed HTML tags and correct faulty HTML" filter in D9/10 has the following allowed tags configured out of the box for the "Basic HTML" text format:

<br> <p> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id> <cite> <dl> <dt> <dd> <a hreflang href> <blockquote cite> <ul type> <ol start type> <strong> <em> <code> <li> <img src alt data-entity-uuid data-entity-type height width data-caption data-align>

Notice things like the following within that configuration:

  • <h* id> (which means that the id attribute is allowed within the various individual <h*> tags)
  • <a hreflang href>
  • <blockquote cite>
  • <ul type>
  • <ol start type>
  • <img src alt data-entity-uuid data-entity-type height width data-caption data-align>

Don't we allow the same?

@olafgrabienski
Copy link

I thought that we already allowed attributes in the HTML filter, by adding them within the tags.

Just tested in a Backdrop demo sandbox:

  • Changed <p> to <p class> in the Allowed HTML tags setting of the Filtered HTML text format.
  • Edited the About page using CKEditor source mode, and added a class to a paragraph.
  • Saved the page and checked the rendered HTML. Result:
    <p>This is a page with static content.</p>
    <p class="right">You may edit or delete it.</p>

I didn't test other attributes, but classes are allowed. However, the help text of the Allowed HMTL tags setting doesn't mention this feature.

@jenlampton
Copy link
Member Author

Don't we allow the same?

No, our default filters are all tags only. No other attributes. I created this issue because I really like the flexibility of the filter system in D9/10 and I want that for Backdrop too.

I'm pretty sure we don't allow all attributes to be defined via the filter system like Drupal 9/10, as we don't include the changes from the Drupal issue, and didn't have our own dedicated issue for it. It's possible it happened without an issue or patch (especially if it was a simplified version that, for example, always allowed class and data-* attributes -- this might have happened along with the addition of ckeditor.)

<p class="right"> might not be a good test case 1) because it's a class and 2) because ckeditor might have its own right/left classes.

I'd recommend using crazier class names, IDs on headings (as in @klonos's example) and other attributes like type, lang, autoplay, or even something random like okdnv="oeFNVPORWNG"

@olafgrabienski
Copy link

<p class="right"> might not be a good test case (...) I'd recommend using crazier class names, IDs on headings (...) and other attributes like type, lang, autoplay, or even something random (...)

As a start, I have tested other classes, e.g. "blue", and IDs on headings, like in <h3 id="first">. They work out of the box, without having to allow them.

@indigoxela
Copy link
Member

indigoxela commented Aug 25, 2023

@jenlampton per default, attributes are allowed, only "style" and "on*" are stripped out, when displaying the filtered content. (Note: They're not removed from input. Only on output.)

However, it's a bit unclear, whether it should be possible to also provide an attributes allowlist via Filter admin UI (as Drupal does).
Technically, this preg_split prevents it from working, as attributes end up in the tags list, not as nested array values.

The filter module does support attribute allow/disallow (with some quirks, I can provide examples). And that's also documented. Additionally, there are some API examples

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants