Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embed style sheets after opening embedStyleSheets should not be deleted all. #108

Closed
LiuXing-R opened this issue Oct 19, 2021 · 11 comments
Closed

Comments

@LiuXing-R
Copy link
Contributor

Set'embedStyleSheets' in the configuration:

<directive name="embedStyleSheets" value="true"/>

Input:

<!DOCTYPE html>
<html>
	<head>
		<style type='text/css'>
			@import url(https://unpkg.com/element-ui/lib/theme-chalk/index.css)
			h1 {font: 15pt "Arial"; color: blue;}
			p {font: 10pt "Arial"; color: black;}
		</style>
	</head>
	<body>
		<div>
			<h1>Title</h1>
			<p>content</p>
		</div>
	</body>
</html>

Out:

<html>
  <head>
    <style type="text/css"><![CDATA[/* */]]></style></head>
  <body>
    <div>
      <h1>Title</h1>
      <p>content</p></div></body></html>

Result: All embedded styles are deleted。

protected void parseImportedStylesheets(LinkedList<?> stylesheets, CssHandler handler,
ArrayList<String> errorMessages, int sizeLimit) throws ScanException {

I found that the parseImportedStylesheets method signature cannot override the parseImportedStylesheets method of the parent class and will never be called. This seems to be a bug.

Hi, @davewichers @spassarop Is this check deprecated?

@davewichers
Copy link
Collaborator

@spassarop - Can you review. I have no background on this.

@spassarop
Copy link
Collaborator

Apparently the ExternalCssScanner only implementation is parseImportedStylesheets method, which is not implemented in CssScanner class (empty code, just a comment saying that ExternalCssScanner implements that). Also there is no other CssScanner extension, so currently it doesn't make any sense to have two scanners.

The only difference is that by policy you can parse imported stylesheets or not, and that is when the ExternalCssScanner should come into play. As a scanner, you parse the imported sheets or not, that's it. But currently it never does. I guess I'll change this logic to make things right.

@spassarop
Copy link
Collaborator

Ok, so I checked this by debugging. Many things to note:

  1. @LiuXing-R about your test:
    a. @import syntax requires ending the rule with ;.
    b. The CSS file of that URL has a size over 200000 which is the maximum specified by the default policy.
    c. The policy must enable (by directive) imported CSS parsing, just in case you didn't enable it.
    Although regarding this considerations, the root problem persists because of how it's programmed.
  2. I moved the ExternalCssScanner logic to CssScanner and used a boolean in the constructor to know if parse imported CSS or not.
  3. By spec, @import also must be at the beginning, didn't know that, but it came up testing. Batik-CSS respects this.
  4. Although it now works, the imported styles are added after other styles that were under the import rule, so order is changed. Not sure if that makes sense.
  5. Results for some CSS URLs were very short, like parsing was stopped a few lines from the start. That may be because of two reasons or a combination of them: my tested CSS is not so standard, or Batik-CSS parser is pretty bad/outdated. Whatever the reason is, if there is an exception, parsing stops. Errors are not logged because there is no error handler set in CssScanner, unlike parser.setDocumentHandler(handler) for overriding some parsing methods. So for now there is no explanation about why it fails but you can more or less know because you have results until some imported line.

I'll make a PR that everyone can review and make any suggestions.

@LiuXing-R
Copy link
Contributor Author

Enable embedStyleSheets, and @import is at the beginning, this is no problem, I set it like this,
But what I expect is to keep these styles:

h1 {font: 15pt "Arial"; color: blue;}
p {font: 10pt "Arial"; Color: black;}

@LiuXing-R
Copy link
Contributor Author

Hi,@spassarop
There is still a problem to be aware of here.
The order of @import and h1, the order between multiple @imports, they have priority, the previous rules will be overwritten by the latter, so AntiSamy cannot change the order.

@spassarop
Copy link
Collaborator

You're right, I mentioned before that AntiSamy adds the imported style after. It first parses all styles, gathers URLs in the process, requests each style and parses it adding to the previous result. We should see if it can be done easily or if parsing results cannot be rearranged by design. However, this is a different issue, the original one was that import was not working at all. @davewichers, you think we should close+merge this here an open a new issue for that?

@davewichers
Copy link
Collaborator

Whatever you prefer. I don't care either way.

@spassarop
Copy link
Collaborator

Let's open a new one then.

@spassarop
Copy link
Collaborator

@LiuXing-R go check #113, it's OK to me from what I've tested.

@ashishkataria86
Copy link

ashishkataria86 commented Nov 25, 2024

@spassarop @davewichers
Do I need to enable <directive name="embedStyleSheets" value="true"/> to make the @media rule work?

@spassarop
Copy link
Collaborator

@ashishkataria86 I don't see how importing a remote stylesheet would work differently. The media rule still seems to not be working on the CSS parser dependency (you may have already read the explanation here) which is a pity. Unless until a big refactor is made by changing the 3rd party CSS parser or the current one gets updated by its maintainer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants