diff --git a/.gitignore b/.gitignore index 3ceaf73e..7820c03f 100644 --- a/.gitignore +++ b/.gitignore @@ -5,3 +5,4 @@ target/ .classpath .java-version +antisamy.iml diff --git a/README.md b/README.md index 15f84fbd..3b36e3eb 100644 --- a/README.md +++ b/README.md @@ -24,11 +24,11 @@ Chances are that your site’s use case for AntiSamy is at least roughly compara Slashdot is a techie news site that allows users to respond anonymously to news posts with very limited HTML markup. Now, Slashdot is not only one of the coolest sites around, it’s also one that’s been subject to many different successful attacks. The rules for Slashdot are fairly strict: users can only submit the following HTML tags and no CSS: ``, ``, ``, ``, `
`. -Accordingly, we’ve built a policy file that allows fairly similar functionality. All text-formatting tags that operate directly on the font, color or emphasis have been allowed. +Accordingly, we’ve built a policy file that allows fairly similar functionality. All text-formatting tags that operate directly on the font, color, or emphasis have been allowed. 2) antisamy-ebay.xml -eBay is the most popular online auction site in the universe, as far as I can tell. It is a public site so anyone is allowed to post listings with rich HTML content. It’s not surprising that given the attractiveness of eBay as a target that it has been subject to a few complex XSS attacks. Listings are allowed to contain much more rich content than, say, Slashdot -- so it’s attack surface is considerably larger. +eBay is the most popular online auction site in the universe, as far as I can tell. It is a public site so anyone is allowed to post listings with rich HTML content. It’s not surprising that given the attractiveness of eBay as a target that it has been subject to a few complex XSS attacks. Listings are allowed to contain much more rich content than, say, Slashdot -- so it’s attack surface is considerably larger. 3) antisamy-myspace.xml @@ -38,6 +38,18 @@ MySpace was, at the time this project was born, the most popular social networki I don’t know of a possible use case for this policy file. If you wanted to allow every single valid HTML and CSS element (but without JavaScript or blatant CSS-related phishing attacks), you can use this policy file. Not even MySpace was this crazy. However, it does serve as a good reference because it contains base rules for every element, so you can use it as a knowledge base when using tailoring the other policy files. +### NOTE: Schema validation behavior change starting with AntiSamy 1.6.0 + +While working on some improvements to AntiSamy's XML Schema Definition (XSD) for AntiSamy policy files, we noticed that AntiSamy was NOT actually enforcing the XSD. So, we've CHANGED the default behavior starting with AntiSamy 1.6.0 to enforce the schema, and not continue if the AntiSamy policy is invalid. However ... + +we recognize that it might not be possible for developers to fix their AntiSamy policies right away if they are non-compliant, and yet still want to upgrade AntiSamy to pick up any security improvements, feature enhancements, and bug fixes. As such, we've provided two ways to (temporarily!) disable schema validation: + +1) Set the Java System property: owasp.validator.validateschema to false. This can be done at the command line (e.g., -Dowasp.validator.validateschema=false) or via the Java System properties file. Neither requires a code change. + +2) Change the code using AntiSamy to invoke: Policy.setSchemaValidation(false) before loading the AntiSamy policy. This is a static call so once disabled, it is disabled for all new Policy instances. + +To encourage AntiSamy users to only use XSD compliant policies, AntiSamy will always issue some type of warning when schema validation is disabled. It will either WARN that the policy is non-compliant so it can be fixed, or it will WARN that the policy is compliant, but schema validation is OFF, so validation should be turned back on (i.e., stop disabling it). + ### 3. Tailoring the policy file You may want to deploy AntiSamy in a default configuration, but it’s equally likely that a site may want to have strict, business-driven rules for what users can allow. The discussion that decides the tailoring should also consider attack surface - which grows in relative proportion to the policy file. @@ -86,6 +98,11 @@ __Important Note__: There has been much confusion about the `getErrorMessages()` The serialization and deserialization process that is critical to the effectiveness of the sanitizer is purposefully lossy and will filter out attacks via a number of attack vectors. Unfortunately, one of the tradeoffs of this strategy is that we don't always know in retrospect that an attack was seen. Thus, the `getErrorMessages()` API is there to help users understand their well-intentioned input meet the requirements of the system, not help a developer detect if an attack was present. +## Other Documentation + +Additional documentation is available on this Github project's wiki page: https://github.com/nahsra/antisamy/wiki +and the OWASP AntiSamy Project Page: https://owasp.org/www-project-antisamy/ + ## Contributing to AntiSamy ### Find an Issue? diff --git a/SECURITY.md b/SECURITY.md index 28cdc7b6..0597dc88 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -13,8 +13,8 @@ to dropping a 0-day on all applications using AntiSamy. Instead, we encourage responsible disclosure. If you wish to be acknowledged for finding the vulnerability, then please follow -this process. One of the project leaders will try to contact you within -at least 5 business days. +this process. One of the project leaders will try to contact you within 1-2 business days. + If you eventually wish to have it published as a CVE, we will also work with you to ensure that you are given proper credit with MITRE and NIST. Even if you do not wish to report the vulnerability as a CVE, we will acknowledge you when we @@ -30,4 +30,3 @@ These are the known CVEs reported for AntiSamy: * AntiSamy CVE #1 - CVE-2016-10006: XSS Bypass in AntiSamy before v1.5.5 - https://www.cvedetails.com/cve/CVE-2016-10006 * AntiSamy CVE #2 - CVE-2017-14735: XSS via HTML5 Entities in AntiSamy before v1.5.7 - https://www.cvedetails.com/cve/CVE-2017-14735 - diff --git a/pom.xml b/pom.xml index 6ad50bbe..4afb802a 100644 --- a/pom.xml +++ b/pom.xml @@ -3,8 +3,7 @@ org.owasp.antisamy antisamy jar - 1.5.13 - + 1.6.0 ossrh @@ -43,9 +42,10 @@ UTF-8 + 2021-05-03T17:04:00Z true 4.2.0 - 4.2.0 + 4.2.2 @@ -59,22 +59,9 @@ - xml-apis - xml-apis - - 1.4.01 - - - org.apache.xmlgraphics - batik-css - 1.13 - - - - commons-logging - commons-logging - - + commons-codec + commons-codec + 1.15 net.sourceforge.nekohtml @@ -100,15 +87,48 @@ + + org.apache.logging.log4j + log4j-slf4j-impl + + 2.12.1 + + + + org.slf4j + slf4j-api + + + + + org.apache.xmlgraphics + batik-css + 1.14 + + + + commons-logging + commons-logging + + + + + org.slf4j + slf4j-api + 1.7.30 + + xerces xercesImpl 2.12.1 - commons-codec - commons-codec - 1.15 + xml-apis + xml-apis + + 1.4.01 @@ -129,20 +149,35 @@ junit junit - 4.13.1 + 4.13.2 test + + + org.apache.maven.plugins + maven-assembly-plugin + 3.3.0 + org.apache.maven.plugins maven-dependency-plugin 3.1.2 + + org.apache.maven.plugins + maven-javadoc-plugin + 3.2.0 + + + true + + org.apache.maven.plugins maven-release-plugin @@ -150,6 +185,7 @@ + org.apache.maven.plugins @@ -249,7 +285,6 @@ org.apache.maven.plugins maven-javadoc-plugin - 3.2.0 attach-javadocs @@ -285,6 +320,11 @@ maven-surefire-plugin 3.0.0-M5 + + org.cyclonedx + cyclonedx-maven-plugin + 2.3.0 + com.github.spotbugs spotbugs-maven-plugin diff --git a/src/main/java/org/owasp/validator/css/CssHandler.java b/src/main/java/org/owasp/validator/css/CssHandler.java index f36dc3bb..6eee70fd 100644 --- a/src/main/java/org/owasp/validator/css/CssHandler.java +++ b/src/main/java/org/owasp/validator/css/CssHandler.java @@ -1,5 +1,5 @@ /* - * Copyright (c) 2007-2019, Arshan Dabirsiaghi, Jason Li + * Copyright (c) 2007-2021, Arshan Dabirsiaghi, Jason Li * * All rights reserved. * @@ -30,13 +30,18 @@ import java.net.URI; import java.net.URISyntaxException; -import java.util.*; +import java.util.ArrayList; +import java.util.Collection; +import java.util.LinkedList; +import java.util.List; +import java.util.ResourceBundle; import org.owasp.validator.html.InternalPolicy; import org.owasp.validator.html.Policy; import org.owasp.validator.html.ScanException; import org.owasp.validator.html.util.ErrorMessageUtil; import org.owasp.validator.html.util.HTMLEntityEncoder; + import org.w3c.css.sac.CSSException; import org.w3c.css.sac.DocumentHandler; import org.w3c.css.sac.InputSource; @@ -144,6 +149,7 @@ public CssHandler(Policy policy, LinkedList embeddedStyleSheets, */ public CssHandler(Policy policy, LinkedList embeddedStyleSheets, List errorMessages, String tagName, ResourceBundle messages) { + assert policy instanceof InternalPolicy : policy.getClass(); this.policy = (InternalPolicy) policy; this.errorMessages = errorMessages; this.messages = messages; @@ -168,9 +174,9 @@ public String getCleanStylesheet() { * @return the error messages generated during parsing */ public Collection getErrorMessages() { - return new ArrayList(errorMessages); + return new ArrayList(errorMessages); } - + /* * (non-Javadoc) * @@ -178,9 +184,9 @@ public Collection getErrorMessages() { */ public void comment(String text) throws CSSException { errorMessages.add(ErrorMessageUtil.getMessage( - messages, - ErrorMessageUtil.ERROR_COMMENT_REMOVED, - new Object[] { HTMLEntityEncoder.htmlEntityEncode(text) })); + messages, + ErrorMessageUtil.ERROR_COMMENT_REMOVED, + new Object[] { HTMLEntityEncoder.htmlEntityEncode(text) })); } /* @@ -194,7 +200,7 @@ public void ignorableAtRule(String atRule) throws CSSException { // CSS2+ stuff if (tagName != null) { errorMessages.add(ErrorMessageUtil.getMessage( - messages, + messages, ErrorMessageUtil.ERROR_CSS_TAG_RULE_NOTFOUND, new Object[] { HTMLEntityEncoder.htmlEntityEncode(tagName), @@ -202,11 +208,11 @@ public void ignorableAtRule(String atRule) throws CSSException { })); } else { errorMessages.add(ErrorMessageUtil.getMessage( - messages, + messages, ErrorMessageUtil.ERROR_STYLESHEET_RULE_NOTFOUND, new Object[] { HTMLEntityEncoder.htmlEntityEncode(atRule) - })); + })); } } @@ -221,22 +227,22 @@ public void importStyle(String uri, SACMediaList media, if (!policy.isEmbedStyleSheets()) { errorMessages.add(ErrorMessageUtil.getMessage( - messages, - ErrorMessageUtil.ERROR_CSS_IMPORT_DISABLED, - new Object[] {})); + messages, + ErrorMessageUtil.ERROR_CSS_IMPORT_DISABLED, + new Object[] {})); return; } - + try { // check for non-nullness (validate after canonicalization) if (uri == null) { - errorMessages.add(ErrorMessageUtil.getMessage( - messages, + errorMessages.add(ErrorMessageUtil.getMessage( + messages, ErrorMessageUtil.ERROR_CSS_IMPORT_URL_INVALID, new Object[] {})); - return; + return; } - + URI importedStyleSheet = new URI(uri); // canonicalize the URI @@ -244,44 +250,41 @@ public void importStyle(String uri, SACMediaList media, // validate the URL - if (!policy.getCommonRegularExpressions("offsiteURL").matches(importedStyleSheet.toString()) + if (!policy.getCommonRegularExpressions("offsiteURL").matches(importedStyleSheet.toString()) && !policy.getCommonRegularExpressions("onsiteURL").matches(importedStyleSheet.toString())) { - errorMessages.add(ErrorMessageUtil.getMessage( - messages, + errorMessages.add(ErrorMessageUtil.getMessage( + messages, ErrorMessageUtil.ERROR_CSS_IMPORT_URL_INVALID, new Object[] { HTMLEntityEncoder.htmlEntityEncode(uri) })); - return; - } - + return; + } if (!importedStyleSheet.isAbsolute()) { // we have no concept of relative reference for free form // text as an end user can't know where the corresponding // free form will end up - if (tagName != null) { - errorMessages.add(ErrorMessageUtil.getMessage( - messages, - ErrorMessageUtil.ERROR_CSS_TAG_RELATIVE, - new Object[] { - HTMLEntityEncoder.htmlEntityEncode(tagName), - HTMLEntityEncoder.htmlEntityEncode(uri) })); - } else { - errorMessages.add(ErrorMessageUtil.getMessage( - messages, - ErrorMessageUtil.ERROR_STYLESHEET_RELATIVE, - new Object[] { HTMLEntityEncoder.htmlEntityEncode(uri) })); - } + if (tagName != null) { + errorMessages.add(ErrorMessageUtil.getMessage( + messages, + ErrorMessageUtil.ERROR_CSS_TAG_RELATIVE, + new Object[] { + HTMLEntityEncoder.htmlEntityEncode(tagName), + HTMLEntityEncoder.htmlEntityEncode(uri) })); + } else { + errorMessages.add(ErrorMessageUtil.getMessage( + messages, + ErrorMessageUtil.ERROR_STYLESHEET_RELATIVE, + new Object[] { HTMLEntityEncoder.htmlEntityEncode(uri) })); + } return; } - importedStyleSheets.add(importedStyleSheet); } catch (URISyntaxException use) { errorMessages.add(ErrorMessageUtil.getMessage( messages, ErrorMessageUtil.ERROR_CSS_IMPORT_URL_INVALID, new Object[] { HTMLEntityEncoder.htmlEntityEncode(uri) })); - return; } } @@ -393,22 +396,22 @@ public void startSelector(SelectorList selectors) throws CSSException { isValidSelector = validator.isValidSelector(selectorName, selector); } catch (ScanException se) { - if (tagName != null) { - errorMessages.add(ErrorMessageUtil.getMessage( + if (tagName != null) { + errorMessages.add(ErrorMessageUtil.getMessage( messages, - ErrorMessageUtil.ERROR_CSS_TAG_SELECTOR_NOTFOUND, - new Object[] { - HTMLEntityEncoder.htmlEntityEncode(selector.toString()) - })); - } else { - errorMessages.add(ErrorMessageUtil.getMessage( + ErrorMessageUtil.ERROR_CSS_TAG_SELECTOR_NOTFOUND, + new Object[] { + HTMLEntityEncoder.htmlEntityEncode(selector.toString()) + })); + } else { + errorMessages.add(ErrorMessageUtil.getMessage( messages, - ErrorMessageUtil.ERROR_STYLESHEET_SELECTOR_NOTFOUND, - new Object[] { - HTMLEntityEncoder.htmlEntityEncode(tagName), - HTMLEntityEncoder.htmlEntityEncode(selector.toString()) - })); - } + ErrorMessageUtil.ERROR_STYLESHEET_SELECTOR_NOTFOUND, + new Object[] { + HTMLEntityEncoder.htmlEntityEncode(tagName), + HTMLEntityEncoder.htmlEntityEncode(selector.toString()) + })); + } } // if the selector is valid, add to list @@ -421,25 +424,22 @@ public void startSelector(SelectorList selectors) throws CSSException { selectorCount++; - } else { - if (tagName != null) { - errorMessages.add(ErrorMessageUtil.getMessage( - messages, - ErrorMessageUtil.ERROR_CSS_TAG_SELECTOR_DISALLOWED, - new Object[] { - HTMLEntityEncoder.htmlEntityEncode(tagName), - HTMLEntityEncoder.htmlEntityEncode(selector.toString()) - })); - - } else { - errorMessages.add(ErrorMessageUtil.getMessage( - messages, - ErrorMessageUtil.ERROR_STYLESHEET_SELECTOR_DISALLOWED, - new Object[] { - HTMLEntityEncoder.htmlEntityEncode(selector.toString()) - })); - } + } else if (tagName != null) { + errorMessages.add(ErrorMessageUtil.getMessage( + messages, + ErrorMessageUtil.ERROR_CSS_TAG_SELECTOR_DISALLOWED, + new Object[] { + HTMLEntityEncoder.htmlEntityEncode(tagName), + HTMLEntityEncoder.htmlEntityEncode(selector.toString()) + })); + } else { + errorMessages.add(ErrorMessageUtil.getMessage( + messages, + ErrorMessageUtil.ERROR_STYLESHEET_SELECTOR_DISALLOWED, + new Object[] { + HTMLEntityEncoder.htmlEntityEncode(selector.toString()) + })); } } } @@ -501,27 +501,23 @@ public void property(String name, LexicalUnit value, boolean important) styleSheet.append(';'); if (!isInline) { styleSheet.append('\n'); } + } else if (tagName != null) { + errorMessages.add(ErrorMessageUtil.getMessage( + messages, + ErrorMessageUtil.ERROR_CSS_TAG_PROPERTY_INVALID, + new Object[] { + HTMLEntityEncoder.htmlEntityEncode(tagName), + HTMLEntityEncoder.htmlEntityEncode(name), + HTMLEntityEncoder.htmlEntityEncode(validator + .lexicalValueToString(value)) })); } else { - - if (tagName != null) { - errorMessages.add(ErrorMessageUtil.getMessage( - messages, - ErrorMessageUtil.ERROR_CSS_TAG_PROPERTY_INVALID, - new Object[] { - HTMLEntityEncoder.htmlEntityEncode(tagName), - HTMLEntityEncoder.htmlEntityEncode(name), - HTMLEntityEncoder.htmlEntityEncode(validator - .lexicalValueToString(value)) })); - } else { - errorMessages.add(ErrorMessageUtil.getMessage( - messages, - ErrorMessageUtil.ERROR_STYLESHEET_PROPERTY_INVALID, - new Object[] { - HTMLEntityEncoder.htmlEntityEncode(name), - HTMLEntityEncoder.htmlEntityEncode(validator - .lexicalValueToString(value)) })); - } - + errorMessages.add(ErrorMessageUtil.getMessage( + messages, + ErrorMessageUtil.ERROR_STYLESHEET_PROPERTY_INVALID, + new Object[] { + HTMLEntityEncoder.htmlEntityEncode(name), + HTMLEntityEncoder.htmlEntityEncode(validator + .lexicalValueToString(value)) })); } } } diff --git a/src/main/java/org/owasp/validator/css/ExternalCssScanner.java b/src/main/java/org/owasp/validator/css/ExternalCssScanner.java index d20646c4..a5c165ea 100644 --- a/src/main/java/org/owasp/validator/css/ExternalCssScanner.java +++ b/src/main/java/org/owasp/validator/css/ExternalCssScanner.java @@ -1,5 +1,5 @@ /* - * Copyright (c) 2007-2019, Arshan Dabirsiaghi, Jason Li + * Copyright (c) 2007-2021, Arshan Dabirsiaghi, Jason Li * * All rights reserved. * @@ -32,6 +32,7 @@ import java.io.IOException; import java.io.InputStreamReader; import java.net.URI; +import java.nio.charset.Charset; import java.util.ArrayList; import java.util.LinkedList; import java.util.ResourceBundle; @@ -42,11 +43,13 @@ import org.apache.http.client.methods.HttpGet; import org.apache.http.impl.client.HttpClientBuilder; import org.apache.http.util.EntityUtils; + import org.owasp.validator.html.InternalPolicy; import org.owasp.validator.html.Policy; import org.owasp.validator.html.ScanException; import org.owasp.validator.html.util.ErrorMessageUtil; import org.owasp.validator.html.util.HTMLEntityEncoder; + import org.w3c.css.sac.InputSource; public class ExternalCssScanner extends CssScanner { @@ -66,106 +69,95 @@ public ExternalCssScanner(InternalPolicy policy, ResourceBundle messages) { * @throws ScanException if an error occurs during scanning */ protected void parseImportedStylesheets(LinkedList stylesheets, CssHandler handler, - ArrayList errorMessages, int sizeLimit) throws ScanException { - - int importedStylesheets = 0; - - // if stylesheets were imported by the inline style declaration, - // continue parsing the nested styles. Note this only happens - // if CSS importing was enabled in the policy file - if (!stylesheets.isEmpty()) { - // Ensure that we have appropriate timeout values so we don't - // get DoSed waiting for returns - int timeout = DEFAULT_TIMEOUT; - try { - timeout = Integer.parseInt(policy.getDirective(Policy.CONNECTION_TIMEOUT)); - } catch (NumberFormatException nfe) { - } - - RequestConfig requestConfig = RequestConfig.custom() - .setSocketTimeout(timeout) - .setConnectTimeout(timeout) - .setConnectionRequestTimeout(timeout) - .build(); - - HttpClient httpClient = HttpClientBuilder.create(). - disableAutomaticRetries(). - disableConnectionState(). - disableCookieManagement(). - setDefaultRequestConfig(requestConfig). - build(); - - int allowedImports = Policy.DEFAULT_MAX_STYLESHEET_IMPORTS; - try { - allowedImports = Integer.parseInt(policy.getDirective("maxStyleSheetImports")); - } catch (NumberFormatException nfe) { - } - - while (!stylesheets.isEmpty()) { - + ArrayList errorMessages, int sizeLimit) throws ScanException { + + int importedStylesheets = 0; + + // if stylesheets were imported by the inline style declaration, + // continue parsing the nested styles. Note this only happens + // if CSS importing was enabled in the policy file + if (!stylesheets.isEmpty()) { + // Ensure that we have appropriate timeout values so we don't + // get DoSed waiting for returns + int timeout = DEFAULT_TIMEOUT; + try { + timeout = Integer.parseInt(policy.getDirective(Policy.CONNECTION_TIMEOUT)); + } catch (NumberFormatException nfe) { + } + + RequestConfig requestConfig = RequestConfig.custom() + .setSocketTimeout(timeout) + .setConnectTimeout(timeout) + .setConnectionRequestTimeout(timeout) + .build(); + + HttpClient httpClient = HttpClientBuilder.create(). + disableAutomaticRetries(). + disableConnectionState(). + disableCookieManagement(). + setDefaultRequestConfig(requestConfig). + build(); + + int allowedImports = Policy.DEFAULT_MAX_STYLESHEET_IMPORTS; + try { + allowedImports = Integer.parseInt(policy.getDirective("maxStyleSheetImports")); + } catch (NumberFormatException nfe) { + } + + while (!stylesheets.isEmpty()) { + URI stylesheetUri = (URI) stylesheets.removeFirst(); - + if (++importedStylesheets > allowedImports) { - errorMessages.add(ErrorMessageUtil.getMessage( - messages, - ErrorMessageUtil.ERROR_CSS_IMPORT_EXCEEDED, - new Object[] { - HTMLEntityEncoder - .htmlEntityEncode(stylesheetUri - .toString()), - String.valueOf(allowedImports) })); - continue; + errorMessages.add(ErrorMessageUtil.getMessage( + messages, + ErrorMessageUtil.ERROR_CSS_IMPORT_EXCEEDED, + new Object[] { + HTMLEntityEncoder.htmlEntityEncode(stylesheetUri.toString()), + String.valueOf(allowedImports) })); + continue; } - + HttpGet stylesheetRequest = new HttpGet(stylesheetUri); - + byte[] stylesheet = null; try { - // pull down stylesheet, observing size limit - HttpResponse response = httpClient.execute(stylesheetRequest); - stylesheet = EntityUtils.toByteArray(response.getEntity()); - if(stylesheet != null && stylesheet.length > sizeLimit) { - errorMessages - .add(ErrorMessageUtil - .getMessage( - messages, - ErrorMessageUtil.ERROR_CSS_IMPORT_INPUT_SIZE, - new Object[] { - HTMLEntityEncoder - .htmlEntityEncode(stylesheetUri - .toString()), - String.valueOf(policy - .getMaxInputSize()) })); - stylesheet = null; - } + // pull down stylesheet, observing size limit + HttpResponse response = httpClient.execute(stylesheetRequest); + stylesheet = EntityUtils.toByteArray(response.getEntity()); + if (stylesheet != null && stylesheet.length > sizeLimit) { + errorMessages.add(ErrorMessageUtil.getMessage( + messages, + ErrorMessageUtil.ERROR_CSS_IMPORT_INPUT_SIZE, + new Object[] { + HTMLEntityEncoder.htmlEntityEncode(stylesheetUri.toString()), + String.valueOf(policy.getMaxInputSize()) })); + stylesheet = null; + } } catch (IOException ioe) { - errorMessages.add(ErrorMessageUtil - .getMessage( - messages, - ErrorMessageUtil.ERROR_CSS_IMPORT_FAILURE, - new Object[] { HTMLEntityEncoder - .htmlEntityEncode(stylesheetUri - .toString()) })); + errorMessages.add(ErrorMessageUtil.getMessage( + messages, + ErrorMessageUtil.ERROR_CSS_IMPORT_FAILURE, + new Object[] { HTMLEntityEncoder.htmlEntityEncode(stylesheetUri.toString()) })); } finally { - stylesheetRequest.releaseConnection(); + stylesheetRequest.releaseConnection(); } - + if (stylesheet != null) { - // decrease the size limit based on the - sizeLimit -= stylesheet.length; - - try { - InputSource nextStyleSheet = new InputSource( - new InputStreamReader(new ByteArrayInputStream( - stylesheet))); - parser.parseStyleSheet(nextStyleSheet); - - } catch (IOException ioe) { - throw new ScanException(ioe); - } - + // decrease the size limit based on the + sizeLimit -= stylesheet.length; + + try { + InputSource nextStyleSheet = new InputSource( + new InputStreamReader(new ByteArrayInputStream(stylesheet), + Charset.defaultCharset())); + parser.parseStyleSheet(nextStyleSheet); + + } catch (IOException ioe) { + throw new ScanException(ioe); + } } - } - } - } + } // end while + } // end if + } // end parseImportedStylesheets() } diff --git a/src/main/java/org/owasp/validator/html/Policy.java b/src/main/java/org/owasp/validator/html/Policy.java index 292ca4be..02916780 100644 --- a/src/main/java/org/owasp/validator/html/Policy.java +++ b/src/main/java/org/owasp/validator/html/Policy.java @@ -1,5 +1,5 @@ /* - * Copyright (c) 2007-2020, Arshan Dabirsiaghi, Jason Li, Kristian Rosenvold + * Copyright (c) 2007-2021, Arshan Dabirsiaghi, Jason Li, Kristian Rosenvold * * All rights reserved. * @@ -28,16 +28,32 @@ import java.io.File; import java.io.IOException; +import java.io.FileNotFoundException; import java.io.InputStream; import java.net.MalformedURLException; import java.net.URI; import java.net.URL; -import java.util.*; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.Callable; import java.util.regex.Pattern; +import javax.xml.XMLConstants; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.ParserConfigurationException; +import javax.xml.transform.Source; +import javax.xml.transform.stream.StreamSource; +import javax.xml.validation.Schema; +import javax.xml.validation.SchemaFactory; + +import org.apache.logging.log4j.Logger; +import org.apache.logging.log4j.LogManager; import org.owasp.validator.html.model.AntiSamyPattern; import org.owasp.validator.html.model.Attribute; @@ -45,11 +61,14 @@ import org.owasp.validator.html.model.Tag; import org.owasp.validator.html.scan.Constants; import org.owasp.validator.html.util.URIUtils; + import org.w3c.dom.Document; import org.w3c.dom.Element; import org.w3c.dom.NodeList; +import org.xml.sax.ErrorHandler; import org.xml.sax.InputSource; import org.xml.sax.SAXException; +import org.xml.sax.SAXParseException; import static org.owasp.validator.html.util.XMLUtil.getAttributeValue; @@ -61,8 +80,11 @@ public class Policy { - public static final Pattern ANYTHING_REGEXP = Pattern.compile(".*"); + private static final Logger logger = LogManager.getLogger(Policy.class); + public static final Pattern ANYTHING_REGEXP = Pattern.compile(".*", Pattern.DOTALL); + + private static final String POLICY_SCHEMA_URI = "antisamy.xsd"; protected static final String DEFAULT_POLICY_URI = "resources/antisamy.xml"; private static final String DEFAULT_ONINVALID = "removeAttribute"; @@ -101,9 +123,27 @@ public class Policy { private final TagMatcher allowedEmptyTagsMatcher; private final TagMatcher requiresClosingTagsMatcher; + /** + * XML Schema for policy validation + */ + private static volatile Schema schema = null; + private static boolean validateSchema = true; // Default is to validate schemas + public static final String VALIDATIONPROPERTY = "owasp.validator.validateschema"; + + // Support the ability to change the default schema validation behavior by setting the + // System property "owasp.antisamy.validateschema". + static { + String validateProperty = System.getProperty(VALIDATIONPROPERTY); + if (validateProperty != null) { + validateSchema = Boolean.getBoolean(validateProperty); + logger.warn("Setting AntiSamy policy schema validation to '" + validateSchema + + "' because '" + VALIDATIONPROPERTY + "' system property set to: '" + validateProperty + "'"); + } + } + /** * Get the Tag specified by the provided tag name. - * + * * @param tagName * The name of the Tag to return. * @return The requested Tag, or null if it doesn't exist. @@ -140,6 +180,25 @@ public Property getPropertyByName(String propertyName) { return cssRules.get(propertyName.toLowerCase()); } + /** + * Is XSD schema validation across all policies enabled or not? It is enabled by default. + * + * @return True if schema validation enabled. False otherwise. + */ + public static boolean getSchemaValidation() { + return validateSchema; + } + + /** + * This can enable/disable the schema validation against AntiSamy XSD for the instantiated + * policies. It is enabled by default. + * + * @param enable boolean value to specify if the schema validation should be performed. Use false to disable. + */ + public static void setSchemaValidation(boolean enable) { + validateSchema = enable; + } + /** * This retrieves a Policy based on a default location ("resources/antisamy.xml") * @@ -170,8 +229,11 @@ public static Policy getInstance(String filename) throws PolicyException { * @throws PolicyException If there is a problem parsing the input stream. */ public static Policy getInstance(InputStream inputStream) throws PolicyException { + final String logMsg = "Attempting to load policy from an input stream."; + // If schema validation is disabled, we elevate this msg to the warn level to match the + // level of the mandatory warning that will follow. We do the same below. + if (validateSchema) logger.info(logMsg); else logger.warn(logMsg); return new InternalPolicy(null, getSimpleParseContext(getTopLevelElement(inputStream))); - } /** @@ -201,10 +263,11 @@ public static Policy getInstance(File file) throws PolicyException { * @throws PolicyException If the file is not found or there is a problem parsing the file. */ public static Policy getInstance(URL url) throws PolicyException { + String logMsg = "Attempting to load policy from URL: " + url.toString(); + if (validateSchema) logger.info(logMsg); else logger.warn(logMsg); return new InternalPolicy(url, getParseContext(getTopLevelElement(url), url)); } - protected Policy(ParseContext parseContext) throws PolicyException { this.allowedEmptyTagsMatcher = new TagMatcher(parseContext.allowedEmptyTags); this.requiresClosingTagsMatcher = new TagMatcher(parseContext.requireClosingTags); @@ -230,8 +293,8 @@ protected Policy(Policy old, Map directives, Map ta protected static ParseContext getSimpleParseContext(Element topLevelElement) throws PolicyException { ParseContext parseContext = new ParseContext(); if (getByTagName(topLevelElement, "include").iterator().hasNext()) { - throw new IllegalArgumentException("A policy file loaded with an InputStream cannot contain include references"); - + throw new IllegalArgumentException( + "A policy file loaded with an InputStream cannot contain include references"); } parsePolicy(topLevelElement, parseContext); return parseContext; @@ -258,9 +321,19 @@ protected static ParseContext getParseContext(Element topLevelElement, URL baseU return parseContext; } + protected static Element getTopLevelElement(final URL baseUrl) throws PolicyException { + final InputSource source = getSourceFromUrl(baseUrl); + return getTopLevelElement(source, new Callable() { + @Override + public InputSource call() throws PolicyException { + return getSourceFromUrl(baseUrl); + } + }); + } + @SuppressFBWarnings(value = "SECURITY", justification="Opening a stream to the provided URL is not " - + "a vulnerability because it points to a local JAR file.") - protected static Element getTopLevelElement(URL baseUrl) throws PolicyException { + + "a vulnerability because it points to a local JAR file.") + protected static InputSource getSourceFromUrl(URL baseUrl) throws PolicyException { try { InputSource source = resolveEntity(baseUrl.toExternalForm(), baseUrl); if (source == null) { @@ -270,7 +343,7 @@ protected static Element getTopLevelElement(URL baseUrl) throws PolicyException source.setSystemId(baseUrl.toExternalForm()); } - return getTopLevelElement( source); + return source; } catch (SAXException | IOException e) { // SAXException can't actually happen. See JavaDoc for resolveEntity(String, URL) throw new PolicyException(e); @@ -278,26 +351,70 @@ protected static Element getTopLevelElement(URL baseUrl) throws PolicyException } private static Element getTopLevelElement(InputStream is) throws PolicyException { - return getTopLevelElement(new InputSource(is)); + final InputSource source = new InputSource(is); + source.getByteStream().mark(0); + return getTopLevelElement(source, new Callable() { + @Override + public InputSource call() throws IOException { + source.getByteStream().reset(); + return source; + } + }); } - protected static Element getTopLevelElement(InputSource source) throws PolicyException { + protected static Element getTopLevelElement(InputSource source, Callable getResetSource) throws PolicyException { + // Track whether an exception was ever thrown while processing policy file + Exception thrownException = null; try { - DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); - - /** - * Disable external entities, etc. - */ - dbf.setFeature(EXTERNAL_GENERAL_ENTITIES, false); - dbf.setFeature(EXTERNAL_PARAM_ENTITIES, false); - dbf.setFeature(DISALLOW_DOCTYPE_DECL, true); - dbf.setFeature(LOAD_EXTERNAL_DTD, false); - DocumentBuilder db = dbf.newDocumentBuilder(); - Document dom = db.parse(source); - return dom.getDocumentElement(); - } catch (SAXException | ParserConfigurationException | IOException e) { + return getDocumentElementFromSource(source, true); + } catch (SAXException e) { + thrownException = e; + if (!validateSchema) { + try { + source = getResetSource.call(); + Element theElement = getDocumentElementFromSource(source, false); + // We warn when the policy has an invalid schema, but schema validation is disabled. + logger.warn("Invalid policy file: " + e.getMessage()); + return theElement; + } catch (Exception e2) { + throw new PolicyException(e2); + } + } else throw new PolicyException(e); + } catch (ParserConfigurationException | IOException e) { + thrownException = e; throw new PolicyException(e); + } finally { + if (!validateSchema && (thrownException == null)) { + // We warn when the policy has a valid schema, but schema validation is disabled. + logger.warn("XML schema validation is disabled for a valid policy. Please reenable policy validation."); + } + } + } + + private static Element getDocumentElementFromSource(InputSource source, boolean schemaValidationEnabled) + throws ParserConfigurationException, SAXException, IOException { + + DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); + + /** + * Disable external entities, etc. + */ + dbf.setFeature(EXTERNAL_GENERAL_ENTITIES, false); + dbf.setFeature(EXTERNAL_PARAM_ENTITIES, false); + dbf.setFeature(DISALLOW_DOCTYPE_DECL, true); + dbf.setFeature(LOAD_EXTERNAL_DTD, false); + + if (schemaValidationEnabled) { + getPolicySchema(); + dbf.setNamespaceAware(true); + dbf.setSchema(schema); } + + DocumentBuilder db = dbf.newDocumentBuilder(); + db.setErrorHandler(new SAXErrorHandler()); + Document dom = db.parse(source); + + return dom.getDocumentElement(); } private static void parsePolicy(Element topLevelElement, ParseContext parseContext) @@ -316,7 +433,7 @@ private static void parsePolicy(Element topLevelElement, ParseContext parseConte parseCSSRules(getFirstChild(topLevelElement, "css-rules"), parseContext.cssRules, parseContext.commonRegularExpressions); parseAllowedEmptyTags(getFirstChild(topLevelElement, "allowed-empty-tags"), parseContext.allowedEmptyTags); - parseRequiresClosingTags(getFirstChild(topLevelElement, "require-closing-tags"), parseContext.requireClosingTags); + parseRequireClosingTags(getFirstChild(topLevelElement, "require-closing-tags"), parseContext.requireClosingTags); } /** @@ -324,70 +441,105 @@ private static void parsePolicy(Element topLevelElement, ParseContext parseConte */ @SuppressFBWarnings(value = "SECURITY", justification="Opening a stream to the provided URL is not " + "a vulnerability because only local file URLs are allowed.") - private static Element getPolicy(String href, URL baseUrl) - throws PolicyException { - + private static Element getPolicy(String href, URL baseUrl) throws PolicyException { + // Track whether an exception was ever thrown while processing policy file + Exception thrownException = null; try { - InputSource source = null; + return getDocumentElementByUrl(href, baseUrl, true); + } catch (SAXException e) { + thrownException = e; + if (!validateSchema) { + try { + Element theElement = getDocumentElementByUrl(href, baseUrl, false); + // We warn when the policy has an invalid schema, but schema validation is disabled. + logger.warn("Invalid policy file: " + e.getMessage()); + return theElement; + } catch (SAXException | ParserConfigurationException | IOException e2) { + throw new PolicyException(e2); + } + } else { + throw new PolicyException(e); + } + } catch (ParserConfigurationException | IOException e) { + thrownException = e; + throw new PolicyException(e); + } finally { + if (!validateSchema && (thrownException == null)) { + // We warn when the policy has a valid schema, but schema validation is disabled. + logger.warn("XML schema validation is disabled for a valid policy. Please reenable policy validation."); + } + } + } - // Can't resolve public id, but might be able to resolve relative - // system id, since we have a base URI. - if (href != null && baseUrl != null) { + // TODO: Add JavaDocs for this new method. + @SuppressFBWarnings(value = "SECURITY", justification="Opening a stream to the provided URL is not " + + "a vulnerability because only local file URLs are allowed.") + private static Element getDocumentElementByUrl(String href, URL baseUrl, boolean schemaValidationEnabled) + throws IOException, ParserConfigurationException, SAXException { - if (!"file".equals(baseUrl.getProtocol())) { - throw new MalformedURLException( - "Only local files can be accessed with the baseURL. Illegal value supplied was: " + baseUrl); - } + InputSource source = null; + + // Can't resolve public id, but might be able to resolve relative + // system id, since we have a base URI. + if (href != null && baseUrl != null) { + + if (!"file".equals(baseUrl.getProtocol())) { + throw new MalformedURLException( + "Only local files can be accessed with the baseURL. Illegal value supplied was: " + baseUrl); + } - URL url; + URL url; + try { + url = new URL(baseUrl, href); + final String logMsg = "Attempting to load policy from URL: " + url.toString(); + if (validateSchema) logger.info(logMsg); else logger.warn(logMsg); + source = new InputSource(url.openStream()); + source.setSystemId(href); + } catch (MalformedURLException | FileNotFoundException e) { try { - url = new URL(baseUrl, href); + String absURL = URIUtils.resolveAsString(href, baseUrl.toString()); + url = new URL(absURL); source = new InputSource(url.openStream()); source.setSystemId(href); - - } catch (MalformedURLException | java.io.FileNotFoundException e) { - try { - String absURL = URIUtils.resolveAsString(href, baseUrl.toString()); - url = new URL(absURL); - source = new InputSource(url.openStream()); - source.setSystemId(href); - - } catch (MalformedURLException ex2) { - // nothing to do - } + } catch (MalformedURLException ex2) { + // nothing to do + // TODO: Is this true? Or should we at least log the original exception, or + // rethrow it? } } + } - DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); - - /** - * Disable external entities, etc. - */ - dbf.setFeature(EXTERNAL_GENERAL_ENTITIES, false); - dbf.setFeature(EXTERNAL_PARAM_ENTITIES, false); - dbf.setFeature(DISALLOW_DOCTYPE_DECL, true); - dbf.setFeature(LOAD_EXTERNAL_DTD, false); - DocumentBuilder db = dbf.newDocumentBuilder(); - Document dom; - - /** - * Load and parse the file. - */ - if (source != null) { - dom = db.parse(source); - - /** - * Get the policy information out of it! - */ - - return dom.getDocumentElement(); - } + DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); - return null; - } catch (SAXException | ParserConfigurationException | IOException e) { - throw new PolicyException(e); + /** + * Disable external entities, etc. + */ + dbf.setFeature(EXTERNAL_GENERAL_ENTITIES, false); + dbf.setFeature(EXTERNAL_PARAM_ENTITIES, false); + dbf.setFeature(DISALLOW_DOCTYPE_DECL, true); + dbf.setFeature(LOAD_EXTERNAL_DTD, false); + + // This code doesn't have the retry logic if schema validation fails because schemaValidationEnabled is + // passed in. It is up to the caller to try again, if this fails the first time (if they want to). + if (schemaValidationEnabled) { + getPolicySchema(); + dbf.setNamespaceAware(true); + dbf.setSchema(schema); + } + + DocumentBuilder db = dbf.newDocumentBuilder(); + db.setErrorHandler(new SAXErrorHandler()); + + // Load and parse the file. + if (source != null) { + Document dom = db.parse(source); + + // Get the policy information out of it! + return dom.getDocumentElement(); } + + return null; } /** @@ -422,40 +574,38 @@ private static void parseDirectives(Element root, Map directives * @param allowedEmptyTagsListNode Top level of * @param allowedEmptyTags The tags that can be empty */ - private static void parseAllowedEmptyTags(Element allowedEmptyTagsListNode, List allowedEmptyTags) throws PolicyException { + private static void parseAllowedEmptyTags(Element allowedEmptyTagsListNode, + List allowedEmptyTags) throws PolicyException { if (allowedEmptyTagsListNode != null) { - for (Element literalNode : getGrandChildrenByTagName(allowedEmptyTagsListNode, "literal-list", "literal")) { + for (Element literalNode : + getGrandChildrenByTagName(allowedEmptyTagsListNode, "literal-list", "literal")) { String value = getAttributeValue(literalNode, "value"); - if (value != null && value.length() > 0) { allowedEmptyTags.add(value); } } - } else { - allowedEmptyTags.addAll(Constants.defaultAllowedEmptyTags); - } + } else allowedEmptyTags.addAll(Constants.defaultAllowedEmptyTags); } /** * Go through section of the policy file. * - * @param requiresClosingTagsListNode Top level of - * @param requiresClosingTags The list of tags that require closing + * @param requireClosingTagsListNode Top level of + * @param requireClosingTags The list of tags that require closing */ - private static void parseRequiresClosingTags(Element requiresClosingTagsListNode, List requiresClosingTags) throws PolicyException { - if (requiresClosingTagsListNode != null) { - for (Element literalNode : getGrandChildrenByTagName(requiresClosingTagsListNode, "literal-list", "literal")) { + private static void parseRequireClosingTags(Element requireClosingTagsListNode, + List requireClosingTags) throws PolicyException { + if (requireClosingTagsListNode != null) { + for (Element literalNode : + getGrandChildrenByTagName(requireClosingTagsListNode, "literal-list", "literal")) { String value = getAttributeValue(literalNode, "value"); - if (value != null && value.length() > 0) { - requiresClosingTags.add(value); + requireClosingTags.add(value); } } - } else { - requiresClosingTags.addAll(Constants.defaultRequiresClosingTags); - } + } else requireClosingTags.addAll(Constants.defaultRequireClosingTags); } /** @@ -472,11 +622,9 @@ private static void parseGlobalAttributes(Element root, Map g String name = getAttributeValue(ele, "name"); Attribute toAdd = commonAttributes.get(name.toLowerCase()); - if (toAdd != null) { - globalAttributes1.put(name.toLowerCase(), toAdd); - } else { - throw new PolicyException("Global attribute '" + name + "' was not defined in "); - } + if (toAdd != null) globalAttributes1.put(name.toLowerCase(), toAdd); + else throw new PolicyException("Global attribute '" + name + + "' was not defined in "); } } @@ -497,9 +645,8 @@ private static void parseDynamicAttributes(Element root, Map if (toAdd != null) { String attrName = name.toLowerCase().substring(0, name.length() - 1); dynamicAttributes.put(attrName, toAdd); - } else { - throw new PolicyException("Dynamic attribute '" + name + "' was not defined in "); - } + } else throw new PolicyException("Dynamic attribute '" + name + + "' was not defined in "); } } @@ -513,15 +660,15 @@ private static void parseCommonRegExps(Element root, Map commonAttributes1, Map commonRegularExpressions1) { - for (Element ele : getByTagName(root, "attribute")) { + private static void parseCommonAttributes(Element root, Map commonAttributes1, + Map commonRegularExpressions1) { + for (Element ele : getByTagName(root, "attribute")) { String onInvalid = getAttributeValue(ele, "onInvalid"); String name = getAttributeValue(ele, "name"); @@ -531,12 +678,11 @@ private static void parseCommonAttributes(Element root, Map c final String onInvalidStr; if (onInvalid != null && onInvalid.length() > 0) { onInvalidStr = onInvalid; - } else { - onInvalidStr = DEFAULT_ONINVALID; - } - String description = getAttributeValue(ele, "description"); - Attribute attribute = new Attribute(getAttributeValue(ele, "name"), allowedRegexps, allowedValues, onInvalidStr, description); + } else onInvalidStr = DEFAULT_ONINVALID; + String description = getAttributeValue(ele, "description"); + Attribute attribute = new Attribute(getAttributeValue(ele, "name"), allowedRegexps, + allowedValues, onInvalidStr, description); commonAttributes1.put(name.toLowerCase(), attribute); } } @@ -551,7 +697,6 @@ private static List getAllowedLiterals(Element ele) { } else if (literalNode.getNodeValue() != null) { allowedValues.add(literalNode.getNodeValue()); } - } return allowedValues; } @@ -564,14 +709,13 @@ private static List getAllowedRegexps(Map comm if (regExpName != null && regExpName.length() > 0) { allowedRegExp.add(commonRegularExpressions1.get(regExpName).getPattern()); - } else { - allowedRegExp.add(Pattern.compile(value)); - } + } else allowedRegExp.add(Pattern.compile(value, Pattern.DOTALL)); } return allowedRegExp; } - private static List getAllowedRegexps2(Map commonRegularExpressions1, Element attributeNode, String tagName) throws PolicyException { + private static List getAllowedRegexps2(Map commonRegularExpressions1, + Element attributeNode, String tagName) throws PolicyException { List allowedRegexps = new ArrayList(); for (Element regExpNode : getGrandChildrenByTagName(attributeNode, "regexp-list", "regexp")) { String regExpName = getAttributeValue(regExpNode, "name"); @@ -585,16 +729,14 @@ private static List getAllowedRegexps2(Map com * one or the other, not both. */ if (regExpName != null && regExpName.length() > 0) { - AntiSamyPattern pattern = commonRegularExpressions1.get(regExpName); if (pattern != null) { allowedRegexps.add(pattern.getPattern()); } else throw new PolicyException("Regular expression '" + regExpName + "' was referenced as a common regexp in definition of '" + tagName + "', but does not exist in "); - } else if (value != null && value.length() > 0) { - allowedRegexps.add(Pattern.compile(value)); + allowedRegexps.add(Pattern.compile(value, Pattern.DOTALL)); } } return allowedRegexps; @@ -613,7 +755,7 @@ private static List getAllowedRegexp3(Map comm if (pattern != null) { allowedRegExp.add(pattern.getPattern()); } else if (value != null) { - allowedRegExp.add(Pattern.compile(value)); + allowedRegExp.add(Pattern.compile(value, Pattern.DOTALL)); } else throw new PolicyException("Regular expression '" + regExpName + "' was referenced as a common regexp in definition of '" + name + "', but does not exist in "); @@ -627,7 +769,6 @@ private static void parseTagRules(Element root, Map commonAtt if (root == null) return; for (Element tagNode : getByTagName(root, "tag")) { - String name = getAttributeValue(tagNode, "name"); String action = getAttributeValue(tagNode, "action"); @@ -639,20 +780,18 @@ private static void parseTagRules(Element root, Map commonAtt } } - private static Map getTagAttributes(Map commonAttributes1, Map commonRegularExpressions1, NodeList attributeList, String tagName) throws PolicyException { + private static Map getTagAttributes(Map commonAttributes1, Map commonRegularExpressions1, NodeList attributeList, String tagName) throws PolicyException { + Map tagAttributes = new HashMap(); for (int j = 0; j < attributeList.getLength(); j++) { - Element attributeNode = (Element) attributeList.item(j); String attrName = getAttributeValue(attributeNode, "name").toLowerCase(); if (!attributeNode.hasChildNodes()) { - Attribute attribute = commonAttributes1.get(attrName); - /* - * All they provided was the name, so they must want a common attribute. - */ + // All they provided was the name, so they must want a common attribute. if (attribute != null) { /* * If they provide onInvalid/description values here they will @@ -665,11 +804,9 @@ private static Map getTagAttributes(Map co commonAttributes1.put(attrName, changed); tagAttributes.put(attrName, changed); - } else { - throw new PolicyException("Attribute '" + getAttributeValue(attributeNode, "name") + + } else throw new PolicyException("Attribute '" + getAttributeValue(attributeNode, "name") + "' was referenced as a common attribute in definition of '" + tagName + "', but does not exist in "); - } } else { List allowedRegexps2 = getAllowedRegexps2(commonRegularExpressions1, attributeNode, tagName); @@ -678,20 +815,16 @@ private static Map getTagAttributes(Map co String description = getAttributeValue(attributeNode, "description"); Attribute attribute = new Attribute(getAttributeValue(attributeNode, "name"), allowedRegexps2, allowedValues2, onInvalid, description); - /* - * Add fully built attribute. - */ + // Add fully built attribute. tagAttributes.put(attrName, attribute); } - } return tagAttributes; } private static void parseCSSRules(Element root, Map cssRules1, Map commonRegularExpressions1) throws PolicyException { - for (Element ele : getByTagName(root, "property")){ - + for (Element ele : getByTagName(root, "property")) { String name = getAttributeValue(ele, "name"); String description = getAttributeValue(ele, "description"); @@ -711,11 +844,9 @@ private static void parseCSSRules(Element root, Map cssRules1, final String onInvalidStr; if (onInvalid != null && onInvalid.length() > 0) { onInvalidStr = onInvalid; - } else { - onInvalidStr = DEFAULT_ONINVALID; - } - Property property = new Property(name,allowedRegexp3, allowedValue, shortHandRefs, description, onInvalidStr ); + } else onInvalidStr = DEFAULT_ONINVALID; + Property property = new Property(name,allowedRegexp3, allowedValue, shortHandRefs, description, onInvalidStr); cssRules1.put(name.toLowerCase(), property); } } @@ -732,8 +863,7 @@ public Attribute getGlobalAttributeByName(String name) { } /** - * A method for returning one of the dynamic <global-attribute> entries by - * name. + * A method for returning one of the dynamic <global-attribute> entries by name. * * @param name The name of the dynamic global-attribute we want to look up. * @return An Attribute associated with the global-attribute lookup name specified, @@ -781,7 +911,7 @@ public String getDirective(String name) { /** * Resolves public and system IDs to files stored within the JAR. - * + * * @param systemId The name of the entity we want to look up. * @param baseUrl The base location of the entity. * @return A String object containing the directive associated with the lookup name, or null if none is found. @@ -845,9 +975,8 @@ private static Iterable getGrandChildrenByTagName(Element parent, Stri } private static Iterable getByTagName(Element parent, String tagName) { - if (parent == null) { - return Collections.emptyList(); - } + if (parent == null) return Collections.emptyList(); + final NodeList nodes = parent.getElementsByTagName(tagName); return new Iterable() { public Iterator iterator() { @@ -875,4 +1004,33 @@ public AntiSamyPattern getCommonRegularExpressions(String name) { return commonRegularExpressions.get(name); } + private static void getPolicySchema() throws SAXException { + if (schema == null) { + InputStream schemaStream = Policy.class.getClassLoader().getResourceAsStream(POLICY_SCHEMA_URI); + Source schemaSource = new StreamSource(schemaStream); + schema = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI) + .newSchema(schemaSource); + } + } + + /** + * This class is implemented to just throw an exception when + * validating the policy schema while parsing the document. + */ + static class SAXErrorHandler implements ErrorHandler { + @Override + public void error(SAXParseException arg0) throws SAXException { + throw arg0; + } + + @Override + public void fatalError(SAXParseException arg0) throws SAXException { + throw arg0; + } + + @Override + public void warning(SAXParseException arg0) throws SAXException { + throw arg0; + } + } } diff --git a/src/main/java/org/owasp/validator/html/scan/AbstractAntiSamyScanner.java b/src/main/java/org/owasp/validator/html/scan/AbstractAntiSamyScanner.java index 371557fe..8821dfdc 100644 --- a/src/main/java/org/owasp/validator/html/scan/AbstractAntiSamyScanner.java +++ b/src/main/java/org/owasp/validator/html/scan/AbstractAntiSamyScanner.java @@ -1,5 +1,5 @@ /* - * Copyright (c) 2007-2020, Arshan Dabirsiaghi, Jason Li + * Copyright (c) 2007-2021, Arshan Dabirsiaghi, Jason Li * * All rights reserved. * @@ -25,10 +25,19 @@ package org.owasp.validator.html.scan; import java.io.Writer; -import java.util.*; +import java.util.ArrayList; +import java.util.List; +import java.util.Locale; +import java.util.MissingResourceException; +import java.util.ResourceBundle; import org.apache.xml.serialize.OutputFormat; -import org.owasp.validator.html.*; + +import org.owasp.validator.html.CleanResults; +import org.owasp.validator.html.InternalPolicy; +import org.owasp.validator.html.Policy; +import org.owasp.validator.html.PolicyException; +import org.owasp.validator.html.ScanException; import org.owasp.validator.html.util.ErrorMessageUtil; public abstract class AbstractAntiSamyScanner { @@ -47,6 +56,7 @@ public abstract class AbstractAntiSamyScanner { public abstract CleanResults getResults(); public AbstractAntiSamyScanner(Policy policy) { + assert policy instanceof InternalPolicy : policy.getClass(); this.policy = (InternalPolicy) policy; } @@ -58,15 +68,15 @@ private static ResourceBundle getResourceBundle() { try { return ResourceBundle.getBundle("AntiSamy", Locale.getDefault()); } catch (MissingResourceException mre) { - return ResourceBundle.getBundle("AntiSamy", new Locale(Constants.DEFAULT_LOCALE_LANG, Constants.DEFAULT_LOCALE_LOC)); + return ResourceBundle.getBundle("AntiSamy", new Locale(Constants.DEFAULT_LOCALE_LANG, + Constants.DEFAULT_LOCALE_LOC)); } } protected void addError(String errorKey, Object[] objs) { errorMessages.add(ErrorMessageUtil.getMessage(messages, errorKey, objs)); } - - + protected OutputFormat getOutputFormat() { OutputFormat format = new OutputFormat(); @@ -74,22 +84,22 @@ protected OutputFormat getOutputFormat() { format.setOmitDocumentType(policy.isOmitDoctypeDeclaration()); format.setPreserveEmptyAttributes(true); format.setPreserveSpace(policy.isPreserveSpace()); - + if (policy.isFormatOutput()) { format.setLineWidth(80); format.setIndenting(true); format.setIndent(2); } - + return format; } - + protected org.apache.xml.serialize.HTMLSerializer getHTMLSerializer(Writer w, OutputFormat format) { - if(policy.isUseXhtml()) { + if (policy.isUseXhtml()) { return new ASXHTMLSerializer(w, format, policy); } - + return new ASHTMLSerializer(w, format, policy); } @@ -103,7 +113,7 @@ protected String trim(String original, String cleaned) { } } } - + return cleaned; } } diff --git a/src/main/java/org/owasp/validator/html/scan/Constants.java b/src/main/java/org/owasp/validator/html/scan/Constants.java index 5487a349..328bac25 100644 --- a/src/main/java/org/owasp/validator/html/scan/Constants.java +++ b/src/main/java/org/owasp/validator/html/scan/Constants.java @@ -34,7 +34,7 @@ public class Constants { public static final String DEFAULT_ENCODING_ALGORITHM = "UTF-8"; public static final Tag BASIC_PARAM_TAG_RULE; public static final List defaultAllowedEmptyTags; - public static final List defaultRequiresClosingTags; + public static final List defaultRequireClosingTags; private static final String[] allowedEmptyTags = { "br", "hr", "a", "img", "link", "iframe", "script", "object", "applet", @@ -61,7 +61,7 @@ public class Constants { List requiresClosingTagsList = new ArrayList(); requiresClosingTagsList.addAll(Arrays.asList(requiresClosingTags)); - defaultRequiresClosingTags = Collections.unmodifiableList(requiresClosingTagsList); + defaultRequireClosingTags = Collections.unmodifiableList(requiresClosingTagsList); } public static final String DEFAULT_LOCALE_LANG = "en"; diff --git a/src/main/resources/antisamy.xsd b/src/main/resources/antisamy.xsd index 4e8ceda8..40d9502b 100644 --- a/src/main/resources/antisamy.xsd +++ b/src/main/resources/antisamy.xsd @@ -4,54 +4,59 @@ - - - - - - - - - + + + + + + + + + + - + + + + + - + - + - + - + - + - + @@ -62,7 +67,7 @@ - + @@ -96,13 +101,13 @@ - + - + - + @@ -123,7 +128,7 @@ - + @@ -136,7 +141,7 @@ - + diff --git a/src/main/resources/log4j2.xml b/src/main/resources/log4j2.xml new file mode 100644 index 00000000..d2a6c7ae --- /dev/null +++ b/src/main/resources/log4j2.xml @@ -0,0 +1,14 @@ + + + + + + + + + + + + + + diff --git a/src/test/java/org/owasp/validator/html/test/AntiSamyTest.java b/src/test/java/org/owasp/validator/html/test/AntiSamyTest.java index 4477b345..0c1c7c02 100644 --- a/src/test/java/org/owasp/validator/html/test/AntiSamyTest.java +++ b/src/test/java/org/owasp/validator/html/test/AntiSamyTest.java @@ -1,8 +1,8 @@ /* * Copyright (c) 2007-2021, Arshan Dabirsiaghi, Jason Li - * + * * All rights reserved. - * + * * Redistribution and use in source and binary forms, with or without modification, * are permitted provided that the following conditions are met: * @@ -1480,3 +1480,4 @@ public void testGithubIssue62() { } } } + diff --git a/src/test/java/org/owasp/validator/html/test/PolicyTest.java b/src/test/java/org/owasp/validator/html/test/PolicyTest.java index 7b2b562a..c5643d13 100644 --- a/src/test/java/org/owasp/validator/html/test/PolicyTest.java +++ b/src/test/java/org/owasp/validator/html/test/PolicyTest.java @@ -1,19 +1,48 @@ -package org.owasp.validator.html.test; +/* + * Copyright (c) 2007-2021, Jacob Coulter, Mark Oberhaus + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without modification, + * are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this list + * of conditions and the following disclaimer. Redistributions in binary form must + * reproduce the above copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided with the distribution. + * Neither the name of OWASP nor the names of its contributors may be used to endorse + * or promote products derived from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, + * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, + * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR + * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ +package org.owasp.validator.html.test; import junit.framework.TestCase; +import org.owasp.validator.html.AntiSamy; import org.owasp.validator.html.Policy; import org.owasp.validator.html.PolicyException; import org.owasp.validator.html.TagMatcher; import org.owasp.validator.html.scan.Constants; import java.io.ByteArrayInputStream; +import java.net.URL; +import static org.hamcrest.CoreMatchers.containsString; +import static org.hamcrest.MatcherAssert.assertThat; /** * This class tests the Policy functionality to show that we can successfully parse the policy file. - * - * @author Jacob Coulter & Mark Oberhaus */ public class PolicyTest extends TestCase { @@ -31,6 +60,7 @@ public class PolicyTest extends TestCase { private static final String COMMON_REGEXPS = "\n"; private static final String FOOTER = ""; + // Returns a valid policy file with the specified allowedEmptyTags private String assembleFile(String allowedEmptyTagsSection) { return HEADER + DIRECTIVES + COMMON_REGEXPS + COMMON_ATTRIBUTES + GLOBAL_TAG_ATTRIBUTES + DYNAMIC_TAG_ATTRIBUTES + TAG_RULES + CSS_RULES + allowedEmptyTagsSection + FOOTER; @@ -47,7 +77,6 @@ public void testGetAllowedEmptyTags() throws PolicyException { policy = Policy.getInstance(new ByteArrayInputStream(policyFile.getBytes())); - TagMatcher actualTags = policy.getAllowedEmptyTags(); assertTrue(actualTags.matches("td")); @@ -67,8 +96,7 @@ public void testGetAllowedEmptyTags_emptyList() throws PolicyException { } public void testGetAllowedEmptyTags_emptySection() throws PolicyException { - String allowedEmptyTagsSection = "\n" + - "\n"; + String allowedEmptyTagsSection = "\n" + "\n"; String policyFile = assembleFile(allowedEmptyTagsSection); policy = Policy.getInstance(new ByteArrayInputStream(policyFile.getBytes())); @@ -85,4 +113,155 @@ public void testGetAllowedEmptyTags_NoSection() throws PolicyException { assertTrue(policy.getAllowedEmptyTags().size() == Constants.defaultAllowedEmptyTags.size()); } + + public void testInvalidPolicies() { + + // Default is to now enforce schema validation on policy files. These tests verify + // various schema violations are detected and flagged. + String notSupportedTagsSection = "\n" + "\n"; + String policyFile = assembleFile(notSupportedTagsSection); + try { + policy = Policy.getInstance(new ByteArrayInputStream(policyFile.getBytes())); + fail("No PolicyException thrown for with schema validation enabled."); + } catch (PolicyException e) { + assertNotNull(e); + } + + String duplicatedTagsSection = "\n" + "\n"; + policyFile = assembleFile(duplicatedTagsSection); + try { + policy = Policy.getInstance(new ByteArrayInputStream(policyFile.getBytes())); + fail("No PolicyException thrown when duplicated and schema validation enabled."); + } catch (PolicyException e) { + assertNotNull(e); + } + + policyFile = assembleFile("").replace("", "").replace("", ""); + try { + policy = Policy.getInstance(new ByteArrayInputStream(policyFile.getBytes())); + fail("No PolicyException thrown when missing and schema validation enabled."); + } catch (PolicyException e) { + assertNotNull(e); + } + } + + public void testSchemaValidationToggleWithSource() { + String notSupportedTagsSection = "\n" + "\n"; + String policyFile = assembleFile(notSupportedTagsSection); + + // Disable validation + Policy.setSchemaValidation(false); + + try { + System.out.println("TESTING: A schema invalid WARNING should mention the invalid tag: "); + policy = Policy.getInstance(new ByteArrayInputStream(policyFile.getBytes())); + assertNotNull(policy); + } catch (PolicyException e) { + fail("Policy creation should not fail when schema validation is disabled."); + } + + // This one should only print a warning on the console because validation is disabled + try { + System.out.println("TESTING: A WARNING should mention that schema validation should not be disabled."); + policy = Policy.getInstance(new ByteArrayInputStream(assembleFile("").getBytes())); + assertNotNull(policy); + } catch (PolicyException e) { + fail("Policy creation should not fail when schema validation is disabled."); + } + + // Enable validation again + Policy.setSchemaValidation(true); + + try { + policy = Policy.getInstance(new ByteArrayInputStream(policyFile.getBytes())); + fail("Not supported tag on policy, but no PolicyException occurred."); + } catch (PolicyException e) { + assertNotNull(e); + } + } + + public void testSchemaValidationToggleWithUrl() { + URL urlOfValidPolicy = getClass().getResource("/antisamy.xml"); + URL urlOfInvalidPolicy = getClass().getResource("/invalidPolicy.xml"); + + // Disable validation + Policy.setSchemaValidation(false); + + try { + System.out.println("TESTING: A schema invalid WARNING should follow:"); + policy = TestPolicy.getInstance(urlOfInvalidPolicy); + assertNotNull(policy); + } catch (PolicyException e) { + fail("Policy creation should not fail for invalid policy when schema validation disabled."); + } + + // This one should only print a warning on the console because validation is disabled + try { + System.out.println("TESTING: A WARNING should mention that schema validation should not be disabled."); + policy = TestPolicy.getInstance(urlOfValidPolicy); + assertNotNull(policy); + } catch (PolicyException e) { + fail("Policy creation should not fail for valid policy when schema validation disabled."); + } + + // Enable validation again + Policy.setSchemaValidation(true); + + try { + policy = TestPolicy.getInstance(urlOfInvalidPolicy); + fail("PolicyException not thrown for policy w/invalid schema and schema validation enabled."); + } catch (PolicyException e) { + assertNotNull(e); + } + } + + public void testSchemaValidationToggleWithInclude() { + // This policy will also include invalidPolicy.xml + URL url = getClass().getResource("/emptyPolicyWithInclude.xml"); + + // Disable validation + Policy.setSchemaValidation(false); + + try { + System.out.println("TESTING: A schema invalid WARNING should follow:"); + policy = TestPolicy.getInstance(url); + assertNotNull(policy); + } catch (PolicyException e) { + fail("Policy creation should not fail for invalid policy when schema validation disabled."); + } + + // Enable validation again + Policy.setSchemaValidation(true); + + try { + policy = TestPolicy.getInstance(url); + fail("PolicyException not thrown for policy w/invalid schema and schema validation enabled."); + } catch (PolicyException e) { + assertNotNull(e); + } + } + + public void testGithubIssue66() { + // Concern is that LSEP characters are not being considered on .* pattern + // Note: Change was done in Policy loading, so test is located here + String tagRules = "" + + "" + + " " + + " " + + " " + + " " + + " " + + "" + + ""; + String rawPolicy = HEADER + DIRECTIVES + COMMON_REGEXPS + COMMON_ATTRIBUTES + GLOBAL_TAG_ATTRIBUTES + tagRules + CSS_RULES + FOOTER; + + try { + policy = Policy.getInstance(new ByteArrayInputStream(rawPolicy.getBytes())); + assertThat(new AntiSamy().scan("Content", policy, AntiSamy.DOM).getCleanHTML(), containsString("Line 1")); + assertThat(new AntiSamy().scan("Content", policy, AntiSamy.SAX).getCleanHTML(), containsString("Line 1")); + } catch (Exception e) { + fail("Policy nor scan should fail:" + e.getMessage()); + } + } } + diff --git a/src/test/resources/emptyPolicyWithInclude.xml b/src/test/resources/emptyPolicyWithInclude.xml new file mode 100644 index 00000000..9c552ac2 --- /dev/null +++ b/src/test/resources/emptyPolicyWithInclude.xml @@ -0,0 +1,11 @@ + + + + + + + + + + + \ No newline at end of file diff --git a/src/test/resources/invalidPolicy.xml b/src/test/resources/invalidPolicy.xml new file mode 100644 index 00000000..80756268 --- /dev/null +++ b/src/test/resources/invalidPolicy.xml @@ -0,0 +1,11 @@ + + + + + + + + + + + \ No newline at end of file diff --git a/src/test/resources/log4j2.xml b/src/test/resources/log4j2.xml new file mode 100644 index 00000000..4ea96fb2 --- /dev/null +++ b/src/test/resources/log4j2.xml @@ -0,0 +1,14 @@ + + + + + + + + + + + + + +