-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve the casing of tags and attributes names. #182
Comments
Please add an option to preserve casing of tags and attributes. I've the same issue with sanitizing svg. This is one of the best tools out there and this particular shortcoming is a show stopper for my project. Any possible workarounds is also appreciated until this is fixed. Thanks! |
@zeeneir a while back I did something like this and actually forgot to open a pull request for it. I'll review it on monday and open a PR. |
@wookie41 excellent and thanks 👌 |
Can you explain how case sensitivity is a problem? <b> and <B> render the same as a simple example. Where is this a problem for you, can you give us a test case?
Regards.
--
Jim Manico
@manicode
… On Mar 27, 2020, at 10:50 AM, zeeneir ***@***.***> wrote:
Please add an option to preserve casing of tags and attributes. I've the same issue with sanitizing svg. This is one of the best tools out there and this particular shortcoming is a show stopper for my project. Any possible workarounds is also appreciated until this is fixed. Thanks!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
@jmanico SVG renderers actually care about the casing (at least the one in Firefox) and the image simply doesn't show, when |
That’s a pretty solid test case of you ask me.
…--
Jim Manico
@manicode
On Mar 27, 2020, at 11:03 AM, Łukasz Bogaczyński ***@***.***> wrote:
@jmanico SVG renderers actually care about the casing (at least the one in Firefox) and the image simply doesn't show, when viewBox is sanitized to viewbox. I know that this isn't an svg sanitizer, but it does the job I need it to do, after creating an svg policy.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Same comment as @wookie41, Agree that this is HTML sanitizer, but I think this small change would allow proper sanitizing of SVG documents. Some renderers don't care about case, but others do. Mozilla states:
|
any updates for this issue? |
What's the list of names that need to be preserved? HTML uppercase qualified name provides a notion that we could use for a canonical name that prefers lower-case ASCII. If an element/attribute name is not defined in the HTML namespace and it is defined in the corresponding SVG or MathML namespace, then we use the preferred casing from that namespace, otherwise we convert to lower-case in a locale-insensitive manner. I don't think the xlink namespace matters. Would that be sufficient? |
Ideal would be a setting on the policy builder that indicates whether case should be preserved for tags/attributes, however, this proposal is good as well. If you go this route, you don't need us to provide a list, do you? |
@zeeneir If the bug results from a mismatch in how converted names are interpreted by HTML parsers then I don't see how more policy settings would help. |
@mikesamuel, I see. Would you proposal cover overlapping tags/attributes? |
Most of the overlaps are innocuous: I was planning on coming up with whitelists of foreign element and attribute names, and just not lower-casing if an identifier is in that set. Unless there's a need for Tiny 1.2 I'd rather go with what spec.html references:
IIUC, for SVG, the lists are Appendix F and Appendix G But I'm not super familiar with which drafts of SVG are supported by which browser vendors. |
Sounds good. Could you please include support for SVG Tiny. SVG Tiny is lightweight version of SVG, is suitable for mobile devices and is necessary for my use case. Here’s the latest spec: Element List: Attribute/Property list: |
For the record, JS used to extract mixed-case attribute and element names from the specs in lieu of data files. // https://www.w3.org/TR/SVGMobile/single-page.html#chapter-elementTable
let elementTable = document.getElementById('elementTable-elements');
let elementCells = Array.from(elementTable.querySelectorAll('tr > :first-child')).map(x => x.innerText);
function isMixedCase(x) { return !(/:/.test(x)) && /[a-z]/.test(x) && /[A-Z]/.test(x) }
elementCells.map(x => x.replace(/^\'|\'$/g, '')).filter(x => x != 'Elements' && isMixedCase(x))
// (8) ["animateColor", "animateMotion", "animateTransform", "foreignObject", "linearGradient", "radialGradient", "solidColor", "textArea"]
let attributeCells = Array.from(elementTable.querySelectorAll('tr > :nth-child(2)')).map(x => x.innerText);
Array.from(new Set(attributeCells.flatMap(x => x.split(/, /g)).filter(x => x != 'Attributes' && isMixedCase(x))));
// (37) ["externalResourcesRequired", "focusHighlight", "requiredExtensions", "requiredFeatures", "requiredFonts", "requiredFormats", "systemLanguage", "attributeName", "attributeType", "calcMode", "keySplines", "keyTimes", "repeatCount", "repeatDur", "keyPoints", "initialVisibility", "preserveAspectRatio", "syncBehavior", "syncMaster", "syncTolerance", "gradientUnits", "defaultAction", "pathLength", "mediaCharacterEncoding", "mediaContentEncodings", "mediaSize", "mediaTime", "baseProfile", "contentScriptType", "playbackOrder", "snapshotTime", "syncBehaviorDefault", "syncToleranceDefault", "timelineBegin", "viewBox", "zoomAndPan", "transformBehavior"]
//https://svgwg.org/svg2-draft/eltindex.html
function isMixedCase(x) { return !(/:/.test(x)) && /[a-z]/.test(x) && /[A-Z]/.test(x) }
Array.from(document.querySelectorAll('.element-name')).map(x => x.innerText.replace(/^\u2018|\u2019$/g, '')).filter(isMixedCase)
// (32) ["animateMotion", "animateTransform", "clipPath", "feBlend", "feColorMatrix", "feComponentTransfer", "feComposite", "feConvolveMatrix", "feDiffuseLighting", "feDisplacementMap", "feDistantLight", "feDropShadow", "feFlood", "feFuncA", "feFuncB", "feFuncG", "feFuncR", "feGaussianBlur", "feImage", "feMerge", "feMergeNode", "feMorphology", "feOffset", "fePointLight", "feSpecularLighting", "feSpotLight", "feTile", "feTurbulence", "foreignObject", "linearGradient", "radialGradient", "textPath"]
// https://svgwg.org/svg2-draft/attindex.html
function isMixedCase(x) { return !(/:/.test(x)) && /[a-z]/.test(x) && /[A-Z]/.test(x) }
Array.from(new Set(Array.from(document.querySelectorAll('.attr-name')).map(x => x.innerText.replace(/^\u2018|\u2019$/g, '')).filter(isMixedCase)))
// (52) ["attributeName", "baseFrequency", "calcMode", "clipPathUnits", "diffuseConstant", "edgeMode", "filterUnits", "gradientTransform", "gradientUnits", "kernelMatrix", "kernelUnitLength", "keyPoints", "keySplines", "keyTimes", "lengthAdjust", "limitingConeAngle", "markerHeight", "markerUnits", "markerWidth", "maskContentUnits", "maskUnits", "numOctaves", "pathLength", "patternContentUnits", "patternTransform", "patternUnits", "pointsAtX", "pointsAtY", "pointsAtZ", "preserveAlpha", "preserveAspectRatio", "primitiveUnits", "refX", "refY", "repeatCount", "repeatDur", "requiredExtensions", "specularConstant", "specularExponent", "spreadMethod", "startOffset", "stdDeviation", "stitchTiles", "surfaceScale", "systemLanguage", "tableValues", "targetX", "targetY", "textLength", "viewBox", "xChannelSelector", "yChannelSelector"]
// https://www.w3.org/Math/draft-spec/appendixi.html
let [elementList, attributeList] = document.querySelectorAll('dl');
function isMixedCase(x) { return !(/:/.test(x)) && /[a-z]/.test(x) && /[A-Z]/.test(x) }
Array.from(elementList.querySelectorAll('dt')).map(x => x.innerText).filter(isMixedCase)
// []
Array.from(attributeList.querySelectorAll('dt')).map(x => x.innerText).filter(isMixedCase)
// (2) ["definitionURL", "schemaLocation"] |
Unioning these lists gives us function union(lists) {
let a = Array.from(new Set(lists.flatMap(x => x)));
a.sort();
return a;
} Mixed case element names in SVG Tiny, SVG 2, and MathMLunion([
["animateColor", "animateMotion", "animateTransform", "foreignObject", "linearGradient", "radialGradient", "solidColor", "textArea"],
["animateMotion", "animateTransform", "clipPath", "feBlend", "feColorMatrix", "feComponentTransfer", "feComposite", "feConvolveMatrix", "feDiffuseLighting", "feDisplacementMap", "feDistantLight", "feDropShadow", "feFlood", "feFuncA", "feFuncB", "feFuncG", "feFuncR", "feGaussianBlur", "feImage", "feMerge", "feMergeNode", "feMorphology", "feOffset", "fePointLight", "feSpecularLighting", "feSpotLight", "feTile", "feTurbulence", "foreignObject", "linearGradient", "radialGradient", "textPath"],
[]
]) ⇒ [
"animateColor",
"animateMotion",
"animateTransform",
"clipPath",
"feBlend",
"feColorMatrix",
"feComponentTransfer",
"feComposite",
"feConvolveMatrix",
"feDiffuseLighting",
"feDisplacementMap",
"feDistantLight",
"feDropShadow",
"feFlood",
"feFuncA",
"feFuncB",
"feFuncG",
"feFuncR",
"feGaussianBlur",
"feImage",
"feMerge",
"feMergeNode",
"feMorphology",
"feOffset",
"fePointLight",
"feSpecularLighting",
"feSpotLight",
"feTile",
"feTurbulence",
"foreignObject",
"linearGradient",
"radialGradient",
"solidColor",
"textArea",
"textPath"
] Mixed case attribute names in SVG Tiny, SVG 2, and MathMLunion([
["externalResourcesRequired", "focusHighlight", "requiredExtensions", "requiredFeatures", "requiredFonts", "requiredFormats", "systemLanguage", "attributeName", "attributeType", "calcMode", "keySplines", "keyTimes", "repeatCount", "repeatDur", "keyPoints", "initialVisibility", "preserveAspectRatio", "syncBehavior", "syncMaster", "syncTolerance", "gradientUnits", "defaultAction", "pathLength", "mediaCharacterEncoding", "mediaContentEncodings", "mediaSize", "mediaTime", "baseProfile", "contentScriptType", "playbackOrder", "snapshotTime", "syncBehaviorDefault", "syncToleranceDefault", "timelineBegin", "viewBox", "zoomAndPan", "transformBehavior"] ,
["attributeName", "baseFrequency", "calcMode", "clipPathUnits", "diffuseConstant", "edgeMode", "filterUnits", "gradientTransform", "gradientUnits", "kernelMatrix", "kernelUnitLength", "keyPoints", "keySplines", "keyTimes", "lengthAdjust", "limitingConeAngle", "markerHeight", "markerUnits", "markerWidth", "maskContentUnits", "maskUnits", "numOctaves", "pathLength", "patternContentUnits", "patternTransform", "patternUnits", "pointsAtX", "pointsAtY", "pointsAtZ", "preserveAlpha", "preserveAspectRatio", "primitiveUnits", "refX", "refY", "repeatCount", "repeatDur", "requiredExtensions", "specularConstant", "specularExponent", "spreadMethod", "startOffset", "stdDeviation", "stitchTiles", "surfaceScale", "systemLanguage", "tableValues", "targetX", "targetY", "textLength", "viewBox", "xChannelSelector", "yChannelSelector"],
["definitionURL", "schemaLocation"]
]) ⇒ [
"attributeName",
"attributeType",
"baseFrequency",
"baseProfile",
"calcMode",
"clipPathUnits",
"contentScriptType",
"defaultAction",
"definitionURL",
"diffuseConstant",
"edgeMode",
"externalResourcesRequired",
"filterUnits",
"focusHighlight",
"gradientTransform",
"gradientUnits",
"initialVisibility",
"kernelMatrix",
"kernelUnitLength",
"keyPoints",
"keySplines",
"keyTimes",
"lengthAdjust",
"limitingConeAngle",
"markerHeight",
"markerUnits",
"markerWidth",
"maskContentUnits",
"maskUnits",
"mediaCharacterEncoding",
"mediaContentEncodings",
"mediaSize",
"mediaTime",
"numOctaves",
"pathLength",
"patternContentUnits",
"patternTransform",
"patternUnits",
"playbackOrder",
"pointsAtX",
"pointsAtY",
"pointsAtZ",
"preserveAlpha",
"preserveAspectRatio",
"primitiveUnits",
"refX",
"refY",
"repeatCount",
"repeatDur",
"requiredExtensions",
"requiredFeatures",
"requiredFonts",
"requiredFormats",
"schemaLocation",
"snapshotTime",
"specularConstant",
"specularExponent",
"spreadMethod",
"startOffset",
"stdDeviation",
"stitchTiles",
"surfaceScale",
"syncBehavior",
"syncBehaviorDefault",
"syncMaster",
"syncTolerance",
"syncToleranceDefault",
"systemLanguage",
"tableValues",
"targetX",
"targetY",
"textLength",
"timelineBegin",
"transformBehavior",
"viewBox",
"xChannelSelector",
"yChannelSelector",
"zoomAndPan"
] |
…s exactly > Currently all names are converted to lowercase which is ok when > you're using it for HTML only, but if there is an SVG image nested > inside the HTML it breaks. For example, when `viewBox` attribute is > converted to `viewbox` the image is not displayed correctly. This commit splits *HtmlLexer*.*canonicalName* into variants which preserve items on whitelists derived from the SVG and MathML specifications, and adjusts callers of *canonicalName* to use the appropriate variant. Fixes #182
#206) * Do not lcase element or attribute names that match SVG or MathML names exactly > Currently all names are converted to lowercase which is ok when > you're using it for HTML only, but if there is an SVG image nested > inside the HTML it breaks. For example, when `viewBox` attribute is > converted to `viewbox` the image is not displayed correctly. This commit splits *HtmlLexer*.*canonicalName* into variants which preserve items on whitelists derived from the SVG and MathML specifications, and adjusts callers of *canonicalName* to use the appropriate variant. Fixes #182 * add unittests for mixed-case SVG names
Included in the latest releas |
It would be nice if the the policy builder had an option that disabled the "sanitization" of tags and attributes names.
Currently all names are converted to lowercase which is ok when you're using it for HTML only, but if there is an SVG image nested inside the HTML it breaks.
For example, when
viewBox
attribute on is converted toviewbox
and the image is not displayed correctly.The text was updated successfully, but these errors were encountered: