decodeForHTML returns same character for Ù and ù #11

GoogleCodeExporter · 2016-03-23T00:43:09Z

What steps will reproduce the problem?
1. decodeForHTML returns same character for &Ugrave; and &ugrave;  This is true 
for all named entities with upper/lower case versions. 

What is the expected output? What do you see instead?

&Ugrave; should return upper case U with accent, and &ugrave; should return 
lower case u with accent.

What version of the product are you using? On what operating system?

Latest version on Linux.

Please provide any additional information below.

In HTMLEntityCodec.js, you should probably not do a case insensitive look-up at 
the end of the getNamedEntity function.

Thanks!

Original issue reported on code.google.com by [email protected] on 5 Aug 2012 at 9:19

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter · 2016-03-23T00:43:09Z

Hi,

I found one issue with decodeForHTML function. I tried below steps

org.owasp.esapi.ESAPI.initialize();

$ESAPI.encoder().encodeForHTML("<script>alert('123');</script>");
"<script>alert('123');</script>"

$ESAPI.encoder().decodeForHTML("<script>alert('123');</script>");
"<script>alert4039123394159<47script>"

Issue:- decodeForHTML is not giving me the actual data which i had encoded.

Solution:- In org.owasp.esapi.codecs.HTMLEntityCodec, the function parseNumber 
and parseHex returning number directly(return parseInt(out);). it should return 
char code(return String.fromCharCode(parseInt(out));).
Below are the function i have modified

var parseNumber = function(input) {
        var out = '';
        while (input.hasNext()) {
            var c = input.peek();
            if (c.match(/[0-9]/)) {
                out += c;
                input.next();
            } else if (c == ';') {
                input.next();
                break;
            } else {
                break;
            }
        }

        try {
            return String.fromCharCode(parseInt(out));
            //Commented to fix esapi bug
            //return parseInt(out);
        } catch (e) {
            return null;
        }
    };

    var parseHex = function(input) {
        var out = '';
        while (input.hasNext()) {
            var c = input.peek();
            if (c.match(/[0-9A-Fa-f]/)) {
                out += c;
                input.next();
            } else if (c == ';') {
                input.next();
                break;
            } else {
                break;
            }
        }
        try {
            return String.fromCharCode(parseInt(out, 16));
            //Commented to fix esapi bug
            //return parseInt(out, 16);
        } catch (e) {
            return null;
        }
    };

I have fixed this issue in esapi.js and using it for my project.

Thanks
Bikesh Kumar

Original comment by [email protected] on 19 Mar 2013 at 8:22

GoogleCodeExporter · 2016-03-23T00:43:09Z

I think all we did was change in HTMLEntityCodec.js

return String.fromCharCode(entityToCharacterMap.getCaseInsensitive('&' + 
entity));

to

return String.fromCharCode(entityToCharacterMap['&' + entity]);

Original comment by [email protected] on 19 Mar 2013 at 10:58

GoogleCodeExporter added Priority-Medium auto-migrated Type-Defect labels Mar 23, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decodeForHTML returns same character for Ù and ù #11

decodeForHTML returns same character for Ù and ù #11

GoogleCodeExporter commented Mar 23, 2016

GoogleCodeExporter commented Mar 23, 2016

GoogleCodeExporter commented Mar 23, 2016

decodeForHTML returns same character for &Ugrave; and &ugrave; #11

decodeForHTML returns same character for &Ugrave; and &ugrave; #11

Comments

GoogleCodeExporter commented Mar 23, 2016

GoogleCodeExporter commented Mar 23, 2016

GoogleCodeExporter commented Mar 23, 2016

decodeForHTML returns same character for Ù and ù #11

decodeForHTML returns same character for Ù and ù #11