Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decodeForHTML returns same character for Ù and ù #11

Open
GoogleCodeExporter opened this issue Mar 23, 2016 · 2 comments
Open

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?
1. decodeForHTML returns same character for Ù and ù  This is true 
for all named entities with upper/lower case versions. 

What is the expected output? What do you see instead?

Ù should return upper case U with accent, and ù should return 
lower case u with accent.

What version of the product are you using? On what operating system?

Latest version on Linux.

Please provide any additional information below.

In HTMLEntityCodec.js, you should probably not do a case insensitive look-up at 
the end of the getNamedEntity function.

Thanks!

Original issue reported on code.google.com by [email protected] on 5 Aug 2012 at 9:19

@GoogleCodeExporter
Copy link
Author

Hi,

I found one issue with decodeForHTML function. I tried below steps

org.owasp.esapi.ESAPI.initialize();

$ESAPI.encoder().encodeForHTML("<script>alert('123');</script>");
"<script>alert('123');</script>"

$ESAPI.encoder().decodeForHTML("<script>alert('123');</script>");
"<script>alert4039123394159<47script>"

Issue:- decodeForHTML is not giving me the actual data which i had encoded.

Solution:- In org.owasp.esapi.codecs.HTMLEntityCodec, the function parseNumber 
and parseHex returning number directly(return parseInt(out);). it should return 
char code(return String.fromCharCode(parseInt(out));).
Below are the function i have modified

var parseNumber = function(input) {
        var out = '';
        while (input.hasNext()) {
            var c = input.peek();
            if (c.match(/[0-9]/)) {
                out += c;
                input.next();
            } else if (c == ';') {
                input.next();
                break;
            } else {
                break;
            }
        }

        try {
            return String.fromCharCode(parseInt(out));
            //Commented to fix esapi bug
            //return parseInt(out);
        } catch (e) {
            return null;
        }
    };

    var parseHex = function(input) {
        var out = '';
        while (input.hasNext()) {
            var c = input.peek();
            if (c.match(/[0-9A-Fa-f]/)) {
                out += c;
                input.next();
            } else if (c == ';') {
                input.next();
                break;
            } else {
                break;
            }
        }
        try {
            return String.fromCharCode(parseInt(out, 16));
            //Commented to fix esapi bug
            //return parseInt(out, 16);
        } catch (e) {
            return null;
        }
    };

I have fixed this issue in esapi.js and using it for my project.

Thanks
Bikesh Kumar

Original comment by [email protected] on 19 Mar 2013 at 8:22

@GoogleCodeExporter
Copy link
Author

I think all we did was change in HTMLEntityCodec.js

return String.fromCharCode(entityToCharacterMap.getCaseInsensitive('&' + 
entity));

to

return String.fromCharCode(entityToCharacterMap['&' + entity]);

Original comment by [email protected] on 19 Mar 2013 at 10:58

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant