-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON string escaping support #753
Conversation
protected int charToCodepoint( Character ch ) { | ||
|
||
final String s = Character.toString(ch); | ||
assert (s.length() == 1) : "Ooops"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asserts are enabled by the Surefire plugin by default, so if this triggers we will see the output in the test dump. Would you consider adding more context to the assert failure output to help identify this as the cause when inspecting from a terminal?
EG "JSONCodec - Unable to convert specified character to codepoint. Character String conversion must contain more than one character"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel strongly about this. There is a lot of output in the tests, so the likelihood is that the stacktrace from any assertion failure will be more useful than the message itself. This will get us in the right area, and I think most developers will be able to fill in the rest with the logic being applied.
Resolving this review thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #753 (comment), @jeremiahjstacey wrote:
Asserts are enabled by the Surefire plugin by default, so if this triggers we will see the output in the test dump. Would you consider adding more context to the assert failure output to help identify this as the cause when inspecting from a terminal?
That still doesn't mean that we should rely on 'asserts'. The are not generally enabled in runtime by default and even if we are doing so in our Surefire plugin, I would feel better if anything that can be caused by a user accidentally providing incorrect input or not using the classes as per their Javadoc, should be an actual runtime check and not rely on assert
. Rather, there should be an explicit check and if input is amok, then throw an IllegalArgumentException
. In other words, do NOT use assert
to check for preconditions. Using asserts
for postconditions or invariants, I think is okay, as that generally means a program error on our part (assume we've validated all the inputs for sanity), but don't use them for precondition checking.
So, @noloader, if this is a precondition check using assertions, could you change it to an explicit check? Thanks. (I haven't checked, as I'm talking to my mother and trying to listen to her and type at the same time.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kwwall per your comment on the assert statement I've unresolved that conversation. Suggest waiting to merge pending updated resolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my previous comment here. If Unicode characters can extend from 0x0 to 0x10ffff (rather than 0xffff), is there any concern that this could result in a string whose length is >1?
I don't think we need to resolve this before merging this PR, but if it is a problem, we should create a GitHub issue to address it later and to note it somewhere in the JSONCodec
Javadoc.
@noloader, I scanned the existing bugs and I don't see a task to associate this PR to. (I may have missed it, sorry) If something does not exist already, would you please create an issue in the project that encapsulate the intent and value of this change? Once identified or created, please associate it with this PR for project traceability. |
I believe there is an old, closed GitHub issue for this that was closed
because he thought lost the code when he did a rebase, but he was able to
recover it. We should find that old GitHub issue and just reopen it.
…-kevin
On Wed, Oct 26, 2022, 6:34 AM jeremiahjstacey ***@***.***> wrote:
@noloader <https://github.com/noloader>, I scanned the existing bugs and
I don't see a task to associate this PR to. (I may have missed it, sorry)
If something does not exist already, would you please create an issue in
the project that encapsulate the intent and value of this change? Once
identified or created, please associate it with this PR for project
traceability.
—
Reply to this email directly, view it on GitHub
<#753 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAO6PG7ZS5YYM6HZPNOIGI3WFECJ5ANCNFSM6AAAAAARORONOE>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Logic adjustements to the implementatation of JsonCodec.decodeCharacter to consolidate method return events to four: 1 - First character is null (fast return/no Exception) 2 - First character is not null, Second character is Null (fast return w/Exception) 3 - First & Second not null, but no decode can be performed (Exception) 4 - First & Second not null, decode performed and returned (normal response) Also added the two-character output to the Exception message in case 3 to aid with diagnostics/debugging.
Its kind of a winding road to get here. This started with Kevin and I talking offline because we need a The mailing list message points to the dead PR at #722 . The original PR is dead because I accidentally blew away the code changes on a rebase. I was able to recover the deleted changes and files through Git's |
I looked at the issue history and didn't find one. No worries. I will
create one for you tonight, and reference this PR so they will get linked
together.
…-kevin
On Thu, Oct 27, 2022, 3:02 PM Jeffrey Walton ***@***.***> wrote:
@jeremiahjstacey <https://github.com/jeremiahjstacey>
I scanned the existing bugs and I don't see a task to associate this PR
to. (I may have missed it, sorry)
If something does not exist already, would you please create an issue in
the project that encapsulate the intent and value of this change? Once
identified or created, please associate it with this PR for project
traceability.
Its kind of a winding road to get here.
This started with Kevin and I talking offline because we need a
JSONEncoder at $dayjob. We did not want to use a JavaScript encoder
because the grammars are slightly different. Then, the discussion moved to
the mailing list at *JSON codec discussion on esapi-github*,
https://groups.google.com/a/owasp.org/g/esapi-project-users/c/Q0LTNGxbTQc/m/xoyjMNwCAgAJ
.
The mailing list message points to the dead PR at #722
<#722> . The original PR
is dead because I accidentally blew away the code changes on a rebase. I
was able to recover the deleted changes and files through Git's reflog,
which eventually lead to this [new] PR.
—
Reply to this email directly, view it on GitHub
<#753 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAO6PG3GHNDHPT5W7LY3OPTWFLGTTANCNFSM6AAAAAARORONOE>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Using the more appropriate %c syntax for the String.format when displaying the first and second characters in the exception message.
Updating comments pertaining to handling of invalid code points in UTF8, per request in PR ESAPI#753 ESAPI#753 (comment)
@noloader, @jeremiahjstacey, & @xeno6696 - Call for final comments and/ commits on this PR. I'd like to get this and @jeremiahjstacey's PR #756 merged soon so we can do another release. I know Dave Wichers is planning on a AntiSamy point release to fix a CVE in a dependency so we should release when AntiSamy does to prevent people from claiming ESAPI is vulnerable. |
On Wed, Nov 16, 2022 at 6:07 AM jeremiahjstacey ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/main/java/org/owasp/esapi/codecs/JSONCodec.java
<#753 (comment)>
:
> + // Do nothing. Fall into throw below.
+ }
+ catch (IllegalArgumentException e)
+ {
+ // Do nothing. Fall into throw below.
+ }
+
+ // Catch all. The escaped character sequence was invalid.
+ input.reset();
+ throw new IllegalArgumentException( errorMessage );
+ }
+
+ protected int charToCodepoint( Character ch ) {
+
+ final String s = Character.toString(ch);
+ assert (s.length() == 1) : "Ooops";
@kwwall <https://github.com/kwwall> per your comment on the assert
statement I've unresolved that conversation. Suggest waiting to merge
pending updated resolution.
The code does not depend on behavior from the assert. The idea behind the
assert is to alert the developer when the code is being debugged. That's
all I wanted to do there: "Hey! Something looks sideways. Take a look at
this!".
In C/C++/ObjC apps, an assert should snap the debugger. In release builds
the assert should be removed. If my understanding is mistaken or a web app
does not alert the developer during a debug session, then would someone
correct it, please.
Now the behavior when asserts are not in effect may be wrong. The behavior
is to return the first character in the pair. I feel like that's not quite
correct. I think the docs say an empty string will be returned on encoding
and decoding errors. Maybe that code path should return "" instead.
… Message ID: ***@***.***
com>
|
In Java, assertions are disabled by default. We enable them via the
Surefire plugin for our JUnit tests, but for others development where
ESAPI is used, they would be disabled. So, just consider that as a
possibility.
…-kevin
On Wed, Nov 16, 2022, 10:51 AM Jeffrey Walton ***@***.***>
wrote:
On Wed, Nov 16, 2022 at 6:07 AM jeremiahjstacey ***@***.***>
wrote:
> ***@***.**** commented on this pull request.
> ------------------------------
>
> In src/main/java/org/owasp/esapi/codecs/JSONCodec.java
> <
#753 (comment)
>
> :
>
> > + // Do nothing. Fall into throw below.
> + }
> + catch (IllegalArgumentException e)
> + {
> + // Do nothing. Fall into throw below.
> + }
> +
> + // Catch all. The escaped character sequence was invalid.
> + input.reset();
> + throw new IllegalArgumentException( errorMessage );
> + }
> +
> + protected int charToCodepoint( Character ch ) {
> +
> + final String s = Character.toString(ch);
> + assert (s.length() == 1) : "Ooops";
>
> @kwwall <https://github.com/kwwall> per your comment on the assert
> statement I've unresolved that conversation. Suggest waiting to merge
> pending updated resolution.
>
The code does not depend on behavior from the assert. The idea behind the
assert is to alert the developer when the code is being debugged. That's
all I wanted to do there: "Hey! Something looks sideways. Take a look at
this!".
In C/C++/ObjC apps, an assert should snap the debugger. In release builds
the assert should be removed. If my understanding is mistaken or a web app
does not alert the developer during a debug session, then would someone
correct it, please.
Now the behavior when asserts are not in effect may be wrong. The behavior
is to return the first character in the pair. I feel like that's not quite
correct. I think the docs say an empty string will be returned on encoding
and decoding errors. Maybe that code path should return "" instead.
> Message ID: ***@***.***
> com>
>
—
Reply to this email directly, view it on GitHub
<#753 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAO6PG2FKDVUTMQZZRJCWR3WIT7ILANCNFSM6AAAAAARORONOE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@kwwall i know I need to review this, I’m on travel for work (again) and I can’t sit down with this until this weekend. If we need to push faster than that I’ll trust the group judgment here. |
I did want to call myself out to building some unit tests that handled the full Unicode range. If we stayed with the AbstractCharacterCodec I have a minor frowny-face just because it was only supposed to be a bridge until all the codecs were converted to Integer-based codecs... however this is my fault as I should have marked that abstract class as deprecated. (It could be that there are some codecs where for some reason, it makes zero sense at all to expand to handle the full UTF-8 range, but I figured we'd handle those case by case.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like a response to my comment before I approve merging this PR. Changes may or may not be required. I just want to be sure our assumptions about code points are still legitimate as @xeno6696 did the AbstractCodec
work many years ago when we were using Java 7 and that may have been an older version of Unicode.
@@ -25,6 +25,9 @@ | |||
* points cannot be represented by a {@code char}, this class remedies that by parsing string | |||
* data as codePoints as opposed to a stream of {@code char}s. | |||
* | |||
* WARNING: This class will silently discard an invalid code point according to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my previous comment in AbstractCodec.java. In particular, it seems as though Character.isValidCodepoint( int ) allows a broader range that what was anticipated.
protected int charToCodepoint( Character ch ) { | ||
|
||
final String s = Character.toString(ch); | ||
assert (s.length() == 1) : "Ooops"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my previous comment here. If Unicode characters can extend from 0x0 to 0x10ffff (rather than 0xffff), is there any concern that this could result in a string whose length is >1?
I don't think we need to resolve this before merging this PR, but if it is a problem, we should create a GitHub issue to address it later and to note it somewhere in the JSONCodec
Javadoc.
This commit adds JSON string escaping support according to RFC 8259, Section 7.
See GitHub Issue
closes #754. [added by @kwwall]