Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce setDefaultCharacterEncoding() in MockHttpServletResponse #27214

Closed
jhyot opened this issue Jul 26, 2021 · 7 comments
Closed

Introduce setDefaultCharacterEncoding() in MockHttpServletResponse #27214

jhyot opened this issue Jul 26, 2021 · 7 comments
Assignees
Labels
in: test Issues in the test module in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Milestone

Comments

@jhyot
Copy link

jhyot commented Jul 26, 2021

Since: Spring 5.2 and the deprecation of APPLICATION_JSON_UTF8 Mediatype.

Although the content type 'application/json' can only be encoded in UTF-8 (which is why the explicit encoding was deprecated), MockHttpServletResponse#getCharacterEncoding() returns 'ISO-8859-1'.

This breaks any test code that inspects or uses the getCharacterEncoding() directly. But more unexpectedly, it breaks MockHttpServletResponse#getContentAsString(), because the response is decoded with ISO-8859-1 instead of UTF-8. Explicitly using MockHttpServletResponse#getContentAsString(Charset fallbackCharset) works correctly.

The Javadoc for getContentAsString() is (to me) somewhat ambiguous:

Get the content of the response body as a String, using the charset specified for the response by the application, either through HttpServletResponse methods or through a charset parameter on the Content-Type.

"using the charset specified for the response by the application" could imply UTF-8 for 'application/json', because that must always be the value for JSON. If using this interpretation, then the method breaks its Javadoc contract.

Previously, this was fixed for ContentResultMatchers#json(String) but not for MockHttpServletResponse#getContentAsString() (#23622).
Another report was closed by the submitter without any resolution except using an explicit charset (#23851).

If it is not possible to fix this for JSON specifically, I would appreciate if somehow a default value could be set (instead of always defaulting to ISO-8859-1).

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Jul 26, 2021
@sbrannen
Copy link
Member

I would appreciate if somehow a default value could be set (instead of always defaulting to ISO-8859-1).

Are you aware of the org.springframework.mock.web.MockHttpServletResponse.setCharacterEncoding(String) method?

@sbrannen sbrannen added in: test Issues in the test module in: web Issues in web modules (web, webmvc, webflux, websocket) status: waiting-for-feedback We need additional information before we can continue labels Jul 26, 2021
@jhyot
Copy link
Author

jhyot commented Jul 27, 2021

@sbrannen Yes I saw the method, but I didn't think of a way to do it by default. I guess I could write a custom org.springframework.test.web.servlet.ResultHandler and set it there. Thanks.

The rest of the ticket is still valid with the somewhat unexpected behaviour.

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Jul 27, 2021
@sbrannen
Copy link
Member

sbrannen commented Jul 29, 2021

@sbrannen Yes I saw the method, but I didn't think of a way to do it by default. I guess I could write a custom org.springframework.test.web.servlet.ResultHandler and set it there. Thanks.

You could write a custom ResultHandler to set the character encoding for the response, but you would probably be better off setting it as early as possible -- for example, via a custom Filter registered via .addFilter() in the MockMvc builder.

Note, however, that setting the character encoding via setCharacterEncoding() or setContentType() results in ;charset=... being appended to the Content-Type header, and that may be an unacceptable side effect for your use case.

In light of that, I am going to repurpose this issue to introduce support for changing the default character encoding used in MockHttpServletResponse.

The rest of the ticket is still valid with the somewhat unexpected behaviour.

Although I can see how you might have expected getContentAsString() to take into account the Content-Type header, I don't think we should alter the behavior to provide special treatment for application/json to infer UTF-8 as the character encoding since getContentAsString(Charset) was introduced to support use cases where the character encoding used to write the response body is not reflected in the value returned from getCharacterEncoding().

@rstoyanchev, do you have any further input here?

@sbrannen sbrannen removed status: feedback-provided Feedback has been provided status: waiting-for-triage An issue we've not yet triaged or decided on labels Jul 29, 2021
@sbrannen sbrannen changed the title Unexpected 'ISO-8859-1' encoding in MockHttpServletResponse for response with 'application/json' content type Introduce setDefaultCharacterEncoding() in MockHttpServletResponse Jul 29, 2021
@sbrannen sbrannen self-assigned this Jul 29, 2021
@sbrannen sbrannen added the type: enhancement A general enhancement label Jul 29, 2021
@sbrannen sbrannen added this to the 5.3.10 milestone Jul 29, 2021
@sbrannen
Copy link
Member

sbrannen commented Jul 29, 2021

You could write a custom ResultHandler to set the character encoding for the response, but you would probably be better off setting it as early as possible -- for example, via a custom Filter registered via .addFilter() in the MockMvc builder.

Note, however, that setting the character encoding via setCharacterEncoding() or setContentType() results in ;charset=... being appended to the Content-Type header, and that may be an unacceptable side effect for your use case.

In light of that, I am going to repurpose this issue to introduce support for changing the default character encoding used in MockHttpServletResponse.

This has been addressed via a new setDefaultCharacterEncoding(String characterEncoding) method in e4b9b1f.

@jhyot, feel free to try this out in an upcoming 5.3.10 snapshot.

@jhyot
Copy link
Author

jhyot commented Jul 29, 2021

Thank you for adding this.

@sbrannen
Copy link
Member

See #27230 for support for this new feature in MockMvc.

lxbzmy pushed a commit to lxbzmy/spring-framework that referenced this issue Mar 26, 2022
Prior to this commit, it was possible to set the character encoding
in MockHttpServletResponse via setCharacterEncoding() or
setContentType(); however, those methods append "charset=..." to the
Content-Type header which may not be an acceptable side effect.

This commit addresses this shortcoming by introducing a new
setDefaultCharacterEncoding() in MockHttpServletResponse which allows
one to override the previously hard coded value of "ISO-8859-1". In
addition, setDefaultCharacterEncoding() does not modify the Content-Type
header.

The reset() method has also been updated to reset the character encoding
to the configured default character encoding.

Closes spring-projectsgh-27214
@zhemaituk
Copy link

@sbrannen sorry for excavating an old issue. I can log new one if it has a chance to live.
I just encountered the same problem and was surprised to see the problem occurs in tests only (as browsers and rest of Spring framework defaults to UTF-8 for application/json as per RFC 4627).

Is there a chance the test framework can follow RFC 4627 out of the box? If it did - It would make tests behavior more representable of real-world behavior.

Workaround is not complex, but it felt as a "least surprise principle" violation, and I saw way too many imprecise workarounds:

  • some people change controllers/filters and add ;charset=utf-8 back, even though Spring stopped doing that due to RFC 4627
  • some people do setDefaultCharacterEncoding("UTF-8"); or getContentAsString(StandardCharsets.UTF_8); unconditionally (potentially hiding encoding issues with other content types).

I ended up adding such code, but it felt quite wordy:

if (MediaType.APPLICATION_JSON_VALUE.equals(response.getContentType())) {
    // Workaround of https://github.com/spring-projects/spring-framework/issues/27214
    // The default encoding for application/json is UTF-8 per RFC 4627, and handled as UTF-8 by web browsers and spring framework (except MockMvc).
    response.setDefaultCharacterEncoding("UTF-8");
}
return response.getContentAsString();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: test Issues in the test module in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

4 participants