Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken encoding in response body if application is running on OpenShift #21700

Closed
fedinskiy opened this issue Nov 25, 2021 · 20 comments · Fixed by #22644
Closed

Broken encoding in response body if application is running on OpenShift #21700

fedinskiy opened this issue Nov 25, 2021 · 20 comments · Fixed by #22644
Labels
area/kubernetes area/rest kind/bug Something isn't working triage/needs-reproducer We are waiting for a reproducer.
Milestone

Comments

@fedinskiy
Copy link
Contributor

fedinskiy commented Nov 25, 2021

Describe the bug

Given:

  1. Application uses quarkus-resteasy-reactive-jackson
  2. The application is run on OpenShift
  3. Application responds to HTTP request with response, which has HTTP status 200 and word "Slovník" in the body(the 6th letter is U+00ED, "LATIN SMALL LETTER I WITH ACUTE")

Expected behavior

The client should receive http response with "Slovník" in the body.

Actual behavior

The client receives http response with "Slovn?k" in the body.

How to Reproduce?

Reproducer:
https://github.com/fedinskiy/quarkus-test-suite/blob/openshift_fail/sql-db/hibernate-reactive/src/test/java/io/quarkus/ts/reactive/AbstractReactiveDatabaseIT.java#L37

Minimal reproducer:
https://github.com/fedinskiy/quarkus-reproducer/tree/openshift_fail
Scripts run.sh and stop.sh contain commands to start and stop the application on the OCP cluster respectively.
Run curl ${path to your app}/hello/broken to get the result.

Output of uname -a or ver

4.18.0-305.el8.x86_64

Output of java -version

Java version: 11.0.12, vendor: Oracle Corporatio

GraalVM version (if different from Java)

No response

Quarkus version or git rev

2c1be38

Build tool (ie. output of mvnw --version or gradlew --version)

Apache Maven 3.8.3 (ff8e977a158738155dc465c6a97ffaf31982d739)

Additional information

OpenShift 4.9

@fedinskiy fedinskiy added the kind/bug Something isn't working label Nov 25, 2021
@quarkus-bot
Copy link

quarkus-bot bot commented Nov 25, 2021

/cc @geoand, @iocanel

@fedinskiy fedinskiy changed the title Broken encoding in request body if application is running on OpenShift Broken encoding in response body if application is running on OpenShift Nov 25, 2021
@geoand
Copy link
Contributor

geoand commented Nov 25, 2021

Do you have a cluster I could access to reproduce this?
Furthermore, what are the exact steps to reproduce it?

@fedinskiy
Copy link
Contributor Author

@geoand the issue was found in this Jenkins run[1], so the answers would be:

  1. OCP at https://api.ocp49.dynamic.quarkus:6443 is the best thing I can offer.
  2. to run the Jenkins job again. Or, precisely, to run test OpenShiftPostgresql10IT#testI18N from io.quarkus.ts.qe:hibernate-reactive module, while being connected to an OCP cluster.

[1] https://main-jenkins-csb-quarkusqe.apps.ocp-c1.prod.psi.redhat.com/job/quarkus-main-rhel8-jdk11-openshift-new-ts-jvm/159/consoleFull

@geoand
Copy link
Contributor

geoand commented Nov 25, 2021

BTW, +100 for Thinking fast and slow in the tests :)

@geoand
Copy link
Contributor

geoand commented Nov 25, 2021

I test a sample RESTEasy Reactive application on Openshift 4.9 that simply does the following:

@Path("/hello")
public class TestResource {
    @GET
    public Multi<String> hello() {
        return Multi.createFrom().items("Γιώργος", "Slovník");
    }
}

and the response contains the proper characters.

So in short, I can't reproduce this.

@fedinskiy
Copy link
Contributor Author

@geoand I wasn't able to reproduce the error on your example either. But this one worked:

@Path("/hello")
@Produces(MediaType.APPLICATION_JSON)
@Consumes(MediaType.APPLICATION_JSON)
public class TestResource {

    @GET
    @Path("constant")
    public Uni<Response> hello() {
        return Uni.createFrom().item(Response.ok("Slovník").build());
    }
}

I presume, the gamechanger is javax.ws.rs.Produces annotation.

@geoand
Copy link
Contributor

geoand commented Nov 26, 2021

I still could not reproduce it

@stuartwdouglas
Copy link
Member

Any chance you can capture the actual data that is being sent over the wire (e.g. via wireshark)?

@fedinskiy
Copy link
Contributor Author

@stuartwdouglas sniffing VPN traffic is beyond my wireshark-fu capabilities, but I managed to capture request and response with curl's tracing. Results of --trace and --trace-ascii can be found here: https://gist.github.com/fedinskiy/567f6ccd14f83558d5b281268f2ca49e
The body looks like that 53 6c 6f 76 6e 3f 6b, so server output indeed contains question mark symbol('?', 3F in Unicode)

@gsmet
Copy link
Member

gsmet commented Jan 1, 2022

@fedinskiy does your Quarkus app run with a non default locale/encoding?

@gsmet gsmet added the triage/needs-reproducer We are waiting for a reproducer. label Jan 1, 2022
@fedinskiy
Copy link
Contributor Author

@gsmet I do not pass any parameters to the application, and locale on the openshift node looks like that:

sh-4.4$ locale
LANG=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

@gsmet
Copy link
Member

gsmet commented Jan 4, 2022

I wouldn't be surprised if it could be related to using a non-UTF-8 locale. I have no idea what should be used as the default charset for content types if you're not defining any charset in it.

@fedinskiy
Copy link
Contributor Author

Well, IETF considers[1] ISO-8859-1(yikes!) as a default, but recommends[2] to use UTF-8. To my understanding, this particular example should work with both, but the latter option looks like a better idea.

As a side note: since Quarkus is advertised as a Kubernetes-native framework[3], novice users(e.g. yours truly) expect that behaviour of their applications doesn't depend on the environment.

[1] https://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1
[2] https://www.w3.org/International/questions/qa-choosing-encodings
[3] https://quarkus.io/kubernetes-native/

P.S. I am not sure, if anyone gets notifications about edits in the original post, so I will add it here: there is also a "minimal" reproducer: https://github.com/fedinskiy/quarkus-reproducer/tree/openshift_fail .
Scripts run.sh and stop.sh contain commands to start and stop the application on the OCP cluster respectively.
Run curl ${path to your app}/hello/broken to get the result.

@geoand
Copy link
Contributor

geoand commented Jan 5, 2022

I am not sure, if anyone gets notifications about edits in the original post

nope, GitHub does not notify on updates

@geoand
Copy link
Contributor

geoand commented Jan 5, 2022

The version of Quarkus you are using in the reproducer is very outdated. Can you try with 2.6.1.Final?

@fedinskiy
Copy link
Contributor Author

Updated quarkus version in the reproducer. The result is the same.

@gsmet
Copy link
Member

gsmet commented Jan 5, 2022

@geoand did you try locally with a POSIX locale?

@geoand
Copy link
Contributor

geoand commented Jan 5, 2022

I will give it a shot

@geoand
Copy link
Contributor

geoand commented Jan 5, 2022

So indeed changing the locale makes this. I think I know what is causing it

geoand added a commit to geoand/quarkus that referenced this issue Jan 5, 2022
This was already being done when producing text/plain, but in cases
where application/json was being used and the actual entity was a String,
the encoding was not being specified

Fixes: quarkusio#21700
@geoand
Copy link
Contributor

geoand commented Jan 5, 2022

#22644 fixes the issue

geoand added a commit to geoand/quarkus that referenced this issue Jan 5, 2022
This was already being done when producing text/plain, but in cases
where application/json was being used and the actual entity was a String,
the encoding was not being specified

Fixes: quarkusio#21700
gsmet added a commit that referenced this issue Jan 5, 2022
Ensure that UTF8 is used as the default encoding in RESTEasy Reactive
@quarkus-bot quarkus-bot bot added this to the 2.7 - main milestone Jan 5, 2022
@gsmet gsmet modified the milestones: 2.7 - main, 2.6.2.Final Jan 7, 2022
gsmet pushed a commit to gsmet/quarkus that referenced this issue Jan 7, 2022
This was already being done when producing text/plain, but in cases
where application/json was being used and the actual entity was a String,
the encoding was not being specified

Fixes: quarkusio#21700
(cherry picked from commit fe79da7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubernetes area/rest kind/bug Something isn't working triage/needs-reproducer We are waiting for a reproducer.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants