-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quarkus doesn’t send X-Ray traces to AWS for sub-segments in Native build while using RESTEasy “controller” based approach #22595
Comments
/cc @Ladicek, @radcortez |
My guess is that we might need to register something more for reflection. But what... If you can debug in JVM which objects are serialized that contain the information you want, that might help. |
A reproducer could also help. Thanks! |
@gsmet @radcortez We cant see any application for xray in your quarkus-quickstarts repo. |
Well, if you want it fixed, please provide more information. I don't have the time to set up everything to reproduce it. Just giving us the name of the classes in AWS Xray containing the missing information would help if you can debug your JVM app and see where they are. |
@amitkumar7566 could you just provide a minimal reproducer? It shouldn't require any of your restricted code. I may be able to look into it since I'm planning to rework the config portion of tracing, but I'm still not there yet. |
Hi @gsmet @radcortez Here is a sample reproducer git project repo which has 2 resources covering both of the below types of scenarios:
Git repo link: https://github.com/amitkumar7566/xray-native-image-test Here are the findings to help you :- A. Findings in Normal JVM Build
B. Findings in Native Build1a. AWS DynamoDB subsegments DOES NOT get traced. In AWS Xray console, neither the subsegment nor the X-ray graph shows anything. 1b. Cloudwatch shows below error message: 2a. External API Call subsegments DOES NOT get traced. In AWS Xray console, neither the subsegment nor the X-ray graph shows anything. |
This class "com.amazonaws.xray.contexts.LambdaSegmentContext" seems to contain the missing information from where the below error arises and can be seen in Cloudwatch: I also tried logging (AWS_REGION & _X_AMZN_TRACE_ID) two of the pre-defined AWS Lambda variables and below are the results: A. In normal JVM build:Both AWS_REGION & _X_AMZN_TRACE_ID values can be found in logs. B. In Native build:AWS_REGION value can be found in logs but _X_AMZN_TRACE_ID value logs as null. |
Thanks! I'll have a look. |
@amitkumar7566 This may seem related with:
How are you setting the |
@radcortez The X-ray seems to work when we use lambda "handler" but does not work when we use "controller". Anyways, you have the sample code. |
I did some additional digging, and most likely this is related to the following substitution: Probably this only surfaces in subsegments, because there is a direct call to @amitkumar7566 can you please provide me with more information about how are you deploying your Quarkus application? Thanks! |
I think the issue is that the trace id is stored in a thread local. Resteasy requests are processed in a different thread from the native pull thread so the trace id is lost. |
@radcortez I am using AWS cloudformation to deploy. You can find the template in reproducer code. @patriot1burke What is the solution for the issue that you have mentioned above? |
@amitkumar7566 I don't a solution yet. Trying to figure out best way to do it. |
@patriot1burke isn't the trace id thread-local storage used only for native mode? At least the only reference I can find is in If X-Ray, now allows the propagation using system properties, we can probably revert back to the original code and drop the thread-local plus substitution? |
Native would only ever be able to process one request at a time though. Was eventually going to look at adding concurrent request processing and rewrite the poll loop to be non-blocking. |
I'll revert it for now. |
@radcortex @amitkumar7566 Please see linked PR if you want to verify this works. |
@amitkumar7566 Did you test tracing? Or did you just look at the output of your example program? I'm pretty sure custom lambda runtime (binary lambdas) do not set the _X_AMZN_TRACE_ID environment variable. The system property com.amazonaws.xray.traceHeader is set by the quarkus lambda integration and should be picked up by xray library. Your example code prints out this env var, but not the system property. FYI: Here's the code from the XRAY amazon library that obtains the trace id. Also, you can speed up the quarkus build by doing running:
|
@patriot1burke I checked in AWS Xray but could not find any sub-segment traces. However, I got the TraceId in logs when I pulled it from System property: Also, we are getting below error in cloudwatch. Full error logs below for DynamoDB call:
Full error logs below for REST Client call:
|
Have to exclude th jackson-module-afterburner artifact. Let me look into how to do that. |
So, let me explain what's happening. Afterburner is an addon for json serialization that an aws library transitive dependency is using. It creates classes dynamically at runtime. You cannot do this with Graal/native binaries. What I think will work is to exclude the afterburner module. Change this:
to this:
|
@patriot1burke When I excluded Afterburner like mentioned above, here is what I get as error: |
Excluding afterburner won't work (obviously :) ) I have another incoming PR that will hopefully fix the problem. The issue is that quarkus-amazon-rest depends on aws-serverless artifact. There is a static block that registers afterburner and code substitutions don't run until after static initialization (static initialization happens at build time with our Graal integration). I'm removing the aws-serverless library dependency and that should solve the problem. |
Ok, merge is in, try it now (you don't need the excludes anymore FYI). |
@patriot1burke I tried as instructed above.. No luck !! |
Same error? |
@patriot1burke No error at build time... |
@amitkumar7566 I was able to see an xray trace using latest quarkus and with a native binary build. Here's the project I used: https://github.com/patriot1burke/xray-demo README has details. I did see the subsegment I created within a trace call within the AWS console. |
@patriot1burke
Also please find below some analysis:
Hope this helps you with some clarification. |
@amitkumar7566 That helps a lot and narrows down the problem. quarkus-amazon-lambda-http vs. quarkus-amazon-lambda-rest wouldn't make a difference. So it looks like the instrumentor isn't working in native. I worry that the instrumentor requires the JVM and won't work in native. Depends if its a compile or runtime thing. I'll look into how XRAY works more. |
@amitkumar7566 I updated my example to use quarkus-amazon-lambda-rest and I'm still able to see the subsegment in the trace I'll look into getting your example to work. |
@amitkumar7566 I was able to get your example to work with latest Quarkus "main" branch build. No new fixes. I executed both the /books/db and /books/restclient endpoints. Here's a screen shot: and |
@amitkumar7566 Here's the pom. I made some minor changes to point to 999-SNAPSHOT. https://gist.github.com/patriot1burke/5c89ca20b04eae87269845cd540967cd |
@patriot1burke I validated and now I can see sub-segments for both AWS Services (DynamoDB) & External API Call. As this test has been done on 999-SNAPSHOT, below are our questions/concerns:
Our APIs which had gone to production last year use 2.0.x & 2.3.x and the ones being developed currently does not have this defect fix too. |
You are using unsupported community versions and not using our product bits. From our product manager: "What needs to be discussed with the customer is what version they should use. The customer is using unsupported extensions in an unsupported configuration so to me it probably makes the most sense that the customer updates to the latest community 2.7.1. The fix will then be added to RHBQ 2.7 when it arrives (planned for April)." |
@patriot1burke so can this issue be closed? From what I understand, the fix has already landed, correct? |
@amitkumar7566 This patch to our xray extension is already within community release 2.7.0.Final. I strongly suggest you upgrade your production systems to either the Red Hat supported bits of Quarkus or to the latest community version of quarkus. |
Thanks Bill. @amitkumar7566: We don't have a Red Hat supported version of Quarkus 2.7.0.Final released yet. But upgrading to community Quarkus 2.7.0.Final would put you in the best position ready for when the Red Hat Build of Quarkus 2.7 product is released. |
@patriot1burke @paulrobinson |
Yes, the fix is available since |
Describe the bug
I an using "Controller" based approach using RESTEasy. I have added Quarkus AWS Xray dependency in my POM.xml..
And then we are instrumenting calls to AWS Services using below dependency:
com.amazonaws
aws-xray-recorder-sdk-aws-sdk-v2-instrumentor
2.10.0
Also, we create manual sub-segment for external API calls..
However, Quarkus send sub-segment traces to XRay in the JVM build..
But in Native build, it does not send any sub-segments... also, there is no error thrown. In cloudwatch logs, it says "TRACE_ID missing.
It works for native build in AWS Lambda "Handler" based approach but not in "Controller" based approach in RESTEasy.
Attaching the screenshots of JVM and Native Build below:
Expected behavior
It should create and show sub-segments in the graph also.
Actual behavior
It does not show the sub-segments in Native build.
How to Reproduce?
https://github.com/amitkumar7566/xray-native-image-test
Output of
uname -a
orver
No response
Output of
java -version
java 11
GraalVM version (if different from Java)
21.3-java11
Quarkus version or git rev
2.6.1.Final
Build tool (ie. output of
mvnw --version
orgradlew --version
)maven
Additional information
No response
The text was updated successfully, but these errors were encountered: