-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-27981 Add connection and request attributes to slow log #5335
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will take a closer look later, but wondering if we should be sending the attributes along in their raw proto form instead of parsed strings. This goes back to the same reasons we had when adding the scan payload. But also we don't know what bytes people put into the values, since it's NameBytePairs.
If we send them back as NameBytePairs we can present to the client as Map<String, byte[]>
and let them parse how they want.
This applies to all three attribute types
Good idea, should be a pretty easy change to make |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
On second thought I'm wondering if we should leave request attributes out of this jira for now. We added scan previously, and I think we might want to add at least Get if not also Mutate (with payload removed). If we did that, those objects would include the request attributes. I know we haven't figured out what to do about multi, but that still applies here since you didnt add request attributes for that. WDYT? |
Just to clarify, do you mean operation attributes? If yes, I think that’s a good point. I’d be interested in adding Get/Mutate requests to the payload in a follow up PR. Still not sure exactly what to do about multi but we could figure that out too |
oh yes, sorry, operation attributes |
Perfect, yeah I’m in agreement, it’d be silly to have them in two places and it’d be nice to have the whole operation. I can strip the operation attribute logic out of this PR next time I’m at my keyboard |
96b4ce8
to
6758f70
Compare
I updated the commit to change two major things:
|
Nice, thanks! Now we just need to expose them with getters on the client side for the full end to end |
We might actually want to expose them in string form in the slow log UI though. I would guess many people use slow log without querying the api like we do. |
6758f70
to
30de63c
Compare
💔 -1 overall
This message was automatically generated. |
I opened up https://issues.apache.org/jira/browse/HBASE-28002 to handle operation payloads |
if (slowLogPayload.getRequestAttributes().isEmpty()) { | ||
jsonObj.remove("requestAttributes"); | ||
} else { | ||
jsonObj.add("requestAttributes", | ||
gson.toJsonTree(deserializeAttributes(slowLogPayload.getRequestAttributes()))); | ||
} | ||
if (slowLogPayload.getConnectionAttributes().isEmpty()) { | ||
jsonObj.remove("connectionAttributes"); | ||
} else { | ||
jsonObj.add("connectionAttributes", | ||
gson.toJsonTree(deserializeAttributes(slowLogPayload.getConnectionAttributes()))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If present, this will spit out a blob like:
{
"startTime": 1,
"processingTime": 2,
"queueTime": 3,
"responseSize": 4,
"blockBytesScanned": 5,
"multiGetsCount": 6,
"multiMutationsCount": 7,
"requestAttributes": {
"r": "1"
},
"connectionAttributes": {
"c": "1"
}
}
public static Map<String, String> deserializeAttributes(Map<String, byte[]> attributes) { | ||
return attributes.entrySet().stream() | ||
.collect(Collectors.toMap(Map.Entry::getKey, entry -> Bytes.toString(entry.getValue()))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this public because we'll want to reuse this logic when deserializing attributes for the RS UI's Operation Details page
@@ -25,6 +25,7 @@ | |||
import="org.apache.hadoop.util.StringUtils" | |||
import="org.apache.hadoop.hbase.regionserver.HRegionServer" | |||
import="org.apache.hadoop.hbase.HConstants" | |||
import="org.apache.hadoop.hbase.client.OnlineLogRecord.deserializeAttributes" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested the changes in this class yet. My plan at the moment is to load this branch onto a test cluster and manually validate that the RS UI is working as intended. Please let me know if there's a better testing approach for these jsp changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know whether non-pretty-printed JSON is really what we want in the UI. We could create a String like we do for slow log params, but I think that's often hard to read too. At least with JSON one can copy & paste into a pretty printer if they really want to inspect in details.
I think there might be a larger/separate ticket here regarding a refactor of the operation details table — for example:
- I don't think the
Call Details
section is particularly useful. Rather than a toString representation of the request type, which can already be inferred by method type, we could consider making this a conglomerate payload of, for example, both request attributes, connection attributes, and any other call metadata that's appropriate. - we should omit table headers that are always empty in the given page. This is easier said than done and relatively low value I suppose, but it's not easy to read the current UI in my opinion.
- we should omit
MultiX
columns that are not relevant to the payloads at hand. This, like omitting empty columns, wouldn't help a use-case with a wide variation in their slow logs' access patterns. I think this could be an argument for having separate, more readable, tables for each method type
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
1616d2e
to
5a5fe19
Compare
OnlineLogRecord toJson example: {
"startTime": 1690836447064,
"processingTime": 1,
"queueTime": 1,
"responseSize": 4,
"blockBytesScanned": 0,
"clientAddress": "127.0.0.1:49570",
"serverClass": "HRegionServer",
"methodName": "Get",
"callDetails": "Get(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$GetRequest)",
"param": "region\u003d doot,,1690823041624.08b6ff7707ce046b301ab2c397ea8060., row\u003d 0",
"userName": "rmattingly",
"requestAttributes": {
"attributeType": "requestAttribute",
"another": "one",
"correlationId": "123"
}
} |
5a5fe19
to
3c22285
Compare
💔 -1 overall
This message was automatically generated. |
227399f
to
6711c9a
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
f139824
to
46a6c1a
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think this overall looks good, but has 1 big problem -- As fixed in #5366, request attributes can also become corrupted if they aren't deserialize within the context of a request. Since we are publishing these to an async Lmax Disruptor, it's possible they will sometimes happen fast enough but not always.
So within that in mind, we should make sure to call rpcCall.getRequestAttributes() in the context of the request. The best place for that might be in constructing the RpcLogDetails in RpcServer -- either by passing it into the constructor or by making that call in the constructor. Please add a comment to say why this is important
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Signed-off-by: Bryan Beaudreault <[email protected]>
…#5335) Signed-off-by: Bryan Beaudreault <[email protected]>
…#5335) Signed-off-by: Bryan Beaudreault <[email protected]>
…#5335) Signed-off-by: Bryan Beaudreault <[email protected]>
…#5335) Signed-off-by: Bryan Beaudreault <[email protected]>
In HBASE-27657 we added support for request and connection attributes. By pushing this metadata to the slow log we can provide more flexible and meaningful visibility for those debugging via the slow log.
cc @bbeaudreault @hgromer @eab148 @bozzkar