Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trim quotes for collected metadata #19

Merged
merged 1 commit into from
Oct 25, 2023
Merged

Conversation

lu-zhengda
Copy link
Collaborator

This PR trims quotes for quoted identifiers before collecting metadata.
This change is necessary as metadata are often processed as tags, table name tag like \"public\".\"users\" with double quotes in string will be processed as tag table:_public_._table_.

Worth to mention, we only want to trim the quotes for collected metadata but retain the exact format in obfuscated and normalized queries.

(Per best practice, tags May contain alphanumerics, underscores, minuses, colons, periods, and slashes. Other characters are converted to underscores.)

@lu-zhengda lu-zhengda marked this pull request as ready for review October 25, 2023 02:36
@lu-zhengda lu-zhengda requested a review from a team as a code owner October 25, 2023 02:36
@@ -240,3 +240,8 @@ func replaceDigits(input string, placeholder string) string {

return builder.String()
}

func trimQuotes(input string, delim string, closingDelim string) string {
replacer := strings.NewReplacer(delim, "", closingDelim, "")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably move the replacer to a global variable initialized via func init()? Not a big deal but it would save on the cost of building the replacer instance.

Copy link
Collaborator Author

@lu-zhengda lu-zhengda Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that but the replacer is a bit dynamic, which depends on the opening and closing delimiter, such as " or [|]. One way of doing that is create all possible replacers as global var and retrieve based on delimiter.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I missed that. Yeah, I guess you could keep a mapping of replacers by delimiters but it might be becoming more tedious than it's worth. I think it's fine as-is.

@lu-zhengda lu-zhengda merged commit 7c204fb into main Oct 25, 2023
3 checks passed
@lu-zhengda lu-zhengda deleted the zhengda.lu/trim-quotes branch October 25, 2023 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants