Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement versioning for Assess Migration callhome payload #2234

Merged
merged 11 commits into from
Jan 23, 2025

Conversation

sanyamsinghal
Copy link
Collaborator

@sanyamsinghal sanyamsinghal commented Jan 22, 2025

Describe the changes in this pull request

  • it starts with 1.0
  • with this version we are also sending MigrationComplexityExplanation and assessment issues struct instead of separate fields for each category
  • For each assessment issue we are obfuscating the sensitive information where it is
    • either Description
    • ObjectName
    • Issue Suggestion coming in (which is eventually a part of description for assessment issue)
  • https://yugabyte.atlassian.net/browse/DB-14988

Describe if there are any user-facing changes

Callhome payloads has changed.
So the dashboard queries needs to be updated for the payload version 1.0 (create a ticket for this?)

How was this pull request tested?

Manually tests on a local callhome server.
cc @shubham-yb for second round of testing before

Does your PR have changes that can cause upgrade issues?

Component Breaking changes?
MetaDB Yes/No
Name registry json Yes/No
Data File Descriptor Json Yes/No
Export Snapshot Status Json Yes/No
Import Data State Yes/No
Export Status Json Yes/No
Data .sql files of tables Yes/No
Export and import data queue Yes/No
Schema Dump Yes/No
AssessmentDB Yes/No
Sizing DB Yes/No
Migration Assessment Report Json Yes/No
Callhome Json Yes/No
YugabyteD Tables Yes/No
TargetDB Metadata Tables Yes/No

@sanyamsinghal sanyamsinghal self-assigned this Jan 23, 2025
@sanyamsinghal sanyamsinghal force-pushed the sanyam/callhome-assessment branch from 4fbd267 to b0a2b9c Compare January 23, 2025 05:52
@sanyamsinghal sanyamsinghal marked this pull request as ready for review January 23, 2025 06:48
yb-voyager/cmd/assessMigrationCommand.go Show resolved Hide resolved
yb-voyager/cmd/assessMigrationCommand.go Show resolved Hide resolved
unsupportedDatatypesList := lo.Map(assessmentReport.UnsupportedDataTypes, func(datatype utils.TableColumnsDataTypes, _ int) string {
return datatype.DataType
})
explanation, err := buildMigrationComplexityExplanation(source.DBType, assessmentReport, "")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we building again? It should already be available in the assessment report, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason is, we will this twice... for json and html reports both, at the time of report generation.
whatever happens later will be stored in the struct's field.

So just to be on safer side i am building it again.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. See, here's another reason why having different values for the same field (depending on report type) is not a good idea. AssessmentReport.MigrationComplexityExplanation should be properly defined in our struct (without including any html/json specific logic).

  • If we want any HTML specific logic, it should be in the template.
  • If we want any json-specific logic, it should be in a custom marshaller.
  • If we want any callhome specific logic, it should be in the callhome layer (this function)

Pls add a comment here explaining why we're re-building. And let's clean this up soon

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah agree, only limitation which stopped me was that to the template we can pass only one struct.
In the original assessment report struct, i wanted to avoid a new child struct, just for the html report case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. It might not be such a bad idea to have a chid struct actually. But alternatively, you could just have all the fields in the struct directly, and then mark them as json:"-" if you don't want them in the json

yb-voyager/cmd/assessMigrationCommand.go Outdated Show resolved Hide resolved
objects = lo.Map(feature.Objects, func(o ObjectInfo, _ int) string {
return o.ObjectName
})
for _, sensitiveDescription := range descriptionsIncludingSensitiveInformationToCallhome {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had discussed that it does not make sense to send description, right? 🤔
Do we see any value in sending "description"? It just contains a more verbose explanation of the issue type + sensitive object names. It's more for the user, not helpful when collecting data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, you remember we discussed that some info in this might be required.
Like column name or object name is not requrid
But object type / datatype like citext etc... can also be there which can be useful..
Although this might not be a good way to send that type of info to callhome.

Comment on lines 215 to 223
for _, sensitiveDescription := range descriptionsIncludingSensitiveInformationToCallhome {
if sensitiveDescription == UNSUPPORTED_PG_SYNTAX_ISSUE_REASON && strings.HasPrefix(obfuscatedIssue.Description, sensitiveDescription) {
obfuscatedIssue.Description = sensitiveDescription
} else {
match, err := utils.MatchesFormatString(sensitiveDescription, obfuscatedIssue.Description)
if match {
obfuscatedIssue.Description, err = utils.ObfuscateFormatDetails(sensitiveDescription, obfuscatedIssue.Description, constants.OBFUSCATE_STRING)
}
if err != nil {
log.Errorf("error while matching issue description with sensitive descriptions: %v", err)
obfuscatedIssue.Description = constants.OBFUSCATE_STRING
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to send the description at all in the assessment issues as the issue.Name/issue.Type should be sufficient but in the cases this is not sufficient like index on complex datatypes, etc.. we should not worry about sending this currently as we were not sending this before as well and we should separate out issues to have issue.Type/Name differentiate that.
cc: @makalaaneesh

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we did this for analyze as we don't want to regress there with less information but I think here we can go without descriptions as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only problem is we won't know have the info like index access method, datatype which is unsupported in the issue... anywhere...

The description after obfuscation of objectname can even act as a key for case like index on complex datatypes 'citext'

Copy link
Collaborator

@makalaaneesh makalaaneesh Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed but based on what we had discussed earlier, the solution would be for us to define separate issues (index-not-supported-json, index-not-supported-citext, etc).

Only in cases where that's not possible (for instance, extensions), we could somehow pass that information along (using the TYPE field for example)

Impact: issue.Impact,
ObjectType: issue.ObjectType,
ObjectName: constants.OBFUSCATE_STRING,
SqlStatement: constants.OBFUSCATE_STRING, // TODO(future): we can obfuscate sensitive info in SQL statement
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if that would be easily possible to obfuscate the SQL statement and if there is any need to do that.

Copy link
Collaborator Author

@sanyamsinghal sanyamsinghal Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will remove SqlStatement field, and keep ObjectName since we need that for extensions case

Comment on lines 211 to 209
// object name(i.e. extension name here) might be qualified with schema so just taking the last part
obfuscatedIssue.ObjectName = strings.Split(issue.ObjectName, ".")[len(strings.Split(issue.ObjectName, "."))-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally extension name is not qualified as the CREATE EXTENSION DDL has a separate option for mentioning the schema name and I believe we are not using the schema name.

CREATE EXTENSION IF NOT EXISTS citext WITH SCHEMA public;

Comment on lines 205 to 206
DocsLink: issue.DocsLink,
MinimumVersionFixedIn: issue.MinimumVersionFixedIn,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can also skip sending these two as they are static information.

return len(constructs)
})
// allowing object name for unsupported extension issue type
if issue.Type == UNSUPPORTED_EXTENSION_ISSUE_TYPE {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of https://yugabyte.atlassian.net/browse/DB-14476, we had decided to create separate items for each extension type. Let's keep that behavior as is. Shall we modify the Type field to also include the extension name?

Copy link
Collaborator

@makalaaneesh makalaaneesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sanyamsinghal sanyamsinghal force-pushed the sanyam/callhome-assessment branch from e8534c7 to b5b7c5d Compare January 23, 2025 10:39
@sanyamsinghal sanyamsinghal merged commit ae842f5 into main Jan 23, 2025
68 checks passed
@sanyamsinghal sanyamsinghal deleted the sanyam/callhome-assessment branch January 23, 2025 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants