-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Cassandra db schema on session initialization #5922
base: main
Are you sure you want to change the base?
Create Cassandra db schema on session initialization #5922
Conversation
plugin/storage/cassandra/factory.go
Outdated
datacenter string | ||
trace_ttl int | ||
dependencies_ttl int | ||
keyspace string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should already be defined in the main config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you talking about this config: jaeger/cmd/jaeger/config-cassandra.yaml ? How would I know what all config key-value pairs are already defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @yurishkuro , I needed some help here. Is this: jaeger/cmd/jaeger/internal/all-in-one.yaml the example for the config file which would be used in v2? If yes, under what section should I expect the cassandra specific configs to be put? I couldn't find any example for this in the current code base.
Just to get the complete picture, would the schema configs I am adding look something like this:
schema:
datacenter: <datacenterName>
keyspace: <keyspace>
.......
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmd/jaeger/config-cassandra.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if the current place of defining it works?
plugin/storage/cassandra/factory.go
Outdated
return result | ||
} | ||
|
||
func constructQueriesFromTemplateFiles(session cassandra.Session, params *StorageConfigParams) ([]cassandra.Query, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is cassandra.Session not able to execute multiple queries at once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you talking about running individual queries in parallel or a batch query option? I didn't try executing parallel queries.
69275fb
to
bcad4c0
Compare
cmd/jaeger/config-cassandra.yaml
Outdated
@@ -24,6 +24,12 @@ extensions: | |||
cassandra: | |||
schema: | |||
keyspace: "jaeger_v1_dc1" | |||
datacenter: "test" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add these values somewhere in create default config function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to func DefaultConfiguration
. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case do you need the values to still be defined here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I was thinking this acts as a quick reference to how the config file should look. I will remove this since this may not be required.
pkg/cassandra/config/config.go
Outdated
@@ -51,13 +51,26 @@ type Connection struct { | |||
ProtoVersion int `mapstructure:"proto_version"` | |||
} | |||
|
|||
type SchemaConfig struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comments to each field with sample values/units.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
} | ||
|
||
for _, query := range casQueries { | ||
err := query.Exec() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a corresponding table already exists, will this be a no-op or will it update the settings in the table? We need to allow users to be able to customize the settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the table creation queries use: CREATE TABLE IF NOT EXISTS
, so this will be a no-op.
f9dd90e
to
90368b1
Compare
…ution for initialize database Signed-off-by: Alok Kumar Singh <[email protected]>
90368b1
to
afc786d
Compare
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
//Datacenter is the name for network topology | ||
Datacenter string `mapstructure:"datacenter" valid:"optional"` | ||
// TraceTTL is Time To Live (TTL) for the trace data in seconds | ||
TraceTTL int `mapstructure:"trace_ttl" valid:"optional"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can the type here be time.Duration
so that the user could specify 72h
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am assuming that there should be a validation to make sure that user can't specify something like "ms"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's the point of using strong type - the business logic does not need to worry about validations, they should happen separately during parsing. I believe if you simply change int
to time.Duration
and use 10ms
as a value in YAML, the parser will do the right thing.
// CasVersion is version of cassandra used | ||
CasVersion int `mapstructure:"cas_version" valid:"optional"` | ||
// CompactionWindow of format "^[0-9]+[mhd]$" tells the compaction window of the db | ||
CompactionWindow string `mapstructure:"compaction_window" valid:"optional"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this is defining a time interval? Can we then also use time.Duration type?
pkg/cassandra/config/config.go
Outdated
// CompactionWindowSize is the numberical part of CompactionWindow. Extracted from CompactionWindow | ||
CompactionWindowSize int `mapstructure:"compaction_window_size" valid:"optional"` | ||
// CompactionWindowUnit is the time unit of the CompactionWindow. Extracted from CompactionWindow | ||
CompactionWindowUnit string `mapstructure:"compaction_window_unit" valid:"optional"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we use time.Duration type we will not need the unit separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, Size and Unit seems to be derived fields, if so they should not be declared in the Config struct. You can create a different struct for use with Go templates, e.g.
type TemplateParams struct {
config.Schema // embed config
// add other derived fields
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added TemplateParams
with embedded config.Schema
and other derived fields.
@@ -0,0 +1,84 @@ | |||
-- There are total 9 queries here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is "test"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Improved test and changed name in recent commit] The test is to check if the extracted query strings are same as the expected query strings.
traceTTLMinutes := cfg.TraceTTL / 60 | ||
|
||
cfg.CompactionWindowSize = (traceTTLMinutes + 30 - 1) / 30 | ||
cfg.CompactionWindowUnit = "m" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
below you use "MINUTES", is "m" also valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, m, h and d are valid.
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
@@ -0,0 +1,43 @@ | |||
-- There are total 4 queries here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the point of test schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of testing on all the queries and manually creating expected result for all of them, I reduced the number of queries to ease the creation of expected result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that doesn't really make sense to me. We can either test that the template is correct or that the code using the template is correct. You're not doing the former by having another template, and the latter you can do with the primary template. We don't need to validate that the output of running template is "as expected", that's like testing the Go template package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to validate that the output of running template is "as expected", that's like testing the Go template package.
We are also removing comments and constructing individual query strings out of it by iterating over lines in the template output. The test checks this "individual query string" construction, which not just involves using the template package.
and the latter you can do with the primary template.
Sure, I think I can write an integration test for this.
Create Schema (if not present) on Session Initialization
Once a session is established with cassandra db, the added code parses the template file containing queries for creating schema and create queries out of it. Post which it executes those queries to create the required types and tables.
Which problem is this PR solving?
Resolves #5797
Description of the changes
How was this change tested?
Checklist
jaeger
:make lint test
jaeger-ui
:yarn lint
andyarn test