-
Notifications
You must be signed in to change notification settings - Fork 749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support MySql driver #2356
Support MySql driver #2356
Conversation
@VeronikaSolovei9 could you please take a look at this mysql implementation? We have a huge pain with administering pgsql, also in the future we would like to contribute support for nosql databases like mongodb or similar. |
@victorshevtsov Thank you for the PR. The team is a bit busy at the moment but we will definitely give this a look. |
@bsardo how long usually does it take to review a PR in here? Can our company do something about this? |
Hi @holms, @victorshevtsov, sorry for the delay. We'll start reviewing this today and will be in touch shortly with feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for putting this PR up, you did a great job! Changes you made totally make sense with current fetchers architecture.
I even installed and set up local MySQL server to test and review your PR :)
I left some comments where I had hick ups while testing it, those are minor.
To make these changes to production we will need to reconfigure something on our side. We need to discuss it with the team.
|
||
var queryArgs []interface{} | ||
if e.cfg.DBDriver == "mysql" { | ||
queryArgs = append(queryArgs, e.lastUpdate, e.lastUpdate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
queryArgs
expects 1 parameter, not 2. If I modify it to queryArgs = append(queryArgs, e.lastUpdate)
all works good. There is no reason to pass two e.lastUpdate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason of passing the same parameter twice (in case of MySql) is in the difference of how it processes the parameters. Here are poll_for_updates
queries for both cases:
mysql:
SELECT id, requestData, 'request' AS type FROM stored_requests WHERE last_updated > ?
UNION ALL
SELECT id, impData, 'imp' AS type FROM stored_imps WHERE last_updated > ?
postgres:
SELECT id, requestData, 'request' AS type FROM stored_requests WHERE last_updated > $1
UNION ALL
SELECT id, impData, 'imp' AS type FROM stored_imps WHERE last_updated > $1
Postgres references parameter by its number in the array of passed parameters, so one is enough.
MySql takes parameter from the passed array based on the number of question marks in the query. In other words, the number of parameters passed must equal to the number of question marks in the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the problem now.
May I discuss this with a team to get ideas how this can be solved and I'll get back to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VeronikaSolovei9 maybe any progress? we need mysql support a lot in here. is there's anyway we could speed up a process?
config/stored_requests.go
Outdated
if wildcard == "$" { | ||
if !strings.Contains(cfg.Query, "$1") || strings.Contains(cfg.Query, "$3") { | ||
errs = append(errs, fmt.Errorf("%s: database.poll_for_updates.query must contain exactly one wildcard", section)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This validation, specifically line 449 leads to error with a valid sql query. I don't think this condition should exist. For example my poll_for_updates.query
looks like
"SELECT uuid AS id, config as data, 'request' as dataType FROM videoconfig_config WHERE updated_at > ?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original validation at line #454 (for postgres) makes sure that the query has exactly one
parameter.
The validation at line #449 does the same but for mysql but in this case the query should have two
parameters.
See my notes here
I see here a mistake. Line #454 should be:
if !strings.Contains(cfg.Query, "$1") || strings.Contains(cfg.Query, "$2") {
I will change it and commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, thank you for this fix
Fixed the change mistakenly made by the previous commit.
Hi @VeronikaSolovei9. There is a chance that I'm using wrong queries in |
We have different poll for update queries based on data we want to fetch. We have I understand the problem. Need discuss it with the team. |
// Database configures Fetchers and EventProducers which read from a Database. | ||
// Fetchers are in stored_requests/backends/db_fetcher/fetcher.go | ||
// EventProducers are in stored_requests/events/database | ||
Database DatabaseConfig `mapstructure:"database"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks good, however I think we need to be backwards compatible. Hosts with stored requests housed in a database will have the postgres
configuration so this would be a breaking change as currently written.
I'd like to see an approach here that we've used quite a bit where we map the old field to the new field if the old field is present and the new field isn't defined. If both the old and new field are defined, we ignore the old field but issue a warning.
Take a look at config/config.go#migrateConfigPurposeOneTreatment
or config/config.go#migrateConfigSpecialFeature1
for reference.
This approach will allow us to give hosts some time to make the necessary adjustments to their config. We will eventually remove the backwards compatibility logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I will follow the proposed approach.
Sorry for the delay and thank you for your interest in contributing. The team has been busy and needed to discuss this internally. We’ve had a chance to review and these are what we consider to be the objectives of this PR:
I’d like to preface this by pointing out that we intend to refactor fetchers in the future so we aren’t asking you to refactor the package, but rather make some design choices that are steps in the right direction as part of this PR. Regarding items 2 and 3, PBS currently only supports a single query parameter, the last updated at timestamp, and by nature of how query parameters are expressed in a postgres query, it supports an unlimited number of occurrences of that query parameter.
Regarding item 4, conceptually it might make sense here to create an interface implemented by a MySQL object and a Postgres object with the objects encapsulating any query parsing and construction, and potentially any other logic (
The instantiated object could then be used downstream in all database operations with no concern for which type of database it is interacting with under the hood. This approach should allow for heavy reuse of the core fetcher logic and is extensible as we would theoretically only need to implement the interface when adding support for other databases in the future. I'd like to hear your thoughts. Thanks again for your patience. |
config/stored_requests.go
Outdated
if wildcard == "?" { | ||
if strings.Count(cfg.Query, "?") != 2 { | ||
errs = append(errs, fmt.Errorf("%s: database.poll_for_updates.query must contain exactly two wildcards", section)) | ||
} | ||
} | ||
if wildcard == "$" { | ||
if !strings.Contains(cfg.Query, "$1") || strings.Contains(cfg.Query, "$2") { | ||
errs = append(errs, fmt.Errorf("%s: database.poll_for_updates.query must contain exactly one wildcard", section)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this introduces a lack of flexibility in how hosts are able to structure their data. Ideally the number of parameters would not be constrained based on the driver, much less at all.
config/stored_requests.go
Outdated
if cfg.Driver == "mysql" { | ||
return cfg.connStringMySql() | ||
} | ||
|
||
if cfg.Driver == "postgres" { | ||
return cfg.connStringPosgres() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better approach that would limit the number of conditional checks on drivers would be to create a database connection interface that is implemented by a mysql object and a postgres object. These objects would then have driver-specific methods on them that build the connection string and build the ID list maker.
@victorshevtsov I noticed you pulled from master and you're getting a test failure now due to a broken test. You can merge with master again to pick up the test fix. |
@bsardo I merged the with master and it fixed the failing test. Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The migration code looks good. Just a couple of minor comments on that.
config/config.go
Outdated
if !v.IsSet(driverField) { | ||
glog.Warning(fmt.Sprintf("%s is not set, using default (postgres) ", driverField)) | ||
v.Set(driverField, "postgres") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it makes sense to only ever show this warning if we believe the host intends to use a database for stored requests to reduce noise and confusion on startup. We could lean on the old postgres dbname
config field to make that determination:
if !v.IsSet(driverField) && v.IsSet("postgres.connection.dbname") {
glog.Warning(fmt.Sprintf("%s is not set, using default (postgres) ", driverField))
v.Set(driverField, "postgres")
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
config/config_test.go
Outdated
`), | ||
} | ||
|
||
storedReqestsTests := []struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: typo - storedRequestsTests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
config/config_test.go
Outdated
migrateConfigDatabaseConnection(v) | ||
|
||
if len(tt.config) > 0 { | ||
assert.Equal(t, tt.want_connection_dbname, v.Get("stored_requests.database.connection.dbname").(string), tt.description) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: instead of using type assertions you can call the viper methods v.GetString
and v.GetInt
instead of v.Get
with the type assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
1. Use named parameters in queries for each DB driver 2. Allow any number of query parameters for each DB driver 3. Lay foundation for support of other databases
@victorshevtsov sorry for the delay. This is still on my radar. I'll give this a closer look tomorrow and come back with feedback. |
@bsardo, @VeronikaSolovei9 I've pushed all the required changes accordingly to the review and wrote some comments with explanations where no changes needed. I really appreciate your ideas for improvements. Please take a look. Thank you, |
@victorshevtsov thank you for the updates, you did a great work!
I think it makes sense to extract it to variable or constant and not to duplicate for every migration. |
config/config.go
Outdated
{ | ||
name: "fetcher.query", | ||
params: []QueryParamMigration{ | ||
{ | ||
old: "%REQUEST_ID_LIST%", | ||
new: "$REQUEST_ID_LIST", | ||
}, | ||
{ | ||
old: "%IMP_ID_LIST%", | ||
new: "$IMP_ID_LIST", | ||
}, | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to migrate fetcher.amp_query
which should have the same parameters as fetcher.query
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
config/config.go
Outdated
{ | ||
name: "poll_for_updates.query", | ||
params: []QueryParamMigration{ | ||
{ | ||
old: "$1", | ||
new: "$LAST_UPDATED", | ||
}, | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to migrate poll_for_updates.amp_query
which should have the same parameter as poll_for_updates.query
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
@@ -1607,6 +1607,587 @@ func TestMigrateConfigTCF2EnforcePurposeFlags(t *testing.T) { | |||
} | |||
} | |||
|
|||
func TestMigrateConfigDatabaseConnection(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The query migration strategy looks good to me. Can you please add test coverage for it? Maybe it makes sense to create another test that just focuses on that instead of mixing it into this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test coverage add by a separate test TestMigrateConfigDatabaseQueryParams
Updated migration of: - `fetcher.query` - `fetcher.amp_query` - `poll_for_updates.query` - `poll_for_updates.amp_query` for `stored_requests`, `stored_video_req`, `stored_responses` sections Added tests for SQL queries migration
All done. Please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@victorshevtsov, this is looking good. Just a couple of additional comments regarding documentation, testing and a bug I found. Then I think we're good to go!
config/stored_requests_test.go
Outdated
const sampleQueryTemplate = "SELECT id, requestData, 'request' as type FROM stored_requests WHERE id in $REQUEST_ID_LIST UNION ALL SELECT id, impData, 'imp' as type FROM stored_requests WHERE id in $IMP_ID_LIST" | ||
const sampleResponsesQueryTemplate = "SELECT id, responseData, 'response' as type FROM stored_responses WHERE id in $ID_LIST" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these are being used. Can we delete them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that can be deleted.
Done.
```yaml | ||
stored_requests: | ||
postgres: | ||
host: localhost | ||
port: 5432 | ||
user: db-username | ||
dbname: database-name | ||
query: SELECT id, requestData, 'request' as type FROM stored_requests WHERE id in %REQUEST_ID_LIST% UNION ALL SELECT id, impData, 'imp' as type FROM stored_imps WHERE id in %IMP_ID_LIST%; | ||
database: | ||
connection: | ||
driver: postgres | ||
host: localhost | ||
port: 5432 | ||
user: db-username | ||
dbname: database-name | ||
fetcher: | ||
query: SELECT id, requestData, 'request' as type FROM stored_requests WHERE id in $REQUEST_ID_LIST UNION ALL SELECT id, impData, 'imp' as type FROM stored_imps WHERE id in $IMP_ID_LIST; | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add a note here about supported databases and query syntax. Something like:
Supported Databases
- postgres
- mysql
Query Syntax
All database queries should be expressed using the native SQL syntax of your supported database of choice with one caveat.
For all supported database drivers, wherever you need to specify a query parameter, you must not use the native syntax (e.g. $1
, %%
, ?
, etc.), but rather a PBS-specific syntax to represent the parameter which is of the format $VARIABLE_NAME
. PBS currently supports just four query parameters, each of which pertains to particular config queries, and here is how they should be specified in your queries:
- last updated at timestamp -->
$LAST_UPDATED
- stored request ID list -->
$REQUEST_ID_LIST
- stored imp ID list -->
$IMP_ID_LIST
- stored response ID list -->
$ID_LIST
See the query defined at stored_requests.database.connection.fetcher.query
in the yaml config above as an example of how to mix these variables in with native SQL syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
dataSourceName := provider.ConnString() | ||
assertStringsEqual(t, dataSourceName, "someuser:somepassword@tcp(somehost.com:20)/TestDB") | ||
} | ||
|
||
func assertStringsEqual(t *testing.T, actual string, expected string) { | ||
if actual != expected { | ||
t.Errorf("Strings did not match.\n\"%s\" -- expected\n\"%s\" -- actual", expected, actual) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: I suggest using the testify package and shorting this to:
connString := provider.ConnString()
expectedConnString := "someuser:somepassword@tcp(somehost.com:20)/TestDB"
assert.Equal(t, expectedConnString, connString, "Strings did not match")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
dataSourceName := provider.ConnString() | ||
paramList := strings.Split(dataSourceName, " ") | ||
params := make(map[string]string, len(paramList)) | ||
for _, param := range paramList { | ||
keyVals := strings.Split(param, "=") | ||
if len(keyVals) != 2 { | ||
t.Fatalf(`param "%s" must only have one equals sign`, param) | ||
} | ||
if _, ok := params[keyVals[0]]; ok { | ||
t.Fatalf("found duplicate param at key %s", keyVals[0]) | ||
} | ||
params[keyVals[0]] = keyVals[1] | ||
} | ||
|
||
assertHasValue(t, params, "dbname", db) | ||
assertHasValue(t, params, "host", host) | ||
assertHasValue(t, params, "port", strconv.Itoa(port)) | ||
assertHasValue(t, params, "user", username) | ||
assertHasValue(t, params, "password", password) | ||
assertHasValue(t, params, "sslmode", "disable") | ||
} | ||
|
||
func assertHasValue(t *testing.T, m map[string]string, key string, val string) { | ||
t.Helper() | ||
realVal, ok := m[key] | ||
if !ok { | ||
t.Errorf("Map missing required key: %s", key) | ||
} | ||
if val != realVal { | ||
t.Errorf("Unexpected value at key %s. Expected %s, Got %s", key, val, realVal) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know all of this was copied from config/stored_requests_test.go
but I think a lot of it is unnecessary. I suggest simplifying the body of this test so it mirrors what you have in the postgres connection string function test.
Also, I suggest converting this and the mysql connection string tests to table based and then have test cases where each part that makes up the string is the only part set with a final test case where they are all set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I put these tests into separate files is to completely separate the logic because different implementations of DbProvider
can have different unique parameters that produce connection strings in a variety of formats. Keeping a unified test for all of them sounds reasonable but can turn into a mesh in future. For example, MongoDB can have multiple hosts with ports in a connection string for a replica set and a sharded cluster.
@bsardo, please confirm if you think I should make it more unified anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@victorshevtsov oh maybe we have a misunderstanding; sorry about that. I wasn't questioning your test organization nor suggesting that we merge the connection string tests for postgres and mysql. I was just suggesting two things:
- I think a lot of
db_provider/postgres_dbprovider_test.go#TestConnStringPostgres
can be deleted so that the test structure is similar to that ofdb_provider/mysql_dbprovider_test.go#TestConnStringMySql
. That means the logic inTestConnStringPostgres
that iterates over theparamList
checking for equal sign occurrences and duplicate keys can be deleted and the calls toassertHasValue
can be replaced with simpler assertion logic similar to that inTestConnStringMySql
. - I think both
db_provider/postgres_dbprovider_test.go#TestConnStringPostgres
anddb_provider/mysql_dbprovider_test.go#TestConnStringMySql
should maybe be converted to a table-based format, each with cases that set just one of the database connection fields (db
,host
,port
,username
,password
) in isolation, with a final test case that has them all set as each test currently does.
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see now. Sorry for misunderstanding.
Yes, that makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
if err := provider.Close(); err != nil { | ||
glog.Errorf("Error closing DB connection: %v", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this if
we need to add the following check so that we don't get a panic due to a nil pointer dereference when shutting down a server configured to use some means other than a database for stored requests/responses:
if provider == nil {
return
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
config/config.go
Outdated
queries: []QueryMigration{ | ||
{ | ||
name: "fetcher.query", | ||
params: []QueryParamMigration{ | ||
{ | ||
old: "%ID_LIST%", | ||
new: "$ID_LIST", | ||
}, | ||
}, | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to migrate %ID_LIST%
, which is a list of stored response IDS, for stored responses. Even though this only applies to stored responses, we could still migrate it for all fetcher.query
and fetcher.amp_query
occurrences to keep this migration simple. If a host specifies it for some other resource type like stored requests, the query wouldn't work anyway, which is the current behavior.
@VeronikaSolovei9 thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VeronikaSolovei9 and I spoke offline and she agreed with this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes and local testing with both Postgres and MySql DBs look good. Thank you for your contribution!
So... Is it ready to be merged to |
@victorshevtsov thanks so much for all of your work on this! |
Nice work guys!!! Cant wait to attempt to implement. |
@victorshevtsov I'd like to share your contribution at the next Prebid Server Webinar. May I ask what company you work for / contributed this change on behalf of? |
I contributed this PR on behalf of "Setupad" company. |
Fantastic. Thank you. |
In "SIA Setupad" we have a huge need for MySQL supported by PBS-Go like PBS-Java does.
I’m submitting a pull request with an implementation of MySQL driver support for PBS-Go.
The changes affected the configuration keys. So, all the keys
*.postgres.*
renamed to*.database.*
.For example:
stored_requests.postgres.connection.dbname
renamed to:
stored_requests.database.connection.dbname
Also there is a new configuration key
stored_requests.database.connection.driver
to set the database driver:mysql
orpostgres
Please provide any missing requirements if any of them missing.