-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cdc): support alter cdc table schema #17334
Conversation
|
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
9425213 | Triggered | Generic Password | 44bebe4 | ci/scripts/e2e-source-test.sh | View secret |
9425213 | Triggered | Generic Password | 44bebe4 | ci/scripts/e2e-source-test.sh | View secret |
9425213 | Triggered | Generic Password | 4b55d07 | ci/scripts/e2e-source-test.sh | View secret |
9425213 | Triggered | Generic Password | 4b55d07 | ci/scripts/e2e-source-test.sh | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
6ce9713
to
44bebe4
Compare
9f0fee7
to
4b55d07
Compare
hostname = '${MYSQL_HOST:localhost}', | ||
port = '${MYSQL_TCP_PORT:8306}', | ||
username = 'root', | ||
password = '123456', | ||
database.name = 'testdb1', | ||
server.id = '5185' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use ${RISEDEV_MYSQL_WITH_OPTIONS_COMMON}
?
Seems that similar comments have been made in #16986 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it but it doesn't work when I run test locally with risedev slt
. It seems we have to launch the mysql via risedev
to let the RISEDEV_MYSQL_WITH_OPTIONS_COMMON
work, but I use docker compose
to launch mysql locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we have to launch the mysql via
risedev
to let theRISEDEV_MYSQL_WITH_OPTIONS_COMMON
work, but I usedocker compose
to launch mysql locally.
This is true. However, I would still recommend launching MySQL with RiseDev as it makes it easier for other developers to test locally without needing to follow any other conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I use docker compose to launch mysql locally.
You CAN still use your own setup, by using user-managed
.
- use: mysql
port: 3306
user-managed: true
# - In user-managed mode, these configs are not validated by risedev.
# They are passed as-is to risedev-env default user for MySQL operations.
# - This is not used in RISEDEV_MYSQL_WITH_OPTIONS_COMMON.
user: root
password: ""
database: "risedev"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why you don't use the mysql started by risedev
? Have you tried it and met any problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can stop and start one or more db services with clicks, no need to edit the yaml.
You still uses risedev
to start RisingWave. So relying on risedev
can also save the click.. Why not?
Anyway, I encourage you to just give it a try. I believe basically no effort is needed to change the workflow and the DX improvement is gained for free.
If it doesn't work well, please share some feedbacks about what is not good enough, so that we can keep improving the DX for all developers.
If the reasons above are still not convincing enough to you, please also tell me why, so that I can understand it better.
Finally, if you just don't want to try it, it's also OK, but please use the suggested style for the benefits of others. For it to work in your machine, I think it's just some additional env vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to manually set MYSQL_ env vars manually.
When you change the configs, the env vars are also updated automatically.
Thanks for the clarification. I got your idea.
I find that the mysql -e
can read the env vars in the risedev-env
file, but psql -c
cannot. I still need to set those $PGxxx
env vars in my shell to make the psql -c
work locally. I checked that postgres related env vars has written into the risedev-env
file. Do you have any idea? @xxchan
Here is the config I used:
default:
steps:
- use: meta-node
- use: compute-node
- use: frontend
- use: postgres
port: 8432
user-managed: true
application: "connector"
user: myuser
password: "123456"
database: "mydb"
- use: mysql
port: 8306
user-managed: true
user: root
password: "123456"
database: "mydb"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for missing this comment.
Just to double confirm: Did you run with risedev slt
? And what's the errors you found?
Perhaps you can try this first:
system ok
echo $PGPORT
----
placeholder
Then running risedev slt
will return errors "result mismatch", and shows what's the output of the echo
command, i.e., the value of PGPORT
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for missing this comment.
Wait, why this reply is "2 days ago", but I really just received it just now. 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for missing this comment.
Wait, why this reply is "2 days ago", but I really just received it just now. 😂
Maybe due to I forget to submit the pending comment. 😄
// Add the new column to the table definition if it is not created by `create table (*)` syntax. | ||
if !columns.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look good to me. 😕
- I remember that we had a discussion long ago (frontend: refactor source schema resolution #9828) that we should purify the definition of a table/source before persisting it into the catalog. In this case, for example,
*
is completely a syntax-sugar and should be first expanded to column definitions. cc @st1page - Considering empty columns as created with
*
is also a fragile assumption. Users are allowed to create a table without columns likeCREATE TABLE t;
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for forgetting this point during my previous review of the #16986.
I think we should expand on CREATE TABLE (*)
. This is because operations such as CREATE SINK INTO TABLE, ALTER TABLE RENAME, and others rely on reparsing the SQL statement, and the parsed columns should remain consistent with the previous plan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think cdc table was the first to meet this problem, because in other scenarios with a schema registry, we simply prevented users from doing things like add columns.
https://github.com/risingwavelabs/risingwave/blob/main/src/frontend/src/handler/alter_table_column.rs#L117
https://github.com/risingwavelabs/risingwave/blob/main/src/frontend/src/handler/alter_source_column.rs#L73
And the only way to change their schema was to refresh the schema registry using alter table, which implied the assumption that their which implies the assumption that their schema needs to always be aligned upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before finding a elegent way to support that. We can tentatively assume that the cdc table of create table(*) is similar to the table with connector of the schema registry. Let's temporarily prevent users from using add columns on such a table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for this PR we can prevent users from adding/drop columns on a table with wildcard.
And here is the future design #17472
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for this PR we can prevent users from adding/drop columns on a table with wildcard.
ok, we can ban ALTER
for tables created by create table(*)
in this pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, we can ban
ALTER
for tables created bycreate table(*)
in this pr.
Does this also mean that we no longer need let mut columns_altered = original_catalog.columns.clone();
?
/test |
Failed: 3 Please ignore this, the pipelines got bugs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code impl generally LGTM
@@ -1971,7 +2000,6 @@ impl DdlController { | |||
} = actor_graph_builder | |||
.generate_graph(&self.env, stream_job, expr_context) | |||
.await?; | |||
assert!(dispatchers.is_empty()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we still keep this assertion depending on the table job type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM!
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Support
ALTER ADD/DROP COLUMN
for cdc table based on shared sourceclose #16287
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.