-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial work for replicated ClickHouse testing #4149
Initial work for replicated ClickHouse testing #4149
Conversation
Thanks for starting on this @karencfv! Can you explain in more detail what you mean when you say the tests become "flaky"? It's been really useful to have each test run against an independent database, and so I think we should try to keep that insofar as possible.
This part is a bit confusing to me. The documentation indicates that it's possible to have replicated and non-replicated tables in the same server. Each server also watches its configuration file for changes, and so one can modify the cluster in various ways after each program starts. From that last link:
Given all that, and the fact that
If I'm understanding all the documentation correctly, this would let us start each server and then describe the cluster after the OS has assigned port numbers. We can avoid hard-coding anything, and keep running the tests in parallel. |
@@ -610,7 +610,7 @@ impl DbWrite for Client { | |||
/// Wipe the ClickHouse database entirely from a replicated set up. | |||
async fn wipe_replicated_db(&self) -> Result<(), Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me realize something. I'm not sure we want the database initialization to be done by the client at all. Consider the case where we start 2 data servers, but only one instance of oximeter
. The ClickHouse client that oximeter
instance uses will initialize the one replica that it happens to connect to, but there's nothing to initialize the other replica.
I'm really thinking that we would like a way to manage the cluster more holistically. Rather than starting individual server programs, and relying on the client to run the SQL initializing the tables, it might help to have a notion of a ClickHouse deployment. This could describe the IP addresses and ports (if they're known); the replica IDs; and probably the SQL we want to run at startup. It might support single and multi-node, if we still want that.
I would imagine a "runner" program would be helpful here too, which would start the ClickHouse program itself; write out or update the XML based on the cluster configuration; and probably also initialize the database tables itself, again based on the cluster deployment. This would obviate needing the client to initialize the tables at all.
This is all pretty pie-in-the-sky at this point. But I'm reasonably certain we can strip down the configuration to the minimal set; write out the needed XML; and initialize the database tables, with everything derived from the cluster configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a good idea! I'll link this comment to the relevant issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So one additional note to add here, from a conversation I had with @davepacheco. It's probably a good idea in the long run to have this front-end runner have very little configuration itself, and instead get that all from Nexus. It would only need to know how to reach Nexus (via DNS) and probably an ID of some kind, and then hit an endpoint in Nexus with that ID. That request could return the configuration that node needs to run (addresses, ports, whether to start keepers and or data servers, etc.).
Thanks for the review @bnaecker!
The thing is, it's the servers themselves that clash. Single node Clickhouse servers cannot run alongside those that are part of a cluster on the same host (I've tried this manually and the clustered servers just hang and don't start up). This means that if we are spinning up several single node data servers in parallel, the moment the replicated cluster begins to be built, it may or may not start up if the single node tests are still running (this is what was happening on the ubuntu build-and-test job).
I would absolutely love to do that, and thought of several approaches, but the matter here is that the moment we port all of the current tests to replicated clusters the test times are going to go through the roof. Unlike single node servers, The ClickHouse cluster takes much longer to spin up. Because of this, it's not as feasible to have a bunch of clusters being spun up. My plan for this is the following:
What do you think? |
Do we understand what's actually causing the servers that are part of the cluster to start slowly?
How much longer? The real question I think is whether the additional startup time is more than the cost of running the tests serially against a single cluster. |
Ha! There was a rogue port that is being set up automatically in the single node servers that I can tweak in the clustered config. The two tests don't need to run serially anymore.
Quite a bit, single node takes ~0.537s to start up, whereas a cluster takes ~3.128s as it checks that all keeper/server nodes are ready and listening for connections. Additionally, I just realised that we would have an issue with ports as well. While we can have OS assigned http ports for single node servers, clustered servers have several other ports that we need to define and share across configs (TCP, interserver port, keeper ports etc). I can see this port allocation becoming quite complex and potentially unreliable if it were to be set dynamically. That said, I believe we can find a way to run all of these tests in parallel once we work on #3982. When the init SQL is dynamically created, it will give us greater flexibility to test. For example: To avoid consistency issues, I can see us using a different database name for each test (e.g. |
Heya @bnaecker 👋 😄 Just wanted to follow up on this and see if you have any thoughts on my last comment? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've a few more questions / comments, but I think this is mostly good to go!
@@ -610,7 +610,7 @@ impl DbWrite for Client { | |||
/// Wipe the ClickHouse database entirely from a replicated set up. | |||
async fn wipe_replicated_db(&self) -> Result<(), Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So one additional note to add here, from a conversation I had with @davepacheco. It's probably a good idea in the long run to have this front-end runner have very little configuration itself, and instead get that all from Nexus. It would only need to know how to reach Nexus (via DNS) and probably an ID of some kind, and then hit an endpoint in Nexus with that ID. That request could return the configuration that node needs to run (addresses, ports, whether to start keepers and or data servers, etc.).
oximeter/db/src/client.rs
Outdated
async fn test_build_replicated() { | ||
let log = slog::Logger::root(slog::Discard, o!()); | ||
// Tests that data can be inserted via the client | ||
insert_samples_test(address, &log).await.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, a lot of this diff is moving the implementation of the tests themselves into one actual #[test]
function. I was under the impression that was to work around an issue with port conflicts for the servers. Has that conflict been addressed? Do we still need to have all tests inside a single #[test]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I have in mind here, is to get the single node tests ready to run with the replicated cluster.
Until single node Clickhouse is removed in its totality, and the init SQL file is created dynamically, the replicated cluster tests will still need to run serially. This means that if we want to run the same tests on both single node and replicated clusters, it'll have to be this way temporarily.
In a follow up PR to this one, I will be running every one of these tests against the replicated cluster. I divided this work into two PRs so it doesn't become a monster like my last PR 😅
|
||
// TODO(https://github.com/oxidecomputer/omicron/issues/4001): With distributed | ||
// engine, it can take a long time to sync the data. This means it's tricky to | ||
// test that the data exists on both nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we have some control over this from here, which might be helpful in test contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I'd like to punt on this item until my follow up PR, which will deal more directly with replicated testing if that's OK with you
Thanks for the review @bnaecker! I think I've addressed all of your comments, let me know what you think :) |
…up (#4372) This is a follow up to #4149 . In this commit all single node tests are now run against a clustered set up. Additionally, a few bugs on the replicated init SQL file have been fixed. Note: All of the individual field and measurement tests all run under a single test now. The reason for this is test times. Each test needs to wipe out it's database before the next test runs. In a replicated installation, this means that all nodes must sync, which takes time. Before I merged all the field/measurement tests into a single one, the testing time for a replicated set up took about 10 minutes, which really is too much. Now it takes ~3.5 minutes. It is still a lot, but only a temporary measure until we can run these tests in parallel again. An unintended bonus of this approach is that running the tests this way means that you are testing that the data is being inserted in the correct table. Related: #4148 Related: #4001
Structural changes
Instead of spinning up a ClickHouse server per test, now only two servers are spun up to run all tests against. One for single node mode and another for replicated testing. Even though it may look odd to this this now, there is a good reason for this. ClickHouse servers that have been configured to be part of a cluster cannot be spun up in the same server as single node servers. This means that if we spin up several servers, the replicated test becomes incredibly flaky. It's important to note that this set up will not live forever. Once we can remove single node mode and start the replicated ClickHouse cluster as part of the build process, the replicated tests will be able to run independently.
All tests now run sequentially against a single server. For correctness, after each test the database is wiped out to give us a clean slate for testing. This only adds a few more seconds (<10s) to the overall testing time. Again, this will not be the final state of things, but necessary to be able to test alongside replicated clusters until we remove single node functionality.
Caveat
Github does some really weird things with the diffs on oximeter/db/src/client.rs. Frankly, it looks like a mess. I would probably checkout the branch itself to see the final state of the tests.
Follow up work
Given the size of this PR, I've not added all of the tests to run against the replicated cluster. There will be a follow up PR that does this.
Related: #4001
Closes: #4037