Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale out Clickhouse to a multinode cluster #3494

Merged
merged 69 commits into from
Sep 5, 2023
Merged
Changes from 1 commit
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
0b21df2
Initial functional 2 replica 3 coordinator cluster
karencfv Jul 5, 2023
3281cbd
Create config templates and pseudocode for updated init config
karencfv Jul 5, 2023
b628be6
Dynamically build configs for servers and keepers
karencfv Jul 6, 2023
cdfb1d1
Create a separate service for keepers
karencfv Jul 10, 2023
41518c8
Update manifest and file location
karencfv Jul 11, 2023
795327e
clean up
karencfv Jul 11, 2023
73e5e22
make linter happy
karencfv Jul 11, 2023
83449a2
Zone image is clickhouse-keeper.tar.gz not clickhouse_keeper.tar.gz
karencfv Jul 11, 2023
38fb86d
Merge branch 'main' into ch-replicated-engine
karencfv Jul 19, 2023
06b8f21
Only use underscores to simplify
karencfv Jul 20, 2023
b5dd484
Merge remote-tracking branch 'upstream' into ch-replicated-engine
karencfv Jul 20, 2023
ca7ad33
Create composite packages to include internal-dns tar
karencfv Jul 24, 2023
bbbbd28
Get internal DNS working
karencfv Jul 25, 2023
7b7b245
Add datastore to keeper service
karencfv Jul 26, 2023
72cc038
Append default and custom configs
karencfv Jul 31, 2023
34f370b
Give keepers dynamic discoverable IDs
karencfv Jul 31, 2023
79dd329
Clean up scripts and configs
karencfv Aug 1, 2023
94f8376
Clean up
karencfv Aug 1, 2023
cea0612
First pass at making tests pass
karencfv Aug 1, 2023
95be228
gargh linter
karencfv Aug 1, 2023
e24a2dd
Add additional zpools for dev envs
karencfv Aug 2, 2023
f6aac77
Add flag to internal-dns-cli to output host name only
karencfv Aug 2, 2023
758fd39
Revert testing configuration and clean up
karencfv Aug 2, 2023
ef914b1
Run oximeter on replicated or single node set ups
karencfv Aug 2, 2023
7eb06dd
fmt
karencfv Aug 2, 2023
1abe9dd
Merge branch 'main' into ch-replicated-engine
karencfv Aug 2, 2023
e2a4060
Small fix after merge with main branch
karencfv Aug 3, 2023
9c759e6
expectoration
karencfv Aug 3, 2023
bc33e97
Address comments
karencfv Aug 4, 2023
1ebcf14
fmt
karencfv Aug 4, 2023
23df4ef
address review comments
karencfv Aug 7, 2023
80eb1d1
Merge branch 'main' into ch-replicated-engine
karencfv Aug 8, 2023
2b1edd9
save config env vars to file
karencfv Aug 8, 2023
5fd1e75
fix scripts and configuration for bench gimlet
karencfv Aug 9, 2023
3bda6b3
Explicitly declare if a database is single node or replicated
karencfv Aug 9, 2023
2492ae8
foundation to test replicated nodes
karencfv Aug 9, 2023
8541d17
Testing utils
karencfv Aug 10, 2023
148dda9
Test replicated nodes
karencfv Aug 10, 2023
cb0cd66
First try at testing
karencfv Aug 11, 2023
b9e64cd
Keeper doesn't like absolute paths :(
karencfv Aug 11, 2023
9d8d019
Get test keepers going
karencfv Aug 14, 2023
81ab2ad
Make the test work
karencfv Aug 14, 2023
a8a02d4
Correct way to check whether a replicated server is ready for connect…
karencfv Aug 14, 2023
c116370
Clean up
karencfv Aug 14, 2023
28354be
Rename test config directories
karencfv Aug 14, 2023
9562f0e
fmt
karencfv Aug 15, 2023
9520449
fix tests
karencfv Aug 15, 2023
8872b09
Refine testing
karencfv Aug 16, 2023
3af0769
Revert bench gimlet configuration and fmt
karencfv Aug 16, 2023
f621b80
Bump clickhouse readyness testing timeout and make clippy happy
karencfv Aug 16, 2023
e51bd0f
Merge branch 'main' into ch-replicated-engine
karencfv Aug 17, 2023
298ca4e
Give end to end tests more time to bring up nexus
karencfv Aug 21, 2023
1930e14
Merge branch 'main' into ch-replicated-engine
karencfv Aug 21, 2023
e7a4635
Automatically detect whether ClickHouse set up is replicated or singl…
karencfv Aug 23, 2023
b8ccf29
Works on my machine, increase timeout
karencfv Aug 23, 2023
d83dc85
Merge branch 'main' into ch-replicated-engine
karencfv Aug 29, 2023
691d9d5
Update CRDB with new service enums
karencfv Aug 29, 2023
4a1c179
Disable replicated ClickHouse
karencfv Aug 30, 2023
fe124fd
Make clippy happy
karencfv Aug 30, 2023
7f67c7b
Merge branch 'main' into ch-replicated-engine
karencfv Aug 31, 2023
251df8a
Small fix after merge
karencfv Aug 31, 2023
c58c8fe
Revert e2e timeout duration
karencfv Aug 31, 2023
4832dda
Address review comments
karencfv Aug 31, 2023
c7e3598
make the linter happy
karencfv Aug 31, 2023
0adffb7
Address comments
karencfv Sep 1, 2023
98c705b
Create distributed tables
karencfv Sep 1, 2023
77a3492
Stop forgetting to run cargo fmt before pushing the commit
karencfv Sep 1, 2023
5933f51
Also don't forget about clippy :facepalm:
karencfv Sep 1, 2023
6c82c8a
Small fix to referenced macro in SQL
karencfv Sep 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
First try at testing
  • Loading branch information
karencfv committed Aug 11, 2023
commit cb0cd66d3fe7d9e9a21f003ac4da230ada35b939
12 changes: 10 additions & 2 deletions oximeter/db/src/client.rs
Original file line number Diff line number Diff line change
@@ -657,10 +657,12 @@ mod tests {

#[tokio::test]
async fn test_build_replicated() {
use std::net::{IpAddr, Ipv4Addr};

let log = slog::Logger::root(slog::Discard, o!());

// Start all Keeper coordinator nodes
let keeper_config = String::from("./configs/keeper_config.xml");
let keeper_config = String::from("oximeter/db/src/configs/keeper_config.xml");

// Start Keeper 1
let k1_port = String::from("9181");
@@ -696,7 +698,9 @@ mod tests {
.expect("Failed to start ClickHouse keeper 3");

// Start all replica nodes
let replica_config = String::from("./configs/replica_config.xml");
let cur_dir = std::env::current_dir().unwrap();
let replica_config = cur_dir.as_path().join("src/configs/replica_config.xml");
// let replica_config = String::from("oximeter/db/src/configs/replica_config.xml");

// Start Replica 1
let r1_port = String::from("8123");
@@ -714,6 +718,10 @@ mod tests {
.expect("Failed to start ClickHouse node 1");
let r1_address = SocketAddr::new("::1".parse().unwrap(), db_1.port());





// Start Replica 2
let r2_port = String::from("8124");
let r2_tcp_port = String::from("9001");
3 changes: 1 addition & 2 deletions oximeter/db/src/configs/keeper_config.xml
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
<!-- This configuration file is used for testing only. -->
<clickhouse>

<!-- This configuration file is used for testing only. -->
<logger>
<level>trace</level>
<log from_env="CH_LOG"/>
48 changes: 36 additions & 12 deletions oximeter/db/src/configs/replica_config.xml
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
<!-- This configuration file is used for testing only. -->
<?xml version="1.0"?>
<!-- This configuration file is used for testing only. -->


<!--
NOTE: User and query level settings are set up in "users.xml" file.
If you have accidentally specified user-level settings here, server won't start.
@@ -37,7 +39,7 @@
(this protocol is also informally called as "the TCP protocol");
See also 'tcp_port_secure' for secure connections.
-->
<tcp_portreplace="true" from_env="CH_TCP_PORT"/>
<tcp_port replace="true" from_env="CH_TCP_PORT"/>

<!-- Compatibility with MySQL protocol.
ClickHouse will pretend to be MySQL for applications connecting to this port.
@@ -238,16 +240,7 @@
<user_files_path replace="true" from_env="CH_USER_FILES_PATH"/>

<!-- Sources to read users, roles, access rights, profiles of settings, quotas. -->
<user_directories>
<users_xml>
<!-- Path to configuration file with predefined users. -->
<path>users.xml</path>
</users_xml>
<local_directory>
<!-- Path to folder where users created by SQL commands are stored. -->
<path replace="true" from_env="CH_USER_LOCAL_DIR"/>
</local_directory>
</user_directories>


<access_control_improvements>
<!-- Enables logic that users without permissive row policies can still read rows using a SELECT query.
@@ -274,6 +267,37 @@
<select_from_information_schema_requires_grant>false</select_from_information_schema_requires_grant>
</access_control_improvements>

<profiles>
<default>
<load_balancing>random</load_balancing>
</default>

</profiles>

<users>
<default>
<password></password>
<networks>
<ip>::/0</ip>
</networks>
<profile>default</profile>
<quota>default</quota>
</default>
</users>

<quotas>
<default>
<interval>
<duration>3600</duration>
<queries>0</queries>
<errors>0</errors>
<result_rows>0</result_rows>
<read_rows>0</read_rows>
<execution_time>0</execution_time>
</interval>
</default>
</quotas>

<!-- Default profile of settings. -->
<default_profile>default</default_profile>

53 changes: 45 additions & 8 deletions test-utils/src/dev/clickhouse.rs
Original file line number Diff line number Diff line change
@@ -111,10 +111,35 @@ impl ClickHouseInstance {
tcp_port: String,
name: String,
r_number: String,
config_path: String,
//config_path: String,
config_path: PathBuf,
) -> Result<Self, anyhow::Error> {
let data_dir = TempDir::new()
.context("failed to create tempdir for ClickHouse data")?;

// Copy config template to new temporary directory
// let cur_dir = std::env::current_dir()?;
// let config_path = cur_dir.as_path().join("clickhouse_configs/replica_config.xml");
let con = data_dir.path().join("replica_config.xml");

// debug
println!("config source: {}", config_path.display());
println!("config destination: {}", con.display());

std::fs::copy(&config_path, &con)?;

assert!(con.exists());

use tokio::io::AsyncReadExt;

//debug
let mut file = File::open(&con).await
.expect("err File not found");
let mut data = String::new();
file.read_to_string(&mut data).await
.expect("Error while reading file");
println!("{}", data);

let log_path = data_dir.path().join("clickhouse-server.log");
let err_log_path = data_dir.path().join("clickhouse-server.errlog");
let tmp_path = data_dir.path().join("/tmp/");
@@ -124,20 +149,21 @@ impl ClickHouseInstance {
let args = vec![
"server".to_string(),
"--config-file".to_string(),
format!("{}", config_path),
//format!("{}", config_path),
format!("{}", con.display()),
];

let child = tokio::process::Command::new("clickhouse")
.args(&args)
.stdin(Stdio::null())
.stdout(Stdio::null())
.stderr(Stdio::null())
// .stdin(Stdio::null())
// .stdout(Stdio::null())
// .stderr(Stdio::null())
.env("CLICKHOUSE_WATCHDOG_ENABLE", "0")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably do .env_clear() here to be safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

.env("CH_LOG", &log_path)
.env("CH_ERROR_LOG", err_log_path)
.env("CH_REPLICA_DISPLAY_NAME", name)
.env("CH_LISTEN_ADDR", "::1")
.env("CH_LISTEN_PORT", port)
.env("CH_LISTEN_PORT", port.clone())
.env("CH_TCP_PORT", tcp_port)
.env("CH_DATASTORE", data_dir.path())
.env("CH_TMP_PATH", tmp_path)
@@ -155,8 +181,19 @@ impl ClickHouseInstance {
format!("failed to spawn `clickhouse` (with args: {:?})", &args)
})?;

println!("{}", log_path.display());
// println!("config path: {}", config_path);

//debugging
// let srcdir = PathBuf::from(config_path);
// println!("{:?}", std::fs::canonicalize(&srcdir));




let data_path = data_dir.path().to_path_buf();
let port = wait_for_port(log_path).await?;
// let port = wait_for_port(log_path).await?;
let port: u16 = port.parse()?;

Ok(Self {
data_dir: Some(data_dir),
@@ -176,7 +213,7 @@ impl ClickHouseInstance {
// We assume that only 3 keepers will be run, and the ID of the keeper can only
// be one of "1", "2" or "3". This is to avoid having to pass the IDs of the
// other keepers as part of the function's parameters.
if k_id != "1" || k_id != "2" || k_id != "3" {
if !["1", "2", "3"].contains(&k_id.as_str()) {
return Err(ClickHouseError::InvalidKeeperId.into());
}
let data_dir = TempDir::new()
43 changes: 43 additions & 0 deletions test-utils/src/dev/clickhouse_configs/keeper_config.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
<clickhouse>
<!-- This configuration file is used for testing only. -->
<logger>
<level>trace</level>
<log from_env="CH_LOG"/>
<errorlog from_env="CH_ERROR_LOG"/>
<size>1000M</size>
<count>3</count>
</logger>

<!-- To allow all use :: -->
<listen_host from_env="CH_LISTEN_ADDR"/>
<path replace="true" from_env="CH_DATASTORE"/>

<keeper_server>
<tcp_port from_env="CH_LISTEN_PORT"/>
<server_id from_env="CH_KEEPER_ID_CURRENT"/>
<log_storage_path from_env="CH_LOG_STORAGE_PATH"/>
<snapshot_storage_path from_env="CH_SNAPSHOT_STORAGE_PATH"/>
<coordination_settings>
<operation_timeout_ms>10000</operation_timeout_ms>
<session_timeout_ms>30000</session_timeout_ms>
<raft_logs_level>trace</raft_logs_level>
</coordination_settings>
<raft_configuration>
<server>
<id from_env="CH_KEEPER_ID_01"/>
<hostname from_env="CH_KEEPER_HOST_01"/>
<port>9234</port>
</server>
<server>
<id from_env="CH_KEEPER_ID_02"/>
<hostname from_env="CH_KEEPER_HOST_02"/>
<port>9235</port>
</server>
<server>
<id from_env="CH_KEEPER_ID_03"/>
<hostname from_env="CH_KEEPER_HOST_03"/>
<port>9236</port>
</server>
</raft_configuration>
</keeper_server>
</clickhouse>
Loading