Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration support for human-readable bytes #1455

Merged
merged 1 commit into from
Apr 24, 2024
Merged

Configuration support for human-readable bytes #1455

merged 1 commit into from
Apr 24, 2024

Conversation

AhmedSoliman
Copy link
Contributor

@AhmedSoliman AhmedSoliman commented Apr 23, 2024

Configuration support for human-readable bytes

  • Add human-readable bytes support to the configuration
  • Full review of naming
  • Remove values the default to MAX and use Option instead
  • Various fixes for better default values and hiding some internal options

Default config:

roles = [
    "worker",
    "admin",
    "metadata-store",
]
cluster-name = "localcluster"
allow-bootstrap = true
metadata-store-address = "http://127.0.0.1:5123/"
bind-address = "0.0.0.0:5122"
advertised-address = "http://127.0.0.1:5122/"
shutdown-timeout = "1m"
tracing-filter = "info"
log-filter = "info"
log-format = "pretty"
log-disable-ansi-codes = false
disable-prometheus = false
rocksdb-total-memory-size = "4.0 GB"
rocksdb-total-memtables-size = "0 B"
rocksdb-high-priority-bg-threads = 2

[http-keep-alive-options]
interval = "40s"
timeout = "20s"

[worker]
internal-queue-length = 64
rocksdb-write-buffer-size = "256.0 MB"
rocksdb-max-total-wal-size = "2.0 GB"
rocksdb-disable-wal = true
rocksdb-disable-statistics = false
rocksdb-max-background-jobs = 12
rocksdb-compaction-readahead-size = "2.0 MB"
rocksdb-statistics-level = "except-detailed-timers"
bootstrap-num-partitions = 64

[worker.invoker]
inactivity-timeout = "1m"
abort-timeout = "1m"
message-size-warning = "10.0 MB"

[worker.invoker.retry-policy]
type = "exponential"
initial-interval = "50ms"
factor = 2.0
max-interval = "10s"

[admin]
bind-address = "0.0.0.0:9070"

[admin.query-engine]
pgsql-bind-address = "0.0.0.0:9071"

[ingress]
bind-address = "0.0.0.0:8080"
kafka-clusters = []

[bifrost]
default-provider = "local"

[bifrost.local]
rocksdb-write-buffer-size = "256.0 MB"
rocksdb-max-total-wal-size = "2.0 GB"
rocksdb-disable-wal = false
rocksdb-disable-statistics = false
rocksdb-max-background-jobs = 12
rocksdb-compaction-readahead-size = "2.0 MB"
rocksdb-statistics-level = "except-detailed-timers"
writer-batch-commit-count = 500
writer-batch-commit-duration = "5ns"
sync-wal-before-ack = true

[metadata-store]
bind-address = "0.0.0.0:5123"
request-queue-length = 32

[metadata-store.rocksdb]
rocksdb-write-buffer-size = "256.0 MB"
rocksdb-max-total-wal-size = "2.0 GB"
rocksdb-disable-wal = true
rocksdb-disable-statistics = false
rocksdb-max-background-jobs = 12
rocksdb-compaction-readahead-size = "2.0 MB"
rocksdb-statistics-level = "except-detailed-timers"

Copy link

github-actions bot commented Apr 23, 2024

Test Results

 95 files  ±0   95 suites  ±0   7m 27s ⏱️ -38s
 82 tests ±0   80 ✅ ±0  2 💤 ±0  0 ❌ ±0 
212 runs  ±0  206 ✅ ±0  6 💤 ±0  0 ❌ ±0 

Results for commit e330750. ± Comparison against base commit 8139946.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@jackkleeman jackkleeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads up! I don't think we are setting any of these in cloud yet but let me know if we should be

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for serializing our memory sizes in human readable form. This is a really cool improvement! I had a few questions concerning the Serialize/DeserializeAs implementations and whether to enforce non-zero constraints directly on the fields of an option struct.


struct ByteCountVisitor<const CAN_BE_ZERO: bool>;

// because we do what appears to be runtime-check fo the const CAN_BE_ZERO but
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// because we do what appears to be runtime-check fo the const CAN_BE_ZERO but
// because we do what appears to be runtime-check of the const CAN_BE_ZERO but

Comment on lines 122 to 132
impl<const CAN_BE_ZERO: bool> From<u64> for ByteCount<CAN_BE_ZERO> {
fn from(value: u64) -> Self {
ByteCount(value)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this only be implemented for ByteCount<true>?

D: Deserializer<'de>,
{
if deserializer.is_human_readable() {
deserializer.deserialize_any::<ByteCountVisitor<CAN_BE_ZERO>>(ByteCountVisitor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we hint that the expected type is a str? Is it to support that users can specify it as a pure number rocksdb-total-memory-size = 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both are supported.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :-)

Comment on lines +300 to +337
#[track_caller]
fn check(s: &str) {
assert_eq!(
serde_json::from_str::<Config>(s).unwrap().0,
s.parse().unwrap()
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method different from check_str? Looks like I can replace check with check_str in the test below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_str adds surrounds the value with \" so it becomes json string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I didn't know that debug output of str includes the "". Thanks for the clarification!

Comment on lines 316 to 317
// can deserialize directly from numbers
assert_eq!(serde_json::from_str::<Config>("5").unwrap().0, ByteCount(5));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this part covered by check("123")?

Comment on lines 222 to 241
impl<const CAN_BE_ZERO: bool> SerializeAs<u64> for ByteCount<CAN_BE_ZERO> {
fn serialize_as<S>(source: &u64, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
Self(*source).serialize(serializer)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't all SerializeAs/DeserializeAs implementations for zero-able types be only implemented for ByteCount<true>? Otherwise, when deserializing a 0 value using ByteCount<false>, it will probably crash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see how you are using it in the options structs to enforce that a u64 is actually non-zero when deserializing it. I am wondering whether we shouldn't change the type from u64 to NonZeroU64 for those fields instead. The reason is that we can now construct a CommonOptions with 0 valued fields which we can serialize but fail to deserialize because of the non-zero constraint that is only enforced at deserialization time.

/// The memory size used for rocksdb caches.
#[serde_as(as = "ByteCount<false>")]
#[cfg_attr(feature = "schemars", schemars(with = "ByteCount<false>"))]
pub rocksdb_total_memory_size: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my previous comment. If we want this field to be non zero, should we change it to NonZeroU64?

@@ -124,25 +125,25 @@ impl Default for RetryPolicy {
}

impl RetryPolicy {
pub fn fixed_delay(interval: Duration, max_attempts: usize) -> Self {
pub fn fixed_delay(interval: Duration, max_attempts: Option<usize>) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should max_attempts have the type Option<NonZeroUsize> to avoid the expect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All those were left in an attempt to stop the refactoring from cascading down tbh. My goal was to be able to ship this by end of day. I can definitely go over those types and follow the rabbit hole :)

}
}

pub fn exponential(
initial_interval: Duration,
factor: f32,
max_attempts: usize,
max_attempts: Option<usize>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to avoid refactoring the high number of call sites at this PR.

- Add human-readable bytes support to the configuration
- Full review of naming
- Remove values the default to MAX and use Option instead
- Various fixes for better default values and hiding some internal options

Default config:

```toml
roles = [
    "worker",
    "admin",
    "metadata-store",
]
cluster-name = "localcluster"
allow-bootstrap = true
metadata-store-address = "http://127.0.0.1:5123/"
bind-address = "0.0.0.0:5122"
advertised-address = "http://127.0.0.1:5122/"
shutdown-timeout = "1m"
tracing-filter = "info"
log-filter = "info"
log-format = "pretty"
log-disable-ansi-codes = false
disable-prometheus = false
rocksdb-total-memory-size = "4.0 GB"
rocksdb-total-memtables-size = "0 B"
rocksdb-high-priority-bg-threads = 2

[http-keep-alive-options]
interval = "40s"
timeout = "20s"

[worker]
internal-queue-length = 64
rocksdb-write-buffer-size = "256.0 MB"
rocksdb-max-total-wal-size = "2.0 GB"
rocksdb-disable-wal = true
rocksdb-disable-statistics = false
rocksdb-max-background-jobs = 12
rocksdb-compaction-readahead-size = "2.0 MB"
rocksdb-statistics-level = "except-detailed-timers"
bootstrap-num-partitions = 64

[worker.invoker]
inactivity-timeout = "1m"
abort-timeout = "1m"
message-size-warning = "10.0 MB"

[worker.invoker.retry-policy]
type = "exponential"
initial-interval = "50ms"
factor = 2.0
max-interval = "10s"

[admin]
bind-address = "0.0.0.0:9070"

[admin.query-engine]
pgsql-bind-address = "0.0.0.0:9071"

[ingress]
bind-address = "0.0.0.0:8080"
kafka-clusters = []

[bifrost]
default-provider = "local"

[bifrost.local]
rocksdb-write-buffer-size = "256.0 MB"
rocksdb-max-total-wal-size = "2.0 GB"
rocksdb-disable-wal = false
rocksdb-disable-statistics = false
rocksdb-max-background-jobs = 12
rocksdb-compaction-readahead-size = "2.0 MB"
rocksdb-statistics-level = "except-detailed-timers"
writer-batch-commit-count = 500
writer-batch-commit-duration = "5ns"
sync-wal-before-ack = true

[metadata-store]
bind-address = "0.0.0.0:5123"
request-queue-length = 32

[metadata-store.rocksdb]
rocksdb-write-buffer-size = "256.0 MB"
rocksdb-max-total-wal-size = "2.0 GB"
rocksdb-disable-wal = true
rocksdb-disable-statistics = false
rocksdb-max-background-jobs = 12
rocksdb-compaction-readahead-size = "2.0 MB"
rocksdb-statistics-level = "except-detailed-timers"
```
@AhmedSoliman
Copy link
Contributor Author

@tillrohrmann thanks for the great review. I've followed your advice and the I like how the new version provides better guards by the compiler. The annoyance with NonZero* is that there is no compiler-time check on literal assignments, so you'll see a few unwrap()s and I also limited the cascading effect of using those types, but at least, the configuration is fully covered.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @AhmedSoliman. It is a really good improvement for our users :-) +1 for merging.

D: Deserializer<'de>,
{
if deserializer.is_human_readable() {
deserializer.deserialize_any::<ByteCountVisitor<CAN_BE_ZERO>>(ByteCountVisitor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :-)

Comment on lines +300 to +337
#[track_caller]
fn check(s: &str) {
assert_eq!(
serde_json::from_str::<Config>(s).unwrap().0,
s.parse().unwrap()
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I didn't know that debug output of str includes the "". Thanks for the clarification!

@AhmedSoliman AhmedSoliman merged commit ba593f8 into main Apr 24, 2024
13 checks passed
@AhmedSoliman AhmedSoliman deleted the pr1455 branch April 24, 2024 11:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants